字符串操作、文件操作,英文词频统计预处理 - 浅锘晗
作业来源:https://edu.cnblogs.com/campus/gzcc/GZCC-16SE1/homework/2684
1.字符串操作:
- 解析身份证号:生日、性别、出生地等。
id = input("请输入身份证号码:") shengfen = id[0:2] nian = id[6:10] yue = id[10:12] ri = id[12:14] diqu={\'北京\': 11, \'上海\': 31, \'湖北\': 42, \'云南\': 53, \'天津\': 12, \'江苏\': 32, \'湖南\': 43, \'西藏\': 54, \'河北\': 13, \'浙江\': 33, \'广东\': 44, \'陕西\': 61, \'山西\': 14, \'安徽\': 34, \'广西\': 45, \'甘肃\': 62, \'内蒙\': 15, \'福建\': 35, \'海南\': 46, \'青海\': 63, \'辽宁\': 21, \'江西\': 36, \'重庆\': 50, \'宁夏\': 64, \'吉林\': 22, \'山东\': 37, \'四川\': 51, \'新疆\': 65, \'黑龙江\': 23, \'河南\': 41, \'贵州\': 52, \'新疆兵团\': 66, \'台湾\': 71, \'香港\': 81, \'澳门\': 91} for key in diqu: if shengfen == str(diqu[key]): shengfen = key print("解析如下:") print(\'籍贯:\'+shengfen) print("出生日期:"+nian+"年"+yue+"月"+ri+"日") if (int(id[-2]) % 2) == 0: print("性别:女性") else: print("性别:男性")
- 凯撒密码编码与解码
加密:
from idna import unichr def Du(): XinXi = open(r"..\bd\venv\Password\monkey.txt", \'r\', encoding=\'utf8\') JiaMi = XinXi.read() XinXi.close() return JiaMi def Xie(): XinXi = open(r"..\bd\venv\Password\monkey.txt", \'w\', encoding=\'utf8\') XinXi.write(JiaMi) XinXi.close() JiaMi = Du() def JM(b): c = \'\' for i in b: c += unichr(ord(i) + 3) # print(unichr(ord(i)+3)) b = c return b print("加密前:"+JiaMi) JiaMi = JM(JiaMi) print("加密后:"+JiaMi) Xie()
解密:
from idna import unichr
XinXi = open(r"..\bd\venv\Password\monkey.txt",\'r+\',encoding=\'utf8\')
JieMi = XinXi.read()
c=\'\'
for i in JieMi:
c+=unichr(ord(i)-3)
#print(unichr(ord(i)+3))
print("解密前:"+JieMi)
JieMi=c
XinXi.seek(0)
XinXi.truncate()
print("解密后:"+JieMi)
XinXi.write(JieMi)
XinXi.close()
- 网址观察与批量生成
for i in range(1,16): print(\'http://news.gzcc.cn/html/xiaoyuanxinwen/{}.html\'.format(i))
2.英文词频统计预处理
- 下载一首英文的歌词或文章或小说。
- 将所有大写转换为小写
- 将所有其他做分隔符(,.?!)替换为空格
- 分隔出一个一个的单词
- 并统计单词出现的次数。
XinXi = open(r"..\bd\venv\GeCi\geci.txt",\'r+\',encoding=\'utf8\') GeCi= XinXi.read() XinXi.close() GeCi_ist = GeCi.replace(\'\n\', \' \').replace(\',\', \' \').replace(\'(\', \' \').replace(\')\', \' \').lower().split(\' \') ZiDian = { } for str in GeCi_ist: if str in ZiDian.keys(): ZiDian[str] = ZiDian[str] + 1 else: ZiDian[str] = 1
del ZiDian[\'\']
print(sorted(ZiDian.items(),key=lambda x:x[1],reverse=True))
3.文件操作
- 同一目录、绝对路径、相对路径
open(r"..\bd\venv\GeCi\geci.txt",\'r+\',encoding=\'utf8\')
- 凯撒密码:从文件读入密函,进行加密或解密,保存到文件。
from idna import unichr XinXi = open(r"..\bd\venv\Password\monkey.txt",\'r+\',encoding=\'utf8\') JieMi = XinXi.read() c=\'\' for i in JieMi: c+=unichr(ord(i)-3) #print(unichr(ord(i)+3)) print("解密前:"+JieMi) JieMi=c XinXi.seek(0) XinXi.truncate() print("解密后:"+JieMi) XinXi.write(JieMi) XinXi.close()
- 词频统计:下载一首英文的歌词或文章或小说,保存为utf8文件。从文件读入文本进行处理。
XinXi = open(r"..\bd\venv\GeCi\geci.txt",\'r+\',encoding=\'utf8\') GeCi= XinXi.read() XinXi.close() GeCi_ist = GeCi.replace(\'\n\', \' \').replace(\',\', \' \').replace(\'(\', \' \').replace(\')\', \' \').lower().split(\' \') ZiDian = { } for str in GeCi_ist: if str in ZiDian.keys(): ZiDian[str] = ZiDian[str] + 1 else: ZiDian[str] = 1 del ZiDian[\'\'] print(sorted(ZiDian.items(),key=lambda x:x[1],reverse=True))
4.函数定义
- 加密函数
def JiaM(b): c = \'\' for i in b: c += unichr(ord(i) + 3) # print(unichr(ord(i)+3)) b = c return b
- 解密函数
def JieM(b): c = \'\' for i in b: c += unichr(ord(i) - 3) # print(unichr(ord(i)-3)) b = c return b
- 读文本函数
def Du(): XinXi = open(r"..\bd\venv\Password\monkey.txt", \'r\', encoding=\'utf8\') JiaMi = XinXi.read() XinXi.close() return JiaMi