Python实现图片转文字并翻译至剪切板
一、环境搭建:
1、PySimpleGUI: pip3 install pysimplegui
2、pytesseract需要有tesseract环境才行:
1. 先搭建tesseract:
brew install tesseract /*安装tesseract环境*/
brew install tesseract-lang /*安装语言包,大概有600+M,心痛。。*/
2. 安装pytesseract
pip3 install pytesseract
二、基本流程:
1、设计一个窗口,支持上传图片文件和相关参数设定。大概长这样:
import PySimpleGUI as sg event,values = sg.Window(\'选择转文字照片\'). Layout([ [sg.Text(\'上传照片\')], [sg.Input(), sg.FileBrowse(\'选择文件\')], [sg.Radio(\'开启翻译\',\'flag\'),sg.Radio(\'中/英\', "choose")], [sg.OK(\'确认\'), sg.Cancel(\'取消\')] ]).Read()
GUI代码
分支控制:
if values[1]: if values[2]: res = translate(text, \'en\', \'zh\') else: res = translate(text, \'zh\', \'en\') text = \'\' for ans in res[\'trans_result\']: text+=ans[\'dst\']+\'\n\'
根据表单进行中英互译
2、图片转文字:
这里主要用到的是pytesseract库,引用库以后一行代码就搞定了。
text=pytesseract.image_to_string(Image.open(values[0]),lang=\'chi_sim\')
参数:values[0]是上传图片的地址,lang代表文字识别语言
3、调用百度翻译api:
首先去百度翻译开发者平台注册申请api:百度翻译开放平台
然后在控制台的开发者信息里,会有你的appid和密钥:
根据官方提供的demo,我撸了一个调用函数:
def translate(q,fromLang,toLang): # q代表需要翻译的语句、fromlang是待翻译语言、tolang是翻译成的语言 appid = \'你的appid\' secretKey = \'你的密钥\' httpClient = None myurl = \'/api/trans/vip/translate\' salt = random.randint(32768, 65536) sign = appid+q+str(salt)+secretKey m1 = hashlib.md5() m1.update(sign.encode(encoding=\'utf-8\')) sign = m1.hexdigest() myurl = myurl+\'?appid=\'+appid+\'&q=\'+parse.quote(q)+\'&from=\'+fromLang+\'&to=\'+toLang+\'&salt=\'+str(salt)+\'&sign=\'+sign try: httpClient = http.client.HTTPConnection(\'api.fanyi.baidu.com\') httpClient.request(\'GET\', myurl) response = httpClient.getresponse() ans = response.read().decode(\'utf-8\') ans = eval(ans) except Exception as e: print(e) finally: if httpClient: httpClient.close() if ans: return ans
百度翻译api调用
4、调用剪切板:
利用subprocess库,按照基本使用方法直接用就好。
text=bytes(text,\'utf8\') p = subprocess.Popen([\'pbcopy\'], stdin=subprocess.PIPE) p.stdin.write(text) p.stdin.close() p.communicate()
复制至剪切板
三、完整代码:
1 from PIL import Image 2 import pytesseract 3 import subprocess 4 import PySimpleGUI 5 import PySimpleGUI as sg 6 import http.client 7 import hashlib 8 from urllib import parse 9 import random 10 11 def translate(q,fromLang,toLang): 12 appid = \'你的appid\' 13 secretKey = \'你的密钥\' 14 httpClient = None 15 myurl = \'/api/trans/vip/translate\' 16 salt = random.randint(32768, 65536) 17 sign = appid+q+str(salt)+secretKey 18 m1 = hashlib.md5() 19 m1.update(sign.encode(encoding=\'utf-8\')) 20 sign = m1.hexdigest() 21 myurl = myurl+\'?appid=\'+appid+\'&q=\'+parse.quote(q)+\'&from=\'+fromLang+\'&to=\'+toLang+\'&salt=\'+str(salt)+\'&sign=\'+sign 22 try: 23 httpClient = http.client.HTTPConnection(\'api.fanyi.baidu.com\') 24 httpClient.request(\'GET\', myurl) 25 response = httpClient.getresponse() 26 ans = response.read().decode(\'utf-8\') 27 ans = eval(ans) 28 except Exception as e: 29 print(e) 30 finally: 31 if httpClient: 32 httpClient.close() 33 if ans: 34 return ans 35 36 event,values = sg.Window(\'选择转文字照片\'). Layout([ 37 [sg.Text(\'上传照片\')], 38 [sg.Input(), sg.FileBrowse(\'选择文件\')], 39 [sg.Radio(\'开启翻译\',\'flag\'),sg.Radio(\'中/英\', "choose")], 40 [sg.OK(\'确认\'), sg.Cancel(\'取消\')] 41 ]).Read() 42 if event==\'取消\': 43 exit(\'no image file selected!\') 44 45 text=pytesseract.image_to_string(Image.open(values[0]),lang=\'chi_sim\'); 46 text=str(text).replace(\'\n\',\'\') 47 if values[1]: 48 if values[2]: 49 res = translate(text, \'en\', \'zh\') 50 else: 51 res = translate(text, \'zh\', \'en\') 52 text = \'\' 53 for ans in res[\'trans_result\']: 54 text+=ans[\'dst\']+\'\n\' 55 text=bytes(text,\'utf8\') 56 p = subprocess.Popen([\'pbcopy\'], stdin=subprocess.PIPE) 57 p.stdin.write(text) 58 p.stdin.close() 59 p.communicate()
Source Code