Ubuntu环境下把word文档转成pdf,把pdf文件转成jpg
使用python把pdf文件转化成jpg格式的图片,把word文档转化成pdf文件
环境搭建
使用语言 python3
安装imagemagick(pdf转jpg是内部需要调用到此工具)
apt-get install imagemagick
安装libreoffice(此工具用于将word文档转化成pdf文件)
apt-get install libreoffice
安装python wand,PIL库
pip install wand
pip install PIL
PDF转JPG
先转png,再转jpg是为了避免出现黑色,透明等背景,造成转换出来的图片与pdf文件显示不一样
1 from PIL import Image as Image2 2 from wand.image import Image 3 from wand.color import Color 4 5 def convert_pdf_to_jpg(filename): 6 end_length = len(filename.split(\'.\')[-1]) + 1 7 title = filename[0:-end_length] 8 title = title.split(\'/\')[-1] 9 10 #resolution为分辨率,background为背景颜色 11 with Image(filename=filename, resolution=150, background=Color(\'White\')) as img : 12 13 #页数 14 length = len(img.sequence) 15 16 #如果页数超过1页,生成的文件名会依次加上页码数 17 with img.convert(\'png\') as converted: 18 path = \'static/local_images/%s.png\' % title 19 converted.save(filename=path) 20 image_list = [] 21 if length == 1: 22 path = \'static/local_images/%s.png\' % title 23 image_list.append(path) 24 else: 25 for i in range(0, length): 27 path = \'static/local_images/%s-%d.png\' % (title, i) 28 image_list.append(path) 29 jpg_list = [] 30 for img in image_list: 31 image = Image2.open(img) 32 x,y = image.size 33 background = Image2.new(\'RGBA\', image.size, (255,255,255)) 34 35 try: 36 background.paste(image, (0, 0, x, y), image) 37 image = background.convert(\'RGB\') 38 except: 39 image = image.convert(\'RGBA\') 40 background.paste(image, (0, 0, x, y), image) 41 image = background.convert(\'RGB\') 42 43 44 title = img.split(\'.\')[0] 45 name = title + \'.jpg\' 46 image.save(name) 47 os.remove(img) 48 name = "%s/%s" %(static_host, name) 49 jpg_list.append(name) 50 51 return jpg_list
word文档转PDF
python没有直接把word转换成pdf文档的库,只能先安装libreoffice工具,然后利用os库系统调用libreoffice工具
1 import os 2 3 def convert_doc_to_pdf(filename): 4 end_length = len(filename.split(\'.\')[-1]) + 1 5 name = filename[0:-end_length] 6 7 cmd = \'libreoffice --convert-to pdf %s\' % filename 8 os.system(cmd) 9 name = name.split(\'/\')[-1] + \'.pdf\' 10 return name