使用python把pdf文件转化成jpg格式的图片,把word文档转化成pdf文件

环境搭建

  使用语言 python3

  安装imagemagick(pdf转jpg是内部需要调用到此工具)

    apt-get install imagemagick

  安装libreoffice(此工具用于将word文档转化成pdf文件)

    apt-get install libreoffice

  安装python wand,PIL库

    pip install wand

    pip install PIL

  

PDF转JPG

先转png,再转jpg是为了避免出现黑色,透明等背景,造成转换出来的图片与pdf文件显示不一样

 1 from PIL import Image as Image2
 2 from wand.image import Image
 3 from wand.color import Color
 4 
 5 def convert_pdf_to_jpg(filename):
 6     end_length = len(filename.split(\'.\')[-1]) + 1
 7     title = filename[0:-end_length]
 8     title = title.split(\'/\')[-1]
 9 
10     #resolution为分辨率,background为背景颜色
11     with Image(filename=filename, resolution=150, background=Color(\'White\')) as img :
12 
13         #页数
14         length = len(img.sequence)
15 
16         #如果页数超过1页,生成的文件名会依次加上页码数
17         with img.convert(\'png\') as converted:
18             path = \'static/local_images/%s.png\' % title
19             converted.save(filename=path)
20     image_list = []
21     if length == 1:
22         path = \'static/local_images/%s.png\' % title
23         image_list.append(path)
24     else:
25         for i in range(0, length):
27             path = \'static/local_images/%s-%d.png\' % (title, i)
28             image_list.append(path)
29     jpg_list = []
30     for img in image_list:
31         image = Image2.open(img)
32         x,y = image.size
33         background = Image2.new(\'RGBA\', image.size, (255,255,255))
34 
35         try:
36             background.paste(image, (0, 0, x, y), image)
37             image = background.convert(\'RGB\')
38         except:
39             image = image.convert(\'RGBA\')
40             background.paste(image, (0, 0, x, y), image)
41             image = background.convert(\'RGB\')
42 
43 
44         title = img.split(\'.\')[0]
45         name = title + \'.jpg\'
46         image.save(name)
47         os.remove(img)
48         name = "%s/%s" %(static_host, name)
49         jpg_list.append(name)
50 
51     return jpg_list

 

word文档转PDF

python没有直接把word转换成pdf文档的库,只能先安装libreoffice工具,然后利用os库系统调用libreoffice工具

 1 import os
 2 
 3 def convert_doc_to_pdf(filename):
 4     end_length = len(filename.split(\'.\')[-1]) + 1
 5     name = filename[0:-end_length]
 6 
 7     cmd = \'libreoffice  --convert-to pdf  %s\' % filename
 8     os.system(cmd)
 9     name = name.split(\'/\')[-1] + \'.pdf\'
10     return name

 

  

 

版权声明:本文为cityking原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
本文链接:https://www.cnblogs.com/cityking/p/pdf_word_to_jpg.html