export pdf pages as bitmap files

setup

routinely, i store figures as vectors as separate pdf files. but sometimes figures are required as separate bitmap files such as jpeg, png, or tiff. acrobat pro's one way to go, but unfortunately, my work laptop doesn't have it.

solution

the plan

  1. place all pdf files to export under one folder and batch process with a for loop
  2. export each figure, specifically:
    • read pdf,
    • get page 1, since each pdf has only 1 page,
    • export with a zoom factor of 4 - that should be large enough.

programming language & module(s)

file preps

  • the source pdf(s)

variables to customize

  • folder, for pdf file path,
  • image_ext, for bitmap file extension, e.g. 'jpg', 'jpeg'. note that 'tif(f)' is unfortunately not supported.
  • zoom_in_factor, controls the size of the exported images.

the script

pdf_to_image.py
#! /usr/bin/python
def look_cool(slogan, font = 'slant'):
import pyfiglet
result = pyfiglet.figlet_format(slogan, font = font)
print('\n' + result)
def pdf2img(folder, pdf_name, zoom_in_factor, image_ext):
import os
import fitz
pdf_path = os.path.join(folder, pdf_name)
print('\t%s' % pdf_path)
pdf = fitz.open(pdf_path)
# create a transformation matrix
mat = fitz.Matrix(zoom_in_factor, zoom_in_factor)
# render the 1st page to a pixmap
pix = pdf[0].get_pixmap(matrix = mat)
# save the pixmap as an image
image_name = pdf_name.replace('.pdf', image_ext)
pix.save(os.path.join(folder, image_name), jpg_quality = 100)
pdf.close()
import os
folder = r'C:\Users\xiao\Desktop\pdf_to_export'
# options: '.png', '.jpg', '.jpeg', '.psd', '.ps'
image_ext = '.jpeg'
# tweak the number to get bitmaps of different zoom-in scales.
zoom_in_factor = 4.0
print('\nexporting the following pdfs as %s:' % image_ext)
pdfs = [item for item in os.listdir(folder) if item.endswith('.pdf')]
for pdf_name in pdfs:
pdf2img(folder, pdf_name, zoom_in_factor, image_ext)
look_cool('All done !')

output

the exported images can be found in the same folder as the pdf docs.

note to self

  • this script assumes there's only 1 page in each pdf. tweak it in case more pages are to be exported.
Some rights reserved
Except where otherwise noted, content on this page is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license.