Updated: so poppler now includes
pdftocairo which does this. No need to do this anymore! Blog post here for reference.
I’ve been using
convert from ImageMagick to convert PDF-files to png files. However, they’re butt ugly, or rather fugly. So I created a pdf2png script / python program to do it better.
Just look at the text here from
Rather ugly. Look at the kerning. It’s truly horrible.
Not to say poppler doesn’t have its share of problems, but it looks rather much better, don’t you agree?
So, since I had to manually edit a presentation I had to use some time making a PDF-to-PNG converter since I couldn’t find another pdf2png.
So without further ado, here is
#!/usr/bin/env python import poppler import cairo import gtk import urllib import sys, os width = height = 0 if len(sys.argv) != 2 and len(sys.argv) != 4: print("Usage: %s <filename> [width height]") sys.exit() if len(sys.argv) == 4: width = sys.argv height = sys.argv input_filename = os.path.abspath(sys.argv) output_filename = os.path.splitext(os.path.basename(sys.argv)) + '-%.2d.png' doc = poppler.document_new_from_file('file://%s' % \ urllib.pathname2url(input_filename), password=None) for i in xrange(doc.get_n_pages()): page = doc.get_page(i) if width and height: surface = cairo.ImageSurface(cairo.FORMAT_ARGB32, int(width), int(height)) ctx = cairo.Context(surface) else: surface = cairo.ImageSurface(cairo.FORMAT_ARGB32, int(page.get_size() * 2), int(page.get_size() * 2)) ctx = cairo.Context(surface) ctx.scale(2, 2) page.render(ctx) ctx.set_operator(cairo.OPERATOR_DEST_OVER) ctx.set_source_rgb(1, 1, 1) ctx.paint() surface.write_to_png(output_filename % i)
It’s very far from perfect.
Note the hard coded height and width, all of these things are possible fixes. I didn’t find any python-poppler documentation, but I used C++-docs instead, they were helpful enough.
If you do any improvements (thanks Raimund) or just use it, it’d make me happy if you told me in a comment. :-)