Saturday, November 18, 2023

Comparing PDFs

I'm working on final corrections for my book Modes of Presentation, which I'm again typesetting myself via LyX and LaTeX (as I did Frege's Theorem and Reading Frege's Grundgesetze). I'm paranoid about something weird creeping into the book and have been comparing the new and old pages as I go. I figured there had to be a better way to do that than flipping back and forth between the two PDFs. All the more so given that doing so feels like one of those change-blindness experiments.

Well, Linux and the command line to the rescue. The ImageMagick suite contains a 'compare' command that takes two images and produces a new one that shows the differences between them. So all I have to do is explode the PDF into a bunch of page images and run the compare command on them. Here's a generic script to do it:

#!/bin/bash

# Uncomment to test with just a few pages
#D2="-l 10";
# Image resolution
RES=100;

pdftocairo $D2 -gray -tiff -r $RES NEWPDF.pdf New;
pdftocairo $D2 -gray -tiff -r $RES OLDPDF.pdf Old;
for NEW in New*tif; do
    BASE=${NEW#New};
    OLD=Old$BASE;
    compare $NEW $OLD Comp$BASE;
done

Here's an example of what you get:

It's not readable, but you can easily see where the changes have been made and, if need be, check the actual page. Mostly, I want to make sure that nothing dramatic has changed with the page breaks, etc, anyway.

Of course, you can also do this on other operating systems, but they do not encourage you, as Linux does, to use the command line.


No comments:

Post a Comment

Comments welcome, but they are expected to be civil.
Please don't bother spamming me. I'm only going to delete it.