Tuesday, March 12, 2013

Creating PDFs for EBrary Reader

Turns out that Brown has a subscription to at least part of the Ebrary archive of online books. I found this out because I was looking for a book in our library, using the online catalog, and it turned out that said book was available online. Very cool!

Well, kind of cool, until I found out what this involved. By default, you basically have to read the document online. You can download the "EBrary Reader", which is a Java application, and read documents using it. But it's kind of clunky, to say the least. What I wanted was a PDF that I could then use as I wanted. How to get one?

I noticed that the EBrary Reader would allow me to print, so I thought maybe I could print the file as a PDF, since the default print dialog for Fedora lets you do that. Unfortunately, the Java application was not using the system dialog, but a Java dialog, so that didn't work.

A little googling led me to the cups-pdf package, which installs a system-wide PDF printer for the Common Unix Printing System. A quick "sudo yum install cups-pdf" was enough to give me access to that.

The next step was to convert this file to DjVu, which tends to be much smaller than the corresponding PDF. I've done this a million times before, so figured it would be pretty easy. Unfortunately, it was not.

The first step was to run the pdfimages command (from the poppler-utils package):
pdfimages -p file.pdf p
to extract the page images. Imagine my surprise when I got 420 images from a 20 page paper! It turned out that each page was constructed from 21 different images, stacked on top of each other. (To help with download times?)

Fortunately, I've had enough experience with ImageMagick to know this was not a problem that could not be solved. It took a little more googling, and a little experimentation, but eventually I found out that:
for i in 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20; do
  montage p-0${i}*.ppm -geometry +0+0 -background none -tile 1x21 page-$i.tiff;
done
would stack all the images back on top of each other.

So now I had 20 page images, all as TIFFs, and those could then be fed to ScanTailor for processing on the way to creating a DjVu.

Friday, March 8, 2013

I Love Perl

I posted about how to write simple Perl filters before, but I have to say that I just love doing this:
perl -ibak -pe 's/plaintext\(odocstream &(.*?), OutputParams const &(.*?)\)/plaintext\(odocstringstream &\1, OutputParams const &\2, int max_length\)/' *.h *.cpp
How much time did that save from doing it manually?

Thursday, March 7, 2013

The Grand Teaching Experiment (7)

I haven't posted for a while about the teaching experiment, because it was basically on hiatus. For the last few weeks, we have been doing technical material, and it does seem worth lecturing about that. Now, though, we are back to more philosophical material, so we are back to having discussions.
Up this week were Field's paper "Tarski's Theory of Truth" and Etchemendy's paper "Tarski on Truth and Logical Consequence". Both of them are pretty clear, though the dialectical structure of Etchemendy's paper is complicated (as I argue in my own paper on the topic).
The discussions seemed to go pretty well, perhaps better on Etchemendy than on Field, and that perhaps because I have a deeper understanding myself of that paper. As previously, the students' written responses to the papers were very good. Everyone seemed to have a solid understanding of the basic outlines of the arguments, with some students (unsurprisingly) a little ahead of others. But what I'm really coming to appreciate about this way of doing things is that, as we start class, I already have a pretty good idea what people understand and what they do not, and we can focus our attention either on filling in the gaps or else, even better, digging more deeply. And because everything is based on discussion, we dig in the direction the students find interesting, or where their questions naturally lead.
It's definitely clear, as I mentioned in a previous post, that the sorts of detailed reading notes I've been giving the students recently are important to this kind of approach. I've had more than one student remark on this.