Tofu 2.0 alpha
Version 2.0 is an update to Tofu in development to add PDF support as well as other features and improvements over the next few weeks and months.
I am using the PDF Kit framework which was introduced in Mac OS X 10.4 Tiger, as it allows me to easily get a formatted string of text from any PDF file.
One limitation, however, is that it can't distinguish between line wraps (which occur at the end of each line) and real paragraph breaks. This is because PDF files don't really store continuous text, but rather the position of each character on the page.
My way around this is to guess when a paragraph break occurs, for example when there are two successive line breaks, or when the font size changes after a line break. This approach should work okay for simple PDF files, but will not be reliable for more complex ones. Let me know how well this works for you.
Please send any feedback to contact@amarsagoo.info.
Tofu 2.0a2 supports Intel Macs, features improved full-screen viewing and minor usability enhancements.
Released 2006-03-28
License: Free
Requires Mac OS X 10.4 or later