You can download this software right after you purchase it.
This product is a Windows COM component.
When you buy this product, you'll receive one developer license and five runtime licenses. That's $475.00 for the developer license and $18.00 for each runtime license: a total of $565.00. This is the minimum required purchase for this product. If you need to purchase more than five runtime licenses, or if you already have a developer license and need to purchase additional runtime licenses, please contact us.
XpdfText is a very affordable programmer's toolkit that makes it easy to extract plain text from PDF files. The PDF file can be on disk or in memory, and likewise, the text can be extracted to memory or directly to disk.
XpdfText can be used in different ways:
Convert entire PDF files or individual pages to plain text
- maintaining layout, or
- converting to "reading order"
Extract text from a specified rectangle on a page
- useful for extracting text from forms
Convert pages into word lists – for each word, you can retrieve:
- font name and font size
- text color
- word position on the page
- character offset (for highlight files)
The extracted text can be converted to a wide choice of standard encodings, including UTF-8 Unicode, ISO-8859-1 (Latin-1), 7-bit ASCII, and various other language-specific encodings.
The XpdfText toolkit also includes all the functionality of the XpdfInfo toolkit.
XpdfText is easy to use:
pdf = new XpdfText.XpdfText
' convert to a text file on disk...
pdf.convertToTextFile(1, 5, "output.txt")
' ... or convert in memory
s = pdf.convertToTextString(1, 5)
If you need to convert to XML instead of plain text, consider the PDFdeconstruct product.