Standards and proprietary extensions
All converters are compliant with the established standard to some degree or another. However, the showcases demonstrate that their rendering behaviours vary - partly because of bugs, and partly because of ambiguities in the standard or because the standard leaves room for interpretation. As well as implementing the respective standard, each converter implements its own extensions for providing additional functionalities that are not (yet) part of the standard itself. This makes it more complicated to choose the right tool. For more complex layout requirements, it is necessary to know the tools in detail and to check whether a particular requirement is supported by a certain tool. Often, you will need to think of a workaround for implementing a particular functionality (for example, sidenotes are only supported by Antennahouse as a proprietary extension; the same applies for placing footnotes in the same column as the reference in a multi-column layout). In addition, none of the converters implement the CSS Paged Media standard in a complete way. The vendor-specific extensions make interoperability difficult. We cannot expect that converters will produce identical results with identical content and styles.
All converters work with standard image formats (GIF, PNG, JPG) and vector graphics (SVG). RGB and CMYK colour spaces are supported. The main problem with graphics is the automatic positioning and resizing of images. CSS provides limited options for controlling image placement and positioning, especially in edge cases where an automatic size reduction of images could result in a better layout. Tools like PDFreactor provide limited access to the renderer internals for implementing an adaptive image layout.
MathML is supported by PrinceXML, PDFreactor and Antennahouse. PDFReactor and PrinceXML have various issues with MathML. The best MathML support is provided by Antennahouse, though this also comes with various rendering issues.
PDF forms are widely available and used. PDFreactor is the only tool that can generate PDF files with form support.
Support for line grids or grids in general is an upcoming feature. There is a W3C draft CSS Line Grid Module Level 1 in the making. At the current time, support for line grids only exists in PDFreactor Version 9 through vendor specific properties -ro-line-snap and -ro-line-grid and in Antennahouse through its own extension.
Multimedia (video and audio)
While PDF allows the embedding of multimedia content like video and audio, the overall value of this is questionable. Antennahouse is the only tool supporting multimedia content in PDF files. The only PDF reader on the market with multimedia support seems to be Acrobat (Reader). The standard PDF viewer on MacOSX ("Preview") does not support multimedia PDF files. As such, the toolchain for generating multimedia PDFs is limited and the tool options on the consumer side are even poorer.
Further PDF features
Advanced features like
- Digital signatures
- Tagged PDFs
- Accessible PDFs
- Archive PDFs (PDF/A)
are best supported by Antennahouse, PrinceXML and PDFreactor.
XML vs. HTML
Missing features and major pain points
Shapes and exclusions
Although is a W3C CSS draft for shapes and exclusions, none of the converters so far support this draft.
Better and more flexible support for floats
All converters support the standard float: left and float: right properties (in particular for images combined with text). Vendor-specific extensions have been implemented, most notably by Antennahouse.
Support for influencing the rendering process
I used the Oxygen Userguide to carry out a quick benchmarking of the tools. I converted the user guide to a single HTML file (20 MB) using the DITA OT using all four converters (4 CPU box, 2.4 GHz, 8 GB RAM). The resulting PDF files were 2200 and 2400 pages.
|150 secs||24 secs||220 secs|
Which tool should I choose?
In my experience, the general rule is: you get what you pay for. The open-source solution Weasyprint will work for standard jobs without fancy layout requirements. PDFreactor and PrinceXML both worked for us in enterprise projects. Our current preference is PDFreactor because of the better documentation and the lower price compared to PrinceXML. Antennahouse is more expensive (you pay for each CPU and each extension), but provides several of the extensions (e.g. better float support) that you might need in your projects. As such, it is not possible to issue a one-size-fits-all recommendation. The choice of tool depends on your requirements and budget. (ZOPYX offers vendor-neutral consulting on CSS Paged Media issues).