Breaking it Down: the ePub 3 Spec
By Eric Freese, Solutions Architect, Aptara and member of the EPUB3 Working Group
EPUB is widely accepted as the defacto digital format standard for eBooks, with its signature reflowable text that can be read on the greatest variety of reading systems (including the iPad, nook/nookColor, Kobo, and Sony readers to name a few).
The much anticipated, upcoming revised edition, EPUB3, will include new features that promise to greatly enhance the reader experience, such as embedded audio, video, and interactivity.
Meanwhile, publishers hold out hope that the new and improved EPUB standard will rectify the frustration with EPUB 2.0.1 files behaving differently on different reading systems (which have led me and others, to stress the importance of testing your files on every intended device.)
With speculation abounding since the introduction of the spec in the spring of this year, publishers have been waiting to see what’s really possible with EPUB3 — and what reading systems will support it.
To help manage expectations and alleviate confusion, I’ve provided a brief snapshot of some of the spec‘s new features, as well as what publishers can do to start preparing for them, and notes of caution as to what may, or may not, be available in EPUB3 reading systems.
There has been some confusion as to whether HTML5 and EPUB3 will work together. To set the record straight, HTML5 is the base language of EPUB3 (with some minor adjustments to allow for pagination and other reading behaviors). Since EPUB3 content is written in HTML5, the two will interact hand-in-hand.
EPUB3 reading systems must be able to process XHTML files written in HTML5. This doesn’t mean that web browsers will be able to display EPUB files, unless they are able to process the additional navigation information contained within the EPUB file. That being said there are some reading systems that are implemented within a browser environment.
The new baseline for style sheets is CSS2.1 with some CSS3 features added. This will provide much richer layout including multi-column layout, better font support, and directional printing, to name a few. Reading systems are NOT required to support CSS, but almost all of them do. One of the leading causes of frustration is the difference in CSS support between reading systems. In the current EPUB environment, many reading systems do not allow stylesheets within an EPUB file to override the system’s default settings. EPUB3 does not do anything to alleviate this situation and, in fact, might exacerbate it somewhat due to the additional capabilities that are possible. Reading systems also have the ability to implement their own proprietary CSS extensions, which would then be ignored by other reading systems.
Audio can be inserted into eBook files using the HTML5 tag. This is what Apple, Barnes & Noble, and Amazon have been using all along to embed audio in enhanced eBooks. Now, it’s simply part of the EPUB3 spec. Reading systems are NOT required to support audio, although many do. If a reading system supports audio, it must support MP3. In addition, support of MP4 AAC and media overlays (explained later) is optional.
Video can be inserted into eBook files using the HTML5 tag. Again, this is what has already been occurring. And again, reading systems are NOT required to support video. In fact, most of the e-Ink devices are not able to show video in a satisfactory manner.
One of my main bones of contention with the EPUB 3 spec is that there is no specified format that must be supported. If a reading system supports video, the spec recommends support of at least one of either H.264 (also known as MPEG-4 AVC) or VP8 video compression formats, but neither is required. Unfortunately, the spec also does not say that some other format is not allowed. Essentially, there is nothing to stop a reading system developer from implementing some other video format (Flash?). Whether that happens remains to be seen, but there is an opening available.
In the meantime, publishers are going to need to prepare videos in both formats to support the widest range of reading systems. As has been discussed in the past, this could lead to very large EPUB3 files, or different versions that target specific reading systems.
Media overlay functionality was added to the spec to enable text and media to be presented in a combined manner. For example, highlighting text as it is spoken by the computer or as part of a soundtrack. In order to employ these overlays, special Synchronized Multimedia Integration Language (SMIL) files will have to be created.
Reading systems are not required to provide this functionality, but if they do, they should allow readers to skip or escape out of overlays. Overlays can also be used to provide text-to-speech functionality. The spec mentions the Pronunciation Lexicon Specification (PLS) and the Speech Synthesis Markup Language (SSML) as the means for providing assistance in generating synthetic speech, but does not require reading systems to use that information.
Scalable Vector Graphic (SVG) files have been allowed within EPUB files for some time. However, their use was limited, due largely to a lack of reading system support. EPUB3 now mandates that reading systems be able to process SVG within the eBook, including allowing users to select text and search within the content of the SVG files. The only portion of SVG that is not allowed is the animation capability.
MathML is part of HTML5, and therefore, it is part of EPUB3. Reading systems must be able to process the presentation form of MathML, but may also support the content form of MathML. I won’t go into a lot of detail here. But publishers that deal with mathematical and scientific content may be interested, as it will allow formulas to be included as part of the XHTML markup — rather than as images. This means the content will be scalable, among other things. It is still recommended that images of the formulas be included as fallbacks.
Foreign resources are pieces of content that are not a core media type. For example, PDFs might be considered foreign resources. I have seen cases where PDF files are incorporated into EPUB files. When this is done, at least one fallback (perhaps a plain text equivalent) should be included to allow reading systems that don’t support the resource to operate.
That being said, publishers should be thinking about possible ways that content can be made more interactive and beginning to plan for creating those enhancements. However, they should also make sure that the reading experience is not adversely affected if a reader decides to turn scripting off, or if a reading system does not provide it.
EPUB3 created a Canonical Fragment Identifier (EPUBCFI) specification for creating and accessing various locations within the content. This allows very fine grained access to the content, even at the word or phrase level. The use of this spec could allow indexes to link to the exact word within the content. It is also the basis of a future inter-document linking spec due out in the near future.
Publishers should consider how best to create additional target IDs within their content to speed the linking process. The good news is that reading systems are required to be able to process EPUBCFI addresses, making them more interoperable.
EPUB 2.0.1 actually consists of 2 schemas — EPUB and DTBook. DTBook was intended to provide content to assistive systems for visually-impaired readers through Braille readers and other technologies. Because of the accessibility features within HTML5, it was decided that DTBook could be deprecated and the functionality rolled into EPUB. So technically, EPUB3 files are accessible ‘by design.’
Publishers should do everything within reason to ensure that all items within their content are accessible. This includes descriptions of all images and alternative text for MathML and scripts.
Hopefully this quick dive into the spec provides enough context for you to at least know what to expect as we move into the new eBook formatting realm of EPUB3. Going forward, there will undoubtedly be lots of new capabilities as best practices get solidified and reading systems become even more advanced. So stay tuned. The final membership vote is expected in late August or early September, and I’ll be reporting back with updates.
Eric Freese is a Solutions Architect with Aptara, which provides digital publishing solutions that deliver significant gains in quality, time-to-market and production costs for eBook publishers.