Open is More Than A License: Why File Formats Matter for Revising & Remixing

I once wrote a 2500-word essay on why PDFs are terrible, so it’s safe to say that I have some thoughts about file formats (and also that I’m great fun at the right kind of parties). However, it’s also safe to say that not many people give file formats as much thought as I do. We probably collect quite a few of those who do in the OER community, but for anyone who doesn’t, here’s a little primer on why they’re important to what we do:

There is an ongoing conversation in the Open world about what being “Open” means beyond allowing cost-free access, from accessible language to inclusivity to leaving open unknown future uses of your work. In the educational context, we’ve talked about what it means to us here, and most famously, David Wiley has encapsulated “Openness” in OER in the 5Rs – the right to Retain, Reuse, Revise, Remix and Redistribute content. Formats have big implications for all of these, and directly inform our approach to Openness at Rebus. Open content needs to do more than give permission to exercise the 5Rs (i.e. through open licenses); it also needs to offer the technical capability (and ease) to exercise them.

As an example, when it comes to revising and remixing, it helps if the content exists in a format that lends itself to editing. The Rebus Community is currently supporting a project that will see two OpenStax biology textbooks combined to create a new text. However, the books are only available in formats that don’t lend themselves to editing: web/HTML, PDF and OpenStax’s own format, CNXML. This means that before being able to work on the content, the team needs to figure out a way to turn one of those formats into something easier to work with.

BCcampus has been working on bringing the OpenStax books into Pressbooks (an open source book production software widely used for producing OER, including in the Rebus Press) and Rebus is working on converting OpenStax Biology in a similar way so it can be adapted (we’re currently doing this manually, but hope to build an automated process in the coming months).

While, in theory, we could have dropped the content into any editing software (e.g. Microsoft Word) for this current project, the advantage of Pressbooks is that it easily allows the work we do to benefit others; this, and many other Pressbooks books can be made available for download, in a range of different formats that each serve a different purpose, through a distribution option. Once activated, the distribution option automatically adds the most recently exported book files to the book landing page, where anyone can access them. The main formats available are:

  • PDF: This is best for print, and is preferred by some for digital reading (especially offline)
  • Ebook (EPUB and MOBI): Ebooks are another popular option for reading (but much less popular with those doing anything remotely technical with the files)
  • XHTML: The standardised nature of this format makes it very useful for moving content between systems & formats. HTML is the language the web speaks, and XHTML is the central source from which PDF and ebooks are created in the Pressbooks system.
  • Pressbooks XML: This is an extension of the standard WordPress XML output format, and allows a clone of the book to be uploaded to a new Pressbooks shell, with either all content or a selection imported. It makes revising and remixing in Pressbooks incredibly easy, with a new, editable version of a book able to be created in a matter of minutes.
  • OpenDocument Format (ODT): ODT is an open file format that is compatible with MS Word and similar word processors. While the ODT output produced by Pressbooks isn’t very pretty in terms of formatting, it does allow for content to be edited in a familiar system, which is sometimes a useful option.

One of our goals is to take the entire collection of OpenStax books and make them available in the Pressbooks format, so any Pressbooks-based network (, etc.) can host copy of the originals that can then be taken and adapted by downloading the XHTML, Pressbooks XML or ODT (with the disclaimers about formatting).

The OER refrain of not reinventing the wheel applies here, too — openness means that we share our work so that it doesn’t have to be replicated, and everyone can build on it. We usually think about this in terms of content and licenses, but we should also consider the practicalities of how we can (or can’t) work with that content. An open textbook that exists as a PDF available online and a Word document buried on someone’s computer just isn’t reaching its full potential!

We should also consider how this kind of openness applies to all the other work that goes into creating an open textbook. That’s a big part of what we’re trying to do at the Rebus Community: engage with the people who are making open textbooks and leverage their experiences to create tools and resources that can be used by everyone in the community.

Want to be a part of it? Join the forum and sign up to one of our projects.

The distribution option is currently available or can be activated on any Pressbooks network except Infrastructure upgrades are in process in order to be able to support it at the scale required on the main network and it is expected to be available within the next couple of months.

