Getting Started with Docbook Book Authoring on Ubuntu

Aug
07
2007

[[FYI, this has been sitting in my writing queue for a while. I took a quick look at it and am shoving it out the door. Let me know if it's deficient and I'll fix it. Consider this version 1.0 of the article.]]

Ever since I spent my time in the technical writing trenches right out of college, I've been interested in doing my writing in a single format, generating whatever target formats from that.

At different points over the ensuing years, I've been drawn to Docbook as that single format. It's SGML/XML, first of all, which makes it relatively easy to write in a text editor. It uses XSLT to transform to other formats. It has pre-built toolchains for outputting HTML, PDF, etc. Also, it is easily versioned using Subversion.

As an additional vote of confidence for using Docbook is the fact that O'Reilly has been using it for many of their recent books. They're actually also storing their content in an Atom Publishing Protocol repository. That's another vote for where my intuition about a personal publishing stack has been leading. When written content is stored in a robust container (DocBook or Atom, etc.), you can repurpose it.

Technical documentation doesn't always work in every format (which is why many of the single source experiments in that space failed). However, for things that *do* work in multiple formats, the technology for producing those formats gets in the way. Not so with DocBook.

Now, while I've looked into it at several different points, I've never really dug into it well enough to get much done with it. I set out over the last couple of weeks to actually get completely up and running with Docbook. This time I powered through and got it working. I suspect that the motivations were more concrete this time

Along the way, I discovered that there wasn't a tutorial that matched what I was looking for. I also found tutorials that had non-functional code. However, despite that, I was able to get a basic Docbook book up and running and figured out a much simpler way to get started with Docbook, so I figured I'd share.

First of all, the prerequisites and assumptions.

  • I did this on Ubuntu Linux. While I know it's possible to do this on Windows or Mac, I'm not doing it there. And, actually, I know for a fact that msxsl.exe on Windows doesn't cooperate with the XSLT from having tried it.
  • If you want PDF output, you need a Java environment. See my article on Yanel for how to set up Java and the right environment variables.
  • I used xsltproc for XSLT transformations. This was already installed, but is there via "apt-get install xsltproc".
  • I am using FOP to output PDF files. I'll cover installing that in a bit.

To install FOP, download the latest .tar.gz from the FOP site. After downloading it, I un-tarred it, renamed and moved it to the "bin" directory in my home directory.

tar -xvf fop-0.20.5-bin.tar.gz
mv fop-0.20.5 fop
mv fop /home/j/bin/

Originally I grabbed a small, sample article XML from the web, changed the name and email address and saved it. I've since built a "book" as well and everything worked pretty much the same. I'm using jEdit to do the editing. With the XML plugin, you get tag completion and hinting, making for a decent editing environment.

Here's the sample XML file:

   1:<?xml version="1.0" encoding="UTF-8"?>
   2:<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
   3:<article>
   4:  <articleinfo>
   5:    <title>An example article</title>
   6:
   7:    <author>
   8:      <firstname>J</firstname>
   9:      <surname>Wynia</surname>
  10:      <affiliation>
  11:        <address>
  12:          <email>j@wynia.org</email>
  13:         </address>
  14:      </affiliation>
  15:    </author>
  16:
  17:    <copyright>
  18:      <year>2007</year>
  19:      <holder>J Wynia. Licensed under the Creative Commons Attribution 3.0 license. http://creativecommons.org</holder>
  20:    </copyright>
  21:
  22:    <abstract>
  23:      <para>If your article has an abstract then it should go here.</para>
  24:    </abstract>
  25:  </articleinfo>
  26:
  27:  <sect1 id="sect1">
  28:    <title>My first section</title>
  29:
  30:    <para>This is the first section in my article.</para>
  31:
  32:    <sect2 id="sect2">
  33:      <title>My first sub-section</title>
  34:
  35:      <para>This is the first sub-section in my article.</para>
  36:    </sect2>
  37:  </sect1>
  38:</article>

Create a simple XSLT file for HTML conversion (the simpler conversion to start with). Now, something that you won't see on the surface is that you don't need to download the Docbook XSLT in order to use it. You can reference it by it's URL.

It's worth noting that that little fact is one of the hidden bits of information that was really hard for me to find. Every other tutorial I read on this subject had you downloading the XSLT and setting paths to them, etc. However, that's all totally unnecessary to start out. Sure, after you get going, you may want your own copy of the XSLT files, but for now, skip it.

   1:<?xml version='1.0'?>
   2:<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
   3:<xsl:import href="http://docbook.sourceforge.net/release/xsl/current/xhtml/docbook.xsl"/>
   4:<xsl:param name="html.stylesheet">gtb_docbook.css</xsl:param>
   5:</xsl:stylesheet>

In case you're curious, there's a nice list of available parameters that you can use.

A similar XSLT file will load in the FO XSLT and override a couple of items with new settings. In this case, I overrode the page size. Since one of the reasons I want to start using Docbook is to do print-on-demand publishing through either my own company or some place like Lulu.

Most of those sites use a 6"x9" format that I think works well for a mix of content, so I wanted to be able to generate PDF's in that size. Here's the XSLT that takes the defaults for everything else.

   1:<?xml version='1.0'?>
   2:<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
   3:<xsl:import href="http://docbook.sourceforge.net/release/xsl/current/fo/docbook.xsl"/>
   4:<xsl:param name="page.width">6in</xsl:param>
   5:<xsl:param name="page.height">9in</xsl:param>
   6:</xsl:stylesheet>

In case you're curious, there's a nice list of available parameters that you can use. Apply this XSLT to generate the FO document:

xsltproc --output demoarticle.fo gtb_docbook_fo.xsl demoarticle.xml

Once you have the XSL-FO document (which describes all of the nitty-gritty layout details), you can turn that into PDF using FOP.

/home/j/bin/fop/fop.sh demoarticle.fo demoarticle.pdf

And, that's it. You've got a PDF from your XML document.

This is obviously a bit of a pain to set up initially and get going, but ended up not being as bad as my previous attempts were. And, if you automated the worflow to a single shell script, you could just run one thing every time you save and re-generate all of your outputs.

Incidentally, I'm actually planning on creating a VMWare appliance that takes various flavors of XML, converts them to DocBook as necessary and then gives you the PDF. It'll be a web app/web service on that virtual machine and I'm thinking about putting it up as a public service once that's done.

If that service existed, how might you use it?

 

Comments on this post

Feedback is always welcome. Read some from other folks or leave your own below. Just keep things civil and remember that what you post lives on in public. Forever.

Thanks,
J

4 Responses to “Getting Started with Docbook Book Authoring on Ubuntu”

  1. S Says:

    "That's another vote for where my intuition about a personal publishing stack"

    Might want to reword that?

    S

  2. J Wynia Says:

    Yep. I'll rewrite that bit.

  3. J Wynia Says:

    I just got the VMWare instance created and ran a sample doc through this toolchain via a web upload. Here's the sample doc as a PDF

  4. Alexey Says:

    Preety nice article. The most valuable part for me is information about automatic docbook deploying. Thanks

Leave Your Own Comment

By submitting a comment, you agree to license it under the terms of the Creative Commons Attribution license.

People who post comments get the added benefit of visiting the site without advertising.

© 2003-2008 J Wynia. All original content is licensed under the terms of the Creative Commons Attribution license unless otherwise noted. Content from other sources is licensed under its original terms.