**This is an old revision of the document!**

Docbook

Docbook is a typesetting and layout tool for authors. Specifically, it is an XML schema intended to produce both print and electronic copies of text. Combined with any one of many powerful xml processors available, it achieves the goal of letting an author write once and publish to anything. It is widely used in technical documentation (such as the first few editions of Slackermedia itself), but also for works of fiction, academia, and more.

Strengths [Weaknesses]

Familiar

XML is vastly different than HTML, but the concept is very similar. If you are good with HTML, then XML will feel like the “pro” version of what you already know.

Strict

XML is rigid in what its processors accept, so there is an absolutism to how you structure your documents. This may help you organise information better, and it guarantees predictable output in the end. You won't spend time shifting indents and special meta-characters around in your text editor; you will spend time writing content within a well-structured framework.

Documented

XML is a long-standing format, and Docbook is a well-respected schema. Docbook is well documented on http://docbook.org and XML is so well-known that you can take classes on the subject.

[Ex]Portable

One your text is in XML format, it is structured and predictable. This probably means that if there is another format (html, epub, pdf, ps, plain text, rtf, odt, and so on) that you want to output to, you can convert to it from XML. There just isn't any ambiguity about XML, and heaps of post-processors.

Weaknesses [Strengths]

Complex

The process of creating well-formed XML is not simple. It is a very verbose format, it will fail at the smallest error, it enforces inheritance, and it requires some number of post-processors in order to get it out of the XML format.

Strict

Unlike markdown or HTML, XML is intolerant of any deviation from its defined schema. Something as simple as a missing closing tag will break the processor. There are tools, such as xmllint to help ensure well-formed XML, but it is not uncommon to attempt at least three builds before a successful one.

Style

The look of documents output from Docbook are clean and professional, but to change the look and feel of your output, you probably need to learn XSL. XSL can be complex, especially if you have only just learnt XML and how to process it.

Install

Docbook is not an application, but a schema, meaning that it is nothing more than a set of rules that you follow whilst writing text in any plain text editor of your choice. If you have ever used HTML, it's a little like that; you don't install HTML, you just write it, and other programmes bear the burden of interpreting it and processing it into a form for public consumption.

The docbook schema, along with a number of XML tools, comes pre-installed on Slackware. The schemas are located in /usr/share/xml/docbook/xml-dtd-X.Y (where X.Y is a version number).

Quickstart

The best quickstart guide to Docbook is a short work by David Rugge, Mark Galassi, and Eric Bischoff and located at http://xml.web.cern.ch/XML/goossens/dbatcern/dbatcern.html.

Here is a basic summary, featuring a severely limited set of functionality:

Docbook Header

The Docbook header is a line of text at the top of a Docbook file which identifies the file as being an XML document following the Docbook schema, and points to where the schema's rules are located on your computer (or a networked location, if you have confidence in your network environment).

<!DOCTYPE book PUBLIC "-//OASIS//DTD Docbook XML V4.5//EN" "/usr/local/share/xml/docbook/4.5/docbookx.dtd">

Docbook can format articles or books; so if you're writing a book the header should have: DOCTYPE book PUBLIC

If you are writing an article, then you want: DOCTYPE article PUBLIC

And, as I've said already, the actual PATH to docbookx.dtd is something you'll need to define for the system. You can always find that out with this command:

p-------------------------------------------------q
|  bash$ find /usr/share/xml/ -iname docbook*dtd  |
|                                                 |
b-------------------------------------/fig 6. find!

> xmlto html and xmlto pdf <--

Now we're ready to take all that confusing xml and make it into HTML so we can look at it in a web browser and a pdf so we can send it to all of our friends and carry it around on our mobiles and tablets. The application we use first is xmlto.

First, get all the xml files into one big document:

p-----------------------------------------------------q
|  bash$ cat docbook.header *.docbook.xml > tmp.xml   |
|                                                     |
b------------------------------/fig 7. creating tmp.xml

And now to create a directory for the htmls files, and process tmp.xml with xmlto:

p---------------------------------------------q 
|  bash$ mkdir ./html                         |
|  bash$ xmlto html tmp.xml -o ./html         |
|                                             |
b----------------------/fig 8. xmlto in action!

Obviously the syntax of xmlto is… xmlto - the command html - the type of output we want -o - the flag to tell xmlto where to dump the output files

And now if you navigate into the html folder you'll find a BUNCH of html files, and if you launch konqueror or some other web browser to that folder, then you'll see it lookin' all pretty and really nicely laid out and stuff.

For a pdf, the first and second steps are essentially the same; if you already have a concatenated tmp.xml then you can skip that step, and the second is similar:

p---------------------------------------------q 
|  bash$ mkdir ./pdf                          |
|  bash$ xmlto fo tmp.xml -o ./pdf            |
|                                             |
b----------------/fig 9. xmlto in action again!

WTF is an fo file? I don't know, but it's the intermediate step between raw unadulterated XML and a fancy hot-link-clickable PDF. It dumps out a tmp.fo in your ./pdf directory.

To get the tmp.fo into pdf, we use Apache's fop:

p-----------------------------------------------------q 
|  bash$ fop ./pdf/tmp.fo ./pdf/myBook_by_myName.pdf  |
|                                                     |
b-----------------------------------------/fig 10. fop!

And now in your ./pdf directory you have a really really cool pdf with a table of contents that is clickable, and text that can be copied and pasted, and all that good stuff, just like the pro's. Except, in our case, we didn't have to sell our souls to the evil that is Ad0be :^)

So, that's it, you're done. Oh, well, unless you want to take it to the next level. I mean, if you think you can handle it. Well, take a moment, think it over, and if you want this to be a really lean-and-mean docbook-wielding machine, gather your party and venture forth:

> The Makefile <--