Maria Nicolae's Website

RSS Feed (What is RSS?)


Back to Software.

epubsynth

Release Archives:
epubsynth-v1.0.4.tar.gz
epubsynth-v1.0.4.zip
Older versions

Git Repository: git clone https://git.marianicolae.com/epubsynth.git
RSS Feed: https://marianicolae.com/software/epubsynth/rss.xml

README

epubsynth

A command-line program for generating EPUB documents.

In a Nutshell

epubsynth generates EPUBs from source files as well as using provided metadata:

epubsynth \
    --output book.epub \
    --spine titlepage.xhtml \
            chapter1.xhtml \
            chapter2.xhtml \
            chapter3.xhtml \
    --stylesheets style.css \
    --resources fig1.png \
                fig2.jpg \
    --dc-title "A Book" \
    --dc-creator "Ann Author" \
    --dc-contributor "Ann Other Author"

Furthermore, epubsynth handles templating so that boilerplate does not need to be included in XHTML source files; their contents are inserted into the <body> tag.

This program is primarily intended to be used inside a shell script, or as part of a build system like make, rather than directly in an interactive shell. Accordingly, the command-line syntax is quite verbose, and includes metadata that is meant to be the same between invocations.

Installation

On Arch Linux, use the Arch User Repository (AUR) package maintained by myself to install epubsynth. Otherwise, use make install and make uninstall to install and uninstall respectively.

Usage

The command-line interface for epubsynth consists of no positional arguments, only options, out of which --output and --spine are required:

epubsynth --output OUTPUT --spine NAME... [OPTION...]

Note that, while epubsynth aims for correctness in any boilerplate it generates, it does not validate input source files for correctness to their respective formats (e.g. XHTML), or check for broken links in these source files. For this reason, it is recommended to use a tool like epubcheck to validate the generated EPUB file.

Output and Source Files

The file path of the generated EPUB is specified using the --output option (alias -o). An EPUB file is a container of content files (in fact, a ZIP archive), and these content files are specified using the options:

The NAMEs of the content files are the names/paths inside the EPUB container, which do not necessarily match the paths to the corresponding source files in the file system. The source paths can be configured using the option --source-dir=. (aliases --src-dir and -d). For example, --source-dir src --resource fig1.png inserts src/fig1.png1 in the file system as fig1.png in the container. Additionally, the names mimetype, package.opf, and any name beginning with META-INF are reserved in the EPUB container, and therefore cannot be used.

XHTML files specified in --spine, by default, have their contents inserted into the <body> element of a template; see Templating to learn how to configure this behaviour.

Finally, note that --toc has a default value, and therefore cannot be blank. Unlike for all other input file options, $SOURCE_DIR/$TOC need not exist in the file system, and if it does not, the table of contents will be automatically generated (see Table of Contents). If it does exist, its contents will, by default, be inserted into the <nav epub:type='toc'> element of a template; see Templating to learn how to configure this behaviour.

Document Metadata

Metadata in EPUB documents follows the Dublin Core vocabulary. These metadata can be set using command line options. The following options correspond to fields that are mandatory in EPUB:

Other options for optional metadata are

These cover all of the /elements/1.1/ namespace, except for dc:format, which is always set to application/epub+zip in EPUBs generated by epubsynth, and cannot be changed.

Table of Contents

The table of contents for the EPUB document can be set in one of two ways: automatic generation and manual authoring.

Automatic Generation

If the file referred to by --toc does not exist inside --source-dir, a table of contents will be automatically generated. The option --toc-headings can be used to specify a list of headings, synchronised with the list of files in --spine, for the table of contents. For example,

--spine        titlepage.xhtml chapter1.xhtml chapter2.xhtml \
--toc-headings "Title Page"    "Chapter 1"    "Chapter 2"

will result in the table of contents

Files in --spine can be omitted from the table of contents by setting their corresponding value in --toc-headings to the empty string "", as well as by making --toc-headings shorter than --spine. For example,

--spine        titlepage.xhtml chapter1.xhtml chapter2.xhtml backpage.xhtml \
--toc-headings ""              "Chapter 1"    "Chapter 2"

will result in the table of contents

If --toc-headings is not set, the file names in --spine themselves are used as the headings. For example,

--spine chapter1.xhtml chapter2.xhtml

without --toc-headings will result in the table of contents

Manual Authoring

If the file referred to by --toc exists inside --source-dir, it will be inserted into the EPUB container as the table of contents (after templating, as mentioned in the previous section and detailed in Templating). For example, the source file

<ol>
    <li><a href='chapter1.xhtml'>First Chapter</a></li>
    <li><a href='chapter2.xhtml'>Another Chapter</a>
    <ol>
        <li><a href='chapter2.xhtml#section1'>Some Section</a></li>
    </ol>
    </li>
</ol>

will result in the table of contents

See the relevant section of the EPUB 3 specification for what this file must adhere to.

Templating

There are two circumstances in which the contents of an input file are inserted into a template before being added to the EPUB container: XHTML files in --spine and the file in --toc (the latter regardless of an automatically generated or manually authored table of contents).

For an XHTML file in --spine, the templates --template-xhtml and --template-html are used, which default to

<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE html>
<html xmlns='http://www.w3.org/1999/xhtml'>
    {html}
</html>

and

<head>
    <title>{dc_title}</title>
    {stylelinks}
</head>
<body>
    {file}
</body>

respectively. These are Python format strings for which the fields are

For the --toc file, the templates --template-toc and --template-toc-nav are used, which default to

<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE html>
<html xmlns='http://www.w3.org/1999/xhtml'
      xmlns:epub='http://www.idpf.org/2007/ops'>
<head>
    <title>{dc_title}</title>
</head>
<body>
    {nav}
</body>
</html>

and

<nav epub:type='toc'>
    {file}
</nav>

respectively. The fields are nav, the contents of --template-toc-nav after substitution, as well as dc_title and file with the same meaning as before.

As an example, to disable all templating, one would set --template-xhtml "{file}" --template-toc "{file}".