epubsynth
A command-line program for generating EPUB documents.
In a Nutshell
epubsynth
generates EPUBs from source files as well as
using provided metadata:
epubsynth \
--output book.epub \
--spine titlepage.xhtml \
chapter1.xhtml \
chapter2.xhtml \
chapter3.xhtml \
--stylesheets style.css \
--resources fig1.png \
fig2.jpg \
--dc-title "A Book" \
--dc-creator "Ann Author" \
--dc-contributor "Ann Other Author"
Furthermore, epubsynth
handles templating so that
boilerplate does not need to be included in XHTML source files; their
contents are inserted into the <body>
tag.
This program is primarily intended to be used inside a shell script,
or as part of a build system like make
, rather than
directly in an interactive shell. Accordingly, the command-line syntax
is quite verbose, and includes metadata that is meant to be the same
between invocations.
Installation
On Arch Linux, use the Arch User Repository
(AUR) package maintained by myself to install
epubsynth
. Otherwise, use make install
and
make uninstall
to install and uninstall respectively.
Usage
The command-line interface for epubsynth
consists of no
positional arguments, only options, out of which --output
and --spine
are required:
epubsynth --output OUTPUT --spine NAME... [OPTION...]
Note that, while epubsynth
aims for correctness in any
boilerplate it generates, it does not validate input
source files for correctness to their respective formats (e.g. XHTML),
or check for broken links in these source files. For this reason, it is
recommended to use a tool like epubcheck
to
validate the generated EPUB file.
Output and Source Files
The file path of the generated EPUB is specified using the
--output
option (alias -o
). An EPUB file is a
container of content files (in fact, a ZIP archive), and these content
files are specified using the options:
--spine NAME...
(alias-s
): The sequence of XHTML and/or SVG files that define the reading order (required).--stylesheets NAME...
: The CSS stylesheets to be included.--toc=toc.xhtml
: The table of contents file. This must be XHTML type: EPUB 2 table of contents files (NCX type) are not supported.--resources NAME...
: All other resources to be included, that are not better categorised into the preceding options.
The NAME
s of the content files are the names/paths
inside the EPUB container, which do not necessarily match the paths to
the corresponding source files in the file system. The source paths can
be configured using the option --source-dir=.
(aliases
--src-dir
and -d
). For example,
--source-dir src --resource fig1.png
inserts
src/fig1.png1
in the file system as fig1.png
in the container. Additionally, the names mimetype
,
package.opf
, and any name beginning with
META-INF
are reserved in the EPUB container, and therefore
cannot be used.
XHTML files specified in --spine
, by default, have their
contents inserted into the <body>
element of a
template; see Templating to learn how to
configure this behaviour.
Finally, note that --toc
has a default value, and
therefore cannot be blank. Unlike for all other input file options,
$SOURCE_DIR/$TOC
need not exist in the file system, and if
it does not, the table of contents will be automatically generated (see
Table of Contents). If it does
exist, its contents will, by default, be inserted into the
<nav epub:type='toc'>
element of a template; see Templating to learn how to configure this
behaviour.
Document Metadata
Metadata in EPUB documents follows the Dublin Core vocabulary. These metadata can be set using command line options. The following options correspond to fields that are mandatory in EPUB:
--dc-title=Untitled
: Document title.--dc-identifier=urn:uuid:$RANDOM_UUID
: Document identifier, either a URL or an RFC 8141 NID. Defaults to a random UUID, but because this field is expected to be consistent between document revisions, it is strongly recommended that you set this explicitly. To this end,epubsynth
will emit a warning if this option is not set.--dc-language=en
: Document language, as an RFC 5646 identifier.
Other options for optional metadata are
--dc-creator NAME
: Document author.--dc-contributor NAME...
: List of non-author contributors.--dc-coverage COVERAGE
: String describing the “spatial or temporal topic of the resource, spatial applicability of the resource, or jurisdiction under which the resource is relevant.”--dc-date DATE
: ISO 8601 string for the document date. Mutually exclusive with--dc-date-now
.--dc-date-now
: Set the document’sdc:date
to the current date and time. Mutually exclusive with--dc-date
.--dc-description DESCRIPTION
: Document description.--dc-publisher PUBLISHER
: Document publisher.--dc-relation RELATION...
: List of URIs (or other formal identifiers) of related resources. See the relevant specification.--dc-rights RIGHTS
: Statement about intellectual property rights associated with the document.--dc-source SOURCE
: A source from which this document is derived, ideally formally identified. See the relevant specification.--dc-subject SUBJECT
: Document subject/topic.--dc-type TYPE
: The “nature or genre of the resource.”
These cover all of the /elements/1.1/
namespace, except for dc:format
, which is always set to
application/epub+zip
in EPUBs generated by
epubsynth
, and cannot be changed.
Table of Contents
The table of contents for the EPUB document can be set in one of two ways: automatic generation and manual authoring.
Automatic Generation
If the file referred to by --toc
does not exist inside
--source-dir
, a table of contents will be automatically
generated. The option --toc-headings
can be used to specify
a list of headings, synchronised with the list of files in
--spine
, for the table of contents. For example,
--spine titlepage.xhtml chapter1.xhtml chapter2.xhtml \
--toc-headings "Title Page" "Chapter 1" "Chapter 2"
will result in the table of contents
- Title
- Chapter 1
- Chapter 2
Files in --spine
can be omitted from the table of
contents by setting their corresponding value in
--toc-headings
to the empty string ""
, as well
as by making --toc-headings
shorter than
--spine
. For example,
--spine titlepage.xhtml chapter1.xhtml chapter2.xhtml backpage.xhtml \
--toc-headings "" "Chapter 1" "Chapter 2"
will result in the table of contents
- Chapter 1
- Chapter 2
If --toc-headings
is not set, the file names in
--spine
themselves are used as the headings. For
example,
--spine chapter1.xhtml chapter2.xhtml
without --toc-headings
will result in the table of
contents
- chapter1.xhtml
- chapter2.xhtml
Manual Authoring
If the file referred to by --toc
exists inside
--source-dir
, it will be inserted into the EPUB container
as the table of contents (after templating, as mentioned in the previous section and detailed in Templating). For example, the source file
<ol>
<li><a href='chapter1.xhtml'>First Chapter</a></li>
<li><a href='chapter2.xhtml'>Another Chapter</a>
<ol>
<li><a href='chapter2.xhtml#section1'>Some Section</a></li>
</ol>
</li>
</ol>
will result in the table of contents
- First Chapter
- Another Chapter
- Some Section
See the relevant section of the EPUB 3 specification for what this file must adhere to.
Templating
There are two circumstances in which the contents of an input file
are inserted into a template before being added to the EPUB container:
XHTML files in --spine
and the file in --toc
(the latter regardless of an automatically generated or manually
authored table of contents).
For an XHTML file in --spine
, the templates
--template-xhtml
and --template-html
are used,
which default to
<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE html>
<html xmlns='http://www.w3.org/1999/xhtml'>
{html}
</html>
and
<head>
<title>{dc_title}</title>
{stylelinks}
</head>
<body>
{file}
</body>
respectively. These are Python format strings for which the fields are
html
: the contents of--template-html
after substitution,dc_title
: the (escaped) value of--dc-title
,stylelinks
: a concatenation of<link>
tags to each of the CSS stylesheets listed in--stylesheets
, andfile
: the contents of the input file.
For the --toc
file, the templates
--template-toc
and --template-toc-nav
are
used, which default to
<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE html>
<html xmlns='http://www.w3.org/1999/xhtml'
xmlns:epub='http://www.idpf.org/2007/ops'>
<head>
<title>{dc_title}</title>
</head>
<body>
{nav}
</body>
</html>
and
<nav epub:type='toc'>
{file}
</nav>
respectively. The fields are nav
, the contents of
--template-toc-nav
after substitution, as well as
dc_title
and file
with the same meaning as
before.
As an example, to disable all templating, one would set
--template-xhtml "{file}" --template-toc "{file}"
.