The EuroTeX2003 paper about Bibulus

(Converted to HTML and slightly modified.)

Background

BibTeX is a great tool -- something which is demonstrated by the fact that it is still being used today, almost twenty years after it was created -- but it has some problems. Some of these have been solved by additional packages:

However, to the best of my knowledge, there are still problems which are not satisfactorily handled by any existing programs:

Enter Bibulus

Bibulus is an attempt to address the problems listed above. In the following some of its main features will be introduced.

XML

Bibulus requires its databases to be in Bibulus XML (specified in a DTD (the Bibulus DTD has been inspired by the one described in The LaTeX Web Companion and by various bibliographic DTDs that can be found on the Internet) which is bundled with Bibulus). This is not as problematic as it sounds for two reasons.

First, Bibulus comes with a program which will convert old BibTeX databases to XML.

Second, data in XML is generally very easy to convert to another kind of XML, for instance by means of an XSLT script. Although no such scripts are included in the Bibulus module at the moment, it could easily be done if there was a need for them, for instance in order to use XML bibliographies conforming to other bibliographic DTDs with Bibulus.

Unicode

Bibulus is truly multilingual. It uses Unicode internally, but it can both read and write other character sets.

Perl

Bibulus is written in pure Perl. This means it is very portable, as Perl is available on most platforms.

Not just LaTeX

To Bibulus, LaTeX is just yet another input/output format. If you want, you can also get your bibliography in pure ASCII or in HTML, and other formats are easily added (for instance, as suggested by a reviewer, it would be easy to make Bibulus output BibTeX databases).

Specifying the style in your document

Bibliography styles can be defined in the LaTeX file. While \bibliographystyle definitions are still supported, one can also use this new format:
  \bibulus{citationstyle=numerical,
           surname=comes-first,
           givennames=initials,
           blockpunctuation=.}
Of course, many more options are possible.

Entries can also be specified or modified within the LaTeX document. For instance, the following command will add a note with the text "Great!" to the entry called sample.

  \bibulusadd{sample}{note}{Great!}

Open Source

Bibulus is released under the GNU Public Licence which among other things means you get the source code and are free to make any changes you want. Bibulus is very much work in progress, both in the sense that many features have not been implemented yet and that there is a good chance your requests will be implemented.

Name

"Bibulus" means fond of drink, thirsty in Latin. Furthermore, M. Calpurnius Bibulus was consul in Rome together with C. Iulius Cæsar in the year 59 BC.

The name was given in the hope that the program will be fond of "drinking" many books, and that it will rule together with the best typesetting systems (hopefully more happily than its ancient namesake).

Tour of Bibulus

In the following we shall have a closer look at various aspects of Bibulus.

Converting BibTeX databases

Bibulus comes with a conversion program called bib2xml which will convert a BibTeX database to Bibulus XML. As an example, let us convert the file xampl.bib which comes with BibTeX.
> bib2xml xampl
This is BibTeX, Version 0.99c (Web2C 7.3.1)
The top-level auxiliary file: tmp6600.aux
The style file: bib2xml.bst
Database file #1: xampl.bib
Odd edition number: Silver

It should be clear from the above that bib2xml calls BibTeX to do its job, thus ensuring it can parse all documents that BibTeX can. It produced one warning, since Bibulus assumes that editions can only be numbers, but xampl.bib contains a "silver edition".

The result of this is a file, xampl.xml, which conforms to the Bibulus DTD.

For instance, in the original file there is an entry which looks like this:

@PHDTHESIS{phdthesis-full,
   author = "F. Phidias Phony-Baloney",
   title = "Fighting Fire with Fire:
            Festooning {F}rench Phrases",
   school = "Fanstord University",
   type = "{PhD} Dissertation",
   address = "Department of French",
   month = jun # "-" # aug,
   year = 1988,
   note = "This is a full PHDTHESIS entry",
}

In Bibulus XML, this has become:

<thesis id="phdthesis-full" type="phd">
  <author>
    <name gender="unknown"
          nametype="familylast">
      <given>F. Phidias</given>
      <family>Phony-Baloney</family>
    </name>
  </author>
  <title>Fighting fire with fire:
         Festooning French
         phrases</title>
  <institution>Fanstord
               University</institution>
  <place>Department of French</place>
  <year month="8">1988</year>
  <note>This is a full
        PHDTHESIS entry</note>
</thesis>

Some notes:

Let us regard a further example from the same file.

@ARTICLE{article-full,
   author = {L[eslie] A. Aamport},
   title = {The Gnats and Gnus Document
            Preparation System},
   journal = {\mbox{G-Animal's} Journal},
   year = 1986,
   volume = 41,
   number = 7,
   pages = "73+",
   month = jul,
   note = "This is a full ARTICLE entry",
}

This is transformed by bib2xml to the following:

<article id="article-full">
  <crossref id="article-full-PART2"/>
  <author>
    <name gender="unknown"
          nametype="familylast">
      <given>L[eslie] A.</given>
      <family>Aamport</family>
    </name>
  </author>
  <title>The gnats and gnus document
         preparation system</title>
  <pages>73+</pages>
  <note>This is a full ARTICLE
        entry</note>
</article>
<magazine id="article-full-PART2">
  <journal>G-Animal's Journal</journal>
  <volume>41</volume>
  <number>7</number>
  <year month="7">1986</year>
</magazine>

Most of this is hardly surprising by now, except for the fact that the entry has been split into two. This does not affect the output since Bibulus (like BibTeX) will inline cross-references that are only used a limited number of times (specified by the user). This allows for a significant simplification of the DTD.

Editing

There is no Bibulus editor (for the time being), but there exist many XML editors, all of which ought to work well with Bibulus XML. However, Bibulus XML is really not any more complicated than BibTeX databases, so it is also quite feasible to edit the files in a plain text editor.

The same situation holds for validation, i.e., checking that an XML file conforms to the definitions in the DTD: There is no Bibulus validator, but many standard tools can be used, and it is highly recommended to validate Bibulus bibliographic databases in this way instead of relying on built-in error handling.

Notes and annotations in the text

BibTeX requires us to write notes and annotations in the bibliographic database, but there are problems with this approach. Annotations are typically unique to each bibliography (this is often true for notes, too). The bibliographic database is therefore the wrong place to specify them -- it should be done in the main text instead. Furthermore, these fields require translation when the document is translated, something which is much easier if they are kept together with the main text. Bibulus allows both for backwards compatibility.

Transliterations and translations

One of the most important raisons d'être for a bibliography formatting system is to make it possible to define an entry once and then extract it in many different formats. To achieve this, Bibulus is able to transliterate names and titles automatically, and it is possible to add translations of titles, either in the XML database or in the LaTeX source.

Migrating to Bibulus

In the following we shall see how a LaTeX user can move from BibTeX to Bibulus.

Getting started

The very first step is to convert the BibTeX databases to Bibulus XML, as described above.

Without making any changes to the LaTeX document, one can then start to use bibulustex instead of bibtex. If a standard bibliography style is used (e.g., plain), this should produce equivalent output.

However, only a few bibliography styles are defined, so this is likely to be less than needed.

Farewell to \bibliographystyle

The second step is to use the \bibulus command in LaTeX to define the style of the bibliography.

The default is a style close to BibTeX plain, so only options that differ need to be defined. For instance, if one wants alpha labels (first letters of the last name + last two digits of the year) instead of numerical labels and furthermore wants author names to be written in small caps, one can just write the following in the LaTeX document:

  \bibulus{citationstyle=alpha,
           namefont=sc}

More Bibulus commands in LaTeX

The next step is to start to use the LaTeX commands for manipulating the bibliography, e.g., by adding notes, annotations or translations of titles.

It is also possible to create an alias for a title if one is not happy with its label in the XML file. For instance, if The TeXbook is stored with the ID knuth86, one might want to issue the following command:

 \bibalias{knuth86}{texbook}
After this, citing knuth86 and texbook will be fully equivalent.

Goodbye to bibulustex

All that bibulustex really does is to get the filename from the command line and then do the following:
my $bib = new Bibulus::LaTeX;
$bib->procaux($filename);
open (BBL, ">$filename.bbl") 
 or die "Could not write $filename.bbl.\n";
print BBL $bib->getbib;
close BBL;

If more functionality is needed, one can thus make a personalised version of bibulustex with extra functionality.

For instance, to output all years ab urbe condita (after the foundation of Rome), add the following after the first line:

$bib->whenparsing('year',
                  sub {
                    return $_[0] + 754;
                  });

Most of this is just Perl code. What matters is the following: The code within the sub is executed when a <year> is encountered; the contents of this XML chunk is passed to the sub in $_[0]; and, whatever the sub returns replaces the old contents of the chunk.

This is a very simple and silly example, but the possibilities are endless, especially as the full power of Perl is available.

Extending Bibulus

If the built-in extension hooks do not provide enough freedom, one can extend Bibulus quite easily.

Create a file called, say, myBibulus.pm and put the following into it:

package myBibulus;
use Bibulus::LaTeX;
our @ISA = qw(Bibulus::LaTeX);

sub newblock {
  return "\par ";
}

Now replace the line

 my $bib = new Bibulus::LaTeX;
with
 my $bib = new myBibulus;
in your personalised version of bibulustex, and the bibliographies produced will now have the blocks separated by \par instead of \newblock.

Any internal Bibulus function can be overridden in this way.

Final words

This has been a brief introduction to the main features of Bibulus. As the program is being developed continuously, there may be features available now that have not been described in this paper, and the syntax of certain commands might have changed slightly.

For more information, please visit the project's website.

There are also two mailing lists, one for developers and one for users -- please consider joining one of them.

Bibulus still has a some way to go, but with the help of the user community, we can do it!