How to parse dblp.xml?

The dblp.xml is a simple, plain ASCII XML file, using the named entities as given in the accompanying dblp.dtd file. Daily updated XML dumps as well as persistent snapshot releases can be found on the dblp web server:

Detailed information on the XML structure of the dblp records and several design decisions can be found in the following paper:

The dblp.xml file can be parsed by essentially any out-of-the-box XML parser.

Example parser

As an example, we provide a simple main memory data structure to parse and query the whole dblp data, written in Java. The code in this section has been tested using the following environment:

Note that Java 8+ is required to run our example parser. The code will not work with earlier Java versions.

Please load the files

from our web server into a local directory. E.g., you may run the following command:

wget http://dblp.org/src/DblpExampleParser.java \
	http://dblp.org/src/mmdb-2016-12-09.jar \
	http://dblp.org/xml/release/dblp-2016-12-01.xml.gz \
	http://dblp.org/xml/release/dblp-2016-10-01.dtd

Unzip the dblp.xml.gz file using:

gunzip dblp-2016-12-01.xml.gz

Compile the parser:

javac -cp mmdb-2016-12-09.jar DblpExampleParser.java

Run the example application:

java -cp mmdb-2016-12-09.jar:. DblpExampleParser dblp-2016-12-01.xml

The JavaDoc pages for the org.dblp.mmdb package are also available for download:

Attachments

maintained by Schloss Dagstuhl LZI at University of Trier