10M RDF Triples A conversion of UMBC ~10M RDF triples from the original SQL dump from UMBC including some fixes and filtering to give legal syntax. Outputs: triples.nt.bz2 N-Triples version - 10,429,951 triples triples.rdf.bz2 RDF/XML version - 10,429,947 triples Processing Notes I started from the data described at http://ebiquity.umbc.edu/resource/html/id/126 in file http://ebiquity.umbc.edu/share/triples.sql.gz which is a SQL database dump (rather useless in itself): -rw-r--r-- 1 dajobe dajobe 1799342725 Jun 16 13:44 triples.sql I loaded it into a mysql database and wrote a custom perl DBI script mysql2ntriples to turn it into legal N-Triples. It generated the N-Triples (updated 2006-04-14 to fix bad \u hex) -rw-r--r-- 1 dajobe dajobe 1625025120 Apr 14 11:59 triples.nt which via rapper (see Makefile) generated the RDF/XML: -rw-r--r-- 1 dajobe dajobe 2117989148 Mar 14 12:04 triples.rdf However this has triples that cannot be written in XML such as ones refering to ASCII 7 in literals; which isn't allowed in XML 1.0 character data. It also discards some triples with predicates that cannot be written in RDF/XML such as http://example.org/# http://www.w3.org/2000/10/swap/pim/doc.n3#@@ http://example.org/property/ Only the syntax is valid, there are also invented RDF namespace terms such as: rdf:date rdf:testCsae rdf:asserts rdf:range rdf:domain rdf:dtype See Makefile for the actual operation for making triples.rdf from triples.nt using rapper (from http://librdf.org/raptor/) Dave Beckett http://purl.org/net/dajobe/ 2006-03-14