ProvBase
ProvBase is an experimental distributed database management system that enables an efficient storage and querying of large scientific workflow provenance datasets represented using the Resource Description Framework (RDF). ProvBase can currently be deployed on an HBase/Hadoop cluster or a MySQL cluster. The HBase flavor of ProvBase is based on more recent cloud computing technologies to enable efficient, distributed triple pattern matching and evaluation of SPARQL queries over an HBase provenance database. The MySQL flavor of ProvBase uses more traditional distributed relational database technology to store RDF data in relations and query it with SQL resulted from the semantics-preserving SPARQL-to-SQL translation.
ProvBase - HBase/Hadoop Cluster
- A scalable, distributed database built using cloud computing technologies.
- Hosts large amounts of data using commodity hardware.
- Uses customizable storage schema and unique query optimizations.
- Builds on top of HBase/Hadoop Infrastructure.
ProvBase - MySQL Cluster
- A scalable, distributed database built using distributed relational database technology.
- Features schema-oblivious, schema-aware, and data-driven approaches to database schema generation.
- Uses the semantics-preserving SPARQL-to-SQL translation.
- Builds on top MySQL Cluster.

