The rdf4j-spring
module allows for using an RDF4J repository as the data backend of a spring application.
A self-contained demo application can be found at rdf4j-spring-demo
To use RDF as the data backend of a spring application built with maven, use these dependencies:
<dependency>
<groupId>org.eclipse.rdf4j</groupId>
<artifactId>rdf4j-spring</artifactId>
<version>${rdf4j.version}</version>
</dependency>
… setting the property rdf4j.version
is set to the RDF4J version you want (minimum 4.0.0
).
In order for the application to run, a repository has to be configured:
To configure the application to access an existing repository, set the following configuration properties, e.g. in application.properties
:
rdf4j.spring.repository.remote.manager-url=http://localhost:7200
rdf4j.spring.repository.remote.name=myrepo
# Optional with username and password
rdf4j.spring.repository.remote.username=admin
rdf4j.spring.repository.remote.password=1234
To use an in-memory repository (for example, for unit tests), use
rdf4j.spring.repository.inmemory.enabled=true
The main purpose of rdf4j-spring
is to support accessing an RDF4J repository using the DAO pattern.
DAOs are subclasses of RDF4JDao
and use
the RDF4JTemplate
for accessing
the RDF4J repository configured for the application.
The RDF4JTemplate
is the class used to access a Repository
in rdf4j-spring
. A bean of this type is configured
at start up and available for wiring into beans. The RDF4JTemplate
accesses the Repository
through a RepositoryConnection
that it obtains from a RepositoryConnectionFactory
.
This indirection allows for using a connection pool, connect RDF4J to spring’s transaction management, and provide
query logging to a file or exposing query statistics via JMX. These features can be enabled/disabled using
configuration properties (see Configuration)
To use the RDF4JTemplate
in a bean, define that bean in the spring application’s configuration and wire the RDF4JTemplate
in:
@Configuration
@Import(RDF4JConfig.class)
public class MyAppConfig {
@Bean
public MyBeanClass getMyBean(@Autowired RDF4JTemplate template){
return new MyBeanClass(template);
}
}
public class MyBeanClass {
private RDF4JTemplate rdf4JTemplate;
public MyBeanClass(RDF4JTemplate template){
this.rdf4jTemplate = template;
}
}
The RDF4JTemplate offers various ways to access the repository.
For example, to evaluate a TupleQuery
using the RDF4JTemplate
(in this case, counting all triples):
int count = rdf4JTemplate
.tupleQuery("SELECT (count(?a) as ?cnt) WHERE { ?a ?b ?c }")
.evaluateAndConvert()
.toSingleton(bs -> TypeMappingUtils.toInt(QueryResultUtils.getValue(bs, "cnt")));
The query, provided through the tupleQuery
method, is executed with the call to evaluateAndConvert()
, which returns
a TupleQueryResultConverter
.
The latter provides methods for converting the TupleQueryResult
of the query into an object, an Optional
, a Map
,
Set
, List
, or Stream
. In the example, we are just interested in the count as an int
- one single object - so we use
the toSingleton()
method and convert the value of the projection variable to an int. The conversion is done
using TypeMappingUtils
; the
extraction of the variable’s value from the BindingSet bs
is done using
QueryResultUtils
.
For binding variables before executing a query or update, use the OperationBuilder
returned by the tupleQuery()
, graphQuery
, or update
methods. It provides various withBinding()
methods following
the builder pattern, allowing for binding variables, as illustrated in the following example.
Set<IRI> artists = rdf4JTemplate
.tupleQuery("PREFIX ex: <http://example.org/>"
+ "SELECT distinct ?artist "
+ "WHERE { ?artist a ?type }")
.withBinding("type", EX.Artist)
.evaluateAndConvert()
.toSet(bs -> QueryResultUtils.getIRI(bs, "artist"));
For using the RepostoryConnection
directly,
without the need to generate a result, the consumeConnection()
method is used:
rdf4JTemplate.consumeConnection(con -> con.remove(EX.Picasso, RDF.TYPE, EX.Artist);
Alternatively, to generate a result, the applyToConnection()
method is used:
boolean isPresent = rdf4JTemplate.applyToConnection(
con -> con.hasStatement(EX.Picasso, RDF.TYPE, EX.Artist, true);
For running queries or updates from external resources, the [(tupleQuery|graphQuery|update)FromResource]
methods can be used.
For example, the sparql/construct-artists.rq
file on the classpath might contain this query:
PREFIX ex: <http://example.org/>
CONSTRUCT {?artist ?p ?o } WHERE { ?artist a ex:Artist; ?p ?o }
and could be evaluated using
Model model = rdf4JTemplate.graphQueryFromResource(
getClass(),
"classpath:sparql/construct-artists.rq")
.evaluateAndConvert()
.toModel();
The resource to be read is resolved by spring’s ResourceLoader,
which supports fully qualified URLs (e.g., file://
URLs, relative paths and classpath:
pseudo-URLs.)
Any spring bean that uses the RDF4JTemplate
can be seen as a DAO and participates in transactionality, query logging,
caching, etc. However, rdf4j-spring
provides a few base classes that provide frequently used functionality.
RDF4JDao
is a suitable base class for a general-purpose DAO. It provides
two functionalities to subclasses:
The RDF4JTemplate
is automatically wired into the bean and it is available through getRDF4JTemplate()
It provides a simple management facility for SPARQL query/update strings. This allows for SPARQL queries being
generated only once (by String concatenation, read from a file, or built with the SparqlBuilder).
The queries are prepared in the template method prepareNamedSparqlSuppliers()
:
In the following example, we
RDF4JDao
@Component
so it gets auto-detected during Spring’s component scanQUERY_KEYS
, as a container for String
constants we use for query keysprepareNamedSparqlSuppliers
method and add one querygetArtistsWithoutPaintings()
). We access the prepared query with getNamedTupleQuery(String)
, passing the constant we defined in QUERY_KEYS
.
@Component // make the DAO a spring component so it's auto-detected in the classpath scan
public class ArtistDao extends RDF4JDao {
// constructor, other methods etc
// recommended: encapsulate the keys for queries in an object
// so it's easier to find them when you need them
static abstract class QUERY_KEYS {
public static final String ARTISTS_WITHOUT_PAINTINGS = "artists-without-paintings";
}
// prepare the named queries, assigning each one of the keys
@Override
protected NamedSparqlSupplierPreparer prepareNamedSparqlSuppliers(NamedSparqlSupplierPreparer preparer) {
return preparer
.forKey(QUERY_KEYS.ARTISTS_WITHOUT_PAINTINGS)
.supplySparql(Queries.SELECT(
ARTIST_ID)
.where(
ARTIST_ID.isA(iri(EX.Artist))
.and(ARTIST_ID.has(iri(EX.creatorOf), Painting.PAINTING_ID).optional())
.filter(not(bound(Painting.PAINTING_ID)))).getQueryString()
);
}
// use the named query with getNamedTupleQuery(String)
public Set<Artist> getArtistsWithoutPaintings(){
return getNamedTupleQuery(QUERY_KEYS.ARTISTS_WITHOUT_PAINTINGS)
.evaluateAndConvert()
.toStream()
.map(bs -> QueryResultUtils.getIRI(bs, ARTIST_ID))
.map(iri -> getById(iri))
.collect(Collectors.toSet());
}
// ...
}
The SimpleRDF4JCRUDDao
is a suitable base class for a DAO for
creating, reading, updating, and deleting one class of entities. It requires two type parameters, ENTITY
and ID
.
It provides create, read, update, and delete functionality for the ENTITY
class, using the ID
class wherever the
entity’s identifier is required.
Subclasses of SimpleRDF4JCRUDDao
must implement a couple of template methods in order to customize the generic
behaviour for the specific entity and id classes.
In the following, we use the entity Artist
(as used in the demo
application) as an example. Note that we define public constants of type Variable
,
one corresponding to each of the entity’s fields.
public class Artist {
// recommended pattern: use a public Variable constant for each of the entities fields
// for use in queries and result processing.
public static final Variable ARTIST_ID = SparqlBuilder.var("artist_id");
public static final Variable ARTIST_FIRST_NAME = SparqlBuilder.var("artist_firstName");
public static final Variable ARTIST_LAST_NAME = SparqlBuilder.var("artist_lastName");
private IRI id;
private String firstName;
private String lastName;
// getter, setter, constructor, ...
// be sure to implement equals() and hashCode() for proper behaviour of collections!
}
The ArtistDao
is shown in the following code snippets.
We recommend to use @Component
for auto-detection. Implementing the constructor is required.
@Component // again, make it a component (see above)
public class ArtistDao extends SimpleRDF4JCRUDDao<Artist, IRI> {
public ArtistDao(RDF4JTemplate rdf4JTemplate) {
super(rdf4JTemplate);
}
The populateIdBindings
method is called by the superclass to bind the id to variable(s) in a SPARQL query.
@Override
protected void populateIdBindings(MutableBindings bindingsBuilder, IRI iri) {
bindingsBuilder.add(ARTIST_ID, iri);
}
The populateBindingsForUpdate
method is called by the superclass to bind all non-id variables when performing an update.
@Override
protected void populateBindingsForUpdate(MutableBindings bindingsBuilder, Artist artist) {
bindingsBuilder
.add(ARTIST_FIRST_NAME, artist.getFirstName())
.add(ARTIST_LAST_NAME, artist.getLastName());
}
The mapSolution
method converts a query solution, i.e., a BindingSet
, to an instance of the entity.
@Override
protected Artist mapSolution(BindingSet querySolution) {
Artist artist = new Artist();
artist.setId(QueryResultUtils.getIRI(querySolution, ARTIST_ID));
artist.setFirstName(QueryResultUtils.getString(querySolution, ARTIST_FIRST_NAME));
artist.setLastName(QueryResultUtils.getString(querySolution, ARTIST_LAST_NAME));
return artist;
}
The getReadQuery
method provides the SPARQL string used to read one entity. Note that the variable names must be the
same ones used in mapSolution(BindingSet)
. It may be cleaner to use the SparqlBuilder for generating this string.
@Override
protected String getReadQuery() {
return "prefix foaf: <http://xmlns.com/foaf/0.1/> "
+ "prefix ex: <http://example.org/> "
+ "SELECT ?artist_id ?artist_firstName ?artist_lastName where {"
+ "?artist_id a ex:Artist; "
+ " foaf:firstName ?artist_firstName; "
+ " foaf:surname ?artist_lastName ."
+ " } ";
}
The getInsertSparql(ENTITY)
method provides the SPARQL string for inserting a new instance. This SPARQL operation
will also be used for updates. If updates require a different operation from inserts, it must be provided by implementing
getUpdateSparql(ENTITY)
.
@Override
protected NamedSparqlSupplier getInsertSparql(Artist artist) {
return NamedSparqlSupplier.of("insert", () -> Queries.INSERT(ARTIST_ID.isA(iri(EX.Artist))
.andHas(iri(FOAF.FIRST_NAME), ARTIST_FIRST_NAME)
.andHas(iri(FOAF.SURNAME), ARTIST_LAST_NAME))
.getQueryString());
}
The getInputId(ENTITY)
method is used to generate the id
of an entity to be inserted. Here, we use the id
of the
specified artist
object; if it is null we generate a new IRI
using getRdf4JTemplate().getNewUUID()
.
@Override
protected IRI getInputId(Artist artist) {
if (artist.getId() == null) {
return getRdf4JTemplate().getNewUUID();
}
return artist.getId();
}
}
If the entity uses a composite key, a class implementing CompositeKey
must be used for the ID
type parameter. For a key consisting of two components, the CompositeKey2
class is available. If more components are needed, the key class can be modeled after that one.
It is not uncommon for an application to read a relation present in the repository data into a Map
. For example, we
might want to group painting id
s by artist id
. The RelationMapBuilder
provides the necesary functionality for such cases:
RelationMapBuilder b = new RelationMapBuilder(getRDF4JTemplate(), EX.creatorOf);
Map<IRI, Set<IRI>> paintingsByArtists = b.buildOneToMany();
Additional Functionality:
constraints(GraphPattern)
method restricts the relationrelationIsOptional()
method allows for the object to be missing, in which case an empty set is generated for the subject key.useRelationObjectAsKey()
method flips the map such that the objects of the relation are used as keys and the subjects are aggregated.buildOneToOne()
method returns a one to one mapping, which dies horribly if the data is not 1:1The RDF4JCRUDDao
is essentially the same as the SimpleRDF4JCRUDDao
,
with the one difference that it has three type parameters, ENTITY
, INPUT
, and ID
. The class thus allows different
classes for input and output: creation and updates use INPUT
, e.g. save(INPUT)
, reading methods use ENTITY
, e.g.
ENTITY getById(ID)
.
Usually, the functionality offered by DAOs is rather narrow, e.g. CRUD methods for one entity class. They
are combined to provide a wider range of functionality in the servcie layer. The only thing one
needs to know when implementing the service layer with rdf4j-spring
DAOs is that its methods need to participate
in spring’s transaction management. The most straightforward way to do this is to use the @Transactional
method annotation, causing the service object to be wrapped with a proxy that takes care of transactionality.
The following code snippet, taken from the demo’s ArtService
class,
shows part of a simple service.
@Component
public class ArtService {
@Autowired
private ArtistDao artistDao;
@Autowired
private PaintingDao paintingDao;
@Transactional
public Artist createArtist(String firstName, String lastName) {
Artist artist = new Artist();
artist.setFirstName(firstName);
artist.setLastName(lastName);
return artistDao.save(artist);
}
@Transactional
public Painting createPainting(String title, String technique, IRI artist) {
Painting painting = new Painting();
painting.setTitle(title);
painting.setTechnique(technique);
painting.setArtistId(artist);
return paintingDao.save(painting);
}
@Transactional
public List<Painting> getPaintings() {
return paintingDao.list();
}
@Transactional
public List<Artist> getArtists() {
return artistDao.list();
}
@Transactional
public Set<Artist> getArtistsWithoutPaintings(){
return artistDao.getArtistsWithoutPaintings();
}
// ...
}
Testing an application built with rdf4j-spring
can be done at the DAO layer as well as on the service layer. Generally,
applications will have more than one test classes.
The common approach is to have a configuration for tests that is shared by all tests, and this configuration prepares
the spring context with all the required facilities. A minimal, shared test configuration is the following. Note that
it imports RDF4JTestConfig
:
@TestConfiguration
@EnableTransactionManagement
@Import(RDF4JTestConfig.class)
@ComponentScan("com.example.myapp.dao")
public class TestConfig {
@Bean
DataInserter getDataInserter() {
return new DataInserter();
}
}
With this configuration, a test class can use the dataInserter
bean to insert data into an inmemory repository before
each test:
@ExtendWith(SpringExtension.class)
@Transactional
@ContextConfiguration(classes = { TestConfig.class })
@TestPropertySource("classpath:application.properties")
@TestPropertySource(
properties = {
"rdf4j.spring.repository.inmemory.enabled=true",
"rdf4j.spring.repository.inmemory.use-shacl-sail=true",
"rdf4j.spring.tx.enabled=true",
"rdf4j.spring.resultcache.enabled=false",
"rdf4j.spring.operationcache.enabled=false",
"rdf4j.spring.pool.enabled=true",
"rdf4j.spring.pool.max-connections=2"
})
@DirtiesContext
public class ArtistDaoTests {
@Autowired
private ArtistDao artistDao;
@BeforeAll
public static void insertTestData(
@Autowired DataInserter dataInserter,
@Value("classpath:/data/my-testdata.ttl") Resource dataFile) {
dataInserter.insertData(dataFile);
}
@Test
public void testReadArtist(){
// ...
}
}
The inmemory repository is likely to behave differently from any database used in production in some edge cases. It is recommended to test against a local installation of the database that is used in production in addition to testing against the inmemory repository.
With rdf4j-spring
this is quite straightforward:
rdf4j.spring.repository.remote.*
properties)@TestPropertySource
annotation@Tag
annotation, so you can easily switch the test on or off using the configuration of your test environment
(most likely the Maven Surefire Plugin), as the local database installation will not be present in many build environments.Example:
@Tag("requires-local-database")
@TestPropertySource("classpath:/repository-localdb.properties")
public class ArtistDaoDbTests extends ArtistDaoTests {
// no code needed, the class is just created to run your ArtistDaoTests with a different configuration
}
In addition to query logging, if you need to get a close look at what’s happening inside the rdf4j-spring
code,
set the loglevel for org.eclipse.rdf4j.spring
to DEBUG
. Sometimes it may be required to look into what spring is doing.
In this case, set org.springframework
to DEBUG
or even TRACE
.
One way to do this is to provide a logback.xml
file on the classpath, as can be found in the source at rdf4j-spring/src/test/resources/logback.xml
.
Another way to set the loglevel is to provide an application property starting with logging.level.
, e.g.
logging.level.org.eclipse.rdf4j.spring=DEBUG
which can be provided in an application.properties
(for details and other ways to do that, have a look at the documentation on Externalied Configuration in Spring).
rdf4j-spring
makes use of the auto-configuration feature in Spring (configured in the source file rdf4j-spring/META-INF/spring.factories
).
That means that bean creation at start up is governed by configuration properties, all of which are prefixed by rdf4j.spring.
The following table shows all subsystems with their property prefixes, the packages they reside in, and the class holding their properties.
Subsystem | property prefix | package (links to reference) | Properties class |
---|---|---|---|
Repository | rdf4j.spring.repository. |
org.eclipse.rdf4j.spring.repository
|
RemoteRepositoryProperties
and InMemoryRepositoryProperties
|
Transaction management | rdf4j.spring.tx. |
org.eclipse.rdf4j.spring.tx
|
TxProperties
|
Connection Pooling | rdf4j.spring.pool. |
org.eclipse.rdf4j.spring.pool
|
PoolProperties
|
Operation caching | rdf4j.spring.operationcache. |
org.eclipse.rdf4j.spring.operationcache
|
OperationCacheProperties
|
Operation logging | rdf4j.spring.operationlog. |
org.eclipse.rdf4j.spring.operationlog
|
OperationLogProperties
and OperationLogJmxProperties
|
Query result caching | rdf4j.spring.resultcache. |
org.eclipse.rdf4j.spring.resultcache
|
ResultCacheProperties
|
UUIDSource | rdf4j.spring.uuidsource. |
org.eclipse.rdf4j.spring.uuidsource
|
SimpleUUIDSourceProperties
, NoveltyCheckingUUIDSourceProperties
, UUIDSequenceProperties
, and PredictableUUIDSourceProperties
|
These subsystems and their configuration are described in more detail below.
As stated in the Getting Started section, to configure the application
to access an existing repository, set the following configuration properties, e.g. in application.properties
:
rdf4j.spring.repository.remote.manager-url=[manager-url]
rdf4j.spring.repository.remote.name=[name]
To use an in-memory repository (for example, for unit tests), use
rdf4j.spring.repository.inmemory.enabled=true
By default, rdf4j-spring
connects with Spring’s PlatformTransactionManager. To disable this connection, use
rdf4j.spring.tx.enabled=false
Creating a RepositoryConnection
has a certain overhead that many applications wish to avoid. rdf4j-spring
allows for
pooling of such connections. Several configuration options, such as the maximum number of connections, are available
(see PoolProperties
).
To enable, use
rdf4j.spring.pool.enabled=true
SPARQL operations (queries and updates) require some computation time to prepare from the SPARQL string they are based on.
In rdf4j-spring
, this process is hidden from clients and happens in the RDF4JTemplate
. By default, operations
are not cached, and the same operation executed multiple times always has the overhead of parsing the SPARQL string and
generating the operation. If this feature is enabled, operations are cached per connection.
Note: If connection pooling is enabled, it is possible that operations created in different threads will use different connections and will therefore all generate their own instance of the SPARQL operation, thus reducing the speedup incurred by operation caching.
To enable, use
rdf4j.spring.operationcache.enabled=true
(aka Query logging)
Two options are available for logging operations (queries and updates) sent to the repository:
Each operation is written to the logger org.eclipse.rdf4j.spring.operationlog.log.slf4
with loglevel DEBUG
.
To enable, use
rdf4j.spring.operationlog.enabled=true
Each operation is recorded (if identical operations are executed, statistics are aggregated) and exposed via JMX.
To enable, use
rdf4j.spring.operationlog.jmx.enabled=true
Applications that frequently execute the same queries might profit from result caching. If enabled, query results are cached on a per-connection basis. By default, this cache is cleared at the end of the ongoing transaction. The performance impact of result caching is application-specific and is not unlikely to be negative. Measure carefully!
However, if the application is the only one using the repository
, and therefore no updates are possible that the
application does not know about, the property rdf4j.spring.resultcache.assumeNoOtherRepositoryClients=true
can be set.
In this case, results are copied to a global cache that all connections have access to, and which is only cleared when
the application executes an update.
To enable result caching, use
rdf4j.spring.resultcache.enabled=true
Using UUIDs as identifiers for entities is a common strategy for applications using an RDF store as their backend. Doing this requires a source of new, previously unused UUIDs for new entities created by the application. Conversely, in unit tests, it is sometimes required that the UUIDs are generated in a predictable manner, so that actual results can be compared with expected results containing generated UUIDs.
The UUIDSource subsystem provides different implementations of the UUIDSource
interface. The configuration of this subsystem determines which implementation is wired into the RDF4JTemplate
at
start up and gets used by the application.
In our opinion, the default implementation, DefaultUUIDSource
is sufficient for generating previously unused UUIDs. Collisions are possible but sufficiently unlikely, so using
any one of noveltychecking
, sequence
, and simple
subsystems should not be necessary.
For using the predictable
UUIDSource, which always produces the same sequence of UUIDs, use
rdf4j.spring.uuidsource.predictable.enabled=true
The RDF4J-Spring module, the RDF4J-Spring-Demo, and this documentation have been developed in the project ‘BIM-Interoperables Merkmalservice’, funded by the Austrian Research Promotion Agency and Österreichische Bautechnik Veranstaltungs GmbH.
Table of Contents