The SPARQL query language is extensible by nature: it allows you to add your own custom functions if the standard set of operators is not sufficient for your needs. The RDF4J SPARQL engine has been designed with this extensibility in mind: you can define your own custom function and use it as part of your SPARQL queries.
In this short tutoral, we’ll create a simple custom function and add it RDF4J. Specifically, we are going to implement a boolean function that detects if some string literal is a palindrome.
Suppose we have the following RDF data:
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix ex: <http://example.org/> .
ex:a rdfs:label "step on no pets" .
ex:b rdfs:label "go on, try it" .
We would like to be able to formulate a SPARQL query that allows us to retrieve all resources that have a palindrome as their label:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX cfn: <http://example.org/custom-function/>
SELECT ?x ?label
WHERE {
?x rdfs:label ?label .
FILTER(cfn:palindrome(str(?label)))
}
The expected result of this query, given the above data, would be:
x | label |
---|---|
ex:a | "step on no pets" |
Unfortunately, the function cfn:palindrome
is not a standard SPARQL function, so this query won’t work: the RDF4J SPARQL engine will simply report an error.
We could of course retrieve all label values in the database and then do some checking ourselves on these values, to detect if they’re palindromes. However if we add a custom function instead, we remove the need to scan over the entire database: the SPARQL engine itself can determine if a value is a valid palindrome or not, which removes the need for us to loop over all possible values.
There’s two basic steps in adding custom functions to RDF4J:
In the RDF4J SPARQL engine, functions are expected to implement the Function
interface.
|
|
The Function
interface defines two methods: evaluate()
and getURI()
. The latter of these is a simple method that returns a string representation of the URI of the function:
|
|
The real proof of the pudding is in the evaluate()
method: this is where the function logic is implemented. In other words, in this method we check the incoming value to see if it is, first of all, a valid argument for the function, and second of all, a palindrome, and return the result.
Example 1 show how we put everything together:
|
|
You are completely free to implement your function logic: in the above example, we have created a function that only returns true
or false
, but since the actual return type of an RDF4J function is Value
, you can create functions that return string literals, numbers, dates, or even IRIs or blank nodes.
There are two important things to keep in mind though:
evaluate()
method is invoked for every single solution in the query result. So you should make sure that the implementation of your function is not overly complex and memory-intensive.Once we have created the Java class for our function, we need some way to make the RDF4J SPARQL engine aware of it. This is where the Service Provider Interface (SPI) comes into play.
RDF4J’s set of SPARQL functions is dynamically determined through the use of a java.util.ServiceLoader
class. Specifically, RDF4J has a class called FunctionRegistry
which keeps track of all implementations of the Function
interface. Java’s SPI mechanism depends on the presence of configuration files in the JAR files that contain service implementations. This configuration file is expected to be present in the directory META-INF/services
in your JAR file.
In the case of the SPARQL function registry, the name of this configuration file should be org.eclipse.rdf4j.query.algebra.evaluation.function.Function
(in other words, the file name is equal to the fully-qualified name of the service interface we are providing an implementation for). The contents are really quite simple: an SPI configuration is a text file, containing the fully-qualified names of each Java class that provides an SPI implementation, one on each line. So in our case, the contents of the file would be:
org.eclipse.rdf4j.example.function.PalindromeFunction
Apart from this configuration file, your JAR file should of course also contain the actual compiled class. All of this is fairly easy to do, for example from your Eclipse project:
META-INF
and a subdirectory META-INF/services
within the src
directory of your project (or, if you use Maven, within src/main/resources
) See our example resources dir for an example;org.eclipse.rdf4j.query.algebra.evaluation.function.Function
to this new directory. Make sure it contains a single line with the fully qualified name of your custom function class (in our example, that’s org.eclipse.rdf4j.example.function.PalindromeFunction
);package
command) to create a JAR file (select the project, click ‘File’ -> ‘Export’ -> ‘JAR file’). Make sure the JAR file produced contains your compiled code and the sevice registry config file.Once you have a proper JAR file, you need to add it the runtime classpath of your RDF4J project (or if you’re aiming to use this in an RDF4J Server, add it to the RDF4J Server webapp classpath and restart). After that, you’re done: RDF4J should automatically pick up your new custom function, you can from now on use it in your SPARQL queries.
If you require any further help, you can contact us to get support. We welcome your feedback.