Class ExtensibleDynamicEvaluationStatistics

  • All Implemented Interfaces:
    DynamicStatistics

    @Experimental
    public class ExtensibleDynamicEvaluationStatistics
    extends ExtensibleEvaluationStatistics
    implements DynamicStatistics

    ExtensibleDynamicEvaluationStatistics aims to keep an internal estimate of the cardinality of various statement patterns.

    It support getting the overall size, any single dimension cardinality (eg. ?a rdf:type ?b) and also two multidimensional patterns (:Peter rdf:type ?b; and ?a rdf:type foaf:Person).

    Since evaluation statistics are best-effort, we use HLL as sets to keep the number of statements for each pattern we support. HLL is a very memory efficient set implementation. Furthermore we hash each pattern into a fixed bucket size, 1024 for single dimension and 64 per dimension for multidimensional patterns.

    This means that adding ':peter rdf:type foaf:Person' and ':lisa rdf:type foaf:Person' could potentially return getCardinality(:peter, ?b, ?c) = 2 if both :peter and :lisa hash to the same of the 1024 buckets in subjectIndex.

    HLL does not support "remove" operations, so there are two sets of every index. One for all added statements and one for all removed statements. If the user adds, removes and re-adds the same statement then the cardinality for that statement will be incorrect. We call this effect "staleness". To prevent staleness from affecting the returned cardinalities this class needs to be monitored by calling the staleness(...) method. This will automatically be done every 60 seconds by the ExtensibleSailStore.