Class SketchBasedJoinEstimator
- All Implemented Interfaces:
QueryOptimizationScopeProvider
Features:
- Array-of-doubles tuple sketches over S, P, O, C singles, component degree sketches, and all six pairs.
- Synchronized reads sharing buffer locks; double‑buffered rebuilds.
- Incremental
addStatement/deleteStatementwith signed multiplicity summaries. - Configurable via
SketchBasedJoinEstimator.Configand system properties (see below).
Configuration
Applications should prefer SketchBasedJoinEstimator(SketchStatementSource, Config) to set options
programmatically. For convenience, SketchBasedJoinEstimator(SketchStatementSource, int, long, long)
delegates to SketchBasedJoinEstimator.Config.defaults() and will pick up system properties as well.
System properties (overlay)
All options can be overridden at construction time by JVM system properties with prefix
org.eclipse.rdf4j.query.algebra.evaluation.sketch.SketchBasedJoinEstimator.. When present, the system
property value takes precedence over the corresponding value provided through SketchBasedJoinEstimator.Config. The legacy
org.eclipse.rdf4j.sail.base.SketchBasedJoinEstimator. prefix is also accepted as a fallback. Supported keys
(defaults shown in SketchBasedJoinEstimator.Config):
nominalEntries(int ≥ 16, sketch nominal entries)subjectBucketCount(int ≥ 4)predicateBucketCount(int ≥ 4)objectBucketCount(int ≥ 4)contextBucketCount(int ≥ 4)contextPairSketchesEnabled(boolean)throttleEveryN(long)throttleMillis(long)refreshSleepMillis(long)estimateCacheSeconds(long)defaultContextString(String)roundJoinEstimates(boolean)churnSampleMin(int)churnSamplePercent(double 0..1)churnSampleMax(int)churnReaddThreshold(double 0..1)churnRemovalRatioThreshold(double 0..1)incrementalQueueInitialLimit(int)incrementalQueueIdleResetMillis(long)incrementalQueueEstimatedStatementBytes(long)
Example (configure default context and reduce refresh cadence):
System.setProperty(
"org.eclipse.rdf4j.query.algebra.evaluation.sketch.SketchBasedJoinEstimator.defaultContextString", "urn:ctx");
System.setProperty(
"org.eclipse.rdf4j.query.algebra.evaluation.sketch.SketchBasedJoinEstimator.refreshSleepMillis", "500");
var est = new SketchBasedJoinEstimator(source, Config.defaults().withNominalEntries(1024));
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic final classstatic enumstatic final classConfiguration forSketchBasedJoinEstimator.final classstatic interfacestatic enumstatic interfacestatic final recordstatic interfacestatic final classImmutable staleness snapshot.Nested classes/interfaces inherited from interface QueryOptimizationScopeProvider
QueryOptimizationScopeProvider.QueryOptimizationScopeModifier and TypeInterfaceDescriptionstatic interface -
Field Summary
FieldsFields inherited from interface QueryOptimizationScopeProvider
NO_OP_SCOPEModifier and TypeFieldDescription -
Constructor Summary
ConstructorsConstructorDescriptionSketchBasedJoinEstimator(SketchStatementSource statementSource, int nominalEntries, long throttleEveryN, long throttleMillis) Convenience constructor that usesSketchBasedJoinEstimator.Config.defaults()with the given basics.SketchBasedJoinEstimator(SketchStatementSource statementSource, SketchBasedJoinEstimator.Config cfg) Full configuration constructor. -
Method Summary
Modifier and TypeMethodDescriptionaccessShapeForJoinOrdering(TupleExpr tupleExpr, String[] sourceVariableNames, long sourceBoundVarMask) accessShapeForJoinOrdering(TupleExpr tupleExpr, Set<String> currentlyBoundVars) voidaddStatement(Resource s, IRI p, Value o) voidaddStatement(Resource s, IRI p, Value o, Resource c) voidvoidaddStatements(List<? extends Statement> statements) booleanawaitReady(long timeout, TimeUnit unit) booleandoublecardinality(List<TupleExpr> tupleExprs) doublecardinality(Filter node) doublecardinality(Join node) doublecardinality(LeftJoin node) doubledoublevoidclose()voidconfigurePersistence(Path file, boolean lazyLoad) Configure persistence for this estimator.voidvoiddeleteStatements(List<? extends Statement> statements) voidDiscard any in-memory estimator state and force a rebuild from store data on the next refresh cycle.doubleestimateCount(SketchBasedJoinEstimator.Component joinVar, String s, String p, String o, String c) estimateFilterPass(Filter filter) doubleestimateFilterPassRatio(Filter filter) doubleestimateJoinOn(SketchBasedJoinEstimator.Component j, SketchBasedJoinEstimator.Component a, String av, SketchBasedJoinEstimator.Component b, String bv) doubleestimateJoinOn(SketchBasedJoinEstimator.Component join, SketchBasedJoinEstimator.Pair a, String ax, String ay, SketchBasedJoinEstimator.Pair b, String bx, String by) estimateJoinOrder(List<TupleExpr> orderedArgs, Set<String> initiallyBoundVars, JoinOrderPlanner.Algorithm algorithm, JoinFactorCostModel factorCostModel, List<JoinOrderPlanner.FilterConstraint> deferredFilters) doublefactorOutputRowsForJoinOrdering(TupleExpr tupleExpr, String[] sourceVariableNames, long sourceBoundVarMask) doublefactorOutputRowsForJoinOrdering(TupleExpr tupleExpr, Set<String> currentlyBoundVars) booleanisReady()booleanbooleanisStale(double threshold) Convenience: true if combined staleness score exceeds a given threshold.booleanBest-effort sketch persistence intended for scheduled store-level sync/commit paths.planJoinOrder(List<TupleExpr> args, Set<String> initiallyBoundVars, JoinOrderPlanner.Algorithm algorithm) planJoinOrder(List<TupleExpr> args, Set<String> initiallyBoundVars, JoinOrderPlanner.Algorithm algorithm, SketchBasedJoinEstimator.JoinOrderWorkAdjuster workAdjuster) planJoinOrder(List<TupleExpr> args, Set<String> initiallyBoundVars, JoinOrderPlanner.Algorithm algorithm, SketchBasedJoinEstimator.JoinOrderWorkAdjuster workAdjuster, List<JoinOrderPlanner.FilterConstraint> deferredFilters) planJoinOrderAttempt(List<TupleExpr> args, Set<String> initiallyBoundVars, JoinOrderPlanner.Algorithm algorithm, JoinFactorCostModel factorCostModel, List<JoinOrderPlanner.FilterConstraint> deferredFilters) planJoinOrderAttempt(List<TupleExpr> args, Set<String> initiallyBoundVars, JoinOrderPlanner.Algorithm algorithm, SketchBasedJoinEstimator.JoinOrderWorkAdjuster workAdjuster, List<JoinOrderPlanner.FilterConstraint> deferredFilters) longrebuild()Rebuild the inactive buffer from scratch (blocking).voidrecordStoreSizeDelta(long additions, long deletions) Records committed store-size changes when exact incremental sketch updates are intentionally deferred.voidsetLearnedStatsProvider(JoinStatsProvider learnedStatsProvider) voidsetPatternCardinalityProvider(SketchBasedJoinEstimator.PatternCardinalityProvider patternCardinalityProvider) voidsetPatternFilterSamplingEstimator(SketchBasedJoinEstimator.PatternFilterSamplingEstimator patternFilterSamplingEstimator) voidsetRebuildAllowedSupplier(BooleanSupplier supplier) Install a store-provided gate for background rebuilds.Compute a staleness snapshot using the *current* published State.voidstartBackgroundRefresh(int stalenessThreshold) voidstop()voidunload()Release in-memory sketches.
-
Field Details
-
REFRESH_THREAD_NAME
- See Also:
-
-
Constructor Details
-
SketchBasedJoinEstimator
public SketchBasedJoinEstimator(SketchStatementSource statementSource, int nominalEntries, long throttleEveryN, long throttleMillis) Convenience constructor that usesSketchBasedJoinEstimator.Config.defaults()with the given basics. All options can still be overridden via system properties (see class‑level Javadoc). -
SketchBasedJoinEstimator
public SketchBasedJoinEstimator(SketchStatementSource statementSource, SketchBasedJoinEstimator.Config cfg) Full configuration constructor.Values from
cfgare overlaid by system properties with prefixorg.eclipse.rdf4j.query.algebra.evaluation.sketch.SketchBasedJoinEstimator.. If a property is set, it takes precedence. See class‑level Javadoc for the list of keys.
-
-
Method Details
-
beginQueryOptimizationScope
- Specified by:
beginQueryOptimizationScopein interfaceQueryOptimizationScopeProvider
-
isReady
public boolean isReady() -
isReadyNonBlocking
public boolean isReadyNonBlocking() -
awaitReady
- Throws:
InterruptedException
-
configurePersistence
Configure persistence for this estimator.- Parameters:
file- persisted snapshot filelazyLoad- if true and snapshot exists, keep sketches unloaded until first demand.
-
setLearnedStatsProvider
-
setPatternFilterSamplingEstimator
public void setPatternFilterSamplingEstimator(SketchBasedJoinEstimator.PatternFilterSamplingEstimator patternFilterSamplingEstimator) -
setPatternCardinalityProvider
public void setPatternCardinalityProvider(SketchBasedJoinEstimator.PatternCardinalityProvider patternCardinalityProvider) -
setRebuildAllowedSupplier
Install a store-provided gate for background rebuilds. Returningfalsepauses refresh-thread rebuild attempts until the gate opens again. -
startBackgroundRefresh
public void startBackgroundRefresh(int stalenessThreshold) -
stop
public void stop() -
close
public void close() -
rebuild
public long rebuild()Rebuild the inactive buffer from scratch (blocking).
Readers stay lock‑free; once complete a single volatile store publishes the freshState.- Returns:
- number of statements scanned.
-
addStatement
-
addStatements
-
addStatement
-
addStatement
-
recordStoreSizeDelta
public void recordStoreSizeDelta(long additions, long deletions) Records committed store-size changes when exact incremental sketch updates are intentionally deferred. -
deleteStatement
-
deleteStatements
-
canAcceptIncrementalUpdatesNonBlocking
public boolean canAcceptIncrementalUpdatesNonBlocking() -
cardinalitySingle
-
cardinalityPair
-
estimateJoinOn
public double estimateJoinOn(SketchBasedJoinEstimator.Component join, SketchBasedJoinEstimator.Pair a, String ax, String ay, SketchBasedJoinEstimator.Pair b, String bx, String by) -
estimateJoinOn
public double estimateJoinOn(SketchBasedJoinEstimator.Component j, SketchBasedJoinEstimator.Component a, String av, SketchBasedJoinEstimator.Component b, String bv) -
estimate
public SketchBasedJoinEstimator.JoinEstimate estimate(SketchBasedJoinEstimator.Component joinVar, String s, String p, String o, String c) -
estimateCount
public double estimateCount(SketchBasedJoinEstimator.Component joinVar, String s, String p, String o, String c) -
planJoinOrder
public Optional<JoinOrderPlanner.JoinOrderPlan> planJoinOrder(List<TupleExpr> args, Set<String> initiallyBoundVars, JoinOrderPlanner.Algorithm algorithm) -
planJoinOrder
public Optional<JoinOrderPlanner.JoinOrderPlan> planJoinOrder(List<TupleExpr> args, Set<String> initiallyBoundVars, JoinOrderPlanner.Algorithm algorithm, SketchBasedJoinEstimator.JoinOrderWorkAdjuster workAdjuster) -
planJoinOrder
public Optional<JoinOrderPlanner.JoinOrderPlan> planJoinOrder(List<TupleExpr> args, Set<String> initiallyBoundVars, JoinOrderPlanner.Algorithm algorithm, SketchBasedJoinEstimator.JoinOrderWorkAdjuster workAdjuster, List<JoinOrderPlanner.FilterConstraint> deferredFilters) -
planJoinOrderAttempt
public JoinOrderPlanner.PlanningAttempt planJoinOrderAttempt(List<TupleExpr> args, Set<String> initiallyBoundVars, JoinOrderPlanner.Algorithm algorithm, SketchBasedJoinEstimator.JoinOrderWorkAdjuster workAdjuster, List<JoinOrderPlanner.FilterConstraint> deferredFilters) -
planJoinOrderAttempt
public JoinOrderPlanner.PlanningAttempt planJoinOrderAttempt(List<TupleExpr> args, Set<String> initiallyBoundVars, JoinOrderPlanner.Algorithm algorithm, JoinFactorCostModel factorCostModel, List<JoinOrderPlanner.FilterConstraint> deferredFilters) -
estimateJoinOrder
public Optional<JoinOrderPlanner.JoinOrderPlan> estimateJoinOrder(List<TupleExpr> orderedArgs, Set<String> initiallyBoundVars, JoinOrderPlanner.Algorithm algorithm, JoinFactorCostModel factorCostModel, List<JoinOrderPlanner.FilterConstraint> deferredFilters) -
estimateFilterPassRatio
-
estimateFilterPass
-
cardinality
-
cardinality
-
cardinality
-
cardinality
-
accessShapeForJoinOrdering
public SketchBasedJoinEstimator.AccessShape accessShapeForJoinOrdering(TupleExpr tupleExpr, Set<String> currentlyBoundVars) -
factorOutputRowsForJoinOrdering
-
factorOutputRowsForJoinOrdering
-
accessShapeForJoinOrdering
public SketchBasedJoinEstimator.AccessShape accessShapeForJoinOrdering(TupleExpr tupleExpr, String[] sourceVariableNames, long sourceBoundVarMask) -
staleness
Compute a staleness snapshot using the *current* published State. No locks taken. This is O(total number of populated sketch keys) and intended for occasional diagnostics or adaptive scheduling. All numbers are approximate by design of the tuple sketches. -
isStale
public boolean isStale(double threshold) Convenience: true if combined staleness score exceeds a given threshold. -
persistIfDirty
public boolean persistIfDirty()Best-effort sketch persistence intended for scheduled store-level sync/commit paths.- Returns:
- true if persisted state was updated.
-
unload
public void unload()Release in-memory sketches. Query planning falls back until sketches are lazy-loaded or rebuilt. -
discardAndMarkForRebuild
public void discardAndMarkForRebuild()Discard any in-memory estimator state and force a rebuild from store data on the next refresh cycle.
-