Class StageMetricsRegistry
java.lang.Object
com.linkedin.venice.spark.datawriter.task.StageMetricsRegistry
Central registry for per-stage diagnostic metrics in the VPJ Spark pipeline.
Maintains insertion order so the report reflects the pipeline execution sequence.
Usage:
StageMetricsRegistry registry = new StageMetricsRegistry(sparkContext);
StageMetrics ttlMetrics = registry.register("ttl_filter");
// ... instrument stage with ttlMetrics.recordsIn, ttlMetrics.recordsOut, etc.
// After dataFrame.count():
StageMetricsSnapshot snapshot = registry.snapshot();
LOGGER.info("VPJ Pipeline Diagnostics:\n{}", snapshot.getFormattedReport());
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionGet a previously registered stage's metrics.Register a new pipeline stage for tracking.snapshot()Capture an immutable snapshot of all stage metrics.
-
Constructor Details
-
StageMetricsRegistry
public StageMetricsRegistry(org.apache.spark.SparkContext sparkContext)
-
-
Method Details
-
register
Register a new pipeline stage for tracking. Stages are reported in registration order.- Parameters:
stageName- unique identifier for the stage (e.g., "ttl_filter", "compaction")- Returns:
- the StageMetrics instance for this stage (returns existing instance if already registered)
-
getStage
Get a previously registered stage's metrics.- Returns:
- the StageMetrics, or null if not registered
-
snapshot
Capture an immutable snapshot of all stage metrics. Call this afterdataFrame.count()triggers execution so accumulator values are finalized.- Returns:
- an engine-agnostic snapshot with per-stage summaries and a formatted report
-