Collecting Diagnostics#
When investigating Venice issues, collect diagnostics before restarting services — they are essential for root cause analysis. These tools apply to any Venice JVM process (server, router, controller, or client).
Heap Dump#
Use when: high memory usage, suspected memory leaks, store buffer memory issues, or when you need to inspect runtime state that isn't visible in logs.
Heap dumps capture the full state of the JVM heap — which objects are consuming memory, and the actual values of variables and object fields at the time of capture. Analyze with HeapDumpStarDiver, which converts heap dumps to Parquet files for SQL analysis and includes an MCP server that enables AI agents to query heap data and assist with memory analysis.
Thread Dump#
Use when: suspected deadlocks, stuck threads, high CPU with no obvious cause.
Thread dumps show what every thread is doing at that moment. Look for threads in BLOCKED or WAITING state, and for lock contention between threads.
JFR (Java Flight Recorder) Profile#
Use when: latency spikes, CPU contention, thread contention, or when you need to understand where time is being spent.
JFR captures CPU profiling, thread contention, lock waits, GC (garbage collection) activity, and I/O with low overhead. Especially useful for:
- Latency spikes — JFR can identify JIT deoptimization. Collecting a profile may itself resolve latency spikes by triggering JIT recompilation.
- Thread contention — shows which threads are competing for locks (e.g., SSL handshake thread pool saturation).
- CPU hotspots — identifies which methods are consuming the most CPU time.
Analyze JFR recordings with JDK Mission Control.