Shotness Analyzer¶
The shotness analyzer measures structural hotness -- the change frequency of individual code entities (functions, methods, classes) across Git history. Unlike the couples analyzer which operates at file granularity, shotness operates at the UAST node level, providing fine-grained co-change analysis.
Quick Start¶
With custom node selection:
codefang run -a history/shotness \
--shotness-dsl-struct 'filter(.roles has "Function")' \
--shotness-dsl-name '.props.name' \
.
Requires UAST
The shotness analyzer needs UAST support to identify code structures. It is automatically enabled when the UAST pipeline is available.
What It Measures¶
Node Change Frequency¶
For each code entity matched by the DSL query (functions by default), the analyzer counts how many commits modified lines within that entity's span. Entities that change frequently are "hot" -- they are likely volatile, complex, or central to the system.
Node Co-Change Coupling¶
When two code entities are modified in the same commit, their coupling counter is incremented. This produces a fine-grained coupling matrix at the function level, which is more precise than file-level coupling from the couples analyzer.
Coupling Strength¶
Coupling strength is normalized to a 0-1 scale using the formula:
This ensures the result is always in [0, 1] and provides a meaningful confidence metric. A strength of 1.0 means functions always change together; 0.5 means they co-change half the time relative to the most active function.
Risk Classification¶
Nodes are classified into risk levels based on absolute change counts:
| Risk Level | Threshold | Meaning |
|---|---|---|
| HIGH | ≥ 20 changes | Requires immediate attention and robust test coverage |
| MEDIUM | ≥ 10 changes | Should be monitored and potentially refactored |
| LOW | < 10 changes | Normal change frequency |
How It Works¶
For each commit:
- Parse the before and after versions of each changed file into UAST
- Apply the
dsl_structquery to select target nodes (e.g., functions) - Apply the
dsl_namequery to extract the name of each node - Map diff hunks to nodes using line-range overlaps
- Emit a per-commit TC (Transient Commit result) with touched node deltas and coupling pairs
After all commits are processed, the Aggregator accumulates TCs into a final report with sorted nodes and a sparse co-change matrix.
Architecture¶
The shotness analyzer follows the TC/Aggregator pattern:
- Consume phase: Per-commit processing builds working state (
nodes,filesmaps for deletion/rename tracking) and emits aTC{Data: *CommitData}with node touch deltas and coupling pairs. - Aggregation phase: The
Aggregatoraccumulates node counts and coupling matrices from the TC stream. It supports disk-backed spilling viaSpillStorefor memory-bounded operation. - Serialization phase:
SerializeTICKs()converts aggregated tick data into theNodes/Countersreport consumed byComputeAllMetrics()and plot generation.
The nodes map remains in the analyzer as working state because handleDeletion, handleInsertion, handleModification, and applyRename read and mutate it during Consume(). The aggregator maintains its own separate accumulation of counts and couplings.
Output Formats¶
The shotness analyzer supports four output formats: JSON, YAML, text, and plot.
Terminal output with color-coded sections:
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Shotness Analysis 42 nodes ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
Summary
──────────────────────────────────────────────────────────
Total Nodes 42
Total Changes 385
Avg Changes/Node 9.2
Total Couplings 156
Avg Coupling Strength 34%
Hot Nodes 8
Hottest Functions
──────────────────────────────────────────────────────────
processPayment (engine [████████████████████░] 1.0 (42 changes)
validateInput (engine. [████████████████░░░░░] 0.8 (34 changes)
Risk Assessment
──────────────────────────────────────────────────────────
processPayment (engine HIGH (42 changes)
validateInput (engine. HIGH (34 changes)
Strongest Couplings
──────────────────────────────────────────────────────────
processPayment ↔ validateInput 85% (12 co-changes)
handleRequest ↔ parseBody 72% (8 co-changes)
{
"node_hotness": [
{
"name": "processFile",
"type": "Function",
"file": "pkg/core/engine.go",
"change_count": 42,
"coupled_nodes": 3,
"hotness_score": 1.0
}
],
"node_coupling": [
{
"node1_name": "processFile",
"node1_file": "pkg/core/engine.go",
"node2_name": "validate",
"node2_file": "pkg/core/engine.go",
"co_changes": 15,
"coupling_strength": 0.36
}
],
"hotspot_nodes": [
{
"name": "processFile",
"type": "Function",
"file": "pkg/core/engine.go",
"change_count": 42,
"risk_level": "HIGH"
}
],
"aggregate": {
"total_nodes": 3,
"total_changes": 105,
"total_couplings": 3,
"avg_changes_per_node": 35.0,
"avg_coupling_strength": 0.42,
"hot_nodes": 2
}
}
node_hotness:
- name: processFile
type: Function
file: pkg/core/engine.go
change_count: 42
coupled_nodes: 3
hotness_score: 1.0
node_coupling:
- node1_name: processFile
node1_file: pkg/core/engine.go
node2_name: validate
node2_file: pkg/core/engine.go
co_changes: 15
coupling_strength: 0.36
hotspot_nodes:
- name: processFile
type: Function
file: pkg/core/engine.go
change_count: 42
risk_level: HIGH
aggregate:
total_nodes: 3
total_changes: 105
total_couplings: 3
avg_changes_per_node: 35.0
avg_coupling_strength: 0.42
hot_nodes: 2
Generates an interactive HTML dashboard with three visualizations:
- Code Hotness TreeMap: Hierarchical file → function view sized by change frequency
- Function Coupling Matrix: Heatmap showing co-change frequency between functions
- Top Hot Functions: Bar chart comparing self-changes vs coupled changes
Configuration Options¶
| Option | Type | Default | Description |
|---|---|---|---|
Shotness.DSLStruct | string | filter(.roles has "Function") | UAST DSL query to select which code structures to track. |
Shotness.DSLName | string | .props.name | UAST DSL expression to extract the name from each matched node. |
# .codefang.yml
history:
shotness:
dsl_struct: 'filter(.roles has "Function")'
dsl_name: '.props.name'
Custom DSL Examples¶
Metrics Reference¶
Node Hotness¶
| Field | Type | Description |
|---|---|---|
name | string | Function/method name |
type | string | UAST node type (e.g., "Function") |
file | string | Source file path |
change_count | int | Number of commits that modified this node |
coupled_nodes | int | Number of other nodes that co-changed with this node |
hotness_score | float | Normalized score [0, 1] relative to the hottest node |
Node Coupling¶
| Field | Type | Description |
|---|---|---|
node1_name / node2_name | string | Names of the coupled nodes |
node1_file / node2_file | string | File paths of the coupled nodes |
co_changes | int | Number of commits where both nodes changed |
coupling_strength | float | Normalized strength [0, 1] |
Aggregate¶
| Field | Type | Description |
|---|---|---|
total_nodes | int | Total tracked nodes |
total_changes | int | Sum of all node change counts |
total_couplings | int | Number of unique coupling pairs |
avg_changes_per_node | float | Mean changes per node |
avg_coupling_strength | float | Mean coupling strength across all pairs |
hot_nodes | int | Nodes with change count ≥ 10 (MEDIUM or HIGH risk) |
Use Cases¶
- Function-level hotspot detection: Find the most frequently changed functions in the codebase. These are the highest-risk points for bugs.
- Fine-grained coupling analysis: Discover which functions always change together. This reveals implicit dependencies that file-level coupling misses.
- Refactoring prioritization: Functions that are both hot (high change count) and coupled (always change with others) are the best refactoring candidates.
- Architecture validation: Functions from different packages that are highly coupled may indicate a leaking abstraction.
- Test prioritization: Focus testing resources on the hottest functions.
Interpreting Results¶
Reading the Coupling Strength¶
| Strength | Interpretation |
|---|---|
| 0.8 - 1.0 | Very tight coupling. Functions almost always change together. Consider merging or extracting shared logic. |
| 0.5 - 0.8 | Moderate coupling. There is a significant shared dependency. Review if coupling is intentional. |
| 0.2 - 0.5 | Loose coupling. Occasional co-changes, likely due to shared APIs or data structures. |
| < 0.2 | Minimal coupling. Co-changes are incidental. |
Actionable Insights¶
- High hotness + High coupling: Core function that drives many changes. Candidate for splitting or stabilizing the interface.
- High hotness + Low coupling: Frequently bugfixed isolated function. Needs better tests and potentially a redesign.
- Low hotness + High coupling: Stable function that always changes with others. Check if coupling is necessary or indicates a design smell.
Limitations¶
- UAST required: Only languages with UAST parser support are analyzed. Files in unsupported languages are skipped entirely.
- CPU intensive: The analyzer performs UAST parsing on both the before and after versions of every changed file in every commit. This makes it one of the most expensive analyzers. It benefits from parallel execution.
- Name collisions: If two functions in different files have the same name, they are tracked as distinct nodes (the file path is part of the key). However, if a file is renamed, the analyzer updates all associated nodes.
- Shallow extraction within a file: When multiple structural nodes in the same file share the same extracted name (e.g., nested functions with identical names), only one is tracked. The last one encountered wins. Qualified paths (e.g.,
OuterClass.innerMethod) are not built. - DSL limitations: The DSL query must match nodes that have position information (
Posfield) in the UAST. Nodes without position data cannot be mapped to diff hunks. - Large functions: A change anywhere within a function's line range counts as a change to that function. Very large functions (hundreds of lines) will have inflated change counts.