Shotness Analyzer¶

The shotness analyzer measures structural hotness -- the change frequency of individual code entities (functions, methods, classes) across Git history. Unlike the couples analyzer which operates at file granularity, shotness operates at the UAST node level, providing fine-grained co-change analysis.

Quick Start¶

codefang run -a history/shotness .

With custom node selection:

codefang run -a history/shotness \
  --shotness-dsl-struct 'filter(.roles has "Function")' \
  --shotness-dsl-name '.props.name' \
  .

Requires UAST

The shotness analyzer needs UAST support to identify code structures. It is automatically enabled when the UAST pipeline is available.

What It Measures¶

Node Change Frequency¶

For each code entity matched by the DSL query (functions by default), the analyzer counts how many commits modified lines within that entity's span. Entities that change frequently are "hot" -- they are likely volatile, complex, or central to the system.

Node Co-Change Coupling¶

When two code entities are modified in the same commit, their coupling counter is incremented. This produces a fine-grained coupling matrix at the function level, which is more precise than file-level coupling from the couples analyzer.

Coupling Strength¶

Coupling strength is normalized to a 0-1 scale using the formula:

strength(A, B) = co_changes(A, B) / max(co_changes(A, B), changes(A), changes(B))

This ensures the result is always in [0, 1] and provides a meaningful confidence metric. A strength of 1.0 means functions always change together; 0.5 means they co-change half the time relative to the most active function.

Risk Classification¶

Nodes are classified into risk levels based on absolute change counts:

Risk Level	Threshold	Meaning
HIGH	≥ 20 changes	Requires immediate attention and robust test coverage
MEDIUM	≥ 10 changes	Should be monitored and potentially refactored
LOW	< 10 changes	Normal change frequency

How It Works¶

For each commit:

Parse the before and after versions of each changed file into UAST
Apply the dsl_struct query to select target nodes (e.g., functions)
Apply the dsl_name query to extract the name of each node
Map diff hunks to nodes using line-range overlaps
Emit a per-commit TC (Transient Commit result) with touched node deltas and coupling pairs

After all commits are processed, the Aggregator accumulates TCs into a final report with sorted nodes and a sparse co-change matrix.

Architecture¶

The shotness analyzer follows the TC/Aggregator pattern:

Consume phase: Per-commit processing builds working state (nodes, files maps for deletion/rename tracking) and emits a TC{Data: *CommitData} with node touch deltas and coupling pairs.
Aggregation phase: The Aggregator accumulates node counts and coupling matrices from the TC stream. It supports disk-backed spilling via SpillStore for memory-bounded operation.
Serialization phase: SerializeTICKs() converts aggregated tick data into the Nodes/Counters report consumed by ComputeAllMetrics() and plot generation.

The nodes map remains in the analyzer as working state because handleDeletion, handleInsertion, handleModification, and applyRename read and mutate it during Consume(). The aggregator maintains its own separate accumulation of counts and couplings.

Output Formats¶

The shotness analyzer supports four output formats: JSON, YAML, text, and plot.

TextJSONYAMLPlot

codefang run -a history/shotness -f text .

Terminal output with color-coded sections:

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Shotness Analysis                              42 nodes   ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛

  Summary
  ──────────────────────────────────────────────────────────
  Total Nodes            42
  Total Changes          385
  Avg Changes/Node       9.2
  Total Couplings        156
  Avg Coupling Strength  34%
  Hot Nodes              8

  Hottest Functions
  ──────────────────────────────────────────────────────────
  processPayment (engine [████████████████████░] 1.0  (42 changes)
  validateInput (engine. [████████████████░░░░░] 0.8  (34 changes)

  Risk Assessment
  ──────────────────────────────────────────────────────────
  processPayment (engine HIGH    (42 changes)
  validateInput (engine. HIGH    (34 changes)

  Strongest Couplings
  ──────────────────────────────────────────────────────────
  processPayment ↔ validateInput   85%  (12 co-changes)
  handleRequest  ↔ parseBody       72%  (8 co-changes)

codefang run -a history/shotness -f json .

{
  "node_hotness": [
    {
      "name": "processFile",
      "type": "Function",
      "file": "pkg/core/engine.go",
      "change_count": 42,
      "coupled_nodes": 3,
      "hotness_score": 1.0
    }
  ],
  "node_coupling": [
    {
      "node1_name": "processFile",
      "node1_file": "pkg/core/engine.go",
      "node2_name": "validate",
      "node2_file": "pkg/core/engine.go",
      "co_changes": 15,
      "coupling_strength": 0.36
    }
  ],
  "hotspot_nodes": [
    {
      "name": "processFile",
      "type": "Function",
      "file": "pkg/core/engine.go",
      "change_count": 42,
      "risk_level": "HIGH"
    }
  ],
  "aggregate": {
    "total_nodes": 3,
    "total_changes": 105,
    "total_couplings": 3,
    "avg_changes_per_node": 35.0,
    "avg_coupling_strength": 0.42,
    "hot_nodes": 2
  }
}

codefang run -a history/shotness -f yaml .

node_hotness:
  - name: processFile
    type: Function
    file: pkg/core/engine.go
    change_count: 42
    coupled_nodes: 3
    hotness_score: 1.0
node_coupling:
  - node1_name: processFile
    node1_file: pkg/core/engine.go
    node2_name: validate
    node2_file: pkg/core/engine.go
    co_changes: 15
    coupling_strength: 0.36
hotspot_nodes:
  - name: processFile
    type: Function
    file: pkg/core/engine.go
    change_count: 42
    risk_level: HIGH
aggregate:
  total_nodes: 3
  total_changes: 105
  total_couplings: 3
  avg_changes_per_node: 35.0
  avg_coupling_strength: 0.42
  hot_nodes: 2

codefang run -a history/shotness -f plot -o shotness.html .

Generates an interactive HTML dashboard with three visualizations:

Code Hotness TreeMap: Hierarchical file → function view sized by change frequency
Function Coupling Matrix: Heatmap showing co-change frequency between functions
Top Hot Functions: Bar chart comparing self-changes vs coupled changes

Configuration Options¶

Option	Type	Default	Description
`Shotness.DSLStruct`	`string`	`filter(.roles has "Function")`	UAST DSL query to select which code structures to track.
`Shotness.DSLName`	`string`	`.props.name`	UAST DSL expression to extract the name from each matched node.

# .codefang.yml
history:
  shotness:
    dsl_struct: 'filter(.roles has "Function")'
    dsl_name: '.props.name'

Custom DSL Examples¶

Track classes instead of functionsTrack both functions and methodsTrack interfaces

dsl_struct: 'filter(.roles has "Class")'
dsl_name: '.props.name'

dsl_struct: 'filter(.roles has "Function" or .roles has "Method")'
dsl_name: '.props.name'

dsl_struct: 'filter(.roles has "Interface")'
dsl_name: '.props.name'

Metrics Reference¶

Node Hotness¶

Field	Type	Description
`name`	string	Function/method name
`type`	string	UAST node type (e.g., "Function")
`file`	string	Source file path
`change_count`	int	Number of commits that modified this node
`coupled_nodes`	int	Number of other nodes that co-changed with this node
`hotness_score`	float	Normalized score [0, 1] relative to the hottest node

Node Coupling¶

Field	Type	Description
`node1_name` / `node2_name`	string	Names of the coupled nodes
`node1_file` / `node2_file`	string	File paths of the coupled nodes
`co_changes`	int	Number of commits where both nodes changed
`coupling_strength`	float	Normalized strength [0, 1]

Aggregate¶

Field	Type	Description
`total_nodes`	int	Total tracked nodes
`total_changes`	int	Sum of all node change counts
`total_couplings`	int	Number of unique coupling pairs
`avg_changes_per_node`	float	Mean changes per node
`avg_coupling_strength`	float	Mean coupling strength across all pairs
`hot_nodes`	int	Nodes with change count ≥ 10 (MEDIUM or HIGH risk)

Use Cases¶

Function-level hotspot detection: Find the most frequently changed functions in the codebase. These are the highest-risk points for bugs.
Fine-grained coupling analysis: Discover which functions always change together. This reveals implicit dependencies that file-level coupling misses.
Refactoring prioritization: Functions that are both hot (high change count) and coupled (always change with others) are the best refactoring candidates.
Architecture validation: Functions from different packages that are highly coupled may indicate a leaking abstraction.
Test prioritization: Focus testing resources on the hottest functions.

Interpreting Results¶

Reading the Coupling Strength¶

Strength	Interpretation
0.8 - 1.0	Very tight coupling. Functions almost always change together. Consider merging or extracting shared logic.
0.5 - 0.8	Moderate coupling. There is a significant shared dependency. Review if coupling is intentional.
0.2 - 0.5	Loose coupling. Occasional co-changes, likely due to shared APIs or data structures.
< 0.2	Minimal coupling. Co-changes are incidental.

Actionable Insights¶

High hotness + High coupling: Core function that drives many changes. Candidate for splitting or stabilizing the interface.
High hotness + Low coupling: Frequently bugfixed isolated function. Needs better tests and potentially a redesign.
Low hotness + High coupling: Stable function that always changes with others. Check if coupling is necessary or indicates a design smell.

Limitations¶

UAST required: Only languages with UAST parser support are analyzed. Files in unsupported languages are skipped entirely.
CPU intensive: The analyzer performs UAST parsing on both the before and after versions of every changed file in every commit. This makes it one of the most expensive analyzers. It benefits from parallel execution.
Name collisions: If two functions in different files have the same name, they are tracked as distinct nodes (the file path is part of the key). However, if a file is renamed, the analyzer updates all associated nodes.
Shallow extraction within a file: When multiple structural nodes in the same file share the same extracted name (e.g., nested functions with identical names), only one is tracked. The last one encountered wins. Qualified paths (e.g., OuterClass.innerMethod) are not built.
DSL limitations: The DSL query must match nodes that have position information (Pos field) in the UAST. Nodes without position data cannot be mapped to diff hunks.
Large functions: A change anywhere within a function's line range counts as a change to that function. Very large functions (hundreds of lines) will have inflated change counts.