Introduction
Debtmap is a code complexity sensor for AI-assisted development. It identifies technical debt hotspots and provides the structured data AI coding tools need to understand and fix them.
What is Debtmap?
Debtmap is different from traditional static analysis tools. Instead of telling you what to fix, it provides signals that AI assistants can use to make informed decisions:
- Where to look - Prioritized list of debt items with exact file locations
- What to read - Context suggestions (callers, callees, test files)
- What signals matter - Complexity, coverage, coupling metrics
The key insight: AI coding assistants are great at fixing code, but they need guidance on where to focus and what to read. Debtmap provides that guidance.
The AI Sensor Model
Debtmap is a sensor, not a prescriber. It measures and reports; it doesn’t tell you what to do.
What Debtmap provides:
- Quantified complexity signals (cyclomatic, cognitive, nesting)
- Test coverage gaps with risk prioritization
- Context suggestions for AI consumption
- Structured output (JSON, LLM-markdown) for machine consumption
What Debtmap doesn’t provide:
- “Fix this by doing X” recommendations
- “You should consider Y” advice
- Template-based refactoring suggestions
This design is intentional. AI assistants can consider business context, team preferences, and constraints that Debtmap can’t know. The AI decides what to do; Debtmap tells it where to look.
Quick Start
# Install
cargo install debtmap
# Analyze and pipe to Claude Code
debtmap analyze . --format markdown --top 3 | claude "Fix the top item"
# Get structured signals for your AI workflow
debtmap analyze . --format json --top 10 > debt.json
# With coverage data for accurate risk assessment
cargo llvm-cov --lcov --output-path coverage.lcov
debtmap analyze . --lcov coverage.lcov --format markdown
Key Features
Signal Generation
- Complexity signals - Cyclomatic, cognitive, nesting depth, lines of code
- Coverage signals - Line coverage, branch coverage, function coverage
- Coupling signals - Fan-in, fan-out, call graph depth
- Quality signals - Entropy (false positive reduction), purity (testability)
For a complete list of metrics and their formulas, see the Metrics Reference.
AI-Optimized Output
- LLM markdown format - Minimal tokens, maximum information
- Context suggestions - File ranges the AI should read
- Structured JSON - Stable schema for programmatic access
- Deterministic output - Same input = same output
Analysis Capabilities
- Rust-first analysis - Full AST parsing, macro expansion, trait resolution
- Coverage integration - Native LCOV support for risk assessment
- Debt pattern detection - God objects, boilerplate code, error handling anti-patterns
- Entropy analysis - Reduces false positives from repetitive code
- Parallel processing - Fast analysis (10-100x faster than Java/Python tools)
Workflow Integration
- Direct piping - Pipe output to Claude, Cursor, or custom agents
- CI/CD gates - Validate debt thresholds with the
validatecommand - Progress tracking - Compare debt across commits with
compareandvalidate-improvementcommands
Current Status
Debtmap focuses exclusively on Rust. This focused approach allows us to:
- Build deep Rust-specific analysis (macros, traits, lifetimes)
- Perfect core algorithms before expanding
- Deliver the best possible AI sensor for Rust
Multi-language support (Python, JavaScript/TypeScript, Go) is planned for future releases.
Target Audience
Debtmap is designed for:
- AI-assisted developers - Get signals that help AI assistants make better decisions
- Development teams - Prioritize debt remediation with quantified metrics
- CI/CD engineers - Enforce quality gates with coverage-aware thresholds
- Legacy codebase maintainers - Identify where AI can help most
Getting Started
Ready to start? Check out:
- Getting Started - Installation and first analysis
- LLM Integration - AI workflow patterns
- Why Debtmap? - The AI sensor model explained
- TUI Guide - Interactive exploration with the terminal UI
Quick tip: Start with debtmap analyze . --format markdown --top 5 to see the top priority items with context suggestions.
Why Debtmap?
Debtmap is a code complexity sensor designed for AI-assisted development workflows. It identifies technical debt hotspots and provides the structured data AI coding tools need to understand and fix them.
The AI Development Paradox
AI coding assistants like Claude Code, GitHub Copilot, and Cursor are transforming software development. They can write code faster than ever before. But this creates a paradox:
AI creates technical debt faster than humans can manage it.
When an AI generates hundreds of lines of code per hour, traditional code review and refactoring processes break down. Teams accumulate debt faster than they can pay it down.
At the same time, AI assistants struggle to fix the debt they create:
- Limited context window - They can’t see the entire codebase at once
- No test awareness - They don’t know which code is tested vs untested
- No prioritization - They can’t identify what matters most
- Wasted tokens - They read irrelevant code while missing critical context
What AI Coding Tools Need
For an AI assistant to effectively fix technical debt, it needs:
1. Prioritized Targets
Not “here are 500 complex functions,” but “here are the 10 functions that matter most, ranked by severity.”
Debtmap provides a severity score (0-10) that combines:
- Complexity metrics (cyclomatic, cognitive, nesting)
- Test coverage gaps
- Coupling and dependency impact
- Pattern-based false positive reduction
2. Context Suggestions
Not “this function is complex,” but “read lines 38-85 of parser.rs, plus lines 100-120 of handler.rs where it’s called, and lines 50-75 of the test file.”
Debtmap’s context suggestions tell the AI exactly which code to read:
CONTEXT:
├─ Primary: src/parser.rs:38-85 (the debt item)
├─ Caller: src/handler.rs:100-120 (usage context)
└─ Tests: tests/parser_test.rs:50-75 (expected behavior)
3. Quantified Signals
Not “this code is bad,” but “cyclomatic complexity: 12, cognitive complexity: 18, test coverage: 0%, called by 8 functions.”
These signals let the AI make informed decisions about the best approach:
- High complexity + good coverage = risky to refactor
- Low complexity + no coverage = easy test target
- High coupling + high complexity = incremental approach needed
4. Structured Output
Not free-form text, but JSON and markdown optimized for LLM consumption:
- Consistent structure across all debt items
- Minimal tokens for maximum information
- Deterministic output for reproducible workflows
- Stable IDs for referencing items across runs
What Debtmap Provides
Complexity Signals
| Signal | What It Measures | Why It Matters |
|---|---|---|
| Cyclomatic | Decision points (if, match, loop) | Number of execution paths |
| Cognitive | Readability difficulty | How hard code is to understand |
| Nesting | Indentation depth | Compound complexity |
| Lines | Function length | Scope of changes needed |
Coverage Signals
| Signal | What It Measures | Why It Matters |
|---|---|---|
| Line coverage | % of lines executed by tests | Basic test coverage |
| Branch coverage | % of branches taken | Edge case coverage |
| Function coverage | Whether function is tested at all | Critical gap detection |
Coupling Signals
| Signal | What It Measures | Why It Matters |
|---|---|---|
| Fan-in | Functions that call this function | Impact of changes |
| Fan-out | Functions this function calls | Dependency risk |
| Call depth | Distance from entry points | Integration complexity |
Quality Signals
| Signal | What It Measures | Why It Matters |
|---|---|---|
| Entropy | Pattern variety in code | False positive filtering |
| Purity | Side effect presence | Testability indicator |
| Dead code | Unused functions | Cleanup candidates |
What Debtmap Doesn’t Do
Debtmap is a sensor, not a prescriber. It measures and reports; it doesn’t tell you what to do.
No Fix Suggestions
Debtmap doesn’t say “split this function into 5 modules” or “add 8 unit tests.” Those decisions require understanding the business context, architectural constraints, and team preferences that only humans (or AI with proper context) can evaluate.
No “Should” Statements
Debtmap doesn’t say “you should refactor this” or “consider extracting a helper function.” It reports facts: “complexity: 18, coverage: 0%, called by 12 functions.” The AI or developer decides what to do with that information.
No Impact Predictions
Debtmap doesn’t claim “refactoring this will reduce bugs by 40%.” Such predictions are speculative. Debtmap reports what it can measure accurately and leaves interpretation to the consumer.
Comparison with Alternatives
vs Static Analysis Tools (SonarQube, CodeClimate)
| Aspect | Traditional Tools | Debtmap |
|---|---|---|
| Output | Recommendations | Signals |
| Audience | Humans | AI + Humans |
| Format | Dashboards | JSON/Markdown |
| Speed | Minutes | Seconds |
| Focus | “Fix this” | “Here’s what exists” |
Traditional tools are designed for human code review workflows. Debtmap is designed for AI-assisted development.
vs Linters (Clippy, ESLint)
| Aspect | Linters | Debtmap |
|---|---|---|
| Focus | Style/idioms | Complexity/debt |
| Scope | Line-level | Function-level |
| Output | Warnings | Prioritized signals |
| Coverage | Not integrated | Core feature |
Linters catch code style issues. Debtmap identifies complexity hotspots. Use both.
vs Coverage Tools (Tarpaulin, pytest-cov)
| Aspect | Coverage Tools | Debtmap |
|---|---|---|
| Output | Coverage % | Risk-prioritized gaps |
| Complexity | Not considered | Core metric |
| Context | None | File ranges for AI |
Coverage tools tell you what’s tested. Debtmap tells you what untested code is most risky.
How Debtmap Fits in Your Workflow
AI-Assisted Development
# Generate debt signals
debtmap analyze . --format markdown --lcov coverage.lcov
# Pipe to AI
cat debt.md | claude "Fix the top item, read the suggested context first"
CI/CD Integration
# Fail build if debt exceeds thresholds
debtmap validate . --max-debt-density 10.0
# Generate report for PR review
debtmap analyze . --format json --output debt-report.json
Exploratory Analysis
# Quick overview
debtmap analyze . --top 10
# Deep dive with coverage
debtmap analyze . --lcov coverage.lcov --format terminal -vv
Key Insights
- Debtmap is a sensor - It measures, it doesn’t prescribe
- AI does the thinking - Debtmap provides data, AI decides action
- Context is key - Knowing what to read is as valuable as what to fix
- Signals over interpretations - Raw metrics, not template advice
- Speed matters - Fast enough for local development loops
Next Steps
Ready to try it? Head to Getting Started to install debtmap and run your first analysis.
Want to integrate with your AI workflow? See LLM Integration for detailed guidance.
Want to understand how it works under the hood? See Architecture for the analysis pipeline.
Getting Started
This guide will help you install Debtmap and run your first analysis in just a few minutes.
Prerequisites
Before installing Debtmap, you’ll need:
- For pre-built binaries: No prerequisites! The install script handles everything.
- For cargo install or building from source:
- Rust toolchain (rustc and cargo)
- Supported platforms: Linux, macOS, Windows
- Rust edition 2021 or later
Optional (for coverage-based risk analysis):
- Rust projects:
cargo-tarpaulinorcargo-llvm-covfor coverage data
Installation
Quick Install (Recommended)
Install the latest release with a single command:
curl -sSL https://raw.githubusercontent.com/iepathos/debtmap/master/install.sh | bash
Or with wget:
wget -qO- https://raw.githubusercontent.com/iepathos/debtmap/master/install.sh | bash
This will:
- Automatically detect your OS and architecture
- Download the appropriate pre-built binary from the latest GitHub release
- Install debtmap to
~/.cargo/binif it exists, otherwise~/.local/bin - Offer to automatically add the install directory to your PATH if needed
Using Cargo
If you have Rust installed:
cargo install debtmap
From Source
For the latest development version:
# Clone the repository
git clone https://github.com/iepathos/debtmap.git
cd debtmap
# Build and install
cargo install --path .
Verify Installation
After installation, verify Debtmap is working:
# Check version
debtmap --version
# See available commands
debtmap --help
Quick Start
Here are the most common commands to get you started:
# Basic analysis (simplest command)
debtmap analyze .
# Markdown output for AI workflows (recommended)
debtmap analyze . --format markdown
# Pipe directly to Claude Code
debtmap analyze . --format markdown --top 3 | claude "Fix the top item"
# JSON output for programmatic access
debtmap analyze . --format json --top 10 > debt.json
# With coverage data for accurate risk assessment
cargo llvm-cov --lcov --output-path coverage.lcov
debtmap analyze . --lcov coverage.lcov
# Show only critical/high priority items
debtmap analyze . --min-priority high --top 10
# Terminal output for human exploration
debtmap analyze . --format terminal
First Analysis
Let’s run your first analysis! Navigate to a project directory and run:
debtmap analyze .
What happens during analysis:
- File Discovery - Debtmap scans your project for source files (currently Rust
.rswith full analysis; Python.pyfile detection only) - Parsing - Each file is parsed into an Abstract Syntax Tree (AST)
- Metric Extraction - Complexity, coverage gaps, and coupling are measured
- Prioritization - Results are ranked by severity (CRITICAL, HIGH, MEDIUM, LOW, MINIMAL)
- Context Generation - File ranges are suggested for each debt item
- Output - Results are displayed in your chosen format
Expected timing: Analyzing a 10,000 LOC project typically takes 2-5 seconds.
Example Output
When you run debtmap analyze . --format markdown, you’ll see output like this:
# Technical Debt Analysis
## Summary
- Total items: 47
- Critical: 3, High: 12, Moderate: 20, Low: 12
## #1 [CRITICAL] parse_complex_input
**Location:** src/parser.rs:38-85
**Score:** 8.9/10
**Signals:**
| Metric | Value |
|--------|-------|
| Cyclomatic | 12 |
| Cognitive | 18 |
| Coverage | 0% |
| Nesting | 4 |
**Context:**
- Primary: src/parser.rs:38-85
- Caller: src/handler.rs:100-120
- Test: tests/parser_test.rs:50-75
Understanding the Output
Priority Tiers
| Tier | Score | Meaning |
|---|---|---|
| CRITICAL | 9.0-10.0 | Severe risk requiring immediate attention |
| HIGH | 7.0-8.9 | Significant risk, address this sprint |
| MEDIUM | 5.0-6.9 | Moderate risk, plan for next sprint |
| LOW | 3.0-4.9 | Minor risk, monitor |
| MINIMAL | 0.0-2.9 | Well-managed code |
Key Signals
Complexity signals:
- Cyclomatic: Decision points (if, match, loop)
- Cognitive: How hard code is to understand
- Nesting: Indentation depth
- Lines: Function length
Coverage signals:
- Line coverage: % of lines executed by tests
- Branch coverage: % of branches taken
Coupling signals:
- Fan-in: Functions that call this function
- Fan-out: Functions this function calls
Context Suggestions
Each debt item includes file ranges the AI should read:
Context:
├─ Primary: src/parser.rs:38-85 (the debt item)
├─ Caller: src/handler.rs:100-120 (usage context)
└─ Test: tests/parser_test.rs:50-75 (expected behavior)
These suggestions help AI assistants understand the code before making changes.
Output Formats
Markdown (for AI workflows)
debtmap analyze . --format markdown
Optimized for minimal token usage while providing all necessary context.
JSON (for programmatic access)
debtmap analyze . --format json --output debt.json
Structured data for CI/CD integration and custom tooling.
Terminal (for human exploration)
debtmap analyze . --format terminal
Color-coded, interactive output for manual review.
Adding Coverage Data
Coverage data enables accurate risk assessment:
# Generate coverage with cargo-llvm-cov
cargo llvm-cov --lcov --output-path coverage.lcov
# Or with cargo-tarpaulin
cargo tarpaulin --out lcov --output-dir target/coverage
# Analyze with coverage
debtmap analyze . --lcov coverage.lcov
With coverage data:
- Complex code with good tests = lower priority
- Simple code with no tests = higher priority
- Untested error paths are identified
AI Workflow Examples
Claude Code
# Direct piping
debtmap analyze . --format markdown --top 3 | claude "Fix the top item"
# With coverage
cargo llvm-cov --lcov --output-path coverage.lcov
debtmap analyze . --format markdown --lcov coverage.lcov --top 1 | \
claude "Add tests for this function"
Cursor
# Generate report for Cursor to reference
debtmap analyze . --format markdown --top 10 > debt-report.md
# In Cursor: @debt-report.md Fix the top critical item
Custom Pipelines
# Get JSON for programmatic processing
debtmap analyze . --format json --top 5 | \
jq '.items[0].context.primary' | \
xargs -I {} echo "Read {}"
Configuration
Create a project-specific configuration:
debtmap init
This creates .debtmap.toml:
[thresholds]
complexity = 10
duplication = 40
[tiers]
critical = 9.0
high = 7.0
medium = 5.0
[ignore]
patterns = ["**/target/**", "**/tests/**"]
CLI Options Reference
Analysis Options
| Option | Description |
|---|---|
--format <FORMAT> | Output format: terminal, json, markdown |
--output <FILE> | Write to file instead of stdout |
--lcov <FILE> | LCOV coverage file for risk analysis |
--top <N> | Show only top N priority items |
--min-priority <TIER> | Filter by minimum priority (low, medium, high, critical) |
--min-score <N> | Filter items below score N |
Verbosity Options
| Option | Description |
|---|---|
-v | Show main score factors |
-vv | Show detailed calculations |
-vvv | Show all debug information |
--quiet | Suppress progress output |
Performance Options
| Option | Description |
|---|---|
--jobs <N> | Number of threads (0 = all cores) |
--no-parallel | Disable parallel processing |
--max-files <N> | Limit analysis to N files |
--no-tui | Disable TUI progress visualization (use simple progress bars) |
--streaming | Enable streaming output mode for large codebases (O(1) memory) |
--stream-to <FILE> | Output file for streaming mode (use “-” for stdout) |
Troubleshooting
Installation Issues
- Binary not in PATH: Add
~/.cargo/binor~/.local/binto your PATHexport PATH="$HOME/.cargo/bin:$PATH" # Add to ~/.bashrc or ~/.zshrc - Permission issues: Run the install script with your current user (don’t use sudo)
- Cargo not found: Install Rust from https://rustup.rs
Analysis Issues
- Empty output: Check that your project contains supported source files (
.rsfor Rust,.pyfor Python) - Parser failures: Run with
-vvfor debug output - Performance issues: Limit parallel jobs with
--jobs 4 - Large codebase slowness: Use
--streamingmode for O(1) memory overhead
Coverage Issues
- Coverage not applied: Verify LCOV file path is correct
- Low coverage detected: Ensure tests actually run during coverage generation
- Coverage debugging: Use
debtmap diagnose-coverage <file.lcov>to validate coverage file parsing
What’s Next?
Now that you’ve run your first analysis:
- Integrate with AI: See LLM Integration for AI workflow patterns
- Understand metrics: See Metrics Reference for signal definitions
- Configure thresholds: See Configuration for customization
- CI/CD integration: See Prodigy Integration for automation
Need help? Report issues at https://github.com/iepathos/debtmap/issues
LLM Integration Guide
How to use debtmap output in AI coding workflows.
Overview
Debtmap is designed to provide AI coding assistants with the signals they need to understand and fix technical debt. This guide covers:
- Output formats optimized for LLMs
- Context suggestions and how to use them
- Example workflows for different AI tools
- Interpreting signals for effective prompts
Output Formats
Markdown (Recommended)
The markdown format is specifically designed for LLM consumption:
debtmap analyze . --format markdown --top 5
Output structure:
# Technical Debt Analysis
## Summary
- Total items: 47
- Critical: 3
- High: 12
- Moderate: 20
- Low: 12
## Top Priority Items
### #1 [CRITICAL] parse_complex_input
**Location:** src/parser.rs:38-85
**Score:** 8.9/10
**Signals:**
| Metric | Value | Threshold |
|--------|-------|-----------|
| Cyclomatic | 12 | 10 |
| Cognitive | 18 | 15 |
| Coverage | 0% | 80% |
| Nesting | 4 | 3 |
**Context to read:**
- Primary: src/parser.rs:38-85
- Caller: src/handler.rs:100-120
- Caller: src/api.rs:45-60
- Test: tests/parser_test.rs:50-75
---
### #2 [CRITICAL] validate_auth
...
Why this format works:
- Consistent structure across all items
- Tables for easy metric comparison
- Context suggestions inline
- Minimal tokens for maximum information
JSON
For programmatic access and CI/CD integration:
debtmap analyze . --format json --output debt.json
Structure:
{
"version": "1.0",
"timestamp": "2024-01-15T10:30:00Z",
"summary": {
"total_items": 47,
"by_tier": {
"critical": 3,
"high": 12,
"moderate": 20,
"low": 12
},
"total_loc": 15420
},
"items": [
{
"rank": 1,
"id": "parse_complex_input_38",
"tier": "critical",
"score": 8.9,
"location": {
"file": "src/parser.rs",
"line_start": 38,
"line_end": 85,
"function": "parse_complex_input"
},
"metrics": {
"cyclomatic": 12,
"cognitive": 18,
"nesting": 4,
"loc": 47
},
"coverage": {
"line_percent": 0.0,
"branch_percent": 0.0
},
"context": {
"primary": "src/parser.rs:38-85",
"callers": [
{"file": "src/handler.rs", "lines": "100-120", "calls": 12},
{"file": "src/api.rs", "lines": "45-60", "calls": 8}
],
"tests": [
{"file": "tests/parser_test.rs", "lines": "50-75"}
]
}
}
]
}
Terminal
For human exploration (not recommended for AI piping):
debtmap analyze . --format terminal
Context Suggestions
Each debt item includes a context field that tells the AI exactly what code to read:
CONTEXT:
├─ Primary: src/parser.rs:38-85 (the debt item)
├─ Caller: src/handler.rs:100-120 (understands usage)
├─ Caller: src/api.rs:45-60 (understands usage)
├─ Callee: src/tokenizer.rs:15-40 (understands dependencies)
└─ Test: tests/parser_test.rs:50-75 (understands expected behavior)
Context Types
| Type | What It Contains | Why It Matters |
|---|---|---|
| Primary | The function with debt | Core code to understand/fix |
| Caller | Functions that call this | Usage patterns, constraints |
| Callee | Functions this calls | Dependencies, side effects |
| Test | Related test files | Expected behavior, test patterns |
| Type | Struct/enum definitions | Data structures being used |
Using Context Effectively
Minimal context (fastest):
# Just get the primary location
debtmap analyze . --format json | jq '.items[0].location'
Full context (most accurate):
# Read all suggested files
debtmap analyze . --format markdown --top 1
# Then have the AI read each file in the context section
Example Workflows
Claude Code Integration
Direct piping:
debtmap analyze . --format markdown --top 3 | claude "Fix the top item. Read the context files first."
Two-step workflow:
# Step 1: Get the analysis
debtmap analyze . --format markdown --lcov coverage.lcov --top 5 > debt.md
# Step 2: Send to Claude with context
cat debt.md | claude "
Read the context files for item #1 before making changes.
Then fix the debt item following these rules:
1. Add tests first
2. Refactor only after tests pass
3. Keep functions under 20 lines
"
Focused fix with coverage:
# Generate fresh coverage
cargo llvm-cov --lcov --output-path coverage.lcov
# Analyze with coverage
debtmap analyze . --format markdown --lcov coverage.lcov --top 1
# Send the top item to Claude
debtmap analyze . --format markdown --lcov coverage.lcov --top 1 | \
claude "Add tests for this function to reach 80% coverage"
Cursor Integration
Cursor works best with file-based context:
# Generate debt report
debtmap analyze . --format markdown --top 10 > .cursor/debt-report.md
# In Cursor, reference the report:
# @debt-report.md Fix the top critical item
Custom Agent Workflow
For building your own AI pipeline:
import json
import subprocess
# Run debtmap analysis
result = subprocess.run(
["debtmap", "analyze", ".", "--format", "json", "--top", "10"],
capture_output=True,
text=True
)
debt_data = json.loads(result.stdout)
# Process each item
for item in debt_data["items"]:
# Extract context files
context_files = []
context_files.append(item["context"]["primary"])
context_files.extend([c["file"] for c in item["context"].get("callers", [])])
# Read context files
context_content = ""
for file_spec in context_files:
file_path, lines = parse_file_spec(file_spec)
context_content += read_file_lines(file_path, lines)
# Build prompt
prompt = f"""
Fix this technical debt item:
Location: {item["location"]["file"]}:{item["location"]["line_start"]}
Function: {item["location"]["function"]}
Score: {item["score"]}/10
Signals:
- Cyclomatic complexity: {item["metrics"]["cyclomatic"]}
- Test coverage: {item["coverage"]["line_percent"]}%
Context code:
{context_content}
Instructions:
1. Add tests first
2. Refactor to reduce complexity
3. Keep the same public API
"""
# Send to your LLM
response = call_llm(prompt)
CI/CD Integration
GitHub Actions example:
name: Debt Analysis
on: [pull_request]
jobs:
analyze:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install debtmap
run: cargo install debtmap
- name: Generate coverage
run: cargo llvm-cov --lcov --output-path coverage.lcov
- name: Analyze debt
run: |
debtmap analyze . --format json --lcov coverage.lcov > debt.json
- name: Check for critical items
run: |
CRITICAL=$(jq '.summary.by_tier.critical' debt.json)
if [ "$CRITICAL" -gt 0 ]; then
echo "::warning::Found $CRITICAL critical debt items"
fi
- name: Upload report
uses: actions/upload-artifact@v3
with:
name: debt-report
path: debt.json
Interpreting Signals
Severity Score (0-10)
The severity score combines multiple signals:
| Score | Tier | Interpretation |
|---|---|---|
| 8.0-10.0 | Critical | High complexity, no tests, high coupling |
| 5.0-7.9 | High | Moderate risk, coverage gaps |
| 2.0-4.9 | Moderate | Lower risk, monitor |
| 0.0-1.9 | Low | Acceptable state |
Complexity Signals
Cyclomatic complexity:
- 1-5: Simple, easy to test
- 6-10: Moderate, manageable
- 11-20: Complex, consider splitting
- 21+: Very complex, high priority
Cognitive complexity:
- Measures how hard code is to understand
- Penalizes nesting more than cyclomatic
- Higher values = harder to reason about
Nesting depth:
- 1-2: Normal
- 3: Getting complex
- 4+: Strongly consider refactoring
Coverage Signals
Line coverage:
- 0%: Critical gap, no tests at all
- 1-50%: Poor coverage
- 51-80%: Moderate coverage
- 81%+: Good coverage
Branch coverage:
- More important than line coverage
- Missing branches = missing edge cases
- 0% branch = high risk
Coupling Signals
Fan-in (callers):
- High fan-in = many dependents
- Changes affect many places
- Higher priority for stability
Fan-out (callees):
- High fan-out = many dependencies
- Complex testing requirements
- Consider dependency injection
Best Practices
For Effective AI Prompts
-
Always include context files
- AI makes better decisions with more context
- Context suggestions are curated for relevance
-
Specify your constraints
- “Keep the same public API”
- “Add tests before refactoring”
- “Functions must be under 20 lines”
-
One item at a time
- Focus on top priority item
- Complete fix before moving on
- Re-run analysis after changes
-
Verify with coverage
- Regenerate coverage after changes
- Run debtmap again to confirm improvement
- Track score reduction
For CI/CD Integration
-
Set appropriate thresholds
- Don’t fail on existing debt
- Fail on new critical items
- Track trends over time
-
Cache analysis results
- Use git-based caching
- Only re-analyze changed files
-
Integrate with PR comments
- Show debt impact of changes
- Suggest focus areas for review
Troubleshooting
Empty context suggestions
Cause: Call graph analysis couldn’t resolve callers/callees
Solution: Ensure file parsing succeeded:
debtmap analyze . -vv # Verbose mode shows parsing issues
Inconsistent scores between runs
Cause: Non-deterministic analysis (should not happen)
Solution: Report as a bug with reproducible example
Large context suggestions
Cause: High coupling in codebase
Solution: Use --max-context-lines to limit:
debtmap analyze . --format markdown --max-context-lines 300
API Reference
CLI Options for LLM Integration
| Option | Description |
|---|---|
--format markdown | LLM-optimized markdown output |
--format json | Structured JSON output |
--top N | Limit to top N items |
--lcov FILE | Include coverage data |
--min-score N | Filter items below score N |
--output FILE | Write to file instead of stdout |
JSON Schema
The JSON output follows a stable schema. See Output Formats for the complete schema definition.
Next Steps
- Configure analysis: See Configuration
- Understand metrics: See Metrics Reference
- View examples: See Examples
CLI Reference
Complete reference for Debtmap command-line interface.
Quick Start
# Basic analysis
debtmap analyze src/
# With coverage integration
debtmap analyze src/ --coverage-file coverage.lcov
# Generate JSON report
debtmap analyze . --format json --output report.json
# Show top 10 priority items only
debtmap analyze . --top 10 --min-priority high
# Initialize configuration and validate
debtmap init
debtmap validate . --config debtmap.toml
Global Options
Options that apply to all commands:
--show-config-sources- Show where each configuration value came from (spec 201)- Displays the source of each configuration value (default, config file, environment, CLI)
- Useful for debugging configuration precedence issues
--config <PATH>- Custom config file path (overrides default locations)- Can also use
DEBTMAP_CONFIGenvironment variable
- Can also use
Example:
# Show configuration sources
debtmap --show-config-sources analyze .
# Use custom config file
debtmap --config /path/to/debtmap.toml validate .
Commands
Debtmap provides seven main commands: five for analysis and validation, plus two debugging tools.
analyze
Analyze code for complexity and technical debt.
Usage:
debtmap analyze <PATH> [OPTIONS]
Arguments:
<PATH>- Path to analyze (file or directory)
Description: Primary command for code analysis. Supports multiple output formats (json, markdown, terminal), coverage file integration, parallel processing, context-aware risk analysis, and comprehensive filtering options.
See Options section below for all available flags.
init
Initialize a Debtmap configuration file.
Usage:
debtmap init [OPTIONS]
Options:
-f, --force- Force overwrite existing config
Description:
Creates a .debtmap.toml configuration file in the current directory with default settings. Use --force to overwrite an existing configuration file.
validate
Validate code against thresholds defined in configuration file.
Usage:
debtmap validate <PATH> [OPTIONS]
Arguments:
<PATH>- Path to analyze
Options:
Configuration & Output:
-c, --config <CONFIG>- Configuration file path-f, --format <FORMAT>- Output format: json, markdown, terminal-o, --output <OUTPUT>- Output file path (defaults to stdout)
Coverage & Context:
--coverage-file <PATH>/--lcov <PATH>- LCOV coverage file for risk analysis--context/--enable-context- Enable context-aware risk analysis--context-providers <PROVIDERS>- Context providers to use (comma-separated)--disable-context <PROVIDERS>- Disable specific context providers
Thresholds & Validation:
--max-debt-density <N>- Maximum debt density allowed (per 1000 LOC)
Display Filtering:
--top <N>/--head <N>- Show only top N priority items--tail <N>- Show only bottom N priority items-s, --summary- Use summary format with tiered priority display
Analysis Control:
--semantic-off- Disable semantic analysis (fallback mode)
Performance Control:
--no-parallel- Disable parallel call graph construction (enabled by default)-j, --jobs <N>- Number of threads for parallel processing (0 = use all cores)- Can also use
DEBTMAP_JOBSenvironment variable
- Can also use
Debugging & Verbosity:
-v, --verbose- Increase verbosity level (can be repeated: -v, -vv, -vvv)
Description:
Similar to analyze but enforces thresholds defined in configuration file. Returns non-zero exit code if thresholds are exceeded, making it suitable for CI/CD integration.
The validate command supports a focused subset of analyze options, primarily for output control, coverage integration, context-aware analysis, and display filtering.
Note: The following analyze options are NOT available in the validate command:
--threshold-complexity,--threshold-duplication,--threshold-preset(configure these in.debtmap.tomlinstead)--languages(language filtering)
Note: The --explain-score flag exists in the validate command but is deprecated (hidden). Use -v, -vv, or -vvv for verbosity instead.
Configure analysis thresholds in your .debtmap.toml configuration file for use with the validate command.
Exit Codes:
0- Success (no errors, all thresholds passed)- Non-zero - Failure (errors occurred or thresholds exceeded)
compare
Compare two analysis results and generate a diff report.
Usage:
debtmap compare --before <FILE> --after <FILE> [OPTIONS]
Required Options:
--before <FILE>- Path to “before” analysis JSON--after <FILE>- Path to “after” analysis JSON
Optional Target Location:
--plan <FILE>- Path to implementation plan (to extract target location)--target-location <LOCATION>- Target location in formatfile:function:line
Note: --plan and --target-location are mutually exclusive options. Using both together will cause a CLI error:
error: the argument '--plan <FILE>' cannot be used with '--target-location <LOCATION>'
Use one or the other to specify the target location.
Output Options:
-f, --format <FORMAT>- Output format: json, markdown, terminal (default: json)-o, --output <OUTPUT>- Output file (defaults to stdout)
Description: Compares two analysis results and generates a diff showing improvements or regressions in code quality metrics.
validate-improvement
Validate that technical debt improvements meet quality thresholds.
Usage:
debtmap validate-improvement --comparison <FILE> [OPTIONS]
Required Options:
--comparison <FILE>- Path to comparison JSON file (fromdebtmap compare)
Optional Options:
-o, --output <FILE>- Output file path for validation results (default:.prodigy/debtmap-validation.json)--previous-validation <FILE>- Path to previous validation result for trend tracking--threshold <N>- Improvement threshold percentage (default: 75.0)-f, --format <FORMAT>- Output format: json, markdown, terminal (default: json)--quiet- Suppress console output (useful for automation)
Description:
Validates improvement quality by analyzing comparison output from debtmap compare. Calculates a composite improvement score based on weighted components:
Composite Score Calculation:
-
Target Item Improvement (50% weight) - Measures direct improvement to the specific target item being fixed
- Compares complexity, debt score, and other metrics before/after
- Weight: 0.5 × (target improvement percentage)
-
Overall Project Health (30% weight) - Evaluates project-wide quality changes
- Analyzes aggregate metrics across entire codebase
- Considers new issues introduced, total debt changes
- Weight: 0.3 × (project health percentage)
-
Absence of Regressions (20% weight) - Penalizes introduction of new technical debt
- Checks for new high-priority issues in other parts of codebase
- Ensures improvements don’t shift complexity elsewhere
- Weight: 0.2 × (regression-free percentage)
Final Score: composite_score = (0.5 × target) + (0.3 × health) + (0.2 × no_regressions)
The default threshold (75%) requires strong improvement in the target item while maintaining overall project quality.
When --previous-validation is provided, tracks progress trends across multiple attempts and provides recommendations for continuing or adjusting the improvement approach.
Example:
# Basic validation
debtmap validate-improvement --comparison comparison.json
# With trend tracking and custom threshold
debtmap validate-improvement \
--comparison .prodigy/comparison.json \
--previous-validation .prodigy/validation.json \
--output .prodigy/validation.json \
--threshold 80.0
diagnose-coverage (Debugging)
Diagnose and validate LCOV coverage files.
Usage:
debtmap diagnose-coverage <COVERAGE_FILE> [OPTIONS]
Arguments:
<COVERAGE_FILE>- Path to the LCOV coverage file to diagnose
Options:
--format <FORMAT>- Output format: text or json (default: text)
Description:
Validates and diagnoses LCOV coverage files with detailed analysis. This command helps identify issues with coverage data before using it with the analyze command. Use this when:
- Coverage data seems incorrect or incomplete
- You want to validate an LCOV file before integration
- Debugging coverage parsing issues
Example:
# Validate coverage file with text output
debtmap diagnose-coverage coverage.lcov
# Get JSON output for automation
debtmap diagnose-coverage lcov.info --format json
explain-coverage (Debugging)
Explain coverage detection for a specific function.
Usage:
debtmap explain-coverage <PATH> --coverage-file <FILE> --function <NAME> [OPTIONS]
Arguments:
<PATH>- Path to the codebase to analyze
Required Options:
--coverage-file <FILE>/--lcov <FILE>- LCOV coverage file--function <NAME>- Function name to explain (e.g., “create_auto_commit”)
Optional Options:
--file <PATH>- File path containing the function (helps narrow search)-v, --verbose- Show all attempted matching strategies-f, --format <FORMAT>- Output format: text or json (default: text)
Description: Debugging tool that explains how coverage detection works for a specific function. Shows all attempted matching strategies and helps diagnose coverage mapping issues. This command is particularly useful when:
- Coverage appears incorrect for specific functions
- You need to understand why a function isn’t matched in coverage data
- Debugging LCOV line mapping issues
Example:
# Explain coverage for a specific function
debtmap explain-coverage src/ --coverage-file coverage.lcov --function "process_file"
# Narrow search to specific file with verbose output
debtmap explain-coverage . --lcov lcov.info --function "analyze_complexity" --file "src/analyzer.rs" -v
# JSON output for automation
debtmap explain-coverage . --coverage-file coverage.lcov --function "my_function" --format json
Options
Options are organized by category for clarity. Most options apply to the analyze command, with a subset available for validate.
Output Control
Control how analysis results are formatted and displayed.
Format Options:
-f, --format <FORMAT>- Output format: json, markdown, terminal, html, dot (default: terminal for analyze)json- JSON format for programmatic processingmarkdown- Markdown format with comprehensive analysis (uses LLM-optimized writer)terminal- Human-readable terminal output with colors and formattinghtml- HTML format for web displaydot- Graphviz DOT format for dependency visualization
-o, --output <OUTPUT>- Output file path (defaults to stdout)--plain- Plain output mode: ASCII-only, no colors, no emoji, machine-parseable
Display Filtering:
--top <N>/--head <N>- Show only top N priority items--tail <N>- Show only bottom N priority items (lowest priority)-s, --summary- Use summary format with tiered priority display (compact output)-c, --compact- Use compact output format (minimal details, top metrics only). Conflicts with verbosity flags (-v, -vv, -vvv). Only available inanalyzecommand (note:validateuses-cfor--config)--min-priority <PRIORITY>- Minimum priority to display: low, medium, high, critical--min-score <N>- Minimum score threshold for filtering T3/T4 recommendations (default: 3.0)- T1 Critical Architecture and T2 Complex Untested items bypass this filter and are always shown
- Overrides config file setting (Spec 193, 205)
--show-filter-stats- Show filter statistics (how many items were filtered and why)--filter <CATEGORIES>- Filter by debt categories (comma-separated)--aggregate-only- Show only aggregated file-level scores--group-by-category- Group output by debt category
Progress Display:
--no-tui- Disable TUI progress visualization (use simple progress bars instead)-q, --quiet- Suppress progress output (quiet mode)
Streaming Output:
--streaming- Enable streaming output mode for large codebases (Spec 003)- Streams results to output as they’re generated with O(1) memory overhead
- Useful for processing large codebases that would otherwise exhaust memory
--stream-to <FILE>- Output file for streaming mode (requires –streaming)- Use “-” for stdout
Dependency Display Options:
--show-dependencies- Show caller/callee information in output--no-dependencies- Hide dependency information (conflicts with –show-dependencies)--max-callers <N>- Maximum number of callers to display (default: 5)--max-callees <N>- Maximum number of callees to display (default: 5)--show-external-calls- Include external crate calls in dependencies--show-std-lib-calls- Include standard library calls in dependencies
Analysis Control
Configure analysis behavior, thresholds, and language selection.
Thresholds:
--threshold-complexity <N>- Complexity threshold (default: 10) [analyze command]--threshold-duplication <N>- Duplication threshold in lines (default: 50) [analyze command]--threshold-preset <PRESET>- Complexity threshold preset: strict, balanced, lenient [analyze command]strict- Strict thresholds for high code quality standardsbalanced- Balanced thresholds for typical projects (default)lenient- Lenient thresholds for legacy or complex domains
--max-debt-density <N>- Maximum debt density allowed per 1000 LOC [validate command]
Note: Threshold options (--threshold-complexity, --threshold-duplication, --threshold-preset) are command-line options for the analyze command. For the validate command, these thresholds are configured via the --config file (.debtmap.toml) rather than as command-line flags.
Language Selection:
--languages <LANGS>- Comma-separated list of languages to analyze- Example:
--languages rust,python,javascript - Supported: rust, python, javascript, typescript
- Example:
Analysis Modes:
--semantic-off- Disable semantic analysis (fallback mode)--no-context-aware- Disable context-aware false positive reduction (enabled by default)--no-multi-pass- Disable multi-pass analysis (use single-pass for performance)--attribution- Show complexity attribution details (requires multi-pass, which is the default)
Functional Programming Analysis:
--ast-functional-analysis- Enable AST-based functional composition analysis (spec 111)- Analyzes code for functional programming patterns and composition
- Detects pure functions, immutability, and side effects
--functional-analysis-profile <PROFILE>- Set functional analysis profilestrict- Strict functional purity requirements (for pure FP codebases)balanced- Balanced analysis suitable for mixed paradigms (default)lenient- Lenient thresholds for imperative codebases
Context & Coverage
Enable context-aware risk analysis and integrate test coverage data.
Context-Aware Risk Analysis:
--context/--enable-context- Enable context-aware risk analysis--context-providers <PROVIDERS>- Context providers to use (comma-separated)- Available:
critical_path,dependency,git_history - Example:
--context-providers critical_path,git_history
- Available:
--disable-context <PROVIDERS>- Disable specific context providers (comma-separated)
Coverage Integration:
--coverage-file <PATH>/--lcov <PATH>- LCOV coverage file for risk analysis- Coverage data dampens debt scores for well-tested code (multiplier = 1.0 - coverage)
- Surfaces untested complex functions as higher priority
- Total debt score with coverage ≤ score without coverage
--validate-loc- Validate LOC consistency across analysis modes (with/without coverage)
Performance
Optimize analysis performance through parallelization.
Parallel Processing:
--no-parallel- Disable parallel call graph construction (enabled by default)-j, --jobs <N>- Number of threads for parallel processing0= use all available CPU cores (default)- Specify number to limit thread count
Other Performance:
--max-files <N>- Maximum number of files to analyze (0 = no limit)
Profiling:
--profile- Enable profiling to identify performance bottlenecks (Spec 001)- Outputs timing breakdown for each analysis phase when complete
--profile-output <FILE>- Write profiling data to file in JSON format (requires –profile)- Use for post-analysis performance investigation
Debugging & Verbosity
Control diagnostic output and debugging information.
Verbosity Levels:
-v, --verbose- Increase verbosity level (can be repeated: -v, -vv, -vvv)-v- Show main score factors-vv- Show detailed calculations-vvv- Show all debug information
Specialized Debugging:
--explain-metrics- Explain metric definitions and formulas (measured vs estimated)--verbose-macro-warnings- Show verbose macro parsing warnings (Rust analysis)--show-macro-stats- Show macro expansion statistics at end of analysis--detail-level <LEVEL>- Detail level for diagnostic reports- Options: summary, standard, comprehensive, debug (default: standard)
Call Graph Debugging:
--debug-call-graph- Enable detailed call graph debugging with resolution information--trace-function <FUNCTIONS>- Trace specific functions during call resolution (comma-separated)- Example:
--trace-function 'my_function,another_function'
- Example:
--call-graph-stats- Show only call graph statistics (no detailed failure list)--validate-call-graph- Validate call graph structure and report issues--debug-format <FORMAT>- Debug output format: text or json (default: text)- Use with call graph debugging flags to control output format
Aggregation
Control file-level aggregation and god object detection.
File Aggregation:
--aggregate-only- Show only aggregated file-level scores--no-aggregation- Disable file-level aggregation--aggregation-method <METHOD>- File aggregation method (default: weighted_sum)- Options: sum, weighted_sum, logarithmic_sum, max_plus_average
--min-problematic <N>- Minimum number of problematic functions for file aggregation--no-god-object- Disable god object detection
God Object Split Recommendations:
--show-splits- Show detailed module split recommendations for god objects and large files (experimental)- Suggests how to decompose large files into smaller, focused modules
--min-split-methods <N>- Minimum methods per god object split recommendation (default: 10)--min-split-lines <N>- Minimum lines per god object split recommendation (default: 150)
Dead Code Analysis:
--no-public-api-detection- Disable public API detection heuristics for dead code analysis--public-api-threshold <N>- Public API confidence threshold (0.0-1.0, default: 0.7)- Functions above this threshold are considered public APIs
Pattern Detection:
--no-pattern-detection- Disable pattern recognition--patterns <PATTERNS>- Enable specific patterns only (comma-separated)- Available patterns: observer, singleton, factory, strategy, callback, template_method
--pattern-threshold <N>- Pattern confidence threshold (0.0-1.0, default: 0.7)--show-pattern-warnings- Show pattern warnings for uncertain detections
Option Aliases
Common option shortcuts and aliases for convenience:
--lcovis alias for--coverage-file--enable-contextis alias for--context--headis alias for--top-sis short form for--summary-vis short form for--verbose-fis short form for--format-ois short form for--output-cis short form for--config-jis short form for--jobs
Deprecated Options
The following options are deprecated and should be migrated:
--explain-score(hidden) - Deprecated: use-vinstead- Migration: Use
-v,-vv, or-vvvfor increasing verbosity levels
- Migration: Use
Configuration
Configuration File
Created via debtmap init command. The configuration file (.debtmap.toml) is used by the validate command for threshold enforcement and default settings.
Creating Configuration:
# Create new config
debtmap init
# Overwrite existing config
debtmap init --force
Environment Variables
-
DEBTMAP_CONFIG- Custom config file path (same as--configglobal flag)- Example:
export DEBTMAP_CONFIG=/path/to/debtmap.toml - Overrides default configuration file locations
- Useful for CI/CD environments with centralized config
- Source: src/cli.rs:45
- Example:
-
DEBTMAP_JOBS- Number of threads for parallel processing (same as--jobs/-jflag)- Example:
export DEBTMAP_JOBS=8 # Same as --jobs 8 - Use
0to utilize all available CPU cores - Controls thread pool size for parallel call graph construction
- Example:
-
DEBTMAP_SINGLE_PASS- Disable multi-pass analysis globally (same as--no-multi-passflag)- Example:
export DEBTMAP_SINGLE_PASS=1 # Disable multi-pass analysis - Set to
1ortrueto disable multi-pass analysis by default - Useful for CI/CD environments where performance is critical
- Can be overridden by command-line flags
- Example:
-
PRODIGY_AUTOMATION- Enable automation mode forvalidate-improvementcommand- Example:
export PRODIGY_AUTOMATION=true - Set to
trueto enable automation mode (suppresses interactive prompts) - Used in automated workflows for continuous improvement validation
- Source: src/main.rs:549
- Example:
-
PRODIGY_VALIDATION- Alternative flag for automation mode (same effect asPRODIGY_AUTOMATION)- Example:
export PRODIGY_VALIDATION=true - Set to
trueto enable automation mode - Provided for backward compatibility with existing workflows
- Source: src/main.rs:552
- Example:
Getting Help
Get help for any command:
# General help
debtmap --help
# Command-specific help
debtmap analyze --help
debtmap validate --help
debtmap compare --help
debtmap init --help
Common Workflows
Basic Analysis
Analyze a project and view results in terminal:
debtmap analyze src/
Generate JSON report for further processing:
debtmap analyze . --format json --output report.json
Generate Markdown report:
debtmap analyze . --format markdown --output report.md
Coverage-Integrated Analysis
Analyze with test coverage to surface untested complex code:
# Generate coverage file first (example for Rust)
cargo tarpaulin --out lcov
# Run analysis with coverage
debtmap analyze src/ --coverage-file lcov.info
Coverage dampens debt scores for well-tested code, making untested complex functions more visible.
Context-Aware Analysis
Enable context providers for risk-aware prioritization:
# Use all context providers
debtmap analyze . --context
# Use specific context providers
debtmap analyze . --context --context-providers critical_path,git_history
Context-aware analysis reduces false positives and prioritizes code based on:
- Critical execution paths
- Dependency relationships
- Git history (change frequency)
Filtered & Focused Analysis
Show only top priority items:
debtmap analyze . --top 10 --min-priority high
Filter by specific debt categories:
debtmap analyze . --filter complexity,duplication
Use summary mode for compact output:
debtmap analyze . --summary
Show only file-level aggregations:
debtmap analyze . --aggregate-only
Performance Tuning
Control parallelization:
# Use 8 threads
debtmap analyze . --jobs 8
# Disable parallel processing
debtmap analyze . --no-parallel
Limit analysis scope:
# Analyze maximum 100 files
debtmap analyze . --max-files 100
# Analyze specific languages only
debtmap analyze . --languages rust,python
CI/CD Integration
Use the validate command in CI/CD pipelines:
# Initialize configuration (one time)
debtmap init
# Edit debtmap.toml to set thresholds
# ...
# In CI pipeline: validate against thresholds
debtmap validate . --config debtmap.toml --max-debt-density 50
The validate command returns non-zero exit code if thresholds are exceeded, failing the build.
Comparison & Tracking
Compare analysis results before and after changes:
# Before changes
debtmap analyze . --format json --output before.json
# Make code changes...
# After changes
debtmap analyze . --format json --output after.json
# Generate comparison report
debtmap compare --before before.json --after after.json --format markdown
With implementation plan:
debtmap compare --before before.json --after after.json --plan IMPLEMENTATION_PLAN.md
Debugging Analysis
Increase verbosity to understand scoring:
# Show main score factors
debtmap analyze src/ -v
# Show detailed calculations
debtmap analyze src/ -vv
# Show all debug information
debtmap analyze src/ -vvv
Debug call graph resolution issues:
# Enable call graph debugging
debtmap analyze . --debug-call-graph
# Trace specific functions
debtmap analyze . --debug-call-graph --trace-function 'problematic_function'
# Validate call graph structure
debtmap analyze . --validate-call-graph --debug-format json
Show macro expansion statistics (Rust):
debtmap analyze . --show-macro-stats --verbose-macro-warnings
Use detailed diagnostic reports:
debtmap analyze . --detail-level comprehensive
Analyze functional programming patterns:
# Enable functional analysis
debtmap analyze . --ast-functional-analysis
# Use strict profile for pure FP codebases
debtmap analyze . --ast-functional-analysis --functional-analysis-profile strict
Examples
Basic Analysis
# Analyze current directory
debtmap analyze .
# Analyze specific directory
debtmap analyze src/
# Generate JSON output
debtmap analyze . --format json --output report.json
With Coverage
# Analyze with LCOV coverage file
debtmap analyze src/ --coverage-file coverage.lcov
# Alternative alias
debtmap analyze src/ --lcov coverage.lcov
Context-Aware Analysis
# Enable all context providers
debtmap analyze . --context
# Use specific context providers
debtmap analyze . --context --context-providers critical_path,git_history
# Disable specific providers
debtmap analyze . --context --disable-context dependency
Filtered Output
# Top 10 priority items only
debtmap analyze . --top 10
# High priority and above
debtmap analyze . --min-priority high
# Specific categories
debtmap analyze . --filter complexity,duplication
# Summary format
debtmap analyze . --summary
# Group by category
debtmap analyze . --group-by-category
Performance Tuning
# Use 8 threads
debtmap analyze . --jobs 8
# Disable parallelization
debtmap analyze . --no-parallel
# Limit file count
debtmap analyze . --max-files 100
Validation
# Initialize config
debtmap init --force
# Validate against config
debtmap validate . --config debtmap.toml
# With max debt density threshold
debtmap validate . --max-debt-density 50
Comparison
# Compare two analyses
debtmap compare --before before.json --after after.json
# With markdown output
debtmap compare --before before.json --after after.json --format markdown
# With implementation plan
debtmap compare --before before.json --after after.json --plan IMPLEMENTATION_PLAN.md
# With target location
debtmap compare --before before.json --after after.json --target-location "src/main.rs:process_file:42"
Language Selection
# Analyze only Rust files
debtmap analyze . --languages rust
# Multiple languages
debtmap analyze . --languages rust,python,javascript
Threshold Configuration
# Custom complexity threshold
debtmap analyze . --threshold-complexity 15
# Use preset
debtmap analyze . --threshold-preset strict
# Custom duplication threshold
debtmap analyze . --threshold-duplication 100
Plain/Machine-Readable Output
# Plain output (no colors, no emoji)
debtmap analyze . --plain
# Combine with JSON for CI
debtmap analyze . --format json --plain --output report.json
Advanced Debugging
# Call graph debugging with detailed information
debtmap analyze . --debug-call-graph --debug-format json
# Trace specific functions during call resolution
debtmap analyze . --debug-call-graph --trace-function 'process_file,analyze_complexity'
# Validate call graph structure
debtmap analyze . --validate-call-graph
# Show only call graph statistics
debtmap analyze . --debug-call-graph --call-graph-stats
# Functional programming analysis with strict profile
debtmap analyze . --ast-functional-analysis --functional-analysis-profile strict
# Explain metric definitions
debtmap analyze . --explain-metrics -v
Command Compatibility Matrix
| Option | analyze | validate | compare | init | diagnose-coverage | explain-coverage |
|---|---|---|---|---|---|---|
<PATH> argument | ✓ | ✓ | ✗ | ✗ | ✓ | ✓ |
--format | ✓ | ✓ | ✓ | ✗ | ✓ | ✓ |
--output | ✓ | ✓ | ✓ | ✗ | ✗ | ✗ |
--coverage-file | ✓ | ✓ | ✗ | ✗ | ✗ | ✓ |
--context | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ |
--threshold-* | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
--top / --tail | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ |
--min-score | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
--show-filter-stats | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
--quiet | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
--no-tui | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
--streaming | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
--stream-to | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
--jobs | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ |
--no-parallel | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ |
--profile | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
--profile-output | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
--verbose | ✓ | ✓ | ✗ | ✗ | ✗ | ✓ |
--show-splits | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ |
--min-split-methods | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
--min-split-lines | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
--no-public-api-detection | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
--public-api-threshold | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
--no-pattern-detection | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
--patterns | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
--pattern-threshold | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
--show-pattern-warnings | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
--explain-metrics | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
--debug-call-graph | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
--trace-function | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
--call-graph-stats | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
--validate-call-graph | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
--debug-format | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
--show-dependencies | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
--no-dependencies | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
--max-callers | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
--max-callees | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
--show-external-calls | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
--show-std-lib-calls | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
--ast-functional-analysis | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
--functional-analysis-profile | ✓ | ✗ | ✗ | ✗ | ✗ | ✗ |
--function | ✗ | ✗ | ✗ | ✗ | ✗ | ✓ |
--file | ✗ | ✗ | ✗ | ✗ | ✗ | ✓ |
--config | ✗ | ✓ | ✗ | ✗ | ✗ | ✗ |
--before / --after | ✗ | ✗ | ✓ | ✗ | ✗ | ✗ |
--force | ✗ | ✗ | ✗ | ✓ | ✗ | ✗ |
Note: The validate command supports output control (--format, --output), coverage integration (--coverage-file), context-aware analysis (--context), display filtering (--top, --tail, --summary, --show-splits), performance control (--jobs, --no-parallel), and verbosity options (--verbose) from the analyze command. Analysis thresholds (--threshold-complexity, --threshold-duplication, --threshold-preset) are configured via the --config file rather than as command-line options. Debugging features like call graph debugging, functional analysis, profiling, and pattern detection are specific to the analyze command. The diagnose-coverage and explain-coverage commands are specialized debugging tools for diagnosing coverage issues and have their own unique options.
Troubleshooting
Performance Issues
Problem: Analysis is slow on large codebases
Solutions:
# Use more threads (if you have CPU cores available)
debtmap analyze . --jobs 16
# Limit analysis scope
debtmap analyze . --max-files 500 --languages rust
Memory Issues
Problem: Analysis runs out of memory
Solutions:
# Disable parallelization
debtmap analyze . --no-parallel
# Limit file count
debtmap analyze . --max-files 100
# Analyze in batches by language
debtmap analyze . --languages rust
debtmap analyze . --languages python
Output Issues
Problem: Terminal output has garbled characters
Solution:
# Use plain mode
debtmap analyze . --plain
Problem: Want machine-readable output
Solution:
# Use JSON with plain mode
debtmap analyze . --format json --plain --output report.json
Threshold Issues
Problem: Too many items flagged
Solutions:
# Use lenient preset
debtmap analyze . --threshold-preset lenient
# Increase threshold
debtmap analyze . --threshold-complexity 20
# Filter to high priority only
debtmap analyze . --min-priority high
Problem: Not enough items flagged
Solutions:
# Use strict preset
debtmap analyze . --threshold-preset strict
# Lower threshold
debtmap analyze . --threshold-complexity 5
# Show all items
debtmap analyze . --min-priority low
Best Practices
Regular Analysis
Run analysis regularly to track code quality trends:
# Daily in CI
debtmap validate . --config debtmap.toml
# Weekly deep analysis with coverage
debtmap analyze . --coverage-file coverage.lcov --format json --output weekly-report.json
Performance Optimization
For large codebases:
# Use maximum parallelization
debtmap analyze . --jobs 0 # 0 = all cores
# Focus on changed files in CI
# (implement via custom scripts to analyze git diff)
Integration with Coverage
Always analyze with coverage when available:
# Rust example
cargo tarpaulin --out lcov
debtmap analyze src/ --coverage-file lcov.info
# Python example
pytest --cov --cov-report=lcov
debtmap analyze . --coverage-file coverage.lcov
Coverage integration helps prioritize untested complex code.
Additional Tools
prodigy-validate-debtmap-improvement
Specialized validation tool for Prodigy workflow integration.
Description: This binary is part of the Prodigy workflow system and provides specialized validation for Debtmap improvement workflows.
Usage: See Prodigy documentation for detailed usage instructions.
See Also
- Configuration Format - Detailed configuration file format
- Output Formats - Understanding JSON, Markdown, and Terminal output
- Coverage Integration - Integrating test coverage data
- Context Providers - Understanding context-aware analysis
- Examples - More comprehensive usage examples
Analysis Guide
This guide explains Debtmap’s analysis capabilities, metrics, and methodologies in depth. Use this to understand what Debtmap measures, how it scores technical debt, and how to interpret analysis results for maximum impact.
Subsections
This chapter is organized into the following subsections:
- Overview - Introduction to analysis capabilities and overall approach
- Complexity Metrics - Comprehensive coverage of cyclomatic, cognitive, and entropy-based complexity with extensive examples
- Debt Patterns - Detailed documentation of all debt pattern types with detection criteria
- Risk Scoring - Risk calculation methodology, scoring formulas, and prioritization strategies
- Interpreting Results - Result interpretation, action strategies, and detailed examples
- Analyzer Types - Documentation of different analyzer modes and their use cases
- Advanced Features - Advanced analysis capabilities and techniques
- Example Outputs - Real-world output examples and interpretation guidance
Overview
Debtmap analyzes code through multiple lenses to provide a comprehensive view of technical health. The goal is to move beyond simple problem identification to evidence-based prioritization - showing what to fix first based on risk scores, test coverage gaps, and ROI calculations, with actionable recommendations backed by impact metrics.
Supported Languages
- Rust - Full analysis support with AST-based parsing via
syn - Python - File detection only; analysis not yet implemented (language enum defined for future expansion)
Source: src/organization/language.rs:4-7 (Language enum), src/analyzers/implementations.rs:185-186 (analysis status)
Analysis Capabilities
Complexity Metrics
Calculates multiple dimensions of code complexity:
- Cyclomatic Complexity - Measures linearly independent paths through code (control flow branching)
- Cognitive Complexity - Quantifies human comprehension difficulty beyond raw paths
- Nesting Depth - Tracks maximum indentation levels
- Function Length - Lines of code per function
- Parameter Count - Number of function parameters
- Entropy Score - Pattern-based complexity adjustment that reduces false positives by up to 70%
- Purity Level - Functional purity classification (StrictlyPure, LocallyPure, ReadOnly, Impure)
Source: src/core/mod.rs:62-92 (FunctionMetrics struct)
See Complexity Metrics for detailed explanations and examples.
Debt Patterns
Identifies 35 types of technical debt across 4 major categories:
Testing Issues (6 types):
- Testing gaps in complex code
- Complex test code requiring refactoring
- Test duplication and flaky patterns
- Over-complex assertions
Architectural Problems (7 types):
- God objects and god modules
- Feature envy and primitive obsession
- Scattered type implementations
- Orphaned functions and utilities sprawl
Performance Issues (8 types):
- Async/await misuse and blocking I/O
- Collection inefficiencies and nested loops
- Memory allocation problems
- Suboptimal data structures
Code Quality Issues (6 types):
- Complexity hotspots without test coverage
- Dead code and duplication
- Error swallowing and magic values
Source: src/priority/debt_types.rs:47 (DebtType enum)
See Debt Patterns for detailed detection rules and examples.
Risk Scoring
Combines complexity, test coverage, coupling, and change frequency through a multi-factor risk model:
Risk Categories:
- Critical - High complexity (>15) + low coverage (<30%)
- High - High complexity (>10) + moderate coverage (<60%)
- Medium - Moderate complexity (>5) + low coverage (<50%)
- Low - Low complexity or high coverage
- WellTested - High complexity with high coverage (good examples to learn from)
Coverage Penalty Calculation:
- Untested code receives a 2.0x multiplier
- Partially tested code receives a 1.5x multiplier
- Coverage gaps are penalized exponentially
Risk Score Weights (configurable):
#![allow(unused)]
fn main() {
coverage: 0.5 // Coverage weight
complexity: 0.3 // Cyclomatic weight
cognitive: 0.45 // Cognitive weight
debt: 0.2 // Debt factor weight
untested_penalty: 2.0 // Multiplier for untested code
}
Source: src/risk/mod.rs:36-42, src/risk/strategy.rs:8-28
See Risk Scoring for detailed scoring algorithms.
Prioritization
Uses a multi-stage pipeline to assign priority tiers and estimate test writing impact:
- Evidence Collection - Gather complexity, coverage, and coupling metrics
- Context Enrichment - Add architectural context and change frequency
- Baseline Scoring - Calculate initial risk scores using multi-factor model
- ROI Calculation - Estimate return on investment for test writing
- Final Priority - Assign priority tiers with risk reduction impact estimates
Recommendation Tiers (based on strategic value):
- T1 Critical Architecture - God objects, excessive complexity; must address before adding features
- T2 Complex Untested - High complexity with no tests; risk of bugs in critical paths
- T3 Testing Gaps - Moderate complexity without tests; improve coverage to prevent future issues
- T4 Maintenance - Low-complexity issues; address opportunistically
Source: src/priority/tiers/mod.rs:29-45 (RecommendationTier enum)
See Interpreting Results for guidance on using priority rankings.
How It Works
Debtmap uses a functional, multi-layered architecture for accurate and performant analysis:
Three-Phase Analysis Pipeline
Phase 1: Parallel Parsing
- Language-specific AST generation using tree-sitter (Rust via
syn) - Pure functional transformation: source code → AST
- Files parsed once, ASTs cached and cloned for reuse (44% faster)
- Runs in parallel using Rayon for CPU-intensive parsing
Phase 2: Parallel Analysis with Batching
- Data flow graph construction (O(1) lookups via multi-index)
- Purity analysis tracking pure vs impure functions
- Pattern detection with entropy analysis
- Metrics computation through pure functions
- Default batch size: 100 items (configurable via
--batch-size)
Phase 3: Sequential Aggregation
- Combine parallel results into unified analysis
- Apply multi-dimensional scoring (complexity + coverage + coupling)
- Priority ranking and tier classification
- Generate actionable recommendations with impact metrics
Source: ARCHITECTURE.md, src/builders/parallel_unified_analysis.rs:21-87
Key Architectural Patterns
Functional Core, Imperative Shell:
- Pure functions for all metric calculations
- I/O isolated to file reading and output formatting
- Enables easy testing and parallelization
Multi-Index Call Graph (O(1) lookups):
- Primary index: exact function ID lookup
- Fuzzy index: name + file matching for generics
- Name index: cross-file function resolution
- Memory overhead: ~7MB for 10,000 functions
Parallel Processing with Rayon:
- CPU-bound work runs in parallel
- Sequential aggregation maintains consistency
- Adaptive batching optimizes memory usage
Source: ARCHITECTURE.md:120-200
See Architecture for detailed design documentation.
Key Differentiators
What makes debtmap effective:
- Pattern-Based Complexity Adjustment - Entropy analysis reduces false positives by identifying boilerplate patterns
- Multi-Pass Analysis - Compares raw vs normalized complexity for accurate attribution
- Coverage-Risk Correlation - Finds genuinely risky code, not just complex code
- Functional Purity Tracking - Identifies side effects and pure functions for targeted refactoring
- Context-Aware Detection - Considers architectural context, not just isolated metrics
- Evidence-Based Prioritization - ROI-driven recommendations backed by multiple signals
Performance
Debtmap achieves high throughput through parallel processing:
| Codebase Size | Target | Actual | Speedup |
|---|---|---|---|
| 50 files | <0.5s | ~0.3s | 4x |
| 250 files | <1s | ~0.8s | 6.25x |
| 1000 files | <5s | ~3.5s | 5.7x |
Source: ARCHITECTURE.md:100-106
See Parallel Processing for optimization details.
Complexity Metrics
Debtmap measures complexity using multiple complementary approaches. Each metric captures a different aspect of code difficulty.
Cyclomatic Complexity
Measures the number of linearly independent paths through code - essentially counting decision points.
How it works:
- Start with a base complexity of 1
- Add 1 for each:
if,else if,matcharm,while,for,&&,||,?operator - Does NOT increase for
else(it’s the alternate path, not a new decision)
Thresholds:
- 1-5: Simple, easy to test - typically needs 1-3 test cases
- 6-10: Moderate complexity - needs 4-8 test cases
- 11-20: Complex, consider refactoring - needs 9+ test cases
- 20+: Very complex, high risk - difficult to test thoroughly
Example:
#![allow(unused)]
fn main() {
fn validate_user(age: u32, has_license: bool, country: &str) -> bool {
// Complexity: 4
// Base (1) + if (1) + && (1) + match (1) = 4
if age >= 18 && has_license {
match country {
"US" | "CA" => true,
_ => false,
}
} else {
false
}
}
}
Cognitive Complexity
Measures how difficult code is to understand by considering nesting depth and control flow interruptions.
How it differs from cyclomatic:
- Nesting increases weight (deeply nested code is harder to understand)
- Linear sequences don’t increase complexity (easier to follow)
- Breaks and continues add complexity (interrupt normal flow)
Calculation:
- Each structure (if, loop, match) gets a base score
- Nesting increases weight linearly: Each nesting level adds to the complexity score
- Base level (no nesting): weight = 1
- First nesting level: weight = 2
- Second nesting level: weight = 3
- Formula:
complexity = 1 + nesting_level(from src/complexity/cognitive.rs:167)
- Break/continue/return in middle of function adds cognitive load
Example:
#![allow(unused)]
fn main() {
// Cyclomatic: 5, Cognitive: 8
fn process_items(items: Vec<Item>) -> Vec<Result> {
let mut results = vec![];
for item in items { // +1 cognitive
if item.is_valid() { // +2 (nested in loop)
match item.type { // +3 (nested 2 levels)
Type::A => results.push(process_a(item)),
Type::B => {
if item.priority > 5 { // +4 (nested 3 levels)
results.push(process_b_priority(item));
}
}
_ => continue, // +1 (control flow interruption)
}
}
}
results
}
}
Thresholds:
- 0-5: Trivial - anyone can understand
- 6-10: Simple - straightforward logic
- 11-20: Moderate - requires careful reading
- 21-40: Complex - difficult to understand
- 40+: Very complex - needs refactoring
Entropy-Based Complexity Analysis
Uses information theory to distinguish genuinely complex code from pattern-based repetitive code. This dramatically reduces false positives for validation functions, dispatchers, and configuration parsers.
How it works:
-
Token Entropy (0.0-1.0): Measures variety in code tokens
- High entropy (0.7+): Diverse logic, genuinely complex
- Low entropy (0.0-0.4): Repetitive patterns, less complex than it appears
-
Pattern Repetition (0.0-1.0): Detects repetitive structures in AST
- High repetition (0.7+): Similar blocks repeated (validation checks, case handlers)
- Low repetition: Unique logic throughout
-
Branch Similarity (0.0-1.0): Analyzes similarity between conditional branches
- High similarity (0.8+): Branches do similar things (consistent handling)
- Low similarity: Each branch has unique logic
-
Token Classification: Categorizes tokens by type with weighted importance (src/complexity/entropy_core.rs:267-277)
- Token categories and weights (from src/complexity/entropy_traits.rs:24-44):
ControlFlow(1.2): if, match, for, while - highest weight for control structuresKeyword(1.0): language keywords like fn, let, pubFunctionCall(0.9): method calls and API usageOperator(0.8): +, -, *, ==, etc.Identifier(0.5): variable and function namesLiteral(0.3): string, number, boolean literals - lowest weight
- Higher weights emphasize structural complexity over superficial differences
- Focuses entropy calculation on control flow and logic rather than data values
- Token categories and weights (from src/complexity/entropy_traits.rs:24-44):
Dampening logic: Dampening is applied when multiple factors indicate repetitive patterns:
- Low token entropy (< 0.4) indicates simple, repetitive patterns
- High pattern repetition (> 0.6) shows similar code blocks (measured via PatternMetrics)
- High branch similarity (> 0.7) indicates consistent branching logic
Pattern detection (src/complexity/entropy_core.rs:279-308):
PatternMetricstracks intermediate calculations:total_patterns: Total number of code patterns detectedunique_patterns: Count of distinct patternsrepetition_ratio: Calculated as1.0 - (unique_patterns / total_patterns)
- High repetition ratio indicates validation functions, dispatchers, and configuration parsers
When these conditions are met:
effective_complexity = entropy × pattern_factor × similarity_factor
Note on metrics (src/complexity/entropy_core.rs:228-265):
token_entropy: Measures unpredictability of code tokens (0.0-1.0), used for pattern detectioneffective_complexity: Final composite score after applying dampening adjustments- These are distinct metrics -
effective_complexitycombines multiple factors, whiletoken_entropyis a single entropy measurement
Dampening cap: The dampening factor is clamped between 0.5 and 1.0 (src/complexity/entropy_core.rs:114), allowing a maximum of 50% complexity reduction. The configuration option max_combined_reduction (default 0.30) provides additional control over the maximum allowed reduction. This prevents over-correction of pattern-based code while still providing meaningful adjustments for repetitive structures.
Example:
#![allow(unused)]
fn main() {
// Without entropy: Cyclomatic = 15 (appears very complex)
// With entropy: Effective = 8 (pattern-based, dampened ~47%)
fn validate_config(config: &Config) -> Result<(), ValidationError> {
if config.name.is_empty() { return Err(ValidationError::EmptyName); }
if config.port == 0 { return Err(ValidationError::InvalidPort); }
if config.host.is_empty() { return Err(ValidationError::EmptyHost); }
if config.timeout == 0 { return Err(ValidationError::InvalidTimeout); }
// ... 11 more similar checks
Ok(())
}
}
Enable in .debtmap.toml:
[entropy]
enabled = true # Enable entropy analysis (default: true)
weight = 0.5 # Weight in adjustment (0.0-1.0)
use_classification = true # Advanced token classification
pattern_threshold = 0.7 # Pattern detection threshold
entropy_threshold = 0.4 # Entropy below this triggers dampening
branch_threshold = 0.8 # Branch similarity threshold
max_combined_reduction = 0.3 # Maximum 30% reduction
Output fields in EntropyScore:
unique_variables: Count of distinct variables in the function (measures variable diversity)max_nesting: Maximum nesting depth detected (contributes to dampening calculation)dampening_applied: Actual dampening factor applied to the complexity score
Nesting Depth
Maximum level of indentation in a function. Deep nesting makes code hard to follow.
Thresholds:
- 1-2: Flat, easy to read
- 3-4: Moderate nesting
- 5+: Deep nesting, consider extracting functions
Example:
#![allow(unused)]
fn main() {
// Nesting depth: 4 (difficult to follow)
fn process(data: Data) -> Result<Output> {
if data.is_valid() { // Level 1
for item in data.items { // Level 2
if item.active { // Level 3
match item.type { // Level 4
Type::A => { /* ... */ }
Type::B => { /* ... */ }
}
}
}
}
}
}
Refactored:
#![allow(unused)]
fn main() {
// Nesting depth: 2 (much clearer)
fn process(data: Data) -> Result<Output> {
if !data.is_valid() {
return Err(Error::Invalid);
}
data.items
.iter()
.filter(|item| item.active)
.map(|item| process_item(item)) // Extract to separate function
.collect()
}
}
Function Length
Number of lines in a function. Long functions often violate single responsibility principle.
Thresholds:
- 1-20 lines: Good - focused, single purpose
- 21-50 lines: Acceptable - may have multiple steps
- 51-100 lines: Long - consider breaking up
- 100+ lines: Very long - definitely needs refactoring
Why length matters:
- Harder to understand and remember
- Harder to test thoroughly
- Often violates single responsibility
- Difficult to reuse
Constructor Detection
Debtmap identifies constructor functions using AST-based analysis (Spec 122), which goes beyond simple name-based detection to catch non-standard constructor patterns.
Detection Strategy:
- Return Type Analysis: Functions returning
Self,Result<Self>, orOption<Self> - Body Pattern Analysis: Struct initialization or simple field assignments
- Complexity Check: Low cyclomatic complexity (≤5), no loops, minimal branching
Why AST-based detection?
Name-based detection (looking for new, new_*, from_*) misses non-standard constructors:
#![allow(unused)]
fn main() {
// Caught by name-based detection
fn new() -> Self {
Self { timeout: 30 }
}
// Missed by name-based, caught by AST detection
pub fn create_default_client() -> Self {
Self { timeout: Duration::from_secs(30) }
}
pub fn initialized() -> Self {
Self::new()
}
}
Builder vs Constructor:
AST analysis distinguishes between constructors and builder methods:
#![allow(unused)]
fn main() {
// Constructor: creates new instance
pub fn new(timeout: u32) -> Self {
Self { timeout }
}
// Builder method: modifies existing instance (NOT a constructor)
pub fn set_timeout(mut self, timeout: Duration) -> Self {
self.timeout = timeout;
self // Returns modified self, not new instance
}
}
Detection Criteria:
A function is classified as a constructor if:
- Returns
Self,Result<Self>, orOption<Self> - Contains struct initialization (
Self { ... }) without loops - OR delegates to another constructor (
Self::new()) with minimal logic
Fallback Behavior:
If AST parsing fails (syntax errors, unsupported language), Debtmap gracefully falls back to name-based detection (Spec 117):
new,new_*try_new*from_*
This ensures analysis always completes, even on partially broken code.
Performance:
AST-based detection adds < 5% overhead compared to name-only detection. See benchmarks:
cargo bench --bench constructor_detection_bench
Why it matters:
Accurately identifying constructors helps:
- Exclude them from complexity thresholds (constructors naturally have high complexity)
- Focus refactoring on business logic, not initialization code
- Understand initialization patterns across the codebase
Debt Patterns
Debtmap detects 35 types of technical debt, organized into 4 strategic categories. Each debt type is mapped to a category that guides prioritization and remediation strategies.
The 35 debt types consist of:
- 26 priority-specific variants: Specialized debt patterns with detailed diagnostic data
- 9 legacy variants: Backward-compatible types from earlier versions (prefer specialized variants)
Source: DebtType enum defined in src/priority/debt_types.rs:47-205, category mappings in src/priority/mod.rs:216-254
Debt Type Enum
The DebtType enum defines all specific debt patterns that Debtmap can detect.
Note: Each DebtType variant carries structured diagnostic data specific to that pattern. For example, ComplexityHotspot includes cyclomatic and cognitive fields that provide detailed metrics for the detected issue. This structured data enables precise prioritization.
Source: DebtType structure defined in src/priority/debt_types.rs:47-205
Priority-Specific Debt Types (26 types)
These are the primary debt types with specialized detection logic and detailed diagnostic data.
Testing Debt (6 types):
TestingGap- Functions with insufficient test coverage (fields:coverage,cyclomatic,cognitive)TestTodo- TODO comments in test code (fields:priority,reason)TestDuplication- Duplicated code in test files (fields:instances,total_lines,similarity)TestComplexityHotspot- Complex test logic that’s hard to maintain (fields:cyclomatic,cognitive,threshold)AssertionComplexity- Complex test assertions (fields:assertion_count,complexity_score)FlakyTestPattern- Non-deterministic test behavior (fields:pattern_type,reliability_impact)
Architecture Debt (6 types):
GodObject- Unified variant for classes/files/modules with too many responsibilities (fields:methods,fields,responsibilities,god_object_score,lines). The detection type (GodClass, GodFile, GodModule) is determined bygod_object_indicators.detection_typein the parent structure.FeatureEnvy- Using more data from other objects than own (fields:external_class,usage_ratio)PrimitiveObsession- Overusing basic types instead of domain objects (fields:primitive_type,domain_concept)ScatteredType- Types with methods scattered across multiple files (fields:type_name,total_methods,file_count,severityas String)OrphanedFunctions- Functions that should be methods on a type (fields:target_type,function_count,file_count)UtilitiesSprawl- Utility modules with poor cohesion (fields:function_count,distinct_types)
Performance Debt (8 types):
AllocationInefficiency- Inefficient memory allocations (fields:pattern,impact)StringConcatenation- Inefficient string building in loops (fields:loop_type,iterations)NestedLoops- Multiple nested iterations O(n²) or worse (fields:depth,complexity_estimate)BlockingIO- Blocking I/O in async contexts (fields:operation,context)SuboptimalDataStructure- Wrong data structure for access pattern (fields:current_type,recommended_type)AsyncMisuse- Improper async/await usage (fields:pattern,performance_impact)ResourceLeak- Resources not properly released (fields:resource_type,cleanup_missing)CollectionInefficiency- Inefficient collection operations (fields:collection_type,inefficiency_type)
Code Quality Debt (6 types):
ComplexityHotspot- Functions exceeding complexity thresholds (fields:cyclomatic,cognitive)DeadCode- Unreachable or unused code (fields:visibility,cyclomatic,cognitive,usage_hints)MagicValues- Unexplained literal values (fields:value,occurrences)Risk- High-risk code combining complexity + poor test coverage (fields:risk_score,factors)Duplication- Duplicated code blocks (fields:instances,total_lines)ErrorSwallowing- Errors caught but ignored (fields:pattern,context)
Legacy Debt Types (9 types)
These variants are maintained for backward compatibility. For new analysis, prefer the specialized priority-specific variants listed above.
Source: Legacy variants defined in src/priority/debt_types.rs:48-77
| Legacy Type | Preferred Alternative | Migration Note |
|---|---|---|
Todo | Use for general TODO markers | Fields: reason |
Fixme | Use for general FIXME markers | Fields: reason |
CodeSmell | Use specific debt types | Fields: smell_type |
Complexity | ComplexityHotspot | Fields: cyclomatic, cognitive |
Dependency | Specific architecture types | Fields: dependency_type |
ResourceManagement | ResourceLeak, AllocationInefficiency | Fields: issue_type |
CodeOrganization | ScatteredType, OrphanedFunctions, UtilitiesSprawl | Fields: issue_type |
TestComplexity | TestComplexityHotspot | Fields: cyclomatic, cognitive |
TestQuality | FlakyTestPattern, AssertionComplexity | Fields: issue_type |
Note: Legacy types are categorized under CodeQuality by default (see src/priority/mod.rs:251-252).
Debt Categories
The DebtCategory enum groups debt types into strategic categories:
Source: src/priority/mod.rs:196-213
#![allow(unused)]
fn main() {
pub enum DebtCategory {
Architecture, // Structure, design, complexity
Testing, // Coverage, test quality
Performance, // Speed, memory, efficiency
CodeQuality, // Maintainability, readability
}
}
Category Mapping
Source: Category assignment logic in src/priority/mod.rs:216-254
| Category | Debt Types | Strategic Focus |
|---|---|---|
| Architecture | GodObject, FeatureEnvy, PrimitiveObsession, ScatteredType, OrphanedFunctions, UtilitiesSprawl | Structural improvements, design patterns, type organization |
| Testing | TestingGap, TestComplexityHotspot, TestTodo, TestDuplication, AssertionComplexity, FlakyTestPattern | Test coverage, test quality |
| Performance | AsyncMisuse, CollectionInefficiency, NestedLoops, BlockingIO, AllocationInefficiency, StringConcatenation, SuboptimalDataStructure, ResourceLeak | Runtime efficiency, resource usage |
| CodeQuality | ComplexityHotspot, DeadCode, Duplication, Risk, ErrorSwallowing, MagicValues, all legacy types | Maintainability, reliability, code clarity |
Note: GodObject is a unified variant that handles all god object detection types (GodClass, GodFile, GodModule). There is no separate GodModule debt type in the enum.
Language-Specific Debt Patterns:
Some debt patterns only apply to languages with specific features:
- BlockingIO, AsyncMisuse: Async-capable languages (Rust)
- AllocationInefficiency, ResourceLeak: Languages with manual memory management (Rust)
- Error handling patterns: Vary by language error model (Result in Rust)
Debtmap automatically applies only the relevant debt patterns during analysis.
Examples by Category
Architecture Debt
GodObject (unified variant for GodClass/GodFile/GodModule): Too many responsibilities
Source: src/priority/debt_types.rs:143-154
#![allow(unused)]
fn main() {
// God module: handles parsing, validation, storage, notifications
mod user_service {
fn parse_user() { /* ... */ }
fn validate_user() { /* ... */ }
fn save_user() { /* ... */ }
fn send_email() { /* ... */ }
fn log_activity() { /* ... */ }
// ... 20+ more functions
}
}
The GodObject debt type captures all god object patterns through a unified variant. The specific detection type (GodClass, GodFile, or GodModule) is stored in the parent UnifiedDebtItem.god_object_indicators.detection_type field:
| Detection Type | When Used | Fields Used |
|---|---|---|
| GodClass | Class with too many methods/fields | methods, fields (Some), responsibilities |
| GodFile | File with too many functions | methods, fields (None), lines |
| GodModule | Module with too many responsibilities | methods, fields (None), responsibilities |
When detected: Complexity-weighted scoring system (see detailed explanation below) Action: Split into focused modules (parser, validator, repository, notifier)
Complexity-Weighted God Object Detection
Debtmap uses complexity-weighted scoring for god object detection to reduce false positives on well-refactored code. This ensures that a file with 100 simple helper functions doesn’t rank higher than a file with 10 complex functions.
The Problem:
Traditional god object detection counts methods:
- File A: 100 methods (average complexity: 1.5) → Flagged as god object
- File B: 10 methods (average complexity: 17.0) → Not flagged
But File A might be a well-organized utility module with many small helpers, while File B is truly problematic with highly complex functions that need refactoring.
The Solution:
Debtmap weights each function by its cyclomatic complexity using this formula:
weight = (max(1, complexity) / 3)^1.5
Weight Examples:
- Simple helper (complexity 1): weight ≈ 0.19
- Baseline function (complexity 3): weight = 1.0
- Moderate function (complexity 9): weight ≈ 5.2
- Complex function (complexity 17): weight ≈ 13.5
- Critical function (complexity 33): weight ≈ 36.5
God Object Score Calculation:
weighted_method_count = sum(weight for each function)
complexity_penalty = 0.7 if avg_complexity < 3, 1.0 if 3-10, 1.5 if > 10
god_object_score = (
(weighted_method_count / threshold) * 40% +
(field_count / threshold) * 20% +
(responsibility_count / threshold) * 15% +
(lines_of_code / 500) * 25%
) * complexity_penalty
Threshold: God object detected if score >= 70.0
Real-World Example:
shared_cache.rs:
- 100 functions, average complexity: 1.5
- Weighted score: ~19.0 (100 * 0.19)
- God object score: 45.2
- Result: Not a god object ✓
legacy_parser.rs:
- 10 functions, average complexity: 17.0
- Weighted score: ~135.0 (10 * 13.5)
- God object score: 87.3
- Result: God object detected ✓
Benefits:
- Reduces false positives on utility modules with many simple functions
- Focuses attention on truly problematic complex modules
- Rewards good refactoring - breaking large functions into small helpers improves score
- Aligns with reality - complexity matters more than count for maintainability
How to View:
When Debtmap detects a god object, the output includes:
- Raw method count
- Weighted method count
- Average complexity
- God object score
- Recommended module splits based on responsibility clustering
Type Organization Debt
Source: Type organization patterns defined in src/priority/debt_types.rs:189-205 (Spec 187), detection logic in src/organization/codebase_type_analyzer.rs
These patterns detect issues with how types and their associated functions are organized across the codebase.
ScatteredType: Type with methods scattered across multiple files
#![allow(unused)]
fn main() {
// Type definition in types/user.rs
pub struct User {
id: UserId,
name: String,
}
// Methods scattered across files:
// In modules/auth.rs:
impl User {
fn authenticate(&self) -> Result<Session> { /* ... */ }
}
// In modules/validation.rs:
impl User {
fn validate_email(&self) -> Result<()> { /* ... */ }
}
// In modules/persistence.rs:
impl User {
fn save(&self) -> Result<()> { /* ... */ }
}
// Problem: User methods spread across 4+ files!
}
When detected: Type has methods in 2+ files
Severity levels: Stored as String field derived from file_count:
- Low = 2 files
- Medium = 3-5 files
- High = 6+ files
Action: Consolidate methods into primary file or create focused trait implementations
Source: Detection criteria from src/organization/codebase_type_analyzer.rs:46-47, type definition in src/priority/debt_types.rs:190-195
OrphanedFunctions: Functions that should be methods on a type
#![allow(unused)]
fn main() {
// Bad: Orphaned functions operating on User
fn validate_user_email(user: &User) -> Result<()> {
// Email validation logic
}
fn calculate_user_age(user: &User) -> u32 {
// Age calculation from birthdate
}
fn format_user_display(user: &User) -> String {
// Display formatting
}
// Good: Functions converted to methods
impl User {
fn validate_email(&self) -> Result<()> { /* ... */ }
fn age(&self) -> u32 { /* ... */ }
fn display(&self) -> String { /* ... */ }
}
}
When detected: Multiple functions with shared type parameter pattern (e.g., all take &User)
Action: Convert functions to methods on the target type
Source: Detection in src/organization/codebase_type_analyzer.rs:58-71, type definition in src/priority/debt_types.rs:196-200
UtilitiesSprawl: Utility module with poor cohesion
#![allow(unused)]
fn main() {
// Bad: utils.rs with mixed responsibilities
mod utils {
fn parse_date(s: &str) -> Date { /* ... */ }
fn validate_email(s: &str) -> bool { /* ... */ }
fn calculate_hash(data: &[u8]) -> Hash { /* ... */ }
fn format_currency(amount: f64) -> String { /* ... */ }
fn send_notification(msg: &str) { /* ... */ }
// ... 20+ more unrelated functions
}
// Good: Focused modules
mod date_utils { fn parse(s: &str) -> Date { /* ... */ } }
mod validators { fn email(s: &str) -> bool { /* ... */ } }
mod crypto { fn hash(data: &[u8]) -> Hash { /* ... */ } }
mod formatters { fn currency(amount: f64) -> String { /* ... */ } }
mod notifications { fn send(msg: &str) { /* ... */ } }
}
When detected: Utility module has many functions operating on diverse types with low cohesion Action: Split into focused modules based on domain or responsibility
Source: Detection in src/organization/codebase_type_analyzer.rs:74-80, type definition in src/priority/debt_types.rs:201-204
Testing Debt
TestingGap: Functions with insufficient test coverage
#![allow(unused)]
fn main() {
// 0% coverage - critical business logic untested
fn calculate_tax(amount: f64, region: &str) -> f64 {
// Complex tax calculation logic
// No tests exist for this function!
}
}
When detected: Coverage data shows function has < 80% line coverage Action: Add unit tests to cover all branches and edge cases
TestComplexity: Test functions too complex
#![allow(unused)]
fn main() {
#[test]
fn complex_test() {
// Cyclomatic: 12 (too complex for a test)
for input in test_cases {
if input.is_special() {
match input.type {
/* complex test logic */
}
}
}
}
}
When detected: Test functions with cyclomatic > 10 or cognitive > 15 Action: Split into multiple focused tests, use test fixtures
FlakyTestPattern: Non-deterministic tests
#![allow(unused)]
fn main() {
#[test]
fn flaky_test() {
let result = async_operation().await; // Timing-dependent
thread::sleep(Duration::from_millis(100)); // Race condition!
assert_eq!(result.status, "complete");
}
}
When detected: Pattern analysis for timing dependencies, random values Action: Use mocks, deterministic test data, proper async test utilities
Performance Debt
AllocationInefficiency: Excessive allocations
#![allow(unused)]
fn main() {
// Bad: Allocates on every iteration
fn process_items(items: &[Item]) -> Vec<String> {
let mut results = Vec::new();
for item in items {
results.push(item.name.clone()); // Unnecessary clone
}
results
}
// Good: Pre-allocate, avoid clones
fn process_items(items: &[Item]) -> Vec<&str> {
items.iter().map(|item| item.name.as_str()).collect()
}
}
BlockingIO: Blocking operations in async contexts
#![allow(unused)]
fn main() {
// Bad: Blocks async runtime
async fn load_data() -> Result<Data> {
let file = std::fs::read_to_string("data.json")?; // Blocking!
parse_json(&file)
}
// Good: Async I/O
async fn load_data() -> Result<Data> {
let file = tokio::fs::read_to_string("data.json").await?;
parse_json(&file)
}
}
NestedLoops: O(n²) or worse complexity
#![allow(unused)]
fn main() {
// Bad: O(n²) nested loops
fn find_duplicates(items: &[Item]) -> Vec<(Item, Item)> {
let mut dupes = vec![];
for i in 0..items.len() {
for j in i+1..items.len() {
if items[i] == items[j] {
dupes.push((items[i].clone(), items[j].clone()));
}
}
}
dupes
}
// Good: O(n) with HashSet
fn find_duplicates(items: &[Item]) -> Vec<Item> {
let mut seen = HashSet::new();
items.iter().filter(|item| !seen.insert(item)).cloned().collect()
}
}
Code Quality Debt
ComplexityHotspot: Functions exceeding complexity thresholds
Source: Type definition in src/priority/debt_types.rs:84-87, categorized as CodeQuality in src/priority/mod.rs:245
#![allow(unused)]
fn main() {
// Cyclomatic: 22, Cognitive: 35
fn process_transaction(tx: Transaction, account: &mut Account) -> Result<Receipt> {
if tx.amount <= 0 {
return Err(Error::InvalidAmount);
}
if account.balance < tx.amount {
if account.overdraft_enabled {
if account.overdraft_limit >= tx.amount {
// More nested branches...
}
} else {
return Err(Error::InsufficientFunds);
}
}
// ... deeply nested logic with many branches
Ok(receipt)
}
}
Fields captured: cyclomatic: u32, cognitive: u32
When detected: Cyclomatic > 10 OR Cognitive > 15 (configurable)
Action: Break into smaller functions, extract validation, simplify control flow
DeadCode: Unreachable or unused code
Source: Type definition in src/priority/debt_types.rs:88-93, categorized as CodeQuality in src/priority/mod.rs:246
#![allow(unused)]
fn main() {
// Private function never called within module
fn obsolete_calculation(x: i32) -> i32 {
x * 2 + 5 // Dead code - no callers
}
// Public function but no external usage
pub fn deprecated_api(data: &str) -> Result<()> {
// Unreachable in practice
Ok(())
}
}
When detected: Function visibility analysis + call graph shows no callers Action: Remove dead code or document if intentionally kept for future use
MagicValues: Unexplained literal values
Source: Type definition in src/priority/debt_types.rs:163-166, categorized as CodeQuality in src/priority/mod.rs:250
#![allow(unused)]
fn main() {
// Bad: Magic numbers
fn calculate_price(quantity: u32) -> f64 {
quantity as f64 * 19.99 + 5.0 // What are these numbers?
}
// Good: Named constants
const UNIT_PRICE: f64 = 19.99;
const SHIPPING_COST: f64 = 5.0;
fn calculate_price(quantity: u32) -> f64 {
quantity as f64 * UNIT_PRICE + SHIPPING_COST
}
}
When detected: Numeric or string literals without clear context (excludes 0, 1, common patterns) Action: Extract to named constants or configuration
Duplication: Duplicated code blocks
#![allow(unused)]
fn main() {
// File A:
fn process_user(user: User) -> Result<()> {
validate_email(&user.email)?;
validate_age(user.age)?;
save_to_database(&user)?;
send_welcome_email(&user.email)?;
Ok(())
}
// File B: Duplicated validation
fn process_admin(admin: Admin) -> Result<()> {
validate_email(&admin.email)?; // Duplicated
validate_age(admin.age)?; // Duplicated
save_to_database(&admin)?;
grant_admin_privileges(&admin)?;
Ok(())
}
}
When detected: Similar code blocks > 50 lines (configurable) Action: Extract shared code into reusable functions
ErrorSwallowing: Errors caught but ignored
#![allow(unused)]
fn main() {
// Bad: Error swallowed, no context
match risky_operation() {
Ok(result) => process(result),
Err(_) => {}, // Silent failure!
}
// Good: Error handled with context
match risky_operation() {
Ok(result) => process(result),
Err(e) => {
log::error!("Risky operation failed: {}", e);
return Err(e.into());
}
}
}
When detected: Empty catch blocks, ignored Results Action: Add proper error logging and propagation
Risk: High-risk code (complex + poorly tested)
#![allow(unused)]
fn main() {
// Cyclomatic: 18, Coverage: 20%, Risk Score: 47.6 (HIGH)
fn process_payment(tx: Transaction) -> Result<Receipt> {
// Complex payment logic with minimal testing
// High risk of bugs in production
}
}
When detected: Combines complexity metrics with coverage data Action: Either add comprehensive tests OR refactor to reduce complexity
Debt Scoring Formula
Each debt item gets a score based on priority and type:
debt_score = priority_weight × type_weight
Priority weights:
- Low = 1
- Medium = 3
- High = 5
- Critical = 10
Combined examples:
- Low Todo = 1 × 1 = 1
- Medium Fixme = 3 × 2 = 6
- High Complexity = 5 × 5 = 25
- Critical Complexity = 10 × 5 = 50
Total debt score = Sum of all debt item scores
Lower is better. Track over time to measure improvement.
Risk Scoring
Debtmap’s risk scoring identifies code that is both complex AND poorly tested - the true risk hotspots.
Unified Scoring System
Debtmap uses a unified scoring system (0-10 scale) as the primary prioritization mechanism. This multi-factor approach balances complexity, test coverage, and dependency impact, adjusted by function role.
Source: src/priority/unified_scorer.rs:97-144 (UnifiedScore struct)
Score Scale and Priority Classifications
Functions receive scores from 0 (minimal risk) to 10 (critical risk):
| Score Range | Priority | Description | Action |
|---|---|---|---|
| 9.0-10.0 | Critical | Severe risk requiring immediate attention | Address immediately |
| 7.0-8.9 | High | Significant risk, should be addressed soon | Plan for this sprint |
| 5.0-6.9 | Medium | Moderate risk, plan for future work | Schedule for next sprint |
| 3.0-4.9 | Low | Minor risk, lower priority | Monitor and address as time permits |
| 0.0-2.9 | Minimal | Well-managed code | Continue monitoring |
Scoring Formula
The unified score combines three weighted factors:
Base Score = (Complexity Factor × Weight) + (Coverage Factor × Weight) + (Dependency Factor × Weight)
Final Score = Base Score × Role Multiplier × Purity Adjustment
Source: src/priority/unified_scorer.rs:300-325 (calculate_unified_priority_with_debt)
Dynamic Weight Adjustment
IMPORTANT: Weights are dynamically adjusted based on coverage data availability.
When coverage data is available (default):
- Complexity: ~35-40% (via complexity_factor)
- Coverage: ~35-40% (via coverage multiplier dampening)
- Dependency: ~20-25%
When coverage data is NOT available:
- Complexity: 50%
- Dependency: 25%
- Debt patterns: 25% (reserved for additive adjustments)
Source:
- With coverage: src/priority/scoring/calculation.rs:68-82 (calculate_base_score_with_coverage_multiplier)
- Without coverage: src/priority/scoring/calculation.rs:119-129 (calculate_base_score_no_coverage)
These weights can be adjusted in .debtmap.toml to match your team’s priorities.
Factor Calculations
Complexity Factor (0-10 scale):
#![allow(unused)]
fn main() {
// Source: src/priority/scoring/calculation.rs:54-59
Complexity Factor = (raw_complexity / 2.0).clamp(0.0, 10.0)
// Where raw_complexity is weighted combination:
// Default: 30% cyclomatic + 70% cognitive
// For orchestrators: 25% cyclomatic + 75% cognitive
}
Maps normalized complexity (0-20 range) to 0-10 scale. Uses configurable weights that prioritize cognitive complexity (70%) over cyclomatic complexity (30%) as it correlates better with defect density. See Complexity Metrics for detailed explanations of cyclomatic vs cognitive complexity.
Source: src/config/scoring.rs:221-267 (ComplexityWeightsConfig)
Coverage Factor (0-10 scale):
#![allow(unused)]
fn main() {
// Source: src/priority/scoring/calculation.rs:8-21
Coverage Multiplier = 1.0 - coverage_percentage
// Applied as dampening:
Base Score × Coverage Multiplier
}
Coverage acts as a dampening multiplier:
- 0% coverage → multiplier = 1.0 (no dampening)
- 50% coverage → multiplier = 0.5 (50% reduction)
- 100% coverage → multiplier = 0.0 (maximum dampening)
Uncovered complex code scores higher than uncovered simple code. Well-tested code gets lower scores.
Dependency Factor (0-10 scale):
#![allow(unused)]
fn main() {
// Source: src/priority/scoring/calculation.rs:61-66
Dependency Factor = (upstream_caller_count / 2.0).min(10.0)
}
Based on call graph analysis with linear scaling:
- 0-1 upstream callers → score 0-0.5 (low impact)
- 2-4 upstream callers → score 1.0-2.0 (moderate impact)
- 5+ upstream callers → score 2.5-10.0 (high impact, capped at 10.0)
Critical path bonus: Functions on critical paths from entry points receive additional dependency weight.
Role-Based Prioritization
The unified score is multiplied by a role multiplier based on the function’s semantic classification.
Source: src/priority/semantic_classifier/mod.rs:24-33 (FunctionRole enum)
Role Multipliers
| Role | Multiplier | Description | When Applied |
|---|---|---|---|
| EntryPoint | 1.5× | main(), HTTP handlers, API endpoints | User-facing code where bugs have immediate impact |
| PureLogic (complex) | 1.3× | Business logic with complexity > 5.0 | Critical domain functions |
| PureLogic (simple) | 1.0× | Business logic with complexity ≤ 5.0 | Baseline importance for domain code |
| Orchestrator | 0.8× | Coordinates 5+ other functions | Delegation-heavy code with low cognitive load |
| PatternMatch | 0.6× | Simple pattern matching functions | Low complexity branching logic |
| IOWrapper | 0.5× | Thin I/O layer (file, network, database) | Simple wrappers around external systems |
| Debug | 0.3× | Debug/diagnostic functions | Lowest test priority |
Source:
- Multiplier values: src/priority/unified_scorer.rs:624-635 (calculate_role_multiplier)
- Configuration defaults: src/config/scoring.rs:147-220 (RoleMultipliers)
Note: Configuration allows overriding these default multipliers via .debtmap.toml. See Configuration for details on customizing role weights.
Note: PureLogic has a dynamic multiplier that adjusts based on complexity. Simple business logic (≤ 5.0 complexity) gets baseline priority, while complex business logic (> 5.0) receives elevated priority (1.3×).
How Role Classification Works
Debtmap identifies function roles through a rule-based classifier with specific detection heuristics:
Source: src/priority/semantic_classifier/mod.rs:46-114 (classify_by_rules)
Detection Rules (in priority order):
-
EntryPoint - Detected by:
- Name patterns:
main,handle_*,run_* - Call graph analysis: no upstream callers (entry point to call graph)
- Source: Line 54
- Name patterns:
-
Debug - Detected by:
- Name patterns:
debug_*,dump_*,log_*,print_*,display_*,trace_*,*_diagnostics,*_debug,*_stats - Complexity limit: cognitive ≤ 10
- Source: Line 59, src/priority/semantic_classifier/classifiers.rs:14-65
- Name patterns:
-
Constructors (classified as PureLogic) - Detected by:
- Name patterns:
new,with_*,from_*,default,create_*,make_*,build_* - Complexity thresholds: cyclomatic ≤ 2, cognitive ≤ 3, length < 15, nesting ≤ 1
- Source: Line 64, src/priority/semantic_classifier/classifiers.rs:67-115
- Name patterns:
-
Accessors (classified as IOWrapper) - Detected by:
- Name patterns:
get_*,is_*,has_*,can_*,should_*,as_*,to_*, single-word accessors (id,name,value, etc.) - Complexity thresholds: cyclomatic ≤ 2, cognitive ≤ 1, length < 10, nesting ≤ 1
- Source: Line 77, src/priority/semantic_classifier/mod.rs:147-177 (is_accessor_method)
- Name patterns:
-
PatternMatch - Detected by:
- Simple match/if-else chains
- Low complexity relative to branch count
- Source: Line 99
-
IOWrapper - Detected by:
- Simple file/network/database operations
- Thin wrapper around I/O primitives
- Source: Line 104
-
Orchestrator - Detected by:
- High delegation ratio (calls 5+ functions)
- Low cognitive complexity relative to cyclomatic complexity
- Coordinates other functions without complex logic
- Source: Line 109
-
PureLogic (default) - Applied when:
- None of the above patterns match
- Assumed to be core business logic
Example: Same Complexity, Different Priorities
Consider a function with base score 8.0:
If classified as EntryPoint:
Final Score = 8.0 × 1.5 = 12.0 (capped at 10.0) → CRITICAL priority
If classified as PureLogic (complex):
Final Score = 8.0 × 1.3 = 10.4 (capped at 10.0) → CRITICAL priority
If classified as PureLogic (simple):
Final Score = 8.0 × 1.0 = 8.0 → HIGH priority
If classified as Orchestrator:
Final Score = 8.0 × 0.8 = 6.4 → MEDIUM priority
If classified as IOWrapper:
Final Score = 8.0 × 0.5 = 4.0 → LOW priority
This ensures that complex code in critical paths gets higher priority than equally complex utility code.
Real Example from Codebase:
A payment processing function with cyclomatic complexity 18 and cognitive complexity 25:
- If it directly implements business logic → PureLogic (complex) → 1.3× multiplier
- If it mainly delegates to other payment functions → Orchestrator → 0.8× multiplier
- If it’s a thin wrapper around a payment API → IOWrapper → 0.5× multiplier
Coverage Propagation
Coverage impact flows through the call graph using transitive coverage and indirect coverage analysis.
Source: src/priority/coverage_propagation.rs:291-387
How It Works
Transitive coverage is calculated via call graph traversal with distance-based dampening:
#![allow(unused)]
fn main() {
// Source: src/priority/coverage_propagation.rs:342-364
Indirect Coverage = Σ(Caller Coverage × 0.7^distance)
Where:
- distance = hops from tested code (MAX_DEPTH = 3)
- DISTANCE_DISCOUNT = 0.7 (70% per hop)
- Well-tested threshold = 0.8 (80% coverage)
}
Implementation Details:
- Transitive coverage is calculated via recursive call graph traversal
- Results are stored in
UnifiedDebtItem.transitive_coveragefield (Source: src/priority/unified_scorer.rs:154) - Weights decay exponentially with call graph depth:
- 1 hop away: contribution × 0.7
- 2 hops away: contribution × 0.49 (0.7²)
- 3 hops away: contribution × 0.343 (0.7³)
- Used to adjust coverage factor in scoring, reducing false positives for utility functions
Coverage Urgency Calculation
The system calculates coverage urgency (0-10 scale) by blending direct and transitive coverage:
#![allow(unused)]
fn main() {
// Source: src/priority/coverage_propagation.rs:237-270
Effective Coverage = (Direct Coverage × 0.7) + (Transitive Coverage × 0.3)
Coverage Urgency = (1.0 - Effective Coverage) × Complexity Weight × 10.0
}
Complexity weighting uses logarithmic scaling to prioritize complex functions.
Example Scenarios
Scenario 1: Untested function with well-tested callers
Function A: 0% direct coverage
Called by (1 hop):
- handle_request (95% coverage): contributes 95% × 0.7 = 66.5%
- process_payment (90% coverage): contributes 90% × 0.7 = 63%
- validate_order (88% coverage): contributes 88% × 0.7 = 61.6%
Indirect coverage: ~66% (highest contributor)
Effective coverage: (0% × 0.7) + (66% × 0.3) = ~20%
Final priority: Lower than isolated 0% coverage function
Scenario 2: Untested function on critical path
Function B: 0% direct coverage
Called by (1 hop):
- main (0% coverage): contributes 0% × 0.7 = 0%
- startup (10% coverage): contributes 10% × 0.7 = 7%
Indirect coverage: ~7% (minimal coverage benefit)
Effective coverage: (0% × 0.7) + (7% × 0.3) = ~2%
Final priority: Higher - on critical path with no safety net
Scenario 3: Multi-hop propagation
Function C: 0% direct coverage
Called by utility_helper (40% coverage, 1 hop):
utility_helper is called by:
- api_handler (95% coverage, 2 hops): contributes 95% × 0.7² = 46.6%
Indirect coverage via 2-hop path: ~46%
Effective coverage: ~14%
Final priority: Moderate - benefits from indirect testing
Coverage propagation prevents false alarms about utility functions called only by well-tested code, while highlighting genuinely risky untested code on critical paths.
Unified Score Example
Updated example using actual implementation:
Function: process_payment
Location: src/payments.rs:145
Metrics:
- Cyclomatic complexity: 18
- Cognitive complexity: 25
- Test coverage: 20%
- Upstream callers: 3
- Classified role: PureLogic (complex, since complexity > 5.0)
Step 1: Calculate raw complexity
Raw Complexity = (cyclomatic × 0.3) + (cognitive × 0.7)
= (18 × 0.3) + (25 × 0.7)
= 5.4 + 17.5
= 22.9
Step 2: Normalize to 0-10 scale
Complexity Factor = (22.9 / 2.0).clamp(0.0, 10.0)
= 10.0 (capped)
// Source: src/priority/scoring/calculation.rs:54-59
Step 3: Calculate coverage multiplier
Coverage Multiplier = 1.0 - 0.20 = 0.80
// Source: src/priority/scoring/calculation.rs:8-21
Step 4: Calculate dependency factor
Dependency Factor = (3 / 2.0).min(10.0) = 1.5
// Source: src/priority/scoring/calculation.rs:61-66
Step 5: Calculate base score (with dynamic weights)
Base Score = (Complexity Factor × weight) + (Coverage dampening) + (Dependency Factor × weight)
// Actual implementation uses coverage as dampening multiplier
Base = ((10.0 × 0.35) + (1.5 × 0.20)) × 0.80
= (3.5 + 0.3) × 0.80
= 3.04
// Source: src/priority/scoring/calculation.rs:68-82
Step 6: Apply role multiplier
Role Multiplier = 1.3 (PureLogic with complexity > 5.0)
// Source: src/priority/unified_scorer.rs:624-635
Final Score = 3.04 × 1.3 = 3.95 → LOW priority
Note: The 20% coverage dampening significantly reduces the final score.
If this function had 0% coverage:
Coverage Multiplier = 1.0 (no dampening)
Base Score = 3.8
Final Score = 3.8 × 1.3 = 4.94 → LOW priority
If this function had 0% coverage AND higher dependency (8 callers):
Dependency Factor = (8 / 2.0).min(10.0) = 4.0
Base Score = ((10.0 × 0.35) + (4.0 × 0.20)) × 1.0 = 4.3
Final Score = 4.3 × 1.3 = 5.59 → MEDIUM priority
Key Insight: Coverage acts as a dampening multiplier, not an additive factor. The example in the original documentation overestimated risk by treating coverage as additive. The actual implementation properly dampens scores for tested code.
Legacy Risk Scoring (Pre-0.2.x)
Prior to the unified scoring system, Debtmap used a simpler additive risk formula. This is still available for compatibility but unified scoring is now the default and provides better prioritization.
Risk Categories
Note: The RiskLevel enum (Low, Medium, High, Critical) is used for legacy risk scoring compatibility. When using unified scoring (0-10 scale), refer to the priority classifications shown in the Unified Scoring System section above.
Legacy RiskLevel Enum
For legacy risk scoring, Debtmap classifies functions into four risk levels:
#![allow(unused)]
fn main() {
pub enum RiskLevel {
Low, // Score < 10
Medium, // Score 10-24
High, // Score 25-49
Critical, // Score ≥ 50
}
}
Critical (legacy score ≥ 50)
- High complexity (cyclomatic > 15) AND low coverage (< 30%)
- Untested code that’s likely to break and hard to fix
- Action: Immediate attention required - add tests or refactor
High (legacy score 25-49)
- High complexity (cyclomatic > 10) AND moderate coverage (< 60%)
- Risky code with incomplete testing
- Action: Should be addressed soon
Medium (legacy score 10-24)
- Moderate complexity (cyclomatic > 5) AND low coverage (< 50%)
- OR: High complexity with good coverage
- Action: Plan for next sprint
Low (legacy score < 10)
- Low complexity OR high coverage
- Well-managed code
- Action: Monitor, low priority
Unified Scoring Priority Levels
When using unified scoring (default), functions are classified using the 0-10 scale:
- Critical (9.0-10.0): Immediate attention
- High (7.0-8.9): Address this sprint
- Medium (5.0-6.9): Plan for next sprint
- Low (3.0-4.9): Monitor and address as time permits
- Minimal (0.0-2.9): Well-managed code
Well-tested complex code is an outcome in both systems, not a separate category:
- Complex function (cyclomatic 18, cognitive 25) with 95% coverage
- Unified score: ~2.5 (Minimal priority due to coverage dampening)
- Legacy risk score: ~8 (Low risk)
- Falls into low-priority categories because good testing mitigates complexity
- This is the desired state for inherently complex business logic
Legacy Risk Calculation
Note: The legacy risk calculation is still supported for compatibility but has been superseded by the unified scoring system (see above). Unified scoring provides better prioritization through its multi-factor, weighted approach with role-based adjustments.
The legacy risk score uses a simpler additive formula:
#![allow(unused)]
fn main() {
risk_score = complexity_factor + coverage_factor + debt_factor
where:
complexity_factor = (cyclomatic / 5) + (cognitive / 10)
coverage_factor = (1 - coverage_percentage) × 50
debt_factor = debt_score / 10 // If debt data available
}
Note on debt_score: The debt_score comes from DebtAggregator which combines multiple debt dimensions:
- Testing debt (unwrap calls, untested error paths)
- Resource debt (unclosed files, memory leaks)
- Duplication debt (code clones)
Source: src/priority/debt_aggregator.rs
Example (legacy scoring):
Function: process_payment
- Cyclomatic complexity: 18
- Cognitive complexity: 25
- Coverage: 20%
- Debt score: 15 (from DebtAggregator)
Calculation:
complexity_factor = (18 / 5) + (25 / 10) = 3.6 + 2.5 = 6.1
coverage_factor = (1 - 0.20) × 50 = 40
debt_factor = 15 / 10 = 1.5
risk_score = 6.1 + 40 + 1.5 = 47.6 (HIGH RISK)
When to use legacy scoring:
- Comparing with historical data from older Debtmap versions
- Teams with existing workflows built around the old scale
- Gradual migration to unified scoring
Why unified scoring is better:
- Normalized 0-10 scale is more intuitive
- Dynamic weights adjust based on coverage data availability
- Role multipliers adjust priority based on function importance
- Coverage propagation reduces false positives for utility functions
- Purity adjustments reward functional programming patterns
Test Effort Assessment
Debtmap estimates testing difficulty based on complexity metrics using an advanced effort model.
Source: src/risk/roi/effort.rs (AdvancedEffortModel)
How Effort is Calculated
Test effort estimation involves two components:
-
Test case count: Estimated from cyclomatic complexity (branch coverage)
- Each branch represents a code path that needs testing
- Formula approximates test cases needed for comprehensive branch coverage
-
Time estimate: Calculated from cognitive complexity (comprehension difficulty)
- Higher cognitive complexity means more time to understand and write tests
- Includes setup cost, assertion cost, and complexity multipliers
- Optional learning system can adjust estimates based on historical data
Difficulty Levels:
- Trivial (cognitive < 5): 1-2 test cases, < 1 hour
- Simple (cognitive 5-10): 3-5 test cases, 1-2 hours
- Moderate (cognitive 10-20): 6-10 test cases, 2-4 hours
- Complex (cognitive 20-40): 11-20 test cases, 4-8 hours
- VeryComplex (cognitive > 40): 20+ test cases, 8+ hours
Test Effort includes:
- Cognitive load: How hard to understand the function
- Branch count (cyclomatic): Number of paths to test
- Recommended test cases: Estimated from cyclomatic complexity
- Estimated hours: Derived from cognitive complexity with setup overhead
Risk Distribution
Debtmap provides codebase-wide risk metrics:
{
"risk_distribution": {
"critical_count": 12,
"high_count": 45,
"medium_count": 123,
"low_count": 456,
"minimal_count": 234,
"total_functions": 870
},
"codebase_risk_score": 1247.5
}
Interpreting distribution:
- Healthy codebase: Most functions in Low/Minimal priority (unified scoring) or Low/WellTested (legacy)
- Needs attention: Many Critical/High priority functions
- Technical debt: High codebase risk score
Legacy vs Unified Risk Distribution Fields
IMPORTANT: The field names differ between legacy and unified scoring systems:
| Unified Scoring (0-10 scale) | Legacy Scoring (RiskCategory enum) |
|---|---|
minimal_count (0-2.9) | Not present |
low_count (3.0-4.9) | low_count |
medium_count (5.0-6.9) | medium_count |
high_count (7.0-8.9) | high_count |
critical_count (9.0-10.0) | critical_count |
| Not present | well_tested_count (legacy outcome) |
Sources:
- Unified priority tiers: src/priority/tiers/mod.rs
- Legacy RiskCategory enum: src/risk/mod.rs:36-42
Note on minimal_count:
In unified scoring (0-10 scale), minimal_count represents functions scoring 0-2.9, which includes:
- Simple utility functions with low complexity
- Helper functions with minimal risk
- Well-tested complex code that scores low due to coverage dampening
This is not a separate risk category but an outcome of the unified scoring system. Complex business logic with 95% test coverage appropriately receives a minimal score (0-2.9), reflecting that good testing mitigates complexity risk.
When using legacy scoring, there is NO minimal_count field. Instead, you’ll see well_tested_count which represents functions that are both complex and well-tested (the desired outcome).
Testing Recommendations
When coverage data is provided, Debtmap generates prioritized testing recommendations with ROI analysis.
Source: src/risk/roi/mod.rs:66-113
ROI Calculation
The ROI calculation is much richer than a simple risk/effort ratio. It includes cascade impacts, module multipliers, and complexity weighting:
#![allow(unused)]
fn main() {
// Source: src/risk/roi/mod.rs:66-113
ROI = ((Direct_Impact × Module_Multiplier) + (Cascade_Impact × Cascade_Weight))
× Dependency_Factor × Complexity_Weight / Adjusted_Effort
}
Formula Components:
-
Direct Impact: Risk reduction from testing this function directly
-
Module Multiplier (based on module type):
- EntryPoint = 2.0 (highest priority for user-facing code)
- Core = 1.5 (domain logic)
- Api = 1.2 (API endpoints)
- Model = 1.1 (data models)
- IO = 1.0 (baseline for I/O operations)
-
Cascade Impact: Risk reduction in dependent functions
- Calculated using cascade analyzer
- Cascade Weight: Configurable (default 0.5)
- Max Cascade Depth: 3 hops (configurable)
-
Dependency Factor: Amplifies ROI based on number of dependents
#![allow(unused)] fn main() { Dependency_Factor = 1.0 + min(dependent_count × 0.1, 1.0) }- Capped at 2.0× multiplier
- Rewards testing functions with many dependents
-
Complexity Weight: Penalizes trivial delegation functions
- (cyclomatic=1, cognitive=0-1): 0.1 (trivial delegation)
- (cyclomatic=1, cognitive=2-3): 0.3 (very simple)
- (cyclomatic=2-3, any): 0.5 (simple)
- (cyclomatic=4-5, any): 0.7 (moderate)
- Other: 1.0 (complex, full weight)
-
Adjusted Effort: Base effort adjusted by learning system (if enabled)
- Learning system tracks historical test writing effort
- Adjusts estimates based on actual time spent
ROI Scaling (for intuitive 0-10 scale):
- raw_roi > 20.0:
10.0 + ln(raw_roi - 20.0)(logarithmic dampening) - 10.0 < raw_roi ≤ 20.0:
5.0 + (raw_roi - 20.0) × 0.5(linear dampening) - Otherwise: raw_roi (no scaling)
Sources:
- ROI model: src/risk/roi/models.rs:4-11
- Effort estimation: src/risk/roi/effort.rs
- Cascade impact: src/risk/roi/cascade.rs
Example ROI Output
{
"function": "process_transaction",
"file": "src/payments.rs",
"line": 145,
"current_risk": 47.6,
"potential_risk_reduction": 35.2,
"test_effort_estimate": {
"estimated_difficulty": "Complex",
"cognitive_load": 25,
"branch_count": 18,
"recommended_test_cases": 12,
"estimated_hours": 6.5
},
"roi": 8.2,
"roi_breakdown": {
"direct_impact": 35.2,
"module_multiplier": 1.5,
"cascade_impact": 12.4,
"cascade_weight": 0.5,
"dependency_factor": 1.3,
"complexity_weight": 1.0,
"adjusted_effort": 6.5
},
"rationale": "High complexity with low coverage (20%) and 3 downstream dependencies. Testing will reduce risk by 74%. Cascade effect improves 8 dependent functions.",
"dependencies": {
"upstream_callers": ["handle_payment_request"],
"downstream_callees": ["validate_amount", "check_balance", "record_transaction"],
"dependent_count": 13
},
"confidence": 0.85
}
Interpreting ROI:
- ROI > 5.0: Excellent return on investment, prioritize highly
- ROI 3.0-5.0: Good return, address soon
- ROI 1.0-3.0: Moderate return, plan for future work
- ROI < 1.0: Low return, consider other priorities
Key Insight: The cascade impact calculation means that testing a critical utility function with many dependents can have higher ROI than testing a complex but isolated function. This helps identify “force multiplier” tests that improve coverage across multiple modules.
Interpreting Results
Interpreting Results
Understanding Output Formats
Debtmap provides four output formats:
Terminal (default): Human-readable with colors and tables
debtmap analyze .
JSON: Machine-readable for CI/CD integration
debtmap analyze . --format json --output report.json
Markdown: Documentation-friendly
debtmap analyze . --format markdown --output report.md
HTML: Interactive dashboard with visualizations
debtmap analyze . --format html --output report.html
Source: Output formats defined in src/io/writers/ (html.rs, markdown.rs, terminal.rs, json.rs)
JSON Structure
Debtmap uses a unified JSON format (spec 108) that provides consistent structure for all debt items. The output is generated by converting the internal UnifiedAnalysis to UnifiedOutput format (src/output/unified.rs).
Top-level JSON structure:
{
"format_version": "1.0",
"metadata": {
"debtmap_version": "0.11.0",
"generated_at": "2025-12-04T12:00:00Z",
"project_root": "/path/to/project",
"analysis_type": "full"
},
"summary": {
"total_items": 45,
"total_debt_score": 2847.3,
"debt_density": 113.89,
"total_loc": 25000,
"by_type": {
"File": 3,
"Function": 42
},
"by_category": {
"Architecture": 5,
"Testing": 23,
"Performance": 7,
"CodeQuality": 10
},
"score_distribution": {
"critical": 2, // score >= 100
"high": 8, // score >= 50
"medium": 18, // score >= 20
"low": 17 // score < 20
},
"cohesion": { // optional, spec 198
"average": 0.72, // codebase average (0.0-1.0)
"high_cohesion_files": 15, // >= 0.7
"medium_cohesion_files": 8, // 0.4-0.7
"low_cohesion_files": 3 // < 0.4
}
},
"items": [ /* array of UnifiedDebtItemOutput */ ]
}
Score Distribution Thresholds (Priority enum):
The score_distribution in JSON uses the Priority classification (src/output/unified/priority.rs:15-33):
| Priority | Score Threshold | Description |
|---|---|---|
critical | score >= 100 | Immediate action required |
high | score >= 50 | Address this sprint |
medium | score >= 20 | Plan for next sprint |
low | score < 20 | Background maintenance |
Note: This differs from the Tier system used for terminal display (see Tiered Prioritization).
Cohesion Summary (optional, spec 198):
When cohesion analysis is enabled, the summary includes codebase-wide cohesion statistics (src/output/unified/cohesion.rs:64-73):
average: Mean cohesion score across all files (0.0-1.0, higher = better)high_cohesion_files: Files with cohesion >= 0.7 (well-organized)medium_cohesion_files: Files with cohesion 0.4-0.7 (acceptable)low_cohesion_files: Files with cohesion < 0.4 (candidates for refactoring)
Individual debt item structure (function-level):
{
"type": "Function",
"score": 87.3,
"category": "Testing",
"priority": "high",
"location": {
"file": "src/main.rs",
"line": 42,
"function": "process_data"
},
"metrics": {
"cyclomatic_complexity": 15,
"cognitive_complexity": 22,
"length": 68,
"nesting_depth": 4,
"coverage": 0.0,
"uncovered_lines": [42, 43, 44, 45, 46],
"entropy_score": 0.65
},
"debt_type": {
"ComplexityHotspot": {
"cyclomatic": 15,
"cognitive": 22,
"adjusted_cyclomatic": null
}
},
"function_role": "BusinessLogic",
"purity_analysis": {
"is_pure": false,
"confidence": 0.8,
"side_effects": ["file_io", "network"]
},
"dependencies": {
"upstream_count": 2,
"downstream_count": 3,
"upstream_callers": ["main", "process_request"],
"downstream_callees": ["validate", "save", "notify"]
},
"recommendation": {
"action": "Add test coverage for complex function",
"priority": "High",
"implementation_steps": [
"Write unit tests covering all 15 branches",
"Consider extracting validation logic to reduce complexity"
]
},
"impact": {
"coverage_improvement": 0.85,
"complexity_reduction": 12.0,
"risk_reduction": 3.7
},
"adjusted_complexity": {
"dampened_cyclomatic": 12.5,
"dampening_factor": 0.83
},
"complexity_pattern": "validation",
"pattern_type": "state_machine",
"pattern_confidence": 0.72,
"context": { // optional, spec 263
"primary": {
"file": "src/main.rs",
"start_line": 35,
"end_line": 110,
"symbol": "process_data"
},
"related": [
{
"range": {
"file": "src/validation.rs",
"start_line": 10,
"end_line": 45
},
"relationship": "calls",
"reason": "Validation helper called by process_data"
}
],
"total_lines": 120,
"completeness_confidence": 0.85
},
"git_history": { // optional, spec 264
"change_frequency": 2.5, // changes per month
"bug_density": 0.15, // ratio of bug fixes to total commits
"age_days": 180, // code age
"author_count": 3, // unique contributors
"stability": "Frequently Changed" // stability classification
}
}
Source: JSON structure defined in src/output/unified.rs:18-24 (UnifiedOutput), lines 66-71 (UnifiedDebtItemOutput enum), lines 156-183 (FunctionDebtItemOutput)
Note: The "type" field uses a tagged enum format where the value is either "Function" or "File". All items have consistent top-level fields (score, category, priority, location) regardless of type, simplifying filtering and sorting across mixed results.
Reading Function Metrics
Key fields in metrics object (src/output/unified/func_item.rs:340-373):
cyclomatic_complexity: Decision points - guides test case countcognitive_complexity: Understanding difficulty - guides refactoring prioritylength: Lines of code - signals SRP violationsnesting_depth: Indentation depth - signals need for extractioncoverage: Test coverage percentage (0.0-1.0, optional if no coverage data)uncovered_lines: Array of line numbers not covered by tests (optional)entropy_score: Pattern analysis score for false positive reduction (0.0-1.0, optional)pattern_repetition: Repetition score (0.0-1.0, higher = more repetitive code) - optional, spec 264branch_similarity: Branch similarity score (0.0-1.0, higher = similar branches) - optional, spec 264entropy_adjusted_cognitive: Cognitive complexity adjusted for entropy patterns - optional, spec 264transitive_coverage: Coverage propagated from calling functions (0.0-1.0) - optional, spec 264
Fields at item level:
function_role: Function importance ("EntryPoint","BusinessLogic","Utility","TestHelper") - affects score multiplier (src/priority/mod.rs:231)purity_analysis: Whether function has side effects (optional) - affects testability assessmentis_pure: Boolean indicating no side effectsconfidence: How certain (0.0-1.0)side_effects: Array of detected side effect types (e.g., “file_io”, “network”, “mutation”)
dependencies: Call graph relationshipsupstream_count: Number of functions that call this one (impact radius)downstream_count: Number of functions this calls (complexity)upstream_callers: Names of calling functions (optional, included if count > 0)downstream_callees: Names of called functions (optional, included if count > 0)
Complexity adjustment fields:
adjusted_complexity: Entropy-based dampening applied (optional, included when dampening factor < 1.0)dampened_cyclomatic: Reduced complexity after pattern analysisdampening_factor: Multiplier applied (e.g., 0.83 = 17% reduction)
complexity_pattern: Detected pattern name (e.g., “validation”, “dispatch”, “error_handling”)pattern_type: Pattern category (e.g., “state_machine”, “coordinator”) - high-level classificationpattern_confidence: Confidence in pattern detection (0.0-1.0, shown if ≥ 0.5)
AI Agent Context fields (spec 263, src/output/unified/func_item.rs:78-83):
context: AI agent context window suggestion (optional) - helps LLMs understand codeprimary: Main code range containing the functionfile: Source file pathstart_line,end_line: Code rangesymbol: Function/method name (optional)
related: Array of related code ranges (callees, callers, types)range: File range with path and line numbersrelationship: How it relates (“calls”, “called_by”, “uses_type”, etc.)reason: Human-readable explanation
total_lines: Estimated total lines for AI to readcompleteness_confidence: Confidence that context is sufficient (0.0-1.0)
Git History fields (spec 264, src/output/unified/func_item.rs:476-520):
git_history: Code stability analysis from git history (optional)change_frequency: Average changes per month (higher = more active)bug_density: Ratio of bug-fix commits to total commits (0.0-1.0)age_days: Days since code was first introducedauthor_count: Number of unique contributorsstability: Classification based on patterns:"Highly Unstable": change_frequency > 5.0 and bug_density > 0.3"Frequently Changed": change_frequency > 2.0"Bug Prone": bug_density > 0.2"Stable": Low change frequency and bug density"Stale": age_days > 365 with minimal changes
Pattern detection and complexity adjustment:
Debtmap detects common code patterns that appear complex but are actually maintainable (src/complexity/entropy_analysis.rs). When detected with high confidence (≥ 0.5), complexity metrics are dampened:
Validation patterns - Repetitive input checking:
#![allow(unused)]
fn main() {
// Appears complex (cyclomatic = 15) but is repetitive
if field1.is_empty() { return Err(...) }
if field2.is_empty() { return Err(...) }
// ... repeated for many fields
}
Dampening: ~20-40% reduction if similarity is high
State machine patterns - Structured state transitions:
#![allow(unused)]
fn main() {
match state {
State::Init => { /* transition logic */ }
State::Running => { /* transition logic */ }
// ... many similar cases
}
}
Dampening: ~30-50% reduction for consistent transition logic
Error handling patterns - Systematic error checking:
#![allow(unused)]
fn main() {
let x = operation1().map_err(|e| /* wrap error */)?;
let y = operation2().map_err(|e| /* wrap error */)?;
// ... repeated error wrapping
}
Dampening: ~15-30% reduction for consistent error propagation
When to trust adjusted_complexity:
pattern_confidence ≥ 0.7: High confidence, usedampened_cyclomaticfor priority decisionspattern_confidence 0.5-0.7: Moderate confidence, consider both original and dampened valuespattern_confidence < 0.5or missing: Use originalcyclomatic_complexity
Entropy score interpretation:
entropy_score ≥ 0.7: High entropy, genuinely complex code - prioritize refactoringentropy_score 0.4-0.7: Moderate entropy, some repetition - review manuallyentropy_score < 0.4: Low entropy, highly repetitive pattern - likely false positive if flagged complex
Prioritizing Work
Debtmap provides multiple prioritization strategies, with unified scoring (0-10 scale) as the recommended default for most workflows:
1. By Unified Score (default - recommended)
debtmap analyze . --top 10
Shows top 10 items by combined complexity, coverage, and dependency factors, weighted and adjusted by function role.
Why use unified scoring:
- Balances complexity (40%), coverage (40%), and dependency impact (20%)
- Adjusts for function importance (entry points prioritized over utilities)
- Normalized 0-10 scale is intuitive and consistent
- Reduces false positives through coverage propagation
- Best for sprint planning and function-level refactoring decisions
Example:
# Show top 20 critical items
debtmap analyze . --min-priority 7.0 --top 20
# Focus on high-impact functions (score >= 7.0)
debtmap analyze . --format json | jq '.functions[] | select(.unified_score >= 7.0)'
2. By Risk Category (legacy compatibility)
debtmap analyze . --min-priority high
Shows only HIGH and CRITICAL priority items using legacy risk scoring.
Note: Legacy risk scoring uses additive formulas and unbounded scales. Prefer unified scoring for new workflows.
3. By Debt Type
debtmap analyze . --filter Architecture,Testing
Focuses on specific categories:
Architecture: God objects, complexity, dead codeTesting: Coverage gaps, test qualityPerformance: Resource leaks, inefficienciesCodeQuality: Code smells, maintainability
4. By ROI (with coverage)
debtmap analyze . --lcov coverage.lcov --top 20
Prioritizes by return on investment for testing/refactoring. Combines unified scoring with test effort estimates to identify high-value work.
Choosing the right strategy:
- Sprint planning for developers: Use unified scoring (
--top N) - Architectural review: Use tiered prioritization (
--summary) - Category-focused work: Use debt type filtering (
--filter) - Testing priorities: Use ROI analysis with coverage data (
--lcov) - Historical comparisons: Use legacy risk scoring (for consistency with old reports)
Tiered Prioritization
Note: Tiered prioritization uses traditional debt scoring (additive, higher = worse) and is complementary to the unified scoring system (0-10 scale). Both systems can be used together:
- Unified scoring (0-10 scale): Best for function-level prioritization and sprint planning
- Tiered prioritization (debt tiers): Best for architectural focus and strategic debt planning
Use --summary for tiered view focusing on architectural issues, or default output for function-level unified scores.
Debtmap uses a tier-based system to map debt scores to actionable priority levels. Each tier includes effort estimates and strategic guidance for efficient debt remediation.
Tier vs Priority: Two Classification Systems
Important: Debtmap uses two different threshold systems for classifying debt items:
| Classification | Used For | Critical | High | Medium/Moderate | Low |
|---|---|---|---|---|---|
| Tier (src/priority/mod.rs:372-386) | Terminal display, grouping | score >= 90 | score 70-89.9 | score 50-69.9 | score < 50 |
| Priority (src/output/unified/priority.rs:15-33) | JSON priority field, score_distribution | score >= 100 | score >= 50 | score >= 20 | score < 20 |
Why two systems?
-
Tier is designed for terminal display and human-readable grouping. It uses tighter thresholds (90/70/50) to create more balanced visual categories when browsing results interactively.
-
Priority is designed for JSON output and CI/CD integration. It uses wider thresholds (100/50/20) that align with typical CI gate requirements and allow for more granular filtering in automated pipelines.
Practical implications:
- A function with score 85 appears as
HIGHtier in terminal output but has"priority": "high"in JSON (both treat it as high priority) - A function with score 55 appears as
MODERATEtier in terminal but has"priority": "high"in JSON (Priority uses >= 50 threshold) - The
score_distributionin JSON summary counts items using Priority thresholds, not Tier thresholds
When filtering:
# Terminal tier filtering (uses Tier thresholds)
debtmap analyze . --min-priority high # Shows score >= 70
# JSON priority field (uses Priority thresholds)
debtmap analyze . --format json | jq '.items[] | select(.priority == "high")' # Shows score >= 50
Tier Levels
The Tier enum defines four priority levels based on score thresholds (src/priority/mod.rs:372-386):
#![allow(unused)]
fn main() {
pub enum Tier {
Critical, // Score >= 90
High, // Score 70-89.9
Moderate, // Score 50-69.9
Low, // Score < 50
}
}
Score-to-Tier Mapping:
- Critical (>= 90): Immediate action required - blocks progress
- High (70-89.9): Should be addressed this sprint
- Moderate (50-69.9): Plan for next sprint
- Low (< 50): Background maintenance work
Effort Estimates Per Tier
Each tier includes estimated effort based on typical remediation patterns:
| Tier | Estimated Effort | Typical Work |
|---|---|---|
| Critical | 1-2 days | Major refactoring, comprehensive testing, architectural changes |
| High | 2-4 hours | Extract functions, add test coverage, fix resource leaks |
| Moderate | 1-2 hours | Simplify logic, reduce duplication, improve error handling |
| Low | 30 minutes | Address TODOs, minor cleanup, documentation |
Effort calculation considers:
- Complexity metrics (cyclomatic, cognitive)
- Test coverage gaps
- Number of dependencies (upstream/downstream)
- Debt category (Architecture debt takes longer than CodeQuality)
Tiered Display Grouping
Note: TieredDisplay is an internal structure (src/priority/mod.rs:416-422) used for terminal output formatting and is not serialized to JSON. JSON output includes individual items with their priority field (critical, high, medium, low) based on Priority thresholds (100/50/20), not Tier thresholds (90/70/50). See Tier vs Priority above.
The internal TieredDisplay structure groups similar debt items for batch action recommendations in terminal output:
Grouping strategy:
- Groups items by tier (Critical/High/Moderate/Low) and similarity pattern
- Prevents grouping of god objects (always show individually)
- Prevents grouping of Critical items (each needs individual attention)
- Suggests batch actions for similar Low/Moderate items in terminal output
To view items by priority in JSON:
# Get all Critical items (priority: "critical", score >= 100)
debtmap analyze . --format json | jq '.items[] | select(.priority == "critical")'
# Get High priority items (score >= 50)
debtmap analyze . --format json | jq '.items[] | select(.priority == "high")'
# Count items by priority (uses Priority thresholds: 100/50/20)
debtmap analyze . --format json | jq '.summary.score_distribution'
The score_distribution in the summary provides counts using Priority thresholds (critical >= 100, high >= 50, medium >= 20, low < 20).
Using Tiered Prioritization
1. Start with Critical tier:
debtmap analyze . --min-priority critical
Focus on items with score ≥ 90. These typically represent:
- Complex functions with 0% coverage
- God objects blocking feature development
- Critical resource leaks or security issues
2. Plan High tier work:
debtmap analyze . --min-priority high --format json > sprint-plan.json
Schedule 2-4 hours per item for this sprint. Look for:
- Functions approaching complexity thresholds
- Moderate coverage gaps on important code paths
- Performance bottlenecks with clear solutions
3. Batch Moderate tier items:
debtmap analyze . --min-priority moderate
Review batch recommendations. Examples:
- “10 validation functions detected - extract common pattern”
- “5 similar test files with duplication - create shared fixtures”
- “8 functions with magic values - create constants module”
4. Schedule Low tier background work: Address during slack time or as warm-up tasks for new contributors.
Strategic Guidance by Tier
Critical Tier Strategy:
- Block new features until addressed
- Pair programming recommended for complex items
- Architectural review before major refactoring
- Comprehensive testing after changes
High Tier Strategy:
- Sprint planning priority
- Impact analysis before changes
- Code review from senior developers
- Integration testing after changes
Moderate Tier Strategy:
- Batch similar items for efficiency
- Extract patterns across multiple files
- Incremental improvement over multiple PRs
- Regression testing for affected areas
Low Tier Strategy:
- Good first issues for new contributors
- Documentation improvements
- Code cleanup during refactoring nearby code
- Technical debt gardening sessions
Categorized Debt Analysis
Note: CategorizedDebt is an internal analysis structure (src/priority/mod.rs:419) used for query operations and markdown formatting. It is not serialized to JSON output. The JSON output uses by_category in the summary section instead (see JSON Structure above).
Debtmap’s internal CategorizedDebt analysis groups debt items by category and identifies cross-category dependencies. This analysis powers the markdown output and internal query methods but is not directly exposed in JSON format.
CategorySummary
Each category gets a summary with metrics for planning:
#![allow(unused)]
fn main() {
pub struct CategorySummary {
pub category: DebtCategory,
pub total_score: f64,
pub item_count: usize,
pub estimated_effort_hours: f64,
pub average_severity: f64,
pub top_items: Vec<DebtItem>, // Up to 5 highest priority
}
}
Effort estimation formulas:
- Architecture debt:
complexity_score / 10 × 2hours (structural changes take longer) - Testing debt:
complexity_score / 10 × 1.5hours (writing tests) - Performance debt:
complexity_score / 10 × 1.8hours (profiling + optimization) - CodeQuality debt:
complexity_score / 10 × 1.2hours (refactoring)
Example category summary:
{
"category": "Architecture",
"total_score": 487.5,
"item_count": 15,
"estimated_effort_hours": 97.5,
"average_severity": 32.5,
"top_items": [
{
"debt_type": "GodObject",
"file": "src/services/user_service.rs",
"score": 95.0,
"estimated_effort_hours": 16.0
},
{
"debt_type": "ComplexityHotspot",
"file": "src/payments/processor.rs",
"score": 87.3,
"estimated_effort_hours": 14.0
}
]
}
Cross-Category Dependencies
CrossCategoryDependency identifies blocking relationships between different debt categories:
#![allow(unused)]
fn main() {
pub struct CrossCategoryDependency {
pub from_category: DebtCategory,
pub to_category: DebtCategory,
pub blocking_items: Vec<(DebtItem, DebtItem)>,
pub impact_level: ImpactLevel, // Critical, High, Medium, Low
pub recommendation: String,
}
}
Common dependency patterns:
1. Architecture blocks Testing:
- Pattern: God objects are too complex to test effectively
- Example:
UserServicehas 50+ functions, making comprehensive testing impractical - Impact: Critical - cannot improve test coverage without refactoring
- Recommendation: “Split god object into 4-5 focused modules before adding tests”
2. Async issues require Architecture changes:
- Pattern: Blocking I/O in async contexts requires architectural redesign
- Example: Sync database calls in async handlers
- Impact: High - performance problems require design changes
- Recommendation: “Introduce async database layer before optimizing handlers”
3. Complexity affects Testability:
- Pattern: High cyclomatic complexity makes thorough testing difficult
- Example: Function with 22 branches needs 22+ test cases
- Impact: High - testing effort grows exponentially with complexity
- Recommendation: “Reduce complexity to < 10 before writing comprehensive tests”
4. Performance requires Architecture:
- Pattern: O(n²) nested loops need different data structures
- Example: Linear search in loops should use HashMap
- Impact: Medium - optimization requires structural changes
- Recommendation: “Refactor data structure before micro-optimizations”
Example cross-category dependency:
{
"from_category": "Architecture",
"to_category": "Testing",
"impact_level": "Critical",
"blocking_items": [
{
"blocker": {
"debt_type": "GodObject",
"file": "src/services/user_service.rs",
"functions": 52,
"score": 95.0
},
"blocked": {
"debt_type": "TestingGap",
"file": "src/services/user_service.rs",
"coverage": 15,
"score": 78.0
}
}
],
"recommendation": "Split UserService into focused modules (auth, profile, settings, notifications) before attempting to improve test coverage. Current structure makes comprehensive testing impractical.",
"estimated_unblock_effort_hours": 16.0
}
Using Category-Based Analysis
View category distribution in JSON:
debtmap analyze . --format json | jq '.summary.by_category'
This shows item counts per category (Architecture, Testing, Performance, CodeQuality).
Filter items by category:
debtmap analyze . --format json | jq '.items[] | select(.category == "Architecture")'
Focus on specific category with CLI:
debtmap analyze . --filter Architecture --top 10
Generate markdown for detailed category analysis:
debtmap analyze . --format markdown --output report.md
The markdown format includes full CategorySummary details with effort estimates and cross-category dependency analysis.
Strategic planning workflow:
-
Review category summaries:
- Identify which category has highest total score
- Check estimated effort hours per category
- Note average severity to gauge urgency
-
Check cross-category dependencies:
- Find Critical and High impact blockers
- Prioritize blockers before blocked items
- Plan architectural changes before optimization
-
Plan remediation order:
Example decision tree: - Architecture score > 400? → Address god objects first - Testing gap with low complexity? → Quick wins, add tests - Performance issues + architecture debt? → Refactor structure first - High code quality debt but good architecture? → Incremental cleanup -
Use category-specific strategies:
- Architecture: Pair programming, design reviews, incremental refactoring
- Testing: TDD for new code, characterization tests for legacy
- Performance: Profiling first, optimize hot paths, avoid premature optimization
- CodeQuality: Code review focus, linting rules, consistent patterns
Note on output formats: CategorySummary and CrossCategoryDependency details are available in markdown format only. The JSON output provides category counts in summary.by_category and you can filter items by category using the category field on each item.
Debt Density Metric
Debt density normalizes technical debt scores across projects of different sizes, providing a per-1000-lines-of-code metric for fair comparison.
Formula
debt_density = (total_debt_score / total_lines_of_code) × 1000
Example calculation:
Project A:
- Total debt score: 1,250
- Total lines of code: 25,000
- Debt density: (1,250 / 25,000) × 1000 = 50
Project B:
- Total debt score: 2,500
- Total lines of code: 50,000
- Debt density: (2,500 / 50,000) × 1000 = 50
Projects A and B have equal debt density (50) despite B having twice the absolute debt, because B is also twice as large. They have proportionally similar technical debt.
Interpretation Guidelines
Use these thresholds to assess codebase health:
| Debt Density | Assessment | Description |
|---|---|---|
| 0-50 | Clean | Well-maintained codebase, minimal debt |
| 51-100 | Moderate | Typical technical debt, manageable |
| 101-150 | High | Significant debt, prioritize remediation |
| 150+ | Critical | Severe debt burden, may impede development |
Context matters:
- Early-stage projects: Often have higher density (rapid iteration)
- Mature projects: Should trend toward lower density over time
- Legacy systems: May have high density, track trend over time
- Greenfield rewrites: Aim for density < 50
Using Debt Density
1. Compare projects fairly:
# Small microservice (5,000 LOC, debt = 250)
# Debt density: 50
# Large monolith (100,000 LOC, debt = 5,000)
# Debt density: 50
# Equal health despite size difference
2. Track improvement over time:
Sprint 1: 50,000 LOC, debt = 7,500, density = 150 (High)
Sprint 5: 52,000 LOC, debt = 6,500, density = 125 (Improving)
Sprint 10: 54,000 LOC, debt = 4,860, density = 90 (Moderate)
3. Set team goals:
Current density: 120
Target density: < 80 (by Q4)
Reduction needed: 40 points
Strategy:
- Fix 2-3 Critical items per sprint
- Prevent new debt (enforce thresholds)
- Refactor before adding features in high-debt modules
4. Benchmark across teams/projects:
{
"team_metrics": [
{
"project": "auth-service",
"debt_density": 45,
"assessment": "Clean",
"trend": "stable"
},
{
"project": "billing-service",
"debt_density": 95,
"assessment": "Moderate",
"trend": "improving"
},
{
"project": "legacy-api",
"debt_density": 165,
"assessment": "Critical",
"trend": "worsening"
}
]
}
Limitations
Debt density doesn’t account for:
- Code importance: 100 LOC in payment logic ≠ 100 LOC in logging utils
- Complexity distribution: One 1000-line god object vs. 1000 simple functions
- Test coverage: 50% coverage on critical paths vs. low-priority features
- Team familiarity: New codebase vs. well-understood legacy system
Best practices:
- Use density as one metric among many
- Combine with category analysis and tiered prioritization
- Focus on trend (improving/stable/worsening) over absolute number
- Consider debt per module for more granular insights
Debt Density in CI/CD
Track density over time:
# Generate report with density
debtmap analyze . --format json --output debt-report.json
# Extract density for trending
DENSITY=$(jq '.debt_density' debt-report.json)
# Store in metrics database
echo "debtmap.density:${DENSITY}|g" | nc -u -w0 statsd 8125
Set threshold gates:
# .github/workflows/debt-check.yml
- name: Check debt density
run: |
DENSITY=$(debtmap analyze . --format json | jq '.debt_density')
if (( $(echo "$DENSITY > 150" | bc -l) )); then
echo "❌ Debt density too high: $DENSITY (limit: 150)"
exit 1
fi
echo "✅ Debt density acceptable: $DENSITY"
Actionable Insights
Each recommendation includes:
ACTION: What to do
- “Add 6 unit tests for full coverage”
- “Refactor into 3 smaller functions”
- “Extract validation to separate function”
IMPACT: Expected improvement
- “Full test coverage, -3.7 risk”
- “Reduce complexity from 22 to 8”
- “Eliminate 120 lines of duplication”
WHY: Rationale
- “Business logic with 0% coverage, manageable complexity”
- “High complexity with low coverage threatens stability”
- “Repeated validation pattern across 5 files”
Example workflow:
- Run analysis with coverage:
debtmap analyze . --lcov coverage.lcov - Filter to CRITICAL items:
--min-priority critical - Review top 5 recommendations
- Start with highest ROI items
- Rerun analysis to track progress
Common Patterns to Recognize
Pattern 1: High Complexity, Well Tested
Complexity: 25, Coverage: 95%, Risk: LOW
This is actually good! Complex but thoroughly tested code. Learn from this approach.
Pattern 2: Moderate Complexity, No Tests
Complexity: 12, Coverage: 0%, Risk: CRITICAL
Highest priority - manageable complexity, should be easy to test.
Pattern 3: Low Complexity, No Tests
Complexity: 3, Coverage: 0%, Risk: LOW
Low priority - simple code, less risky without tests.
Pattern 4: Repetitive High Complexity (Dampened)
Cyclomatic: 20, Effective: 7 (65% dampened), Risk: LOW
Validation or dispatch pattern - looks complex but is repetitive. Lower priority.
Pattern 5: God Object
File: services.rs, Functions: 50+, Responsibilities: 15+
Architectural issue - split before adding features.
See Also
- Complexity Metrics - Detailed explanation of cyclomatic, cognitive, and entropy metrics
- Risk Scoring - How debt scores and risk assessments are calculated
- CLI Reference - Complete command-line options for output formats and filtering
- Output Formats - Detailed documentation of JSON, Markdown, and HTML output
Analyzer Types
Overview
Debtmap is a Rust-only code analysis tool. As of specification 191, debtmap focuses exclusively on Rust codebases to provide deep, language-specific insights into code complexity, technical debt, and architectural patterns.
While the architecture supports extensibility through the Analyzer trait, only Rust is actively supported and maintained. Files in other programming languages are automatically filtered during discovery and never reach the analysis phase.
Source: Language enum at src/core/mod.rs:416-421, AnalyzerFactory at src/core/injection.rs:190-203
Rust Analyzer
Debtmap provides comprehensive analysis for Rust codebases using the syn crate for native AST parsing.
Core Capabilities
The Rust analyzer (src/analyzers/rust.rs) provides:
- Complexity Metrics: Cyclomatic complexity, cognitive complexity, and entropy analysis
- Purity Detection: Identifies pure functions with confidence scoring
- Call Graph Analysis: Tracks upstream callers and downstream callees with transitive relationships
- Trait Implementation Tracking: Monitors trait implementations across the codebase
- Macro Expansion Support: Analyzes complexity within macros accurately
- Pattern-Based Adjustments: Recognizes and adjusts for code generation patterns
- Visibility Tracking: Distinguishes
pub,pub(crate), and private functions - Test Module Detection: Identifies
#[cfg(test)]modules and#[test]functions
Source: Capabilities verified in src/analyzers/rust.rs:1-100
Semantic Function Classification
The Rust analyzer automatically classifies functions by their role in the system. This classification feeds into the unified scoring system’s role multiplier for accurate technical debt assessment.
Classification Categories (src/analyzers/rust.rs):
- Entry Points: Functions named
main,start, or public functions inbin/modules - Business Logic: Core domain functions containing complex algorithms and business rules
- Data Access: Functions performing database queries, file I/O, or network operations
- Infrastructure: Logging, configuration, monitoring, and error handling utilities
- Utilities: Helper functions, formatters, type converters, and validation functions
- Test Code: Functions in
#[cfg(test)]modules or marked with#[test]attribute
These classifications are used to calculate role-based priority multipliers in the risk scoring system. See Risk Scoring for details on how semantic classification affects debt prioritization.
Language Support
Supported: Rust Only
Debtmap exclusively analyzes Rust source files (.rs extension). All analysis features, metrics, and debt detection patterns are designed specifically for Rust’s syntax and semantics.
Language Detection (src/core/mod.rs:435-440):
#![allow(unused)]
fn main() {
pub fn from_path(path: &std::path::Path) -> Self {
path.extension()
.and_then(|ext| ext.to_str())
.map(Self::from_extension)
.unwrap_or(Language::Unknown)
}
}
The Language enum (src/core/mod.rs:416-421) includes Rust, Python, and Unknown variants, but only Rust is actively processed:
#![allow(unused)]
fn main() {
pub enum Language {
Rust,
Python, // Architectural placeholder, not supported
Unknown,
}
}
File Filtering Behavior
During file discovery, debtmap filters files by extension:
- Rust files (
.rs): Parsed and analyzed - All other files: Silently filtered out—no warnings or errors generated
- Unknown extensions: Mapped to
Language::Unknownand filtered during discovery
Source: Language detection implemented in src/core/mod.rs:423-440
Example Usage:
# Analyze all Rust files in current directory
debtmap analyze .
# Analyze specific Rust file
debtmap analyze src/main.rs
# Python, JavaScript, and other files are ignored
# (no error messages, just skipped)
Extensibility
While debtmap currently focuses on Rust-only analysis, the architecture is designed to support additional languages in the future through the Analyzer trait.
Analyzer Trait
The core Analyzer trait defines the interface for language-specific analyzers (src/analyzers/mod.rs:40-44):
#![allow(unused)]
fn main() {
pub trait Analyzer: Send + Sync {
fn parse(&self, content: &str, path: std::path::PathBuf) -> Result<Ast>;
fn analyze(&self, ast: &Ast) -> FileMetrics;
fn language(&self) -> crate::core::Language;
}
}
Note: There is also a generic Analyzer trait with associated types in src/core/traits.rs:11-16, used for internal abstractions. The trait shown above is the public extension point for language analyzers.
Current Implementation
The AnalyzerFactory (src/core/injection.rs:190-203) creates language-specific analyzers:
#![allow(unused)]
fn main() {
impl AnalyzerFactory {
pub fn create_analyzer(&self, language: Language) -> Box<dyn Analyzer<...>> {
match language {
Language::Rust => Box::new(RustAnalyzerAdapter::new()),
Language::Python => {
panic!("Python analysis is not currently supported.
Debtmap is focusing exclusively on Rust analysis.")
}
}
}
}
}
Adding Language Support (Future)
To add support for a new language:
- Implement the
Analyzertrait with language-specific parsing and analysis - Add the language variant to the
Languageenum (src/core/mod.rs:416-421) - Update
from_extension()to recognize the file extension (src/core/mod.rs:423-433) - Register in
AnalyzerFactoryto instantiate your analyzer (src/core/injection.rs:190-203)
Reference Implementation: See src/analyzers/rust.rs for a complete example of implementing the Analyzer trait with full complexity analysis, purity detection, and call graph support.
See Also
- Overview - Analysis pipeline and workflow
- Complexity Metrics - Detailed metric calculations
- Risk Scoring - How semantic classification affects prioritization
Advanced Features
This section covers Debtmap’s advanced analysis capabilities: purity detection, data flow analysis, entropy-based complexity, and context-aware analysis.
Purity Detection
Debtmap detects pure functions - those without side effects that always return the same output for the same input.
What makes a function pure:
- No I/O operations (file, network, database)
- No mutable global state
- No random number generation
- No system calls
- Deterministic output
Purity detection is optional:
- Both
is_pureandpurity_confidenceareOptiontypes - May be
Nonefor some functions or languages where detection is not available - Rust has the most comprehensive purity detection support
Four-level purity classification:
The PurityLevel enum (src/core/mod.rs:49-62) provides more nuanced classification than the binary is_pure:
- StrictlyPure: No mutations whatsoever - pure mathematical functions
- LocallyPure: Uses local mutations for efficiency but no external side effects (builder patterns, accumulators, owned
mut self) - ReadOnly: Reads external state but doesn’t modify it (constants,
&selfmethods) - Impure: Modifies external state or performs I/O (
&mut self, statics, I/O)
This four-level classification enables better scoring for functions that use local mutations for efficiency but are functionally pure (referentially transparent). See Complexity Metrics for how purity affects scoring.
Confidence scoring (when available):
- 0.9-1.0: Very confident (no side effects detected)
- 0.7-0.8: Likely pure (minimal suspicious patterns)
- 0.5-0.6: Uncertain (some suspicious patterns)
- 0.0-0.4: Likely impure (side effects detected)
Example:
#![allow(unused)]
fn main() {
// Pure: confidence = 0.95
fn calculate_total(items: &[Item]) -> f64 {
items.iter().map(|i| i.price).sum()
}
// Impure: confidence = 0.1 (I/O detected)
fn save_total(items: &[Item]) -> Result<()> {
let total = items.iter().map(|i| i.price).sum();
write_to_file(total) // Side effect!
}
}
Benefits:
- Pure functions are easier to test
- Can be safely cached or memoized
- Safe to parallelize
- Easier to reason about
Data Flow Analysis
Debtmap builds a comprehensive DataFlowGraph that extends basic call graph analysis with variable dependencies, data transformations, I/O operations, and purity tracking.
Call Graph Foundation
Upstream callers - Who calls this function
- Indicates impact radius
- More callers = higher impact if it breaks
Downstream callees - What this function calls
- Indicates dependencies
- More callees = more integration testing needed
Example:
{
"name": "process_payment",
"upstream_callers": [
"handle_checkout",
"process_subscription",
"handle_refund"
],
"downstream_callees": [
"validate_payment_method",
"calculate_fees",
"record_transaction",
"send_receipt"
]
}
Variable Dependency Tracking
DataFlowGraph tracks which variables each function depends on (src/data_flow/mod.rs:119):
#![allow(unused)]
fn main() {
pub struct DataFlowGraph {
// Maps function_id -> set of variable names used
variable_deps: HashMap<FunctionId, HashSet<String>>,
// ...
}
}
What it tracks:
- Function parameters (primary source via extraction adapters)
- Local variables accessed in function body
- Captured variables (closures)
Note: Variable dependency tracking stores variable names only (as HashSet<String>). It does not track mutability information - that analysis is handled separately by the purity detection system.
Benefits:
- Identify functions coupled through shared state
- Detect potential side effect chains
- Guide refactoring to reduce coupling
Example output:
{
"function": "calculate_total",
"variable_dependencies": ["items", "tax_rate", "discount", "total"],
"parameter_count": 3,
"local_var_count": 1
}
Data Transformation Patterns
DataFlowGraph tracks data transformations between functions. The TransformationType enum (src/organization/data_flow_analyzer.rs:35-46) classifies transformations by their input/output cardinality:
#![allow(unused)]
fn main() {
pub enum TransformationType {
Direct, // A → B (pure transformation)
Aggregation, // (A, B) → C (multiple inputs to single output)
Decomposition, // A → (B, C) (single input to multiple outputs)
Enrichment, // A → Result<B> (validation/enrichment with Result/Option)
Expansion, // A → Vec<B> (single input to collection)
}
}
Classification logic (src/organization/data_flow_analyzer.rs:124-146):
- Multiple input parameters →
Aggregation - Return type is
Result<T>orOption<T>→Enrichment - Return type is
Vec<T>→Expansion - Return type is tuple →
Decomposition - Default →
Direct
Example usage:
#![allow(unused)]
fn main() {
// Aggregation: (items, discount_rate) → f64
fn calculate_discounted_total(items: &[Item], discount_rate: f64) -> f64 {
items.iter().map(|i| i.price).sum::<f64>() * (1.0 - discount_rate)
}
// Enrichment: Config → Result<ValidatedConfig>
fn validate_config(config: Config) -> Result<ValidatedConfig> {
// ...
}
// Expansion: Order → Vec<LineItem>
fn extract_line_items(order: &Order) -> Vec<LineItem> {
order.items.clone()
}
}
Note: The DataFlowGraph.data_transformations field (src/data_flow/mod.rs:149) stores transformation_type as a String, allowing flexible pattern descriptions beyond the enum variants.
I/O Operation Detection
Tracks functions performing I/O operations for purity and performance analysis:
I/O categories tracked:
- File I/O:
std::fs,File::open,read_to_string - Network I/O: HTTP requests, socket operations
- Database I/O: SQL queries, ORM operations
- System calls: Process spawning, environment access
- Blocking operations:
thread::sleep, synchronous I/O in async
Example detection:
#![allow(unused)]
fn main() {
// Detected I/O operations: FileRead, FileWrite
fn save_config(config: &Config, path: &Path) -> Result<()> {
let json = serde_json::to_string(config)?; // No I/O
std::fs::write(path, json)?; // FileWrite detected
Ok(())
}
}
I/O metadata:
{
"function": "save_config",
"io_operations": ["FileWrite"],
"is_blocking": true,
"affects_purity": true,
"async_safe": false
}
Purity Analysis Integration
DataFlowGraph integrates with purity detection to provide comprehensive side effect analysis:
Side effect tracking:
- I/O operations (file, network, console)
- Global state mutations
- Random number generation
- System time access
- Non-deterministic behavior
Purity confidence factors:
- 1.0: Pure mathematical function, no side effects
- 0.8: Pure with deterministic data transformations
- 0.5: Mixed - some suspicious patterns
- 0.2: Likely impure - I/O detected
- 0.0: Definitely impure - multiple side effects
Example analysis:
{
"function": "calculate_discount",
"is_pure": true,
"purity_confidence": 0.95,
"side_effects": [],
"deterministic": true,
"safe_to_parallelize": true,
"safe_to_cache": true
}
Modification Impact Analysis
DataFlowGraph calculates the impact of modifying a function:
#![allow(unused)]
fn main() {
pub struct ModificationImpact {
pub function_name: String,
pub affected_functions: Vec<String>, // Upstream callers
pub dependency_count: usize, // Downstream callees
pub has_side_effects: bool,
pub risk_level: RiskLevel,
}
}
Risk level calculation:
- Critical: Many upstream callers + side effects + low test coverage
- High: Many callers OR side effects with moderate coverage
- Medium: Few callers with side effects OR many callers with good coverage
- Low: Few callers, no side effects, or well-tested
Example impact analysis:
{
"function": "validate_payment_method",
"modification_impact": {
"affected_functions": 4,
"dependency_count": 8,
"has_side_effects": true,
"risk_level": "High"
}
}
Note: The affected_functions field contains the count of upstream callers. The actual function names can be obtained from the upstream_callers field in the function metadata.
Using modification impact:
# Analyze impact before refactoring
debtmap analyze . --format json | jq '.functions[] | select(.name == "validate_payment_method") | .modification_impact'
Impact analysis uses:
- Refactoring planning: Understand blast radius before changes
- Test prioritization: Focus tests on high-impact functions
- Code review: Flag high-risk changes for extra scrutiny
- Dependency management: Identify tightly coupled components
DataFlowGraph Methods
Key methods for data flow analysis:
#![allow(unused)]
fn main() {
// Add function with its dependencies
pub fn add_function(&mut self, function_id: String, callees: Vec<String>)
// Track variable dependencies
pub fn add_variable_dependency(&mut self, function_id: String, var_name: String)
// Record I/O operations
pub fn add_io_operation(&mut self, function_id: String, io_type: IoType)
// Calculate modification impact
pub fn calculate_modification_impact(&self, function_id: &str) -> ModificationImpact
// Get all functions affected by a change
pub fn get_affected_functions(&self, function_id: &str) -> Vec<String>
// Find functions with side effects
pub fn find_functions_with_side_effects(&self) -> Vec<String>
}
Integration in analysis pipeline:
- Parser builds initial call graph
- DataFlowGraph extends with variable/I/O tracking
- Purity analyzer adds side effect information
- Modification impact calculated for each function
- Results used in prioritization and risk scoring
Connection to Unified Scoring:
The dependency analysis from DataFlowGraph directly feeds into the unified scoring system’s dependency factor (20% weight):
- Dependency Factor Calculation: Functions with high upstream caller count or on critical paths from entry points receive higher dependency scores (8-10)
- Isolated Utilities: Functions with few or no callers score lower (1-3) on dependency factor
- Impact Prioritization: This helps prioritize functions where bugs have wider impact across the codebase
- Modification Risk: The modification impact analysis uses dependency data to calculate blast radius when changes are made
Example:
Function: validate_payment_method
Upstream callers: 4 (high impact)
→ Dependency Factor: 8.0
Function: format_currency_string
Upstream callers: 0 (utility)
→ Dependency Factor: 1.5
Both have same complexity, but validate_payment_method gets higher unified score
due to its critical role in the call graph.
This integration ensures that the unified scoring system considers not just internal function complexity and test coverage, but also the function’s importance in the broader codebase architecture.
Entropy-Based Complexity
Advanced pattern detection to reduce false positives.
Token Classification:
#![allow(unused)]
fn main() {
enum TokenType {
Variable, // Weight: 1.0
Method, // Weight: 1.5 (more important)
Literal, // Weight: 0.5 (less important)
Keyword, // Weight: 0.8
Operator, // Weight: 0.6
}
}
Shannon Entropy Calculation:
H(X) = -Σ p(x) × log₂(p(x))
where p(x) is the probability of each token type.
Dampening Decision:
#![allow(unused)]
fn main() {
if entropy_score.token_entropy < 0.4
&& entropy_score.pattern_repetition > 0.6
&& entropy_score.branch_similarity > 0.7
{
// Apply dampening
effective_complexity = base_complexity × (1 - dampening_factor);
}
}
Output explanation:
Function: validate_input
Cyclomatic: 15 → Effective: 5
Reasoning:
- High pattern repetition detected (85%)
- Low token entropy indicates simple patterns (0.32)
- Similar branch structures found (92% similarity)
- Complexity reduced by 67% due to pattern-based code
Entropy Analysis Caching
EntropyAnalyzer includes an LRU-style cache for performance optimization when analyzing large codebases or performing repeated analysis.
Cache Structure
#![allow(unused)]
fn main() {
struct CacheEntry {
score: EntropyScore,
timestamp: Instant,
hit_count: usize,
}
}
Cache configuration:
- Default size: 1000 entries
- Eviction policy: LRU (Least Recently Used)
- Memory per entry: ~128 bytes
- Total memory overhead: ~128 KB for default size
Cache Statistics
The analyzer tracks cache performance:
#![allow(unused)]
fn main() {
pub struct CacheStats {
pub hits: usize,
pub misses: usize,
pub evictions: usize,
pub hit_rate: f64,
pub memory_usage: usize,
}
}
Example stats output:
{
"entropy_cache_stats": {
"hits": 3427,
"misses": 1573,
"evictions": 573,
"hit_rate": 0.685,
"memory_usage": 128000
}
}
Hit rate interpretation:
- > 0.7: Excellent - many repeated analyses, cache is effective
- 0.4-0.7: Good - moderate reuse, typical for incremental analysis
- < 0.4: Low - mostly unique functions, cache less helpful
Performance Benefits
Typical performance gains:
- Cold analysis: 100ms baseline (no cache benefit)
- Incremental analysis: 30-40ms (~60-70% faster) for unchanged functions
- Re-analysis: 15-20ms (~80-85% faster) for recently analyzed functions
Best for:
- Watch mode: Analyzing on file save (repeated analysis of same files)
- CI/CD: Comparing feature branch to main (overlap in functions)
- Large codebases: Many similar functions benefit from pattern caching
Memory estimation:
Total cache memory = entry_count × 128 bytes
Examples:
- 1,000 entries: ~128 KB (default)
- 5,000 entries: ~640 KB (large projects)
- 10,000 entries: ~1.25 MB (very large)
Cache Management
Automatic eviction:
- When cache reaches size limit, oldest entries evicted
- Hit count influences retention (frequently accessed stay longer)
- Timestamp used for LRU ordering
Cache invalidation:
- Function source changes invalidate entry
- Cache cleared between major analysis runs
- No manual invalidation needed
Configuration (if exposed in future):
[entropy.cache]
enabled = true
size = 1000 # Number of entries
ttl_seconds = 3600 # Optional: expire after 1 hour
Context-Aware Analysis
Debtmap adjusts analysis based on code context:
Pattern Recognition:
- Validation patterns (repetitive checks)
- Dispatcher patterns (routing logic)
- Builder patterns (fluent APIs)
- Configuration parsers (key-value processing)
Adjustment Strategies:
- Reduce false positives for recognized patterns
- Apply appropriate thresholds by pattern type
- Consider pattern confidence in scoring
Example:
#![allow(unused)]
fn main() {
// Recognized as "validation_pattern"
// Complexity dampening applied
fn validate_user_input(input: &UserInput) -> Result<()> {
if input.name.is_empty() { return Err(Error::EmptyName); }
if input.email.is_empty() { return Err(Error::EmptyEmail); }
if input.age < 13 { return Err(Error::TooYoung); }
// ... more similar validations
Ok(())
}
}
Coverage Integration
Debtmap parses LCOV coverage data for risk analysis:
LCOV Support:
- Standard format from most coverage tools
- Line-level coverage tracking
- Function-level aggregation
Coverage Index:
- O(1) exact name lookups (~0.5μs)
- O(log n) line-based fallback (~5-8μs)
- ~200 bytes per function
- Thread-safe (
Arc<CoverageIndex>)
Performance Characteristics
Index Build Performance:
- Index construction: O(n), approximately 20-30ms for 5,000 functions
- Memory usage: ~200 bytes per record (~2MB for 5,000 functions)
- Scales linearly with function count
Lookup Performance:
- Exact match (function name): O(1) average, ~0.5μs per lookup
- Line-based fallback: O(log n), ~5-8μs per lookup
- Cache-friendly data structure for hot paths
Analysis Overhead:
- Coverage integration overhead: ~2.5x baseline analysis time
- Target overhead: ≤3x (maintained through optimizations)
- Example timing: 53ms baseline → 130ms with coverage (2.45x overhead)
- Overhead includes index build + lookups + coverage propagation
When to use coverage integration:
- Skip coverage (faster iteration): For rapid development iteration or quick local checks, omit
--lcovto get baseline results 2.5x faster - Include coverage (comprehensive analysis): Use coverage integration for final validation, sprint planning, and CI/CD gates where comprehensive risk analysis is needed
Thread Safety:
- Coverage index wrapped in
Arc<CoverageIndex>for lock-free parallel access - Multiple analyzer threads can query coverage simultaneously
- No contention on reads, suitable for parallel analysis pipelines
Memory Footprint:
Total memory = (function_count × 200 bytes) + index overhead
Examples:
- 1,000 functions: ~200 KB
- 5,000 functions: ~2 MB
- 10,000 functions: ~4 MB
Scalability:
- Tested with codebases up to 10,000 functions
- Performance remains predictable and acceptable
- Memory usage stays bounded and reasonable
Generating coverage:
# Rust (using cargo-tarpaulin)
cargo tarpaulin --out lcov --output-dir target/coverage
# Or using cargo-llvm-cov
cargo llvm-cov --lcov --output-path target/coverage/lcov.info
Using with Debtmap:
debtmap analyze . --lcov target/coverage/lcov.info
Coverage dampening: When coverage data is provided, debt scores are dampened for well-tested code:
final_score = base_score × (1 - coverage_percentage)
This ensures well-tested complex code gets lower priority than untested simple code.
See Also
- Complexity Metrics - Detailed metrics used in analysis
- Risk Scoring - How advanced features influence risk scores
- Interpreting Results - Using analysis results effectively
Example Outputs
This page demonstrates realistic examples of debtmap’s terminal and JSON output formats using the unified format (spec 108).
High Complexity Function (Needs Refactoring)
Terminal Output:
#1 SCORE: 9.2 [CRITICAL]
├─ COMPLEXITY: ./src/payments/processor.rs:145 process_transaction()
├─ ACTION: Refactor into 4 smaller functions
├─ IMPACT: Reduce complexity from 25 to 8, improve testability
├─ COMPLEXITY: cyclomatic=25, branches=25, cognitive=38, nesting=5, lines=120
├─ DEPENDENCIES: 3 upstream, 8 downstream
└─ WHY: Exceeds all complexity thresholds, difficult to test and maintain
JSON Output (Unified Format):
{
"type": "Function",
"score": 92.5,
"category": "CodeQuality",
"priority": "critical",
"location": {
"file": "src/payments/processor.rs",
"line": 145,
"function": "process_transaction"
},
"metrics": {
"cyclomatic_complexity": 25,
"cognitive_complexity": 38,
"length": 120,
"nesting_depth": 5,
"coverage": 0.15,
"uncovered_lines": [145, 147, 152, 158, 165, 172, 180, 185]
},
"debt_type": {
"ComplexityHotspot": {
"cyclomatic": 25,
"cognitive": 38,
"adjusted_cyclomatic": null
}
},
"function_role": "Orchestrator",
"purity_analysis": {
"is_pure": false,
"confidence": 0.15,
"side_effects": ["mutates_state", "io_operations", "database_access"]
},
"dependencies": {
"upstream_count": 3,
"downstream_count": 8,
"upstream_callers": [
"handle_payment",
"handle_subscription",
"handle_refund"
],
"downstream_callees": [
"validate",
"calculate_fees",
"record_transaction",
"send_receipt",
"update_balance",
"log_transaction",
"check_fraud",
"notify_user"
]
},
"recommendation": {
"action": "Refactor into 4 smaller, focused functions",
"implementation_steps": [
"Extract validation logic into validate_payment_request",
"Create calculate_payment_totals for fee calculation",
"Move side effects to separate transaction recorder",
"Keep process_transaction as thin orchestrator"
]
},
"impact": {
"coverage_improvement": 0.55,
"complexity_reduction": 68.0,
"risk_reduction": 7.8
},
"scoring_details": {
"coverage_score": 45.0,
"complexity_score": 38.5,
"dependency_score": 9.0,
"base_score": 92.5,
"role_multiplier": 1.0,
"final_score": 92.5
}
}
Source: Structure from src/output/unified/func_item.rs:FunctionDebtItemOutput (lines 50-84)
Key Fields Explained:
type: Always"Function"for function-level debt itemsscore: Unified debt score (same path for File and Function items)category: One ofCodeQuality,Architecture,Testing,Performancepriority: Derived from score (critical>= 100,high>= 50,medium>= 20,low< 20)location: Unified location structure with file, line, and function namefunction_role: Classification fromFunctionRoleenum (see below)debt_type: Tagged enum with variant-specific data
Function Role Classification
The function_role field helps prioritize testing and refactoring efforts based on the function’s architectural purpose.
Source: src/priority/semantic_classifier/mod.rs:25-33
{
"function_role": "PureLogic"
}
Available Roles:
PureLogic- Business logic, high test priorityOrchestrator- Coordinates other functions (like the example above)IOWrapper- Thin I/O layer, lower test priorityEntryPoint- Main entry points (main, CLI handlers)PatternMatch- Pattern matching function (often low complexity)Debug- Debug/diagnostic functions (low test priority)Unknown- Cannot classify automatically
File-Level Debt (God Object)
Terminal Output:
#2 SCORE: 8.7 [HIGH]
├─ GOD OBJECT: ./src/services/user_manager.rs
├─ ACTION: Split into 4 focused modules
├─ METRICS: 1250 lines, 45 functions, avg complexity 12.3
├─ INDICATORS: High responsibility count (8), excessive dependencies
└─ WHY: File has too many responsibilities, difficult to maintain
JSON Output (Unified Format):
{
"type": "File",
"score": 87.0,
"category": "Architecture",
"priority": "high",
"location": {
"file": "src/services/user_manager.rs"
},
"metrics": {
"lines": 1250,
"functions": 45,
"classes": 3,
"avg_complexity": 12.3,
"max_complexity": 28,
"total_complexity": 554,
"coverage": 0.62,
"uncovered_lines": 125
},
"god_object_indicators": {
"responsibility_count": 8,
"data_class_count": 12,
"method_groups": [
"authentication",
"authorization",
"profile_management",
"session_handling",
"notification_preferences",
"audit_logging",
"password_management",
"role_management"
],
"coupling_score": 0.78,
"cohesion_score": 0.34
},
"recommendation": {
"action": "Split into focused modules by responsibility",
"implementation_steps": [
"Extract authentication into auth_service.rs",
"Move authorization to permission_service.rs",
"Create profile_service.rs for user data management",
"Separate audit concerns into audit_logger.rs"
]
},
"impact": {
"complexity_reduction": 45.0,
"maintainability_improvement": 0.68,
"test_effort": 8.5
}
}
Source: Structure from src/output/unified/file_item.rs:FileDebtItemOutput and src/priority/file_metrics.rs:GodObjectIndicators
Test Gap (Needs Testing)
Terminal Output:
#3 SCORE: 8.9 [CRITICAL]
├─ TEST GAP: ./src/analyzers/rust_call_graph.rs:38 add_function_to_graph()
├─ ACTION: Add 6 unit tests for full coverage
├─ IMPACT: Full test coverage, -3.7 risk reduction
├─ COMPLEXITY: cyclomatic=6, branches=6, cognitive=8, nesting=2, lines=32
├─ DEPENDENCIES: 0 upstream, 11 downstream
├─ TEST EFFORT: Simple (2-3 hours)
└─ WHY: Business logic with 0% coverage, manageable complexity (cyclo=6, cog=8)
High impact - 11 functions depend on this
JSON Output (Unified Format):
{
"type": "Function",
"score": 89.0,
"category": "Testing",
"priority": "critical",
"location": {
"file": "src/analyzers/rust_call_graph.rs",
"line": 38,
"function": "add_function_to_graph"
},
"metrics": {
"cyclomatic_complexity": 6,
"cognitive_complexity": 8,
"length": 32,
"nesting_depth": 2,
"coverage": 0.0,
"uncovered_lines": [38, 39, 40, 42, 45, 48, 51, 54, 57, 60, 63, 66]
},
"debt_type": {
"TestingGap": {
"coverage": 0.0,
"cyclomatic": 6,
"cognitive": 8
}
},
"function_role": "PureLogic",
"purity_analysis": {
"is_pure": false,
"confidence": 0.65
},
"dependencies": {
"upstream_count": 0,
"downstream_count": 11,
"downstream_callees": [
"get_function_name",
"extract_parameters",
"parse_return_type",
"add_to_registry",
"update_call_sites",
"resolve_types",
"track_visibility",
"record_location",
"increment_counter",
"validate_signature",
"log_registration"
]
},
"recommendation": {
"action": "Add unit tests for core business logic",
"implementation_steps": [
"Test happy path with valid function definition",
"Test error cases: null input, invalid syntax",
"Test edge cases: complex generics, lifetimes",
"Test integration with registry updates",
"Verify correct handling of visibility modifiers",
"Test type resolution edge cases"
]
},
"impact": {
"coverage_improvement": 1.0,
"complexity_reduction": 0.0,
"risk_reduction": 3.7
}
}
Source: Structure from src/output/unified/func_item.rs:FunctionDebtItemOutput with debt_type from src/priority/debt_types.rs:DebtType
Entropy-Dampened Validation Function
This example shows how debtmap’s entropy analysis reduces false positives for repetitive code patterns.
Terminal Output:
Function: validate_config
File: src/config/validator.rs:23
Cyclomatic: 20 → Effective: 7 (65% dampened)
Risk: LOW
Entropy Analysis:
├─ Token Entropy: 0.28 (low variety - repetitive patterns)
├─ Pattern Repetition: 0.88 (high similarity between checks)
├─ Branch Similarity: 0.91 (consistent validation structure)
└─ Reasoning: Complexity reduced by 65% due to pattern-based code
This appears complex but is actually a repetitive validation pattern.
Lower priority for refactoring.
JSON Output (Unified Format):
{
"type": "Function",
"score": 15.2,
"category": "CodeQuality",
"priority": "low",
"location": {
"file": "src/config/validator.rs",
"line": 23,
"function": "validate_config"
},
"metrics": {
"cyclomatic_complexity": 20,
"cognitive_complexity": 18,
"length": 85,
"nesting_depth": 3,
"coverage": 0.95,
"entropy_score": 0.28
},
"debt_type": {
"ComplexityHotspot": {
"cyclomatic": 20,
"cognitive": 18,
"adjusted_cyclomatic": 7
}
},
"adjusted_complexity": {
"dampened_cyclomatic": 7.0,
"dampening_factor": 0.65
},
"function_role": "PatternMatch",
"recommendation": {
"action": "Low priority - repetitive validation pattern"
},
"impact": {
"coverage_improvement": 0.05,
"complexity_reduction": 0.0,
"risk_reduction": 0.8
},
"scoring_details": {
"coverage_score": 2.5,
"complexity_score": 7.0,
"dependency_score": 0.0,
"base_score": 43.5,
"entropy_dampening": 0.65,
"role_multiplier": 0.35,
"final_score": 15.2
}
}
Source: adjusted_complexity from src/output/unified/func_item.rs:AdjustedComplexity, entropy dampening spec 182
Key Points:
adjusted_cyclomatic: Entropy-dampened complexity value (7 vs original 20)dampening_factor: Amount of reduction applied (0.65 = 65% reduction)entropy_score: Low value (0.28) indicates repetitive patterns- Score reduced from 43.5 to 15.2 due to entropy analysis
Pattern Detection Example
When debtmap detects a specific complexity pattern, it includes pattern metadata.
JSON Output:
{
"type": "Function",
"score": 65.0,
"category": "CodeQuality",
"priority": "high",
"location": {
"file": "src/state/workflow_executor.rs",
"line": 78,
"function": "execute_transition"
},
"pattern_type": "state_machine",
"pattern_confidence": 0.87,
"pattern_details": {
"state_count": 12,
"transition_count": 34,
"branch_entropy": 0.82,
"state_cohesion": 0.91
},
"complexity_pattern": "State machine with 12 states, high cohesion"
}
Source: pattern_type and pattern_confidence from src/output/unified/func_item.rs:FunctionDebtItemOutput
Available Pattern Types:
state_machine- State transition logiccoordinator- Function orchestrating multiple operations- Pattern detection threshold: 0.7 confidence (from
src/io/writers/pattern_display.rs:PATTERN_CONFIDENCE_THRESHOLD)
Test File Detection
Debtmap automatically labels test files using the file_context_label field (spec 166).
JSON Output:
{
"type": "Function",
"location": {
"file": "tests/integration/payment_test.rs",
"line": 45,
"function": "test_payment_processing",
"file_context_label": "TEST FILE"
}
}
Labels:
"TEST FILE"- File is definitively a test file"PROBABLE TEST"- File likely contains tests but not confirmed
Source: file_context_label from src/output/unified/location.rs:UnifiedLocation
Summary Statistics
The unified format includes summary statistics at the top level.
JSON Output:
{
"format_version": "1.0.0",
"metadata": {
"debtmap_version": "0.5.0",
"generated_at": "2025-12-04T22:15:00Z",
"project_root": "/home/user/myproject",
"analysis_type": "full"
},
"summary": {
"total_items": 127,
"total_debt_score": 2845.6,
"debt_density": 0.18,
"total_loc": 15823,
"by_type": {
"File": 8,
"Function": 119
},
"by_category": {
"CodeQuality": 67,
"Architecture": 12,
"Testing": 42,
"Performance": 6
},
"score_distribution": {
"critical": 15,
"high": 34,
"medium": 58,
"low": 20
}
},
"items": []
}
Source: UnifiedOutput structure from src/output/unified/types.rs:14-20 and DebtSummary from lines 32-44
Key Summary Fields:
debt_density: Total debt score per 1000 lines of codeby_type: Count of File vs Function debt itemsby_category: Count by debt categoryscore_distribution: Count of items by priority level
Before/After Refactoring Comparison
Before:
Function: process_order
Cyclomatic: 22
Cognitive: 35
Coverage: 15%
Risk Score: 52.3 (CRITICAL)
Debt Score: 50 (Critical Complexity)
After:
Function: process_order (refactored)
Cyclomatic: 5
Cognitive: 6
Coverage: 92%
Risk Score: 2.1 (LOW)
Debt Score: 0 (no debt)
Extracted functions:
- validate_order (Cyclomatic: 4, Coverage: 100%)
- calculate_totals (Cyclomatic: 3, Coverage: 95%)
- apply_discounts (Cyclomatic: 6, Coverage: 88%)
- finalize_order (Cyclomatic: 4, Coverage: 90%)
Impact:
✓ Complexity reduced by 77%
✓ Coverage improved by 513%
✓ Risk reduced by 96%
✓ Created 4 focused, testable functions
Well-Tested Complex Function (Good Example)
Not all complexity is bad. This example shows a legitimately complex function with excellent test coverage.
Terminal Output:
Function: calculate_tax (WELL TESTED - Good Example!)
File: src/tax/calculator.rs:78
Complexity: Cyclomatic=18, Cognitive=22
Coverage: 98%
Risk: LOW
Why this is good:
- High complexity is necessary (tax rules are complex)
- Thoroughly tested with 45 test cases
- Clear documentation of edge cases
- Good example to follow for other complex logic
Next Steps
- Output Formats - Complete JSON schema and format documentation
- Configuration - Customize thresholds and analysis behavior
- Advanced Features - Purity analysis, entropy dampening, pattern detection
For questions or issues, visit GitHub Issues.
Compare Analysis
The compare command enables you to track technical debt changes over time by comparing two analysis results. This is essential for validating refactoring efforts, detecting regressions in pull requests, and monitoring project health trends.
Implementation Status
All Features Available Now:
- ✅ Target location tracking with intelligent fuzzy matching
- ✅ Detailed improvement percentage calculations (per-item)
- ✅ Multiple output formats (JSON, Markdown, Terminal, HTML)
- ✅ Implementation plan parsing for target extraction
- ✅ Four match strategies (Exact, FunctionLevel, ApproximateName, FileLevel)
- ✅ Resolved items tracking (debt eliminated)
- ✅ Improved items detection (score reduction ≥ 30%)
- ✅ New critical items detection (regressions)
- ✅ Project health metrics and trends
- ✅ CI/CD integration support
Overview
The compare command analyzes differences between “before” and “after” debtmap analyses, providing:
- Target location tracking - Monitor specific code locations through refactoring with fuzzy matching
- Validation tracking - Verify debt items are resolved or improved
- Project health metrics - Track overall debt trends across your codebase
- Regression detection - Identify new critical debt items introduced (score ≥ 60.0)
- Improvement tracking - Measure and celebrate debt reduction with detailed per-item metrics
- CI/CD integration - Automate quality gates in your pipeline
Basic Usage
Command Syntax
debtmap compare \
--before path/to/before.json \
--after path/to/after.json \
--output validation.json
Command-Line Options
| Option | Required | Description |
|---|---|---|
--before FILE | Yes | Path to “before” analysis JSON |
--after FILE | Yes | Path to “after” analysis JSON |
--output FILE | No | Output file path (default: stdout) |
--plan FILE | No | Implementation plan to extract target location |
--target-location LOCATION | No | Manual target location (format: file:function:line) |
--format FORMAT | No | Output format: json, markdown, terminal, or html (default: json) |
All comparison features are available now, including target location tracking, fuzzy matching, and multiple output formats.
Target Location Tracking
Target location tracking allows you to monitor specific code locations through refactoring changes. The compare command uses intelligent fuzzy matching to find your target even when code is moved or renamed.
Location Format
Target locations use the format: file:function:line
Examples:
src/main.rs:complex_function:42lib/parser.rs:parse_expression:156api/handler.rs:process_request:89
Location Format Variations (src/comparison/location_matcher.rs:12-19):
| Pattern | Description | Example |
|---|---|---|
file:function:line | Exact match | src/main.rs:process:42 |
file:function | Function-level match (any line) | src/main.rs:process |
file:*:line | Line range match (any function at line) | src/main.rs:*:42 |
file | File-level match (all items in file) | src/main.rs |
The file:*:line format is useful when you want to target a specific line number regardless of the function name, which is helpful when function names change during refactoring.
Specifying Target Locations
Option 1: Via Implementation Plan
Create an IMPLEMENTATION_PLAN.md file with a target location section:
# Implementation Plan
## Target Item
**Location**: ./src/example.rs:complex_function:45
**Current Debt Score**: 85.5
**Severity**: critical
## Problem Analysis
The `complex_function` has high cognitive complexity...
## Proposed Solution
1. Extract nested conditionals into separate functions
2. Use early returns to reduce nesting depth
3. Add comprehensive unit tests
Then run compare with the plan:
debtmap compare --before before.json --after after.json --plan IMPLEMENTATION_PLAN.md
Option 2: Manual Target Location
Specify the target directly via command-line:
debtmap compare \
--before before.json \
--after after.json \
--target-location "src/example.rs:complex_function:45"
Matching Strategies
Debtmap uses intelligent matching to find your target item even when code changes. The matcher tries multiple strategies in order, using the most precise match available:
| Strategy | When Used | Confidence |
|---|---|---|
| Exact | file:function:line matches exactly | 1.0 |
| FunctionLevel | file:function matches (any line) | 0.8 |
| ApproximateName | Fuzzy name matching finds similar function | 0.6 |
| FileLevel | All items in file match | 0.4 |
The comparison result includes the match strategy and confidence score used, along with the count of matched items (useful when fuzzy matching finds multiple candidates).
Target Status Values
After comparing, the target item will have one of these statuses:
- Resolved - Item no longer exists in after analysis (debt eliminated!)
- Improved - Item exists but with lower debt score
- Unchanged - Item exists with similar metrics (within 5%)
- Regressed - Item exists but got worse
- NotFoundBefore - Item didn’t exist in before analysis
- NotFound - Item not found in either analysis
Project Health Metrics
The compare command tracks project-wide health metrics to show overall trends.
Tracked Metrics
{
"project_health": {
"before": {
"total_debt_score": 450.5,
"total_items": 25,
"critical_items": 5,
"high_priority_items": 12,
"average_score": 18.02
},
"after": {
"total_debt_score": 380.2,
"total_items": 22,
"critical_items": 3,
"high_priority_items": 10,
"average_score": 17.28
},
"changes": {
"debt_score_change": -70.3,
"debt_score_change_pct": -15.6,
"items_change": -3,
"critical_items_change": -2
}
}
}
Understanding Metrics
- total_debt_score - Sum of all debt item scores
- total_items - Total number of debt items detected
- critical_items - Items with score ≥ 60.0 (critical threshold)
- high_priority_items - Items with score ≥ 40.0 (high priority threshold)
- average_score - Mean debt score across all items
- debt_score_change - Absolute change in total debt
- debt_score_change_pct - Percentage change in total debt
Debt Trends
The comparison calculates an overall debt trend based on the percentage change:
- Improving - Debt decreased by more than 5%
- Stable - Debt changed by less than 5% (within normal variance)
- Regressing - Debt increased by more than 5%
Regression Detection
Regressions are new critical debt items (score ≥ 60.0) that appear in the after analysis.
What Counts as a Regression
A regression is detected when:
- An item exists in the after analysis
- The item does NOT exist in the before analysis
- The item has a debt score ≥ 60.0 (critical severity threshold)
Regression Output
The compare command returns a ComparisonResult with detailed regression information:
{
"regressions": [
{
"location": "src/new_feature.rs:process_data:23",
"score": 65.5,
"debt_type": "high_complexity",
"description": "Function has cyclomatic complexity of 12 and cognitive complexity of 15"
}
]
}
Each regression item includes:
- location - Full path with function and line number
- score - Debt score (≥ 60.0 for regressions)
- debt_type - Type of debt detected (e.g., “high_complexity”, “god_object”)
- description - Human-readable explanation of the issue
Using Regressions in CI/CD
Fail your CI build if regressions are detected:
# Run comparison
debtmap compare --before before.json --after after.json --output result.json
# Check for regressions
REGRESSION_COUNT=$(jq '.regressions | length' result.json)
if [ "$REGRESSION_COUNT" -gt 0 ]; then
echo "❌ Regression detected - $REGRESSION_COUNT new critical debt items found"
jq '.regressions[]' result.json
exit 1
fi
# Check overall debt trend
TREND=$(jq -r '.summary.overall_debt_trend' result.json)
if [ "$TREND" = "Regressing" ]; then
echo "⚠️ Warning: Overall debt is increasing"
fi
Improvement Tracking
The compare command tracks improvements as a list of ImprovementItem objects with detailed before/after metrics.
Improvement Types
The ImprovementType enum (src/comparison/types.rs:121-126) defines four improvement categories:
- Resolved - Debt item completely eliminated (no longer in after analysis) ✓ Auto-detected
- ScoreReduced - Overall debt score reduced significantly (≥ 30% reduction) ✓ Auto-detected
- ComplexityReduced - Cyclomatic or cognitive complexity decreased (defined but not auto-classified)
- CoverageImproved - Test coverage increased (defined but not auto-classified)
Note: Currently, the comparator automatically detects and classifies Resolved and ScoreReduced improvements (src/comparison/comparator.rs:134-191). The ComplexityReduced and CoverageImproved types are defined in the type system but not yet auto-assigned during comparison. Future versions may extend auto-detection to classify complexity and coverage improvements.
Improvement Items Structure
{
"improvements": [
{
"location": "src/example.rs:complex_function:45",
"before_score": 68.5,
"after_score": 35.2,
"improvement_type": "ScoreReduced"
},
{
"location": "src/legacy.rs:old_code:120",
"before_score": 72.0,
"after_score": null,
"improvement_type": "Resolved"
},
{
"location": "src/utils.rs:helper_function:88",
"before_score": 45.0,
"after_score": 28.0,
"improvement_type": "ComplexityReduced"
}
]
}
Each improvement item includes:
- location - Full path with function and line number
- before_score - Original debt score
- after_score - New debt score (null if resolved)
- improvement_type - Type of improvement achieved
Before/After Metrics
When you specify a target location (via --plan or --target-location), the compare command provides detailed before/after metrics for that specific code location.
Target Item Comparison
{
"target_item": {
"location": "src/example.rs:complex_function:45",
"match_strategy": "Exact",
"match_confidence": 1.0,
"matched_items_count": 1,
"before": {
"score": 68.5,
"cyclomatic_complexity": 8,
"cognitive_complexity": 15,
"coverage": 45.0,
"function_length": 120,
"nesting_depth": 4
},
"after": {
"score": 35.1,
"cyclomatic_complexity": 3,
"cognitive_complexity": 5,
"coverage": 85.0,
"function_length": 45,
"nesting_depth": 2
},
"improvements": {
"score_reduction_pct": 48.8,
"complexity_reduction_pct": 66.7,
"coverage_improvement_pct": 88.9
},
"status": "Improved"
}
}
Target Metrics Fields
Each TargetMetrics object (before/after) includes:
- score - Unified debt score
- cyclomatic_complexity - Cyclomatic complexity metric
- cognitive_complexity - Cognitive complexity metric
- coverage - Test coverage percentage
- function_length - Lines of code in function
- nesting_depth - Maximum nesting depth
Improvement Percentages
The improvements object provides percentage improvements:
- score_reduction_pct - Percentage reduction in overall debt score
- complexity_reduction_pct - Reduction in cyclomatic/cognitive complexity
- coverage_improvement_pct - Increase in test coverage
Metric Aggregation
When multiple items match the target location (due to fuzzy matching), metrics are aggregated:
- score - Average across matched items
- cyclomatic_complexity - Average
- cognitive_complexity - Average
- coverage - Average
- function_length - Average
- nesting_depth - Maximum (worst case)
The matched_items_count field tells you how many items were aggregated.
Validating Refactoring Success
Use the comparison output to verify your refactoring:
# Check target status
STATUS=$(jq -r '.target_item.status' result.json)
SCORE_REDUCTION=$(jq '.target_item.improvements.score_reduction_pct' result.json)
echo "Target Status: $STATUS"
echo "Score Reduction: ${SCORE_REDUCTION}%"
# Check for improvements
IMPROVEMENT_COUNT=$(jq '.improvements | length' result.json)
echo "Improvements: $IMPROVEMENT_COUNT items"
# Verify no regressions
REGRESSION_COUNT=$(jq '.regressions | length' result.json)
if [ "$REGRESSION_COUNT" -eq 0 ]; then
echo "✅ No regressions detected!"
else
echo "⚠️ $REGRESSION_COUNT new critical items"
fi
Output Formats
JSON Format
The default JSON format provides complete comparison results:
debtmap compare --before before.json --after after.json --output result.json
The ComparisonResult JSON output (src/comparison/types.rs:3-22) includes:
metadata- Comparison metadata (timestamp, file paths, target location)target_item- Target item comparison with before/after metrics (if specified)project_health- Project-wide health metrics comparisonregressions- List of new critical items (score ≥ 60.0)improvements- List of improved/resolved items (≥30% score reduction or resolved)summary- Summary statistics and overall debt trend
Example output:
{
"metadata": {
"comparison_date": "2024-01-15T10:30:00Z",
"before_file": "before.json",
"after_file": "after.json",
"target_location": "src/example.rs:complex_function:45"
},
"target_item": {
"status": "Improved",
"improvements": {
"score_reduction_pct": 48.8
}
},
"summary": {
"target_improved": true,
"new_critical_count": 0,
"resolved_count": 3,
"overall_debt_trend": "Improving"
}
}
Markdown Format
Generate human-readable markdown reports for pull request comments:
debtmap compare --before before.json --after after.json --format markdown
Implementation: src/io/writers/markdown/mod.rs, src/main.rs:1027-1072
The markdown output is suitable for:
- Pull request comments
- Documentation
- Email reports
- Team dashboards
Terminal Format
Display colorized output directly in the terminal:
debtmap compare --before before.json --after after.json --format terminal
Implementation: src/io/writers/terminal.rs, src/main.rs:1011
The terminal format provides:
- Color-coded status indicators
- Formatted tables for metrics
- Human-readable summaries
- Easy scanning of results
HTML Format
Generate HTML reports with structured styling:
debtmap compare --before before.json --after after.json --format html
Implementation: src/io/writers/html.rs
The HTML format is suitable for:
- Web-based dashboards
- Archived reports
- Integration with documentation sites
CI/CD Integration
GitHub Actions Example
name: Technical Debt Check
on: [pull_request]
jobs:
debt-check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
with:
fetch-depth: 0 # Need history for before/after
- name: Install debtmap
run: cargo install debtmap
- name: Analyze main branch
run: |
git checkout main
debtmap analyze --output before.json
- name: Analyze PR branch
run: |
git checkout ${{ github.head_ref }}
debtmap analyze --output after.json
- name: Compare analyses
run: |
debtmap compare \
--before before.json \
--after after.json \
--output comparison.json
- name: Check comparison result
run: |
TREND=$(jq -r '.summary.overall_debt_trend' comparison.json)
REGRESSION_COUNT=$(jq '.regressions | length' comparison.json)
IMPROVEMENT_COUNT=$(jq '.improvements | length' comparison.json)
echo "Debt Trend: $TREND"
echo "Regressions: $REGRESSION_COUNT"
echo "Improvements: $IMPROVEMENT_COUNT"
# Fail on regression
if [ "$REGRESSION_COUNT" -gt 0 ]; then
echo "❌ Regression detected"
jq '.regressions[]' comparison.json
exit 1
fi
# Warn if debt is increasing
if [ "$TREND" = "Regressing" ]; then
echo "⚠️ Warning: Overall debt is increasing"
fi
- name: Post comparison to PR
uses: actions/github-script@v6
with:
script: |
const fs = require('fs');
const comparison = JSON.parse(fs.readFileSync('comparison.json', 'utf8'));
const body = `## Technical Debt Comparison
**Overall Trend:** ${comparison.summary.overall_debt_trend}
**Regressions:** ${comparison.summary.new_critical_count}
**Improvements:** ${comparison.summary.resolved_count}
${comparison.improvements.length > 0 ? `
### Improvements
${comparison.improvements.map(i => `- ${i.location}: ${i.before_score.toFixed(1)} → ${i.after_score ? i.after_score.toFixed(1) : 'resolved'}`).join('\n')}
` : ''}
${comparison.regressions.length > 0 ? `
### ⚠️ Regressions
${comparison.regressions.map(r => `- ${r.location}: ${r.score.toFixed(1)} (${r.debt_type})`).join('\n')}
` : ''}`;
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: body
});
GitLab CI Example
debt_check:
stage: test
script:
# Analyze main branch
- git fetch origin main
- git checkout origin/main
- debtmap analyze --output before.json
# Analyze current branch
- git checkout $CI_COMMIT_SHA
- debtmap analyze --output after.json
# Compare and check status
- debtmap compare --before before.json --after after.json --output comparison.json
- |
TREND=$(jq -r '.summary.overall_debt_trend' comparison.json)
REGRESSION_COUNT=$(jq '.regressions | length' comparison.json)
echo "Debt Trend: $TREND"
echo "Regressions: $REGRESSION_COUNT"
if [ "$REGRESSION_COUNT" -gt 0 ]; then
echo "Failed: Regression detected"
jq '.regressions[]' comparison.json
exit 1
fi
artifacts:
paths:
- before.json
- after.json
- comparison.json
expire_in: 1 week
Best Practices for CI/CD
- Store analyses as artifacts - Keep before/after JSON for debugging
- Check status field - Use
statusto determine pass/fail - Track completion percentage - Monitor progress toward debt resolution
- Review improvements - Celebrate and document successful refactorings
- Act on remaining issues - Create follow-up tasks for unresolved items
- Set completion thresholds - Require minimum completion percentage for merges
Practical Examples
Example 1: Basic Comparison
Compare two analyses to track debt changes:
# Run before analysis
debtmap analyze --output before.json
# Make changes to codebase...
# Run after analysis
debtmap analyze --output after.json
# Compare
debtmap compare --before before.json --after after.json --output comparison.json
# Check results
cat comparison.json | jq '.'
# Output shows: target_item, project_health, regressions, improvements, summary
Example 2: Validating Function Refactoring
Validate your refactoring work with target location tracking:
# Run before analysis
debtmap analyze --output before.json
# Identify critical items to fix
jq '.items[] | select(.unified_score.final_score >= 60.0)' before.json
# Refactor the high-priority functions...
# Run after analysis
debtmap analyze --output after.json
# Compare and validate with target location
debtmap compare \
--before before.json \
--after after.json \
--target-location "src/example.rs:complex_function:45" \
--output comparison.json
# Check target status
STATUS=$(jq -r '.target_item.status' comparison.json)
SCORE_REDUCTION=$(jq '.target_item.improvements.score_reduction_pct' comparison.json)
echo "Target Status: $STATUS"
echo "Score Reduction: ${SCORE_REDUCTION}%"
# Review all improvements
jq '.improvements[]' comparison.json
Example 3: Detecting PR Regressions
Check if a pull request introduces new critical debt:
# Analyze base branch
git checkout main
debtmap analyze --output main.json
# Analyze PR branch
git checkout feature/new-feature
debtmap analyze --output feature.json
# Compare
debtmap compare \
--before main.json \
--after feature.json \
--output comparison.json
# Check for regressions
REGRESSION_COUNT=$(jq '.regressions | length' comparison.json)
TREND=$(jq -r '.summary.overall_debt_trend' comparison.json)
echo "Regressions: $REGRESSION_COUNT"
echo "Debt Trend: $TREND"
# Example output structure:
jq '.' comparison.json
# {
# "summary": {
# "overall_debt_trend": "Improving", // or "Regressing"
# "new_critical_count": 0,
# "resolved_count": 3
# },
# "regressions": [],
# "improvements": [...],
# "project_health": {...}
# }
Example 4: Monitoring Project Health Over Releases
Track overall project health across releases:
# Analyze release v1.0
git checkout v1.0
debtmap analyze --output v1.0.json
# Analyze release v1.1
git checkout v1.1
debtmap analyze --output v1.1.json
# Compare
debtmap compare \
--before v1.0.json \
--after v1.1.json \
--output v1.0-to-v1.1.json
# Check project health metrics
echo "Before (v1.0):"
jq '.project_health.before' v1.0-to-v1.1.json
echo "After (v1.1):"
jq '.project_health.after' v1.0-to-v1.1.json
# Check overall trend
TREND=$(jq -r '.summary.overall_debt_trend' v1.0-to-v1.1.json)
DEBT_CHANGE=$(jq '.project_health.changes.debt_score_change_pct' v1.0-to-v1.1.json)
echo "Debt Trend: $TREND"
echo "Debt Score Change: ${DEBT_CHANGE}%"
Example 5: Full CI/CD Workflow
Complete workflow for continuous debt monitoring:
#!/bin/bash
# ci-debt-check.sh
set -e
BEFORE="before.json"
AFTER="after.json"
COMPARISON="comparison.json"
# Step 1: Analyze baseline (main branch)
echo "📊 Analyzing baseline..."
git checkout main
debtmap analyze --output "$BEFORE"
# Step 2: Analyze current branch
echo "📊 Analyzing current branch..."
git checkout -
debtmap analyze --output "$AFTER"
# Step 3: Run comparison
echo "🔍 Running comparison..."
debtmap compare \
--before "$BEFORE" \
--after "$AFTER" \
--output "$COMPARISON"
# Step 4: Extract metrics
TREND=$(jq -r '.summary.overall_debt_trend' "$COMPARISON")
REGRESSION_COUNT=$(jq '.regressions | length' "$COMPARISON")
IMPROVEMENT_COUNT=$(jq '.improvements | length' "$COMPARISON")
RESOLVED_COUNT=$(jq '.summary.resolved_count' "$COMPARISON")
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo "📈 Debt Comparison Results"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo "Trend: $TREND"
echo "Regressions: $REGRESSION_COUNT"
echo "Improvements: $IMPROVEMENT_COUNT"
echo "Resolved: $RESOLVED_COUNT"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
# Step 5: Quality gate
if [ "$REGRESSION_COUNT" -gt 0 ]; then
echo "❌ FAILED: Regression detected"
jq '.regressions[]' "$COMPARISON"
exit 1
fi
if [ "$TREND" = "Regressing" ]; then
echo "⚠️ WARNING: Overall debt is increasing"
# Don't fail, just warn
fi
if [ "$RESOLVED_COUNT" -gt 0 ]; then
echo "🎉 SUCCESS: $RESOLVED_COUNT debt items resolved!"
fi
echo "✅ PASSED: No regressions detected"
Example 6: Interpreting Comparison Results
Understanding the comparison output:
# Run comparison
debtmap compare --before before.json --after after.json --output comparison.json
# Check debt trend
TREND=$(jq -r '.summary.overall_debt_trend' comparison.json)
REGRESSION_COUNT=$(jq '.regressions | length' comparison.json)
IMPROVEMENT_COUNT=$(jq '.improvements | length' comparison.json)
case "$TREND" in
"Improving")
echo "🎉 Success! Debt is decreasing"
echo "Improvements: $IMPROVEMENT_COUNT"
jq '.improvements[] | "\(.location): \(.improvement_type)"' comparison.json
;;
"Stable")
echo "➡️ Stable - no significant debt change"
echo "Improvements: $IMPROVEMENT_COUNT"
echo "Regressions: $REGRESSION_COUNT"
;;
"Regressing")
echo "❌ Warning! Debt is increasing"
echo "New critical items: $REGRESSION_COUNT"
jq '.regressions[] | "\(.location): \(.score) (\(.debt_type))"' comparison.json
;;
esac
# Check if target improved (if target was specified)
if jq -e '.target_item' comparison.json > /dev/null; then
TARGET_STATUS=$(jq -r '.target_item.status' comparison.json)
echo "Target Status: $TARGET_STATUS"
fi
Troubleshooting
Understanding Debt Trends
Issue: Confused about what the debt trend means
Solution: Check the summary.overall_debt_trend field in comparison output:
Improving- Total debt decreased by more than 5%Stable- Total debt changed by less than 5% (within normal variance)Regressing- Total debt increased by more than 5%
Check the trend:
TREND=$(jq -r '.summary.overall_debt_trend' comparison.json)
DEBT_CHANGE=$(jq '.project_health.changes.debt_score_change_pct' comparison.json)
echo "Debt Trend: $TREND (${DEBT_CHANGE}% change)"
No Improvements Detected
Issue: Made changes but comparison shows no improvements
Possible causes:
- Changes didn’t reduce debt scores by ≥30% (improvement threshold)
- Refactored items had scores <60.0 (not tracked as critical)
- Changes were neutral (e.g., code moved but complexity unchanged)
Solution: Check the details:
# Compare before/after project health
jq '.project_health.before' result.json
jq '.project_health.after' result.json
# Look for critical items in before analysis
jq '.items[] | select(.unified_score.final_score >= 60.0)' before.json
JSON Parsing Errors
Problem: Error parsing JSON file
Solutions:
- Verify the file is valid JSON:
jq . before.json - Ensure the file is a debtmap analysis output
- Check file permissions and path
- Regenerate the analysis if corrupted
Understanding Target Status Values
| Status | Meaning | Action Required |
|---|---|---|
Resolved | Item eliminated completely | ✅ Celebrate! Item no longer exists |
Improved | Score reduced significantly | ✅ Good progress, verify metrics improved |
Unchanged | No significant change | ⚠️ Review approach, may need different strategy |
Regressed | Item got worse | ❌ Investigate and fix before merging |
NotFoundBefore | Item didn’t exist before | ℹ️ New code, ensure quality is acceptable |
NotFound | Item not found in either | ⚠️ Check target location format |
Handling Missing Files
Problem: No such file or directory
Solutions:
# Verify files exist
ls -la before.json after.json
# Check current directory
pwd
# Use absolute paths if needed
debtmap compare \
--before /absolute/path/to/before.json \
--after /absolute/path/to/after.json
Interpreting Edge Cases
All Items Resolved:
{
"summary": {
"resolved_count": 25,
"new_critical_count": 0,
"overall_debt_trend": "Improving"
},
"project_health": {
"after": {
"total_items": 0,
"critical_items": 0
}
}
}
All debt items resolved - excellent work!
New Project (Empty Before):
{
"summary": {
"new_critical_count": 15,
"resolved_count": 0,
"overall_debt_trend": "Stable"
},
"project_health": {
"before": {
"total_items": 0
}
}
}
New project or first analysis - establish baseline for future comparisons.
No Changes:
{
"summary": {
"overall_debt_trend": "Stable",
"new_critical_count": 0,
"resolved_count": 0
},
"improvements": [],
"regressions": []
}
No changes detected - either no code changes or changes were neutral to debt.
Related Documentation
- Validation Gates - Quality gate thresholds and validation strategies
- Prodigy Integration - Automated refactoring workflows with compare validation
- Output Formats - Understanding analysis JSON structure and formatters
- Scoring Strategies - How debt scores are calculated (critical ≥ 60.0, high priority ≥ 40.0)
- Threshold Configuration - Configuring severity thresholds for your project
Summary
The compare command provides validation for refactoring efforts:
Current Capabilities:
- ✅ Target location tracking with intelligent fuzzy matching (4 strategies: Exact, FunctionLevel, ApproximateName, FileLevel)
- ✅ Detect regressions (new critical items with score ≥ 60.0)
- ✅ Track resolved items and improvements (≥30% score reduction, auto-detected)
- ✅ Detailed per-item improvement metrics with before/after scores
- ✅ Multiple output formats (JSON, Markdown, Terminal, HTML)
- ✅ Implementation plan parsing for target extraction (via
--planflag) - ✅ Project-wide health metrics and debt trends
- ✅ Automate quality gates in CI/CD pipelines
Note: The compare command is fully implemented with all features available. For improvement classification, Resolved and ScoreReduced types are automatically detected; ComplexityReduced and CoverageImproved are defined for future use.
Use the compare command regularly to maintain visibility into your codebase’s technical health and ensure continuous improvement. All features are fully implemented and ready for production use.
Configuration
Debtmap is highly configurable through a TOML configuration file. This section covers all configuration options and best practices for tuning debtmap for your codebase.
Quick Start
Create a .debtmap.toml file in your project root:
[scoring]
coverage = 0.50
complexity = 0.35
dependency = 0.15
[thresholds]
complexity = 15
lines = 80
coverage = 0.8
[languages]
rust = true
python = true
javascript = true
Configuration Topics
- Scoring Configuration - Tune debt scoring weights and role multipliers
- Thresholds Configuration - Set complexity and coverage thresholds
- Language Configuration - Enable/disable language support and tune language-specific settings
- Display and Output - Configure output formats and display options
- Advanced Options - Advanced configuration for power users
- Best Practices - Guidelines for effective configuration
Configuration File Location
Debtmap searches for configuration in the following order:
- Path specified with
--configflag .debtmap.tomlin current directory.debtmap.tomlin git repository root- Built-in defaults
Validation
Debtmap validates your configuration on startup. Invalid configurations will produce clear error messages:
$ debtmap analyze .
Error: Invalid configuration
- scoring.coverage + scoring.complexity + scoring.dependency must equal 1.0
- Current sum: 1.10
Default Values
All configuration options have sensible defaults. You only need to specify values you want to override from the defaults documented in each section.
Scoring Configuration
Debtmap uses a weighted scoring model to calculate technical debt priority. This chapter explains how to configure scoring weights, role multipliers, and related settings that affect how functions and files are prioritized.
Quick Reference
Here’s a quick overview of all scoring defaults (from src/config/scoring.rs):
| Configuration | Default Value | Purpose |
|---|---|---|
| Scoring Weights | ||
coverage | 0.50 | Weight for test coverage gaps |
complexity | 0.35 | Weight for code complexity |
dependency | 0.15 | Weight for dependency criticality |
| Role Multipliers | ||
pure_logic | 1.2 | Prioritize pure computation |
orchestrator | 0.8 | Reduce for delegation functions |
io_wrapper | 0.7 | Reduce for I/O wrappers |
entry_point | 0.9 | Slight reduction for main/CLI |
pattern_match | 0.6 | Reduce for pattern matching |
debug | 0.3 | Debug/diagnostic functions |
unknown | 1.0 | No adjustment |
| Role Coverage Weights | ||
entry_point | 0.6 | Reduce coverage penalty |
orchestrator | 0.8 | Reduce coverage penalty |
pure_logic | 1.0 | No reduction |
io_wrapper | 0.5 | Reduce for I/O wrappers |
pattern_match | 1.0 | Standard penalty |
debug | 0.3 | Lowest coverage expectations |
unknown | 1.0 | Standard penalty |
| Role Multiplier Clamping | ||
clamp_min | 0.3 | Minimum multiplier |
clamp_max | 1.8 | Maximum multiplier |
enable_clamping | true | Enable clamping |
Scoring Weights
The [scoring] section controls how different factors contribute to the overall debt score. Debtmap uses a weighted sum model where weights must sum to 1.0.
[scoring]
coverage = 0.50 # Weight for test coverage gaps (default: 0.50)
complexity = 0.35 # Weight for code complexity (default: 0.35)
dependency = 0.15 # Weight for dependency criticality (default: 0.15)
Active weights (used in scoring):
coverage- Prioritizes untested code (default: 0.50)complexity- Identifies complex areas (default: 0.35)dependency- Considers impact radius (default: 0.15)
Unused weights (reserved for future features):
semantic- Not currently used (default: 0.00)security- Not currently used (default: 0.00)organization- Not currently used (default: 0.00)
Validation rules:
- All weights must be between 0.0 and 1.0
- Active weights (coverage + complexity + dependency) must sum to 1.0 (±0.001 tolerance)
- If weights don’t sum to 1.0, they will be automatically normalized
Example - Prioritize complexity over coverage:
[scoring]
coverage = 0.30
complexity = 0.55
dependency = 0.15
Source: src/config/scoring.rs:14-40 (ScoringWeights)
Complexity Weights
The [complexity_weights] section controls how cyclomatic and cognitive complexity are combined in the final scoring:
[complexity_weights]
cyclomatic = 0.3 # Weight for cyclomatic complexity (default: 0.3)
cognitive = 0.7 # Weight for cognitive complexity (default: 0.7)
max_cyclomatic = 50.0 # Maximum cyclomatic for normalization (default: 50.0)
max_cognitive = 100.0 # Maximum cognitive for normalization (default: 100.0)
Why cognitive complexity is weighted higher:
- Cognitive complexity correlates better with bug density
- Cyclomatic complexity is a proxy for test cases needed
- The 70/30 split balances maintainability with testability
Source: src/config/scoring.rs:335-381 (ComplexityWeightsConfig)
Role Multipliers
Role multipliers adjust complexity scores based on a function’s semantic role:
[role_multipliers]
pure_logic = 1.2 # Prioritize pure computation (default: 1.2)
orchestrator = 0.8 # Reduce for delegation functions (default: 0.8)
io_wrapper = 0.7 # Reduce for I/O wrappers (default: 0.7)
entry_point = 0.9 # Slight reduction for main/CLI (default: 0.9)
pattern_match = 0.6 # Reduce for pattern matching (default: 0.6)
debug = 0.3 # Debug/diagnostic functions (default: 0.3)
unknown = 1.0 # No adjustment (default: 1.0)
These multipliers help reduce false positives by recognizing that different function types have naturally different complexity levels. The debug role has the lowest multiplier (0.3) since debug and diagnostic functions typically have low testing priority.
Source: src/config/scoring.rs:206-333 (RoleMultipliers)
Role-Based Scoring Configuration
DebtMap uses a two-stage role adjustment mechanism to accurately score functions based on their architectural role and testing strategy. This section explains how to configure both stages.
Stage 1: Role Coverage Weights
The first stage adjusts how much coverage gaps penalize different function types. This recognizes that not all functions need the same level of unit test coverage.
Configuration (.debtmap.toml under [scoring.role_coverage_weights]):
[scoring.role_coverage_weights]
entry_point = 0.6 # Reduce coverage penalty (often integration tested)
orchestrator = 0.8 # Reduce coverage penalty (tested via higher-level tests)
pure_logic = 1.0 # Pure logic should have unit tests, no reduction (default: 1.0)
io_wrapper = 0.5 # I/O wrappers are integration tested (default: 0.5)
pattern_match = 1.0 # Standard penalty
debug = 0.3 # Debug functions have lowest coverage expectations (default: 0.3)
unknown = 1.0 # Standard penalty (default behavior)
Rationale:
| Function Role | Weight | Why This Value? |
|---|---|---|
| Entry Point | 0.6 | CLI handlers, HTTP routes, main functions are integration tested, not unit tested |
| Orchestrator | 0.8 | Coordination functions tested via higher-level tests |
| Pure Logic | 1.0 | Core business logic should have unit tests (default: 1.0) |
| I/O Wrapper | 0.5 | File/network operations tested via integration tests (default: 0.5) |
| Pattern Match | 1.0 | Standard coverage expectations |
| Debug | 0.3 | Debug/diagnostic functions have lowest testing priority (default: 0.3) |
| Unknown | 1.0 | Default when role cannot be determined |
Example Impact:
# Emphasize pure logic testing strongly
[scoring.role_coverage_weights]
pure_logic = 1.5 # 50% higher penalty for untested logic
entry_point = 0.5 # 50% lower penalty for untested entry points
io_wrapper = 0.4 # 60% lower penalty for untested I/O
# Conservative approach (smaller adjustments)
[scoring.role_coverage_weights]
pure_logic = 1.1 # Only 10% increase
entry_point = 0.9 # Only 10% decrease
How It Works:
When a function has 0% coverage:
- Entry Point (weight 0.6): Gets 60% penalty instead of 100% penalty
- Pure Logic (weight 1.0): Gets 100% penalty (standard emphasis on testing)
- I/O Wrapper (weight 0.5): Gets 50% penalty
This prevents entry points from dominating the priority list due to low unit test coverage while emphasizing the importance of testing pure business logic.
Source: src/config/scoring.rs:383-455 (RoleCoverageWeights)
Stage 2: Role Multiplier with Clamping
The second stage applies a final role-based multiplier to reflect architectural importance. This multiplier is clamped by default to prevent extreme score variations.
Configuration (.debtmap.toml under [scoring.role_multiplier]):
[scoring.role_multiplier]
clamp_min = 0.3 # Minimum multiplier (default: 0.3)
clamp_max = 1.8 # Maximum multiplier (default: 1.8)
enable_clamping = true # Enable clamping (default: true)
Parameters:
| Parameter | Default | Description |
|---|---|---|
clamp_min | 0.3 | Minimum allowed multiplier - prevents functions from becoming invisible |
clamp_max | 1.8 | Maximum allowed multiplier - prevents extreme score spikes |
enable_clamping | true | Whether to apply clamping (disable for prototyping only) |
Clamp Range Rationale:
Default [0.3, 1.8]: Balances differentiation with stability
-
Lower bound (0.3): I/O wrappers still contribute 30% of their base score
- Prevents them from becoming invisible in the priority list
- Ensures simple wrappers aren’t completely ignored
-
Upper bound (1.8): Critical functions get at most 180% of base score
- Prevents one complex function from dominating the entire list
- Maintains balanced prioritization across different issues
When to Adjust Clamp Range:
# Wider range for more differentiation
[scoring.role_multiplier]
clamp_min = 0.2 # Allow more reduction
clamp_max = 2.5 # Allow more emphasis
# Narrower range for more stability
[scoring.role_multiplier]
clamp_min = 0.5 # Less reduction
clamp_max = 1.5 # Less emphasis
# Disable clamping (not recommended for production)
[scoring.role_multiplier]
enable_clamping = false # Allow unclamped multipliers
# Warning: May cause unstable prioritization
When to Disable Clamping:
- Prototyping: Testing extreme multiplier values for custom scoring strategies
- Special cases: Very specific project needs requiring wide multiplier ranges
- Not recommended for production use as it can lead to unstable prioritization
Example Impact:
Without clamping:
Function: critical_business_logic (Pure Logic)
Base Score: 45.0
Role Multiplier: 2.5 (unclamped)
Final Score: 112.5 (dominates entire list)
With clamping (default):
Function: critical_business_logic (Pure Logic)
Base Score: 45.0
Role Multiplier: 1.8 (clamped from 2.5)
Final Score: 81.0 (high priority, but balanced)
Source: src/config/scoring.rs:457-493 (RoleMultiplierConfig)
Complete Example Configuration
Here’s a complete example showing both stages configured together:
# Stage 1: Coverage weight adjustments
[scoring.role_coverage_weights]
pure_logic = 1.0 # Pure logic should have unit tests (default: 1.0)
entry_point = 0.6 # Reduce penalty for integration-tested entry points
orchestrator = 0.8 # Partially reduce penalty for orchestrators
io_wrapper = 0.5 # I/O wrappers are integration tested (default: 0.5)
pattern_match = 1.0 # Standard
debug = 0.3 # Debug functions have lowest coverage expectations (default: 0.3)
unknown = 1.0 # Standard
# Stage 2: Role multiplier with clamping
[scoring.role_multiplier]
clamp_min = 0.3 # I/O wrappers contribute at least 30%
clamp_max = 1.8 # Critical functions get at most 180%
enable_clamping = true # Keep clamping enabled for stability
How the Two Stages Work Together
The two-stage approach ensures role-based coverage adjustments and architectural importance multipliers work independently:
Example Workflow:
1. Calculate base score from complexity (10) and dependencies (5)
-> Base = 15.0
2. Stage 1: Apply coverage weight based on role (Entry Point, weight 0.6)
-> Coverage penalty reduced from 1.0 to 0.4
-> Preliminary score = 15.0 * 0.4 = 6.0
3. Stage 2: Apply clamped role multiplier (Entry Point, multiplier 1.2)
-> Clamped to [0.3, 1.8] -> stays 1.2
-> Final score = 6.0 * 1.2 = 7.2
Key Benefits:
- Coverage adjustments don’t interfere with role multiplier
- Both mechanisms contribute independently to final score
- Clamping prevents instability from extreme values
- Configuration flexibility for different project needs
Verification
To see how role-based adjustments affect your codebase:
# Show detailed scoring breakdown
debtmap analyze . --verbose
# Look for lines like:
# Coverage Weight: 0.6 (Entry Point adjustment)
# Adjusted Coverage Penalty: 0.4 (reduced from 1.0)
# Role Multiplier: 1.2 (clamped from 1.5)
For more details on how role-based adjustments reduce false positives, see the Role-Based Adjustments section in the Scoring Strategies guide.
Score Normalization
The [normalization] section controls how raw scores are normalized to a 0-10 scale:
[normalization]
linear_threshold = 10.0 # Use linear scaling below this value
logarithmic_threshold = 100.0 # Use logarithmic scaling above this value
sqrt_multiplier = 3.33 # Multiplier for square root scaling
log_multiplier = 10.0 # Multiplier for logarithmic scaling
show_raw_scores = true # Show both raw and normalized scores
Normalization ensures scores are comparable across different codebases and prevents extreme outliers from dominating the results.
Source: src/config/scoring.rs:557-610 (NormalizationConfig)
Context Multipliers
The [context_multipliers] section dampens scores for non-production code (spec 191):
[context_multipliers]
examples = 0.1 # 90% reduction for example files
tests = 0.2 # 80% reduction for test files
benchmarks = 0.3 # 70% reduction for benchmarks
build_scripts = 0.3 # 70% reduction for build scripts
documentation = 0.1 # 90% reduction for documentation
enable_context_dampening = false # Disabled by default (use --context flag)
When enabled with --context, these multipliers reduce false positive urgency scores for code that doesn’t represent production complexity.
Source: src/config/scoring.rs:612-676 (ContextMultipliers)
Data Flow Scoring
The [data_flow_scoring] section configures analysis of data flow patterns (spec 218):
[data_flow_scoring]
enabled = true # Enable data flow scoring
purity_weight = 0.4 # Weight for function purity
refactorability_weight = 0.3 # Weight for refactorability
pattern_weight = 0.3 # Weight for pattern recognition
Data flow scoring rewards functions with:
- Pure data transformations (no side effects)
- High refactorability potential
- Recognized functional patterns (map, filter, fold)
Source: src/config/scoring.rs:678-724 (DataFlowScoringConfig)
Rebalanced Scoring Presets
The [rebalanced_scoring] section allows using predefined presets or custom weights (spec 136):
[rebalanced_scoring]
preset = "balanced" # Preset: balanced, quality-focused, size-focused, test-coverage
# Or override with custom weights:
# complexity_weight = 0.35
# coverage_weight = 0.30
# structural_weight = 0.15
# size_weight = 0.10
# smell_weight = 0.10
Available Presets:
| Preset | Focus | Best For |
|---|---|---|
balanced | Even distribution | General use |
quality-focused | Complexity + coverage | Mature projects |
size-focused | File/function size | Refactoring projects |
test-coverage | Coverage gaps | Test improvement efforts |
Source: src/config/scoring.rs:495-554 (RebalancedScoringConfig)
Related Documentation
- Scoring Strategies - In-depth explanation of scoring algorithms
- Role-Based Adjustments - How semantic roles affect scoring
- Thresholds Configuration - Configure when code is flagged as debt
- Tiered Prioritization - How scores map to priority tiers
Thresholds Configuration
This subsection covers threshold configuration in Debtmap, including basic thresholds, minimum thresholds for filtering, and validation thresholds for CI/CD pipelines.
Overview
Thresholds control what gets flagged as technical debt. Debtmap provides multiple threshold categories:
- Basic thresholds - Core complexity and size limits
- Minimum thresholds - Filter out trivial functions
- Validation thresholds - CI/CD quality gates
- Coverage expectations - Role-based test coverage requirements
Basic Thresholds
Basic thresholds define when code is flagged as technical debt. Configure them in the [thresholds] section of .debtmap.toml.
Source: src/config/thresholds.rs:83-118 (ThresholdsConfig)
[thresholds]
complexity = 10 # Cyclomatic complexity threshold (default: 10)
duplication = 50 # Duplication threshold in lines (default: 50)
max_file_length = 500 # Maximum lines per file (default: 500)
max_function_length = 50 # Maximum lines per function (default: 50)
Configuration Options
| Field | Type | Default | Description |
|---|---|---|---|
complexity | u32 | 10 | Cyclomatic complexity threshold |
duplication | u32 | 50 | Minimum duplicate lines to flag |
max_file_length | usize | 500 | Maximum lines per file |
max_function_length | usize | 50 | Maximum lines per function |
CLI Override
You can override basic thresholds from the command line:
# Override cyclomatic complexity threshold
debtmap analyze . --threshold-complexity 15
# Override duplication threshold
debtmap analyze . --threshold-duplication 30
# Combine multiple threshold flags
debtmap analyze . --threshold-complexity 15 --threshold-duplication 30
Minimum Thresholds
Minimum thresholds filter out trivial functions that aren’t significant technical debt. This helps focus analysis on meaningful issues.
Source: src/config/thresholds.rs:90-109 (ThresholdsConfig)
[thresholds]
# Filter items below these scores
minimum_debt_score = 2.0 # Only show debt score >= 2.0 (default: none)
minimum_risk_score = 2.0 # Only show risk score >= 2.0 (default: none)
min_score_threshold = 3.0 # Hide LOW severity items (default: 3.0)
# Complexity minimums (filter out simple functions)
minimum_cyclomatic_complexity = 3 # Ignore cyclomatic < 3 (default: none)
minimum_cognitive_complexity = 5 # Ignore cognitive < 5 (default: none)
Configuration Options
| Field | Type | Default | Description |
|---|---|---|---|
minimum_debt_score | f64 | None | Minimum debt score to include in results |
minimum_risk_score | f64 | None | Minimum risk score (0-10) to include |
min_score_threshold | f64 | 3.0 | Score threshold for recommendations |
minimum_cyclomatic_complexity | u32 | None | Minimum cyclomatic to analyze |
minimum_cognitive_complexity | u32 | None | Minimum cognitive to analyze |
Use Cases
Focus on High-Priority Issues:
[thresholds]
minimum_debt_score = 5.0 # Only show significant debt
minimum_risk_score = 4.0 # Only show meaningful risk
Legacy Codebase (Reduce Noise):
[thresholds]
minimum_cyclomatic_complexity = 10 # Ignore moderate complexity
minimum_cognitive_complexity = 15 # Focus on worst offenders
Validation Thresholds
Validation thresholds are used by the debtmap validate command to enforce quality gates in CI/CD pipelines.
Source: src/config/thresholds.rs:120-196 (ValidationThresholds)
Primary Quality Metrics (Scale-Independent)
These metrics work for codebases of any size:
[thresholds.validation]
# Primary quality metrics
max_average_complexity = 10.0 # Maximum average complexity per function (default: 10.0)
max_debt_density = 50.0 # Debt items per 1000 LOC (default: 50.0)
max_codebase_risk_score = 7.0 # Overall risk score 0-10 (default: 7.0)
# Optional metrics
min_coverage_percentage = 0.0 # Minimum test coverage % (default: 0.0 - disabled)
# Safety net
max_total_debt_score = 10000 # Maximum total debt (default: 10000)
Configuration Options
| Field | Type | Default | Description |
|---|---|---|---|
max_average_complexity | f64 | 10.0 | Maximum average complexity per function |
max_debt_density | f64 | 50.0 | Maximum debt per 1000 LOC (scale-independent) |
max_codebase_risk_score | f64 | 7.0 | Maximum overall risk score (0-10) |
min_coverage_percentage | f64 | 0.0 | Minimum required test coverage (0 = disabled) |
max_total_debt_score | u32 | 10000 | Safety ceiling for total debt |
Using Validation in CI/CD
# Run validation (exits with error if thresholds exceeded)
debtmap validate . --config .debtmap.toml
# Example CI/CD pipeline
debtmap analyze . --output report.json
debtmap validate . --config .debtmap.toml || exit 1
Deprecated Fields
The following fields are deprecated since v0.3.0 and will be removed in v1.0:
| Deprecated Field | Replacement |
|---|---|
max_high_complexity_count | Use max_debt_density |
max_debt_items | Use max_debt_density |
max_high_risk_functions | Use max_debt_density + max_codebase_risk_score |
Migration Example:
# Old (deprecated)
[thresholds.validation]
max_debt_items = 100
# New (scale-independent)
[thresholds.validation]
max_debt_density = 50.0 # Works for any codebase size
Coverage Expectations
Coverage expectations define role-based test coverage requirements. Different function types have different testing strategies.
Source: src/priority/scoring/coverage_expectations.rs:103-173
# High expectations for pure functions
[coverage_expectations.pure]
min = 90.0 # Minimum acceptable coverage
target = 95.0 # Target/ideal coverage
max = 100.0 # Maximum meaningful coverage
# Moderate expectations for I/O operations
[coverage_expectations.io_operations]
min = 60.0
target = 70.0
max = 80.0
# Low expectations for debug/diagnostic code
[coverage_expectations.debug]
min = 20.0
target = 30.0
max = 40.0
Default Coverage Expectations by Role
| Function Role | Min | Target | Max | Rationale |
|---|---|---|---|---|
| Pure Logic | 90% | 95% | 100% | Easy to unit test, should be comprehensive |
| Business Logic | 80% | 90% | 95% | Core functionality requires thorough testing |
| Validation | 85% | 92% | 98% | Input validation is critical |
| State Management | 75% | 85% | 90% | State transitions need coverage |
| Error Handling | 70% | 80% | 90% | Error paths should be tested |
| Orchestration | 65% | 75% | 85% | Tested via higher-level tests |
| Configuration | 60% | 70% | 80% | Often integration tested |
| I/O Operations | 60% | 70% | 80% | Often integration tested |
| Initialization | 50% | 65% | 75% | Setup code with less testing priority |
| Performance | 40% | 50% | 60% | Optimization code with lower priority |
| Debug | 20% | 30% | 40% | Diagnostic code has lowest priority |
Complete Coverage Configuration
# Strict quality standards
[coverage_expectations.pure]
min = 95.0
target = 98.0
max = 100.0
[coverage_expectations.business_logic]
min = 90.0
target = 95.0
max = 98.0
[coverage_expectations.validation]
min = 92.0
target = 96.0
max = 100.0
# More lenient for I/O
[coverage_expectations.io_operations]
min = 50.0
target = 65.0
max = 75.0
Complexity Thresholds
For fine-grained control over function complexity detection, use the [complexity_thresholds] section.
Source: src/complexity/threshold_manager.rs:16-58 (ComplexityThresholds)
[complexity_thresholds]
# Core complexity metrics
minimum_total_complexity = 8 # Sum of cyclomatic + cognitive (default: 8)
minimum_cyclomatic_complexity = 5 # Decision points (default: 5)
minimum_cognitive_complexity = 10 # Mental effort to understand (default: 10)
# Structural complexity metrics
minimum_match_arms = 4 # Maximum match/switch arms (default: 4)
minimum_if_else_chain = 3 # Maximum if-else chain length (default: 3)
minimum_function_length = 20 # Minimum lines before flagging (default: 20)
# Role-based multipliers (adjust thresholds by function role)
entry_point_multiplier = 1.5 # main(), handlers - more lenient (default: 1.5)
core_logic_multiplier = 1.0 # Business logic - standard (default: 1.0)
utility_multiplier = 0.8 # Getters, setters - stricter (default: 0.8)
test_function_multiplier = 2.0 # Test functions - most lenient (default: 2.0)
Threshold Presets
Use --threshold-preset to apply predefined configurations:
# Strict preset (new projects, libraries)
debtmap analyze . --threshold-preset strict
# Balanced preset (default - typical projects)
debtmap analyze . --threshold-preset balanced
# Lenient preset (legacy codebases)
debtmap analyze . --threshold-preset lenient
Preset Comparison:
| Threshold | Strict | Balanced | Lenient |
|---|---|---|---|
| Cyclomatic Complexity | 3 | 5 | 10 |
| Cognitive Complexity | 7 | 10 | 20 |
| Total Complexity | 5 | 8 | 15 |
| Function Length | 15 | 20 | 50 |
| Match Arms | 3 | 4 | 8 |
| If-Else Chain | 2 | 3 | 5 |
Source: src/complexity/threshold_manager.rs:120-148 (from_preset)
Role-Based Multiplier Examples
Role multipliers adjust ALL thresholds for different function types:
Entry Point with Balanced preset (1.5x multiplier):
- Cyclomatic threshold: 7.5 (5 x 1.5)
- Cognitive threshold: 15 (10 x 1.5)
- Total threshold: 12 (8 x 1.5)
Utility function with Balanced preset (0.8x multiplier):
- Cyclomatic threshold: 4 (5 x 0.8)
- Cognitive threshold: 8 (10 x 0.8)
- Total threshold: 6.4 (8 x 0.8)
Validation Rules
Debtmap validates thresholds to prevent misconfiguration:
Source: src/complexity/threshold_manager.rs:191-217
- Core complexity metrics must be > 0
- Role multipliers must be positive (> 0)
- Invalid configurations fall back to defaults with a warning
File Size Thresholds
Context-aware file size limits vary by file type to avoid unrealistic recommendations.
Source: src/config/thresholds.rs:293-375 (FileSizeThresholds)
[thresholds.file_size]
business_logic = 400 # Strict limit for business logic (default: 400)
test_code = 650 # Moderate limit for tests (default: 650)
declarative_config = 1200 # Lenient for config files (default: 1200)
generated_code = 5000 # Very lenient for generated (default: 5000)
proc_macro = 500 # Moderate-strict for macros (default: 500)
build_script = 300 # Strict for build scripts (default: 300)
min_lines_per_function = 3.0 # Safety threshold (default: 3.0)
File-Specific Overrides
Use glob patterns for file-specific thresholds:
[thresholds.file_size.overrides]
"src/generated/*.rs" = 10000 # Allow large generated files
"src/migrations/*.rs" = 2000 # Allow larger migration files
Related Topics
- Configuration Overview - Full configuration reference
- Threshold Configuration Guide - Extended guide with examples
- Scoring Strategies - How thresholds affect scoring
- Validation and Quality Gates - CI/CD integration
Language Configuration
Configure language-specific analysis behavior in debtmap.
Overview
Debtmap analyzes source code for technical debt and complexity issues. Language configuration allows you to:
- Enable or disable specific languages
- Toggle analysis features per language
- Configure language-specific detection behavior
Supported Languages
Full Support (AST-Based Analysis):
- Rust - Full AST parsing with tree-sitter
- Python - Full AST parsing with tree-sitter
Source: src/core/types.rs:9-12 (Language enum) and src/organization/language.rs:4-7
The Language enum in the codebase currently supports:
#![allow(unused)]
fn main() {
// From src/core/types.rs:9-12
pub enum Language {
Rust,
Python,
}
}
Language Detection: Debtmap determines file language by extension:
.rsfiles are analyzed as Rust.pyand.pywfiles are analyzed as Python
Source: src/organization/language.rs:10-17
#![allow(unused)]
fn main() {
pub fn from_path(path: &Path) -> Option<Language> {
path.extension()
.and_then(|ext| ext.to_str())
.and_then(|ext| match ext {
"rs" => Some(Language::Rust),
"py" => Some(Language::Python),
_ => None,
})
}
}
Configuration Structure
The [languages] section in your debtmap.toml configures language analysis.
Source: src/config/languages.rs:4-22 (LanguagesConfig struct)
[languages]
enabled = ["rust", "python"]
[languages.rust]
detect_dead_code = false # Disabled by default for Rust
detect_complexity = true
detect_duplication = true
[languages.python]
detect_dead_code = true
detect_complexity = true
detect_duplication = true
Language-Specific Features
Each language supports three configurable feature toggles:
Source: src/config/languages.rs:24-38 (LanguageFeatures struct)
| Feature | Description | Default |
|---|---|---|
detect_dead_code | Identify unused code paths | true (except Rust) |
detect_complexity | Calculate cyclomatic and cognitive complexity | true |
detect_duplication | Find code duplication patterns | true |
Rust Configuration
[languages.rust]
detect_dead_code = false # Disabled: rustc already reports unused code
detect_complexity = true
detect_duplication = true
Why dead code detection is disabled for Rust: The Rust compiler (rustc) already provides excellent unused code warnings via #[warn(dead_code)]. Enabling debtmap’s dead code detection would duplicate these warnings without adding value.
Source: src/config/accessors.rs:104-111
#![allow(unused)]
fn main() {
Language::Rust => {
languages_config
.and_then(|lc| lc.rust.clone())
.unwrap_or(LanguageFeatures {
detect_dead_code: false, // Rust dead code detection disabled by default
detect_complexity: true,
detect_duplication: true,
})
}
}
Python Configuration
[languages.python]
detect_dead_code = true
detect_complexity = true
detect_duplication = true
All features enabled by default for Python. Dead code detection is valuable since Python lacks a compiler phase that catches unused code.
Source: src/config/accessors.rs:113-115
Enabling Languages
Specify which languages to analyze with the enabled array:
[languages]
enabled = ["rust", "python"]
Note: The configuration structure supports JavaScript and TypeScript entries (see src/config/languages.rs:15-21), but the core Language enum and detection logic only implement Rust and Python. JavaScript/TypeScript support may be planned for future releases.
Feature Defaults
Source: src/config/languages.rs:40-61
#![allow(unused)]
fn main() {
impl Default for LanguageFeatures {
fn default() -> Self {
Self {
detect_dead_code: true, // default_detect_dead_code()
detect_complexity: true, // default_detect_complexity()
detect_duplication: true, // default_detect_duplication()
}
}
}
}
When no language-specific configuration is provided, debtmap uses these defaults (with the Rust dead code exception handled at the accessor level).
Using Language Configuration
Analyze Only Rust Code
[languages]
enabled = ["rust"]
[languages.rust]
detect_dead_code = false
detect_complexity = true
detect_duplication = true
Full Python Analysis
[languages]
enabled = ["python"]
[languages.python]
detect_dead_code = true
detect_complexity = true
detect_duplication = true
Multi-Language Projects
[languages]
enabled = ["rust", "python"]
# Different settings per language
[languages.rust]
detect_dead_code = false # Trust rustc
detect_complexity = true
detect_duplication = true
[languages.python]
detect_dead_code = true # Python needs this
detect_complexity = true
detect_duplication = false # Skip if not needed
Complexity Calculations by Language
Debtmap uses language-specific complexity analyzers:
Source: src/complexity/languages/rust.rs and src/analyzers/implementations.rs
Rust Complexity
- Uses tree-sitter for AST parsing
- Calculates cyclomatic complexity from control flow
- Tracks cognitive complexity with nesting depth penalties
- Detects pattern match complexity with entropy analysis
Python Complexity
- Uses tree-sitter for AST parsing
- Handles Python-specific constructs (decorators, comprehensions)
- Tracks class method complexity
- Analyzes exception handling patterns
Integration with Other Configuration
Language configuration interacts with other debtmap settings:
Ignore Patterns
File exclusions in [ignore] apply before language detection:
[ignore]
patterns = [
"target/**", # Rust build output
"venv/**", # Python virtual environment
"*.min.js", # Would be skipped anyway (JS not supported)
]
See Thresholds Configuration for complexity thresholds that can be language-aware.
Classification
Semantic classification (function roles like “orchestrator”, “pure_logic”) operates independently of language but uses language-specific patterns for detection.
See Advanced Options for classification configuration.
API Reference
LanguagesConfig
Source: src/config/languages.rs:4-22
| Field | Type | Description |
|---|---|---|
enabled | Vec<String> | Languages to analyze |
rust | Option<LanguageFeatures> | Rust-specific settings |
python | Option<LanguageFeatures> | Python-specific settings |
javascript | Option<LanguageFeatures> | JavaScript settings (planned) |
typescript | Option<LanguageFeatures> | TypeScript settings (planned) |
LanguageFeatures
Source: src/config/languages.rs:24-38
| Field | Type | Default | Description |
|---|---|---|---|
detect_dead_code | bool | true | Enable dead code detection |
detect_complexity | bool | true | Enable complexity analysis |
detect_duplication | bool | true | Enable duplication detection |
Troubleshooting
Language Not Detected
If files aren’t being analyzed:
- Check the file extension is recognized (
.rs,.py,.pyw) - Verify the language is in
enabledarray - Check ignore patterns aren’t excluding the files
Missing Dead Code Warnings
For Rust: Enable dead code detection explicitly if you want debtmap to report it:
[languages.rust]
detect_dead_code = true # Override default
For Python: Ensure detect_dead_code = true (the default).
Unexpected Complexity Scores
Language-specific complexity varies due to:
- Different control flow constructs
- Pattern matching complexity (Rust)
- Exception handling (Python)
See Entropy Analysis for how entropy affects complexity scoring.
See Also
- Scoring Configuration - Configure how complexity translates to debt scores
- Thresholds Configuration - Set complexity thresholds
- Advanced Options - Classification and detection settings
Display and Output Configuration
This chapter covers configuration options that control how debtmap displays analysis results and formats output. These settings affect terminal rendering, verbosity levels, color modes, and evidence display for multi-signal classification.
Quick Reference
Key display and output configuration options (from src/config/display.rs):
| Configuration | Default | Purpose |
|---|---|---|
| Display Settings | ||
tiered | true | Use tiered priority display |
items_per_tier | 5 | Items shown per priority tier |
verbosity | detailed | Output verbosity level |
| Output Settings | ||
evidence_verbosity | minimal | Multi-signal evidence detail |
min_confidence_warning | 0.80 | Minimum confidence for warnings |
| Color Mode | ||
color | auto | Color output control |
| Formatting | ||
show_splits | false | Show god object split recommendations |
max_callers | 5 | Maximum callers to display |
max_callees | 5 | Maximum callees to display |
Display Settings
The [display] section controls how analysis results are organized and presented. These settings are defined in the DisplayConfig struct (src/config/display.rs:56-68).
Basic Display Configuration
[display]
# Use tiered priority display for organizing results
tiered = true
# Maximum items to show per tier
items_per_tier = 5
# Verbosity level (summary, detailed, comprehensive)
verbosity = "detailed"
Verbosity Levels
The VerbosityLevel enum (src/config/display.rs:7-15) controls how much detail appears in output:
| Level | Description | Use Case |
|---|---|---|
summary | Essential information only | Quick health checks |
detailed | Includes module structure details | Normal development use |
comprehensive | All available analysis data | Debugging and deep analysis |
Example:
[display]
# Summary mode - minimal output
verbosity = "summary"
[display]
# Comprehensive mode - all details
verbosity = "comprehensive"
The default is detailed, which provides a balance between information density and readability.
Tiered Priority Display
When tiered = true, debtmap organizes debt items by priority tier (Critical, High, Medium, Low) and limits output to items_per_tier items per tier:
[display]
tiered = true
items_per_tier = 10 # Show up to 10 items per tier
This prevents output from being overwhelming when analyzing large codebases with many debt items.
Output Format Configuration
The format field in [display] sets the default output format:
[display]
# Default output format (terminal, json, markdown, html)
format = "terminal"
Available formats (from src/cli/args.rs:574-583):
| Format | Description |
|---|---|
terminal | Interactive colored output for terminals |
json | Machine-readable structured data |
markdown | Documentation-friendly reports |
html | Interactive web dashboard |
dot | Graphviz DOT format for dependency visualization |
See Output Formats for detailed format documentation.
Color and Terminal Options
Color Mode
The ColorMode enum (src/formatting/mod.rs:6-11) controls color output:
| Mode | Behavior |
|---|---|
auto | Detect terminal color support automatically |
always | Force colors on (even when piping) |
never | Disable colors entirely |
[display]
# Color configuration is typically handled via CLI or environment
# but can be set in config
plain = false # false = colors enabled (when supported)
Environment Variable Controls
Debtmap respects standard environment variables for color control (src/formatting/mod.rs:67-90):
| Variable | Effect |
|---|---|
NO_COLOR | If set, disables colors (per no-color.org) |
CLICOLOR=0 | Disables colors |
CLICOLOR_FORCE=1 | Forces colors even when not a TTY |
TERM=dumb | Disables colors for dumb terminals |
Precedence (highest to lowest):
CLICOLOR_FORCE=1- Forces colors onNO_COLORorCLICOLOR=0- Disables colors- Terminal detection - Auto-detect based on TTY status
Plain Mode
For environments without color support or when piping output:
[display]
plain = true # ASCII-only, no colors, no emoji
Or via CLI: debtmap analyze . --plain
Evidence Display Configuration
Multi-signal classification produces evidence that can be displayed at varying levels of detail. The OutputConfig struct (src/config/display.rs:122-145) controls evidence output.
Evidence Verbosity
The EvidenceVerbosity enum (src/config/display.rs:18-30) maps to -v flag counts:
| Level | -v Count | Description |
|---|---|---|
minimal | 0 | Category and confidence only |
standard | 1 | Signal summary |
verbose | 2 | Detailed breakdown |
very_verbose | 3 | All signals including low-weight ones |
[output]
# Set evidence verbosity in config
evidence_verbosity = "standard"
# Minimum confidence for showing warnings
min_confidence_warning = 0.80
Signal Filters
The SignalFilterConfig (src/config/display.rs:152-190) controls which classification signals appear in output:
[output.signal_filters]
# Show I/O detection signal
show_io_detection = true
# Show call graph signal
show_call_graph = true
# Show type signatures signal
show_type_signatures = true
# Show purity signal
show_purity = true
# Show framework signal
show_framework = true
# Show name heuristics signal (low weight, hidden by default)
show_name_heuristics = false
Signal Filter Defaults:
| Filter | Default | Purpose |
|---|---|---|
show_io_detection | true | I/O operation detection signals |
show_call_graph | true | Call graph analysis signals |
show_type_signatures | true | Type signature analysis |
show_purity | true | Function purity classification |
show_framework | true | Framework pattern detection |
show_name_heuristics | false | Low-weight naming heuristics |
Name heuristics are hidden by default because they are a low-weight fallback signal.
Formatting Configuration
The FormattingConfig struct (src/formatting/mod.rs:32-48) controls advanced formatting options.
Caller/Callee Display
Configure how call graph relationships are displayed:
[display.formatting]
# Maximum number of callers to display per function
max_callers = 5
# Maximum number of callees to display per function
max_callees = 5
# Show calls to external crates
show_external = false
# Show standard library calls
show_std_lib = false
These settings are defined in CallerCalleeConfig (src/config/classification.rs:7-23).
Show Splits
Enable detailed god object module split recommendations:
[display.formatting]
# Show detailed module split recommendations
show_splits = false
When enabled, debtmap provides specific recommendations for breaking down god objects into smaller modules. This is defined in FormattingConfig (src/formatting/mod.rs:36-37).
Complete Configuration Example
Here’s a complete [display] configuration section showing all options:
# Display and output configuration
[display]
# Output format (terminal, json, markdown, html)
format = "terminal"
# Verbosity level (0-3 or named: summary, detailed, comprehensive)
verbosity = "detailed"
# Use compact output format
compact = false
# Use summary format with tiered priority display
summary = false
# Enable tiered priority display
tiered = true
# Items per priority tier
items_per_tier = 5
# Group output by debt category
group_by_category = false
# Show complexity attribution details
show_attribution = false
# Detail level (summary, standard, comprehensive, debug)
detail_level = "standard"
# Disable TUI progress visualization
no_tui = false
# Plain output mode (ASCII-only, no colors, no emoji)
plain = false
# Formatting options
[display.formatting]
# Show dependency information (callers/callees)
show_dependencies = false
# Maximum callers to display
max_callers = 5
# Maximum callees to display
max_callees = 5
# Show external crate calls
show_external = false
# Show standard library calls
show_std_lib = false
# Show god object split recommendations
show_splits = false
# Evidence and output configuration
[output]
# Evidence verbosity for multi-signal classification
evidence_verbosity = "minimal"
# Minimum confidence for showing warnings (0.0-1.0)
min_confidence_warning = 0.80
# Signal filters for evidence display
[output.signal_filters]
show_io_detection = true
show_call_graph = true
show_type_signatures = true
show_purity = true
show_framework = true
show_name_heuristics = false
CLI Flag Overrides
Many display settings can be overridden via CLI flags:
| Config Option | CLI Flag |
|---|---|
format | -f, --format |
plain | --plain |
verbosity | -v (repeatable) |
summary | --summary |
compact | --compact |
show_dependencies | --show-dependencies |
show_splits | --show-splits |
tiered | --tiered |
Precedence: CLI flags override config file settings.
See Also
- Output Formats - Detailed format documentation
- Scoring Configuration - Configure scoring weights
- CLI Reference - Command-line options
Advanced Options
This subsection covers advanced configuration options in Debtmap, including entropy analysis, god object detection, context-aware false positive reduction, and parallel processing.
Overview
Debtmap provides several advanced analysis features that can be tuned for specific project needs:
- Entropy Analysis - Information theory-based complexity dampening
- God Object Detection - Detection of overly complex types and modules
- Context-Aware Detection - Smart false positive reduction based on code context
- Parallel Processing - Multi-threaded analysis for large codebases
Entropy Analysis
Entropy analysis uses information theory to identify repetitive code patterns that inflate complexity metrics. When code has low entropy (highly repetitive), its complexity score is dampened to reflect its true cognitive load.
Source: src/config/languages.rs:65-127 (EntropyConfig)
Configuration
Configure entropy analysis in the [entropy] section of .debtmap.toml:
[entropy]
enabled = true # Enable entropy-based scoring (default: true)
weight = 1.0 # Weight of entropy in adjustment (0.0-1.0, default: 1.0)
min_tokens = 20 # Minimum tokens for calculation (default: 20)
pattern_threshold = 0.7 # Pattern similarity threshold (0.0-1.0, default: 0.7)
entropy_threshold = 0.4 # Low entropy detection threshold (default: 0.4)
# Branch analysis
branch_threshold = 0.8 # Branch similarity threshold (default: 0.8)
# Reduction caps
max_repetition_reduction = 0.20 # Max 20% reduction for repetition (default: 0.20)
max_entropy_reduction = 0.15 # Max 15% reduction for low entropy (default: 0.15)
max_branch_reduction = 0.25 # Max 25% reduction for similar branches (default: 0.25)
max_combined_reduction = 0.30 # Max 30% total reduction cap (default: 0.30)
Configuration Options
| Field | Type | Default | Description |
|---|---|---|---|
enabled | bool | true | Enable entropy-based scoring |
weight | f64 | 1.0 | Weight of entropy in complexity adjustment |
min_tokens | usize | 20 | Minimum tokens required for calculation |
pattern_threshold | f64 | 0.7 | Threshold for pattern repetition detection |
entropy_threshold | f64 | 0.4 | Threshold for low entropy detection |
branch_threshold | f64 | 0.8 | Threshold for branch similarity detection |
max_repetition_reduction | f64 | 0.20 | Maximum reduction for high repetition |
max_entropy_reduction | f64 | 0.15 | Maximum reduction for low entropy |
max_branch_reduction | f64 | 0.25 | Maximum reduction for similar branches |
max_combined_reduction | f64 | 0.30 | Maximum combined reduction cap |
How Entropy Dampening Works
Source: src/complexity/entropy_core.rs:19-48 (EntropyAnalysis)
Entropy analysis calculates several metrics for each function:
-
Token Entropy (
entropy_score) - Shannon entropy of code tokens (0.0-1.0)- High entropy (>0.4): Unique, varied code patterns
- Low entropy (<0.4): Repetitive patterns, triggers dampening
-
Pattern Repetition (
pattern_repetition) - How much code repeats (0.0-1.0)- High values indicate repeated code blocks
-
Branch Similarity (
branch_similarity) - Similarity between conditional branches- High values indicate similar match/if-else arms
-
Dampening Factor - Applied to complexity (0.5-1.0)
- 1.0 = no dampening (genuine complexity)
- 0.5 = maximum dampening (very repetitive code)
Example Impact:
Function: format_match_arms (20 cyclomatic complexity)
Token Entropy: 0.3 (low - repetitive formatting)
Pattern Repetition: 0.8 (high - repeated patterns)
Dampening Factor: 0.7
Adjusted Complexity: 14 (20 × 0.7)
Use Cases
Reduce false positives from match statements:
[entropy]
enabled = true
pattern_threshold = 0.6 # More aggressive pattern detection
max_branch_reduction = 0.30 # Allow higher reduction for similar branches
Strict analysis (minimal dampening):
[entropy]
enabled = true
max_combined_reduction = 0.15 # Cap total reduction at 15%
Disable entropy analysis:
[entropy]
enabled = false
God Object Detection
God object detection identifies types and files that have grown too large, accumulating too many responsibilities. Debtmap detects three types:
Source: src/organization/god_object/core_types.rs:12-46 (DetectionType)
- GodClass - A single struct with >20 impl methods across multiple responsibilities
- GodFile - A file with >50 standalone functions and no struct definitions
- GodModule - A hybrid with both structs AND many standalone functions
Detection Thresholds
Source: src/organization/god_object/thresholds.rs:63-80 (GodObjectThresholds)
Default thresholds for detection:
| Threshold | Default | Description |
|---|---|---|
max_methods | 20 | Maximum methods before flagging as GodClass |
max_fields | 15 | Maximum fields before flagging |
max_traits | 5 | Maximum trait implementations |
max_lines | 1000 | Maximum lines of code |
max_complexity | 200 | Maximum total complexity |
Fallback heuristics for non-Rust files (src/organization/god_object/heuristics.rs:20-22):
| Threshold | Value | Description |
|---|---|---|
HEURISTIC_MAX_FUNCTIONS | 50 | Maximum functions in a file |
HEURISTIC_MAX_LINES | 2000 | Maximum lines for heuristic detection |
HEURISTIC_MAX_FIELDS | 30 | Maximum fields for heuristic detection |
Language-Specific Thresholds
Source: src/organization/god_object/thresholds.rs:84-102
Debtmap provides language-specific thresholds:
Rust (default):
max_methods: 20, max_fields: 15, max_traits: 5
max_lines: 1000, max_complexity: 200
Python (stricter):
max_methods: 15, max_fields: 10, max_traits: 3
max_lines: 500, max_complexity: 150
God Object Score Calculation
The god_object_score is calculated using a weighted algorithm that considers:
- Method count relative to threshold
- Field count relative to threshold
- Number of distinct responsibilities
- Lines of code
- Average complexity per method
A higher score indicates a more severe god object problem. Scores are used to prioritize which types/files to refactor first.
Viewing God Object Analysis
# Show detailed god object information
debtmap analyze . --show-god-objects
# Include split recommendations
debtmap analyze . --show-god-objects --verbose
Context-Aware Detection
Context-aware detection reduces false positives by adjusting severity based on code context. Test files, example code, and debug functions receive different treatment than production code.
Source: src/analyzers/context_aware.rs:18-35 (ContextAwareAnalyzer)
Enabling Context-Aware Detection
Context-aware detection is enabled by default. To disable it:
# Disable context-aware detection
debtmap analyze . --no-context-aware
# Or via environment variable
DEBTMAP_CONTEXT_AWARE=false debtmap analyze .
How Context Detection Works
Source: src/cli/setup.rs:56-61
When context-aware detection is enabled:
- File Type Detection - Identifies test files, examples, benchmarks
- Function Context Analysis - Detects function roles (entry point, debug, etc.)
- Rule-Based Adjustment - Applies severity adjustments based on context
Context Actions:
Allow/Skip- Remove the debt item entirelyWarn- Reduce severity by 2 levelsReduceSeverity(n)- Reduce severity by n levelsDeny- Keep the item unchanged
Rule Actions
Source: src/analyzers/context_aware.rs:50-63 (process_rule_action)
| Action | Effect | Example Use |
|---|---|---|
Allow | Filters out item | Ignore TODOs in test files |
Skip | Filters out item | Skip complexity in examples |
Warn | Reduces severity by 2 | Flag but deprioritize |
ReduceSeverity(n) | Reduces severity by n | Custom adjustment |
Deny | No change | Keep full severity |
Use Cases
Analyze only production code (strict mode):
# Disable context awareness - analyze everything equally
debtmap analyze . --no-context-aware
Default behavior (recommended):
# Context-aware is enabled by default
debtmap analyze .
# Test files, examples get reduced severity
Parallel Processing
Parallel processing enables multi-threaded analysis for faster results on large codebases.
Source: src/config/parallel.rs:36-57 (ParallelConfig)
Configuration
Configure parallel processing in .debtmap.toml:
[parallel]
enabled = true # Enable parallel processing (default: true)
max_concurrency = 8 # Maximum concurrent operations (default: num_cpus)
batch_size = 100 # Files per batch (default: 100)
Configuration Options
| Field | Type | Default | Description |
|---|---|---|---|
enabled | bool | true | Enable parallel processing |
max_concurrency | Option<usize> | None (all cores) | Maximum concurrent operations |
batch_size | Option<usize> | 100 | Batch size for chunked processing |
CLI Flags
# Disable parallel processing (sequential mode)
debtmap analyze . --no-parallel
# Set specific number of worker threads
debtmap analyze . --jobs 4
# Use all available cores (default behavior)
debtmap analyze . --jobs 0
Source: src/cli/args.rs:461-464 (–jobs flag)
Batch Analysis Configuration
Source: src/config/parallel.rs:125-143 (BatchAnalysisConfig)
For advanced batch processing control:
[batch_analysis]
fail_fast = false # Stop at first error (default: false)
collect_timing = false # Track analysis duration (default: false)
[batch_analysis.parallelism]
enabled = true
max_concurrency = 4
batch_size = 50
Configuration Options
| Field | Type | Default | Description |
|---|---|---|---|
fail_fast | bool | false | Stop on first error vs accumulate all |
collect_timing | bool | false | Track per-file analysis duration |
parallelism | ParallelConfig | default | Nested parallelism settings |
Performance Considerations
When to use parallel processing (default):
- Large codebases (>100 files)
- Multi-core systems
- CI/CD pipelines
When to disable parallel processing:
- Debugging analysis issues
- Memory-constrained environments
- Reproducible/deterministic output needed
# Sequential mode for debugging
debtmap analyze . --no-parallel
# Limited concurrency for memory constraints
debtmap analyze . --jobs 2
Batch Processing Modes
Source: src/config/parallel.rs:145-174 (BatchAnalysisConfig methods)
#![allow(unused)]
fn main() {
// Accumulating mode - collect all errors (default)
BatchAnalysisConfig::accumulating()
// Fail-fast mode - stop at first error
BatchAnalysisConfig::fail_fast()
// With timing collection for profiling
BatchAnalysisConfig::default().with_timing()
// Sequential processing for debugging
BatchAnalysisConfig::default().sequential()
}
Environment Variables
Several advanced options can be controlled via environment variables:
| Variable | Effect | Example |
|---|---|---|
DEBTMAP_CONTEXT_AWARE | Enable/disable context-aware detection | DEBTMAP_CONTEXT_AWARE=false |
DEBTMAP_JOBS | Set worker thread count | DEBTMAP_JOBS=4 |
Related Topics
- Thresholds Configuration - Configure detection thresholds
- Scoring Configuration - Configure scoring weights
- Parallel Processing - Detailed parallel processing guide
- Entropy Analysis - In-depth entropy analysis documentation
- God Object Detection - Detailed god object detection guide
Configuration Best Practices
This subsection provides practical configuration guidance for different project types and development contexts. Use these recommendations as starting points, then tune based on your specific codebase.
Overview
Effective Debtmap configuration depends on your project’s characteristics:
- Maturity level - New projects vs legacy codebases
- Quality standards - Strict quality gates vs gradual improvement
- Team context - Open source library vs internal tool
- Workflow integration - CI/CD pipeline vs local development
The key is choosing thresholds that surface actionable issues without overwhelming noise.
Configuration for Strict Quality Standards
Use strict configurations for new projects, libraries with public APIs, or teams committed to maintaining low technical debt.
Source: src/config/thresholds.rs:120-196 (ValidationThresholds)
Complete Strict Configuration
# Strict quality configuration for new/greenfield projects
# Prioritize coverage and early debt prevention
[scoring]
coverage = 0.60 # Emphasize test coverage
complexity = 0.30 # Moderate complexity weight
dependency = 0.10 # Low dependency weight
[thresholds]
complexity = 8 # Lower cyclomatic threshold
max_function_length = 30 # Enforce smaller functions
minimum_debt_score = 3.0 # Higher bar for flagging issues
[thresholds.validation]
max_average_complexity = 8.0 # Strict complexity limits
max_debt_density = 30.0 # Low debt density tolerance
max_codebase_risk_score = 6.0 # Stricter risk tolerance
min_coverage_percentage = 80.0 # Require 80% coverage
[complexity_thresholds]
minimum_cyclomatic_complexity = 3
minimum_cognitive_complexity = 7
minimum_function_length = 15
Key Principles for Strict Standards
- Start strict, loosen if needed - It’s easier to relax thresholds than tighten them later
- Enforce coverage early - Set
min_coverage_percentagefrom day one - Use validation in CI - Block merges that exceed thresholds
- Review high-scoring items weekly - Prevent debt accumulation
CLI Example
# Use strict preset for quick strictness
debtmap analyze . --threshold-preset strict
# Validate with strict configuration
debtmap validate . --config .debtmap.toml
Configuration for Legacy Codebases
Legacy codebases need gradual improvement rather than strict enforcement. Focus on identifying the highest-priority items without overwhelming the team.
Source: src/config/thresholds.rs:83-118 (ThresholdsConfig)
Complete Legacy Configuration
# Legacy codebase configuration
# Focus on high-priority issues, reduce noise from moderate complexity
[scoring]
coverage = 0.30 # Reduce coverage weight (legacy often lacks tests)
complexity = 0.50 # Focus on complexity
dependency = 0.20 # Higher dependency weight for coupling issues
[thresholds]
minimum_debt_score = 5.0 # Only show highest priority items
minimum_cyclomatic_complexity = 10 # Filter out moderate complexity
minimum_cognitive_complexity = 15 # Focus on worst offenders
minimum_risk_score = 4.0 # High-risk items only
[thresholds.validation]
max_debt_density = 100.0 # Accommodate existing debt
max_average_complexity = 15.0 # Start lenient
max_total_debt_score = 5000 # Higher limits for legacy code
max_codebase_risk_score = 8.0 # More tolerant risk threshold
[complexity_thresholds]
minimum_total_complexity = 15
minimum_function_length = 50
Gradual Threshold Tightening Strategy
- Week 1-2: Run analysis with lenient thresholds, establish baseline
- Month 1: Lower
minimum_debt_scorefrom 5.0 to 4.5, address top 10 items - Month 2: Lower
max_debt_densityby 10%, continue addressing high-priority items - Quarterly: Reduce thresholds by 10-15% until reaching target
Example progression:
# Month 1
[thresholds.validation]
max_debt_density = 100.0
# Month 3
[thresholds.validation]
max_debt_density = 85.0
# Month 6
[thresholds.validation]
max_debt_density = 70.0
# Target (Year 1)
[thresholds.validation]
max_debt_density = 50.0
CLI Example
# Focus on high-priority items only
debtmap analyze . --min-score 5.0 --top 20
# Use lenient preset for legacy code
debtmap analyze . --threshold-preset lenient
Configuration for Open Source Libraries
Open source libraries require high test coverage for public APIs and clear documentation of complexity.
Complete Open Source Configuration
# Open source library configuration
# Prioritize test coverage and public API detection
[scoring]
coverage = 0.55 # High coverage weight (public API focus)
complexity = 0.30 # Moderate complexity weight
dependency = 0.15 # Standard dependency weight
[analysis]
detect_external_api = true # Flag untested public APIs
public_api_threshold = 0.8 # 80% threshold for public API detection
[thresholds]
max_function_length = 40 # Moderate function size limit
minimum_debt_score = 2.5 # Surface more issues for thorough review
[thresholds.validation]
min_coverage_percentage = 90.0 # High coverage for public API
max_high_complexity_count = 20 # Keep complexity low
max_average_complexity = 9.0 # Strict complexity average
max_debt_density = 40.0 # Low debt tolerance
# Coverage expectations by function role
[coverage_expectations.pure]
min = 95.0
target = 98.0
max = 100.0
[coverage_expectations.io_operations]
min = 70.0
target = 80.0
max = 90.0
Key Principles for Open Source
- Prioritize public API coverage - Users depend on your public interface
- Document complexity trade-offs - Explain why complex functions exist
- Use semantic classification - Apply role-based scoring for accurate prioritization
- Enable pattern detection - Identify boilerplate for macro opportunities
CLI Example
# Analyze with public API detection
debtmap analyze . --no-public-api-detection=false --public-api-threshold 0.8
# Generate markdown report for documentation
debtmap analyze . --format markdown --output TECH_DEBT.md
CI/CD Integration Best Practices
Integrate Debtmap into your CI/CD pipeline to enforce quality gates automatically.
Source: .github/workflows/debtmap.yml (example workflow)
GitHub Actions Configuration
name: Technical Debt Validation
on:
push:
branches: [main, master]
pull_request:
branches: [main, master]
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install Debtmap
run: cargo install debtmap
- name: Generate coverage data
run: |
cargo install cargo-llvm-cov
cargo llvm-cov --lcov --output-path coverage.lcov
- name: Run Debtmap validation
run: |
debtmap validate . \
--coverage-file coverage.lcov \
--format json \
--output debtmap-report.json
- name: Upload report
if: always()
uses: actions/upload-artifact@v4
with:
name: debtmap-report
path: debtmap-report.json
GitLab CI Configuration
stages:
- test
- quality
coverage:
stage: test
script:
- cargo install cargo-llvm-cov
- cargo llvm-cov --lcov --output-path coverage.lcov
artifacts:
paths:
- coverage.lcov
debtmap:
stage: quality
needs: [coverage]
script:
- cargo install debtmap
- debtmap validate . --coverage-file coverage.lcov --format json --output report.json
artifacts:
paths:
- report.json
reports:
codequality: report.json
allow_failure: false # Block merge on validation failure
Generic Pipeline Configuration
For any CI system, the key steps are:
- Install Debtmap -
cargo install debtmap - Generate coverage - Use cargo-llvm-cov, tarpaulin, or your coverage tool
- Run validation -
debtmap validate . --coverage-file coverage.lcov - Check exit code - Non-zero exit indicates threshold violation
- Archive report - Store JSON output for trend analysis
CI/CD-Specific Thresholds
Use scale-independent metrics for CI/CD validation:
[thresholds.validation]
# Scale-independent metrics - no adjustment needed as codebase grows
max_debt_density = 50.0 # Debt per 1000 LOC
max_average_complexity = 10.0 # Per-function average
max_codebase_risk_score = 7.0 # Overall risk level
# Enable coverage gate if you have coverage data
min_coverage_percentage = 75.0
Exit Codes
The validate command returns specific exit codes for CI integration:
| Exit Code | Meaning |
|---|---|
| 0 | All thresholds passed |
| 1 | One or more thresholds exceeded |
| 2 | Configuration or runtime error |
Common Configuration Anti-Patterns
Avoid these configuration mistakes:
Absolute Count Thresholds in CI
Problem: Absolute counts punish healthy codebase growth.
# BAD - Will fail as codebase grows
[thresholds.validation]
max_debt_items = 100
max_high_complexity_count = 50
Solution: Use density-based metrics instead.
# GOOD - Scales with codebase size
[thresholds.validation]
max_debt_density = 50.0
max_average_complexity = 10.0
Ignoring Coverage in Scoring
Problem: High-complexity well-tested code scores same as untested code.
# BAD - Ignores coverage entirely
[scoring]
coverage = 0.0
complexity = 0.85
dependency = 0.15
Solution: Include coverage weight proportional to your testing culture.
# GOOD - Balanced scoring
[scoring]
coverage = 0.50
complexity = 0.35
dependency = 0.15
Over-Suppression
Problem: Too many suppressions hide real issues.
Solution: Use minimum thresholds to filter noise instead of inline suppressions:
[thresholds]
minimum_debt_score = 3.0 # Filter low-priority items
minimum_cyclomatic_complexity = 5 # Filter simple functions
Inconsistent Team Configuration
Problem: Different developers use different thresholds.
Solution: Commit .debtmap.toml to version control and reference it explicitly:
# Always reference committed config
debtmap validate . --config .debtmap.toml
Related Topics
- Thresholds Configuration - Detailed threshold reference
- Scoring Configuration - Scoring weight configuration
- Validation and Quality Gates - CI/CD integration guide
- CLI Reference - Command-line options
Threshold Configuration
Debtmap uses configurable thresholds to determine when code complexity, duplication, or structural issues should be flagged as technical debt. This chapter explains how to configure thresholds to match your project’s quality standards.
Overview
Thresholds control what gets flagged as technical debt. You can configure thresholds using:
- Preset configurations - Quick start with strict, balanced, or lenient settings
- CLI flags - Override thresholds for a single analysis run
- Configuration file - Project-specific thresholds in
.debtmap.toml
Threshold Presets
Debtmap provides three preset threshold configurations to match different project needs:
Preset Comparison
| Threshold | Strict | Balanced (Default) | Lenient |
|---|---|---|---|
| Cyclomatic Complexity | 3 | 5 | 10 |
| Cognitive Complexity | 7 | 10 | 20 |
| Total Complexity | 5 | 8 | 15 |
| Function Length (lines) | 15 | 20 | 50 |
| Match Arms | 3 | 4 | 8 |
| If-Else Chain | 2 | 3 | 5 |
| Role Multipliers | |||
| Entry Point Multiplier | 1.2x | 1.5x | 2.0x |
| Utility Multiplier | 0.6x | 0.8x | 1.0x |
| Test Function Multiplier | 3.0x | 2.0x | 3.0x |
When to Use Each Preset
Strict Preset
- New projects aiming for high code quality standards
- Libraries and reusable components
- Critical systems requiring high reliability
- Teams enforcing strict coding standards
debtmap analyze . --threshold-preset strict
Balanced Preset (Default)
- Typical production applications
- Projects with moderate complexity tolerance
- Recommended starting point for most projects
- Good balance between catching issues and avoiding false positives
debtmap analyze . # Uses balanced preset by default
Lenient Preset
- Legacy codebases during initial assessment
- Complex domains (compilers, scientific computing)
- Gradual debt reduction strategies
- Temporary relaxation during major refactoring
debtmap analyze . --threshold-preset lenient
Understanding Complexity Thresholds
Debtmap tracks multiple complexity metrics and uses conjunction logic: a function must exceed ALL thresholds to be flagged as technical debt.
Cyclomatic Complexity
Counts decision points in code: if, while, for, match, &&, ||, etc.
- What it measures: Number of independent paths through code
- Why it matters: More paths = harder to test completely
- Default threshold: 5
Cognitive Complexity
Measures the mental effort required to understand code by weighing nested structures and breaks in linear flow.
- What it measures: How hard code is to read and comprehend
- Why it matters: High cognitive load = maintenance burden
- Default threshold: 10
Total Complexity
Sum of cyclomatic and cognitive complexity.
- What it measures: Combined complexity burden
- Why it matters: Catches functions high in either metric
- Default threshold: 8
Function Length
Number of lines of code in the function body.
- What it measures: Physical size of function
- Why it matters: Long functions are hard to understand and test
- Default threshold: 20 lines
Structural Complexity
Additional metrics for specific patterns:
- Match arms: Flags large match/switch statements (default: 4)
- If-else chains: Flags long conditional chains (default: 3)
Critical: Debtmap uses conjunction logic - functions are flagged only when they meet ALL of these conditions simultaneously:
- Cyclomatic complexity >= adjusted cyclomatic threshold
- Cognitive complexity >= adjusted cognitive threshold
- Function length >= minimum function length
- Total complexity (cyclomatic + cognitive) >= adjusted total threshold
The thresholds are first adjusted by role-based multipliers, then all four checks must pass for the function to be flagged.
Threshold Validation
Debtmap validates critical thresholds to prevent misconfiguration. The following validation rules apply (src/complexity/threshold_manager.rs:191-217):
Core Complexity Metrics (must not be zero):
minimum_total_complexity> 0minimum_cyclomatic_complexity> 0minimum_cognitive_complexity> 0
Role Multipliers (must be positive):
entry_point_multiplier> 0core_logic_multiplier> 0utility_multiplier> 0test_function_multiplier> 0
Note: Structural thresholds (minimum_match_arms, minimum_if_else_chain, minimum_function_length) are not validated and can be set to any value including zero. Zero values effectively disable those checks.
If any validated field fails validation, Debtmap will reject the configuration with an error message and use default values instead.
Role-Based Multipliers
Debtmap automatically adjusts thresholds based on function role, recognizing that different types of functions have different complexity expectations:
| Function Role | Multiplier | Effect | Examples |
|---|---|---|---|
| Entry Points | 1.2x - 2.0x (preset-specific) | More lenient | main(), HTTP handlers, CLI commands |
| Core Logic | 1.0x | Standard | Business logic, algorithms |
| Utility Functions | 0.6x - 1.0x (preset-specific) | Stricter | Getters, setters, simple helpers |
| Test Functions | 2.0x - 3.0x (preset-specific) | Most lenient | Unit tests, integration tests |
| Unknown Functions | 1.0x (defaults to core logic) | Standard | Functions that don’t match any role pattern |
Function Role Classification
Debtmap automatically determines function roles based on naming patterns (src/complexity/threshold_manager.rs:264-278):
| Role | Matching Patterns | Examples |
|---|---|---|
| Entry Point | main, *_handler, handle_* | main(), request_handler(), handle_event() |
| Test | test_*, *_test | test_validation(), parser_test() |
| Utility | get_*, set_*, is_*, has_* | get_name(), set_config(), is_valid(), has_permission() |
| Core Logic | All other functions | process_data(), calculate_score(), validate() |
| Unknown | Fallback (uses core logic multiplier) | Rare edge cases |
Note: The “Unknown” role uses the same multiplier as Core Logic (1.0x). In practice, functions are classified as Core Logic if they don’t match any specific pattern.
Note: Some multipliers vary by preset:
- Entry Points: Strict=1.2x, Balanced=1.5x, Lenient=2.0x
- Utility Functions: Strict=0.6x, Balanced=0.8x, Lenient=1.0x
- Test Functions: Strict=3.0x, Balanced=2.0x, Lenient=3.0x
How multipliers work:
A higher multiplier makes thresholds more lenient by adjusting ALL thresholds. The multiplier values vary by preset - for example, entry point functions use 1.2x (strict), 1.5x (balanced), or 2.0x (lenient).
Example: Entry Point function with Balanced preset (multiplier = 1.5x):
- Cyclomatic threshold: 7.5 (5 × 1.5)
- Cognitive threshold: 15 (10 × 1.5)
- Total threshold: 12 (8 × 1.5)
- Length threshold: 30 lines (20 × 1.5)
The function is flagged only if ALL conditions are met:
- Cyclomatic complexity >= 7.5 AND
- Cognitive complexity >= 15 AND
- Function length >= 30 lines AND
- Total complexity (cyclomatic + cognitive) >= 12
Comparison across roles (Balanced preset):
| Role | Cyclomatic | Cognitive | Total | Length | Flagged When |
|---|---|---|---|---|---|
| Entry Point (1.5x) | 7.5 | 15 | 12 | 30 | ALL conditions met |
| Core Logic (1.0x) | 5 | 10 | 8 | 20 | ALL conditions met |
| Utility (0.8x) | 4 | 8 | 6.4 | 16 | ALL conditions met |
| Test (2.0x) | 10 | 20 | 16 | 40 | ALL conditions met |
Note: Entry point multipliers differ by preset. With the strict preset, entry points use 1.2x (cyclomatic=3.6, cognitive=8.4), while the lenient preset uses 2.0x (cyclomatic=20, cognitive=40).
This allows test functions and entry points to be more complex without false positives, while keeping utility functions clean and simple.
CLI Threshold Flags
Override thresholds for a single analysis run using command-line flags:
Preset-Based Configuration (Recommended)
Use --threshold-preset to apply a predefined threshold configuration:
# Use strict preset (cyclomatic=3, cognitive=7, total=5, length=15)
debtmap analyze . --threshold-preset strict
# Use balanced preset (default - cyclomatic=5, cognitive=10, total=8, length=20)
debtmap analyze . --threshold-preset balanced
# Use lenient preset (cyclomatic=10, cognitive=20, total=15, length=50)
debtmap analyze . --threshold-preset lenient
Individual Threshold Overrides
You can also override specific thresholds:
# Override cyclomatic complexity threshold (legacy flag, default: 10)
debtmap analyze . --threshold-complexity 15
# Override duplication threshold in lines (default: 50)
debtmap analyze . --threshold-duplication 30
# Combine multiple threshold flags
debtmap analyze . --threshold-complexity 15 --threshold-duplication 30
Note:
--threshold-presetprovides the most comprehensive threshold configuration (includes all complexity metrics and role multipliers)- Individual flags like
--threshold-complexityare legacy flags that only set a single cyclomatic complexity threshold, without configuring cognitive complexity, total complexity, function length, or role multipliers - For full control over all complexity metrics and role-based multipliers, use the
.debtmap.tomlconfiguration file - CLI flags override configuration file settings for that run only
Configuration File
For project-specific thresholds, create a .debtmap.toml file in your project root.
Complexity Thresholds Configuration
The [complexity_thresholds] section in .debtmap.toml allows fine-grained control over function complexity detection:
[complexity_thresholds]
# Core complexity metrics
minimum_total_complexity = 8 # Sum of cyclomatic + cognitive
minimum_cyclomatic_complexity = 5 # Decision points (if, match, etc.)
minimum_cognitive_complexity = 10 # Mental effort to understand code
# Structural complexity metrics
minimum_match_arms = 4 # Maximum match/switch arms
minimum_if_else_chain = 3 # Maximum if-else chain length
minimum_function_length = 20 # Minimum lines before flagging
# Role-based multipliers (applied to all thresholds above)
entry_point_multiplier = 1.5 # main(), handlers, CLI commands
core_logic_multiplier = 1.0 # Standard business logic
utility_multiplier = 0.8 # Getters, setters, helpers
test_function_multiplier = 2.0 # Unit tests, integration tests
Note: The multipliers are applied to thresholds before comparison. For example, with entry_point_multiplier = 1.5 and minimum_cyclomatic_complexity = 5, an entry point function would be flagged at cyclomatic complexity 7.5 (5 × 1.5).
Validation: Core complexity metrics (minimum_total_complexity, minimum_cyclomatic_complexity, minimum_cognitive_complexity) and all role multipliers must be positive (> 0). Zero or negative values for these fields will cause validation errors and Debtmap will use default values. Structural thresholds (minimum_match_arms, minimum_if_else_chain, minimum_function_length) are not validated and can be set to zero to disable those checks.
Complete Example
# Legacy threshold settings (simple configuration)
# Note: For comprehensive control, use [complexity_thresholds] instead
[thresholds]
complexity = 15 # Cyclomatic complexity threshold (legacy)
cognitive = 20 # Cognitive complexity threshold (legacy)
max_file_length = 500 # Maximum file length in lines
# Validation thresholds for CI/CD
[thresholds.validation]
max_average_complexity = 10.0 # Maximum average complexity across codebase
max_debt_density = 50.0 # Maximum debt items per 1000 LOC
max_codebase_risk_score = 7.0 # Maximum overall risk score
min_coverage_percentage = 0.0 # Minimum test coverage (0 = disabled)
max_total_debt_score = 10000 # Safety net for total debt score
# God object detection
[god_object]
enabled = true
# Rust-specific thresholds
[god_object.rust]
max_methods = 20 # Maximum methods before flagging as god object
max_fields = 15 # Maximum fields
max_traits = 5 # Maximum trait implementations
max_lines = 1000 # Maximum lines in impl block
max_complexity = 200 # Maximum total complexity
# Python-specific thresholds
[god_object.python]
max_methods = 15
max_fields = 10
max_traits = 3
max_lines = 500
max_complexity = 150
# JavaScript/TypeScript-specific thresholds
[god_object.javascript]
max_methods = 15
max_fields = 20
max_traits = 3
max_lines = 500
max_complexity = 150
Configuration Section Notes:
[thresholds]: Legacy/simple threshold configuration. Sets basic complexity thresholds without role multipliers or comprehensive metric control.[complexity_thresholds]: Modern/comprehensive threshold configuration. Provides fine-grained control over all complexity metrics, structural thresholds, and role-based multipliers. Use this for full control.- Recommendation: For new projects, use
[complexity_thresholds]for comprehensive configuration. The[thresholds]section is maintained for backward compatibility.
Using Configuration File
# Initialize with default configuration
debtmap init
# Edit .debtmap.toml to customize thresholds
# Then run analysis (automatically uses config file)
debtmap analyze .
# Validate against thresholds in CI/CD
debtmap validate . --config .debtmap.toml
God Object Thresholds
God objects are classes/structs with too many responsibilities. Debtmap uses language-specific thresholds to detect them:
Rust Thresholds
[god_object.rust]
max_methods = 20 # Methods in impl blocks
max_fields = 15 # Struct fields
max_traits = 5 # Trait implementations
max_lines = 1000 # Lines in impl blocks
max_complexity = 200 # Total complexity
Python Thresholds
[god_object.python]
max_methods = 15
max_fields = 10
max_traits = 3 # Base classes
max_lines = 500
max_complexity = 150
JavaScript/TypeScript Thresholds
[god_object.javascript]
max_methods = 15
max_fields = 20
max_traits = 3 # Extended classes
max_lines = 500
max_complexity = 150
Why language-specific thresholds?
Different languages have different idioms:
- Rust: Encourages small traits and composition, so lower thresholds
- Python: Duck typing allows more fields, but fewer methods
- JavaScript: Prototype-based, typically has more properties
Validation Thresholds
Use validation thresholds in CI/CD pipelines to enforce quality gates:
Scale-Independent Metrics (Recommended)
These metrics work for codebases of any size:
[thresholds.validation]
# Average complexity per function (default: 10.0)
max_average_complexity = 10.0
# Debt items per 1000 lines of code (default: 50.0)
max_debt_density = 50.0
# Overall risk score 0-10 (default: 7.0)
max_codebase_risk_score = 7.0
Optional Metrics
[thresholds.validation]
# Minimum test coverage percentage (default: 0.0 = disabled)
min_coverage_percentage = 80.0
# Safety net for total debt score (default: 10000)
max_total_debt_score = 5000
Using Validation in CI/CD
# Run validation (exits with error if thresholds exceeded)
debtmap validate . --config .debtmap.toml
# Example CI/CD workflow
debtmap analyze . --output report.json
debtmap validate . --config .debtmap.toml || exit 1
CI/CD Best Practices:
- Start with lenient thresholds to establish baseline
- Gradually tighten thresholds as you pay down debt
- Use
max_debt_densityfor stable quality metric - Track trends over time, not just point-in-time values
Tuning Guidelines
How to choose and adjust thresholds for your project:
1. Start with Defaults
Begin with the balanced preset to understand your codebase:
debtmap analyze .
Review the output to see what gets flagged and what doesn’t.
2. Run Baseline Analysis
Understand your current state:
# Analyze and save results
debtmap analyze . --output baseline.json
# Review high-priority items
debtmap analyze . --top 20
3. Adjust Based on Project Type
New Projects:
- Use strict preset to enforce high quality from the start
- Prevents accumulation of technical debt
Typical Projects:
- Use balanced preset (recommended)
- Good middle ground for most teams
Legacy Codebases:
- Use lenient preset initially
- Focus on worst offenders first
- Gradually tighten thresholds as you refactor
4. Fine-Tune in Configuration File
Create .debtmap.toml and adjust specific thresholds:
# Initialize config file
debtmap init
# Edit .debtmap.toml
# Adjust thresholds based on your baseline analysis
5. Validate and Iterate
# Test your thresholds
debtmap validate . --config .debtmap.toml
# Adjust if needed
# Iterate until you find the right balance
Troubleshooting Threshold Configuration
Too many false positives?
- Increase thresholds or switch to lenient preset
- Check if role multipliers are appropriate
- Review god object thresholds for your language
Missing important issues?
- Decrease thresholds or switch to strict preset
- Verify
.debtmap.tomlis being loaded - Check for suppression patterns hiding issues
Different standards for tests?
- Don’t worry - role multipliers automatically handle this
- Test functions get 2-3x multiplier by default
Inconsistent results?
- Ensure
.debtmap.tomlis in project root - CLI flags override config file - remove them for consistency
- Use
--configflag to specify config file explicitly
Examples
Example 1: Quick Analysis with Strict Preset
# Use strict thresholds for new project
debtmap analyze . --threshold-preset strict
Example 2: Custom CLI Thresholds
# Analyze with custom thresholds (no config file)
debtmap analyze . \
--threshold-complexity 15 \
--threshold-duplication 30
Example 3: Project-Specific Configuration
# Initialize configuration
debtmap init
# Creates .debtmap.toml - edit to customize
# Example: Increase complexity threshold to 15
# Run analysis with project config
debtmap analyze .
Example 4: CI/CD Validation
# Create strict validation configuration
cat > .debtmap.toml << EOF
[thresholds.validation]
max_average_complexity = 8.0
max_debt_density = 30.0
max_codebase_risk_score = 6.0
min_coverage_percentage = 75.0
EOF
# Run in CI/CD pipeline
debtmap analyze . --output report.json
debtmap validate . --config .debtmap.toml
Example 5: Gradual Debt Reduction
# Month 1: Start lenient
debtmap analyze . --threshold-preset lenient --output month1.json
# Month 2: Switch to balanced
debtmap analyze . --threshold-preset balanced --output month2.json
# Month 3: Tighten further
debtmap analyze . --threshold-preset strict --output month3.json
# Compare progress
debtmap analyze . --output current.json
# Review trend: month1.json -> month2.json -> month3.json -> current.json
Decision Tree: Choosing Your Preset
Start here: What kind of project are you working on?
│
├─ New project or library
│ └─ Use STRICT preset
│ └─ Prevent debt accumulation from day one
│
├─ Existing production application
│ └─ What's your goal?
│ ├─ Maintain current quality
│ │ └─ Use BALANCED preset
│ │ └─ Good default for most teams
│ │
│ └─ Reduce existing debt gradually
│ └─ Start with LENIENT preset
│ └─ Focus on worst issues first
│ └─ Tighten thresholds over time
│
└─ Legacy codebase or complex domain
└─ Use LENIENT preset
└─ Avoid overwhelming with false positives
└─ Create baseline and improve incrementally
Best Practices
- Start with defaults - Don’t over-customize initially
- Track trends - Monitor debt over time, not just point values
- Be consistent - Use same thresholds across team
- Document choices - Comment your
.debtmap.tomlto explain custom thresholds - Automate validation - Run
debtmap validatein CI/CD - Review regularly - Reassess thresholds quarterly
- Gradual tightening - Don’t make thresholds stricter too quickly
- Trust role multipliers - Let Debtmap handle different function types
Related Topics
- Getting Started - Initial setup and first analysis
- CLI Reference - Complete command-line flag documentation
- Configuration - Full
.debtmap.tomlreference - Scoring Strategies - How thresholds affect debt scores
- God Object Detection - Deep dive into god object analysis
Suppression Patterns
Debtmap provides flexible suppression mechanisms to help you focus on the technical debt that matters most. You can suppress specific debt items inline with comments, or exclude entire files and functions through configuration.
Why Use Suppressions?
Not all detected technical debt requires immediate action. Suppressions allow you to:
- Focus on priorities: Hide known, accepted debt to see new issues clearly
- Handle false positives: Suppress patterns that don’t apply to your context
- Document decisions: Explain why certain debt is acceptable using reason annotations
- Exclude test code: Ignore complexity in test fixtures and setup functions
Inline Comment Suppression
Debtmap supports four inline comment formats that work with your language’s comment syntax:
Single-Line Suppression
Suppress debt on the same line as the comment:
#![allow(unused)]
fn main() {
// debtmap:ignore
// TODO: Implement caching later - performance is acceptable for now
}
# debtmap:ignore
# FIXME: Refactor this after the Q2 release
The suppression applies to debt detected on the same line as the comment.
Next-Line Suppression
Suppress debt on the line immediately following the comment:
#![allow(unused)]
fn main() {
// debtmap:ignore-next-line
fn complex_algorithm() {
// ...20 lines of complex code...
}
}
// debtmap:ignore-next-line
function calculateMetrics(data: DataPoint[]): Metrics {
// ...complex implementation...
}
This format is useful when you want the suppression comment to appear before the code it affects.
Block Suppression
Suppress multiple lines of code between start and end markers:
#![allow(unused)]
fn main() {
// debtmap:ignore-start
fn setup_test_environment() {
// TODO: Add more test cases
// FIXME: Handle edge cases
// Complex test setup code...
}
// debtmap:ignore-end
}
# debtmap:ignore-start
def mock_api_responses():
# TODO: Add more mock scenarios
# Multiple lines of mock setup
pass
# debtmap:ignore-end
Important: Every ignore-start must have a matching ignore-end. Debtmap tracks unclosed blocks and can warn you about them.
Type-Specific Suppression
You can suppress specific types of debt using bracket notation instead of suppressing everything:
Quick Reference: Debt Type Suppression
| Debt Type | Bracket Name(s) | Example | Notes |
|---|---|---|---|
| TODO comments | [todo] | // debtmap:ignore[todo] | Also suppresses TestTodo |
| FIXME comments | [fixme] | // debtmap:ignore[fixme] | |
| Code smells | [smell] or [codesmell] | // debtmap:ignore[smell] | |
| High complexity | [complexity] | // debtmap:ignore[complexity] | Also suppresses TestComplexity |
| Code duplication | [duplication] or [duplicate] | // debtmap:ignore[duplication] | Also suppresses TestDuplication |
| Dependency issues | [dependency] | // debtmap:ignore[dependency] | |
| Error swallowing | ❌ Not supported | // debtmap:ignore | Use general suppression only |
| Resource management | ❌ Not supported | // debtmap:ignore | Use general suppression only |
| Code organization | ❌ Not supported | // debtmap:ignore | Use general suppression only |
| Test quality | ❌ Not supported | // debtmap:ignore | Use general suppression only |
| All types | [*] | // debtmap:ignore[*] | Wildcard matches everything |
Suppress Specific Types
#![allow(unused)]
fn main() {
// debtmap:ignore[todo]
// TODO: This TODO is ignored, but FIXMEs and complexity are still reported
}
#![allow(unused)]
fn main() {
// debtmap:ignore[todo,fixme]
// TODO: Both TODOs and FIXMEs are ignored here
// FIXME: But complexity issues would still be detected
}
Supported Debt Types
You can suppress the following debt types by name in bracket notation:
Currently Supported:
todo- TODO comments (also detects test-specific TODOs)fixme- FIXME commentssmellorcodesmell- Code smell patternscomplexity- High cognitive complexity (also detects test complexity)duplicationorduplicate- Code duplication (also detects test duplication)dependency- Dependency issues*- All types (wildcard)
Auto-Detected Types (cannot be suppressed by name):
The following debt types are detected by code analysis rather than comment scanning. These types cannot be suppressed using bracket notation like [error_swallowing] because they are not included in the suppression parser’s type mapping.
Why bracket notation doesn’t work: The suppression parser only recognizes specific type names in its internal mapping (DEBT_TYPE_MAP): todo, fixme, smell/codesmell, complexity, duplication/duplicate, and dependency. Types detected through AST analysis (like error swallowing and resource management) don’t have string identifiers in the parser. To suppress these, use the general debtmap:ignore marker without brackets:
error_swallowing- Error handling issues (empty catch blocks, ignored errors)resource_management- Resource cleanup issues (file handles, connections)code_organization- Structural issues (god objects, large classes)
Example:
#![allow(unused)]
fn main() {
// ✅ Correct: General suppression without brackets
// debtmap:ignore -- Intentional empty catch for cleanup
match result {
Err(_) => {} // Empty catch block
Ok(v) => process(v)
}
// ❌ Wrong: Bracket notation not supported for auto-detected types
// debtmap:ignore[error_swallowing]
}
Test-Specific Debt Types:
Test-specific variants like TestComplexity, TestTodo, TestDuplication, and TestQuality are suppressed through their base types:
TestComplexity→ suppressed with[complexity]TestTodo→ suppressed with[todo]TestDuplication→ suppressed with[duplication]TestQuality→ suppressed with generaldebtmap:ignore(no bracket notation)
Example:
#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
// Suppresses both Complexity and TestComplexity
// debtmap:ignore[complexity] -- Complex test setup acceptable
fn setup_test_environment() {
// Complex test initialization
}
// debtmap:ignore[todo] -- Suppresses both Todo and TestTodo
// TODO: Add more test cases
fn test_feature() { }
}
}
Wildcard Suppression
Use [*] to explicitly suppress all types (equivalent to no bracket notation):
#![allow(unused)]
fn main() {
// debtmap:ignore[*]
// Suppresses all debt types
}
Type-Specific Blocks
Block suppressions also support type filtering:
#![allow(unused)]
fn main() {
// debtmap:ignore-start[complexity]
fn intentionally_complex_for_performance() {
// Complex nested logic is intentional here
// Complexity warnings suppressed, but TODOs still detected
}
// debtmap:ignore-end
}
Suppression Reasons
Document why you’re suppressing debt using the -- separator:
#![allow(unused)]
fn main() {
// debtmap:ignore -- Intentional for backward compatibility
// TODO: Remove this after all clients upgrade to v2.0
}
# debtmap:ignore[complexity] -- Performance-critical hot path
def optimize_query(params):
# Complex but necessary for performance
pass
// debtmap:ignore-next-line -- Waiting on upstream library fix
function workaroundBug() {
// FIXME: Remove when library v3.0 is released
}
Best Practice: Always include reasons for suppressions. This helps future maintainers understand the context and know when suppressions can be removed.
Config File Exclusions
For broader exclusions, use the [ignore] section in .debtmap.toml:
File Pattern Exclusions
[ignore]
patterns = [
"target/**", # Build artifacts
"node_modules/**", # Dependencies
"**/*_test.rs", # Test files with _test suffix
"tests/**", # All test directories
"**/fixtures/**", # Test fixtures
"**/mocks/**", # Mock implementations
"**/*.min.js", # Minified files
"**/demo/**", # Demo code
"**/*.generated.rs", # Generated files
"vendor/**", # Vendor code
"third_party/**", # Third-party code
]
Function Name Exclusions (Planned)
Note: Function-level exclusions by name pattern are not yet implemented. This is a planned feature for a future release.
When implemented, you will be able to exclude entire function families by name pattern:
# Planned feature - not yet available
[ignore.functions]
patterns = [
# Test setup functions
"setup_test_*",
"teardown_test_*",
"create_test_*",
"mock_*",
# Generated code
"derive_*",
"__*", # Python dunder methods
# CLI parsing (naturally complex)
"parse_args",
"parse_cli",
"build_cli",
# Serialization (naturally complex pattern matching)
"serialize_*",
"deserialize_*",
"to_json",
"from_json",
]
Current workaround: Use inline suppression comments (debtmap:ignore) for specific functions, or use file pattern exclusions to exclude entire test files.
Glob Pattern Syntax
File patterns use standard glob syntax:
| Pattern | Matches | Example |
|---|---|---|
* | Any characters within a path component | *.rs matches main.rs |
** | Any directories (recursive) | tests/** matches tests/unit/foo.rs |
? | Single character | test?.rs matches test1.rs |
[abc] | Character class | test[123].rs matches test1.rs |
[!abc] | Negated class | test[!0].rs matches test1.rs but not test0.rs |
Glob Pattern Examples
[ignore]
patterns = [
"src/**/*_generated.rs", # Generated files in any subdirectory
"**/test_*.py", # Python test files anywhere
"legacy/**/[!i]*.js", # Legacy JS files not starting with 'i'
"**/*.min.js", # Minified JavaScript
"**/*.min.css", # Minified CSS
]
Note: Brace expansion (e.g.,
*.{js,css}) is not supported. Use separate patterns for each file extension.
Language-Specific Comment Syntax
Debtmap automatically uses the correct comment syntax for each language:
| Language | Comment Prefix | Example |
|---|---|---|
| Rust | // | // debtmap:ignore |
| JavaScript* | // | // debtmap:ignore |
| TypeScript* | // | // debtmap:ignore |
| Python | # | # debtmap:ignore |
| Other languages | // | // debtmap:ignore |
Note: Languages not explicitly listed use // as the default comment prefix.
*JavaScript and TypeScript: While debtmap recognizes suppression comments using // syntax in these files, full language analysis (AST parsing, complexity metrics, etc.) currently only supports Rust and Python. JavaScript/TypeScript files will have suppression comments detected if processed, but complete debt analysis for these languages is planned for future releases.
You don’t need to configure this—Debtmap detects the language and uses the appropriate syntax.
Explicitly Specified Files
Important behavior: When you analyze a specific file directly, ignore patterns are bypassed:
# Respects [ignore] patterns in .debtmap.toml
debtmap analyze .
debtmap analyze src/
# Bypasses ignore patterns - analyzes the file even if patterns would exclude it
debtmap analyze src/test_helper.rs
This ensures you can always analyze specific files when needed, even if they match an ignore pattern.
Suppression Statistics
Debtmap internally tracks suppression usage during analysis:
- Total suppressions: Count of active suppressions across all files
- Suppressions by type: How many of each debt type are suppressed
- Unclosed blocks: Detection of
ignore-startwithout matchingignore-end
Current Status: These statistics are computed during analysis (via the SuppressionContext::get_stats() method) but are not currently displayed in any output format. The SuppressionStats struct exists and tracks all metrics, but there is no user-facing command or report format that exposes them. Future releases may add a dedicated command to view suppression metrics.
Auditing Suppressions Now: You can audit your suppressions using standard tools:
# Find all suppressions in Rust code
rg "debtmap:ignore" --type rust
# Count suppressions by type
rg "debtmap:ignore\[" --type rust | grep -o "\[.*\]" | sort | uniq -c
# Find unclosed blocks
rg "debtmap:ignore-start" --type rust -A 100 | grep -v "debtmap:ignore-end"
# List files with suppressions
rg "debtmap:ignore" --files-with-matches
Best Practices
Use Suppressions Sparingly
Suppressions hide information, so use them intentionally:
✅ Good use cases:
- Test fixtures and mock data
- Known technical debt with an accepted timeline
- Intentional complexity for performance
- False positives specific to your domain
❌ Poor use cases:
- Hiding all debt to make reports look clean
- Suppressing instead of fixing simple issues
- Using wildcards when specific types would work
Always Include Reasons
#![allow(unused)]
fn main() {
// ✅ Good: Clear reason and timeline
// debtmap:ignore[complexity] -- Hot path optimization, profiled and necessary
fn fast_algorithm() { }
// ❌ Bad: No context for future maintainers
// debtmap:ignore
fn fast_algorithm() { }
}
Prefer Specific Over Broad
#![allow(unused)]
fn main() {
// ✅ Good: Only suppress the specific debt type
// debtmap:ignore[todo] -- Remove after v2.0 migration
// TODO: Migrate to new API
// ❌ Bad: Suppresses everything, including real issues
// debtmap:ignore
// TODO: Migrate to new API
}
Use Config for Systematic Exclusions
For patterns that apply project-wide, use .debtmap.toml instead of inline comments:
# ✅ Good: One config applies to all test files
[ignore]
patterns = ["tests/**"]
# ❌ Bad: Repetitive inline suppressions in every test file
Review Suppressions Periodically
Suppressions can become outdated:
- Remove suppressions when the reason no longer applies
- Check if suppressed debt can now be fixed
- Verify reasons are still accurate after refactoring
Solution: Periodically search for suppressions:
rg "debtmap:ignore" --type rust
Ensure Blocks Are Closed
#![allow(unused)]
fn main() {
// ✅ Good: Properly closed block
// debtmap:ignore-start
fn test_setup() { }
// debtmap:ignore-end
// ❌ Bad: Unclosed block affects all subsequent code
// debtmap:ignore-start
fn test_setup() { }
// (missing ignore-end)
}
Debtmap detects unclosed blocks and can warn you about them.
Common Patterns
Suppressing Test Code
# In .debtmap.toml
[ignore]
patterns = [
"tests/**/*",
"**/*_test.rs",
"**/test_*.py",
"**/fixtures/**",
]
For test functions within production files, use inline suppressions:
#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
// debtmap:ignore-start -- Test code
fn setup_test_environment() { }
// debtmap:ignore-end
}
}
Suppressing Generated Code
[ignore]
patterns = [
"**/*_generated.*",
"**/proto/**",
"**/bindings/**",
]
Temporary Suppressions with Timeline
#![allow(unused)]
fn main() {
// debtmap:ignore[complexity] -- TODO: Refactor during Q2 2025 sprint
fn legacy_payment_processor() {
// Complex legacy code scheduled for refactoring
}
}
Suppressing False Positives
# debtmap:ignore[duplication] -- Similar but semantically different
def calculate_tax_us():
# US tax calculation
pass
# debtmap:ignore[duplication] -- Similar but semantically different
def calculate_tax_eu():
# EU tax calculation with different rules
pass
Conditional Suppression
#![allow(unused)]
fn main() {
#[cfg(test)]
// debtmap:ignore[complexity]
fn test_helper() {
// Complex test setup is acceptable
}
}
Suppression with Detailed Justification
#![allow(unused)]
fn main() {
// debtmap:ignore[complexity] -- Required by specification XYZ-123
// This function implements the state machine defined in spec XYZ-123.
// Complexity is inherent to the specification and cannot be reduced
// without violating requirements.
fn state_machine() { ... }
}
Troubleshooting
Suppression Not Working
- Check comment syntax: Ensure you’re using the correct comment prefix for your language (
//for Rust/JS/TS,#for Python) - Verify spelling: It’s
debtmap:ignore, notdebtmap-ignoreordebtmap_ignore - Check line matching: For same-line suppressions, ensure the debt is on the same line as the comment
- Verify type names: Use
todo,fixme,complexity, etc. (lowercase)
Common syntax errors:
#![allow(unused)]
fn main() {
// Wrong: debtmap: ignore (space after colon)
// Right: debtmap:ignore
// Wrong: debtmap:ignore[Complexity] (capital C)
// Right: debtmap:ignore[complexity]
}
Check placement:
#![allow(unused)]
fn main() {
// Wrong: comment after code
fn function() { } // debtmap:ignore
// Right: comment before code
// debtmap:ignore
fn function() { }
}
Unclosed Block Warning
If you see warnings about unclosed blocks:
#![allow(unused)]
fn main() {
// Problem: Missing ignore-end
// debtmap:ignore-start
fn test_helper() { }
// (Should have debtmap:ignore-end here)
// Solution: Add the closing marker
// debtmap:ignore-start
fn test_helper() { }
// debtmap:ignore-end
}
File Still Being Analyzed
If a file in your ignore patterns is still being analyzed:
- Check if you’re analyzing the specific file directly (bypasses ignore patterns)
- Verify the glob pattern matches the file path
- Check for typos in the pattern
- Test the pattern in isolation
Test pattern with find:
find . -path "tests/**/*" -type f
Use double asterisk for subdirectories:
# Wrong: "tests/*" (only direct children)
# Right: "tests/**/*" (all descendants)
Check relative paths:
# Patterns are relative to project root
patterns = [
"src/legacy/**", # ✓ Correct
"/src/legacy/**", # ✗ Wrong (absolute path)
]
Function Suppression Not Working
Function-level exclusions by name pattern are not yet implemented. To suppress specific functions:
- Use inline suppressions:
// debtmap:ignorebefore the function - Use block suppressions:
// debtmap:ignore-start…// debtmap:ignore-end - Exclude entire files using
[ignore]patterns if the functions are in dedicated files
Related Topics
- Configuration - Full
.debtmap.tomlreference - CLI Reference - Command-line analysis options
- Analysis Guide - Understanding debt detection
- Output Formats - Viewing suppressed items in reports
Summary
Suppressions help you focus on actionable technical debt:
- Inline comments:
debtmap:ignore,ignore-next-line,ignore-start/end - Type-specific: Use
[type1,type2]to suppress selectively - Reasons: Always use
-- reasonto document why - Config patterns: Use
.debtmap.tomlfor systematic file exclusions - Best practices: Use sparingly, prefer specific over broad, review periodically
With proper use of suppressions, your Debtmap reports show only the debt that matters to your team.
Output Formats
Debtmap provides multiple output formats to suit different workflows, from interactive terminal reports to machine-readable JSON for CI/CD integration. This chapter covers all available formats and how to use them effectively.
Format Selection
Select the output format using the -f or --format flag:
# Terminal output (default) - human-readable with colors
debtmap analyze .
# JSON output - machine-readable for tooling
debtmap analyze . --format json
# Markdown output - documentation and reports
debtmap analyze . --format markdown
# HTML output - interactive web dashboard
debtmap analyze . --format html
# DOT output - Graphviz dependency visualization
debtmap analyze . --format dot
Available formats:
- terminal (default): Interactive output with colors, emoji, and formatting
- json: Structured data for programmatic processing
- markdown: Reports suitable for documentation and PR comments
- html: Web-viewable HTML reports with interactive dashboard
- dot: Graphviz DOT format for dependency graph visualization
Writing to Files
By default, output goes to stdout. Use -o or --output to write to a file:
# Write JSON to file
debtmap analyze . --format json -o report.json
# Write markdown report
debtmap analyze . --format markdown -o DEBT_REPORT.md
# Terminal output to file (preserves colors)
debtmap analyze . -o analysis.txt
Terminal Output
The terminal format provides an interactive, color-coded report designed for developer workflows. It’s the default format and optimized for readability.
Output Structure
Terminal output is organized into five main sections:
- Header - Analysis report title
- Codebase Summary - High-level metrics and debt score
- Complexity Hotspots - Top 5 most complex functions with refactoring guidance
- Technical Debt - High-priority debt items requiring attention
- Pass/Fail Status - Overall quality assessment
Example Terminal Output
═══════════════════════════════════════════
DEBTMAP ANALYSIS REPORT
═══════════════════════════════════════════
📊 CODEBASE Summary
───────────────────────────────────────────
Files analyzed: 42
Total functions: 287
Average complexity: 6.3
Debt items: 15
Total debt score: 156 (threshold: 100)
⚠️ COMPLEXITY HOTSPOTS (Top 5)
───────────────────────────────────────────
1. src/analyzers/rust.rs:245 parse_function() - Cyclomatic: 18, Cognitive: 24
ACTION: Extract 3-5 pure functions using decompose-then-transform strategy
PATTERNS: Decompose into logical units, then apply functional patterns
BENEFIT: Pure functions are easily testable and composable
2. src/debt/smells.rs:196 detect_data_clumps() - Cyclomatic: 15, Cognitive: 20
↓ Entropy: 0.32, Repetition: 85%, Effective: 0.6x
High pattern repetition detected (85%)
🔧 TECHNICAL DEBT (15 items)
───────────────────────────────────────────
High Priority (5):
- src/risk/scoring.rs:142 - TODO: Implement caching for score calculations
- src/core/metrics.rs:89 - High complexity: cyclomatic=16
- src/debt/patterns.rs:201 - Code duplication: 65 lines duplicated
✓ Pass/Fail: PASS
Color Coding and Symbols
The terminal output uses colors and symbols for quick visual scanning:
Status Indicators:
- ✓ Green: Passing, good, well-tested
- ⚠️ Yellow: Warning, moderate complexity
- ✗ Red: Failing, critical, high complexity
- 📊 Blue: Information, metrics
- 🔧 Orange: Technical debt items
- 🎯 Cyan: Recommendations
Complexity Classification:
- LOW (0-5): Green - Simple, easy to maintain
- MODERATE (6-10): Yellow - Consider refactoring
- HIGH (11-15): Orange - Should refactor
- SEVERE (>15): Red - Urgent refactoring needed
Note: These levels match the
ComplexityLevelenum in the implementation.
Debt Score Thresholds:
The default debt threshold is 100. Scores are colored based on this threshold:
- Green (≤50): Healthy - Below half threshold (score ≤ threshold/2)
- Yellow (51-100): Attention needed - Between half and full threshold (threshold/2 < score ≤ threshold)
- Red (>100): Action required - Exceeds threshold (score > threshold)
Note: Boundary values use strict inequalities: 50 is Green, 100 is Yellow (not Red), 101+ is Red.
Refactoring Guidance
For complex functions (cyclomatic complexity > 5), the terminal output provides actionable refactoring recommendations:
ACTION: Extract 3-5 pure functions using decompose-then-transform strategy
PATTERNS: Decompose into logical units, then apply functional patterns
BENEFIT: Pure functions are easily testable and composable
Guidance levels:
- Moderate (6-10): Extract 2-3 pure functions using direct functional transformation
- High (11-15): Extract 3-5 pure functions using decompose-then-transform strategy
- Severe (>15): Extract 5+ pure functions into modules with functional core/imperative shell
See the Analysis Guide for metric explanations.
Plain Terminal Mode
For environments without color support or when piping to tools, use --plain:
# ASCII-only output, no colors
debtmap analyze . --plain
Plain mode:
- Removes ANSI color codes
- Uses ASCII box-drawing characters
- Machine-parseable structure
Note: Terminal output formatting is controlled internally via
FormattingConfig(found insrc/formattingandsrc/io/writers/terminal.rs), which manages color mode settings. The--plainflag and environment variables provide user-facing control over these settings:
--plainflag - Disables colors and fancy formattingNO_COLOR=1- Disables colors (per no-color.org standard)CLICOLOR=0- Disables colorsCLICOLOR_FORCE=1- Forces colors even when output is not a terminal
FormattingConfigis not directly exposed to CLI users but can be accessed when using debtmap as a library throughTerminalWriter::with_formatting.
Verbosity Levels
Control detail level with -v flags (can be repeated):
# Standard output
debtmap analyze .
# Level 1: Show main score factors
debtmap analyze . -v
# Level 2: Show detailed calculations
debtmap analyze . -vv
# Level 3: Show all debug information
debtmap analyze . -vvv
Verbosity features:
-v: Show main score factors (complexity, coverage, dependency breakdown)-vv: Show detailed calculations with formulas and intermediate values-vvv: Show all debug information including entropy metrics and role detection
Note: Verbosity flags affect terminal output only. JSON and markdown formats include all data regardless of verbosity level.
Each level includes all information from the previous levels, progressively adding more detail to help understand how scores are calculated.
Example Output Differences:
Standard output shows basic metrics:
Total debt score: 156 (threshold: 100)
Level 1 (-v) adds score breakdowns:
Total debt score: 156 (threshold: 100)
Complexity contribution: 85 (54%)
Coverage gaps: 45 (29%)
Dependency issues: 26 (17%)
Level 2 (-vv) adds detailed calculations:
Total debt score: 156 (threshold: 100)
Complexity contribution: 85 (54%)
Formula: sum(cyclomatic_weight * severity_multiplier)
High complexity functions: 5 × 12 = 60
Medium complexity: 8 × 3 = 24
Base penalty: 1
Coverage gaps: 45 (29%)
Uncovered complex functions: 3 × 15 = 45
Level 3 (-vvv) adds all internal details:
Total debt score: 156 (threshold: 100)
... (all level 2 output) ...
Debug info:
Entropy metrics analyzed: 42/50 functions
Function role detection: BusinessLogic=12, Utility=8, TestHelper=5
Parse time: 245ms
Understanding Metrics
To get detailed explanations of how metrics are calculated, use the --explain-metrics flag:
# Get explanations of metric definitions and formulas
debtmap analyze . --explain-metrics
This flag provides:
- Metric definitions - Detailed explanations of what each metric measures
- Calculation formulas - How scores are computed from raw data
- Measured vs estimated - Which metrics are exact and which are heuristic-based
- Interpretation guidance - How to understand and act on metric values
The explanations appear inline with the analysis output, helping you understand:
- What cyclomatic and cognitive complexity measure
- How debt scores are calculated
- What entropy metrics indicate
- How risk scores are determined
This is particularly useful when:
- Learning how debtmap evaluates code quality
- Understanding why certain functions have high scores
- Explaining analysis results to team members
- Tuning thresholds based on metric meanings
Risk Analysis Output
When coverage data is provided via --lcov, terminal output includes a dedicated risk analysis section:
═══════════════════════════════════════════
RISK ANALYSIS REPORT
═══════════════════════════════════════════
📈 RISK Summary
───────────────────────────────────────────
Codebase Risk Score: 45.5 (MEDIUM)
Complexity-Coverage Correlation: -0.65
Risk Distribution:
Critical: 2 functions
High: 5 functions
Medium: 10 functions
Low: 15 functions
Well Tested: 20 functions
🎯 CRITICAL RISKS
───────────────────────────────────────────
1. src/core/parser.rs:142 parse_complex_ast()
Risk: 85.0 | Complexity: 15 | Coverage: 0%
Recommendation: Add 5 unit tests (est: 2-3 hours)
Impact: -40 risk reduction
💡 RECOMMENDATIONS (by ROI)
───────────────────────────────────────────
1. test_me() - ROI: 5.0x
Current Risk: 75 | Reduction: 40 | Effort: Moderate
Rationale: High risk function with low coverage
Risk Level Classification:
- LOW (<30): Green - score < 30.0
- MEDIUM (30-59): Yellow - 30.0 ≤ score < 60.0
- HIGH (≥60): Red - score ≥ 60.0
Note: 60 is the start of HIGH risk level.
JSON Output
JSON output provides complete analysis results in a machine-readable format, ideal for CI/CD pipelines, custom tooling, and programmatic analysis.
Basic Usage
# Generate JSON output
debtmap analyze . --format json
# Save to file
debtmap analyze . --format json -o report.json
# Pretty-printed by default for readability
debtmap analyze . --format json | jq .
Note: JSON output is automatically pretty-printed for readability.
JSON Schema Structure
Debtmap outputs a structured JSON document with the following top-level fields:
{
"project_path": "/path/to/project",
"timestamp": "2025-01-09T12:00:00Z",
"complexity": { ... },
"technical_debt": { ... },
"dependencies": { ... },
"duplications": [ ... ]
}
Full Schema Example
Here’s a complete annotated JSON output example:
{
// Project metadata
"project_path": "/Users/dev/myproject",
"timestamp": "2025-01-09T15:30:00Z",
// Complexity analysis results
"complexity": {
"metrics": [
{
"name": "calculate_risk_score",
"file": "src/risk/scoring.rs",
"line": 142,
"cyclomatic": 12,
"cognitive": 18,
"nesting": 4,
"length": 85,
"is_test": false,
"visibility": "pub",
"is_trait_method": false,
"in_test_module": false,
"entropy_score": {
"token_entropy": 0.65,
"pattern_repetition": 0.30,
"branch_similarity": 0.45,
"effective_complexity": 0.85
},
"is_pure": false,
"purity_confidence": 0.75,
"detected_patterns": ["nested_loops", "complex_conditionals"],
"upstream_callers": ["analyze_codebase", "generate_report"],
"downstream_callees": ["get_metrics", "apply_weights"]
}
],
"summary": {
"total_functions": 287,
"average_complexity": 6.3,
"max_complexity": 24,
"high_complexity_count": 12
}
},
// Technical debt items
"technical_debt": {
"items": [
{
"id": "debt_001",
"debt_type": "Complexity",
"priority": "High",
"file": "src/analyzers/rust.rs",
"line": 245,
"column": 5,
"message": "High cyclomatic complexity: 18",
"context": "Function parse_function has excessive branching"
},
{
"id": "debt_002",
"debt_type": "Todo",
"priority": "Medium",
"file": "src/core/cache.rs",
"line": 89,
"column": null,
"message": "TODO: Implement LRU eviction policy",
"context": null
}
],
"by_type": {
"Complexity": [ /* same structure as items */ ],
"Todo": [ /* ... */ ],
"Duplication": [ /* ... */ ]
},
"priorities": ["Low", "Medium", "High", "Critical"]
},
// Dependency analysis
"dependencies": {
"modules": [
{
"module": "risk::scoring",
"dependencies": ["core::metrics", "debt::patterns"],
"dependents": ["commands::analyze", "io::output"]
}
],
"circular": [
{
"cycle": ["module_a", "module_b", "module_c", "module_a"]
}
]
},
// Code duplication blocks
"duplications": [
{
"hash": "abc123def456",
"lines": 15,
"locations": [
{
"file": "src/parser/rust.rs",
"start_line": 42,
"end_line": 57
},
{
"file": "src/parser/python.rs",
"start_line": 89,
"end_line": 104
}
]
}
]
}
Field Descriptions
FunctionMetrics Fields:
-
name: Function name -
file: Path to source file -
line: Line number where function is defined -
cyclomatic: Cyclomatic complexity score -
cognitive: Cognitive complexity score -
nesting: Maximum nesting depth -
length: Lines of code in function -
is_test: Whether this is a test function -
visibility: Rust visibility modifier (pub, pub(crate), or null) -
is_trait_method: Whether this implements a trait -
in_test_module: Whether inside #[cfg(test)] -
entropy_score: Optional entropy analysis with structure:{ "token_entropy": 0.65, // Token distribution entropy (0-1): measures variety of tokens "pattern_repetition": 0.30, // Pattern repetition score (0-1): detects repeated code patterns "branch_similarity": 0.45, // Branch similarity metric (0-1): compares similarity between branches "effective_complexity": 0.85 // Adjusted complexity multiplier: complexity adjusted for entropy }EntropyScore Fields:
token_entropy: Measures the variety and distribution of tokens in the function (0-1, higher = more variety)pattern_repetition: Detects repeated code patterns within the function (0-1, higher = more repetition)branch_similarity: Measures similarity between different code branches (0-1, higher = more similar)effective_complexity: The overall complexity multiplier adjusted for entropy effects
-
is_pure: Whether function is pure (no side effects) -
purity_confidence: Confidence level (0.0-1.0) -
detected_patterns: List of detected code patterns -
upstream_callers: Functions that call this one -
downstream_callees: Functions this one calls
DebtItem Fields:
id: Unique identifierdebt_type: Type of debt (see DebtType enum below)priority: Priority level (Low, Medium, High, Critical)file: Path to file containing debtline: Line numbercolumn: Optional column numbermessage: Human-readable descriptioncontext: Optional additional context
DebtType Enum:
Todo: TODO markersFixme: FIXME markersCodeSmell: Code smell patternsDuplication: Duplicated codeComplexity: Excessive complexityDependency: Dependency issuesErrorSwallowing: Suppressed errorsResourceManagement: Resource management issuesCodeOrganization: Organizational problemsTestComplexity: Complex test codeTestTodo: TODOs in testsTestDuplication: Duplicated test codeTestQuality: Test quality issues
Risk Insights JSON
When coverage data is provided via --lcov, risk insights are included as part of the analysis output. The write_risk_insights method (found in src/io/writers/json.rs, terminal.rs, and markdown/core.rs) outputs risk analysis data in the following JSON structure:
{
"items": [
{
"location": {
"file": "src/risk/scoring.rs",
"function": "calculate_priority",
"line": 66
},
"debt_type": "TestGap",
"unified_score": {
"complexity_factor": 3.2,
"coverage_factor": 10.0,
"dependency_factor": 2.5,
"role_multiplier": 1.2,
"final_score": 9.4
},
"function_role": "BusinessLogic",
"recommendation": {
"action": "Add unit tests",
"details": "Add 6 unit tests for full coverage",
"effort_estimate": "2-3 hours"
},
"expected_impact": {
"risk_reduction": 3.9,
"complexity_reduction": 0,
"coverage_improvement": 100
},
"upstream_dependencies": 0,
"downstream_dependencies": 3,
"nesting_depth": 1,
"function_length": 13
}
],
"call_graph": {
"total_functions": 1523,
"entry_points": 12,
"test_functions": 456,
"max_depth": 8
},
"overall_coverage": 82.3,
"total_impact": {
"risk_reduction": 45.2,
"complexity_reduction": 12.3,
"coverage_improvement": 18.5
}
}
Markdown Output
Markdown format generates documentation-friendly reports suitable for README files, PR comments, and technical documentation.
Basic Usage
# Generate markdown report
debtmap analyze . --format markdown
# Save to documentation
debtmap analyze . --format markdown -o docs/DEBT_REPORT.md
Markdown Structure
Markdown output includes:
- Executive Summary - High-level metrics and health dashboard
- Complexity Analysis - Detailed complexity breakdown by file
- Technical Debt - Categorized debt items with priorities
- Dependencies - Module dependencies and circular references
- Recommendations - Prioritized action items
Example Markdown Output
# Debtmap Analysis Report
**Generated:** 2025-01-09 15:30:00 UTC
**Project:** /Users/dev/myproject
## Executive Summary
- **Files Analyzed:** 42
- **Total Functions:** 287
- **Average Complexity:** 6.3
- **Total Debt Items:** 15
- **Debt Score:** 156/100 ⚠️
### Health Dashboard
| Metric | Value | Status |
|--------|-------|--------|
| Complexity | 6.3 avg | ✅ Good |
| Debt Score | 156 | ⚠️ Attention |
| High Priority Items | 5 | ⚠️ Action Needed |
## Complexity Analysis
### Top 5 Complex Functions
| Function | File | Cyclomatic | Cognitive | Priority |
|----------|------|-----------|-----------|----------|
| parse_function | src/analyzers/rust.rs:245 | 18 | 24 | High |
| detect_data_clumps | src/debt/smells.rs:196 | 15 | 20 | Medium |
| analyze_dependencies | src/core/deps.rs:89 | 14 | 18 | Medium |
### Refactoring Recommendations
**src/analyzers/rust.rs:245** - `parse_function()`
- **Complexity:** Cyclomatic: 18, Cognitive: 24
- **Action:** Extract 3-5 pure functions using decompose-then-transform strategy
- **Patterns:** Decompose into logical units, then apply functional patterns
- **Benefit:** Improved testability and maintainability
## Technical Debt
### High Priority (5 items)
- **src/risk/scoring.rs:142** - TODO: Implement caching for score calculations
- **src/core/metrics.rs:89** - High complexity: cyclomatic=16
- **src/debt/patterns.rs:201** - Code duplication: 65 lines duplicated
### Medium Priority (8 items)
...
## Dependencies
### Circular Dependencies
- `risk::scoring` → `core::metrics` → `risk::scoring`
## Recommendations
1. **Refactor parse_function** (High Priority)
- Reduce complexity from 18 to <10
- Extract helper functions
- Estimated effort: 4-6 hours
2. **Add tests for scoring module** (High Priority)
- Current coverage: 35%
- Target coverage: 80%
- Estimated effort: 2-3 hours
## Data Flow Analysis
When verbosity is enabled (`-v` or higher), detailed data flow analysis is included in markdown reports for top priority items. This section provides deep insights into mutations, I/O operations, and escape analysis.
### Mutation Analysis
Shows variable mutation patterns within functions:
**Example:**
```markdown
**Data Flow Analysis**
- Mutations: 10 total, 2 live, 2 dead stores
- **Opportunity**: Remove 2 dead store(s) to simplify code
- **Almost Pure**: Only 2 live mutation(s), consider extracting pure subset
Mutation Metrics:
- Total Mutations: Count of all variable assignments and mutations
- Live Mutations: Mutations where the new value is actually used
- Dead Stores: Assignments that are never read (can be removed)
Refactoring Opportunities:
- Functions with many dead stores can be simplified by removing unused assignments
- Functions with few live mutations relative to total mutations are “almost pure” and may benefit from extracting pure subsets
I/O Operations
Detects and lists input/output operations within functions:
Example:
- I/O Operations: 3 detected
- File Read at line 105
- Network Call at line 110
- Database Query at line 120
- **Recommendation**: Consider isolating I/O in separate functions
Detected I/O Types:
- File system operations (read, write, open, close)
- Network operations (HTTP, TCP, UDP)
- Database queries and updates
- Standard I/O (print, input)
- System calls
Best Practices:
- Isolate I/O operations in dedicated functions
- Keep business logic pure and separate from I/O
- Easier testing when I/O is at function boundaries
Escape Analysis
Shows which variables escape the function scope or affect the return value:
Example:
- Escape Analysis: 2 variables escape
- Return dependencies: result, accumulator
- **Insight**: These variables directly affect function output
Escape Metrics:
- Escaping Variables: Variables whose values leave the function scope
- Return Dependencies: Variables that contribute to the return value
Implications:
- Functions with many escaping variables have complex data flow
- Clear return dependencies indicate focused, single-purpose functions
- Excessive escaping may indicate the function is doing too much
Purity Analysis
Indicates whether a function is pure and reasons for any impurity:
Example:
- Purity: Not pure (95% confidence)
- Reasons: Mutates shared state, Performs I/O operations
- **Benefit**: Converting to pure functions improves testability
Purity Assessment:
- Is Pure: Boolean indicating if the function is pure (deterministic, no side effects)
- Confidence Level: Percentage confidence in the assessment (0-100%)
- Impurity Reasons: Specific reasons why the function is not pure
Common Impurity Reasons:
- Mutates shared or global state
- Performs I/O operations
- Calls other impure functions
- Uses random number generation
- Depends on system time or external state
Value of Pure Functions:
- Easier to test (no setup/teardown needed)
- Easier to reason about (same input always gives same output)
- Safe to parallelize and memoize
- More composable and reusable
Enabling Data Flow Analysis
Data flow analysis appears in markdown output when using verbose mode:
# Include data flow analysis in markdown reports
debtmap analyze . --format markdown -v
# Higher verbosity includes more details
debtmap analyze . --format markdown -vv
The data flow section appears in the score breakdown for each of the top 3 priority items, providing actionable insights for refactoring.
### CLI vs Library Markdown Features
**CLI Markdown Output (`--format markdown`):**
When you use `debtmap analyze . --format markdown`, you get comprehensive reports that include:
- Executive summary with health dashboard
- Complexity analysis with refactoring recommendations
- Technical debt categorization by priority
- Dependency analysis with circular reference detection
- Actionable recommendations
This uses the base `MarkdownWriter` implementation and provides everything needed for documentation and PR comments.
**Enhanced Library Features:**
If you're using debtmap as a Rust library in your own tools, additional markdown capabilities are available:
- **`EnhancedMarkdownWriter` trait** (`src/io/writers/markdown/enhanced.rs`) - Provides advanced formatting and analysis features
- **Enhanced markdown modules** (`src/io/writers/enhanced_markdown/`) - Building blocks for custom visualizations including:
- Priority-based debt rankings with unified scoring
- Dead code detection and reporting
- Call graph insights and dependency visualization
- Testing recommendations with ROI analysis
To use enhanced features in your Rust code:
```rust
use debtmap::io::writers::markdown::enhanced::EnhancedMarkdownWriter;
use debtmap::io::writers::enhanced_markdown::*;
// Create custom reports with enhanced features
let mut writer = create_enhanced_writer(output)?;
writer.write_priority_rankings(&analysis)?;
writer.write_dead_code_analysis(&call_graph)?;
Note: Enhanced markdown features are only available through the library API, not via the CLI. The CLI
--format markdownoutput is comprehensive for most use cases.
Rendering to HTML/PDF
Markdown reports can be converted to other formats:
# Generate markdown
debtmap analyze . --format markdown -o report.md
# Convert to HTML with pandoc
pandoc report.md -o report.html --standalone --css style.css
# Convert to PDF
pandoc report.md -o report.pdf --pdf-engine=xelatex
HTML Output
HTML format generates an interactive web dashboard with visual metrics and navigation. This format is ideal for viewing analysis results in a browser and sharing reports with stakeholders.
Source: src/io/writers/html.rs, src/cli.rs:492
Basic Usage
# Generate HTML dashboard
debtmap analyze . --format html
# Save to file
debtmap analyze . --format html -o dashboard.html
# Open in browser
debtmap analyze . --format html -o dashboard.html && open dashboard.html
Dashboard Features
The HTML output provides an interactive dashboard with:
- Executive Summary - High-level metrics with visual indicators
- Debt Score Dashboard - Priority distribution (Critical, High, Medium, Low)
- Complexity Metrics - Average complexity and function counts
- Debt Density - Technical debt per function ratio
- Interactive Data - Full analysis results embedded as JSON
Dashboard Structure
The HTML dashboard uses an embedded template (src/io/writers/templates/dashboard.html) that includes:
- Metrics Cards - Visual representation of key metrics
- Priority Breakdown - Count of items by priority level
- Health Indicators - Color-coded status based on thresholds
- Raw Data Access - Complete JSON analysis results in the page
Example Output
When you generate an HTML dashboard, it displays:
┌─────────────────────────────────────┐
│ Debtmap Analysis Dashboard │
├─────────────────────────────────────┤
│ │
│ Total Items: 156 │
│ Critical: 5 High: 12 │
│ Medium: 45 Low: 94 │
│ │
│ Total Functions: 287 │
│ Avg Complexity: 6.3 │
│ Debt Density: 0.54 │
│ │
└─────────────────────────────────────┘
When to Use HTML Format
Use HTML Format When:
- Sharing reports with non-technical stakeholders
- Creating dashboards for team visibility
- Generating reports for management review
- Viewing analysis results in a browser
- Embedding reports in internal documentation sites
- Publishing analysis results to a web server
HTML vs Markdown:
- HTML: Interactive, visual dashboard with embedded data
- Markdown: Text-based, suitable for conversion to multiple formats
- HTML: Better for standalone viewing in browsers
- Markdown: Better for version control and text processing
Customization
The HTML output uses a built-in template. For custom styling, you can:
- Generate the HTML file
- Extract and modify the CSS within the file
- Serve with custom stylesheets
Note: Direct template customization requires modifying
src/io/writers/templates/dashboard.htmlin the debtmap source code.
DOT Output
DOT format generates Graphviz-compatible output for visualizing file dependencies and technical debt as a graph. This is useful for understanding code architecture and identifying coupling patterns.
Source: src/io/writers/dot.rs (Spec 204)
Basic Usage
# Generate DOT output for dependency visualization
debtmap analyze . --format dot
# Save to file
debtmap analyze . --format dot -o deps.dot
Rendering the Graph
DOT files can be rendered using Graphviz tools:
# Generate SVG (recommended for web)
dot -Tsvg deps.dot -o deps.svg
# Generate PNG
dot -Tpng deps.dot -o deps.png
# Interactive exploration with xdot
xdot deps.dot
Graph Features
The DOT output provides:
- File Nodes - Each analyzed file appears as a node
- Dependency Edges - Arrows show which files depend on which
- Color-Coded Debt Scores - Node colors indicate debt severity:
- Green (#6BCB77): Low debt (<20)
- Yellow (#FFD93D): Medium debt (≥20)
- Orange (#FF8C00): High debt (≥50)
- Red (#FF6B6B): Critical debt (≥100)
- Module Clustering - Files are grouped by directory/module
- Tooltips - Hover for detailed metrics (score, functions, lines, coupling)
- Legend - Built-in legend explains the color coding
Example Output
digraph debtmap {
rankdir=TB;
node [shape=box, style=filled, fontname="Helvetica"];
edge [fontname="Helvetica", fontsize=10];
subgraph cluster_legend {
label="Debt Score Legend";
legend_critical [label="Critical (>=100)", fillcolor="#FF6B6B"];
legend_high [label="High (>=50)", fillcolor="#FF8C00"];
legend_medium [label="Medium (>=20)", fillcolor="#FFD93D"];
legend_low [label="Low (<20)", fillcolor="#6BCB77"];
}
subgraph cluster_io {
label="io";
"src_io_output_rs" [label="output.rs", fillcolor="#6BCB77"];
"src_io_writers_json_rs" [label="json.rs", fillcolor="#FFD93D"];
}
// Dependencies
"src_io_output_rs" -> "src_io_writers_json_rs";
}
When to Use DOT Format
Use DOT Format When:
- Visualizing module dependencies and architecture
- Identifying tightly coupled file clusters
- Finding isolated or dead code modules
- Presenting architecture to team members
- Detecting circular dependency patterns
- Understanding code organization
DOT vs Other Formats:
- DOT: Best for dependency visualization and architecture understanding
- JSON: Best for programmatic processing and CI/CD integration
- HTML: Best for interactive dashboards and metrics overview
- Terminal/Markdown: Best for readable reports and documentation
Tool Integration
CI/CD Pipelines
Debtmap JSON output integrates seamlessly with CI/CD systems.
GitHub Actions
name: Code Quality
on: [pull_request]
jobs:
analyze:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Install debtmap
run: cargo install debtmap
- name: Run analysis
run: |
debtmap analyze . \
--format json \
--output analysis.json \
--lcov coverage/lcov.info
- name: Check thresholds
run: |
DEBT_SCORE=$(jq '.technical_debt.items | length' analysis.json)
if [ "$DEBT_SCORE" -gt 100 ]; then
echo "❌ Debt score too high: $DEBT_SCORE"
exit 1
fi
- name: Comment on PR
uses: actions/github-script@v6
with:
script: |
const fs = require('fs');
const analysis = JSON.parse(fs.readFileSync('analysis.json'));
const summary = `## Debtmap Analysis
- **Debt Items:** ${analysis.technical_debt.items.length}
- **Average Complexity:** ${analysis.complexity.summary.average_complexity}
- **High Complexity Functions:** ${analysis.complexity.summary.high_complexity_count}
`;
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: summary
});
GitLab CI
code_quality:
stage: test
script:
- cargo install debtmap
- debtmap analyze . --format json --output gl-code-quality.json
- |
DEBT=$(jq '.technical_debt.items | length' gl-code-quality.json)
if [ "$DEBT" -gt 50 ]; then
echo "Debt threshold exceeded"
exit 1
fi
artifacts:
reports:
codequality: gl-code-quality.json
Jenkins Pipeline
pipeline {
agent any
stages {
stage('Analyze') {
steps {
sh 'debtmap analyze . --format json -o report.json'
script {
def json = readJSON file: 'report.json'
def debtScore = json.technical_debt.items.size()
if (debtScore > 100) {
error("Debt score ${debtScore} exceeds threshold")
}
}
}
}
}
post {
always {
archiveArtifacts artifacts: 'report.json'
}
}
}
Querying JSON with jq
Common jq queries for analyzing debtmap output:
# Get total debt items
jq '.technical_debt.items | length' report.json
# Get high-priority items only
jq '.technical_debt.items[] | select(.priority == "High")' report.json
# Get functions with complexity > 10
jq '.complexity.metrics[] | select(.cyclomatic > 10)' report.json
# Calculate average complexity
jq '.complexity.summary.average_complexity' report.json
# Get all TODO items
jq '.technical_debt.items[] | select(.debt_type == "Todo")' report.json
# Get top 5 complex functions
jq '.complexity.metrics | sort_by(-.cyclomatic) | .[0:5] | .[] | {name, file, cyclomatic}' report.json
# Get files with circular dependencies
jq '.dependencies.circular[] | .cycle' report.json
# Count debt items by type
jq '.technical_debt.items | group_by(.debt_type) | map({type: .[0].debt_type, count: length})' report.json
# Get functions with 0% coverage (when using --lcov)
jq '.complexity.metrics[] | select(.coverage == 0)' report.json
# Extract file paths with high debt
jq '.technical_debt.items[] | select(.priority == "High" or .priority == "Critical") | .file' report.json | sort -u
Filtering and Transformation Examples
Python Script to Parse JSON
#!/usr/bin/env python3
import json
import sys
def analyze_debtmap_output(json_file):
with open(json_file) as f:
data = json.load(f)
# Get high-priority items
high_priority = [
item for item in data['technical_debt']['items']
if item['priority'] in ['High', 'Critical']
]
# Group by file
by_file = {}
for item in high_priority:
file = item['file']
if file not in by_file:
by_file[file] = []
by_file[file].append(item)
# Print summary
print(f"High-priority debt items: {len(high_priority)}")
print(f"Files affected: {len(by_file)}")
print("\nBy file:")
for file, items in sorted(by_file.items(), key=lambda x: -len(x[1])):
print(f" {file}: {len(items)} items")
return by_file
if __name__ == '__main__':
analyze_debtmap_output(sys.argv[1])
Shell Script for Threshold Checking
#!/bin/bash
set -e
REPORT="$1"
DEBT_THRESHOLD=100
COMPLEXITY_THRESHOLD=10
# Check debt score
DEBT_SCORE=$(jq '.technical_debt.items | length' "$REPORT")
if [ "$DEBT_SCORE" -gt "$DEBT_THRESHOLD" ]; then
echo "❌ Debt score $DEBT_SCORE exceeds threshold $DEBT_THRESHOLD"
exit 1
fi
# Check average complexity
AVG_COMPLEXITY=$(jq '.complexity.summary.average_complexity' "$REPORT")
if (( $(echo "$AVG_COMPLEXITY > $COMPLEXITY_THRESHOLD" | bc -l) )); then
echo "❌ Average complexity $AVG_COMPLEXITY exceeds threshold $COMPLEXITY_THRESHOLD"
exit 1
fi
echo "✅ All quality checks passed"
echo " Debt score: $DEBT_SCORE/$DEBT_THRESHOLD"
echo " Avg complexity: $AVG_COMPLEXITY"
Editor Integration
VS Code Tasks
Create .vscode/tasks.json:
{
"version": "2.0.0",
"tasks": [
{
"label": "Debtmap: Analyze",
"type": "shell",
"command": "debtmap",
"args": [
"analyze",
".",
"--format",
"terminal"
],
"problemMatcher": [],
"presentation": {
"reveal": "always",
"panel": "new"
}
},
{
"label": "Debtmap: Generate Report",
"type": "shell",
"command": "debtmap",
"args": [
"analyze",
".",
"--format",
"markdown",
"-o",
"DEBT_REPORT.md"
],
"problemMatcher": []
}
]
}
Problem Matcher for VS Code
Parse debtmap output in VS Code’s Problems panel:
{
"problemMatcher": {
"owner": "debtmap",
"fileLocation": "absolute",
"pattern": {
"regexp": "^(.+?):(\\d+):(\\d+)?\\s*-\\s*(.+)$",
"file": 1,
"line": 2,
"column": 3,
"message": 4
}
}
}
Webhook Integration
Send debtmap results to webhooks for notifications:
#!/bin/bash
# Run analysis
debtmap analyze . --format json -o report.json
# Send to Slack
DEBT_SCORE=$(jq '.technical_debt.items | length' report.json)
curl -X POST "$SLACK_WEBHOOK_URL" \
-H 'Content-Type: application/json' \
-d "{\"text\": \"Debtmap Analysis Complete\n• Debt Score: $DEBT_SCORE\n• High Priority: $(jq '[.technical_debt.items[] | select(.priority == "High")] | length' report.json)\"}"
# Send to custom webhook
curl -X POST "$CUSTOM_WEBHOOK_URL" \
-H 'Content-Type: application/json' \
-d @report.json
Output Filtering
Debtmap provides several flags to filter and limit output:
Note: Filtering options (
--top,--tail,--summary,--filter) apply to all output formats (terminal, JSON, and markdown). The filtered data is applied at the analysis level before formatting, ensuring consistent results across all output types.
Limiting Results
# Show only top 10 priority items
debtmap analyze . --top 10
# Show bottom 5 lowest priority items
debtmap analyze . --tail 5
Priority Filtering
# Show only high and critical priority items
debtmap analyze . --min-priority high
# Filter by specific debt categories
debtmap analyze . --filter Architecture,Testing
Available categories:
Architecture: God objects, complexity hotspots, dead codeTesting: Testing gaps, coverage issuesPerformance: Resource leaks, inefficient patternsCodeQuality: Code smells, maintainability
Grouping Output
# Group results by debt category
debtmap analyze . --group-by-category
# Combine filters for focused analysis
debtmap analyze . --filter Architecture --min-priority high --top 5
Summary Mode
# Compact tiered priority display
debtmap analyze . --summary
# Combines well with filtering
debtmap analyze . --summary --min-priority medium
Best Practices
When to Use Each Format
Use Terminal Format When:
- Developing locally and reviewing code
- Getting quick feedback on changes
- Presenting results to team members
- Exploring complexity hotspots interactively
Use JSON Format When:
- Integrating with CI/CD pipelines
- Building custom analysis tools
- Tracking metrics over time
- Programmatically processing results
- Feeding into dashboards or monitoring systems
Use Markdown Format When:
- Generating documentation
- Creating PR comments
- Sharing reports with stakeholders
- Archiving analysis results
- Producing executive summaries
Use HTML Format When:
- Viewing analysis in a web browser
- Sharing visual dashboards with stakeholders
- Publishing reports to internal documentation sites
- Creating interactive reports for management review
- Embedding analysis results in web applications
Quick Reference Table
| Format | Best For | Machine Readable | Human Readable | File Extension |
|---|---|---|---|---|
| Terminal | Development | No | Yes | .txt |
| JSON | Automation | Yes | No | .json |
| Markdown | Documentation | Partially | Yes | .md |
| HTML | Visualization | Partially | Yes | .html |
| DOT | Architecture | Yes | Partially | .dot |
Combining Formats
Use multiple formats for comprehensive workflows:
# Generate terminal output for review
debtmap analyze .
# Generate JSON for automation
debtmap analyze . --format json -o ci-report.json
# Generate markdown for documentation
debtmap analyze . --format markdown -o docs/DEBT.md
# Generate DOT for architecture visualization
debtmap analyze . --format dot -o deps.dot && dot -Tsvg deps.dot -o deps.svg
Performance Considerations
- Terminal format: Fastest, minimal overhead
- JSON format: Fast serialization, efficient for large codebases
- Markdown format: Slightly slower due to formatting, but still performant
For very large codebases (>10,000 files), use --top or --filter to limit output size.
Troubleshooting
Common Issues
Colors not showing in terminal:
- Check if terminal supports ANSI colors
- Use
--plainflag for ASCII-only output - Some CI systems may not support color codes
JSON parsing errors:
- Ensure output is complete (check for errors during analysis)
- Validate JSON with
jqor online validators - Check for special characters in file paths
Markdown rendering issues:
- Some markdown renderers don’t support all features
- Use standard markdown for maximum compatibility
- Test with pandoc or GitHub/GitLab preview
File encoding problems:
- Ensure UTF-8 encoding for all output files
- Use
--plainfor pure ASCII output - Check locale settings (LC_ALL, LANG environment variables)
Exit Codes
Current behavior (as verified in src/main.rs):
0: Successful analysis completed without errors- Non-zero: Error during analysis (invalid path, parsing error, etc.)
Note: Threshold-based exit codes (where analysis succeeds but fails quality gates) are not currently implemented. The
analyzecommand returns 0 on successful analysis regardless of debt scores or complexity thresholds.
To enforce quality gates based on thresholds, use the validate command or parse JSON output:
# Use validate command for threshold enforcement
debtmap validate . --config debtmap.toml
# Or parse JSON output for threshold checking
debtmap analyze . --format json -o report.json
DEBT_SCORE=$(jq '.technical_debt.items | length' report.json)
if [ "$DEBT_SCORE" -gt 100 ]; then
echo "Debt threshold exceeded"
exit 1
fi
See Also
- Getting Started - Basic usage and examples
- Analysis Guide - Understanding metrics and scores
- Configuration - Customizing analysis behavior
TUI Results Viewer Guide
Debtmap includes an interactive TUI (Text User Interface) results viewer that allows you to explore analysis results in detail without re-running the analysis.
Launching the TUI
After running an analysis that produces a results file, launch the TUI viewer:
# Run analysis and save results
debtmap analyze . -o results.json --format json
# Launch interactive TUI viewer
debtmap results results.json
The TUI provides an interactive interface for exploring technical debt items, with detailed views and keyboard navigation.
Navigation
List View (Main Screen)
The main screen shows a list of technical debt items sorted by priority score:
Keyboard Controls:
↑/↓orj/k- Navigate up/down through the listEnter- View detailed information for selected itemq- Quit the application?- Show help
Display Information:
- Item rank and priority score
- Function location (file:line)
- Debt type indicator
- Brief description
Detail View
When you select an item with Enter, the detail view opens with multiple pages of information.
Detail Pages
The detail view provides 5 pages of in-depth analysis for each debt item:
Page 1: Overview
Displays core information about the technical debt item:
- Priority Score: Final unified score and breakdown
- Location: File path, line number, and function name
- Debt Type: Classification (Complexity, Side Effects, etc.)
- Description: Detailed explanation of the issue
- Function Role: Role in the codebase (Entry Point, Core Logic, etc.)
Score Breakdown:
- Base complexity score
- Impact multipliers (downstream dependencies, test coverage)
- Contextual adjustments
- Final unified score
Page 2: Metrics
Shows detailed complexity and quality metrics:
Complexity Metrics:
- Cyclomatic Complexity
- Cognitive Complexity
- Lines of Code (LOC)
- Nesting Depth
Understanding Complexity Metric Labels:
The “Complexity” section uses different metric labels depending on whether you’re viewing a regular function or a god object:
-
Regular Functions: Shows individual function metrics with simple labels:
- cyclomatic: The function’s own cyclomatic complexity
- cognitive: The function’s own cognitive complexity
- nesting: Maximum nesting depth within the function
-
God Objects: Shows aggregated metrics with descriptive labels indicating the aggregation strategy:
- accumulated cyclomatic: Sum of cyclomatic complexity across all methods
- accumulated cognitive: Sum of cognitive complexity across all methods
- max nesting: Maximum nesting depth found in any method
The descriptive labels (“accumulated” for sums, “max” for maximum) clearly indicate how the metrics are calculated across all functions in the class. This helps identify classes that have grown too large and may need to be split into smaller, more focused components.
Quality Indicators:
- Maintainability Index
- Halstead Metrics (if available)
- Code duplication percentage
- Comment density
Boilerplate Analysis:
- Boilerplate percentage
- Pattern repetition score
- Effective complexity (adjusted for boilerplate)
Page 3: Recommendations
Provides actionable refactoring guidance:
Refactoring Strategy:
- Recommended approach (Extract Method, Decompose, etc.)
- Estimated effort (hours)
- Expected benefit/improvement
Step-by-Step Actions:
- Specific refactoring steps
- Pattern recommendations
- Testing strategies
Code Examples:
- Before/after snippets (when applicable)
- Suggested function signatures
- Pattern implementations
Page 4: Context
Displays contextual information about the function’s role and impact:
Call Graph Analysis:
- Upstream callers (who calls this function)
- Downstream dependencies (what this function calls)
- Depth in call graph
Module Dependencies:
- Module coupling metrics
- Circular dependency warnings
- Cross-module calls
Git History Insights:
- Change frequency
- Recent modifications
- Contributing authors
- Historical complexity trend
Page 5: Data Flow
Shows detailed data flow analysis for understanding mutations, I/O operations, and variable escape behavior.
Mutation Analysis:
- Total Mutations: Count of all variable mutations in the function
- Live Mutations: Variables that are mutated and their new values are used
- Dead Stores: Variables that are assigned but never read (optimization opportunity)
Example display:
Total Mutations: 5
Live Mutations: 2
Dead Stores: 1
Live Mutations:
• counter
• state
Dead Stores:
• temp (never read)
I/O Operations:
- Detected I/O operations by type (File, Network, Database, etc.)
- Line numbers where I/O occurs
- Variables involved in I/O
Example display:
File Read at line 105 (variables: file)
Network Call at line 110 (variables: socket)
Database Query at line 120 (variables: db)
Escape Analysis:
- Escaping Variables: Variables whose values escape the function scope
- Return Dependencies: Variables that affect the return value
Example display:
Escaping Variables: 2
Variables affecting return value:
• result
• accumulator
Purity Analysis:
- Is Pure: Whether the function is pure (no side effects, deterministic)
- Confidence: Confidence level of the purity assessment (0-100%)
- Impurity Reasons: Specific reasons why the function is not pure
Example display:
Is Pure: No
Confidence: 95.0%
Impurity Reasons:
• Mutates shared state
• Performs I/O operations
• Calls impure functions
Navigation in Data Flow Page:
- Use
←/→keys to switch between detail pages - All data flow insights help identify refactoring opportunities
- Pure functions are easier to test and reason about
- High mutation counts suggest opportunities for functional refactoring
Page Navigation
In Detail View:
←/→orh/l- Switch between detail pages (1-5)Escorq- Return to list view?- Show help
Page Indicator: The bottom of the screen shows which page you’re viewing:
[1/5] Overview [2/5] Metrics [3/5] Recommendations [4/5] Context [5/5] Data Flow
Filtering and Sorting
The TUI supports filtering results by various criteria:
Filter by Debt Type
# Only show complexity issues
debtmap results results.json --filter complexity
# Only show side effect issues
debtmap results results.json --filter side-effects
Sort Options
Results are pre-sorted by priority score, but you can change the sort order:
# Sort by cyclomatic complexity
debtmap results results.json --sort complexity
# Sort by lines of code
debtmap results results.json --sort loc
# Sort by file name
debtmap results results.json --sort file
Theme Customization
The TUI supports custom color themes for better readability:
# Use light theme
debtmap results results.json --theme light
# Use dark theme (default)
debtmap results results.json --theme dark
# Use high-contrast theme
debtmap results results.json --theme contrast
Export from TUI
While viewing results in the TUI, you can export specific items:
e- Export current item to markdownE- Export all visible items to markdown
The exported markdown includes all detail page information in a readable format.
Tips and Best Practices
-
Use Data Flow Page: The data flow page (Page 5) provides deep insights into mutations and I/O operations, helping identify refactoring opportunities for functional programming patterns.
-
Focus on Top Items: Start with the highest-priority items (top 5-10) for maximum impact.
-
Review Recommendations: Always check Page 3 (Recommendations) for specific refactoring steps before making changes.
-
Check Context: Use Page 4 (Context) to understand the function’s role before refactoring to avoid breaking critical paths.
-
Understand Purity: Pure functions (shown in Data Flow page) are easier to test and maintain. High mutation counts indicate opportunities for functional refactoring.
-
Look for Dead Stores: Variables in the “Dead Stores” section can be removed to simplify code.
-
Identify I/O Boundaries: Functions with many I/O operations should isolate I/O from business logic.
-
Track Git History: Frequently changing functions (shown in Context page) may indicate design issues.
Troubleshooting
TUI Won’t Launch
Ensure you have a valid results file:
# Check file exists and is valid JSON
cat results.json | jq .
Colors Not Showing
Some terminals don’t support 256-color mode. Try:
# Use basic colors
export TERM=xterm-color
debtmap results results.json
Navigation Keys Not Working
Ensure your terminal supports the required key codes. Try alternative keys:
- Use
j/kinstead of arrow keys for up/down - Use
h/linstead of arrow keys for left/right
Integration with Workflows
CI/CD Integration
Generate results in CI and review locally:
# In CI pipeline
debtmap analyze . -o ci-results.json --format json
# Download and review locally
scp ci-server:ci-results.json .
debtmap results ci-results.json
Team Review
Share results files with team members:
# Generate results
debtmap analyze . -o team-review.json --format json
# Commit to repository (small file)
git add team-review.json
git commit -m "Add debtmap analysis results for review"
# Team members can view
debtmap results team-review.json
Progressive Refactoring
Track progress across refactoring sessions:
# Initial analysis
debtmap analyze . -o before.json --format json
# After refactoring
debtmap analyze . -o after.json --format json
# Compare results
debtmap compare before.json after.json
Advanced Features
Custom Data Retention
Control how much detail is stored in results:
# Minimal results (scores only)
debtmap analyze . -o minimal.json --detail-level minimal
# Full results (all metrics and context)
debtmap analyze . -o full.json --detail-level full
Performance Optimization
For large codebases, optimize TUI performance:
# Limit results to top N items
debtmap analyze . -o top100.json --top 100
# View subset
debtmap results top100.json
See Also
- Output Formats - Other output format options
- CLI Reference - Complete command reference
- Configuration - Customization options
Architecture
This chapter explains how debtmap’s analysis pipeline works, from discovering files to producing prioritized technical debt signals.
Analysis Pipeline Overview
Debtmap’s analysis follows a multi-stage pipeline that transforms source code into structured signals:
┌─────────────────┐
│ File Discovery │
└────────┬────────┘
│
▼
┌─────────────────┐
│Language Detection│
└────────┬────────┘
│
▼
┌────────┐
│ Parser │
└────┬───┘
│
┌────┼────────────┐
│ │ │
▼ ▼ ▼
┌─────┐ ┌──────────┐ ┌───────────┐
│ syn │ │rustpython│ │tree-sitter│
│ AST │ │ AST │ │ AST │
└──┬──┘ └────┬─────┘ └─────┬─────┘
│ │ │
└─────────┼─────────────┘
│
▼
┌──────────────────┐
│ Metric Extraction │
└─────────┬────────┘
│
┌───────┼───────┐
│ │ │
▼ ▼ ▼
┌────────┐ ┌─────┐ ┌─────────┐
│Complexity│ │Call │ │ Pattern │
│ Calc │ │Graph│ │Detection│
└────┬───┘ └──┬──┘ └────┬────┘
│ │ │
▼ │ │
┌─────────┐ │ │
│ Entropy │ │ │
│ Analysis│ │ │
└────┬────┘ │ │
│ │ │
▼ ▼ ▼
┌─────────┐ ┌────────┐ ┌──────┐ ┌──────────┐
│Effective│ │Dependency│ │ Debt │ │ LCOV │
│Complexity│ │Analysis│ │Class │ │ Coverage │
└────┬────┘ └────┬───┘ └──┬───┘ └────┬─────┘
│ │ │ │
└───────────┼────────┼─────────────┘
│ │
▼ ▼
┌─────────────────┐
│ Risk Scoring │
└────────┬────────┘
│
▼
┌───────────────────┐
│Tiered Prioritization│
└─────────┬─────────┘
│
▼
┌──────────────────────┐
│ Context Suggestion │
│ Generation │
└─────────┬────────────┘
│
▼
┌────────────────┐
│Output Formatting│
└────────┬───────┘
│
┌───────────┼───────────┐
│ │ │
▼ ▼ ▼
┌──────┐ ┌────────┐ ┌─────┐
│ JSON │ │LLM-MD │ │Term │
└──────┘ └────────┘ └─────┘
Key Components
1. File Discovery and Language Detection
Purpose: Identify source files to analyze and determine their language.
How it works:
- Walks the project directory tree (respecting
.gitignoreand.debtmapignore) - Detects language based on file extension (
.rs,.py,.js,.ts) - Filters out test files, build artifacts, and vendored dependencies
- Groups files by language for parallel processing
Configuration:
[analysis]
exclude_patterns = ["**/tests/**", "**/target/**", "**/node_modules/**"]
include_patterns = ["src/**/*.rs", "lib/**/*.py"]
2. Parser Layer
Purpose: Convert source code into Abstract Syntax Trees (ASTs) for analysis.
Language-Specific Parsers:
Rust (syn):
- Uses the
syncrate for full Rust syntax support - Extracts: functions, structs, impls, traits, macros
- Handles: async/await, generic types, lifetime annotations
- Performance: ~10-20ms per file
Python (rustpython):
- Uses rustpython’s parser for Python 3.x syntax
- Extracts: functions, classes, methods, decorators
- Handles: comprehensions, async/await, type hints
- Performance: ~5-15ms per file
JavaScript/TypeScript (tree-sitter):
- Uses tree-sitter for JS/TS parsing
- Extracts: functions, classes, arrow functions, hooks
- Handles: JSX/TSX, decorators, generics
- Performance: ~8-18ms per file
Error Handling:
- Syntax errors logged but don’t stop analysis
- Partial ASTs used when possible
- Files with parse errors excluded from final report
3. Metric Extraction
Purpose: Extract raw metrics from ASTs.
Metrics Computed:
Function-Level:
- Lines of code (LOC)
- Cyclomatic complexity (branch count)
- Nesting depth (max indentation level)
- Parameter count
- Return path count
- Comment ratio
File-Level:
- Total LOC
- Number of functions/classes
- Dependency count (imports)
- Documentation coverage
Implementation:
#![allow(unused)]
fn main() {
pub struct FunctionMetrics {
pub name: String,
pub location: Location,
pub loc: u32,
pub cyclomatic_complexity: u32,
pub nesting_depth: u32,
pub parameter_count: u32,
pub return_paths: u32,
}
}
4. Complexity Calculation and Entropy Analysis
Purpose: Compute effective complexity using entropy-adjusted metrics.
Traditional Cyclomatic Complexity:
- Count decision points (if, match, loop, etc.)
- Each branch adds +1 to complexity
- Does not distinguish between repetitive and varied logic
Entropy-Based Adjustment:
Debtmap calculates pattern entropy to adjust cyclomatic complexity:
- Extract patterns - Identify branch structures (e.g., all if/return patterns)
- Calculate variety - Measure information entropy of patterns
- Adjust complexity - Reduce score for low-entropy (repetitive) code
Formula:
Entropy = -Σ(p_i * log2(p_i))
where p_i = frequency of pattern i
Effective Complexity = Cyclomatic * (1 - (1 - Entropy/Max_Entropy) * 0.75)
Example:
#![allow(unused)]
fn main() {
// 20 similar if/return statements
// Cyclomatic: 20, Entropy: 0.3
// Effective: 20 * (1 - (1 - 0.3/4.32) * 0.75) ≈ 5.5
}
This approach reduces false positives from validation/configuration code while still flagging genuinely complex logic.
5. Call Graph Construction
Purpose: Understand function dependencies and identify critical paths.
What’s Tracked:
- Function calls within the same file
- Cross-file calls (when possible to resolve)
- Method calls on structs/classes
- Trait/interface implementations
Analysis:
- Fan-in: How many functions call this function
- Fan-out: How many functions this function calls
- Depth: Distance from entry points (main, handlers)
- Cycles: Detect recursive calls
Usage:
- Prioritize functions called from many untested paths
- Identify central functions (high fan-in/fan-out)
- Detect test coverage gaps in critical paths
Limitations:
- Dynamic dispatch not fully resolved
- Cross-crate calls require additional analysis
- Closures and function pointers approximated
6. Pattern Detection and Debt Classification
Purpose: Identify specific technical debt patterns.
Debt Categories:
Test Gaps:
- Functions with 0% coverage and high complexity
- Untested error paths
- Missing edge case tests
Complexity Issues:
- Functions exceeding thresholds (default: 10)
- Deep nesting (3+ levels)
- Long functions (200+ LOC)
Design Smells:
- God functions (high fan-out)
- Unused code (fan-in = 0)
- Circular dependencies
Implementation:
#![allow(unused)]
fn main() {
pub enum DebtType {
TestingGap { coverage: f64, cyclomatic: u32, cognitive: u32 },
ComplexityHotspot { cyclomatic: u32, cognitive: u32 },
DeadCode { visibility: FunctionVisibility, cyclomatic: u32, cognitive: u32, usage_hints: Vec<String> },
GodObject { methods: u32, fields: Option<u32>, responsibilities: u32, god_object_score: f64, lines: u32 },
// ... 35 total variants
}
}
7. Coverage Integration
Purpose: Map test coverage data to complexity metrics for risk scoring.
Coverage Data Flow:
- Read LCOV file - Parse coverage report from test runners
- Map to source - Match coverage lines to functions/branches
- Calculate coverage % - For each function, compute:
- Line coverage: % of lines executed
- Branch coverage: % of branches taken
- Identify gaps - Find untested branches in complex functions
Coverage Scoring:
#![allow(unused)]
fn main() {
pub struct CoverageMetrics {
pub lines_covered: u32,
pub lines_total: u32,
pub branches_covered: u32,
pub branches_total: u32,
pub coverage_percent: f64,
}
}
Special Cases:
- Entry points (main, handlers) expect integration test coverage
- Generated code excluded from coverage requirements
- Test files themselves not analyzed for coverage
8. Risk Scoring
Purpose: Combine complexity and coverage into a unified risk score.
Risk Formula:
Risk Score = (Effective Complexity * Coverage Gap Weight) + (Call Graph Depth * Path Weight)
where:
- Effective Complexity: Entropy-adjusted cyclomatic complexity
- Coverage Gap Weight: 1.0 for 0% coverage, decreasing to 0.1 for 95%+
- Call Graph Depth: Distance from entry points
- Path Weight: Number of untested paths leading to this function
Example Calculation:
#![allow(unused)]
fn main() {
fn calculate_risk_score():
Effective Complexity: 8.5
Coverage: 30%
Coverage Gap Weight: 0.7
Call Graph Depth: 3
Untested Paths: 2
Risk = (8.5 * 0.7) + (3 * 2 * 0.3) = 5.95 + 1.8 = 7.75
}
Risk Tiers (Unified Score 0-10):
- Critical (9.0-10.0): Severe risk requiring immediate attention
- High (7.0-8.9): Significant risk, address this sprint
- Medium (5.0-6.9): Moderate risk, plan for next sprint
- Low (3.0-4.9): Minor risk, monitor
- Minimal (0.0-2.9): Well-managed code
9. Tiered Prioritization
Purpose: Classify and rank technical debt items by severity.
Prioritization Algorithm:
- Calculate base risk score (from Risk Scoring step)
- Apply context adjustments:
- Entry points: -2.0 score (integration test coverage expected)
- Core business logic: +1.5 score (higher priority)
- Frequently changed files: +1.0 score (git history analysis)
- Critical paths: +0.5 score per untested caller
- Classify into tiers:
- Critical: score >= 9.0
- High: score >= 7.0
- Medium: score >= 5.0
- Low: score >= 3.0
- Minimal: score < 3.0
- Sort within tiers by:
- Severity score
- Coupling impact
- File location (group related items)
Output:
#![allow(unused)]
fn main() {
pub struct PrioritizedDebtItem {
pub rank: u32,
pub score: f64,
pub tier: Tier,
pub location: Location,
pub debt_type: DebtType,
pub metrics: ComplexityMetrics,
pub coverage: Option<CoverageMetrics>,
pub context: ContextSuggestion,
}
}
See Tiered Prioritization for detailed explanation of the ranking algorithm.
10. Context Suggestion Generation
Purpose: Provide AI agents with specific file ranges to read for understanding the debt item.
Context Types:
Primary Context:
- The function/struct where debt is located
- Start and end line numbers
- File path
Related Context:
- Callers: Functions that call this function
- Callees: Functions this function calls
- Tests: Existing test files that cover related code
- Types: Struct/enum definitions used by this function
Selection Algorithm:
- Include primary location (always)
- Add top 3 callers by call frequency
- Add callees that are untested
- Add test files with matching function names
- Limit total context to ~500 lines (configurable)
Output Format:
CONTEXT:
├─ Primary: src/parser.rs:38-85
├─ Caller: src/handler.rs:100-120 (calls 12x)
├─ Caller: src/api.rs:45-60 (calls 8x)
├─ Callee: src/tokenizer.rs:15-40 (untested)
└─ Test: tests/parser_test.rs:50-120
11. Output Formatting
Purpose: Present analysis results in formats optimized for different consumers.
Output Formats:
Markdown (–format markdown):
- Structured for minimal token usage
- Context suggestions included inline
- Metrics in consistent tabular format
- Designed for piping to AI assistants
JSON (–format json):
- Machine-readable for CI/CD integration
- Full metadata for each debt item
- Stable schema for programmatic consumption
- Schema-versioned for compatibility
Terminal (–format terminal):
- Color-coded by tier (red=critical, yellow=high, etc.)
- Hierarchical tree view with unicode box characters
- Progress bars for analysis phases
- Summary statistics at top
Markdown (–format markdown):
- Rendered in GitHub/GitLab for PR comments
- Embedded code blocks with syntax highlighting
- Collapsible details sections
- Linked to source code locations
See Output Formats for examples and configuration options.
Data Flow Example
Let’s trace a single function through the entire pipeline:
Input: Source File
#![allow(unused)]
fn main() {
// src/handlers.rs
pub fn process_request(req: Request) -> Result<Response> {
validate_auth(&req)?;
let data = parse_payload(&req.body)?;
let result = apply_business_logic(data)?;
format_response(result)
}
}
Stage 1: Parsing
#![allow(unused)]
fn main() {
FunctionAst {
name: "process_request",
location: Location { file: "src/handlers.rs", line: 2 },
calls: ["validate_auth", "parse_payload", "apply_business_logic", "format_response"],
...
}
}
Stage 2: Metric Extraction
#![allow(unused)]
fn main() {
FunctionMetrics {
name: "process_request",
cyclomatic_complexity: 4, // 3 ?-operators + base
nesting_depth: 1,
loc: 5,
...
}
}
Stage 3: Entropy Analysis
#![allow(unused)]
fn main() {
// Pattern: repetitive ?-operator error handling
Entropy: 0.4 (low variety)
Effective Complexity: 4 * 0.85 = 3.4
}
Stage 4: Call Graph
#![allow(unused)]
fn main() {
CallGraphNode {
function: "process_request",
fan_in: 3, // called from 3 handlers
fan_out: 4, // calls 4 functions
depth: 1, // direct handler (entry point)
}
}
Stage 5: Coverage (from LCOV)
#![allow(unused)]
fn main() {
CoverageMetrics {
lines_covered: 5,
lines_total: 5,
branches_covered: 3,
branches_total: 4, // Missing one error path
coverage_percent: 75%,
}
}
Stage 6: Risk Scoring
#![allow(unused)]
fn main() {
Risk = (3.4 * 0.25) + (1 * 1 * 0.2) = 0.85 + 0.2 = 1.05
Tier: LOW (entry point with decent coverage)
}
Stage 7: Context Suggestion
CONTEXT:
├─ Primary: src/handlers.rs:2-6
├─ Callee: src/auth.rs:15-30 (validate_auth)
└─ Test: tests/integration/handlers_test.rs:10-25
Stage 8: Output
#23 SCORE: 1.1 [LOW]
├─ src/handlers.rs:2 process_request()
├─ COMPLEXITY: cyclomatic=4, cognitive=3, nesting=1
├─ COVERAGE: 75% (1 branch untested)
└─ CONTEXT: Primary + 1 callee + 1 test file
Performance Characteristics
Analysis Speed:
- Small project (< 10k LOC): 1-3 seconds
- Medium project (10-50k LOC): 5-15 seconds
- Large project (50-200k LOC): 20-60 seconds
- Very large project (200k+ LOC): 1-5 minutes
Parallelization:
- File parsing: Parallel across all available cores
- Metric extraction: Parallel per-file
- Call graph construction: Parallel with work stealing
- Risk scoring: Parallel per-function
- Output formatting: Sequential
Memory Usage:
- Approx 100-200 KB per file analyzed
- Peak memory for large projects: 500 MB - 1 GB
- Streaming mode available for very large codebases
Optimization Strategies:
- Skip unchanged files (git diff integration)
- Parallel processing with rayon
- Efficient AST traversal (visitor pattern)
- Memory-efficient streaming for large codebases
Extension Points
Custom Analyzers:
Implement the Analyzer trait to add language support:
#![allow(unused)]
fn main() {
pub trait Analyzer {
fn parse(&self, content: &str) -> Result<Ast>;
fn extract_metrics(&self, ast: &Ast) -> Vec<FunctionMetrics>;
fn detect_patterns(&self, ast: &Ast) -> Vec<DebtPattern>;
}
}
Custom Scoring:
Implement the RiskScorer trait to adjust scoring logic:
#![allow(unused)]
fn main() {
pub trait RiskScorer {
fn calculate_risk(&self, metrics: &FunctionMetrics, coverage: &CoverageMetrics) -> f64;
fn classify_tier(&self, score: f64) -> Tier;
}
}
Custom Output:
Implement the OutputFormatter trait for new formats:
#![allow(unused)]
fn main() {
pub trait OutputFormatter {
fn format(&self, items: &[PrioritizedDebtItem]) -> Result<String>;
}
}
Next Steps
- Understand prioritization: See Tiered Prioritization
- Learn scoring strategies: See Scoring Strategies
- Configure analysis: See Configuration
- Integrate with AI: See LLM Integration
Architectural Analysis
Debtmap provides comprehensive architectural analysis capabilities based on Robert C. Martin’s software engineering principles. These tools help identify structural issues, coupling problems, and architectural anti-patterns in your codebase.
Overview
Architectural analysis examines module-level relationships and dependencies to identify:
- Circular Dependencies - Modules that create dependency cycles
- Coupling Metrics - Afferent and efferent coupling measurements
- Bidirectional Dependencies - Inappropriate intimacy between modules
- Stable Dependencies Principle Violations - Unstable modules being depended upon
- Zone of Pain - Rigid, concrete implementations heavily depended upon
- Zone of Uselessness - Overly abstract, unstable modules
- Code Duplication - Identical or similar code blocks across files
These analyses help you maintain clean architecture and identify refactoring opportunities.
Circular Dependency Detection
Circular dependencies occur when modules form a dependency cycle (A depends on B, B depends on C, C depends on A). These violations break architectural boundaries and make code harder to understand, test, and maintain.
How It Works
Debtmap builds a dependency graph from module imports and uses depth-first search (DFS) with recursion stack tracking to detect cycles:
- Parse all files to extract import/module dependencies
- Build a directed graph where nodes are modules and edges are dependencies
- Run DFS from each unvisited module
- Track visited nodes and recursion stack
- When a node is reached that’s already in the recursion stack, a cycle is detected
Implementation: src/debt/circular.rs:46-66 (detect_circular_dependencies)
Example
#![allow(unused)]
fn main() {
// Module A (src/auth.rs)
use crate::user::User;
use crate::session::validate_session;
// Module B (src/user.rs)
use crate::session::Session;
// Module C (src/session.rs)
use crate::auth::authenticate; // Creates cycle: auth → session → auth
}
Debtmap detects:
Circular dependency detected: auth → session → auth
Refactoring Recommendations
To break circular dependencies:
- Extract Interface - Create a trait that both modules depend on
- Dependency Inversion - Introduce an abstraction layer
- Move Shared Code - Extract common functionality to a new module
- Remove Dependency - Inline or duplicate small amounts of code
Coupling Metrics
Coupling metrics measure how interconnected modules are. Debtmap calculates two primary metrics:
Afferent Coupling (Ca)
Afferent coupling is the number of modules that depend on this module. High afferent coupling means many modules rely on this code.
#![allow(unused)]
fn main() {
pub struct CouplingMetrics {
pub module: String,
pub afferent_coupling: usize, // Number depending on this module
pub efferent_coupling: usize, // Number this module depends on
pub instability: f64, // Calculated from Ca and Ce
pub abstractness: f64, // Ratio of abstract types
}
}
Implementation: src/debt/coupling.rs:6-30
Efferent Coupling (Ce)
Efferent coupling is the number of modules this module depends on. High efferent coupling means this module has many dependencies.
Note on Abstractness: The abstractness field in CouplingMetrics requires advanced type analysis to calculate properly. The current implementation uses a placeholder value (0.0) as full abstractness calculation would need semantic analysis of trait definitions, abstract types, and implementation ratios. This is similar to the cohesion analysis limitation documented below (see “Cohesion Analysis” section).
Source: src/debt/coupling.rs:44
Example Coupling Analysis
Module: api_handler
Afferent coupling (Ca): 8 // 8 modules depend on api_handler
Efferent coupling (Ce): 3 // api_handler depends on 3 modules
Instability: 0.27 // Relatively stable
High afferent or efferent coupling (typically >5) indicates potential maintainability issues.
Instability Metric
The instability metric measures how resistant a module is to change. It’s calculated as:
I = Ce / (Ca + Ce)
Interpretation:
- I = 0.0 - Maximally stable (no dependencies, many dependents)
- I = 1.0 - Maximally unstable (many dependencies, no dependents)
Implementation: src/debt/coupling.rs:16-24 (calculate_instability)
Stability Guidelines
- Stable modules (I < 0.3) - Hard to change but depended upon; should contain stable abstractions
- Balanced modules (0.3 ≤ I ≤ 0.7) - Normal modules with both dependencies and dependents
- Unstable modules (I > 0.7) - Change frequently; should have few or no dependents
Example
#![allow(unused)]
fn main() {
// Stable module (I = 0.1)
// core/types.rs - defines fundamental types, depended on by 20 modules
pub struct User { ... }
pub struct Session { ... }
// Unstable module (I = 0.9)
// handlers/admin_dashboard.rs - depends on 10 modules, no dependents
use crate::auth::*;
use crate::database::*;
use crate::templates::*;
// ... 7 more imports
}
Stable Dependencies Principle
The Stable Dependencies Principle (SDP) states: Depend in the direction of stability. Modules should depend on modules that are more stable than themselves.
SDP Violations
Debtmap flags violations when a module has:
- Instability > 0.8 (very unstable)
- Afferent coupling > 2 (multiple modules depend on it)
This means an unstable, frequently changing module is being depended upon by multiple other modules - a recipe for maintenance problems.
Implementation: src/debt/coupling.rs:69-76
Example Violation
Module 'temp_utils' violates Stable Dependencies Principle
(instability: 0.85, depended on by 5 modules)
Problem: This module changes frequently but is heavily depended upon.
Solution: Extract stable interface or reduce dependencies on this module.
Fixing SDP Violations
- Increase stability - Reduce the module’s dependencies
- Reduce afferent coupling - Extract interface, use dependency injection
- Split module - Separate stable and unstable parts
Bidirectional Dependencies
Bidirectional dependencies (also called inappropriate intimacy) occur when two modules depend on each other:
Module A depends on Module B
Module B depends on Module A
This creates tight coupling and makes both modules harder to change, test, or reuse independently.
Implementation: src/debt/coupling.rs:98-117 (detect_inappropriate_intimacy)
Example
#![allow(unused)]
fn main() {
// order.rs
use crate::customer::Customer;
pub struct Order {
customer: Customer,
}
// customer.rs
use crate::order::Order; // Bidirectional dependency!
pub struct Customer {
orders: Vec<Order>,
}
}
Debtmap detects:
Inappropriate intimacy detected between 'order' and 'customer'
Refactoring Recommendations
- Create Mediator - Introduce a third module to manage the relationship
- Break into Separate Modules - Split concerns more clearly
- Use Events - Replace direct dependencies with event-driven communication
- Dependency Inversion - Introduce interfaces/traits both depend on
Zone of Pain Detection
The zone of pain contains modules with:
- Low abstractness (< 0.2) - Concrete implementations, no abstractions
- Low instability (< 0.2) - Stable, hard to change
- High afferent coupling (> 3) - Many modules depend on them
These modules are rigid concrete implementations that are heavily used but hard to change - causing pain when modifications are needed.
Implementation: src/debt/coupling.rs:125-138
Example
Module 'database_client' is in the zone of pain (rigid and hard to change)
Abstractness: 0.1 (all concrete implementation)
Instability: 0.15 (very stable, many dependents)
Afferent coupling: 12 (12 modules depend on it)
Problem: This concrete database client is used everywhere.
Any change to its implementation requires updating many modules.
Refactoring Recommendations
- Extract Interfaces - Create a
DatabaseClienttrait - Introduce Abstractions - Define abstract operations others depend on
- Break into Smaller Modules - Separate concerns to reduce coupling
- Use Dependency Injection - Pass implementations via interfaces
Zone of Uselessness Detection
The zone of uselessness contains modules with:
- High abstractness (> 0.8) - Mostly abstract, few concrete implementations
- High instability (> 0.8) - Frequently changing
These modules are overly abstract and unstable, providing little stable value to the system.
Implementation: src/debt/coupling.rs:141-153
Example
Module 'base_processor' is in the zone of uselessness
(too abstract and unstable)
Abstractness: 0.9 (mostly traits and interfaces)
Instability: 0.85 (changes frequently)
Problem: This module defines many abstractions but provides little
concrete value. It changes often, breaking implementations.
Refactoring Recommendations
- Add Concrete Implementations - Make the module useful by implementing functionality
- Remove if Unused - Delete if no real value is provided
- Stabilize Interfaces - Stop changing abstractions frequently
- Merge with Implementations - Combine abstract and concrete code
Distance from Main Sequence
The main sequence represents the ideal balance between abstractness and instability. Modules should lie on the line:
A + I = 1
Where:
- A = Abstractness (ratio of abstract types to total types)
- I = Instability (Ce / (Ca + Ce))
Distance from the main sequence:
D = |A + I - 1|
Implementation: src/debt/coupling.rs:119-123
Interpretation
- D ≈ 0.0 - Module is on the main sequence (ideal)
- D > 0.5 - Module is far from ideal
- High D with low A and I → Zone of Pain
- High D with high A and I → Zone of Uselessness
Visual Representation
Abstractness
1.0 ┤ Zone of Uselessness
│ ╱
│ ╱
0.5 ┤ ╱ Main Sequence
│╱
╱
0.0 ┤──────────────────────────
0.0 0.5 1.0
Instability
Zone of Pain
Code Duplication Detection
Debtmap detects code duplication using hash-based chunk comparison:
- Extract chunks - Split files into fixed-size chunks (default: 50 lines)
- Normalize - Remove whitespace and comments
- Calculate hash - Compute xxHash64 hash for each normalized chunk (10-20x faster than SHA-256)
- Match duplicates - Find chunks with identical hashes
- Merge adjacent - Consolidate consecutive duplicate blocks
Note: The minimum chunk size is configurable via the --threshold-duplication flag or in .debtmap.toml (default: 50 lines).
Implementation: src/debt/duplication.rs:6-44 (detect_duplication)
Algorithm Details
#![allow(unused)]
fn main() {
pub fn detect_duplication(
files: Vec<(PathBuf, String)>,
min_lines: usize, // Default: 50
_similarity_threshold: f64, // Currently unused (exact matching)
) -> Vec<DuplicationBlock>
}
The algorithm:
- Extracts overlapping chunks from each file
- Normalizes by trimming whitespace and removing comments
- Calculates xxHash64 hash for each normalized chunk (returns a
u64value) - Groups chunks by hash using thread-safe concurrent aggregation
- Returns groups with 2+ locations (duplicates found)
Example Output
Code duplication detected:
Hash: 14628437538729542158 (xxHash64)
Lines: 50
Locations:
- src/handlers/user.rs:120-169
- src/handlers/admin.rs:85-134
- src/handlers/guest.rs:200-249
Recommendation: Extract common validation logic to shared module
Duplication Configuration
Configure duplication detection in .debtmap.toml:
# Minimum lines for duplication detection
threshold_duplication = 50 # Default value
# Smaller values catch more duplications but increase noise
# threshold_duplication = 30 # More sensitive
# Larger values only catch major duplications
# threshold_duplication = 100 # Less noise
Configuration reference: src/cli/args.rs:86-87 (threshold_duplication flag definition)
Implementation: src/debt/duplication.rs:11-15
Current Limitations
- Exact matching only - Currently uses hash-based exact matching
- similarity_threshold parameter - Defined in function signature but not implemented yet
- Future enhancement - Fuzzy matching for near-duplicates using similarity algorithms (e.g., Levenshtein distance, token-based similarity)
The similarity_threshold parameter exists for future extensibility but is currently unused. All duplication detection uses exact hash matching. Track progress on fuzzy matching in the project’s issue tracker or roadmap.
Refactoring Recommendations
Debtmap provides specific refactoring recommendations for each architectural issue:
For Circular Dependencies
- Extract Interface - Create shared abstraction both modules use
- Dependency Inversion - Introduce interfaces to reverse dependency direction
- Move Shared Code - Extract to new module both can depend on
- Event-Driven - Replace direct calls with event publishing/subscribing
For High Coupling
- Facade Pattern - Provide simplified interface hiding complex dependencies
- Reduce Dependencies - Remove unnecessary imports and calls
- Dependency Injection - Pass dependencies via constructors/parameters
- Interface Segregation - Split large interfaces into focused ones
For Zone of Pain
- Introduce Abstractions - Extract traits/interfaces for flexibility
- Adapter Pattern - Wrap concrete implementations with adapters
- Strategy Pattern - Make algorithms pluggable via interfaces
For Zone of Uselessness
- Add Concrete Implementations - Provide useful functionality
- Remove Unused Code - Delete if providing no value
- Stabilize Interfaces - Stop changing abstractions frequently
For Bidirectional Dependencies
- Create Mediator - Third module manages relationship
- Break into Separate Modules - Clearer separation of concerns
- Observer Pattern - One-way communication via observers
For Code Duplication
- Extract Common Code - Create shared function/module
- Use Inheritance/Composition - Share via traits or composition
- Parameterize Differences - Extract variable parts as parameters
- Template Method - Define algorithm structure, vary specific steps
Examples and Use Cases
Running Architectural Analysis
# Architectural analysis runs automatically with standard analysis
debtmap analyze .
# Duplication detection with custom chunk size
debtmap analyze . --threshold-duplication 30
# Note: Circular dependencies, coupling metrics, and SDP violations
# are analyzed automatically. There are no separate flags to enable
# or disable specific architectural checks.
Example: Circular Dependency
Before:
src/auth.rs → src/session.rs → src/user.rs → src/auth.rs
Circular dependency detected: auth → session → user → auth
After refactoring:
src/auth.rs → src/auth_interface.rs ← src/session.rs
↑
src/user.rs
No circular dependencies found.
Example: Coupling Metrics Table
Module Analysis Results:
Module Ca Ce Instability Issues
-------------------------------------------------
core/types 15 0 0.00 None
api/handlers 2 8 0.80 High Ce
database/client 8 2 0.20 None
utils/temp 5 12 0.71 SDP violation
auth/session 3 3 0.50 None
Example: Zone of Pain
Module: legacy_db_client
Metrics:
Abstractness: 0.05 (all concrete code)
Instability: 0.12 (depended on by 25 modules)
Afferent coupling: 25
Distance from main sequence: 0.83
Status: Zone of Pain - rigid and hard to change
Refactoring steps:
1. Extract interface DatabaseClient trait
2. Create adapter wrapping legacy implementation
3. Gradually migrate dependents to use trait
4. Introduce alternative implementations
Interpreting Results
Prioritization
Address architectural issues in this order:
-
Circular Dependencies (Highest Priority)
- Break architectural boundaries
- Make testing impossible
- Cause build issues
-
Bidirectional Dependencies (High Priority)
- Create tight coupling
- Prevent independent testing
- Block modular changes
-
Zone of Pain Issues (Medium-High Priority)
- Indicate rigid architecture
- Block future changes
- High risk for bugs
-
SDP Violations (Medium Priority)
- Cause ripple effects
- Increase maintenance cost
- Unstable foundation
-
High Coupling (Medium Priority)
- Maintainability risk
- Testing difficulty
- Change amplification
-
Code Duplication (Lower Priority)
- Maintenance burden
- Bug multiplication
- Inconsistency risk
Decision Flowchart
Is there a circular dependency?
├─ YES → Break immediately (extract interface, DI)
└─ NO → Continue
Is there bidirectional dependency?
├─ YES → Refactor (mediator, event-driven)
└─ NO → Continue
Is module in zone of pain?
├─ YES → Introduce abstractions
└─ NO → Continue
Is SDP violated?
├─ YES → Stabilize or reduce afferent coupling
└─ NO → Continue
Is coupling > threshold?
├─ YES → Reduce dependencies
└─ NO → Continue
Is there significant duplication?
├─ YES → Extract common code
└─ NO → Architecture is good!
Integration with Debt Categories
Architectural analysis results are integrated with debtmap’s debt categorization system:
Debt Type Mapping
Architectural issues are mapped to existing DebtType enum variants:
- Duplication - Duplicated code blocks found
- Dependency - Used for circular dependencies and coupling issues
- CodeOrganization - May be used for architectural violations (SDP, zone issues)
Note: The DebtType enum does not have dedicated variants for CircularDependency, HighCoupling, or ArchitecturalViolation. Architectural issues are mapped to existing general-purpose debt types.
Reference: src/core/mod.rs:220-236 for actual DebtType enum definition
Tiered Prioritization
Architectural issues are assigned priority tiers:
- Tier 1 (Critical) - Circular dependencies, bidirectional dependencies
- Tier 2 (High) - Zone of pain, SDP violations
- Tier 3 (Medium) - High coupling, large duplications
- Tier 4 (Low) - Small duplications, minor coupling issues
Reference: See Tiered Prioritization for complete priority assignment logic
Cohesion Analysis
Note: Module cohesion analysis is currently a simplified placeholder implementation.
Current status: src/debt/coupling.rs:82-95 (analyze_module_cohesion)
The function exists but provides basic cohesion calculation. Full cohesion analysis (measuring how well module elements belong together) is planned for a future release.
Future Enhancement
Full cohesion analysis would measure:
- Functional cohesion (functions operating on related data)
- Sequential cohesion (output of one function feeds another)
- Communicational cohesion (functions operating on same data structures)
Configuration
Configurable Parameters
Configure duplication detection in .debtmap.toml or via CLI:
# Minimum lines for duplication detection
threshold_duplication = 50 # Default value
Or via command line:
debtmap analyze . --threshold-duplication 50
Configuration reference: src/cli/args.rs:86-87 (threshold_duplication flag definition)
Hardcoded Thresholds
Note: Most architectural thresholds are currently hardcoded in the implementation and cannot be configured. These thresholds are based on industry-standard metrics from Robert C. Martin’s research and empirical software engineering studies:
- Coupling threshold: 5 (modules with >5 dependencies are flagged)
- Instability threshold: 0.8 (for SDP violations)
- SDP afferent threshold: 2 (minimum dependents for SDP violations)
- Zone of pain thresholds:
- Abstractness < 0.2
- Instability < 0.2
- Afferent coupling > 3
- Zone of uselessness thresholds:
- Abstractness > 0.8
- Instability > 0.8
These values represent widely-accepted boundaries in software architecture literature. While they work well for most projects, configurable thresholds may be added in a future release to support domain-specific tuning.
Source: src/debt/coupling.rs:70-76, 130, 145 (hardcoded threshold definitions)
See Configuration for complete options.
Troubleshooting
“No circular dependencies detected but build fails”
Cause: Circular dependencies at the package/crate level, not module level.
Solution: Use cargo tree to analyze package-level dependencies.
“Too many coupling warnings”
Cause: Default threshold of 5 may be too strict for your codebase.
Solution: The coupling threshold is currently hardcoded at 5 in the implementation (src/debt/coupling.rs:62). To adjust it, you would need to modify the source code. Consider using suppression patterns to exclude specific modules if needed. See Suppression Patterns.
“Duplication detected in generated code”
Cause: Code generation tools create similar patterns.
Solution: Use suppression patterns to exclude generated files. See Suppression Patterns.
“Zone of pain false positives”
Cause: Utility modules are intentionally stable and concrete.
Solution: This is often correct - utility modules should be stable. Consider whether the module should be more abstract.
Further Reading
Robert C. Martin’s Principles
The architectural metrics in debtmap are based on:
- Clean Architecture by Robert C. Martin
- Agile Software Development: Principles, Patterns, and Practices by Robert C. Martin
- Stable Dependencies Principle (SDP)
- Stable Abstractions Principle (SAP)
- Main Sequence distance metric
Related Topics
- Analysis Guide - Complete analysis workflow
- Configuration - Configuration options
- Entropy Analysis - Complexity vs. entropy
- Scoring Strategies - How debt is scored
- Tiered Prioritization - Priority assignment
Boilerplate Detection
Debtmap identifies repetitive code patterns that could benefit from macro-ification or other abstraction techniques. This helps reduce maintenance burden and improve code consistency.
Overview
Boilerplate detection analyzes low-complexity repetitive code to identify opportunities for:
- Macro-ification - Convert repetitive patterns to declarative or procedural macros
- Code generation - Use build scripts to generate repetitive implementations
- Generic abstractions - Replace duplicate implementations with generic code
- Trait derivation - Use derive macros instead of manual implementations
Boilerplate detection runs automatically as part of the god object analysis pipeline. When a file has many impl blocks, it’s classified as either a true god object (complex code needing splitting), a builder pattern (intentional fluent API), or a boilerplate pattern (low complexity needing macro-ification). This prevents false positives where repetitive low-complexity code is misclassified as god objects.
Source: Integration with god object detection in src/organization/god_object/classification_types.rs:45-50, src/analyzers/file_analyzer.rs:366-372
Detection Criteria
Debtmap identifies boilerplate using trait pattern analysis (src/organization/trait_pattern_analyzer.rs:158-176):
- Multiple similar trait implementations - 20+ impl blocks with shared structure
- High method uniformity - 70%+ of implementations share the same methods
- Low complexity repetitive code - Average cyclomatic complexity < 2.0
- Low complexity variance - Consistent complexity across implementations
- Single dominant trait - One trait accounts for 80%+ of implementations
The TraitPatternAnalyzer computes these metrics:
impl_block_count- Number of trait implementations in the fileunique_traits- Set of distinct traits implementedmost_common_trait- Most frequently implemented trait and countmethod_uniformity- Ratio of most common method appearance to total implsshared_methods- Methods appearing in 50%+ of implementationsavg_method_complexity- Average cyclomatic complexity per methodcomplexity_variance- Variance in complexity across methodsavg_method_lines- Average lines of code per method
Detection Signals
The boilerplate detector extracts detection signals (src/organization/boilerplate_detector.rs:246-253, 164-190):
- HighImplCount(usize) - Number of impl blocks exceeds threshold
- HighMethodUniformity(f64) - Methods are highly uniform across implementations
- LowAvgComplexity(f64) - Average complexity is below threshold
- LowComplexityVariance(f64) - Complexity variance is low (consistent complexity)
Note: The HighStructDensity signal is defined as an enum variant but is not currently extracted by the detection algorithm. It is reserved for future enhancement to detect struct-level boilerplate patterns.
Boilerplate Scoring Algorithm
The confidence score is calculated using weighted signals (src/organization/boilerplate_detector.rs:124-161):
-
High impl count (30% weight) - Files with 20+ impl blocks score higher
- Normalized:
min(impl_count / 100, 1.0) × 30%
- Normalized:
-
Method uniformity (25% weight) - Methods shared across implementations
- Score:
method_uniformity × 25%(if ≥ 0.7 threshold)
- Score:
-
Low average complexity (20% weight) - Simple, repetitive code
- Score:
(1 - complexity / 2.0) × 20%(if complexity < 2.0)
- Score:
-
Low complexity variance (15% weight) - Consistent complexity
- Score:
(1 - min(variance / 10.0, 1.0)) × 15%(if variance < 2.0)
- Score:
-
Single dominant trait (10% weight) - One trait dominates
- Score:
trait_ratio × 10%(if trait_ratio > 0.8)
- Score:
Threshold: Patterns with confidence ≥ 0.7 (70%) are reported as boilerplate.
Pattern Types
Trait Implementation Boilerplate
Detected when a file has many similar trait implementations with low complexity.
Example (from tests/boilerplate_integration_test.rs:14-48):
#![allow(unused)]
fn main() {
// 26 From<Format> implementations with identical structure
pub enum Format { A, B, C, /* ... */ Z }
pub struct Target { name: String }
impl From<Format> for Target {
fn from(f: Format) -> Self {
match f {
Format::A => Target { name: "a".to_string() },
Format::B => Target { name: "b".to_string() },
// ... 24 more identical patterns
}
}
}
// Detected: 26 impl blocks, 1.0 method uniformity, complexity ~2.0
}
Recommendation: Use declarative macro to reduce ~250 lines to ~30 lines.
Builder Pattern
Detected when a file has repetitive setter methods returning Self.
Example (from book/src/boilerplate-detection.md:46-61):
#![allow(unused)]
fn main() {
impl ConfigBuilder {
pub fn host(mut self, host: String) -> Self {
self.host = host;
self
}
pub fn port(mut self, port: u16) -> Self {
self.port = port;
self
}
// ... more identical setter methods
}
}
Recommendation: Use derive_builder crate or custom derive macro.
Test Boilerplate
Detected when test functions have shared structure and repetitive assertions (src/organization/boilerplate_detector.rs:238-243).
Example:
#![allow(unused)]
fn main() {
#[test]
fn test_case_a() {
let input = create_input_a();
let result = process(input);
assert_eq!(result.status, Status::Success);
}
#[test]
fn test_case_b() {
let input = create_input_b();
let result = process(input);
assert_eq!(result.status, Status::Success);
}
// ... 20 more similar test functions
}
Recommendation: Use parameterized tests with rstest or table-driven tests.
Planned Pattern Types
The following pattern types are defined in the feature specification but not yet fully implemented:
- State machine patterns - Repetitive state transition implementations
- Registry patterns - Similar registration logic across modules
- Delegation patterns - Wrapper types with passthrough implementations
These are tracked for future enhancement. Currently, the implementation focuses on trait implementation boilerplate detection with partial support for builder and test patterns.
Configuration
Boilerplate detection is controlled via configuration file (src/organization/boilerplate_detector.rs:256-322):
[boilerplate_detection]
# Enable boilerplate detection (default: true)
enabled = true
# Minimum impl blocks to consider (default: 20)
min_impl_blocks = 20
# Method uniformity threshold 0.0-1.0 (default: 0.7)
method_uniformity_threshold = 0.7
# Maximum average complexity for boilerplate (default: 2.0)
max_avg_complexity = 2.0
# Minimum confidence to report 0.0-1.0 (default: 0.7)
confidence_threshold = 0.7
# Enable trait implementation detection (default: true)
detect_trait_impls = true
# Enable builder pattern detection (default: true)
detect_builders = true
# Enable test boilerplate detection (default: true)
detect_test_boilerplate = true
Implementation Status: The detect_trait_impls option is fully implemented. The detect_builders and detect_test_boilerplate options are configuration placeholders with partial implementation - the current detect() method primarily focuses on trait implementation detection (src/organization/boilerplate_detector.rs:75-121).
Field reference (src/organization/boilerplate_detector.rs:48-57, 310-322):
- All fields have serde defaults
- Configuration can be provided via TOML or JSON
- Missing fields use default values
Usage
Boilerplate detection runs automatically when enabled in configuration:
# Run analysis with default configuration
debtmap analyze .
# Use custom config file
debtmap analyze . --config custom-config.toml
# Show where configuration values came from
debtmap analyze . --show-config-sources
Note: There are no dedicated CLI flags like --detect-boilerplate or --show-macro-suggestions. Boilerplate detection is integrated into the god object analysis pipeline and controlled via configuration file only (verified in src/cli.rs - no boilerplate-specific flags exist).
Macro Recommendations
The MacroRecommendationEngine generates specific refactoring guidance (src/organization/macro_recommendations.rs:13-150):
For Trait Implementations
Detected boilerplate: 25 implementations of From trait
Estimated line reduction: 220 lines → 35 lines (84% reduction)
Recommendation:
- Use declarative macro (macro_rules!) for simple conversions
- Use procedural derive macro for complex transformations
- Consider code generation in build.rs for large enums
For Builder Patterns
The MacroRecommendationEngine suggests multiple builder crates (src/organization/macro_recommendations.rs:119-131):
BUILDER PATTERN BOILERPLATE DETECTED: 5 builder structs
Consider using existing builder libraries:
- `bon` crate for declarative builders
- `typed-builder` for compile-time checked builders
- `derive_builder` for macro-based generation
This reduces boilerplate while maintaining type safety.
For Test Boilerplate
Detected boilerplate: 20 similar test functions
Estimated line reduction: 120 lines → 25 lines (79% reduction)
Recommendation:
- Use rstest with #[rstest] and #[case(...)] for parameterized tests
- Extract common test setup into fixture functions
- Use table-driven tests with Vec<TestCase> for data-driven testing
Integration with God Object Detection
Boilerplate detection prevents false positives in god object analysis:
- File with many impl blocks detected → Analyze trait patterns
- High complexity + many impls → Classified as GodObject (needs module splitting)
- Low complexity + many impls → Classified as BoilerplatePattern (needs macro-ification)
- Builder pattern detected → Classified as BuilderPattern (intentional design)
This distinction ensures appropriate recommendations for different code patterns.
Source: src/organization/god_object/classification_types.rs:45-50
See Also
- God Object Detection - Complexity-based refactoring
- Design Pattern Detection - Higher-level pattern recognition
- Boilerplate vs Complexity - Understanding the distinction
- Configuration - Full configuration reference
Boilerplate vs Complexity
Overview
Debtmap distinguishes between boilerplate code (necessary but mechanical patterns) and true complexity (business logic requiring cognitive effort). This distinction is critical for:
- Avoiding false positives in complexity analysis
- Focusing refactoring efforts on actual problems
- Understanding which high-complexity code is acceptable
- Providing actionable recommendations
This chapter explains how Debtmap identifies boilerplate patterns, why they differ from complexity, and how to interpret the analysis results.
The Distinction
What is Boilerplate?
Boilerplate code consists of repetitive, mechanical patterns that are:
- Required by language/framework - Type conversions, trait implementations, builder patterns
- Structurally necessary - Match arms for enums, error propagation, validation chains
- Low cognitive load - Pattern-based code that developers scan rather than deeply analyze
- Not actual complexity - High cyclomatic complexity but mechanistic structure
Examples:
Fromtrait implementations converting between typesDisplayformatting with exhaustive enum match arms- Builder pattern setters with validation
- Error conversion implementations
- Serialization/deserialization code
What is True Complexity?
True complexity consists of business logic that requires:
- Domain understanding - Knowledge of problem space and requirements
- Cognitive effort - Careful analysis to understand behavior
- Algorithmic decisions - Non-obvious control flow or data transformations
- Maintainability risk - Changes may introduce subtle bugs
Examples:
- Graph traversal algorithms
- Complex business rules with multiple conditions
- State machine implementations with non-trivial transitions
- Performance-critical optimizations
- Error recovery with fallback strategies
Real Example: ripgrep’s defs.rs
The ripgrep codebase provides an excellent real-world example of boilerplate vs complexity.
File: crates/printer/src/defs.rs
This file contains type conversion implementations with high cyclomatic complexity scores but minimal actual complexity:
#![allow(unused)]
fn main() {
impl From<HyperlinkFormat> for ColorHyperlink {
fn from(format: HyperlinkFormat) -> ColorHyperlink {
match format {
HyperlinkFormat::Default => ColorHyperlink::default(),
HyperlinkFormat::Grep => ColorHyperlink::grep(),
HyperlinkFormat::GrepPlus => ColorHyperlink::grep_plus(),
HyperlinkFormat::Ripgrep => ColorHyperlink::ripgrep(),
HyperlinkFormat::FileNone => ColorHyperlink::file_none(),
// ... 10+ more variants
}
}
}
}
Analysis:
- Cyclomatic Complexity: 15+ (one branch per enum variant)
- Cognitive Complexity: Low (simple delegation pattern)
- Boilerplate Confidence: 95% (trait implementation with mechanical structure)
Why This Matters
Without boilerplate detection, this file would be flagged as:
- High complexity debt
- Requiring refactoring
- Priority for review
With boilerplate detection, it’s correctly classified as:
- Necessary type conversion code
- Low maintenance risk
- Can be safely skipped in debt prioritization
Detection Methodology
Debtmap uses a multi-phase analysis pipeline to detect boilerplate:
Phase 1: Trait Analysis
Identifies trait implementations known to produce boilerplate:
High-confidence boilerplate traits:
From,Into- Type conversionsDisplay,Debug- FormattingDefault- Default value constructionClone,Copy- Value semanticsEq,PartialEq,Ord,PartialOrd- ComparisonsHash- Hashing implementations
Medium-confidence boilerplate traits:
Serialize,Deserialize- SerializationAsRef,AsMut,Deref,DerefMut- Reference conversions- Custom builder traits
See src/debt/boilerplate/boilerplate_traits.rs:10-58 for complete trait categorization.
Phase 2: Pattern Analysis
Analyzes code structure for boilerplate patterns:
Pattern 1: Simple Delegation
#![allow(unused)]
fn main() {
fn operation(&self) -> Result<T> {
self.inner.operation() // Single delegation call
}
}
Score: 90% confidence
Pattern 2: Trivial Match Arms
#![allow(unused)]
fn main() {
match variant {
A => handler_a(),
B => handler_b(),
C => handler_c(),
}
}
Each arm calls a single function with no additional logic. Score: 85% confidence
Pattern 3: Validation Chains
#![allow(unused)]
fn main() {
fn validate(&self) -> Result<()> {
check_condition_1()?;
check_condition_2()?;
check_condition_3()?;
Ok(())
}
}
Sequential validation with early returns. Score: 75% confidence
Pattern 4: Builder Setters
#![allow(unused)]
fn main() {
pub fn with_field(mut self, value: T) -> Self {
self.field = value;
self
}
}
Simple field assignment with fluent return. Score: 95% confidence
See src/debt/boilerplate/pattern_detector.rs:18-82 for pattern detection logic.
Phase 3: Macro Analysis
Detects macro-generated code and provides recommendations:
Derivable Traits:
Debtmap suggests using #[derive(...)] when it detects manual implementations of:
Clone,Copy,Debug,DefaultEq,PartialEq,Ord,PartialOrdHash
Custom Macros: Recommends creating custom derive macros for:
- Repeated builder pattern implementations
- Repeated conversion trait implementations
- Repeated validation logic
Existing Crates: Suggests established crates for common patterns:
derive_more- Extended derive macrosthiserror- Error type boilerplatetyped-builder- Builder pattern macrosdelegate- Delegation patterns
See src/debt/boilerplate/macro_recommender.rs:9-136 for macro recommendation logic.
Common Boilerplate Patterns
Type Conversions
#![allow(unused)]
fn main() {
// High complexity (15+), but boilerplate
impl From<ConfigFormat> for Config {
fn from(format: ConfigFormat) -> Config {
match format {
ConfigFormat::Json => Config::json(),
ConfigFormat::Yaml => Config::yaml(),
ConfigFormat::Toml => Config::toml(),
// ... many variants
}
}
}
}
Boilerplate Confidence: 90%+ Recommendation: Consider using a macro if pattern repeats
Error Propagation
#![allow(unused)]
fn main() {
// High nesting, but boilerplate pattern
fn complex_operation(&self) -> Result<Output> {
let step1 = self.step_one()
.context("Step one failed")?;
let step2 = self.step_two(&step1)
.context("Step two failed")?;
let step3 = self.step_three(&step2)
.context("Step three failed")?;
Ok(Output::new(step3))
}
}
Boilerplate Confidence: 75% Recommendation: Acceptable pattern for error handling
Builder Patterns
#![allow(unused)]
fn main() {
// Many methods, but all boilerplate
impl ConfigBuilder {
pub fn with_timeout(mut self, timeout: Duration) -> Self {
self.timeout = Some(timeout);
self
}
pub fn with_retries(mut self, retries: u32) -> Self {
self.retries = Some(retries);
self
}
// ... 20+ more setters
}
}
Boilerplate Confidence: 95%
Recommendation: Use typed-builder or similar crate
Display Formatting
#![allow(unused)]
fn main() {
// High complexity due to match, but boilerplate
impl Display for Status {
fn fmt(&self, f: &mut Formatter) -> fmt::Result {
match self {
Status::Pending => write!(f, "pending"),
Status::Running => write!(f, "running"),
Status::Success => write!(f, "success"),
Status::Failed(err) => write!(f, "failed: {}", err),
// ... many variants
}
}
}
}
Boilerplate Confidence: 90%
Recommendation: Consider using strum or derive_more
Decision Table
Use this table to interpret boilerplate confidence scores:
| Confidence | Interpretation | Action |
|---|---|---|
| 90-100% | Definite boilerplate | Exclude from complexity prioritization; consider macro optimization |
| 70-89% | Probable boilerplate | Review pattern; likely acceptable; low refactoring priority |
| 50-69% | Mixed boilerplate/logic | Investigate; may contain hidden complexity; medium priority |
| 30-49% | Mostly real complexity | Standard complexity analysis; normal refactoring priority |
| 0-29% | True complexity | High priority; focus refactoring efforts here |
Example Classifications
Boilerplate (90%+ confidence):
#![allow(unused)]
fn main() {
// Simple trait delegation - skip in debt analysis
impl AsRef<str> for CustomString {
fn as_ref(&self) -> &str {
&self.inner
}
}
}
Mixed (50-70% confidence):
#![allow(unused)]
fn main() {
// Match with some logic - review case by case
fn process_event(&mut self, event: Event) -> Result<()> {
match event {
Event::Simple => self.handle_simple(), // Boilerplate
Event::Complex(data) => { // Real logic
if data.priority > 10 {
self.handle_urgent(data)?;
} else {
self.queue_normal(data)?;
}
self.update_metrics()?;
Ok(())
}
}
}
}
True Complexity (0-30% confidence):
#![allow(unused)]
fn main() {
// Business logic requiring domain knowledge
fn calculate_optimal_strategy(&self, market: &Market) -> Strategy {
let volatility = market.calculate_volatility();
let trend = market.detect_trend();
if volatility > self.risk_threshold {
if trend.is_bullish() && self.can_hedge() {
Strategy::hedged_long(self.calculate_position_size())
} else {
Strategy::defensive()
}
} else {
Strategy::momentum_based(trend, self.confidence_level())
}
}
}
Integration with Complexity Analysis
Boilerplate Scoring
Debtmap calculates a BoilerplateScore for each function:
#![allow(unused)]
fn main() {
pub struct BoilerplateScore {
pub confidence: f64, // 0.0-1.0 (0% to 100%)
pub primary_pattern: Pattern, // Strongest detected pattern
pub contributing_patterns: Vec<Pattern>,
pub macro_recommendation: Option<MacroRecommendation>,
}
}
Complexity Adjustment
High-confidence boilerplate reduces effective complexity:
effective_complexity = raw_complexity × (1.0 - boilerplate_confidence)
Example:
- Raw cyclomatic complexity: 15
- Boilerplate confidence: 0.90 (90%)
- Effective complexity: 15 × (1.0 - 0.90) = 1.5
This prevents boilerplate from dominating debt prioritization.
Output Display
Debtmap annotates boilerplate functions in analysis output:
src/types/conversions.rs:
├─ from (complexity: 15, boilerplate: 92%)
│ Pattern: Trait Implementation (From)
│ Recommendation: Consider #[derive(From)] via derive_more
│ Priority: Low (boilerplate)
├─ process_request (complexity: 12, boilerplate: 15%)
│ Priority: High (true complexity)
Best Practices
When to Accept Boilerplate
Accept high-complexity boilerplate when:
- Required by language - Trait implementations, type conversions
- Pattern is clear - Developers can scan quickly without deep analysis
- Covered by tests - Mechanical patterns verified by unit tests
- No simpler alternative - Refactoring would reduce clarity
Example: Exhaustive match arms for enum variants with simple delegation.
When to Refactor Boilerplate
Refactor boilerplate when:
- Pattern repeats extensively - 10+ similar implementations
- Macro alternative exists - Can use derive or custom macro
- Maintenance burden - Changes require updating many copies
- Error-prone - Manual pattern increases bug risk
Example: 50+ builder setters that could use typed-builder crate.
Configuring Thresholds
Adjust boilerplate sensitivity in .debtmap.toml:
[boilerplate_detection]
enabled = true
min_confidence_to_exclude = 0.85 # Only exclude 85%+ confidence
trait_delegation_threshold = 0.90 # Trait impl confidence
pattern_match_threshold = 0.75 # Match pattern confidence
Strict mode (minimize false negatives):
min_confidence_to_exclude = 0.95 # Very high bar for exclusion
Lenient mode (minimize false positives):
min_confidence_to_exclude = 0.70 # More aggressive exclusion
Validation and Testing
Integration Test Example
Debtmap’s test suite includes real-world boilerplate validation:
#![allow(unused)]
fn main() {
#[test]
fn test_ripgrep_defs_boilerplate() {
let code = r#"
impl From<HyperlinkFormat> for ColorHyperlink {
fn from(format: HyperlinkFormat) -> ColorHyperlink {
match format {
HyperlinkFormat::Default => ColorHyperlink::default(),
// ... 15 variants
}
}
}
"#;
let result = analyze_boilerplate(code);
assert!(result.confidence >= 0.85, "Should detect trait boilerplate");
assert_eq!(result.primary_pattern, Pattern::TraitImplementation);
}
}
See tests/boilerplate_integration_test.rs for complete test cases.
Performance Overhead
Boilerplate detection adds minimal overhead:
Measurement: <5% increase in analysis time Reason: Single-pass AST analysis with cached pattern matching Optimization: Trait analysis uses fast HashMap lookups
See tests/boilerplate_performance_test.rs for benchmark details.
Troubleshooting
“Why is my code marked as boilerplate?”
Check:
- Is it a trait implementation? (From, Display, etc.)
- Does it follow a mechanical pattern?
- Are all branches simple delegations?
If incorrectly classified:
- Adjust
min_confidence_to_excludethreshold - Report false positive if confidence is very high
“My boilerplate isn’t detected”
Common causes:
- Custom logic mixed with boilerplate pattern
- Non-standard trait names
- Complex match arm logic
Solutions:
- Extract pure boilerplate into separate functions
- Use standard traits when possible
- Check confidence score - may be detected with lower confidence
“Boilerplate detection seems too aggressive”
Adjust configuration:
[boilerplate_detection]
min_confidence_to_exclude = 0.95 # Raise threshold
trait_delegation_threshold = 0.95
Related Documentation
- Complexity Metrics - Understanding cyclomatic complexity
- Configuration - Complete
.debtmap.tomlreference - Tiered Prioritization - How boilerplate affects debt ranking
Summary
Boilerplate detection is a critical feature that:
- Distinguishes mechanical patterns from true complexity
- Reduces false positives in debt analysis
- Provides actionable macro recommendations
- Integrates seamlessly with complexity scoring
- Helps teams focus on real maintainability issues
By identifying boilerplate with 85%+ confidence, Debtmap ensures that high-complexity scores reflect actual cognitive burden rather than necessary but mechanical code patterns.
Call Graph Analysis
Debtmap constructs detailed call graphs to track function relationships and dependencies across your codebase. This enables critical path identification, circular dependency detection, and transitive coverage propagation.
Overview
Call graph analysis builds a comprehensive map of which functions call which other functions. This information powers several key features:
- Critical path identification - Find frequently-called functions that deserve extra attention
- Circular dependency detection - Identify problematic circular call patterns
- Transitive coverage - Propagate test coverage through the call graph
- Dependency visualization - See caller/callee relationships in output
- Risk assessment - Factor calling patterns into priority scoring
Call Graph Construction
Debtmap builds call graphs through a three-phase AST-based construction process:
- Extract functions and collect unresolved calls - Parse each file to identify function definitions and call expressions
- Resolve calls using CallResolver and PathResolver - Match call expressions to function definitions within the same file
- Final cross-file resolution - Resolve remaining calls across module boundaries
This multi-phase approach ensures accurate resolution while handling complex scenarios like trait methods, macros, and module imports.
#![allow(unused)]
fn main() {
// Example: Debtmap tracks these relationships
fn process_data(input: &str) -> Result<Data> {
validate_input(input)?; // Call edge: process_data -> validate_input
parse_data(input) // Call edge: process_data -> parse_data
}
fn validate_input(input: &str) -> Result<()> {
// Call graph tracks this function as a callee
Ok(())
}
}
Source: Example pattern from tests/call_graph_comprehensive_test.rs:48-94
Resolution Mechanisms
The call graph analyzer handles complex resolution scenarios:
- Trait method resolution - Resolves trait method calls to implementations using struct prefixes (e.g.,
Processor::process) - Macro expansion tracking - Classifies and tracks calls within macros (collection, formatting, assertion, and logging macros)
- Module path resolution - Resolves fully-qualified paths across module boundaries
- Cross-file resolution - Matches unresolved calls (marked with line 0) to actual function definitions
Source: Resolution mechanisms from src/analyzers/call_graph/trait_handling.rs, src/analyzers/call_graph/macro_expansion.rs, src/analyzers/call_graph/path_resolver.rs
Parallel Construction
Call graph construction runs in parallel by default for improved performance. You can disable parallel processing with --no-parallel for debugging purposes, though this affects overall analysis performance, not just call graph construction.
Configuration
Call graph behavior is controlled through two configuration sections:
Analysis Settings
Configure advanced analysis features in the [analysis] section:
[analysis]
# Enable trait method resolution (default: depends on context)
enable_trait_analysis = true
# Enable function pointer and closure tracking (default: depends on context)
enable_function_pointer_tracking = true
# Enable framework pattern detection for tests and handlers (default: depends on context)
enable_framework_patterns = true
# Enable cross-module dependency analysis (default: depends on context)
enable_cross_module_analysis = true
# Maximum depth for transitive analysis (optional)
max_analysis_depth = 10
Source: Configuration fields from src/config/core.rs:159-175 (AnalysisSettings)
Caller/Callee Display Settings
Configure how dependencies are displayed in the [classification.caller_callee] section:
[classification.caller_callee]
# Maximum number of callers to display per function (default: 5)
max_callers = 5
# Maximum number of callees to display per function (default: 5)
max_callees = 5
# Show external crate calls in dependencies (default: false)
show_external = false
# Show standard library calls in dependencies (default: false)
show_std_lib = false
Source: Configuration fields from src/config/classification.rs:5-50 (CallerCalleeConfig)
CLI Reference
Display Control Flags
| Flag | Default | Description |
|---|---|---|
--show-dependencies | false | Show dependency information (callers/callees) in output |
--no-dependencies | false | Hide dependency information (conflicts with –show-dependencies) |
--max-callers <N> | 5 | Maximum number of callers to display per function |
--max-callees <N> | 5 | Maximum number of callees to display per function |
--show-external-calls | false | Show external crate calls in dependencies |
--show-std-lib-calls | false | Show standard library calls in dependencies |
Analysis Control Flags
| Flag | Default | Description |
|---|---|---|
--no-parallel | false | Disable parallel processing (enabled by default) |
Debug and Validation Flags
| Flag | Default | Description |
|---|---|---|
--debug-call-graph | false | Enable detailed call graph debugging output |
--validate-call-graph | false | Validate call graph structure and report issues |
--call-graph-stats | false | Show call graph statistics with resolution percentiles (p50, p95, p99) |
--trace-function <NAMES> | none | Trace specific functions during call resolution (comma-separated) |
--debug-format <FORMAT> | text | Debug output format (text or json) |
Source: CLI flags from src/cli/args.rs:341-350
Usage
Basic Call Graph Analysis
# Analyze with call graph enabled (default)
debtmap analyze .
# Show caller/callee relationships in output
debtmap analyze . --show-dependencies
# Limit displayed relationships
debtmap analyze . --show-dependencies --max-callers 3 --max-callees 3
Filtering External Calls
By default, Debtmap filters both external crate calls and standard library calls for cleaner output. The call graph contains all edges; filtering only affects display output.
# Default: external and standard library calls are hidden
debtmap analyze .
# Show external crate calls (e.g., from dependencies)
debtmap analyze . --show-dependencies --show-external-calls
# Show standard library calls (std::, core::, alloc::)
debtmap analyze . --show-dependencies --show-std-lib-calls
# Show both external and standard library calls
debtmap analyze . --show-dependencies --show-external-calls --show-std-lib-calls
Important: External call filtering happens at display time, not during graph construction. This means --debug-call-graph may show more calls than regular output.
Source: Filtering logic from src/priority/formatter/dependencies.rs:filter_dependencies
Debugging Call Resolution
# Enable detailed call graph debugging
debtmap analyze . --debug-call-graph
# Trace specific functions during resolution
debtmap analyze . --trace-function "process_data,validate_input"
# Show call graph statistics with percentiles
debtmap analyze . --call-graph-stats
# Validate call graph structure
debtmap analyze . --validate-call-graph
# Disable parallel processing for debugging
debtmap analyze . --no-parallel
Debug Output Format
Debug output includes:
- Resolution statistics - Success rates with percentiles (p50, p95, p99)
- Timing information - Performance metrics for each resolution phase
- Function tracing - Detailed resolution attempts for specified functions
- Unresolved calls - Calls that couldn’t be matched to definitions
Macro expansion statistics show classification breakdown (collection macros, formatting macros, assertion macros, logging macros).
Source: Debug capabilities from src/analyzers/call_graph/debug.rs (DebugConfig, ResolutionStatistics)
Visualization
Call graph information appears in output using Unicode tree-style rendering:
├─ DEPENDENCIES:
│ ├─ Called by (2):
│ │ * main
│ │ * handle_request
│ │ ... (showing 2 of 2)
│ ├─ Calls (3):
│ * validate_input
│ * parse_data
│ * transform
│ ... (showing 3 of 5)
Source: Tree-style rendering from src/priority/formatter/sections.rs:240-329
Path Simplification
Long paths are simplified for readability:
- Short names: unchanged (e.g.,
my_function) - Two-segment paths: unchanged (e.g.,
helper::read_file) - Long paths: simplified to last two segments (e.g.,
crate::utils::io::helper::read_file→helper::read_file)
Empty States
- No callers: “Called by: No direct callers detected”
- No callees: “Calls: Calls no other functions”
Standard Library Detection
Standard library calls are filtered by default and include:
- Functions starting with
std::,core::, oralloc:: - Common macros:
println,print,eprintln,eprint,write,writeln,format,panic,assert,debug_assert
External crate calls are identified as functions containing :: that aren’t in the standard library or the current crate (crate::).
Source: Detection logic from src/priority/formatter/dependencies.rs:is_standard_library_call, is_external_crate_call
Validation and Health Scoring
The call graph validator checks for structural issues:
debtmap analyze . --validate-call-graph
Validation reports include:
- Health score - Overall graph quality (0-100)
- Structural issues - Orphaned functions, disconnected components
- Warnings - Potential resolution problems
Source: Validation implementation from src/analyzers/call_graph/validation.rs:185 (CallGraphValidator)
Performance Tuning
For large codebases, consider these performance optimizations:
- Disable parallel processing (
--no-parallel) - Only for debugging; reduces performance - Control analysis depth - Use
max_analysis_depthin configuration to limit transitive analysis - Disable optional analysis - Turn off
enable_trait_analysis,enable_function_pointer_tracking, orenable_framework_patternsif not needed
Troubleshooting
Unresolved Calls
If you see unresolved calls in debug output:
- Check imports - Ensure all modules are properly imported
- Verify visibility - Confirm functions are accessible (not private across module boundaries)
- Review module structure - Complex module hierarchies may require explicit path configuration
- Use tracing - Run with
--trace-functionto see detailed resolution attempts
Incorrect Caller/Callee Counts
If counts seem wrong:
- Check filtering - Use
--show-external-callsand--show-std-lib-callsto see all edges - Validate structure - Run
--validate-call-graphto check for structural issues - Review debug output - Use
--debug-call-graphto see complete graph before filtering
See Also
- Architectural Analysis - Circular dependency detection
- Context Providers - Critical path analysis
- Coverage Integration - Transitive coverage propagation
- Configuration - Complete configuration reference
- CLI Reference - All command-line flags
Context Providers
Context providers enhance debtmap’s risk analysis by incorporating additional factors beyond complexity and test coverage. They analyze critical execution paths, dependency relationships, and version control history to provide a more comprehensive understanding of technical risk.
Table of Contents
- Overview
- Critical Path Provider
- Dependency Provider
- Git History Provider
- Enabling Context Providers
- Provider Weights
- How Context Affects Scoring
- Context Details Structure
- Practical Examples
- Configuration
- Performance Considerations
- Troubleshooting
- Advanced Usage
- Best Practices
- See Also
Overview
Context providers implement the ContextProvider trait, which gathers risk-relevant information about functions and modules. Each provider analyzes a specific dimension of risk:
- Critical Path Provider: Identifies functions on critical execution paths
- Dependency Provider: Analyzes call graph relationships and blast radius
- Git History Provider: Integrates version control history for change patterns
Context providers help debtmap understand:
- Which code paths are most critical
- How functions depend on each other
- Which code changes most frequently
- Where bugs are likely to occur
This context-aware analysis improves prioritization accuracy and reduces false positives.
The ContextAggregator combines context from multiple enabled providers and adjusts risk scores using the formula:
contextual_risk = base_risk × (1.0 + context_contribution)
Where context_contribution is the weighted sum of all provider contributions:
context_contribution = Σ(provider.contribution × provider.weight)
Configuration Note: Provider-specific TOML configuration (like
[context.critical_path]) is a planned feature not yet implemented. All providers currently use hard-coded defaults from the implementation. Use CLI flags (--context,--context-providers,--disable-context) to control providers. See the Enabling Context Providers section for working examples.
Critical Path Provider
The Critical Path provider identifies functions that lie on critical execution paths through your application. Functions on these paths have elevated risk because failures directly impact user-facing functionality.
Entry Point Detection
The provider automatically detects entry points based on function names and file paths. These weights determine the base criticality of execution paths:
| Entry Type | Weight | Detection Pattern | User-Facing |
|---|---|---|---|
| Main | 10.0 | Function named main | Yes |
| API Endpoint | 8.0 | handle_*, *_handler, get_*, post_* in api/, handler/, route/ paths | Yes |
| CLI Command | 7.0 | cmd_*, command_*, *_command in cli/, command/ paths | Yes |
| Web Handler | 7.0 | Functions with route, handler in web/, http/ paths | Yes |
| Event Handler | 5.0 | on_*, *_listener, contains event | No |
| Test Entry | 2.0 | test_*, in test/ paths | No |
Note on API Endpoint detection: Detection requires BOTH conditions: (1) path contains api/, handler/, or route/ AND (2) function starts with handle_*, get_*, post_*, put_*, delete_* or ends with *_handler. This combined matching ensures accurate classification of HTTP endpoint handlers.
What it detects:
- Entry points (main functions, CLI handlers, API endpoints)
- Error handling paths
- Data processing pipelines
- Resource initialization
Path Weighting
Functions on critical paths receive contribution scores based on:
- Path weight: The maximum entry point weight leading to the function
- User-facing flag: Doubles contribution for user-facing paths
The contribution formula consists of two steps:
#![allow(unused)]
fn main() {
// Step 1: Calculate base contribution (normalized 0-1)
base_contribution = path_weight / max_weight
// Step 2: Apply user-facing multiplier
final_contribution = base_contribution × user_facing_multiplier
// Example: main entry path (weight 10.0, user-facing)
base = 10.0 / 10.0 = 1.0
final = 1.0 × 2.0 = 2.0
// Example: event handler path (weight 5.0, non-user-facing)
base = 5.0 / 10.0 = 0.5
final = 0.5 × 1.0 = 0.5
}
Impact on scoring:
- Functions on critical paths get higher priority
- Entry point multiplier: 1.5x
- Business logic multiplier: 1.2x
Use Cases
- API prioritization: Identify critical endpoints that need careful review
- Refactoring safety: Avoid breaking user-facing execution paths
- Test coverage: Ensure critical paths have adequate test coverage
Enable
debtmap analyze . --context-providers critical_path
Configuration:
[analysis]
context_providers = ["critical_path"]
# Note: Provider-specific TOML sections below are planned features.
# Currently, providers use hard-coded defaults. Use CLI flags for now.
[context.critical_path]
# Multiplier for entry points (default: 1.5)
entry_point_multiplier = 1.5
# Multiplier for business logic (default: 1.2)
business_logic_multiplier = 1.2
Dependency Provider
The Dependency provider analyzes call graph relationships to identify functions with high architectural impact. It calculates how changes propagate through the dependency graph and determines the blast radius of modifications.
Dependency Chain Analysis
The provider builds a dependency graph where:
- Modules contain functions and have intrinsic risk scores
- Edges represent dependencies with coupling strength (0.0-1.0)
- Risk propagation flows through dependencies using iterative refinement
Convergence Parameters: The risk propagation algorithm uses iterative convergence with a maximum of 10 iterations. Convergence is reached when the maximum risk change between iterations falls below 0.01. This ensures risk stabilizes throughout the dependency graph.
What it detects:
- Upstream dependencies (functions this function calls)
- Downstream dependencies (functions that call this function)
- Transitive dependencies through the call graph
- Dependency criticality
Blast Radius Calculation
The blast radius represents how many modules would be affected by changes to a function. It counts unique modules reachable through transitive dependencies by traversing the dependency graph edges.
| Blast Radius | Contribution | Impact Level |
|---|---|---|
| > 10 modules | 1.5 | Critical dependency affecting many modules |
| > 5 modules | 1.0 | Important dependency with moderate impact |
| > 2 modules | 0.5 | Medium impact |
| ≤ 2 modules | 0.2 | Minimal or isolated component |
Risk Propagation Formula
Risk propagation uses an iterative convergence algorithm to stabilize risk scores throughout the dependency graph:
#![allow(unused)]
fn main() {
propagated_risk = base_risk × criticality_factor + Σ(caller.risk × 0.3)
where:
criticality_factor = 1.0 + min(0.5, dependents.len() × 0.1)
The 0.3 factor dampens risk propagation from callers
}
Iterative Convergence: The algorithm runs with a maximum of 10 iterations and converges when the maximum risk change between iterations falls below 0.01. This ensures risk stabilizes throughout the dependency graph without requiring manual tuning.
Note: The constants (0.5, 0.1, 0.3) are currently hard-coded based on empirical analysis. Future versions may make these configurable.
Impact on scoring:
dependency_factor = normalized_to_0_10(upstream + downstream)
Ranges:
- Entry points: 8-10 (critical path)
- Business logic: 6-8 (core functionality)
- Data access: 5-7 (important but stable)
- Utilities: 3-5 (lower priority)
- Test helpers: 1-3 (lowest priority)
Use Cases
- Architectural refactoring: Identify high-impact modules to refactor carefully
- Change impact analysis: Understand downstream effects of modifications
- Module decoupling: Find tightly coupled modules with high blast radius
Enable
debtmap analyze . --context-providers dependency
Configuration:
[analysis]
context_providers = ["dependency"]
# Note: Provider-specific TOML sections below are planned features.
# Currently, providers use hard-coded defaults. Use CLI flags for now.
[context.dependency]
# Include transitive dependencies (default: true)
include_transitive = true
# Maximum depth for transitive analysis (default: 5)
max_depth = 5
Git History Provider
The Git History provider integrates version control data to detect change-prone code and bug patterns. Files with frequent changes and bug fixes indicate higher maintenance risk.
Metrics Collected
The provider analyzes Git history to calculate:
- Change frequency: Commits per month (recent activity indicator)
- Bug density: Ratio of bug fix commits to total commits
- Age: Days since first commit (maturity indicator)
- Author count: Number of unique contributors (complexity indicator)
- Total commits: Total number of commits to the file
- Last modified: Timestamp of the most recent commit
- Stability score: Weighted combination of churn, bug fixes, and age (0.0-1.0)
What it analyzes:
- Commit frequency per file/function
- Bug fix patterns (commits with “fix” in message)
- Code churn (lines added/removed)
- Recent activity
Risk Classification
The table below shows approximate contribution thresholds for understanding risk levels:
| Category | Conditions | Contribution | Explanation |
|---|---|---|---|
| Very unstable | freq > 5.0 AND bug_density > 0.3 | 2.0 | High churn with many bug fixes |
| Moderately unstable | freq > 2.0 OR bug_density > 0.2 | 1.0 | Frequent changes or bug-prone |
| Slightly unstable | freq > 1.0 OR bug_density > 0.1 | 0.5 | Some instability |
| Stable | freq ≤ 1.0 AND bug_density ≤ 0.1 | 0.1 | Low change rate, few bugs |
Continuous Scoring Model
The actual implementation uses a continuous scoring formula rather than discrete thresholds. This provides more accurate differentiation between risk levels (from src/risk/context/git_history.rs:495):
#![allow(unused)]
fn main() {
contribution = (bug_density * 1.5) + min(change_frequency / 20.0, 0.5)
}
Scoring breakdown:
-
Bug density (primary signal): Scales linearly from 0.0 to 1.5
- 0% bugs → 0.0 contribution
- 50% bugs → 0.75 contribution
- 100% bugs → 1.5 contribution
-
Change frequency (secondary signal): Scales from 0.0 to 0.5, saturates at 10/month
- 0/month → 0.0
- 5/month → 0.25
- 10+/month → 0.5 (capped)
-
Total is capped at 2.0 to prevent excessive score amplification
-
Stable code with no bugs and no changes contributes 0.0 (no risk increase)
Example calculations:
| Scenario | Frequency | Bug Density | Contribution |
|---|---|---|---|
| Stable code | 0.0/month | 0% | 0.0 |
| Low activity, some bugs | 2.0/month | 25% | 0.475 |
| High churn, no bugs | 10.0/month | 0% | 0.5 |
| Bug-prone | 0.5/month | 100% | 1.525 |
| Critical hotspot | 10.0/month | 50% | 1.25 |
Bug Fix Detection
The provider identifies bug fixes using sophisticated pattern matching with word boundary detection to minimize false positives from substring matches like “prefix” or “debug”.
Detection Patterns:
The analyzer searches commit messages for these word-boundary-matched patterns (case-insensitive):
\bfix\b,\bfixes\b,\bfixed\b,\bfixing\b- Matches “fix” as a complete word, excluding “prefix”, “suffix”, “fixture”\bbug\b- Matches “bug” as a complete word, excluding “debug”, “debugging”\bhotfix\b- Matches emergency fixes
Git Command:
git log --oneline \
--grep='\bfix\b' --grep='\bfixes\b' --grep='\bfixed\b' \
--grep='\bfixing\b' --grep='\bbug\b' --grep='\bhotfix\b' \
-i -- <file>
Exclusion Filters:
To further reduce false positives, commits are filtered out if they match non-bug-fix patterns:
| Exclusion Type | Keywords | Rationale |
|---|---|---|
| Conventional Commits | style:, chore:, docs:, test: | Not bug fixes, just maintenance |
| Maintenance | formatting, linting, whitespace, typo | Cosmetic changes, not functional bugs |
| Refactoring | refactor: (without bug keywords) | Code improvements without fixing bugs |
Examples of Detection:
| Commit Message | Detected? | Reason |
|---|---|---|
fix: resolve login bug | ✅ Yes | Contains “fix” and “bug” as complete words |
Fixed the payment issue | ✅ Yes | Contains “fixed” as complete word |
hotfix: urgent database fix | ✅ Yes | Contains “hotfix” and “fix” |
Bug fix for issue #123 | ✅ Yes | Contains “bug” and “fix” |
style: apply formatting fixes | ❌ No | Excluded: conventional commit type “style:” |
refactor: improve prefix handling | ❌ No | Excluded: refactor without bug keywords, “prefix” is substring |
Add debugging tools | ❌ No | “debugging” contains “bug” but not as word boundary |
chore: fix linting issues | ❌ No | Excluded: conventional commit type “chore:” |
update: add fixture for testing | ❌ No | “fixture” contains “fix” but not as word boundary |
Bug Density Calculation:
bug_density = bug_fix_count / total_commits
For example, if a file has 10 total commits and 3 are genuine bug fixes (after exclusion filtering):
- Bug density = 3/10 = 0.30 (30%)
A file with 100% bug density means every commit to that file was a bug fix, which is a strong signal that the code is problematic.
Research Background: The Bug Magnet Hypothesis
The use of git history for risk assessment is backed by extensive empirical research in software engineering. The core theory, often called the “Bug Magnet Hypothesis”, states that code with a history of bugs is significantly more likely to contain future bugs.
Empirical Evidence
Multiple large-scale studies have validated this approach:
-
Microsoft Study (Nagappan & Ball, 2005): Analyzed Windows Server 2003 and found that modules with prior bugs were 4-16 times more likely to have future bugs than modules without bug history.
-
Mozilla Study (Hassan, 2009): Found that bug prediction models based on change history achieved 73% accuracy in identifying future buggy files.
-
Linux Kernel Study (Kim et al., 2007): Showed that files with bug fixes in their recent history had a significantly higher probability of containing latent defects.
Why Past Bugs Predict Future Bugs
There are four key mechanisms that explain this phenomenon:
-
Inherent Complexity: Code that attracted bugs in the past is often more complex, making it harder to fix correctly and more prone to regression.
-
Incomplete Understanding: If developers repeatedly introduce bugs in the same area, it suggests the code is difficult to understand or has subtle edge cases.
-
Technical Debt Accumulation: Bug fixes under time pressure often introduce workarounds rather than proper solutions, creating technical debt that leads to more bugs.
-
Broken Window Effect: Once code develops a reputation for being buggy, it may receive less careful maintenance, perpetuating the cycle.
Interpreting Bug Density Scores
| Bug Density | Interpretation | Action |
|---|---|---|
| 0% - 10% | Healthy | Typical for stable, well-tested code |
| 10% - 30% | Moderate concern | Consider adding tests or documentation |
| 30% - 50% | High risk | Strong candidate for refactoring |
| 50% - 100% | Critical | Almost certainly needs redesign or major refactoring |
Example from real output:
├─ GIT HISTORY: 2.0 changes/month, 100.0% bugs, 30 days old, 1 authors
│ └─ Risk Impact: base_risk=39.7 → contextual_risk=88.9 (2.2x multiplier)
This shows a file where every single commit was a bug fix (100% bug density), resulting in a 2.2x risk multiplier. This is a critical red flag indicating code that needs immediate attention.
Limitations and False Positives
The improved detection methodology with word boundary matching and exclusion filters significantly reduces false positives, but some limitations remain:
Reduced Issues (handled by word boundaries and exclusion filters):
- ✅ Substring matches: Word boundary matching (
\bfix\b) now correctly excludes “prefix”, “fixture”, “suffix”, and “debugging” - ✅ Style/formatting commits: Conventional commit prefixes (
style:,chore:) are automatically excluded - ✅ Maintenance changes: Keywords like “formatting”, “linting”, “whitespace”, “typo” are filtered out
- ✅ Non-bug refactorings: Refactoring commits without bug-related keywords are excluded
Remaining Limitations:
- Commit message quality: Still relies on developers mentioning “fix” or “bug” in commit messages
- Underreporting: Some bug fixes may not be mentioned in commit messages
- Example: A commit “Update authentication logic” that actually fixes a bug won’t be detected
- New code: Cannot assess code without commit history
- Files with <5 commits may not have enough data for reliable bug density calculation
- Language barriers: Non-English commit messages may use different bug-fix keywords
- Informal commits: Internal repos may use different conventions (e.g., ticket IDs only)
Accuracy Improvements:
Before word boundary matching and exclusion filters:
- False positive rate: ~15-25% (substring matches, style commits, etc.)
After improvements:
- False positive rate: ~5-10% (primarily commit message quality issues)
- Precision: Significantly improved, particularly for conventional commit style repositories
Recommended practice: Use git history as one signal among many. Combine it with complexity metrics, test coverage, and dependency analysis for a complete risk picture. The bug density metric is most reliable when:
- Repository uses consistent commit message conventions
- Files have at least 10+ commits in their history
- Development team follows English-based conventional commit style
Stability Score
Stability is calculated using weighted factors:
#![allow(unused)]
fn main() {
stability = (churn_factor × 0.4) + (bug_factor × 0.4) + (age_factor × 0.2)
where:
churn_factor = 1.0 / (1.0 + monthly_churn)
bug_factor = 1.0 - (bug_fixes / total_commits)
age_factor = min(1.0, age_days / 365.0)
}
Stability Status Classifications
The provider internally classifies files into stability statuses based on the calculated metrics:
| Status | Criteria | Explanation |
|---|---|---|
| HighlyUnstable | freq > 5.0 AND bug_density > 0.3 | Extremely high churn combined with many bug fixes |
| FrequentlyChanged | freq > 2.0 | High change rate regardless of bug density |
| BugProne | bug_density > 0.2 | High proportion of bug fix commits |
| MatureStable | age > 365 days | Code older than one year (unless unstable) |
| RelativelyStable | (default) | Moderate activity, typical stability |
These classifications are used internally for contribution calculations and appear in verbose output.
Impact on scoring:
- High-churn functions get higher priority
- Recently fixed bugs indicate risk areas
- Stable code (no recent changes) gets lower priority
Use Cases
- Find change-prone code: Identify files that change frequently and need attention
- Detect bug hotspots: Locate areas with high bug fix rates
- Prioritize refactoring: Target unstable code for improvement
- Team collaboration patterns: Files touched by many authors may need better documentation
Enable
debtmap analyze . --context-providers git_history
Configuration:
[analysis]
context_providers = ["git_history"]
# Note: Provider-specific TOML sections below are planned features.
# Currently, providers use hard-coded defaults. Use CLI flags for now.
[context.git_history]
# Commits to analyze (default: 100)
max_commits = 100
# Time range in days (default: 90)
time_range_days = 90
# Minimum commits to consider "high churn" (default: 10)
high_churn_threshold = 10
Troubleshooting
Git repository not found: The provider requires a Git repository. If analysis fails:
# Verify you're in a git repository
git rev-parse --git-dir
# If not a git repo, initialize one or disable git_history provider
# Option 1: Enable context but exclude git_history
debtmap analyze --context --disable-context git_history
# Option 2: Use only specific providers
debtmap analyze --context-providers critical_path,dependency
Performance issues: Git history analysis can be slow for large repositories:
# Use only lightweight providers
debtmap analyze --context-providers critical_path,dependency
Enabling Context Providers
Context-aware analysis is disabled by default. Enable it using CLI flags:
Enable All Providers
# Enable all available context providers
debtmap analyze --context
# or
debtmap analyze --enable-context
Enable Specific Providers
# Enable only critical_path and dependency
debtmap analyze --context-providers critical_path,dependency
# Enable only git_history
debtmap analyze --context-providers git_history
# Enable all three explicitly
debtmap analyze --context-providers critical_path,dependency,git_history
Disable Specific Providers
# Enable context but disable git_history (useful for non-git repos)
debtmap analyze --context --disable-context git_history
# Enable context but disable dependency analysis
debtmap analyze --context --disable-context dependency
Enabling Multiple Providers
Combine providers for comprehensive analysis:
debtmap analyze . --context-providers critical_path,dependency,git_history
Or via config:
[analysis]
context_providers = ["critical_path", "dependency", "git_history"]
Provider Weights
Each provider has a weight that determines its influence on the final risk score:
| Provider | Weight | Rationale |
|---|---|---|
| critical_path | 1.5 | Critical paths have high impact on users |
| dependency_risk | 1.2 | Architectural dependencies affect many modules |
| git_history | 1.0 | Historical patterns indicate future risk |
The total context contribution is calculated as:
#![allow(unused)]
fn main() {
total_contribution = Σ(contribution_i × weight_i)
Example with all providers:
critical_path: 2.0 × 1.5 = 3.0
dependency: 1.0 × 1.2 = 1.2
git_history: 0.5 × 1.0 = 0.5
────────────────────────────
total_contribution = 4.7
contextual_risk = base_risk × (1.0 + 4.7) = base_risk × 5.7
}
How Context Affects Scoring
Base Scoring (No Context)
Base Score = (Complexity × 0.40) + (Coverage × 0.40) + (Dependency × 0.20)
With Context Providers
Context-Adjusted Score = Base Score × Role Multiplier × Churn Multiplier
Role Multiplier (from critical path & dependency analysis):
- Entry points: 1.5x
- Business logic: 1.2x
- Data access: 1.0x
- Infrastructure: 0.8x
- Utilities: 0.5x
- Test code: 0.1x
Churn Multiplier (from git history):
- High churn (10+ commits/month): 1.3x
- Medium churn (5-10 commits/month): 1.1x
- Low churn (1-5 commits/month): 1.0x
- Stable (0 commits/6 months): 0.8x
Context Details Structure
When using --format json, context information is included in the output. The ContextDetails enum contains provider-specific data:
CriticalPath
{
"provider": "critical_path",
"weight": 1.5,
"contribution": 2.0,
"details": {
"CriticalPath": {
"entry_points": ["main (Main)", "handle_request (ApiEndpoint)"],
"path_weight": 10.0,
"is_user_facing": true
}
}
}
DependencyChain
{
"provider": "dependency_risk",
"weight": 1.2,
"contribution": 1.5,
"details": {
"DependencyChain": {
"depth": 3,
"propagated_risk": 8.5,
"dependents": ["module_a", "module_b", "module_c"],
"blast_radius": 12
}
}
}
Historical
{
"provider": "git_history",
"weight": 1.0,
"contribution": 1.0,
"details": {
"Historical": {
"change_frequency": 3.5,
"bug_density": 0.25,
"age_days": 180,
"author_count": 5
}
}
}
Practical Examples
Example 1: Entry Point vs Utility
Without context providers:
Function: main() - Entry point
Complexity: 8
Coverage: 50%
Score: 6.0 [MEDIUM]
Function: format_string() - Utility
Complexity: 8
Coverage: 50%
Score: 6.0 [MEDIUM]
Both functions have the same score.
With context providers:
Function: main() - Entry point
Complexity: 8
Coverage: 50%
Base Score: 6.0
Role Multiplier: 1.5x (entry point)
Final Score: 9.0 [CRITICAL]
Function: format_string() - Utility
Complexity: 8
Coverage: 50%
Base Score: 6.0
Role Multiplier: 0.5x (utility)
Final Score: 3.0 [LOW]
Entry point is prioritized over utility.
Example 2: High-Churn Function
Without git history:
Function: process_payment()
Complexity: 12
Coverage: 60%
Score: 7.5 [HIGH]
With git history:
Function: process_payment()
Complexity: 12
Coverage: 60%
Base Score: 7.5
Churn: 15 commits in last month (bug fixes)
Churn Multiplier: 1.3x
Final Score: 9.75 [CRITICAL]
High-churn function is elevated to critical.
Example 3: Stable Well-Tested Code
Without context:
Function: legacy_parser()
Complexity: 15
Coverage: 95%
Score: 3.5 [LOW]
With context:
Function: legacy_parser()
Complexity: 15
Coverage: 95%
Base Score: 3.5
Churn: 0 commits in last 2 years
Churn Multiplier: 0.8x
Role: Data access (stable)
Role Multiplier: 1.0x
Final Score: 2.8 [LOW]
Stable, well-tested code gets even lower priority.
Example 4: API Endpoint Prioritization
Analyze a web service to identify critical API endpoints:
debtmap analyze --context-providers critical_path --format json
Functions on API endpoint paths will receive elevated risk scores. Use this to prioritize code review and testing for user-facing functionality.
Example 5: Finding Change-Prone Code
Identify files with high change frequency and bug fixes:
debtmap analyze --context-providers git_history --top 20
This highlights unstable areas of the codebase that may benefit from refactoring or increased test coverage.
Example 6: Architectural Impact Analysis
Find high-impact modules with large blast radius:
debtmap analyze --context-providers dependency --format json | \
jq '.[] | select(.blast_radius > 10)'
Use this to identify architectural choke points that require careful change management.
Example 7: Comprehensive Risk Assessment
Combine all providers for holistic risk analysis:
debtmap analyze --context -v
The verbose output shows how each provider contributes to the final risk score:
function: process_payment
base_risk: 5.0
critical_path: +3.0 (on main path, user-facing)
dependency: +1.2 (12 dependent modules)
git_history: +1.0 (3.5 changes/month, 0.25 bug density)
──────────────────
contextual_risk: 26.0
Configuration
⚠️ Configuration Limitation: Provider-specific TOML configuration sections shown below are planned features not yet implemented. Currently, all provider settings use hard-coded defaults from the implementation. Use CLI flags (
--context,--context-providers,--disable-context) to control providers. See the CLI examples throughout this chapter for working configurations.
Configure context providers in .debtmap.toml:
[analysis]
# Enable context-aware analysis (default: false)
enable_context = true
# Specify which providers to use
context_providers = ["critical_path", "dependency", "git_history"]
# Disable specific providers (use CLI flag --disable-context instead)
# disable_context = ["git_history"] # Not yet implemented in config
[context.git_history]
# Commits to analyze (default: 100) - PLANNED
max_commits = 100
# Time range in days (default: 90) - PLANNED
time_range_days = 90
# Minimum commits to consider "high churn" (default: 10) - PLANNED
high_churn_threshold = 10
[context.critical_path]
# Multiplier for entry points (default: 1.5) - PLANNED
entry_point_multiplier = 1.5
# Multiplier for business logic (default: 1.2) - PLANNED
business_logic_multiplier = 1.2
[context.dependency]
# Include transitive dependencies (default: true) - PLANNED
include_transitive = true
# Maximum depth for transitive analysis (default: 5) - PLANNED
max_depth = 5
Performance Considerations
Context providers add computational overhead to analysis:
Impact on analysis time:
- Critical path: +10-15% (fast - call graph traversal)
- Dependency: +20-30% (moderate - iterative risk propagation)
- Git history: +30-50% (slow for large repos - multiple git commands per file)
Combined overhead: ~60-80% increase in analysis time
Batched Git History Optimization
The git history provider uses a batched loading strategy to minimize subprocess overhead. Instead of running separate git commands for each file (which would create N subprocess calls for N files), the provider:
- Fetches all git history data upfront in a single batch (
BatchedGitHistory) - Parses the complete log output into an in-memory index
- Serves subsequent file lookups via O(1) HashMap access
This optimization reduces the git subprocess calls from O(N) to O(1), making git history analysis practical even for large repositories.
Implementation details (from src/risk/context/git_history/batched.rs):
- On initialization, runs a single
git log --numstatto fetch all file changes - Builds per-file indexes for change frequency, bug density, authors, etc.
- Falls back to direct git queries only if batch loading fails
Optimization Tips
- Start minimal: Use
--context-providers critical_path,dependencyinitially - Add git_history selectively: Enable for critical modules only
- Use caching: The
ContextAggregatorcaches results byfile:functionkey - Profile with verbose flags: Use
-vvvto see provider execution times
For Large Projects
# Disable git history for faster analysis
debtmap analyze . --disable-context git_history
# Or disable all context
debtmap analyze . --no-context-aware
For CI/CD
# Full analysis with context (run nightly)
debtmap analyze . --context-providers critical_path,dependency,git_history
# Fast analysis without context (run on every commit)
debtmap analyze . --no-context-aware
When to Use Each Provider
| Scenario | Recommended Providers |
|---|---|
| API service refactoring | critical_path |
| Legacy codebase analysis | git_history |
| Microservice boundaries | dependency |
| Pre-release risk review | All providers (--context) |
| CI/CD integration | critical_path,dependency (faster) |
Troubleshooting
Git History Analysis Slow
Issue: Analysis takes much longer with git history enabled
Solutions:
Reduce commit history:
[context.git_history]
max_commits = 50
time_range_days = 30
Use shallow clone in CI:
git clone --depth 50 repo.git
debtmap analyze . --context-providers critical_path,dependency
Incorrect Role Classification
Issue: Function classified as wrong role (e.g., utility instead of business logic)
Possible causes:
- Function naming doesn’t match patterns
- Call graph analysis incomplete
- Function is misplaced in codebase
Solutions:
Check with verbose output:
debtmap analyze . -vv | grep "Role classification"
Manually verify call graph:
debtmap analyze . --show-call-graph
Context Providers Not Available
Issue: --context-providers flag not recognized
Solution: Ensure you’re using a recent version:
debtmap --version
# Should be 0.2.0 or later
Update debtmap:
cargo install debtmap --force
Common Issues
Issue: Context providers not affecting scores
Solution: Ensure providers are enabled with --context or --context-providers
# Wrong: context flag missing
debtmap analyze
# Correct: context enabled
debtmap analyze --context
Issue: Git history provider fails with “Not a git repository”
Solution: Disable git_history if not using version control
debtmap analyze --context --disable-context git_history
Issue: Dependency analysis errors
Solution: Check for circular dependencies or disable dependency provider
debtmap analyze --context --disable-context dependency
Issue: Slow analysis with all providers
Solution: Use selective providers or increase verbosity to identify bottlenecks
# Faster: skip git_history
debtmap analyze --context-providers critical_path,dependency
# Debug: see provider execution times
debtmap analyze --context -vvv
For more troubleshooting guidance, see the Troubleshooting chapter.
Advanced Usage
Interpreting Context Contribution
Enable verbose output to see detailed context contributions:
debtmap analyze --context -v
Each function shows:
- Base risk score from complexity/coverage
- Individual provider contributions
- Total contextual risk score
- Provider-specific explanations
Architecture Exploration
The ContextAggregator caches context by file:function key to avoid redundant analysis during a single run.
Cache Lifetime: The cache is in-memory per ContextAggregator instance and is cleared when a new instance is created or when analyzing a different codebase. This enables efficient re-analysis within the same run without requiring external cache management:
#![allow(unused)]
fn main() {
let mut aggregator = ContextAggregator::new()
.with_provider(Box::new(CriticalPathProvider::new(analyzer)))
.with_provider(Box::new(DependencyRiskProvider::new(graph)))
.with_provider(Box::new(GitHistoryProvider::new(repo_root)?));
let context = aggregator.analyze(&target);
let contribution = context.total_contribution();
}
Custom Provider Implementation
Advanced users can implement custom context providers by implementing the ContextProvider trait:
#![allow(unused)]
fn main() {
pub trait ContextProvider: Send + Sync {
fn name(&self) -> &str;
fn gather(&self, target: &AnalysisTarget) -> Result<Context>;
fn weight(&self) -> f64;
fn explain(&self, context: &Context) -> String;
}
}
See src/risk/context/mod.rs for the trait definition and src/risk/context/ for built-in provider implementations.
Future Enhancements
Business Context Provider (Planned)
A Business context provider is defined but not yet implemented. It will support:
#![allow(unused)]
fn main() {
Business {
priority: Priority, // Critical, High, Medium, Low
impact: Impact, // Revenue, UserExperience, Security, Compliance
annotations: Vec<String> // Custom business metadata
}
}
This will allow manual prioritization based on business requirements through code annotations or configuration files.
Best Practices
- Use all providers for comprehensive analysis - Especially for production code
- Disable git history in CI - Use shallow clones or disable for speed
- Verify role classifications - Use
-vvto see how functions are classified - Tune multipliers for your project - Adjust in config based on architecture
- Combine with coverage data - Context providers enhance coverage-based risk analysis
Summary
Context providers transform debtmap from a static complexity analyzer into a comprehensive risk assessment tool. By combining:
- Critical path analysis for user impact
- Dependency analysis for architectural risk
- Git history analysis for maintenance patterns
You gain actionable insights for prioritizing technical debt and refactoring efforts. Start with --context to enable all providers, then refine based on your project’s needs.
See Also
- Analysis Guide - Core analysis concepts
- Risk Scoring - Risk scoring methodology
- Scoring Strategies - File-level and function-level scoring
- Configuration - Complete configuration reference
- Parallel Processing - Performance optimization
- Troubleshooting - Common issues and solutions
Coverage Gap Analysis
Debtmap provides precise line-level coverage gap reporting to help you understand exactly which lines of code lack test coverage, rather than relying on misleading function-level percentages.
Understanding Coverage Gaps
A coverage gap represents the portion of a function that is not executed during tests. Traditional tools report this as a simple percentage (e.g., “50% covered”), but this can be misleading:
- A 100-line function with 1 uncovered line shows “99% covered” - sounds great!
- A 10-line function with 1 uncovered line shows “90% covered” - sounds worse, but is actually better
Debtmap improves on this by:
- Reporting the actual number of uncovered lines
- Showing which specific lines are uncovered
- Calculating the gap as a percentage of instrumented lines (not total lines)
- Providing visual severity indicators based on gap size
Precise vs Estimated Gaps
Debtmap uses different precision levels depending on available coverage data:
Precise Gaps (Line-Level Data Available)
When LCOV coverage data is available, debtmap provides exact line-level reporting:
Business logic - 1 line uncovered (11% gap) - line 52
Complex calculation - 4 lines uncovered (20% gap) - lines 10-12, 15
Benefits:
- Exact line numbers for uncovered code
- Accurate gap percentage based on instrumented lines
- Compact line range formatting (e.g., “10-12, 15, 20-21”)
- Distinguishes between code that can’t be instrumented vs uncovered code
How it works:
- Debtmap reads LCOV coverage data from your test runs
- Matches functions by file path and name
- Extracts precise uncovered line numbers
- Calculates percentage as:
(uncovered_lines / instrumented_lines) * 100
Estimated Gaps (Function-Level Data Only)
When only function-level coverage percentages are available:
Data processing - ~50% gap (estimated, ~25 lines)
Helper function - ~100% gap (estimated, 15 lines)
Utility - ~3% gap (mostly covered)
Characteristics:
- Estimates uncovered line count from percentage
- Uses tilde (~) prefix to indicate estimation
- Special cases:
- ≥99% gap → “~100% gap”
- <5% gap → “mostly covered”
- Otherwise → “~X% gap (estimated, ~Y lines)”
How it works:
- Falls back when LCOV data unavailable or function not found
- Calculates:
estimated_uncovered = total_lines * (gap_percentage / 100) - Useful for quick overview but less actionable than precise gaps
Unknown Coverage
When no coverage data is available:
Untested module - Coverage data unavailable (42 lines)
This typically occurs when:
- No coverage collection has been run
- File not included in coverage report
- Coverage data file path mismatch
Gap Severity Indicators
Debtmap uses visual indicators to quickly identify the severity of coverage gaps:
| Indicator | Range | Severity | Meaning |
|---|---|---|---|
| 🟡 | 1-25% | LOW | Minor gaps, mostly covered |
| 🟠 | 26-50% | MODERATE | Significant gaps, needs attention |
| 🔴 | 51-75% | HIGH | Major gaps, high priority |
| 🔴🔴 | 76-100% | CRITICAL | Severe gaps, critical priority |
These indicators appear in debtmap’s priority output to help you quickly identify which functions need testing most urgently.
Severity Calculation
Gap severity is based on the percentage of uncovered code:
#![allow(unused)]
fn main() {
fn get_severity(gap_percentage: f64) -> &'static str {
match gap_percentage {
p if p <= 25.0 => "🟡 LOW",
p if p <= 50.0 => "🟠 MODERATE",
p if p <= 75.0 => "🔴 HIGH",
_ => "🔴🔴 CRITICAL"
}
}
}
This works for both precise and estimated gaps, ensuring consistent severity classification across your codebase.
Example Output
High Verbosity Mode
Priority 1: Authentication Logic (CRITICAL)
File: src/auth/login.rs:45
Coverage Gap: 2 lines uncovered (89% gap) 🔴🔴 - lines 67, 89
Complexity: Cyclomatic 8, Cognitive 12
Impact: High-risk business logic with critical coverage gaps
Priority 2: Data Validation (HIGH)
File: src/validation/rules.rs:120
Coverage Gap: 15 lines uncovered (65% gap) 🔴 - lines 145-152, 167-173
Complexity: Cyclomatic 5, Cognitive 8
Impact: Complex validation logic needs comprehensive testing
Priority 3: Helper Function (MODERATE)
File: src/utils/helpers.rs:30
Coverage Gap: ~45% gap (estimated, ~12 lines) 🟠
Complexity: Cyclomatic 3, Cognitive 4
Impact: Moderate complexity with estimated coverage gaps
Standard Mode
1. Authentication Logic (src/auth/login.rs:45)
Gap: 2 lines uncovered (89%) 🔴🔴 [lines 67, 89]
2. Data Validation (src/validation/rules.rs:120)
Gap: 15 lines uncovered (65%) 🔴 [lines 145-152, 167-173]
3. Helper Function (src/utils/helpers.rs:30)
Gap: ~45% (estimated) 🟠
Integration with Coverage Tools
Generating LCOV Data
For precise gap reporting, generate LCOV coverage data with your test framework:
Rust (using cargo-tarpaulin):
cargo tarpaulin --out Lcov --output-dir ./coverage
Python (using pytest-cov):
pytest --cov=mypackage --cov-report=lcov:coverage/lcov.info
JavaScript (using Jest):
jest --coverage --coverageReporters=lcov
Configuring Debtmap
Point debtmap to your coverage data:
debtmap analyze --coverage-path ./coverage/lcov.info
Or in .debtmap.toml:
[coverage]
lcov_path = "./coverage/lcov.info"
Best Practices
1. Use Precise Gaps When Possible
Always generate LCOV data for actionable coverage insights:
- Precise line numbers help you quickly locate untested code
- Accurate percentages prevent over/under-estimating gaps
- Line ranges show if gaps are concentrated or scattered
2. Focus on High Severity Gaps First
Prioritize based on severity indicators:
- 🔴🔴 CRITICAL (76-100%) - Address immediately
- 🔴 HIGH (51-75%) - Schedule for next sprint
- 🟠 MODERATE (26-50%) - Address when convenient
- 🟡 LOW (1-25%) - Acceptable for some code
3. Consider Context
Gap severity should be weighted by:
- Function role: Business logic vs utilities
- Complexity: High complexity + high gap = top priority
- Change frequency: Frequently changed code needs better coverage
- Risk: Security, data integrity, financial calculations
4. Track Progress Over Time
Run debtmap regularly to track coverage improvements:
# Weekly coverage check
debtmap analyze --coverage-path ./coverage/lcov.info > weekly-gaps.txt
Compare reports to see gap reduction progress.
Troubleshooting
“Coverage data unavailable” for all functions
Cause: Debtmap can’t find or parse LCOV file
Solutions:
- Verify
--coverage-pathpoints to valid LCOV file - Ensure LCOV file was generated recently
- Check file permissions (readable by debtmap)
- Validate LCOV format:
head -20 ./coverage/lcov.info
Line numbers don’t match source code
Cause: Source code changed since coverage was generated
Solutions:
- Re-run tests with coverage collection
- Ensure clean build before coverage run
- Commit code before running coverage
Estimated gaps for functions with LCOV data
Cause: Function name or path mismatch
Solutions:
- Check function names match exactly (case-sensitive)
- Verify file paths are consistent (relative vs absolute)
- Enable debug logging:
debtmap analyze --log-level debug
Missing functions in coverage report
Cause: Functions not instrumented or filtered out
Solutions:
- Check coverage tool configuration
- Ensure test execution reaches those functions
- Verify functions aren’t in excluded paths
Related Topics
- Coverage Integration - Detailed coverage tool setup
- Tiered Prioritization - How coverage gaps affect priority
- Scoring Strategies - Coverage weight in debt scoring
- Metrics Reference - All coverage-related metrics
Coverage Integration
Coverage integration is one of Debtmap’s most powerful capabilities, enabling risk-based prioritization by correlating complexity metrics with test coverage. This helps you identify truly risky code—functions that are both complex and untested—rather than just highlighting complex but well-tested functions.
Why Coverage Matters
Without coverage data, complexity analysis shows you what’s complex, but not what’s risky. A complex function with 100% test coverage poses far less risk than a simple function with 0% coverage on a critical path.
Coverage integration transforms Debtmap from a complexity analyzer into a risk assessment tool:
- Prioritize testing efforts: Focus on high-complexity functions with low coverage
- Validate refactoring safety: See which complex code is already protected by tests
- Risk-based sprint planning: Surface truly risky code ahead of well-tested complexity
- Quantify risk reduction: Measure how coverage improvements reduce project risk
LCOV Format: The Universal Standard
Debtmap uses the LCOV format for coverage data. LCOV is a language-agnostic standard supported by virtually all coverage tools across all major languages.
Why LCOV?
- Universal compatibility: Works with Rust, Python, JavaScript, TypeScript, Go, and more
- Tool independence: Not tied to any specific test framework
- Simple text format: Easy to inspect and debug
- Widely supported: Generated by most modern coverage tools
LCOV File Structure
An LCOV file contains line-by-line coverage information:
SF:src/analyzer.rs
FN:42,calculate_complexity
FNDA:15,calculate_complexity
DA:42,15
DA:43,15
DA:44,12
DA:45,0
LH:3
LF:4
end_of_record
SF:- Source file pathFN:- Function name and starting lineFNDA:- Function execution countDA:- Line execution data (line number, hit count)LH:- Lines hitLF:- Lines found (total)
Rust Name Demangling
For Rust projects, Debtmap includes sophisticated name demangling to correlate LCOV coverage with analyzed functions. The demangling system handles:
Mangling Schemes:
- v0 scheme: Starts with
_RNv(modern Rust, default since 1.38) - Legacy scheme: Starts with
_ZN(older Rust versions)
Normalization Process (see src/risk/lcov/demangle.rs:demangle_function_name and src/risk/lcov/normalize.rs:normalize_demangled_name):
- Demangle: Convert mangled symbols to human-readable names
- Strip crate hashes: Remove build-specific hash IDs (e.g.,
debtmap[71f4b4990cdcf1ab]→debtmap) - Strip generic parameters: Remove type parameters (e.g.,
HashMap<K,V>::insert→HashMap::insert) - Extract method names: Store both full path and method name for flexible matching
Examples:
| Original (in LCOV) | After Demangling | Normalized Full Path | Method Name |
|---|---|---|---|
_RNvXs0_14debtmap...visit_expr | <debtmap[hash]::Type>::visit_expr | debtmap::Type::visit_expr | visit_expr |
Type::method::<T> | Type::method::<T> | Type::method | method |
std::vec::Vec<T>::push | std::vec::Vec<T>::push | std::vec::Vec::push | push |
This normalization enables Debtmap to match coverage data even when:
- Crate hashes change between builds
- Multiple monomorphizations of generic functions exist
- LCOV stores simplified names while Debtmap uses qualified names
Generating Coverage Data
Rust: cargo-tarpaulin
Installation:
cargo install cargo-tarpaulin
Generate LCOV:
cargo tarpaulin --out lcov --output-dir target/coverage
Analyze with Debtmap:
debtmap analyze . --lcov target/coverage/lcov.info
Common Issues:
- Ensure tests compile before running tarpaulin
- Use
--ignore-testsif tests themselves show up in coverage - Check paths match your project structure (relative to project root)
JavaScript/TypeScript: Jest
Configuration (package.json or jest.config.js):
{
"jest": {
"coverageReporters": ["lcov", "text"]
}
}
Generate Coverage:
npm test -- --coverage
Analyze with Debtmap:
debtmap analyze . --lcov coverage/lcov.info
Python: pytest-cov
Installation:
pip install pytest-cov
Generate LCOV:
pytest --cov=src --cov-report=lcov
Analyze with Debtmap:
debtmap analyze . --lcov coverage.lcov
Go: go test with gcov2lcov
Install gcov2lcov:
go install github.com/jandelgado/gcov2lcov@latest
Generate Coverage:
# Generate Go coverage profile
go test -coverprofile=coverage.out ./...
# Convert to LCOV format
gcov2lcov -infile=coverage.out -outfile=coverage.lcov
Analyze with Debtmap:
debtmap analyze . --lcov coverage.lcov --languages go
Note: Go’s native coverage format requires conversion to LCOV. The gcov2lcov tool provides direct conversion without intermediate formats.
Role-Based Coverage Expectations
Not all functions need the same level of test coverage. Debtmap uses a role-based coverage expectation system to adjust scoring based on function purpose (see src/priority/scoring/coverage_expectations.rs and src/risk/evidence/coverage_analyzer.rs).
Function Roles and Coverage Targets
| Role | Coverage Range | Rationale | Role Weight |
|---|---|---|---|
| PureLogic | 90-95% | Business logic requires comprehensive testing | 1.0 |
| EntryPoint | 70-80% | Better tested with integration tests | 0.9 |
| Orchestrator | 60-80% | Coordinates other functions, moderate complexity | 0.6 |
| IOWrapper | 40-60% | Thin I/O layer, often integration-tested | 0.4 |
| PatternMatch | 50-70% | Simple pattern matching, lower complexity | 0.3 |
| Debug | 20-30% | Diagnostic functions, low priority | 0.2 |
| Unknown | 70-90% | Default for unclassified functions | 0.8 |
Source: See src/priority/semantic_classifier/mod.rs:25-32 for role definitions and src/risk/evidence/coverage_analyzer.rs:63-73 for role weight calculation.
Coverage Gap Severity Classification
Debtmap classifies coverage gaps into 4 severity levels (see src/priority/scoring/coverage_expectations.rs:GapSeverity):
| Severity | Condition | Impact on Score | Visual |
|---|---|---|---|
| None | Coverage ≥ target | No penalty | 🟢 |
| Minor | Coverage between min and target | Small penalty | 🟡 |
| Moderate | Coverage between 50% of min and min | Medium penalty | 🟠 |
| Critical | Coverage < 50% of min | High penalty | 🔴 |
Example: For PureLogic (target: 95%, min: 90%):
- 96% coverage → None (🟢)
- 92% coverage → Minor (🟡)
- 75% coverage → Moderate (🟠)
- 40% coverage → Critical (🔴)
Test Quality Classification
Coverage is classified into quality tiers based on both percentage and complexity (see src/risk/evidence/coverage_analyzer.rs:44-52):
#![allow(unused)]
fn main() {
fn classify_test_quality(coverage: f64, complexity: u32) -> TestQuality {
match () {
_ if coverage >= 90.0 && complexity <= 5 => TestQuality::Excellent,
_ if coverage >= 80.0 => TestQuality::Good,
_ if coverage >= 60.0 => TestQuality::Adequate,
_ if coverage > 0.0 => TestQuality::Poor,
_ => TestQuality::Missing,
}
}
}
Quality Levels:
- Excellent: ≥90% coverage AND complexity ≤5 (simple, well-tested)
- Good: ≥80% coverage (comprehensive testing)
- Adequate: ≥60% coverage (basic testing)
- Poor: >0% but <60% coverage (incomplete testing)
- Missing: 0% coverage (no tests)
How Roles Affect Scoring
Role weights adjust the coverage penalty applied to functions (see src/risk/evidence/coverage_analyzer.rs:63-73):
Example: A function with 50% coverage:
- PureLogic (weight: 1.0): Full penalty, high urgency
- Orchestrator (weight: 0.6): 60% of full penalty
- Debug (weight: 0.2): Only 20% of full penalty, low urgency
This ensures that:
- Business logic functions are prioritized for testing
- Entry points rely more on integration tests
- Diagnostic/debug functions don’t create noise
How Coverage Affects Scoring
Coverage data fundamentally changes how Debtmap calculates debt scores. The scoring system operates in two different modes depending on whether coverage data is available.
Scoring Modes
Mode 1: With Coverage Data (Dampening Multiplier)
When you provide an LCOV file with --lcov, coverage acts as a dampening multiplier that reduces scores for well-tested code:
Base Score = (Complexity Factor × 0.50) + (Dependency Factor × 0.25)
Coverage Multiplier = 1.0 - coverage_percentage
Final Score = Base Score × Coverage Multiplier
This is the current implementation as of spec 122. Coverage dampens the base score rather than contributing as an additive component.
Mode 2: Without Coverage Data (Weighted Sum)
When no coverage data is available, Debtmap falls back to a weighted sum model:
Final Score = (Coverage × 0.50) + (Complexity × 0.35) + (Dependency × 0.15)
In this mode, coverage is assumed to be 0% (worst case), giving it a weight of 50% in the total score. See src/priority/scoring/calculation.rs:119-129 for the implementation.
Coverage Dampening Multiplier
When coverage data is provided, it acts as a multiplier that dampens the base score:
Coverage Multiplier = 1.0 - coverage_percentage
Final Score = Base Score × Coverage Multiplier
Examples:
| Base Score | Coverage | Multiplier | Final Score | Priority |
|---|---|---|---|---|
| 8.5 | 100% | 0.0 | 0.0 | Minimal (well-tested) |
| 8.5 | 50% | 0.5 | 4.25 | Medium |
| 8.5 | 0% | 1.0 | 8.5 | High (untested) |
Key Insight: Complex but well-tested code automatically drops in priority, while untested complex code rises to the top.
Special Cases:
- Test functions: Coverage multiplier = 0.0 (tests get near-zero scores regardless of complexity)
- Entry points: Handled through semantic classification (FunctionRole) system with role multipliers, not coverage-specific weighting
Invariant: Total debt score with coverage ≤ total debt score without coverage.
Implementation: See src/priority/scoring/calculation.rs:68-82 for the coverage dampening calculation.
Transitive Coverage Propagation
Debtmap doesn’t just look at direct coverage—it propagates coverage through the call graph using transitive analysis.
How It Works
A function’s effective coverage considers:
- Direct coverage: Lines executed by tests
- Caller coverage: Coverage of functions that call this function
Transitive Coverage = Direct Coverage + Σ(Caller Coverage × Weight)
Algorithm Parameters
The transitive coverage propagation uses carefully tuned parameters to balance accuracy and performance:
- Well-Tested Threshold: 80% - Only functions with ≥80% direct coverage contribute to indirect coverage, ensuring high confidence
- Distance Discount: 70% per hop - Each level of indirection reduces contribution by 30%, reflecting decreased confidence
- Maximum Distance: 3 hops - Limits recursion depth to prevent exponential complexity (after 3 hops, contribution drops to ~34%)
These parameters ensure that indirect coverage signals are meaningful while preventing false confidence from distant call relationships. See src/priority/coverage_propagation.rs:38-46 for the implementation.
Why It Matters
A function with 0% direct coverage might have high transitive coverage if it’s only called by well-tested functions:
#![allow(unused)]
fn main() {
// direct_coverage = 0%
// But called only by `process_request` (100% coverage)
// → transitive_coverage = 85%
fn validate_input(data: &str) -> bool {
data.len() > 0
}
// direct_coverage = 100%
fn process_request(input: String) -> Result<()> {
if !validate_input(&input) {
return Err("Invalid");
}
// ...
}
}
Effect: validate_input has reduced urgency because it’s only reachable through well-tested code paths.
Generic Function Coverage (Monomorphization)
Challenge: Generic functions in Rust get monomorphized into multiple versions, each appearing as a separate function in LCOV with different coverage.
For example, execute::<T>() might appear as:
execute::<WorkflowExecutor>- 70% coverage, uncovered: [10, 20, 30]execute::<MockExecutor>- 80% coverage, uncovered: [20, 40]
Debtmap’s Solution (see src/risk/coverage_index.rs:merge_coverage):
-
Base Function Index: Maps base names to all monomorphized versions
(file, "execute")→["execute::<WorkflowExecutor>", "execute::<MockExecutor>"]
-
Intersection Merge Strategy: A line is uncovered ONLY if ALL versions leave it uncovered
- Coverage percentage: Average across all versions (75% in example)
- Uncovered lines: Intersection of uncovered sets ([20] in example)
-
Conservative Approach: Ensures we don’t claim coverage that doesn’t exist in all code paths
Example Aggregation:
| Version | Coverage | Uncovered Lines |
|---|---|---|
execute::<WorkflowExecutor> | 70% | [10, 20, 30] |
execute::<MockExecutor> | 80% | [20, 40] |
| Aggregated Result | 75% | [20] |
Line 20 is uncovered in BOTH versions, so it’s risky. Lines 10, 30, 40 are covered in at least one version, so they’re considered safer.
Implementation: See src/risk/coverage_index.rs:AggregateCoverage and merge_coverage (lines 50-139).
Trait Method Coverage Matching
Challenge: LCOV files may store trait method implementations with simplified names while Debtmap tracks fully qualified names.
Example Mismatch:
- LCOV stores:
visit_expr(demangled method name) - Debtmap queries:
RecursiveMatchDetector::visit_expr(fully qualified)
Debtmap’s Solution (see src/risk/coverage_index.rs:generate_name_variants and method_name_index):
-
Name Variant Generation: Extract method name from qualified paths
RecursiveMatchDetector::visit_expr→ generates variantvisit_expr
-
Method Name Index: O(1) lookup from method name to all implementations
(file, "visit_expr")→["RecursiveMatchDetector::visit_expr", "_RNvXs0_...visit_expr"]
-
Multi-Strategy Matching: Try variants if exact match fails
- First: Exact qualified name match
- Second: Method name variant match
- Third: Line-based fallback
Implementation: See src/risk/coverage_index.rs:192 (method_name_index) and generate_name_variants (lines 12-48).
Performance Characteristics
Coverage integration is highly optimized for large codebases using a multi-strategy lookup system.
Coverage Index Structure
The coverage index uses nested HashMaps plus supporting indexes for O(1) lookups (see src/risk/coverage_index.rs:172-196):
- by_file:
HashMap<PathBuf, HashMap<String, FunctionCoverage>>- Primary nested index - by_line:
HashMap<PathBuf, BTreeMap<usize, FunctionCoverage>>- Line-based range queries - base_function_index: Maps base names to monomorphized versions (generic handling)
- method_name_index: Maps method names to full qualified names (trait methods)
Lookup Strategy Waterfall
Debtmap tries 5 strategies in order, stopping at the first match (see src/risk/coverage_index.rs:get_function_coverage_with_line):
| Strategy | Complexity | When It Matches | Typical Latency |
|---|---|---|---|
| 1. Exact Match | O(1) | File path and function name exactly match LCOV | ~0.5μs |
| 2. Suffix Matching | O(files) | Query path ends with LCOV file path, then O(1) function lookup | ~5-8μs |
| 3. Reverse Suffix | O(files) | LCOV file path ends with query path, then O(1) function lookup | ~5-8μs |
| 4. Normalized Equality | O(files) | Paths equal after normalizing ./ prefix, then O(1) function lookup | ~5-8μs |
| 5. Line-Based Fallback | O(log n) | Match by line number ±2 tolerance using BTreeMap range query | ~10-15μs |
Strategy Optimizations:
- Path strategies iterate over FILES (typically ~375) not functions (~1,500+), providing 4x-50x speedup
- Each path strategy tries 3 name matching techniques per file:
- Exact name match
- Function name suffix match (handles qualified vs short names)
- Method name match (handles trait implementations)
Performance Benchmarks
- Index Build: O(n), ~20-30ms for 5,000 functions
- Exact Lookup: O(1), ~0.5μs per lookup
- Path Strategy Fallback: O(files) × O(1), ~5-8μs when exact match fails
- Line-Based Fallback: O(log n), ~10-15μs when all path strategies fail
- Memory Usage: ~200 bytes per record (~2MB for 5,000 functions)
- Thread Safety: Lock-free parallel access via
Arc<CoverageIndex> - Analysis Overhead: ~2.5x baseline (target: ≤3x)
Result: Coverage integration adds minimal overhead even on projects with thousands of functions.
Debugging Lookup Performance
The coverage index tracks detailed statistics for performance analysis (see src/risk/coverage_index.rs:CoverageIndexStats):
#![allow(unused)]
fn main() {
pub struct CoverageIndexStats {
pub total_files: usize,
pub total_records: usize,
pub index_build_time: Duration,
pub estimated_memory_bytes: usize,
}
}
Enable verbose logging to see which strategy matched:
debtmap analyze . --lcov coverage.info -vv
Output shows strategy attempts:
Looking up coverage for function 'visit_expr' in file 'src/detector.rs'
Strategy 1: Suffix matching...
Found path match: 'src/detector.rs'
✓ Matched method name 'visit_expr' -> 'RecursiveMatchDetector::visit_expr': 85%
CLI Options Reference
Primary Coverage Options
# Provide LCOV coverage file
debtmap analyze . --coverage-file path/to/lcov.info
# Shorthand alias
debtmap analyze . --lcov path/to/lcov.info
Context Providers
Coverage can be combined with other context providers for nuanced risk assessment:
# Enable all context providers (includes coverage propagation)
debtmap analyze . --lcov coverage.info --enable-context
# Specify specific providers
debtmap analyze . --lcov coverage.info \
--context-providers critical_path,dependency,git_history
# Disable specific providers
debtmap analyze . --lcov coverage.info \
--disable-context git_history
Available Context Providers:
critical_path: Identifies functions on critical execution pathsdependency: Analyzes dependency relationships and impactgit_history: Uses change frequency from version control
See Scoring Strategies for details on how these combine.
Validate Command Support
The validate command also supports coverage integration for risk-based quality gates:
# Fail CI builds if untested complex code exceeds thresholds
debtmap validate . --lcov coverage.info --max-debt-density 50
See CLI Reference for complete validation options.
Troubleshooting Coverage Integration
Coverage Not Correlating with Functions
Symptoms:
- Debtmap shows 0% coverage for all functions
- Warning: “No coverage data correlated with analyzed functions”
Solutions:
- Verify LCOV Format:
head coverage.info
# Should show: SF:, FN:, DA: lines
- Check Path Matching: Coverage file paths must be relative to project root:
# Good: SF:src/analyzer.rs
# Bad: SF:/home/user/project/src/analyzer.rs
- Use explain-coverage Command:
debtmap explain-coverage . --lcov coverage.info \
--function validate_input \
--file src/validator.rs \
--format json
The explain-coverage command provides detailed diagnostics:
JSON Output Structure (see src/commands/explain_coverage.rs:ExplainCoverageResult):
{
"function_name": "validate_input",
"file_path": "src/validator.rs",
"coverage_found": true,
"coverage_percentage": 85.0,
"matched_by_strategy": "Suffix Matching",
"attempts": [
{
"strategy": "Exact Match",
"success": false,
"matched_function": null,
"coverage_percentage": null
},
{
"strategy": "Suffix Matching",
"success": true,
"matched_function": "validator::validate_input",
"matched_file": "src/validator.rs",
"coverage_percentage": 85.0
}
],
"available_functions": [
"src/validator.rs::validate_input",
"src/processor.rs::process_request"
],
"available_files": [
"src/validator.rs",
"src/processor.rs"
]
}
Key Fields:
attempts[]: Shows all 5 strategies tried and which succeededavailable_functions[]: All functions found in LCOV (helps identify naming mismatches)available_files[]: All files in LCOV (helps debug path issues)
- Enable Verbose Logging:
debtmap analyze . --lcov coverage.info -vv
This shows coverage lookup details for each function during analysis.
- Verify Coverage Tool Output:
# Ensure your coverage tool generated line data (DA: records)
grep "^DA:" coverage.info | head
Functions Still Show Up Despite 100% Coverage
This is expected behavior when:
- Function has high complexity (cyclomatic > 10)
- Function has other debt issues (duplication, nesting, etc.)
- You’re viewing function-level output (coverage dampens but doesn’t eliminate)
Coverage reduces priority but doesn’t hide issues. Use filters to focus:
# Show only critical and high priority items
debtmap analyze . --lcov coverage.info --min-priority high
# Show top 10 most urgent items
debtmap analyze . --lcov coverage.info --top 10
Coverage File Path Issues
Problem: Can’t find coverage file
Solutions:
# Use absolute path
debtmap analyze . --lcov /absolute/path/to/coverage.info
# Or ensure relative path is from project root
debtmap analyze . --lcov ./target/coverage/lcov.info
LCOV Format Errors
Problem: “Invalid LCOV format” error
Causes:
- Non-LCOV format (Cobertura XML, JaCoCo, etc.)
- Corrupted file
- Wrong file encoding
Solutions:
- Verify your coverage tool is configured for LCOV output
- Check for binary/encoding issues:
file coverage.info - Regenerate coverage with explicit LCOV format flag
See Troubleshooting for more debugging tips.
Best Practices
Analysis Workflow
-
Generate Coverage Before Analysis:
# Rust example cargo tarpaulin --out lcov --output-dir target/coverage debtmap analyze . --lcov target/coverage/lcov.info -
Use Coverage for Sprint Planning:
# Focus on untested complex code debtmap analyze . --lcov coverage.info --top 20 -
Combine with Tiered Prioritization: Coverage automatically feeds into Tiered Prioritization:
- Tier 1: Architectural issues (less affected by coverage)
- Tier 2: Complex untested code (coverage < 50%, complexity > 15)
- Tier 3: Testing gaps (coverage < 80%, complexity 10-15)
-
Validate Refactoring Impact:
# Before refactoring debtmap analyze . --lcov coverage-before.info -o before.json # After refactoring debtmap analyze . --lcov coverage-after.info -o after.json # Compare debtmap compare --before before.json --after after.json
Testing Strategy
Prioritize testing based on risk:
-
High Complexity + Low Coverage = Highest Priority:
debtmap analyze . --lcov coverage.info \ --filter Risk --min-priority high -
Focus on Business Logic: Entry points and infrastructure code have natural coverage patterns. Focus unit tests on business logic functions.
-
Use Dependency Analysis:
debtmap analyze . --lcov coverage.info \ --context-providers dependency -vvTests high-dependency functions first—they have the most impact.
-
Don’t Over-Test Entry Points: Entry points (main, handlers) are better tested with integration tests, not unit tests. Debtmap applies role multipliers through its semantic classification system (FunctionRole) to adjust scoring for different function types. See
src/priority/unified_scorer.rs:149andsrc/priority/scoring/classification.rsfor the classification system.
Configuration
In .debtmap.toml:
[scoring]
# Default weights for scoring WITHOUT coverage data
# When coverage data IS provided, it acts as a dampening multiplier instead
coverage = 0.50 # Default: 50% (only used when no LCOV provided)
complexity = 0.35 # Default: 35%
dependency = 0.15 # Default: 15%
[thresholds]
# Set minimum risk score to filter low-priority items
minimum_risk_score = 15.0
# Skip simple functions even if uncovered
minimum_cyclomatic_complexity = 5
Important: These weights are from the deprecated additive scoring model. The current implementation (spec 122) calculates a base score from complexity (50%) and dependency (25%) factors, then applies coverage as a dampening multiplier: Final Score = Base Score × (1.0 - coverage_pct). These weights only apply when coverage data is not available. See src/priority/scoring/calculation.rs:68-82 for the coverage dampening calculation and src/priority/scoring/calculation.rs:119-129 for the fallback weighted sum mode.
See Configuration for complete options.
CI Integration
Example GitHub Actions Workflow:
- name: Generate Coverage
run: cargo tarpaulin --out lcov --output-dir target/coverage
- name: Analyze with Debtmap
run: |
debtmap analyze . \
--lcov target/coverage/lcov.info \
--format json \
--output debtmap-report.json
- name: Validate Quality Gates
run: |
debtmap validate . \
--lcov target/coverage/lcov.info \
--max-debt-density 50
Quality Gate Strategy:
- Fail builds on new critical debt (Tier 1 architectural issues)
- Warn on new high-priority untested code (Tier 2)
- Track coverage trends over time with
comparecommand
Complete Language Examples
Rust End-to-End
# 1. Generate coverage
cargo tarpaulin --out lcov --output-dir target/coverage
# 2. Verify LCOV output
head target/coverage/lcov.info
# 3. Run Debtmap with coverage
debtmap analyze . --lcov target/coverage/lcov.info
# 4. Interpret results (look for [UNTESTED] markers on high-complexity functions)
JavaScript/TypeScript End-to-End
# 1. Configure Jest for LCOV (in package.json or jest.config.js)
# "coverageReporters": ["lcov", "text"]
# 2. Generate coverage
npm test -- --coverage
# 3. Verify LCOV output
head coverage/lcov.info
# 4. Run Debtmap
debtmap analyze . --lcov coverage/lcov.info --languages javascript,typescript
Python End-to-End
# 1. Install pytest-cov
pip install pytest-cov
# 2. Generate LCOV coverage
pytest --cov=src --cov-report=lcov
# 3. Verify output
head coverage.lcov
# 4. Run Debtmap
debtmap analyze . --lcov coverage.lcov --languages python
Go End-to-End
# 1. Install gcov2lcov (one-time setup)
go install github.com/jandelgado/gcov2lcov@latest
# 2. Generate coverage
go test -coverprofile=coverage.out ./...
# 3. Convert to LCOV
gcov2lcov -infile=coverage.out -outfile=coverage.lcov
# 4. Verify LCOV output
head coverage.lcov
# 5. Run Debtmap
debtmap analyze . --lcov coverage.lcov --languages go
FAQ
Why does my 100% covered function still show up?
Coverage dampens debt scores but doesn’t eliminate debt. A function with cyclomatic complexity 25 and 100% coverage still represents technical debt—it’s just lower priority than untested complex code.
Use filters to focus on high-priority items:
debtmap analyze . --lcov coverage.info --top 10
What’s the difference between direct and transitive coverage?
- Direct coverage: Lines executed directly by tests
- Transitive coverage: Coverage considering call graph (functions called by well-tested code)
Transitive coverage reduces urgency for functions only reachable through well-tested paths.
Should I test everything to 100% coverage?
No. Use Debtmap’s risk scores to prioritize:
- Test high-complexity, low-coverage functions first
- Entry points are better tested with integration tests
- Simple utility functions (complexity < 5) may not need dedicated unit tests
Debtmap helps you achieve optimal coverage, not maximal coverage.
How do I debug coverage correlation issues?
Use verbose logging:
debtmap analyze . --lcov coverage.info -vv
This shows:
- Coverage file parsing details
- Function-to-coverage correlation attempts
- Path matching diagnostics
Can I use coverage with validate command?
Yes! The validate command supports --lcov for risk-based quality gates:
debtmap validate . --lcov coverage.info --max-debt-density 50
See CLI Reference for details.
Further Reading
- Scoring Strategies - Deep dive into how coverage affects unified scoring
- Tiered Prioritization - How coverage fits into tiered priority levels
- CLI Reference - Complete coverage option documentation
- Configuration - Customizing coverage scoring weights
- Troubleshooting - More debugging tips for coverage issues
Design Pattern Detection
Debtmap automatically detects common design patterns in your codebase to provide better architectural insights and reduce false positives in complexity analysis. When recognized design patterns are detected, Debtmap applies appropriate complexity adjustments to avoid penalizing idiomatic code.
Overview
Debtmap detects 7 user-facing design patterns across Python, JavaScript, TypeScript, and Rust, plus 2 internal-only patterns (Builder, Visitor) used for scoring adjustments:
| Pattern | Primary Language | Detection Confidence | Type |
|---|---|---|---|
| Observer | Python, Rust | High (0.8-0.9) | User-facing |
| Singleton | Python | High (0.85-0.95) | User-facing |
| Factory | Python | Medium-High (0.7-0.85) | User-facing |
| Strategy | Python | Medium (0.7-0.8) | User-facing |
| Callback | Python, JavaScript | High (0.8-0.9) | User-facing |
| Template Method | Python | Medium (0.7-0.8) | User-facing |
| Dependency Injection | Python | Medium (0.65-0.75) | User-facing |
| Builder | Rust | Internal | Internal Only |
| Visitor | Rust | Internal | Internal Only |
Pattern detection serves multiple purposes:
- Reduces false positives: Avoids flagging idiomatic pattern implementations as overly complex
- Documents architecture: Automatically identifies architectural patterns in your codebase
- Validates consistency: Helps ensure patterns are used correctly and completely
- Guides refactoring: Identifies incomplete pattern implementations
Pattern Detection Details
Observer Pattern
The Observer pattern is detected in Python and Rust by identifying abstract base classes with concrete implementations.
Detection Criteria (Python):
- Abstract base class with
ABC,Protocol, orInterfacemarkers - Abstract methods decorated with
@abstractmethod - Concrete implementations inheriting from the interface
- Methods prefixed with
on_,handle_, ornotify_ - Registration methods like
add_observer,register, orsubscribe - Notification methods like
notify,notify_all,trigger,emit
Detection Criteria (Rust):
- Trait definitions with callback-style methods
- Multiple implementations of the same trait
- Trait registry tracking for cross-module detection
Example (Python):
from abc import ABC, abstractmethod
class EventObserver(ABC):
@abstractmethod
def on_event(self, data):
"""Handle event notification"""
pass
class LoggingObserver(EventObserver):
def on_event(self, data):
print(f"Event occurred: {data}")
class EmailObserver(EventObserver):
def on_event(self, data):
send_email(f"Alert: {data}")
class EventManager:
def __init__(self):
self.observers = []
def add_observer(self, observer: EventObserver):
self.observers.append(observer)
def notify_all(self, data):
for observer in self.observers:
observer.on_event(data)
Confidence: High (0.8-0.9) when abstract base class, implementations, and registration/notification methods are present. Lower confidence (0.5-0.7) for partial implementations.
Singleton Pattern
Singleton pattern detection identifies three common Python implementations: module-level singletons, __new__ override, and decorator-based patterns.
Detection Criteria:
- Module-level variable assignments (e.g.,
instance = MyClass()) - Classes overriding
__new__to enforce single instance - Classes decorated with
@singletonor similar decorators - Presence of instance caching logic
Example (Module-level):
# config.py
class Config:
def __init__(self):
self.settings = {}
def load(self, path):
# Load configuration
pass
# Single instance created at module level
config = Config()
Example (__new__ override):
class DatabaseConnection:
_instance = None
def __new__(cls):
if cls._instance is None:
cls._instance = super().__new__(cls)
return cls._instance
def __init__(self):
if not hasattr(self, 'initialized'):
self.initialized = True
self.connect()
Example (Decorator-based):
def singleton(cls):
instances = {}
def get_instance(*args, **kwargs):
if cls not in instances:
instances[cls] = cls(*args, **kwargs)
return instances[cls]
return get_instance
@singleton
class Logger:
def __init__(self):
self.log_file = open('app.log', 'a')
Confidence: Very High (0.9-0.95) for __new__ override and decorator patterns. High (0.85) for module-level singletons with clear naming.
Factory Pattern
Factory pattern detection identifies factory functions, factory classes, and factory registries based on naming conventions and structural patterns.
Detection Criteria:
- Functions with names containing
create_,make_,build_, or_factory - Factory registry patterns (dictionaries mapping types to constructors)
- Functions that return instances of different types based on parameters
- Classes with factory methods
Example (Factory Function):
def create_logger(log_type: str):
if log_type == "file":
return FileLogger()
elif log_type == "console":
return ConsoleLogger()
elif log_type == "network":
return NetworkLogger()
else:
raise ValueError(f"Unknown logger type: {log_type}")
Example (Registry-based Factory):
# Parser registry
PARSERS = {
'json': JSONParser,
'xml': XMLParser,
'yaml': YAMLParser,
}
def create_parser(format: str):
parser_class = PARSERS.get(format)
if parser_class is None:
raise ValueError(f"No parser for format: {format}")
return parser_class()
Example (Factory Method):
class DocumentFactory:
@staticmethod
def create_document(doc_type: str):
if doc_type == "pdf":
return PDFDocument()
elif doc_type == "word":
return WordDocument()
else:
return PlainTextDocument()
Confidence: Medium-High (0.75-0.85) for functions with factory naming patterns. Lower confidence (0.6-0.7) for registry patterns without factory names.
Strategy Pattern
Strategy pattern detection identifies interfaces with multiple implementations representing interchangeable algorithms.
Detection Criteria:
- Abstract base class or Protocol defining strategy interface
- Multiple concrete implementations
- Strategy interface typically has 1-2 core methods
- Used via composition (strategy object passed to context)
Example:
from abc import ABC, abstractmethod
class CompressionStrategy(ABC):
@abstractmethod
def compress(self, data: bytes) -> bytes:
pass
class ZipCompression(CompressionStrategy):
def compress(self, data: bytes) -> bytes:
return zlib.compress(data)
class GzipCompression(CompressionStrategy):
def compress(self, data: bytes) -> bytes:
return gzip.compress(data)
class LzmaCompression(CompressionStrategy):
def compress(self, data: bytes) -> bytes:
return lzma.compress(data)
class FileCompressor:
def __init__(self, strategy: CompressionStrategy):
self.strategy = strategy
def compress_file(self, path):
data = read_file(path)
return self.strategy.compress(data)
Confidence: Medium (0.7-0.8) based on interface structure and implementation count.
Callback Pattern
Callback pattern detection identifies decorator-based callbacks commonly used in web frameworks and event handlers.
Detection Criteria:
- Decorators with patterns like
@route,@handler,@app.,@on,@callback - Framework-specific decorators (Flask routes, FastAPI endpoints, event handlers)
- Functions registered as callbacks for events or hooks
Example (Flask Routes):
from flask import Flask
app = Flask(__name__)
@app.route('/api/users')
def get_users():
return {"users": []}
@app.route('/api/users/<id>')
def get_user(id):
return {"user": find_user(id)}
Example (Event Handler):
class EventBus:
def __init__(self):
self.handlers = {}
def on(self, event_name):
def decorator(func):
self.handlers[event_name] = func
return func
return decorator
bus = EventBus()
@bus.on('user.created')
def handle_user_created(user):
send_welcome_email(user)
@bus.on('order.placed')
def handle_order_placed(order):
process_payment(order)
Confidence: High (0.8-0.9) for framework decorator patterns. Medium (0.6-0.7) for custom callback implementations.
Template Method Pattern
Template method pattern detection identifies base classes with template methods that call abstract hook methods.
Detection Criteria:
- Base class with concrete methods (template methods)
- Abstract methods intended to be overridden (hook methods)
- Template method calls hook methods in a defined sequence
- Subclasses override hook methods but not template method
Example:
from abc import ABC, abstractmethod
class DataProcessor(ABC):
def process(self, data):
"""Template method defining the algorithm skeleton"""
raw = self.load_data(data)
validated = self.validate(raw)
transformed = self.transform(validated)
self.save(transformed)
@abstractmethod
def load_data(self, source):
"""Hook: Load data from source"""
pass
@abstractmethod
def validate(self, data):
"""Hook: Validate data"""
pass
def transform(self, data):
"""Hook: Transform data (optional override)"""
return data
@abstractmethod
def save(self, data):
"""Hook: Save processed data"""
pass
class CSVProcessor(DataProcessor):
def load_data(self, source):
return read_csv(source)
def validate(self, data):
return [row for row in data if row]
def save(self, data):
write_csv('output.csv', data)
Confidence: Medium (0.7-0.8) based on combination of abstract and concrete methods in base class.
Dependency Injection Pattern
Dependency injection pattern detection identifies classes that receive dependencies through constructors or setters rather than creating them internally.
Detection Criteria:
- Constructor parameters accepting interface/protocol types
- Setter methods for injecting dependencies
- Optional dependencies with default values
- Absence of hard-coded object instantiation inside the class
Example (Constructor Injection):
class UserService:
def __init__(self,
user_repository: UserRepository,
email_service: EmailService,
logger: Logger):
self.user_repo = user_repository
self.email_service = email_service
self.logger = logger
def create_user(self, username, email):
user = self.user_repo.create(username, email)
self.email_service.send_welcome(email)
self.logger.info(f"Created user: {username}")
return user
Example (Setter Injection):
class ReportGenerator:
def __init__(self):
self.data_source = None
self.formatter = None
def set_data_source(self, source):
self.data_source = source
def set_formatter(self, formatter):
self.formatter = formatter
def generate(self):
data = self.data_source.fetch()
return self.formatter.format(data)
Confidence: Medium (0.65-0.75) based on constructor signatures and absence of direct instantiation.
Internal Pattern Detection
Debtmap also detects certain patterns internally for analysis purposes, but these are not exposed as user-facing design pattern detection features. These internal patterns help improve the accuracy of other analyses like god object detection and complexity calculations.
Builder Pattern (Internal Use Only)
The Builder pattern is detected internally during god object detection to avoid false positives. Classes that follow the builder pattern are given adjusted scores in god object analysis since builder classes naturally have many methods and fields.
Note: Builder pattern detection is not available via the --patterns CLI flag. It’s used only internally for scoring adjustments.
Internal Detection Criteria:
- Struct with builder suffix or builder-related naming
- Methods returning
Selffor chaining - Final
build()method returning the constructed type - Type-state pattern usage (optional)
Example (Internal Detection):
#![allow(unused)]
fn main() {
pub struct HttpClientBuilder {
base_url: Option<String>,
timeout: Duration,
headers: HashMap<String, String>,
}
impl HttpClientBuilder {
pub fn new() -> Self { /* ... */ }
// Chaining methods detected internally
pub fn base_url(mut self, url: impl Into<String>) -> Self { /* ... */ }
pub fn timeout(mut self, timeout: Duration) -> Self { /* ... */ }
pub fn header(mut self, key: String, value: String) -> Self { /* ... */ }
pub fn build(self) -> Result<HttpClient> { /* ... */ }
}
}
Why Internal Only: Builder patterns are a legitimate design choice for complex object construction. Debtmap detects them to prevent flagging builder classes as god objects, but doesn’t report them as design patterns since they don’t require complexity adjustments like other patterns.
Source: src/organization/builder_pattern.rs - Used for god object detection score adjustment
Visitor Pattern (Internal Use Only)
The Visitor pattern is detected internally for complexity analysis normalization. When exhaustive pattern matching is detected (typical of visitor patterns), Debtmap applies logarithmic complexity scaling instead of linear scaling to avoid penalizing idiomatic exhaustive match expressions.
Note: Visitor pattern detection is not available via the --patterns CLI flag. It’s used only internally for complexity scaling adjustments.
Internal Detection Criteria:
- Trait with visit methods for different types
- Implementations providing behavior for each visited type
- Exhaustive pattern matching across enum variants
- Used primarily for AST traversal or data structure processing
Example (Internal Detection):
#![allow(unused)]
fn main() {
trait Visitor {
fn visit_function(&mut self, func: &Function);
fn visit_class(&mut self, class: &Class);
fn visit_module(&mut self, module: &Module);
}
impl Visitor for ComplexityVisitor {
fn visit_function(&mut self, func: &Function) {
// Exhaustive matching detected for complexity scaling
match &func.body {
FunctionBody::Simple => { /* ... */ }
FunctionBody::Complex(statements) => { /* ... */ }
}
}
}
}
Why Internal Only: Visitor patterns often involve exhaustive pattern matching which can appear complex by traditional metrics. Debtmap detects these patterns to apply logarithmic scaling (log2(match_arms) * avg_complexity) instead of linear, preventing false positives in complexity analysis. This is a complexity adjustment mechanism, not a user-visible pattern detection feature.
Source: src/complexity/visitor_detector.rs - Used for complexity analysis, not pattern reporting
Configuration
Current Implementation Status
Pattern detection is currently internal-only and used for analysis adjustments. The CLI flags for pattern detection exist in the codebase but are not yet fully integrated into the analysis pipeline.
Status Summary:
- ✅ Pattern detection logic implemented (7 user-facing patterns)
- ✅ CLI flags defined (
--no-pattern-detection,--patterns,--pattern-threshold,--show-pattern-warnings) - ⚠️ CLI flags not yet wired to analysis pipeline
- ⚠️ Pattern detection results not currently exposed in output formats
Pattern detection is primarily used internally for:
- Adjusting complexity scores to avoid false positives
- Informing god object detection (Builder pattern)
- Normalizing exhaustive pattern matching complexity (Visitor pattern)
CLI Options (Defined but Not Yet Active)
The following CLI flags are defined in the codebase (src/cli.rs:228-241) but are not yet fully integrated:
# Disable all pattern detection (planned)
debtmap analyze --no-pattern-detection
# Enable only specific patterns (planned)
debtmap analyze --patterns observer,singleton,factory,strategy,callback,template_method,dependency_injection
# Set confidence threshold (planned)
debtmap analyze --pattern-threshold 0.8
# Show warnings for uncertain pattern detections (planned)
debtmap analyze --show-pattern-warnings
Planned Patterns for --patterns Flag (when integration is complete):
observer- Observer pattern detectionsingleton- Singleton pattern detectionfactory- Factory pattern detectionstrategy- Strategy pattern detectioncallback- Callback pattern detectiontemplate_method- Template method pattern detectiondependency_injection- Dependency injection detection
Note: Builder and Visitor patterns are detected internally but will not be available via the --patterns flag. See Internal Pattern Detection for details.
Roadmap: Pattern Detection Output
When fully integrated, pattern detection results will appear in debtmap’s output formats:
Planned Terminal Format:
Design Patterns Detected:
Observer Pattern (confidence: 0.88)
Interface: EventListener (event_system.py:4)
Implementations: AuditLogger, SessionManager
Planned JSON Format:
{
"pattern_instances": [
{
"pattern_type": "Observer",
"confidence": 0.88,
"location": "event_system.py:4",
"implementations": ["AuditLogger", "SessionManager"]
}
]
}
Current Workaround: Pattern detection is used internally during analysis to improve accuracy. To see the effects of pattern detection:
- Run analysis with and without
--no-pattern-detection(when implemented) - Compare complexity scores and god object detection results
- Patterns are being detected, but not explicitly reported in output
Confidence Scoring
Pattern detection uses a confidence scoring system (0.0-1.0) to indicate match quality:
- 0.9-1.0: Very High - Strong structural match with all key elements present
- 0.8-0.9: High - Clear pattern with most elements present
- 0.7-0.8: Medium-High - Pattern present with some uncertainty
- 0.6-0.7: Medium - Possible pattern with limited evidence
- 0.5-0.6: Low - Weak match, may be false positive
Default Threshold: 0.7 - Only patterns with 70% or higher confidence are reported by default.
Adjusting Thresholds:
# More strict (fewer patterns, higher confidence)
debtmap analyze --pattern-threshold 0.85
# More lenient (more patterns, lower confidence)
debtmap analyze --pattern-threshold 0.6 --show-pattern-warnings
How Confidence is Calculated:
Each pattern detector calculates confidence holistically based on multiple factors:
- Structural completeness: Are all expected elements present?
- Naming conventions: Do names match expected patterns?
- Implementation count: Are there enough implementations to confirm the pattern?
- Cross-validation: Do different detection heuristics agree?
For example, Observer pattern confidence is calculated holistically based on:
- Presence of abstract base class with appropriate markers (
ABC,Protocol, etc.) - Number of concrete implementations found
- Detection of registration methods (
add_observer,register,subscribe) - Detection of notification methods (
notify,notify_all,trigger,emit) - Naming conventions matching observer patterns
Higher confidence requires more structural elements to be present. The calculation is not a simple sum of individual weights but rather a holistic assessment of pattern completeness.
Cross-File Pattern Detection
Debtmap can detect patterns that span multiple files, particularly for the Observer pattern where interfaces and implementations may be in separate modules.
How Cross-File Detection Works:
- Import Tracking: Debtmap tracks imports to understand module dependencies
- Interface Registry: Abstract base classes are registered globally
- Implementation Matching: Implementations in other files are matched to registered interfaces
- Cross-Module Context: A shared context links related files
Example:
# interfaces/observer.py
from abc import ABC, abstractmethod
class EventObserver(ABC):
@abstractmethod
def on_event(self, data):
pass
# observers/logging_observer.py
from interfaces.observer import EventObserver
class LoggingObserver(EventObserver):
def on_event(self, data):
log(data)
# observers/email_observer.py
from interfaces.observer import EventObserver
class EmailObserver(EventObserver):
def on_event(self, data):
send_email(data)
Debtmap detects this as a single Observer pattern with cross-file implementations.
Limitations:
- Only works for explicitly imported interfaces
- Requires static import analysis (dynamic imports may not be tracked)
- Most effective within a single project (not across external dependencies)
Rust-Specific Pattern Detection
Trait-Based Patterns
Rust pattern detection leverages the trait system for identifying patterns:
Trait Registry: Tracks trait definitions and implementations across modules
#![allow(unused)]
fn main() {
// Trait registered for pattern detection
pub trait EventHandler {
fn handle(&self, event: &Event);
}
// Multiple implementations tracked
impl EventHandler for LogHandler { /* ... */ }
impl EventHandler for MetricsHandler { /* ... */ }
impl EventHandler for AlertHandler { /* ... */ }
}
Observer Pattern via Traits:
#![allow(unused)]
fn main() {
pub trait Observable {
fn subscribe(&mut self, observer: Box<dyn Observer>);
fn notify(&self, event: &Event);
}
pub trait Observer {
fn on_event(&self, event: &Event);
}
}
Differences from Python Detection:
- Traits are more explicit than Python’s ABC
- Type system ensures implementation correctness
- No runtime reflection needed for detection
- Pattern matching exhaustiveness helps identify Visitor pattern
Integration with Complexity Analysis
Debtmap has two separate but complementary systems for patterns:
1. Design Pattern Detection (This Feature)
The 7 user-facing design patterns documented in this chapter (Observer, Singleton, Factory, Strategy, Callback, Template Method, Dependency Injection) are detected and reported to users. These patterns appear in the output to document architectural choices but do not directly adjust complexity scores.
Purpose: Architectural documentation and pattern identification
Output: Pattern instances with confidence scores in terminal, JSON, and markdown formats
2. Complexity Pattern Adjustments (Internal System)
Debtmap has a separate internal system in src/complexity/python_pattern_adjustments.rs that detects specific complexity patterns and applies multipliers. These are different patterns from the user-facing design patterns:
Internal complexity patterns include:
- Dictionary Dispatch (0.5x multiplier)
- Strategy Pattern detection via conditionals (0.6x multiplier)
- Comprehension patterns (0.8x multiplier)
- Other Python-specific complexity patterns
Purpose: Adjust complexity scores to avoid penalizing idiomatic code
Output: Applied automatically during complexity calculation, not reported separately
Relationship Between the Systems
Currently, these are independent systems:
- Design pattern detection focuses on architectural patterns
- Complexity adjustments focus on implementation patterns
The design pattern detection results are primarily for documentation and architectural insights. The complexity scoring uses its own pattern recognition to apply appropriate adjustments.
Visitor Pattern Special Case
The Visitor pattern (internal-only) is used for complexity analysis. When exhaustive pattern matching is detected, debtmap applies logarithmic scaling:
visitor_complexity = log2(match_arms) * average_arm_complexity
This prevents exhaustive pattern matching from being flagged as overly complex. See Visitor Pattern (Internal Use Only) for more details.
See Also:
- Complexity Analysis - How complexity is calculated
- Scoring Strategies - Complexity adjustments and multipliers
Practical Examples
Example 1: Observer Pattern Code Structure
Pattern detection identifies Observer implementations even though results are not yet shown in output:
# event_system.py
from abc import ABC, abstractmethod
class EventListener(ABC):
@abstractmethod
def on_user_login(self, user):
pass
class AuditLogger(EventListener):
def on_user_login(self, user):
audit_log.write(f"User {user.id} logged in")
class SessionManager(EventListener):
def on_user_login(self, user):
create_session(user)
class EventDispatcher:
def __init__(self):
self.listeners = []
def add_listener(self, listener):
self.listeners.append(listener)
def notify_login(self, user):
for listener in self.listeners:
listener.on_user_login(user)
Current Behavior: Debtmap internally detects this Observer pattern (confidence ~0.88) and uses it to adjust complexity scoring. The pattern structure is recognized but not reported in output.
Source: Pattern detection logic in src/analysis/patterns/observer.rs
Example 2: Factory Pattern Detection Criteria
Debtmap detects factory patterns based on naming and structure:
def create_logger(log_type: str):
"""Factory function - detected by 'create_' prefix"""
if log_type == "file":
return FileLogger()
elif log_type == "console":
return ConsoleLogger()
else:
return NetworkLogger()
Current Behavior: The factory pattern detector (in src/analysis/patterns/factory.rs) identifies this as a Factory pattern with medium-high confidence (~0.75-0.85) based on:
- Function name contains
create_ - Returns different types based on parameter
- Multiple instantiation paths
This information is used internally to adjust complexity scores for factory functions.
Example 3: Impact on Complexity Analysis
While pattern detection results aren’t directly shown, their effect can be observed:
# Run standard analysis
debtmap analyze myapp/
# When --no-pattern-detection is fully integrated, compare results
# (currently this flag exists but isn't fully wired)
Expected Differences:
- Factory functions: Lower complexity scores with pattern detection
- Observer implementations: Adjusted scores for callback registration
- Template methods: Reduced penalty for abstract method patterns
- Builder classes: Not flagged as god objects despite many methods
Use Cases
1. False Positive Reduction (Active)
Problem: Complex factory functions flagged as too complex Solution: Pattern detection automatically adjusts complexity scores for recognized factory patterns
Current Behavior:
debtmap analyze myapp/
Factory functions are automatically detected and receive adjusted complexity scores. This happens internally without requiring specific flags.
Source: src/analysis/patterns/factory.rs applies multipliers to factory function complexity
2. Builder Pattern God Object Prevention (Active)
Problem: Builder classes flagged as god objects due to many chaining methods Solution: Builder pattern detection automatically excludes builder classes from god object analysis
Current Behavior:
#![allow(unused)]
fn main() {
// This builder class is automatically recognized
pub struct HttpClientBuilder {
// Many fields and methods
pub fn base_url(mut self, url: String) -> Self { /* ... */ }
pub fn timeout(mut self, duration: Duration) -> Self { /* ... */ }
pub fn build(self) -> HttpClient { /* ... */ }
}
}
Debtmap detects the builder pattern (chaining methods returning Self, final build() method) and adjusts scoring accordingly.
Source: src/organization/builder_pattern.rs for god object detection adjustment
3. Future Use Case: Architecture Documentation (Planned)
Problem: Undocumented design patterns in legacy codebase Solution: When pattern output is integrated, users will be able to generate architectural pattern reports
Planned Command:
debtmap analyze --show-pattern-warnings --output-format json > architecture-report.json
Current Workaround: Review complexity adjustments and god object scoring to infer where patterns are detected
4. Future Use Case: Pattern Consistency Validation (Planned)
Problem: Inconsistent Observer implementations across the codebase Solution: When integrated, users can filter analysis to specific pattern types
Planned Command:
debtmap analyze --patterns observer --output-format json > observers.json
Current Status: Pattern detection logic exists and works internally, but results aren’t yet exposed in output
Troubleshooting
Pattern Detection Not Visible in Output
Symptoms: Cannot see detected patterns in analysis output
Explanation: Pattern detection is currently internal-only. Patterns are detected and used to adjust complexity scoring, but results are not exposed in terminal, JSON, or markdown output.
Current Behavior:
- Patterns ARE being detected (see
src/analysis/patterns/mod.rs) - Detection results affect complexity scores and god object analysis
- Pattern information is not included in output formatting
Solution: To benefit from pattern detection:
- Run standard analysis - patterns are automatically detected
- Check if complexity scores seem adjusted for factory/callback patterns
- Verify builder classes aren’t flagged as god objects
Future Integration: CLI flags and output formatting will be connected when pattern output integration is complete.
CLI Flags Have No Effect
Symptoms: Using --patterns, --pattern-threshold, or --show-pattern-warnings doesn’t change results
Explanation: These CLI flags are defined in src/cli.rs:228-241 but are not yet fully wired to the analysis pipeline.
Current Status:
- ✅ CLI argument parsing works
- ⚠️ Values not passed to pattern detector
- ⚠️ Pattern detection runs with default settings regardless of flags
Workaround: Pattern detection runs automatically with default settings (threshold 0.7, all 7 patterns enabled).
Builder or Visitor Pattern Not Available via CLI
Symptoms: Cannot specify --patterns builder or --patterns visitor
Explanation: Builder and Visitor patterns are intentionally internal-only and will not be available as user-facing pattern detection features:
- Builder: Used during god object detection to adjust scores for builder classes (see
src/organization/builder_pattern.rs) - Visitor: Used for complexity analysis to apply logarithmic scaling to exhaustive match expressions (see
src/complexity/visitor_detector.rs)
Solution: These patterns are automatically detected when needed for internal analyses. They won’t appear in the --patterns flag even when that feature is fully integrated.
Available user-facing patterns: observer, singleton, factory, strategy, callback, template_method, dependency_injection
False Positive Complexity Adjustments
Symptoms: Function with create_ prefix gets lower complexity score but isn’t actually a factory
Possible Causes:
- Naming collision (e.g.,
create_session()that doesn’t create objects) - Overly broad pattern matching heuristics
Current Workaround: Pattern detection cannot currently be disabled or tuned per-file. The --no-pattern-detection flag exists but isn’t yet wired to the detector.
Future Solution: When CLI integration is complete:
debtmap analyze --no-pattern-detection # Disable all pattern detection
debtmap analyze --pattern-threshold 0.9 # Require very high confidence
Best Practices
Current Recommendations
- Trust automatic detection: Pattern detection runs automatically with sensible defaults (threshold 0.7, all 7 patterns enabled)
- Review complexity scores: Lower-than-expected complexity for factory/callback functions indicates pattern detection is working
- Check builder classes: If builder classes aren’t flagged as god objects, builder pattern detection is working correctly
- Follow pattern idioms: Use standard naming conventions (
create_,make_,@abstractmethod, etc.) to ensure patterns are recognized - Structure code clearly: Well-structured patterns (clear base classes, explicit implementations) have higher confidence scores
When CLI Integration is Complete
Future best practices when CLI flags are fully wired:
- Start with defaults: The default 0.7 threshold will work well for most projects
- Use
--show-pattern-warningsduring initial analysis to see borderline detections - Tune thresholds per-project: Adjust
--pattern-thresholdbased on your codebase’s idioms - Disable selectively: Use
--no-pattern-detectionto compare scores with/without adjustments - Review pattern reports: Examine detected patterns to understand architectural decisions
Summary
Current State
Debtmap’s design pattern detection exists and works internally with the following characteristics:
Implemented Features:
- ✅ 7 user-facing patterns: Observer, Singleton, Factory, Strategy, Callback, Template Method, Dependency Injection
- ✅ 2 internal patterns: Builder (for god object detection), Visitor (for complexity normalization)
- ✅ Pattern detection logic: Fully implemented in
src/analysis/patterns/ - ✅ Confidence scoring: 0.0-1.0 scale with holistic assessment
- ✅ Cross-file detection: Tracks imports and interfaces across modules
- ✅ Rust trait support: Leverages trait system for pattern detection
- ✅ Complexity integration: Automatically adjusts scores to reduce false positives
Partially Implemented:
- ⚠️ CLI flags: Defined in
src/cli.rsbut not wired to pattern detector - ⚠️ Output formatting:
PatternInstancetype exists but not exposed in output
Not Yet Implemented:
- ❌ Pattern output in terminal/JSON/markdown: Detection results not shown to users
- ❌ User configuration: Cannot currently control pattern detection via CLI or config file
- ❌ Pattern-specific reports: Cannot filter or focus on specific pattern types
Impact
Pattern detection significantly improves analysis accuracy even without visible output:
- Reduces false positives: Factory functions, callbacks, and template methods get appropriate complexity scores
- Prevents god object misclassification: Builder classes recognized and excluded from god object detection
- Normalizes exhaustive matching: Visitor pattern detection applies logarithmic scaling to pattern matching
- Supports multiple languages: Works across Python, JavaScript, TypeScript, and Rust
Future Integration
When CLI and output integration is complete, users will be able to:
- View detected patterns in analysis output
- Control pattern detection via
--patterns,--pattern-threshold, and--show-pattern-warningsflags - Generate architectural documentation from pattern detection results
- Validate pattern consistency across codebases
The foundation is solid - pattern detection works correctly and provides value. The remaining work is connecting the detection logic to user-facing configuration and output.
Entropy Analysis
Entropy analysis is Debtmap’s unique approach to distinguishing genuinely complex code from repetitive pattern-based code. This reduces false positives by 60-75% compared to traditional cyclomatic complexity metrics.
Overview
Traditional static analysis tools flag code as “complex” based purely on cyclomatic complexity or lines of code. However, not all complexity is equal:
- Repetitive patterns (validation functions, dispatchers) have high cyclomatic complexity but low cognitive load
- Diverse logic (state machines, business rules) may have moderate cyclomatic complexity but high cognitive load
Entropy analysis uses information theory to distinguish between these cases.
How It Works
Debtmap’s entropy analysis is language-agnostic, working across Rust, Python, JavaScript, and TypeScript codebases using a universal token classification approach. This ensures consistent complexity assessment regardless of the programming language used.
Language-Agnostic Analysis
The same entropy concepts apply consistently across all supported languages. Here’s how a validation function would be analyzed in different languages:
Rust:
#![allow(unused)]
fn main() {
fn validate_config(config: &Config) -> Result<()> {
if config.output_dir.is_none() { return Err(anyhow!("output_dir required")); }
if config.max_workers.is_none() { return Err(anyhow!("max_workers required")); }
if config.timeout_secs.is_none() { return Err(anyhow!("timeout_secs required")); }
Ok(())
}
// Entropy: ~0.3, Pattern Repetition: 0.9, Effective Complexity: ~5
}
Python:
def validate_config(config: Config) -> None:
if config.output_dir is None: raise ValueError("output_dir required")
if config.max_workers is None: raise ValueError("max_workers required")
if config.timeout_secs is None: raise ValueError("timeout_secs required")
# Entropy: ~0.3, Pattern Repetition: 0.9, Effective Complexity: ~5
JavaScript/TypeScript:
function validateConfig(config: Config): void {
if (!config.outputDir) throw new Error("outputDir required");
if (!config.maxWorkers) throw new Error("maxWorkers required");
if (!config.timeoutSecs) throw new Error("timeoutSecs required");
}
// Entropy: ~0.3, Pattern Repetition: 0.9, Effective Complexity: ~5
All three receive similar entropy scores because they share the same repetitive validation pattern, demonstrating how Debtmap’s analysis transcends language syntax to identify underlying code structure patterns.
Shannon Entropy
Shannon entropy measures the variety and unpredictability of code patterns:
H(X) = -Σ p(x) × log₂(p(x))
Where:
p(x)= probability of each token type- High entropy (0.8-1.0) = many different patterns
- Low entropy (0.0-0.3) = repetitive patterns
Token Classification
Debtmap can classify tokens by importance to give more weight to semantically significant tokens in entropy calculations. This is controlled by the use_classification configuration option.
When enabled (use_classification = false by default for backward compatibility), tokens are weighted by importance:
High importance (weight: 1.0):
- Control flow keywords (
if,match,for,while) - Error handling (
try,catch,?,unwrap) - Async keywords (
async,await)
Medium importance (weight: 0.7):
- Function calls
- Method invocations
- Operators
Low importance (weight: 0.3):
- Identifiers (variable names)
- Literals (strings, numbers)
- Punctuation
When disabled (use_classification = false), all tokens are treated equally, which may be useful for debugging or when you want unweighted entropy scores.
Pattern Repetition Detection
Detects repetitive structures in the AST:
#![allow(unused)]
fn main() {
// Low pattern repetition (0.2) - all branches identical
if a.is_none() { return Err(...) }
if b.is_none() { return Err(...) }
if c.is_none() { return Err(...) }
// High pattern repetition (0.9) - diverse branches
match state {
Active => transition_to_standby(),
Standby => transition_to_active(),
Maintenance => schedule_restart(),
}
}
Branch Similarity Analysis
Analyzes similarity between conditional branches:
#![allow(unused)]
fn main() {
// High branch similarity (0.9) - branches are nearly identical
if condition_a {
log("A happened");
process_a();
}
if condition_b {
log("B happened");
process_b();
}
// Low branch similarity (0.2) - branches are very different
if needs_auth {
authenticate_user()?;
load_profile()?;
} else {
show_guest_ui();
}
}
Effective Complexity Adjustment
Debtmap uses a multi-factor dampening approach that analyzes three dimensions of code repetitiveness:
- Pattern Repetition - Detects repetitive AST structures
- Token Entropy - Measures variety in token usage
- Branch Similarity - Compares similarity between conditional branches
These factors are combined multiplicatively with a minimum floor of 0.7 (preserving at least 70% of original complexity):
dampening_factor = (repetition_factor × entropy_factor × branch_factor).max(0.7)
effective_complexity = raw_complexity × dampening_factor
Historical Note: Spec 68
Spec 68: Graduated Entropy Dampening was the original simple algorithm that only considered entropy < 0.2:
dampening_factor = 0.5 + 0.5 × (entropy / 0.2) [when entropy < 0.2]
The current implementation uses a more sophisticated graduated dampening approach that considers all three factors (repetition, entropy, branch similarity) with separate thresholds and ranges for each. The test suite references Spec 68 to verify backward compatibility with the original behavior.
When Dampening Applies
Dampening is applied based on multiple thresholds:
- Pattern Repetition: Values approaching 1.0 trigger dampening (high repetition detected)
- Token Entropy: Values below 0.4 trigger graduated dampening (low variety)
- Branch Similarity: Values above 0.8 trigger dampening (similar branches)
Graduated Dampening Formula
Each factor is dampened individually using a graduated calculation:
#![allow(unused)]
fn main() {
// Conceptual pseudocode showing the three-factor approach
// Actual implementation in src/complexity/entropy.rs:185-195 and :429-439
fn calculate_dampening_factor(
repetition: f64, // 0.0-1.0
entropy: f64, // 0.0-1.0
branch_similarity: f64 // 0.0-1.0
) -> f64 {
// Each factor uses calculate_graduated_dampening with its own threshold/range
let repetition_factor = graduated_dampening(repetition, threshold=1.0, max_reduction=0.20);
let entropy_factor = graduated_dampening(entropy, threshold=0.4, max_reduction=0.15);
let branch_factor = graduated_dampening(branch_similarity, threshold=0.8, max_reduction=0.25);
(repetition_factor * entropy_factor * branch_factor).max(0.7) // Never reduce below 70%
}
}
Key Parameters:
- Repetition: Threshold 1.0, max 20% reduction (configurable via
max_repetition_reduction) - Entropy: Threshold 0.4 (hardcoded), max 15% reduction (configurable via
max_entropy_reduction) - Branch Similarity: Threshold 0.8 (configurable via
branch_threshold), max 25% reduction (configurable viamax_branch_reduction) - Combined Floor: Minimum 70% of original complexity preserved (configurable via
max_combined_reduction)
Example: Repetitive Validation Function
Raw Complexity: 20
Pattern Repetition: 0.95 (very high)
Token Entropy: 0.3 (low variety)
Branch Similarity: 0.9 (very similar branches)
repetition_factor ≈ 0.85 (15% reduction)
entropy_factor ≈ 0.90 (10% reduction)
branch_factor ≈ 0.80 (20% reduction)
dampening_factor = (0.85 × 0.90 × 0.80) = 0.612
dampening_factor = max(0.612, 0.7) = 0.7 // Floor applied
Effective Complexity = 20 × 0.7 = 14
Result: 30% reduction (maximum allowed)
Example: Diverse State Machine
Raw Complexity: 20
Pattern Repetition: 0.2 (low - not repetitive)
Token Entropy: 0.8 (high variety)
Branch Similarity: 0.3 (diverse branches)
repetition_factor ≈ 1.0 (no reduction)
entropy_factor ≈ 1.0 (no reduction)
branch_factor ≈ 1.0 (no reduction)
dampening_factor = (1.0 × 1.0 × 1.0) = 1.0
Effective Complexity = 20 × 1.0 = 20
Result: 0% reduction (complexity preserved)
Real-World Examples
Example 1: Validation Function
#![allow(unused)]
fn main() {
fn validate_config(config: &Config) -> Result<()> {
if config.output_dir.is_none() {
return Err(anyhow!("output_dir required"));
}
if config.max_workers.is_none() {
return Err(anyhow!("max_workers required"));
}
if config.timeout_secs.is_none() {
return Err(anyhow!("timeout_secs required"));
}
// ... 17 more similar checks
Ok(())
}
}
Traditional analysis:
- Cyclomatic Complexity: 20
- Assessment: CRITICAL
Entropy analysis:
- Shannon Entropy: 0.3 (low variety)
- Pattern Repetition: 0.9 (highly repetitive)
- Branch Similarity: 0.95 (nearly identical)
- Effective Complexity: 5
- Assessment: LOW PRIORITY
Example 2: State Machine Logic
#![allow(unused)]
fn main() {
fn reconcile_state(current: &State, desired: &State) -> Vec<Action> {
let mut actions = vec![];
match (current.mode, desired.mode) {
(Mode::Active, Mode::Standby) => {
if current.has_active_connections() {
actions.push(Action::DrainConnections);
actions.push(Action::WaitForDrain);
}
actions.push(Action::TransitionToStandby);
}
(Mode::Standby, Mode::Active) => {
if desired.requires_warmup() {
actions.push(Action::Warmup);
}
actions.push(Action::TransitionToActive);
}
// ... more diverse state transitions
_ => {}
}
actions
}
}
Traditional analysis:
- Cyclomatic Complexity: 8
- Assessment: MODERATE
Entropy analysis:
- Shannon Entropy: 0.85 (high variety)
- Pattern Repetition: 0.2 (not repetitive)
- Branch Similarity: 0.3 (diverse branches)
- Effective Complexity: 9
- Assessment: HIGH PRIORITY
Configuration
Configure entropy analysis in .debtmap.toml or disable via the --semantic-off CLI flag.
[entropy]
# Enable entropy analysis (default: true)
enabled = true
# Weight of entropy in overall complexity scoring (0.0-1.0, default: 1.0)
# Note: This affects scoring, not dampening thresholds
weight = 1.0
# Minimum tokens required for entropy calculation (default: 20)
min_tokens = 20
# Pattern similarity threshold for repetition detection (0.0-1.0, default: 0.7)
pattern_threshold = 0.7
# Enable advanced token classification (default: false for backward compatibility)
# When true, weights tokens by semantic importance (control flow > operators > identifiers)
use_classification = false
# Branch similarity threshold (0.0-1.0, default: 0.8)
# Branches with similarity above this threshold contribute to dampening
branch_threshold = 0.8
# Maximum reduction limits (these are configurable)
max_repetition_reduction = 0.20 # Max 20% reduction from pattern repetition
max_entropy_reduction = 0.15 # Max 15% reduction from low token entropy
max_branch_reduction = 0.25 # Max 25% reduction from branch similarity
max_combined_reduction = 0.30 # Overall cap at 30% reduction (minimum 70% preserved)
Important Notes:
-
Dampening thresholds - Some are configurable, some are hardcoded (
src/complexity/entropy.rs:185-195):- Entropy factor threshold: 0.4 - Hardcoded in implementation (src/complexity/entropy.rs:192). Although an
entropy_thresholdfield exists in the config struct (src/config/languages.rs:84), it is not wired up to be used in the actual dampening calculation. The value 0.4 was chosen based on empirical analysis across multiple codebases to balance false positive reduction with sensitivity to genuinely complex code. - Branch threshold: 0.8 - Configurable via
branch_thresholdin config file - Pattern threshold: 0.7/1.0 - Configurable via
pattern_thresholdin config file
- Entropy factor threshold: 0.4 - Hardcoded in implementation (src/complexity/entropy.rs:192). Although an
-
The
weightparameter affects how entropy scores contribute to overall complexity scoring, but does not change the dampening thresholds or reductions. -
Token classification defaults to
false(disabled) for backward compatibility, even though it provides more accurate entropy analysis when enabled.
Tuning for Your Project
Enable token classification for better accuracy:
[entropy]
enabled = true
use_classification = true # Weight control flow keywords more heavily
Strict mode (fewer reductions, flag more code):
[entropy]
enabled = true
max_repetition_reduction = 0.10 # Reduce from default 0.20
max_entropy_reduction = 0.08 # Reduce from default 0.15
max_branch_reduction = 0.12 # Reduce from default 0.25
max_combined_reduction = 0.20 # Reduce from default 0.30 (preserve 80%)
Lenient mode (more aggressive reduction):
[entropy]
enabled = true
max_repetition_reduction = 0.30 # Increase from default 0.20
max_entropy_reduction = 0.25 # Increase from default 0.15
max_branch_reduction = 0.35 # Increase from default 0.25
max_combined_reduction = 0.50 # Increase from default 0.30 (preserve 50%)
Disable entropy dampening entirely:
[entropy]
enabled = false
Or via CLI (disables entropy-based complexity adjustments):
# Disables semantic analysis features including entropy dampening
debtmap analyze . --semantic-off
Note: The --semantic-off flag disables all semantic analysis features, including entropy-based complexity adjustments. This is useful when you want raw cyclomatic complexity without any dampening.
Interpreting Entropy-Adjusted Output
When entropy analysis detects repetitive patterns, debtmap displays both the original and adjusted complexity values to help you understand the adjustment. This transparency allows you to verify the analysis and understand why certain code receives lower priority.
Output Format
When viewing detailed output (verbosity level 2 with -vv), entropy-adjusted complexity is shown in the COMPLEXITY section:
COMPLEXITY: cyclomatic=20 (dampened: 14, factor: 0.70), est_branches=40, cognitive=25, nesting=3, entropy=0.30
And in the Entropy Impact scoring section:
- Entropy Impact: 30% dampening (entropy: 0.30, repetition: 95%)
Understanding the Values
cyclomatic=20: Original cyclomatic complexity before adjustment dampened: 14: Adjusted complexity after entropy analysis (20 × 0.70 = 14) factor: 0.70: The dampening factor applied (0.70 = 30% reduction) entropy=0.30: Shannon entropy score (0.0-1.0, lower = more repetitive) repetition: 95%: Pattern repetition score (higher = more repetitive)
Reconstructing the Calculation
You can verify the adjustment by multiplying:
original_complexity × dampening_factor = adjusted_complexity
20 × 0.70 = 14
The dampening percentage shown in the Entropy Impact section is:
dampening_percentage = (1.0 - dampening_factor) × 100%
(1.0 - 0.70) × 100% = 30%
When Entropy Data is Unavailable
If a function is too small for entropy analysis (< 20 tokens) or entropy is disabled, the output shows complexity without dampening:
COMPLEXITY: cyclomatic=5, est_branches=10, cognitive=8, nesting=2
No “dampened” or “factor” values are shown, indicating the raw complexity is used for scoring.
Example Output Comparison
Before entropy-adjustment:
#1 SCORE: 95.5 [CRITICAL]
├─ COMPLEXITY: cyclomatic=20, est_branches=40, cognitive=25, nesting=3
After entropy-adjustment:
#15 SCORE: 68.2 [HIGH]
├─ COMPLEXITY: cyclomatic=20 (dampened: 14, factor: 0.70), est_branches=40, cognitive=25, nesting=3, entropy=0.30
- Entropy Impact: 30% dampening (entropy: 0.30, repetition: 95%)
The item dropped from rank #1 to #15 because entropy analysis detected the high complexity was primarily due to repetitive validation patterns rather than genuine cognitive complexity.
Understanding the Impact
Measuring False Positive Reduction
Run analysis with and without entropy:
# Without entropy
debtmap analyze . --semantic-off --top 20 > without_entropy.txt
# With entropy (default)
debtmap analyze . --top 20 > with_entropy.txt
# Compare
diff without_entropy.txt with_entropy.txt
Expected results:
- 60-75% reduction in flagged validation functions
- 40-50% reduction in flagged dispatcher functions
- 20-30% reduction in flagged configuration parsers
- No reduction in genuinely complex state machines or business logic
Verifying Correctness
Entropy analysis should:
- Reduce flags on repetitive code (validators, dispatchers)
- Preserve flags on genuinely complex code (state machines, business logic)
If entropy analysis incorrectly reduces flags on genuinely complex code, adjust configuration:
[entropy]
max_combined_reduction = 0.20 # Reduce from default 0.30 (preserve 80%)
max_repetition_reduction = 0.10 # Reduce individual factors
max_entropy_reduction = 0.08
max_branch_reduction = 0.12
Best Practices
- Use default settings - They work well for most projects
- Verify results - Spot-check top-priority items to ensure correctness
- Tune conservatively - Start with default settings, adjust if needed
- Disable for debugging - Use
--semantic-offif entropy seems incorrect - Report issues - If entropy incorrectly flags code, report it
Limitations
Entropy analysis works best for:
- Functions with cyclomatic complexity 10-50
- Code with clear repetitive patterns
- Validation, dispatch, and configuration functions
Entropy analysis is less effective for:
- Very simple functions (complexity < 5)
- Very complex functions (complexity > 100)
- Obfuscated or generated code
Comparison with Other Approaches
| Approach | False Positive Rate | Complexity | Speed |
|---|---|---|---|
| Raw Cyclomatic Complexity | High (many false positives) | Low | Fast |
| Cognitive Complexity | Medium | Medium | Medium |
| Entropy Analysis (Debtmap) | Low | High | Fast |
| Manual Code Review | Very Low | Very High | Very Slow |
Debtmap’s entropy analysis provides the best balance of accuracy and speed.
See Also
- Why Debtmap? - Real-world examples of entropy analysis
- Analysis Guide - General analysis concepts
- Configuration - Complete configuration reference
Error Handling Analysis
Debtmap provides comprehensive error handling analysis for Rust codebases, detecting anti-patterns that lead to silent failures, production panics, and difficult-to-debug issues.
Implementation Status
| Language | Error Swallowing | Panic Patterns | Async Errors | Context Loss | Propagation Analysis |
|---|---|---|---|---|---|
| Rust | ✅ Full (7 patterns) | ✅ Full (6 patterns) | ✅ Full (5 patterns) | ✅ Full (5 patterns) | ✅ Full (4 patterns) |
| Python | ⚠️ Planned | ⚠️ Planned | N/A | ⚠️ Planned | ⚠️ Planned |
| JavaScript/TypeScript | ⚠️ Limited | ⚠️ Limited | ⚠️ Planned | ❌ Not yet | ❌ Not yet |
Current Focus: Rust error handling analysis is fully implemented and production-ready. Python and JavaScript/TypeScript support is limited or planned for future releases.
Overview
Error handling issues are classified as ErrorSwallowing debt with Major severity (weight 4), reflecting their significant impact on code reliability and debuggability.
Fully Implemented for Rust:
- Error swallowing (7 patterns): Exception handlers that silently catch errors without logging or re-raising
- Panic patterns (6 patterns): Code that can panic in production (unwrap, expect, panic!)
- Error propagation issues (4 patterns): Missing error context in Result chains
- Async error handling (5 patterns): Dropped futures, unhandled JoinHandles, silent task panics
- Context loss detection (5 patterns): Error propagation without meaningful context
All error handling patterns are filtered intelligently - code detected in test modules (e.g., #[cfg(test)], #[test] attributes) receives lower priority or is excluded entirely.
Rust Error Handling Analysis
Panic Pattern Detection
Debtmap identifies Rust code that can panic at runtime instead of returning Result:
Detected patterns:
#![allow(unused)]
fn main() {
// ❌ CRITICAL: Direct panic in production code
fn process_data(value: Option<i32>) -> i32 {
panic!("not implemented"); // Detected: PanicInNonTest
}
// ❌ HIGH: Unwrap on Result
fn read_config(path: &Path) -> Config {
let content = fs::read_to_string(path).unwrap(); // Detected: UnwrapOnResult
parse_config(&content)
}
// ❌ HIGH: Unwrap on Option
fn get_user(id: u32) -> User {
users.get(&id).unwrap() // Detected: UnwrapOnOption
}
// ❌ MEDIUM: Expect with generic message
fn parse_value(s: &str) -> i32 {
s.parse().expect("parse failed") // Detected: ExpectWithGenericMessage
}
// ❌ MEDIUM: TODO in production
fn calculate_tax(amount: f64) -> f64 {
todo!("implement tax calculation") // Detected: TodoInProduction
}
}
Recommended alternatives:
#![allow(unused)]
fn main() {
// ✅ GOOD: Propagate errors with ?
fn read_config(path: &Path) -> Result<Config> {
let content = fs::read_to_string(path)?;
parse_config(&content)
}
// ✅ GOOD: Handle Option explicitly
fn get_user(id: u32) -> Result<User> {
users.get(&id)
.ok_or_else(|| anyhow!("User {} not found", id))
}
// ✅ GOOD: Add meaningful context
fn parse_value(s: &str) -> Result<i32> {
s.parse()
.with_context(|| format!("Failed to parse '{}' as integer", s))
}
}
Test code exceptions:
#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
#[test]
fn test_parsing() {
let result = "42".parse::<i32>().unwrap(); // ✅ OK in tests (LOW priority)
assert_eq!(result, 42);
}
}
}
Debtmap detects #[cfg(test)] attributes and test function contexts, automatically assigning Low priority to panic patterns in test code.
Error Propagation Analysis
Debtmap detects missing error context in Result chains.
Note: Context loss detection uses AST-based heuristics without full type information. This means some edge cases may produce false positives or negatives. Type information would improve accuracy but would require a full compilation environment.
#![allow(unused)]
fn main() {
// ❌ Missing context - which file failed? What was the error?
fn load_multiple_configs(paths: &[PathBuf]) -> Result<Vec<Config>> {
paths.iter()
.map(|p| fs::read_to_string(p)) // Error loses file path information
.collect::<Result<Vec<_>>>()?
.into_iter()
.map(|c| parse_config(&c)) // Error loses which config failed
.collect()
}
// ✅ GOOD: Preserve context through the chain
fn load_multiple_configs(paths: &[PathBuf]) -> Result<Vec<Config>> {
paths.iter()
.map(|p| {
fs::read_to_string(p)
.with_context(|| format!("Failed to read config from {}", p.display()))
})
.collect::<Result<Vec<_>>>()?
.into_iter()
.enumerate()
.map(|(i, content)| {
parse_config(&content)
.with_context(|| format!("Failed to parse config #{}", i))
})
.collect()
}
}
Best practices:
- Use
.context()or.with_context()fromanyhoworthiserror - Include relevant values in error messages (file paths, indices, input values)
- Maintain error context at each transformation in the chain
Error Swallowing in Rust
Debtmap detects seven distinct patterns of error swallowing in Rust, where errors are silently ignored without logging or propagation:
1. IfLetOkNoElse - Missing else branch
#![allow(unused)]
fn main() {
// ❌ Detected: if let Ok without else branch
fn try_update(value: &str) {
if let Ok(parsed) = value.parse::<i32>() {
update_value(parsed);
}
// Error case silently ignored - no logging or handling
}
// ✅ GOOD: Handle both cases
fn try_update(value: &str) -> Result<()> {
if let Ok(parsed) = value.parse::<i32>() {
update_value(parsed);
Ok(())
} else {
Err(anyhow!("Failed to parse value: {}", value))
}
}
}
2. IfLetOkEmptyElse - Empty else branch
#![allow(unused)]
fn main() {
// ❌ Detected: if let Ok with empty else
fn process_result(result: Result<Data, Error>) {
if let Ok(data) = result {
process(data);
} else {
// Empty else - error silently swallowed
}
}
// ✅ GOOD: Log the error
fn process_result(result: Result<Data, Error>) {
if let Ok(data) = result {
process(data);
} else {
log::error!("Failed to process: {:?}", result);
}
}
}
3. LetUnderscoreResult - Discarding Result with let _
#![allow(unused)]
fn main() {
// ❌ Detected: Result discarded with let _
fn save_data(data: &Data) {
let _ = fs::write("data.json", serde_json::to_string(data).unwrap());
// Write failure silently ignored
}
// ✅ GOOD: Handle or propagate the error
fn save_data(data: &Data) -> Result<()> {
fs::write("data.json", serde_json::to_string(data)?)
.context("Failed to save data")?;
Ok(())
}
}
4. OkMethodDiscard - Calling .ok() and discarding
#![allow(unused)]
fn main() {
// ❌ Detected: .ok() called but result discarded
fn try_parse(s: &str) -> Option<i32> {
s.parse::<i32>().ok(); // Result immediately discarded
None
}
// ✅ GOOD: Use the Ok value or log the error
fn try_parse(s: &str) -> Option<i32> {
match s.parse::<i32>() {
Ok(v) => Some(v),
Err(e) => {
log::warn!("Failed to parse '{}': {}", s, e);
None
}
}
}
}
5. MatchIgnoredErr - Match with ignored error variant
#![allow(unused)]
fn main() {
// ❌ Detected: match with _ in Err branch
fn try_load(path: &Path) -> Option<String> {
match fs::read_to_string(path) {
Ok(content) => Some(content),
Err(_) => None, // Error details ignored
}
}
// ✅ GOOD: Log the error with context
fn try_load(path: &Path) -> Option<String> {
match fs::read_to_string(path) {
Ok(content) => Some(content),
Err(e) => {
log::error!("Failed to read {}: {}", path.display(), e);
None
}
}
}
}
6. UnwrapOrNoLog - .unwrap_or() without logging
#![allow(unused)]
fn main() {
// ❌ Detected: unwrap_or without logging
fn get_config_value(key: &str) -> String {
load_config()
.and_then(|c| c.get(key))
.unwrap_or_else(|| "default".to_string())
// Error silently replaced with default
}
// ✅ GOOD: Log before falling back to default
fn get_config_value(key: &str) -> String {
match load_config().and_then(|c| c.get(key)) {
Ok(value) => value,
Err(e) => {
log::warn!("Config key '{}' not found: {}. Using default.", key, e);
"default".to_string()
}
}
}
}
7. UnwrapOrDefaultNoLog - .unwrap_or_default() without logging
#![allow(unused)]
fn main() {
// ❌ Detected: unwrap_or_default without logging
fn load_settings() -> Settings {
read_settings_file().unwrap_or_default()
// Error silently replaced with default settings
}
// ✅ GOOD: Log the fallback to defaults
fn load_settings() -> Settings {
match read_settings_file() {
Ok(settings) => settings,
Err(e) => {
log::warn!("Failed to load settings: {}. Using defaults.", e);
Settings::default()
}
}
}
}
Summary of Error Swallowing Patterns:
| Pattern | Description | Common Cause |
|---|---|---|
| IfLetOkNoElse | if let Ok(..) without else | Quick prototyping, forgotten error path |
| IfLetOkEmptyElse | if let Ok(..) with empty else | Incomplete implementation |
| LetUnderscoreResult | let _ = result | Intentional ignore without thought |
| OkMethodDiscard | .ok() result not used | Misunderstanding of .ok() semantics |
| MatchIgnoredErr | Err(_) => ... with no logging | Generic error handling |
| UnwrapOrNoLog | .unwrap_or() without logging | Convenience over observability |
| UnwrapOrDefaultNoLog | .unwrap_or_default() without logging | Default fallback without visibility |
All these patterns are detected at Medium to High priority depending on context, as they represent lost error information that makes debugging difficult.
Source: Error swallowing patterns are defined in src/debt/error_swallowing.rs:234-243 and comprehensively tested in tests/error_swallowing_test.rs.
Python Error Handling Analysis (Planned)
⚠️ Python error handling detection is planned but not yet implemented. Currently only Rust error patterns are fully supported.
The patterns described below represent the intended future behavior once Python analysis is implemented.
Bare Except Clause Detection (Planned)
Python’s bare except: catches all exceptions, including system exits and keyboard interrupts:
# ❌ CRITICAL: Bare except catches everything
def process_file(path):
try:
with open(path) as f:
return f.read()
except: # Detected: BareExceptClause
return None # Catches SystemExit, KeyboardInterrupt, etc.
# ❌ HIGH: Catching Exception is too broad
def load_config(path):
try:
return yaml.load(open(path))
except Exception: # Detected: OverlyBroadException
return {} # Silent failure loses error information
# ✅ GOOD: Specific exception types
def process_file(path):
try:
with open(path) as f:
return f.read()
except FileNotFoundError:
log.error(f"File not found: {path}")
return None
except PermissionError:
log.error(f"Permission denied: {path}")
return None
Why bare except is dangerous:
- Catches
SystemExit(prevents clean shutdown) - Catches
KeyboardInterrupt(prevents Ctrl+C) - Catches
GeneratorExit(breaks generator protocol) - Masks programming errors like
NameError,AttributeError
Best practices:
- Always specify exception types:
except ValueError,except (TypeError, KeyError) - Use
except Exceptiononly when truly catching all application errors - Never use bare
except:in production code - Log exceptions with full context before suppressing
Silent Exception Handling (Planned)
# ❌ Silent exception handling
def get_user_age(user_id):
try:
user = db.get_user(user_id)
return user.age
except: # Detected: SilentException (no logging, no re-raise)
pass
# ✅ GOOD: Log and provide meaningful default
def get_user_age(user_id):
try:
user = db.get_user(user_id)
return user.age
except UserNotFound:
logger.warning(f"User {user_id} not found")
return None
except DatabaseError as e:
logger.error(f"Database error fetching user {user_id}: {e}")
raise # Re-raise for caller to handle
Contextlib Suppress Detection (Planned)
Python’s contextlib.suppress() intentionally silences exceptions, which can hide errors:
from contextlib import suppress
# ❌ MEDIUM: contextlib.suppress hides errors
def cleanup_temp_files(paths):
for path in paths:
with suppress(FileNotFoundError, PermissionError):
os.remove(path) # Detected: ContextlibSuppress
# Errors silently suppressed - no visibility into failures
# ✅ GOOD: Log suppressed errors
def cleanup_temp_files(paths):
for path in paths:
try:
os.remove(path)
except FileNotFoundError:
logger.debug(f"File already deleted: {path}")
except PermissionError as e:
logger.warning(f"Permission denied removing {path}: {e}")
except Exception as e:
logger.error(f"Unexpected error removing {path}: {e}")
# ✅ ACCEPTABLE: Use suppress only for truly ignorable cases
def best_effort_cleanup(paths):
"""Best-effort cleanup - failures are expected and acceptable."""
for path in paths:
with suppress(OSError): # OK if documented and intentional
os.remove(path)
When contextlib.suppress is acceptable:
- Cleanup operations where failures are genuinely unimportant
- Operations explicitly documented as “best effort”
- Code where logging would create noise without value
When to avoid contextlib.suppress:
- Production code where error visibility matters
- Operations where partial failure should be noticed
- Any case where debugging might be needed later
Exception Flow Analysis (Planned)
Python exception flow analysis is planned for future implementation. This would track exception propagation through Python codebases to identify functions that can raise exceptions without proper handling.
# Potential issue: Exceptions may propagate unhandled
def process_batch(items):
for item in items:
validate_item(item) # Can raise ValueError
transform_item(item) # Can raise TransformError
save_item(item) # Can raise DatabaseError
# ✅ GOOD: Handle exceptions appropriately
def process_batch(items):
results = {"success": 0, "failed": 0}
for item in items:
try:
validate_item(item)
transform_item(item)
save_item(item)
results["success"] += 1
except ValueError as e:
logger.warning(f"Invalid item {item.id}: {e}")
results["failed"] += 1
except (TransformError, DatabaseError) as e:
logger.error(f"Failed to process item {item.id}: {e}")
results["failed"] += 1
# Optionally re-raise critical errors
if isinstance(e, DatabaseError):
raise
return results
Async Error Handling
JavaScript/TypeScript Async Patterns (Planned)
⚠️ JavaScript and TypeScript async error detection is planned but not yet fully implemented.
Current Status:
- JavaScript/TypeScript support focuses on complexity analysis and basic error patterns
- Async error handling detection (unhandled promise rejections, missing await) is fully implemented for Rust only
- Enhanced JavaScript/TypeScript async error detection is planned for future releases
The examples below show intended future behavior:
// ❌ CRITICAL: Unhandled promise rejection
async function loadUserData(userId) {
const response = await fetch(`/api/users/${userId}`);
// If fetch rejects, promise is unhandled
return response.json();
}
loadUserData(123); // Detected: UnhandledPromiseRejection
// ✅ GOOD: Handle rejections
async function loadUserData(userId) {
try {
const response = await fetch(`/api/users/${userId}`);
if (!response.ok) {
throw new Error(`HTTP ${response.status}: ${response.statusText}`);
}
return await response.json();
} catch (error) {
console.error(`Failed to load user ${userId}:`, error);
throw error; // Re-throw or return default
}
}
loadUserData(123).catch(err => {
console.error("Top-level error handler:", err);
});
Missing Await Detection (Planned)
// ❌ HIGH: Missing await - promise dropped
async function saveAndNotify(data) {
await saveToDatabase(data);
sendNotification(data.userId); // Detected: MissingAwait
// Function returns before notification completes
}
// ✅ GOOD: Await all async operations
async function saveAndNotify(data) {
await saveToDatabase(data);
await sendNotification(data.userId);
}
Async Rust Error Handling
Debtmap detects five async-specific error handling patterns in Rust.
Note: Async error detection uses pattern matching on tokio APIs and AST-based heuristics. Some patterns (particularly select! macro handling and future dropping) may require manual review, as full semantic analysis would require type information.
1. DroppedFuture - Future dropped without awaiting
#![allow(unused)]
fn main() {
// ❌ HIGH: Dropped future without error handling
async fn process_requests(requests: Vec<Request>) {
for req in requests {
tokio::spawn(async move {
handle_request(req).await // Detected: DroppedFuture
// Errors silently dropped
});
}
}
// ✅ GOOD: Join handles and propagate errors
async fn process_requests(requests: Vec<Request>) -> Result<()> {
let handles: Vec<_> = requests.into_iter()
.map(|req| {
tokio::spawn(async move {
handle_request(req).await
})
})
.collect();
for handle in handles {
handle.await??; // Propagate both JoinError and handler errors
}
Ok(())
}
}
2. UnhandledJoinHandle - Spawned task without join
#![allow(unused)]
fn main() {
// ❌ HIGH: Task spawned but handle never checked
async fn background_sync() {
tokio::spawn(async {
sync_to_database().await // Detected: UnhandledJoinHandle
});
// Handle dropped - can't detect if task panicked or failed
}
// ✅ GOOD: Store and check join handle
async fn background_sync() -> Result<()> {
let handle = tokio::spawn(async {
sync_to_database().await
});
handle.await? // Wait for completion and check for panic
}
}
3. SilentTaskPanic - Task panic without monitoring
#![allow(unused)]
fn main() {
// ❌ HIGH: Task panic silently ignored
tokio::spawn(async {
panic!("task failed"); // Detected: SilentTaskPanic
});
// ✅ GOOD: Handle task panics
let handle = tokio::spawn(async {
critical_operation().await
});
match handle.await {
Ok(Ok(result)) => println!("Success: {:?}", result),
Ok(Err(e)) => eprintln!("Task failed: {}", e),
Err(e) => eprintln!("Task panicked: {}", e),
}
}
4. SpawnWithoutJoin - Spawning without storing handle
#![allow(unused)]
fn main() {
// ❌ MEDIUM: Spawn without storing handle
async fn fire_and_forget_tasks(items: Vec<Item>) {
for item in items {
tokio::spawn(process_item(item)); // Detected: SpawnWithoutJoin
// No way to check task completion or errors
}
}
// ✅ GOOD: Collect handles for later checking
async fn process_tasks_with_monitoring(items: Vec<Item>) -> Result<()> {
let handles: Vec<_> = items.into_iter()
.map(|item| tokio::spawn(process_item(item)))
.collect();
for handle in handles {
handle.await??;
}
Ok(())
}
}
5. SelectBranchIgnored - Select branch without error handling
#![allow(unused)]
fn main() {
// ❌ MEDIUM: tokio::select! branch error ignored
async fn process_with_timeout(data: Data) {
tokio::select! {
result = process_data(data) => {
// Detected: SelectBranchIgnored
// result could be Err but not checked
}
_ = tokio::time::sleep(Duration::from_secs(5)) => {
println!("Timeout");
}
}
}
// ✅ GOOD: Handle errors in select branches
async fn process_with_timeout(data: Data) -> Result<()> {
tokio::select! {
result = process_data(data) => {
result?; // Propagate error
Ok(())
}
_ = tokio::time::sleep(Duration::from_secs(5)) => {
Err(anyhow!("Processing timeout after 5s"))
}
}
}
}
Async Error Pattern Summary:
| Pattern | Severity | Description | Common in |
|---|---|---|---|
| DroppedFuture | High | Future result ignored | Fire-and-forget spawns |
| UnhandledJoinHandle | High | JoinHandle never checked | Background tasks |
| SilentTaskPanic | High | Task panic not monitored | Unmonitored spawns |
| SpawnWithoutJoin | Medium | Handle not stored | Quick prototypes |
| SelectBranchIgnored | Medium | select! branch error ignored | Concurrent operations |
All async error patterns emphasize the importance of properly handling errors in concurrent Rust code, where failures can easily go unnoticed.
Source: Async error patterns are defined in src/debt/async_errors.rs:210-217 and tested with tokio-specific patterns.
Severity Levels and Prioritization
Error handling issues are assigned severity based on their impact:
| Pattern | Severity | Weight | Priority | Rationale |
|---|---|---|---|---|
| Panic in production | CRITICAL | 4 | Critical | Crashes the process |
| Bare except clause | CRITICAL | 4 | Critical | Masks system signals |
| Silent task panic | CRITICAL | 4 | Critical | Hidden failures |
| Unwrap on Result/Option | HIGH | 4 | High | Likely to panic |
| Dropped future | HIGH | 4 | High | Lost error information |
| Unhandled promise rejection | HIGH | 4 | High | Silently fails |
| Error swallowing | MEDIUM | 4 | Medium | Loses debugging context |
| Missing error context | MEDIUM | 4 | Medium | Hard to debug |
| Expect with generic message | MEDIUM | 4 | Medium | Uninformative errors |
| TODO in production | MEDIUM | 4 | Medium | Incomplete implementation |
All ErrorSwallowing debt has weight 4 (Major severity), but individual patterns receive different priorities based on production impact.
Integration with Risk Scoring
Error handling issues contribute to the debt_factor in Debtmap’s risk scoring formula:
risk_score = (complexity_factor * 0.4) + (debt_factor * 0.3) + (coverage_factor * 0.3)
where debt_factor includes:
- ErrorSwallowing count * weight (4)
- Combined with other debt types
Compound risk example:
#![allow(unused)]
fn main() {
// HIGH RISK: High complexity + error swallowing + low coverage
fn process_transaction(tx: Transaction) -> bool { // Cyclomatic: 12, Cognitive: 18
if tx.amount > 1000 {
if tx.verified {
if validate_funds(&tx).unwrap() { // ❌ Panic pattern
if tx.user_type == "premium" {
match apply_premium_discount(&tx) {
Ok(_) => {},
Err(_) => return false, // ❌ Error swallowed
}
}
charge_account(&tx).unwrap(); // ❌ Another panic
return true;
}
}
}
false
}
// Coverage: 45% (untested error paths)
// Risk Score: Very High (complexity + error handling + coverage gaps)
}
This function would be flagged as Priority 1 in Debtmap’s output due to:
- High cyclomatic complexity (12)
- Multiple panic patterns (unwrap calls)
- Error swallowing (ignored Result)
- Coverage gaps in error handling paths
Configuration
Error Handling Configuration Options
All Rust error handling detection is enabled by default. You typically don’t need to configure anything - Debtmap will automatically detect all error patterns in your Rust code.
When to Use Configuration:
- Gradual adoption: Disable some patterns while fixing others
- Project-specific needs: Turn off patterns that don’t apply to your codebase
- Performance tuning: Disable expensive analyzers if not needed
Configure error handling analysis in .debtmap.toml:
[error_handling]
# All patterns enabled by default - only add config to DISABLE patterns
# detect_panic_patterns = true # Default: enabled
# detect_swallowing = true # Default: enabled
# detect_async_errors = true # Default: enabled
# detect_context_loss = true # Default: enabled
# detect_propagation = true # Default: enabled
# Example: Gradual adoption - start with just panic patterns
detect_panic_patterns = true # Keep enabled
detect_swallowing = false # Disable initially
detect_async_errors = false # Disable initially
detect_context_loss = false # Disable initially
Default Behavior (No Configuration):
All error handling patterns are detected with the ErrorSwallowing debt category (weight 4). Test code automatically receives lower priority.
Detection Examples
What Gets Detected vs. Not Detected
Rust Examples (Fully Implemented)
#![allow(unused)]
fn main() {
// ❌ Detected: unwrap() in production code
pub fn get_config() -> Config {
load_config().unwrap()
}
// ✅ Not detected: ? operator (proper error propagation)
pub fn get_config() -> Result<Config> {
load_config()?
}
// ✅ Not detected: unwrap() in test
#[test]
fn test_config() {
let config = load_config().unwrap(); // OK in tests
assert_eq!(config.port, 8080);
}
// ❌ Detected: expect() with generic message
let value = map.get("key").expect("missing");
// ✅ Not detected: expect() with descriptive context
let value = map.get("key")
.expect("Configuration must contain 'key' field");
}
Python Examples (Planned - Not Yet Implemented)
# ❌ Detected: bare except
try:
risky_operation()
except:
pass
# ✅ Not detected: specific exception
try:
risky_operation()
except ValueError:
handle_value_error()
# ❌ Detected: silent exception (no logging/re-raise)
try:
db.save(record)
except DatabaseError:
pass # Silent failure
# ✅ Not detected: logged exception
try:
db.save(record)
except DatabaseError as e:
logger.error(f"Failed to save record: {e}")
raise
Suppression Patterns
For cases where error handling patterns are intentional, use suppression comments:
Rust:
#![allow(unused)]
fn main() {
// debtmap: ignore - Unwrap is safe here due to prior validation
let value = validated_map.get("key").unwrap();
}
Python:
try:
experimental_feature()
except: # debtmap: ignore - Intentional catch-all during migration
use_fallback()
See Suppression Patterns for complete syntax and usage.
Best Practices
Rust Error Handling
-
Prefer
?operator over unwrap/expect#![allow(unused)] fn main() { // Instead of: fs::read_to_string(path).unwrap() // Use: fs::read_to_string(path)? } -
Use anyhow for application errors, thiserror for libraries
#![allow(unused)] fn main() { use anyhow::{Context, Result}; fn load_data(path: &Path) -> Result<Data> { let content = fs::read_to_string(path) .with_context(|| format!("Failed to read {}", path.display()))?; parse_data(&content) .context("Invalid data format") } } -
Add context at each error boundary
#![allow(unused)] fn main() { .with_context(|| format!("meaningful message with {}", value)) } -
Handle Option explicitly
#![allow(unused)] fn main() { map.get(key).ok_or_else(|| anyhow!("Missing key: {}", key))? }
Python Error Handling (Planned - Future Implementation)
Note: Python error handling detection is planned. These best practices represent intended future behavior.
-
Always use specific exception types
except (ValueError, KeyError) as e: -
Log before suppressing
except DatabaseError as e: logger.error(f"Database operation failed: {e}", exc_info=True) # Then decide: re-raise, return default, or handle -
Avoid bare except completely
# If you must catch everything: except Exception as e: # Not bare except: logger.exception("Unexpected error") raise -
Use context managers for resource cleanup
with open(path) as f: # Ensures cleanup even on exception process(f)
JavaScript/TypeScript Error Handling (Planned - Limited Implementation)
Note: JavaScript/TypeScript async error detection is planned. Currently only basic patterns are supported.
-
Always handle promise rejections
fetchData().catch(err => console.error(err)); // Or use try/catch with async/await -
Use async/await consistently
async function process() { try { const data = await fetchData(); await saveData(data); } catch (error) { console.error("Failed:", error); throw error; } } -
Don’t forget await
await asyncOperation(); // Don't drop promises
Improving Rust Error Handling Based on Debtmap Reports
Workflow
-
Run analysis with error focus
debtmap analyze --filter-categories ErrorSwallowingNote: This analyzes Rust error patterns. Python and JavaScript/TypeScript support is limited or planned.
-
Review priority issues first
- Address CRITICAL (panic in production, bare except) immediately
- Schedule HIGH (unwrap, dropped futures) for next sprint
- Plan MEDIUM (missing context) for gradual improvement
-
Fix systematically
- One file or module at a time
- Add tests as you improve error handling
- Run debtmap after each fix to verify
-
Validate improvements
# Before fixes debtmap analyze --output before.json # After fixes debtmap analyze --output after.json # Compare debtmap compare before.json after.json
Migration Strategy for Legacy Code
# .debtmap.toml - Gradual adoption
[error_handling]
# Start with just critical panic patterns
detect_panic_patterns = true
detect_swallowing = false # Add later
detect_async_errors = false # Add later
detect_context_loss = false # Add later
# After fixing panic patterns, enable error swallowing detection
# detect_swallowing = true
# Eventually enable all patterns
# detect_swallowing = true
# detect_async_errors = true
# detect_context_loss = true
# detect_propagation = true
Track progress over time:
# Weekly error handling health check
debtmap analyze --filter-categories ErrorSwallowing | tee weekly-error-health.txt
Troubleshooting
Too Many False Positives in Test Code
Problem: Debtmap flagging unwrap() in test functions
Solution: Debtmap should automatically detect test code via:
#[cfg(test)]modules in Rust#[test]attributestest_function name prefix in Python*.test.ts,*.spec.jsfile patterns
If false positives persist:
#![allow(unused)]
fn main() {
// Use suppression comment
let value = result.unwrap(); // debtmap: ignore - Test assertion
}
Error Patterns Not Being Detected
Problem: Known error patterns not appearing in report
Causes and solutions:
-
Language support not enabled
debtmap analyze --languages rust,python,javascript -
Pattern disabled in config
[error_handling] detect_panic_patterns = true detect_swallowing = true detect_async_errors = true # Ensure relevant detectors are enabled -
Suppression comment present
- Check for
debtmap: ignorecomments - Review
.debtmap.tomlignore patterns
- Check for
Disagreement with Severity Levels
Problem: Severity feels too high/low for your codebase
Solution: Customize in .debtmap.toml:
[debt_categories.ErrorSwallowing]
weight = 2 # Reduce from default 4 to Warning level
severity = "Warning"
# Or increase for stricter enforcement
# weight = 5
# severity = "Critical"
Can’t Find Which Line Has the Issue
Problem: Debtmap reports error at wrong line number
Causes:
- Source code changed since analysis
- Parser approximation for line numbers
Solutions:
- Re-run analysis:
debtmap analyze - Search for pattern:
rg "\.unwrap\(\)" src/ - Enable debug logging:
debtmap analyze --log-level debug
Validating Error Handling Improvements
Problem: Unsure if fixes actually improved code quality
Solution: Use compare workflow:
# Baseline before fixes
git checkout main
debtmap analyze --output baseline.json
# After fixes
git checkout feature/improve-errors
debtmap analyze --output improved.json
# Compare reports
debtmap compare baseline.json improved.json
Look for:
- Reduced ErrorSwallowing debt count
- Lower risk scores for affected functions
- Improved coverage of error paths (if running with coverage)
Related Topics
- Configuration - Complete
.debtmap.tomlreference - Suppression Patterns - Suppress false positives
- Scoring Strategies - How error handling affects risk scores
- Coverage Integration - Detect untested error paths
- CLI Reference - Command-line options for error analysis
- Troubleshooting - General debugging guide
Functional Composition Analysis
Debtmap provides deep AST-based analysis to detect and evaluate functional programming patterns in Rust code. This feature helps you understand how effectively your codebase uses functional composition patterns like iterator pipelines, identify opportunities for refactoring imperative code to functional style, and rewards pure, side-effect-free functions in complexity scoring.
Overview
Functional analysis examines your code at the AST level to detect:
- Iterator pipelines - Chains like
.iter().map().filter().collect() - Purity analysis - Functions with no mutable state or side effects
- Composition quality metrics - Overall functional programming quality scores
- Side effect classification - Categorization of Pure, Benign, and Impure side effects
This analysis integrates with debtmap’s scoring system, providing score bonuses for high-quality functional code and reducing god object warnings for codebases with many small pure helper functions.
Specification: This feature implements Specification 111: AST-Based Functional Pattern Detection with accuracy targets of precision ≥90%, recall ≥85%, F1 ≥0.87, and performance overhead <10%.
Configuration Profiles
Debtmap provides three pre-configured analysis profiles to match different codebases:
| Profile | Use Case | Min Pipeline Depth | Max Closure Complexity | Purity Threshold | Quality Threshold |
|---|---|---|---|---|---|
| Strict | Functional-first codebases | 3 | 3 | 0.9 | 0.7 |
| Balanced (default) | Typical Rust projects | 2 | 5 | 0.8 | 0.6 |
| Lenient | Imperative-heavy legacy code | 2 | 10 | 0.5 | 0.4 |
Choosing a Profile
Use Strict when:
- Your codebase emphasizes functional programming patterns
- You want to enforce high purity standards
- You’re building a new project with functional-first principles
- You want to detect even simple pipelines (3+ stages)
Use Balanced (default) when:
- You have a typical Rust codebase mixing functional and imperative styles
- You want reasonable detection without being overly strict
- You’re working on a mature project with mixed patterns
- You want to reward functional patterns without penalizing pragmatic imperative code
Use Lenient when:
- You’re analyzing legacy code with heavy imperative patterns
- You want to identify only the most obviously functional code
- You’re migrating from an imperative codebase and want gradual improvement
- You have complex closures that are still fundamentally functional
CLI Usage
Enable functional analysis with the --ast-functional-analysis flag and select a profile with --functional-analysis-profile:
# Enable with balanced profile (default)
debtmap analyze . --ast-functional-analysis --functional-analysis-profile balanced
# Use strict profile for functional-first codebases
debtmap analyze . --ast-functional-analysis --functional-analysis-profile strict
# Use lenient profile for legacy code
debtmap analyze . --ast-functional-analysis --functional-analysis-profile lenient
Note: The --ast-functional-analysis flag enables the feature, while --functional-analysis-profile selects the configuration profile (strict/balanced/lenient).
Pure Function Detection
A function is considered pure when it:
- Returns same output for same input (deterministic)
- Has no observable side effects
- Doesn’t mutate external state
- Doesn’t perform I/O
Examples
#![allow(unused)]
fn main() {
// Pure function
fn add(a: i32, b: i32) -> i32 {
a + b
}
// Pure function with internal iteration
fn factorial(n: u32) -> u32 {
(1..=n).product() // Pure despite internal iteration
}
// Not pure: I/O side effect
fn log_and_add(a: i32, b: i32) -> i32 {
println!("Adding {} and {}", a, b); // Side effect!
a + b
}
// Not pure: mutates external state
fn increment_counter(counter: &mut i32) -> i32 {
*counter += 1; // Side effect!
*counter
}
}
Pipeline Detection
Debtmap detects functional pipelines through deep AST analysis, identifying iterator chains and their transformations.
Pipeline Stages
The analyzer recognizes these pipeline stage types:
1. Iterator Initialization
Methods that start an iterator chain:
.iter()- Immutable iteration.into_iter()- Consuming iteration.iter_mut()- Mutable iteration
#![allow(unused)]
fn main() {
// Detected iterator initialization
let results = collection.iter()
.map(|x| x * 2)
.collect();
}
2. Map Transformations
Applies a transformation function to each element:
#![allow(unused)]
fn main() {
// Detected Map stage
items.iter()
.map(|x| x * 2) // Simple closure (low complexity)
.map(|x| { // Complex closure (higher complexity)
let doubled = x * 2;
doubled + 1
})
.collect()
}
The analyzer tracks closure complexity for each map operation. Complex closures may indicate code smells and affect quality scoring based on your max_closure_complexity threshold.
3. Filter Predicates
Selects elements based on a predicate:
#![allow(unused)]
fn main() {
// Detected Filter stage
items.iter()
.filter(|x| *x > 0) // Simple predicate
.filter(|x| { // Complex predicate
x.is_positive() && x < 100
})
.collect()
}
4. Fold/Reduce Aggregation
Combines elements into a single value:
#![allow(unused)]
fn main() {
// Detected Fold stage
items.iter()
.fold(0, |acc, x| acc + x)
// Or using reduce
items.iter()
.reduce(|a, b| a + b)
}
5. FlatMap Transformations
Maps and flattens nested structures:
#![allow(unused)]
fn main() {
// Detected FlatMap stage
items.iter()
.flat_map(|x| vec![x, x * 2])
.collect()
}
6. Inspect (Side-Effect Aware)
Performs side effects while passing through values:
#![allow(unused)]
fn main() {
// Detected Inspect stage (affects purity scoring)
items.iter()
.inspect(|x| println!("Processing: {}", x))
.map(|x| x * 2)
.collect()
}
7. Result/Option Chaining
Specialized stages for error handling:
#![allow(unused)]
fn main() {
// Detected AndThen stage
results.iter()
.and_then(|x| try_process(x))
.collect()
// Detected MapErr stage
results.iter()
.map_err(|e| format!("Error: {}", e))
.collect()
}
Terminal Operations
Pipelines typically end with a terminal operation that consumes the iterator:
collect()- Gather elements into a collectionsum()- Sum numeric valuescount()- Count elementsany()- Check if any element matchesall()- Check if all elements matchfind()- Find first matching elementreduce()- Reduce to single valuefor_each()- Execute side effects for each element
#![allow(unused)]
fn main() {
// Complete pipeline with terminal operation
let total: i32 = items.iter()
.filter(|x| **x > 0)
.map(|x| x * 2)
.sum(); // Terminal operation: sum
}
Nested Pipelines
Debtmap detects pipelines nested within closures, indicating highly functional code patterns:
#![allow(unused)]
fn main() {
// Nested pipeline detected
let results = outer_items.iter()
.map(|item| {
// Inner pipeline (nesting_level = 1)
item.values.iter()
.filter(|v| **v > 0)
.collect()
})
.collect();
}
Nesting level tracking helps identify sophisticated functional composition patterns.
Parallel Pipelines
Parallel iteration using Rayon is automatically detected:
#![allow(unused)]
fn main() {
use rayon::prelude::*;
// Detected as parallel pipeline (is_parallel = true)
let results: Vec<_> = items.par_iter()
.filter(|x| **x > 0)
.map(|x| x * 2)
.collect();
}
Parallel pipelines indicate high-performance functional patterns and receive positive quality scoring.
Builder Pattern Filtering
To avoid false positives, debtmap distinguishes builder patterns from functional pipelines:
#![allow(unused)]
fn main() {
// This is a builder pattern, NOT counted as a functional pipeline
let config = ConfigBuilder::new()
.with_host("localhost")
.with_port(8080)
.build();
// This IS a functional pipeline
let values = items.iter()
.map(|x| x * 2)
.collect();
}
Builder patterns are filtered out to ensure accurate functional composition metrics.
Purity Analysis
Debtmap analyzes functions to determine their purity level - whether they have side effects and mutable state.
Purity Levels
Functions are classified into three purity levels for god object weighting (defined in src/organization/purity_analyzer.rs:19-26):
Note: Debtmap has two purity analysis systems serving different purposes:
Three-level system (this section) - Used for god object scoring with weight multipliers:
- Defined in
src/organization/purity_analyzer.rs- Categories: Pure, ProbablyPure, Impure
- Purpose: Dampen god object scores for pure functions via weight multipliers (0.3, 0.5, 1.0)
Four-level system - Used for detailed responsibility classification:
- Defined in
src/core/mod.rs:47-60andsrc/analysis/purity_analysis.rs:34-43- Categories: StrictlyPure, LocallyPure, ReadOnly, Impure
- Purpose: Fine-grained analysis of function behavior and side effects
- Used in responsibility analysis and technical debt assessment
The three-level system provides a simplified classification optimized for god object detection. The four-level system offers more granular distinctions (e.g., builder patterns with local mutations vs. read-only accessors).
This chapter focuses on the three-level system used for god object integration. For detailed responsibility classification, see Responsibility Analysis.
Pure (Weight 0.3)
Guaranteed no side effects:
- No mutable parameters (
&mut,mut self) - No I/O operations
- No global mutations
- No
unsafeblocks - Only immutable bindings
#![allow(unused)]
fn main() {
// Pure function
fn calculate_total(items: &[i32]) -> i32 {
items.iter().sum()
}
// Pure function with immutable bindings
fn process_value(x: i32) -> i32 {
let doubled = x * 2; // Immutable binding
let result = doubled + 10;
result
}
}
Probably Pure (Weight 0.5)
Likely no side effects:
- Static functions (
fnitems, not methods) - Associated functions (no
self) - No obvious side effects detected
#![allow(unused)]
fn main() {
// Probably pure - static function
fn transform(value: i32) -> i32 {
value * 2
}
// Probably pure - associated function
impl MyType {
fn create_default() -> Self {
MyType { value: 0 }
}
}
}
Impure (Weight 1.0)
Has side effects:
- Uses mutable references (
&mut,mut self) - Performs I/O operations (
println!, file I/O, network) - Uses
async(potential side effects) - Mutates global state
- Uses
unsafe
#![allow(unused)]
fn main() {
// Impure - mutable reference
fn increment(value: &mut i32) {
*value += 1;
}
// Impure - I/O operation
fn log_value(value: i32) {
println!("Value: {}", value);
}
// Impure - mutation
fn process_items(items: &mut Vec<i32>) {
items.push(42);
}
}
Purity Weight Multipliers
Purity levels affect god object detection through weight multipliers (implemented in src/organization/purity_analyzer.rs:29-39). Pure functions contribute less to god object scores, rewarding codebases with many small pure helper functions:
- Pure (0.3): A pure function counts as 30% of a regular function in god object method count calculations
- Probably Pure (0.5): Counts as 50%
- Impure (1.0): Full weight
The purity_score dampens god object scores via the weight_multiplier calculation. For example, pure functions with weight 0.3 count as only 30% of a regular function when calculating method counts for god object detection.
Example: A module with 20 pure helper functions (20 × 0.3 = 6.0 effective) is less likely to trigger god object warnings than a module with 10 impure functions (10 × 1.0 = 10.0 effective).
Side Effect Detection
Detected Side Effects
I/O Operations:
- File reading/writing
- Network calls
- Console output
- Database queries
State Mutation:
- Mutable global variables
- Shared mutable state
- Reference mutations
Randomness:
- Random number generation
- Time-dependent behavior
System Interaction:
- Environment variable access
- System calls
- Thread spawning
Rust-Specific Detection
#![allow(unused)]
fn main() {
// Interior mutability detection
use std::cell::RefCell;
fn has_side_effect() {
let data = RefCell::new(vec![]);
data.borrow_mut().push(1); // Detected as mutation
}
// Unsafe code detection
fn unsafe_side_effect() {
unsafe {
// Automatically flagged as potentially impure
}
}
}
Side Effect Classification
Side effects are categorized by severity:
Pure - No Side Effects
No mutations, I/O, or global state changes:
#![allow(unused)]
fn main() {
// Pure - only computation
fn fibonacci(n: u32) -> u32 {
match n {
0 => 0,
1 => 1,
_ => fibonacci(n - 1) + fibonacci(n - 2),
}
}
}
Benign - Small Penalty
Only logging, tracing, or metrics:
#![allow(unused)]
fn main() {
use tracing::debug;
// Benign - logging side effect
fn process(value: i32) -> i32 {
debug!("Processing value: {}", value);
value * 2
}
}
Benign side effects receive a small penalty in purity scoring. Logging and observability are recognized as practical necessities.
Impure - Large Penalty
I/O, mutations, network operations:
#![allow(unused)]
fn main() {
// Impure - file I/O
fn save_to_file(data: &str) -> std::io::Result<()> {
std::fs::write("output.txt", data)
}
// Impure - network operation
async fn fetch_data(url: &str) -> Result<String, reqwest::Error> {
reqwest::get(url).await?.text().await
}
}
Impure side effects receive a large penalty in purity scoring.
Purity Metrics
For each function, debtmap calculates purity metrics through the functional composition analysis (src/analysis/functional_composition.rs). These metrics are computed by analyze_composition() and returned in CompositionMetrics and PurityMetrics:
has_mutable_state- Whether the function uses mutable bindingshas_side_effects- Whether I/O or global mutations are detectedimmutability_ratio- Ratio of immutable to total bindings (0.0-1.0)is_const_fn- Whether declared asconst fnside_effect_kind- Classification: Pure, Benign, or Impurepurity_score- Overall purity score (0.0 impure to 1.0 pure)
Immutability Ratio
The immutability ratio measures how much of a function’s local state is immutable:
#![allow(unused)]
fn main() {
fn example() {
let x = 10; // Immutable
let y = 20; // Immutable
let mut z = 30; // Mutable
z += 1;
// immutability_ratio = 2/3 = 0.67
}
}
Higher immutability ratios contribute to better purity scores.
Composition Pattern Recognition
Function Composition
#![allow(unused)]
fn main() {
// Detected composition pattern
fn process_data(input: String) -> Result<Output> {
input
.parse()
.map(validate)
.and_then(transform)
.map(normalize)
}
}
Higher-Order Functions
#![allow(unused)]
fn main() {
// Detected HOF pattern
fn apply_twice<F>(f: F, x: i32) -> i32
where
F: Fn(i32) -> i32,
{
f(f(x))
}
}
Map/Filter/Fold Chains
#![allow(unused)]
fn main() {
// Detected functional pipeline
let result = items
.iter()
.filter(|x| x.is_valid())
.map(|x| x.transform())
.fold(0, |acc, x| acc + x);
}
Composition Quality Scoring
Debtmap combines pipeline metrics and purity analysis into an overall composition quality score (0.0-1.0).
Scoring Factors
The composition quality score considers:
- Pipeline depth - Longer pipelines indicate more functional composition
- Purity score - Higher purity means better functional programming
- Immutability ratio - More immutable bindings improve the score
- Closure complexity - Simpler closures score better
- Parallel execution - Parallel pipelines receive bonuses
- Nested pipelines - Sophisticated composition patterns score higher
Quality Thresholds
Based on your configuration profile, functions with composition quality above the threshold receive score boosts in debtmap’s overall analysis:
- Strict: Quality ≥ 0.7 required for boost
- Balanced: Quality ≥ 0.6 required for boost
- Lenient: Quality ≥ 0.4 required for boost
High-quality functional code can offset complexity in other areas of your codebase.
Purity Scoring
Distribution Analysis
Debtmap calculates purity distribution:
- Pure functions: 0 side effects detected
- Mostly pure: Minor side effects (e.g., logging)
- Impure: Multiple side effects
- Highly impure: Extensive state mutation and I/O
Scoring Formula
Purity Score = (pure_functions / total_functions) × 100
Side Effect Density = total_side_effects / total_functions
Codebase Health Metrics
Target Purity Levels:
- Core business logic: 80%+ pure
- Utilities: 70%+ pure
- I/O layer: 20-30% pure (expected)
- Overall: 50%+ pure
Integration with Risk Scoring
Functional composition quality integrates with debtmap’s risk scoring system and multi-signal aggregation framework:
- High composition quality → Lower risk scores (functions with quality above threshold receive score boosts)
- Pure functions → Reduced god object penalties (via weight multipliers in
purity_analyzer.rs) - Deep pipelines → Bonus for functional patterns
- Impure side effects → Risk penalties applied
Multi-Signal Integration: Functional composition analysis is one of several signals aggregated in the unified analysis system (src/builders/unified_analysis.rs and src/analysis/multi_signal_aggregation.rs) alongside complexity metrics, god object detection, and risk assessment. This ensures that functional programming quality contributes to the comprehensive technical debt assessment across multiple dimensions.
This integration ensures that well-written functional code is properly rewarded in the overall technical debt assessment.
Practical Examples
Example 1: Detecting Imperative vs Functional Code
Imperative style (lower composition quality):
#![allow(unused)]
fn main() {
fn process_items_imperative(items: Vec<i32>) -> Vec<i32> {
let mut results = Vec::new();
for item in items {
if item > 0 {
results.push(item * 2);
}
}
results
}
// Detected: No pipelines, mutable state, lower purity score
}
Functional style (higher composition quality):
#![allow(unused)]
fn main() {
fn process_items_functional(items: Vec<i32>) -> Vec<i32> {
items.iter()
.filter(|x| **x > 0)
.map(|x| x * 2)
.collect()
}
// Detected: Pipeline depth 3, pure function, high composition quality
}
Example 2: Identifying Refactoring Opportunities
When debtmap detects low composition quality, it suggests refactoring:
#![allow(unused)]
fn main() {
// Original: Imperative with mutations
fn calculate_statistics(data: &[f64]) -> (f64, f64, f64) {
let mut sum = 0.0;
let mut min = f64::MAX;
let mut max = f64::MIN;
for &value in data {
sum += value;
if value < min { min = value; }
if value > max { max = value; }
}
(sum / data.len() as f64, min, max)
}
// Refactored: Functional style
fn calculate_statistics_functional(data: &[f64]) -> (f64, f64, f64) {
let sum: f64 = data.iter().sum();
let min = data.iter().min_by(|a, b| a.partial_cmp(b).unwrap()).unwrap();
let max = data.iter().max_by(|a, b| a.partial_cmp(b).unwrap()).unwrap();
(sum / data.len() as f64, *min, *max)
}
// Higher purity score, multiple pipelines detected
}
Example 3: Using Profiles for Different Codebases
Strict profile - Catches subtle functional patterns:
$ debtmap analyze --ast-functional-analysis --functional-analysis-profile strict src/
# Detects pipelines with 3+ stages
# Requires purity ≥ 0.9 for "pure" classification
# Flags closures with complexity > 3
Balanced profile - Default for most projects:
$ debtmap analyze --ast-functional-analysis --functional-analysis-profile balanced src/
# Detects pipelines with 2+ stages
# Requires purity ≥ 0.8 for "pure" classification
# Flags closures with complexity > 5
Lenient profile - For legacy code:
$ debtmap analyze --ast-functional-analysis --functional-analysis-profile lenient src/
# Detects pipelines with 2+ stages
# Requires purity ≥ 0.5 for "pure" classification
# Flags closures with complexity > 10
Example 4: Interpreting Purity Scores
Pure function (score: 1.0):
#![allow(unused)]
fn main() {
fn add(a: i32, b: i32) -> i32 {
a + b
}
// Purity: 1.0 (perfect)
// Immutability ratio: 1.0 (no bindings)
// Side effects: None
}
Mostly pure (score: 0.8):
#![allow(unused)]
fn main() {
fn process(values: &[i32]) -> i32 {
let doubled: Vec<_> = values.iter().map(|x| x * 2).collect();
let sum: i32 = doubled.iter().sum();
sum
}
// Purity: 0.8 (high)
// Immutability ratio: 1.0 (both bindings immutable)
// Side effects: None
// Pipelines: 2 detected
}
Impure function (score: 0.2):
#![allow(unused)]
fn main() {
fn log_and_process(values: &mut Vec<i32>) {
println!("Processing {} items", values.len());
values.iter_mut().for_each(|x| *x *= 2);
}
// Purity: 0.2 (low)
// Immutability ratio: 0.0 (mutable parameter)
// Side effects: I/O (println), mutation
}
Best Practices
Writing Functional Rust Code
To achieve high composition quality scores:
-
Prefer iterator chains over manual loops
#![allow(unused)] fn main() { // Good let evens: Vec<_> = items.iter().filter(|x| *x % 2 == 0).collect(); // Avoid let mut evens = Vec::new(); for item in &items { if item % 2 == 0 { evens.push(item); } } } -
Minimize mutable state
#![allow(unused)] fn main() { // Good let result = calculate(input); // Avoid let mut result = 0; result = calculate(input); } -
Separate pure logic from side effects
#![allow(unused)] fn main() { // Good - pure computation fn calculate_price(quantity: u32, unit_price: f64) -> f64 { quantity as f64 * unit_price } // Good - I/O at the boundary fn display_price(price: f64) { println!("Total: ${:.2}", price); } } -
Keep closures simple
#![allow(unused)] fn main() { // Good - simple closure items.map(|x| x * 2) // Consider extracting - complex closure items.map(|x| { let temp = expensive_operation(x); transform(temp) }) // Better fn transform_item(x: i32) -> i32 { let temp = expensive_operation(x); transform(temp) } items.map(transform_item) } -
Use parallel iteration for CPU-intensive work
#![allow(unused)] fn main() { use rayon::prelude::*; let results: Vec<_> = large_dataset.par_iter() .map(|item| expensive_computation(item)) .collect(); }
Code Organization
Separate pure from impure:
- Keep pure logic in core modules
- Isolate I/O at boundaries
- Use dependency injection for testability
Maximize purity in:
- Business logic
- Calculations and transformations
- Validation functions
- Data structure operations
Accept impurity in:
- I/O layers
- Logging and monitoring
- External system integration
- Application boundaries
Refactoring strategy:
- Identify impure functions
- Extract pure logic
- Push side effects to boundaries
- Test pure functions exhaustively
Migration Guide
To enable functional analysis on existing projects:
-
Start with lenient profile to understand current state:
debtmap analyze --ast-functional-analysis --functional-analysis-profile lenient . -
Identify quick wins - functions that are almost functional:
- Look for loops that can become iterator chains
- Find mutable variables that can be immutable
- Spot side effects that can be extracted
-
Gradually refactor to functional patterns:
- Convert one function at a time
- Run tests after each change
- Measure improvements with debtmap
-
Tighten profile as codebase improves:
# After refactoring debtmap analyze --ast-functional-analysis --functional-analysis-profile balanced . # For new modules debtmap analyze --ast-functional-analysis --functional-analysis-profile strict src/new_module/ -
Monitor composition quality trends over time
Use Cases
Code Quality Audit
# Assess functional purity
debtmap analyze . --ast-functional-analysis --functional-analysis-profile balanced --format markdown
Refactoring Targets
# Find impure functions in core logic
debtmap analyze src/core/ --ast-functional-analysis --functional-analysis-profile strict
Onboarding Guide
# Show functional patterns in codebase
debtmap analyze . --ast-functional-analysis --functional-analysis-profile balanced --summary
Troubleshooting
“No pipelines detected” but I have iterator chains
- Check pipeline depth: Your chains may be too short for the profile
- Strict requires 3+ stages
- Balanced/Lenient require 2+ stages
- Check for builder patterns: Method chaining for construction is filtered out
- Verify terminal operation: Ensure the chain ends with
collect(),sum(), etc.
“Low purity score” for seemingly pure functions
- Check for hidden side effects:
println!or logging statements- Calls to impure helper functions
unsafeblocks
- Review immutability ratio: Unnecessary
mutbindings lower the score - Verify no I/O operations: File access, network calls affect purity
“High complexity closures flagged”
- Extract complex closures into named functions:
#![allow(unused)] fn main() { // Instead of items.map(|x| { /* 10 lines */ }) // Use fn process_item(x: Item) -> Result { /* 10 lines */ } items.map(process_item) } - Adjust
max_closure_complexity: Consider lenient profile if needed - Refactor closure logic: Break down complex operations
Too Many False Positives
Issue: Pure functions flagged as impure
Solution:
- Use lenient profile
- Suppress known patterns
- Review detection criteria
- Report false positives
Missing Side Effects
Issue: Known impure functions not detected
Solution:
- Use strict profile
- Check for exotic side effect patterns
- Enable comprehensive analysis
Performance impact concerns
- Spec 111 targets <10% overhead: Performance impact should be minimal
- Disable for hot paths: Analyze functional patterns in separate runs if needed
- Use parallel processing: Leverage multi-core parallelism for faster analysis
Related Chapters
- Analysis Guide - Understanding analysis types and methodologies
- Complexity Metrics - How functional patterns affect complexity metrics
- Scoring Strategies - Integration with overall technical debt scoring
- God Object Detection - How purity weights reduce false positives
- Configuration - Advanced functional analysis configuration options
Summary
Functional composition analysis helps you:
- Identify functional patterns in your Rust codebase through AST-based pipeline detection
- Measure purity with side effect detection and immutability analysis
- Improve code quality by refactoring imperative code to functional style
- Get scoring benefits for high-quality functional programming patterns
- Choose appropriate profiles (strict/balanced/lenient) for different codebases
Enable it with --functional-analysis-profile to start benefiting from functional programming insights in your technical debt analysis.
God Object Detection
Overview
Debtmap includes sophisticated god object detection that identifies files and types that have grown too large and taken on too many responsibilities. God objects (also called “god classes” or “god modules”) are a significant source of technical debt as they:
- Violate the Single Responsibility Principle
- Become difficult to maintain and test
- Create bottlenecks in development
- Increase the risk of bugs due to high coupling
- Have high coupling with many other modules
- Are hard to test effectively
This chapter explains how Debtmap identifies god objects, calculates their scores, and provides actionable refactoring recommendations.
Detection Criteria
Debtmap uses two distinct detection strategies depending on the file structure:
God Class Criteria
A struct/class is classified as a god class when it violates multiple thresholds:
- Method Count - Number of impl methods on the struct
- Field Count - Number of struct/class fields
- Responsibility Count - Distinct responsibilities inferred from method names (max_traits in config)
- Lines of Code - Estimated lines for the struct and its impl blocks
- Complexity Sum - Combined cyclomatic complexity of struct methods
Note: All five criteria are evaluated by the determine_confidence function to calculate confidence levels. Each criterion that exceeds its threshold contributes to the violation count.
God Module Criteria
A file is classified as a god module when it has excessive standalone functions:
- Standalone Function Count - Total standalone functions (not in impl blocks)
- Responsibility Count - Distinct responsibilities across all functions
- Lines of Code - Total lines in the file
- Complexity Sum - Combined cyclomatic complexity (estimated as
function_count × 5)
Key Difference: God class detection focuses on a single struct’s methods, while god module detection counts standalone functions across the entire file.
Language-Specific Thresholds
Rust
- Max Methods: 20 (includes both impl methods and standalone functions)
- Max Fields: 15
- Max Responsibilities: 5
- Max Lines: 1000
- Max Complexity: 200
Python
- Max Methods: 15
- Max Fields: 10
- Max Responsibilities: 3
- Max Lines: 500
- Max Complexity: 150
JavaScript/TypeScript
- Max Methods: 15
- Max Fields: 20
- Max Responsibilities: 3
- Max Lines: 500
- Max Complexity: 150
Note: TypeScript uses the same thresholds as JavaScript since both languages have similar structural patterns. The implementation treats them identically for god object detection purposes.
These thresholds can be customized per-language in your .debtmap.toml configuration file.
God Class vs God Module Detection
Debtmap distinguishes between two distinct types of god objects:
God Class Detection
A god class is a single struct/class with excessive methods and fields. Debtmap analyzes the largest type in a file using:
- Find the largest type (struct/class) by
method_count + field_count × 2 - Count only the impl methods for that struct
- Check against thresholds:
- Rust: >20 methods, >15 fields
- Python: >15 methods, >10 fields
- JavaScript/TypeScript: >15 methods, >20 fields
Example: A struct with 25 methods and 18 fields would be flagged as a god class.
God Module Detection
A god module is a file with excessive standalone functions (no dominant struct). Debtmap counts standalone functions when:
- No struct/class is found, OR
- The file has many standalone functions outside of any impl blocks
Implementation Detail: All three detection types (GodClass, GodFile, GodModule) share a single unified DebtType::GodObject variant. The specific detection type is distinguished using the detection_type field within the variant:
#![allow(unused)]
fn main() {
DebtType::GodObject {
methods: u32,
fields: Option<u32>,
responsibilities: u32,
god_object_score: Score0To100,
lines: u32,
}
}
The detection_type field (tracked separately in the analysis) indicates whether this god object was detected as:
GodClass- Single struct with excessive methods/fields (hasfields: Some(count))GodFile- File with excessive functions or lines of code (hasfields: None)GodModule- Same asGodFile, used conceptually for files with many standalone functions (hasfields: None)
This unified structure ensures consistent handling across all god object types while the fields value (Some vs None) provides the key distinction between class-based and file-based detection.
Example: A file like rust_call_graph.rs with 270 standalone functions would be flagged as a god module (using the GodFile/GodModule detection type).
Why Separate Analysis?
Previously, Debtmap combined standalone functions with struct methods, causing false positives for functional/procedural modules. The current implementation analyzes them separately to:
- Avoid penalizing pure functional modules
- Distinguish between architectural issues (god class) and organizational issues (god module)
- Provide more accurate refactoring recommendations
Key Distinction: A file containing a struct with 15 methods plus 20 standalone functions is analyzed as:
- God Class: No (15 methods < 20 threshold)
- God Module: Possibly (20 standalone functions, approaching threshold)
See src/organization/god_object/detector.rs and src/organization/god_object/classifier.rs for implementation details.
Confidence Levels
Debtmap assigns confidence levels based solely on the number of thresholds violated:
- Definite (5 violations) - All five metrics exceed thresholds - clear god object requiring immediate refactoring
- Probable (3-4 violations) - Most metrics exceed thresholds - likely god object that should be refactored
- Possible (1-2 violations) - Some metrics exceed thresholds - potential god object worth reviewing
- NotGodObject (0 violations) - All metrics within acceptable limits
Note: The confidence level is determined by violation count alone. The god object score (calculated separately) is used for prioritization and ranking, but does not affect the confidence classification.
Example: Consider two files both with violation_count=2 (Possible confidence):
- File A: 21 methods, 16 fields (just over the threshold)
- File B: 100 methods, 50 fields (severely over the threshold)
Both receive the same “Possible” confidence level, but File B will have a much higher god object score for prioritization purposes. This separation ensures consistent confidence classification while still allowing scores to reflect severity.
See src/organization/god_object/classifier.rs:582 for the determine_confidence function.
Scoring Algorithms
Debtmap provides three scoring algorithms to accommodate different analysis needs.
Simple Scoring
The base scoring algorithm calculates god object score using four factors:
method_factor = min(method_count / max_methods, 3.0)
field_factor = min(field_count / max_fields, 3.0)
responsibility_factor = min(responsibility_count / 3, 3.0)
size_factor = min(lines_of_code / max_lines, 3.0)
base_score = method_factor × field_factor × responsibility_factor × size_factor
Score Enforcement:
- If
violation_count > 0:final_score = max(base_score × 20 × violation_count, min_score)- Where
min_scoreis: 1 violation → 30.0, 2 violations → 50.0, 3+ violations → 70.0
- Where
- Else:
final_score = base_score × 10
Source: src/organization/god_object/scoring.rs:44-90
The graduated minimum scores ensure that severity matches the number of violations while preventing over-flagging of moderate files.
Complexity-Weighted Scoring
Unlike raw method counting, this algorithm weights each method by its cyclomatic complexity. This ensures that 100 simple functions (complexity 1-3) score better than 10 highly complex functions (complexity 17+).
The formula is similar to simple scoring, but uses weighted_method_count (sum of complexity weights) instead of raw counts:
method_factor = min(weighted_method_count / max_methods, 3.0)
Additionally, a complexity factor is applied:
- Average complexity < 3.0:
0.7(reward simple functions) - Average complexity > 10.0:
1.5(penalize complex functions) - Otherwise:
1.0
The final score becomes:
final_score = max(base_score × 20 × complexity_factor × violation_count, min_score)
Where min_score is: 1 violation → 30.0, 2 violations → 50.0, 3+ violations → 70.0
This approach better reflects the true maintainability burden of a large module while using conservative scaling to prevent small files from being over-flagged.
Source: src/organization/god_object/scoring.rs:118-178
Purity-Weighted Scoring (Advanced)
Available for Rust only (requires syn::ItemFn analysis)
This advanced scoring variant combines both complexity weighting and purity analysis, building on top of complexity-weighted scoring to further reduce the impact of pure functions. This prevents pure functional modules from being unfairly penalized. The algorithm:
-
Analyzes each function for purity using three levels:
-
Pure (no side effects): Functions with read-only operations, no I/O, no mutation
- Weight multiplier:
0.3 - Examples:
calculate_sum(),format_string(),is_valid()
- Weight multiplier:
-
Probably Pure (likely no side effects): Functions that appear pure but may have hidden side effects
- Weight multiplier:
0.5 - Examples: Functions using trait methods (could have side effects), generic operations
- Weight multiplier:
-
Impure (has side effects): Functions with clear side effects like I/O, mutation, external calls
- Weight multiplier:
1.0 - Examples:
save_to_file(),update_state(),send_request()
- Weight multiplier:
-
-
Purity Detection Heuristics:
- Pure indicators: No
mutreferences, no I/O operations, no external function calls - Impure indicators: File/network operations, mutable state, database access, logging
- Probably Pure: Generic functions, trait method calls, or ambiguous patterns
- Pure indicators: No
-
Combines complexity and purity weights to calculate the total contribution:
total_weight = complexity_weight × purity_multiplierThis means pure functions get both the complexity-based weight AND the purity multiplier applied together.
Example: A pure function with complexity 5 contributes only
5 × 0.3 = 1.5to the weighted count (compared to 5.0 for an impure function of the same complexity). -
Tracks the
PurityDistribution:pure_count,probably_pure_count,impure_countpure_weight_contribution,probably_pure_weight_contribution,impure_weight_contribution
Impact: A file with 100 pure helper functions (total complexity 150) might have a weighted method count of only 150 × 0.3 = 45, avoiding false positives while still catching stateful god objects with many impure methods.
See src/organization/god_object/scoring.rs and src/organization/purity_analyzer.rs.
Responsibility Detection
Responsibilities are inferred from method names using behavioral heuristics. Debtmap recognizes the following categories based on the BehaviorCategory enum (src/organization/behavioral_decomposition/types.rs:8-41):
| Category | Method Patterns | Examples |
|---|---|---|
| Lifecycle | new, init, setup, destroy, cleanup | Initialization and teardown |
| State Management | update_*, mutate_*, *_state | State transitions and mutations |
| Rendering | render, draw, paint, display, format | Display and formatting |
| Event Handling | handle_*, on_* | Event dispatchers and handlers |
| Persistence | save, load, serialize, deserialize | Data storage and retrieval |
| Validation | validate_*, check_*, verify_*, ensure_*, is_* | Data validation |
| Computation | calculate, compute, evaluate | Pure deterministic calculations |
| Parsing | parse, read, extract, decode, unmarshal, scan | Data parsing and reading |
| Filtering | filter, select, find, search, query, lookup, match | Search and filtering |
| Transformation | transform, convert, map, apply, adapt | Data transformation |
| Data Access | get, set, fetch, retrieve, access | Getter/setter methods |
| Construction | create, build, make, construct | Object construction |
| Processing | process, handle, execute, run | General processing |
| Communication | send, receive, transmit, broadcast, notify | Inter-process communication |
| Utilities | with_*, from_*, into_*, as_*, default, any | Helpers, builders, factories, converters |
| Domain | (no match) | Domain-specific with custom name |
Note: The Domain category serves as the fallback when no behavioral pattern matches. It extracts a domain name from the method’s first word to provide context-specific categorization.
Classification Order: More specific categories are checked first (Construction before Lifecycle, Validation before Rendering) to ensure accurate categorization. See BehavioralCategorizer::categorize_method() in src/organization/behavioral_decomposition/categorization.rs:32.
Distinct Responsibility Counting: Debtmap counts the number of unique responsibility categories used by a struct/module’s methods. A high responsibility count (e.g., >5) indicates the module is handling too many different concerns, violating the Single Responsibility Principle.
Responsibility count directly affects:
- God object scoring (via
responsibility_factor) - Refactoring recommendations (methods grouped by responsibility for suggested splits)
- Detection confidence (counted as one of the five violation criteria)
See BehavioralCategorizer::categorize_method() in src/organization/behavioral_decomposition/categorization.rs:32 and infer_responsibility_with_confidence() in src/organization/god_object/classifier.rs:662.
Examples and Case Studies
Example 1: Large Rust Module
File: rust_call_graph.rs with 270 standalone functions
Detection:
- Is God Object: Yes
- Method Count: 270
- Field Count: 0 (no struct)
- Responsibilities: 8
- Confidence: Definite
- Score: >1000 (severe violation)
Recommendation: Break into multiple focused modules:
CallGraphBuilder(construction methods)CallGraphAnalyzer(analysis methods)CallGraphFormatter(output methods)
Example 2: Complex Python Class
File: data_manager.py with class containing 25 methods and 12 fields
Detection:
- Is God Object: Yes
- Method Count: 25
- Field Count: 12
- Responsibilities: 6 (Data Access, Validation, Persistence, etc.)
- Confidence: Probable
- Score: ~150-200
Recommendation: Split by responsibility:
DataAccessLayer(get/set methods)DataValidator(validate/check methods)DataPersistence(save/load methods)
Example 3: Mixed Paradigm File (God Module)
File: utils.rs with small struct (5 methods, 3 fields) + 60 standalone functions
Detection:
- God Class (struct): No (5 methods < 20 threshold, 3 fields < 15 threshold)
- God Module (file): Yes (60 standalone functions > 50 threshold)
- Confidence: Probable
- Score: ~120
Analysis: The struct and standalone functions are analyzed separately. The struct is not a god class, but the file is a god module due to the excessive standalone functions. This indicates an overgrown utility module that should be split into smaller, focused modules.
Recommendation: Split standalone functions into focused utility modules:
StringUtils(formatting, parsing)FileUtils(file operations)MathUtils(calculations)
Refactoring Recommendations
When is_god_object = true, Debtmap generates recommended module splits using the recommended_splits field (see src/organization/god_object/split_types.rs for the ModuleSplit type). This feature:
-
Groups methods by their inferred responsibilities
-
Creates a
ModuleSplitfor each responsibility group containing:suggested_name(e.g., “DataAccessManager”, “ValidationManager”)methods_to_move(list of method names)responsibility(category name)estimated_lines(approximate LOC for the new module)
-
Orders splits by cohesion (most focused responsibility groups first)
Example output:
Recommended Splits:
1. DataAccessManager (12 methods, ~150 lines)
2. ValidationManager (8 methods, ~100 lines)
3. PersistenceManager (5 methods, ~75 lines)
This provides an actionable roadmap for breaking down god objects into focused, single-responsibility modules.
See src/organization/god_object/split_types.rs:35-125 for the ModuleSplit struct definition and src/organization/god_object/core_types.rs:80 for how splits are stored in detection results.
Code Examples
Split by Responsibility
#![allow(unused)]
fn main() {
// Before: UserManager (god object)
struct UserManager { ... }
// After: Split into focused modules
struct AuthService { ... }
struct ProfileService { ... }
struct PermissionService { ... }
struct NotificationService { ... }
}
Extract Common Functionality
#![allow(unused)]
fn main() {
// Extract shared dependencies
struct ServiceContext {
db: Database,
cache: Cache,
logger: Logger,
}
// Each service gets a reference
struct AuthService<'a> {
context: &'a ServiceContext,
}
}
Use Composition
#![allow(unused)]
fn main() {
// Compose services instead of inheriting
struct UserFacade {
auth: AuthService,
profile: ProfileService,
permissions: PermissionService,
}
impl UserFacade {
fn login(&mut self, credentials: Credentials) -> Result<Session> {
self.auth.login(credentials)
}
}
}
Configuration
TOML Configuration
Add a [god_object_detection] section to your .debtmap.toml:
[god_object_detection]
enabled = true
[god_object_detection.rust]
max_methods = 20
max_fields = 15
max_traits = 5 # max_traits = max responsibilities
max_lines = 1000
max_complexity = 200
# Note: The configuration field is named 'max_traits' for historical reasons,
# but it controls the maximum number of responsibilities/concerns, not Rust traits.
# This is a legacy naming issue from early development.
[god_object_detection.python]
max_methods = 15
max_fields = 10
max_traits = 3
max_lines = 500
max_complexity = 150
[god_object_detection.javascript]
max_methods = 15
max_fields = 20
max_traits = 3
max_lines = 500
max_complexity = 150
Note: enabled defaults to true. Set to false to disable god object detection entirely (equivalent to --no-god-object CLI flag).
See src/config/display.rs:90-115 for GodObjectConfig and src/organization/god_object/thresholds.rs:63-113 for GodObjectThresholds.
Tuning for Your Project
Strict mode (smaller modules):
[god_object_detection.rust]
max_methods = 15
max_fields = 10
max_traits = 3
Lenient mode (larger modules acceptable):
[god_object_detection.rust]
max_methods = 30
max_fields = 20
max_traits = 7
CLI Options
Debtmap provides several CLI flags to control god object detection behavior:
--no-god-object
Disables god object detection entirely.
debtmap analyze . --no-god-object
Use case: When you only want function-level complexity analysis without file-level aggregation.
--aggregate-only
Shows only file-level god object scores, hiding individual function details.
debtmap analyze . --aggregate-only
Use case: High-level overview of which files are god objects without function-by-function breakdowns.
--no-aggregation
Disables file-level aggregation, showing only individual function metrics.
debtmap analyze . --no-aggregation
Use case: Detailed function-level analysis without combining into file scores.
--aggregation-method <METHOD>
Chooses how to combine function scores into file-level scores:
sum- Add all function scoresweighted_sum- Weight by complexity (default)logarithmic_sum- Logarithmic scaling for large filesmax_plus_average- Max score + average of others
debtmap analyze . --aggregation-method logarithmic_sum
--min-problematic <N>
Sets minimum number of problematic functions required for file-level aggregation.
debtmap analyze . --min-problematic 3
Use case: Avoid flagging files with only 1-2 complex functions as god objects.
See features.json:65-71 and features.json:507-512.
Output Display
File-Level Display
When a god object is detected, Debtmap displays:
⚠️ God Object: 270 methods, 0 fields, 8 responsibilities
Score: 1350 (Confidence: Definite)
Function-Level Display
Within a god object file, individual functions show:
├─ ⚠️ God Object: 45 methods, 20 fields, 5 responsibilities
│ Score: 250 (Confidence: Probable)
The ⚠️ God Object indicator makes it immediately clear which files need architectural refactoring.
Integration with File-Level Scoring
God object detection affects the overall technical debt prioritization through a god object multiplier:
god_object_multiplier = 2.0 + normalized_god_object_score
Normalization
The normalized_god_object_score is scaled to the 0-1 range using:
normalized_score = min(god_object_score / max_expected_score, 1.0)
Where max_expected_score is typically based on the maximum score in the analysis (e.g., 1000 for severe violations).
Impact on Prioritization
This multiplier means:
- Non-god objects (score = 0): multiplier = 2.0 (baseline)
- Moderate god objects (score = 200): multiplier ≈ 2.2-2.5
- Severe god objects (score = 1000+): multiplier ≈ 3.0 (maximum)
Result: God objects receive 2-3× higher priority in debt rankings, ensuring that:
- Functions within god objects inherit elevated scores due to architectural concerns
- God objects surface in the “top 10 most problematic” lists
- Architectural debt is weighted appropriately alongside function-level complexity
See the Scoring Strategies documentation for complete details on how this multiplier integrates into the overall debt calculation.
Metrics Tracking (Advanced)
For teams tracking god object evolution over time, Debtmap provides GodObjectMetrics with:
- Snapshots - Historical god object data per file
- Trends - Improving/Stable/Worsening classification (based on ±10 point score changes)
- New God Objects - Files that crossed the threshold
- Resolved God Objects - Files that were refactored below thresholds
This enables longitudinal analysis: “Are we reducing god objects sprint-over-sprint?”
See src/organization/god_object_metrics.rs:1-228.
Troubleshooting
“Why is my functional module flagged as a god object?”
Answer: Debtmap now analyzes god classes (structs) separately from god modules (standalone functions). If your functional module with 100 pure helper functions is flagged, it’s being detected as a god module (not a god class), which indicates the file has grown too large and should be split for better organization.
Solutions:
- Accept the finding: 100+ functions in one file is difficult to navigate and maintain, even if each function is simple
- Split by responsibility: Organize functions into smaller, focused modules (e.g.,
string_utils.rs,file_utils.rs,math_utils.rs) - Use purity-weighted scoring (Rust only): Pure functions contribute only 0.3× weight, dramatically reducing scores for functional modules
- Adjust thresholds: Increase
max_methodsin.debtmap.tomlif your project standards allow larger modules
“My god object score seems too high”
Answer: The scoring algorithm uses scaling (base_score × 20 × violation_count) to ensure god objects are prioritized. The multiplier was reduced from 50 to 20 for more conservative scoring.
Solutions:
- Check the violation count - 5 violations means severe issues
- Review each metric - are method count, field count, responsibilities, LOC, and complexity all high?
- Consider if the score accurately reflects maintainability burden
“Why does my test file show as a god object?”
Answer: Test files often have many test functions, which can trigger god module detection. However, this is usually expected for comprehensive test suites.
Solutions:
- Accept the finding: Large test files (100+ test functions) can be difficult to navigate and maintain
- Split by feature: Organize tests into smaller files grouped by the feature they test (e.g.,
user_auth_tests.rs,user_profile_tests.rs) - Adjust thresholds: If your project standards accept large test files, increase
max_methodsin.debtmap.toml - Use test organization: Group related tests in modules within the test file for better structure
Note: Debtmap does not automatically exclude test files from god object detection. Consider the trade-offs between comprehensive test coverage in one file versus better organization across multiple test files.
“Can I disable god object detection for specific files?”
Answer: Currently, god object detection is global. However, you can:
- Use
--no-god-objectto disable entirely - Use
--no-aggregationto skip file-level analysis - Adjust thresholds in
.debtmap.tomlto be more lenient
Best Practices
To avoid god objects:
- Follow Single Responsibility Principle - Each module should have one clear purpose
- Regular Refactoring - Split modules before they reach thresholds
- Monitor Growth - Track method and field counts as modules evolve
- Use Composition - Prefer smaller, composable units over large monoliths
- Clear Boundaries - Define clear module interfaces and responsibilities
- Leverage Purity - Keep pure functions separate from stateful logic (reduces scores in Rust)
- Set Project Thresholds - Customize
.debtmap.tomlto match your team’s standards
Configuration Tradeoffs
Strict Thresholds (e.g., Rust: 10 methods):
- ✅ Catch problems early
- ✅ Enforce strong modularity
- ❌ May flag legitimate large modules
- ❌ More noise in reports
Lenient Thresholds (e.g., Rust: 50 methods):
- ✅ Reduce false positives
- ✅ Focus on egregious violations
- ❌ Miss real god objects
- ❌ Allow technical debt to grow
Recommended: Start with defaults, then adjust based on your codebase’s characteristics. Use metrics tracking to monitor trends over time.
Related Documentation
- Scoring Strategies - How god objects affect overall file scores
- Configuration - Complete
.debtmap.tomlreference - CLI Reference - All command-line options
- Tiered Prioritization - How god objects are prioritized
Summary
God object detection is a powerful architectural analysis feature that:
- Identifies files/types violating single responsibility principle
- Provides multiple scoring algorithms (simple, complexity-weighted, purity-weighted)
- Generates actionable refactoring recommendations
- Integrates with file-level scoring for holistic debt prioritization
- Supports customization via TOML config and CLI flags
By combining quantitative metrics (method count, LOC, complexity) with qualitative analysis (responsibility detection, purity), Debtmap helps teams systematically address architectural debt.
Multi-Pass Analysis
Multi-pass analysis is enabled by default in debtmap. It performs two separate complexity analyses on your code to distinguish between genuine logical complexity and complexity artifacts introduced by code formatting. By comparing raw and normalized versions of your code, debtmap can attribute complexity to specific sources and provide actionable insights for refactoring.
Overview
Traditional complexity analysis treats all code as-is, which means formatting choices like multiline expressions, whitespace, and indentation can artificially inflate complexity metrics. Multi-pass analysis solves this problem by:
- Raw Analysis - Measures complexity of code exactly as written
- Normalized Analysis - Measures complexity after removing formatting artifacts
- Attribution - Compares the two analyses to identify complexity sources
The difference between raw and normalized complexity reveals how much “complexity” comes from formatting versus genuine logical complexity from control flow, branching, and nesting.
How It Works
Two-Pass Analysis Process
┌─────────────┐
│ Raw Code │
└──────┬──────┘
│
├─────────────────────┐
│ │
▼ ▼
┌──────────────┐ ┌────────────────────┐
│ Raw Analysis │ │ Normalize Formatting│
└──────┬───────┘ └─────────┬──────────┘
│ │
│ ▼
│ ┌──────────────────────┐
│ │ Normalized Analysis │
│ └─────────┬────────────┘
│ │
└──────────┬───────────┘
▼
┌──────────────────┐
│ Attribution │
│ Engine │
└─────────┬────────┘
│
┌─────────┴──────────┐
│ │
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ Insights │ │ Recommendations │
└─────────────────┘ └─────────────────┘
Raw Analysis examines your code as-is, capturing all complexity including:
- Logical control flow (if, loops, match, try/catch)
- Function calls and closures
- Formatting artifacts (multiline expressions, whitespace, indentation)
Normalized Analysis processes semantically equivalent code with standardized formatting:
- Removes excessive whitespace
- Normalizes multiline expressions to single lines where appropriate
- Standardizes indentation
- Preserves logical structure
Attribution Engine compares the results to categorize complexity sources:
- Logical Complexity - From control flow and branching (normalized result)
- Formatting Artifacts - From code formatting choices (difference between raw and normalized)
- Pattern Complexity - From recognized code patterns (error handling, validation, etc.)
Note: Pattern complexity analysis is part of the standard multi-pass analysis. No additional configuration is required to enable pattern detection.
CLI Usage
Multi-pass analysis runs by default. You can disable it if needed for performance-constrained scenarios:
# Basic analysis (multi-pass enabled by default)
debtmap analyze .
# Multi-pass with detailed attribution breakdown
debtmap analyze . --attribution
# Control detail level
debtmap analyze . --attribution --detail-level comprehensive
# Output as JSON for tooling integration
debtmap analyze . --attribution --json
# Disable multi-pass for faster single-pass analysis
debtmap analyze . --no-multi-pass
Available Flags
| Flag | Description |
|---|---|
--no-multi-pass | Disable multi-pass analysis (use single-pass for performance) |
--attribution | Show detailed complexity attribution breakdown (requires multi-pass) |
--detail-level <level> | Set output detail: summary, standard, comprehensive, debug (CLI accepts lowercase values) |
--json | Output results in JSON format |
Note: The
--attributionflag requires multi-pass analysis to be enabled (the default), as attribution depends on comparing raw and normalized analyses. Use--no-multi-passonly when performance is critical.
Attribution Engine
The attribution engine breaks down complexity into three main categories, each with detailed tracking and suggestions.
Logical Complexity
Represents inherent complexity from your code’s control flow and structure:
- Function complexity - Cyclomatic and cognitive complexity per function
- Control flow - If statements, loops, match expressions
- Error handling - Try/catch blocks, Result/Option handling
- Closures and callbacks - Anonymous functions and callbacks
- Nesting levels - Depth of nested control structures
Each logical complexity component includes:
- Contribution - Complexity points from this construct
- Location - File, line, column, and span information
- Suggestions - Specific refactoring recommendations
Example:
#![allow(unused)]
fn main() {
// Function with high logical complexity
fn process_data(items: Vec<Item>) -> Result<Vec<Output>> {
let mut results = Vec::new();
for item in items { // +1 (loop)
if item.is_valid() { // +1 (if)
match item.category { // +1 (match)
Category::A => {
if item.value > 100 { // +2 (nested if)
results.push(transform_a(&item)?);
}
}
Category::B => {
results.push(transform_b(&item)?);
}
_ => continue, // +1 (match arm)
}
}
}
Ok(results)
}
// Logical complexity: ~7 points
}
Formatting Artifacts
Identifies complexity introduced by code formatting choices:
- Multiline expressions - Long expressions split across multiple lines
- Excessive whitespace - Blank lines within code blocks
- Inconsistent indentation - Mixed tabs/spaces or irregular indentation
- Line breaks in chains - Method chains split across many lines
Formatting artifacts are categorized by severity:
- Low - Minor formatting inconsistencies (<10% impact)
- Medium - Noticeable formatting impact (10-25% impact)
- High - Significant complexity inflation (>25% impact)
Example:
#![allow(unused)]
fn main() {
// Same function with formatting that inflates complexity
fn process_data(
items: Vec<Item>
) -> Result<Vec<Output>> {
let mut results =
Vec::new();
for item in
items
{
if item
.is_valid()
{
match item
.category
{
Category::A =>
{
if item
.value
> 100
{
results
.push(
transform_a(
&item
)?
);
}
}
Category::B =>
{
results
.push(
transform_b(
&item
)?
);
}
_ => continue,
}
}
}
Ok(results)
}
// Raw complexity: ~12 points (formatting adds ~5 points)
// Normalized complexity: ~7 points (true logical complexity)
}
Pattern Complexity
Recognizes common code patterns and their complexity characteristics:
- Error handling patterns - Result/Option propagation, error conversion
- Validation patterns - Input validation, constraint checking
- Data transformation - Map/filter/fold chains, data conversions
- Builder patterns - Fluent interfaces and builders
- State machines - Explicit state management
Each pattern includes:
- Confidence score (0.0-1.0) - How certain the pattern recognition is
- Opportunities - Suggestions for pattern extraction or improvement
Example:
#![allow(unused)]
fn main() {
// Error handling pattern (confidence: 0.85)
fn load_config(path: &Path) -> Result<Config> {
let contents = fs::read_to_string(path)
.context("Failed to read config file")?;
let config: Config = serde_json::from_str(&contents)
.context("Failed to parse config JSON")?;
config.validate()
.context("Config validation failed")?;
Ok(config)
}
// Pattern complexity: moderate error handling overhead
// Suggestion: Consider error enum for better type safety
}
Understanding Attribution Output
When you run with --attribution, you’ll see a detailed breakdown:
$ debtmap analyze src/main.rs --attribution --detail-level comprehensive
Sample Output
Multi-Pass Analysis Results
============================
File: src/main.rs
Raw Complexity: 45
Normalized Complexity: 32
Formatting Impact: 28.9%
Attribution Breakdown
---------------------
Logical Complexity: 32 points
├─ Function 'main' (line 10): 8 points
│ ├─ Control flow: 5 points (2 if, 1 match, 2 loops)
│ ├─ Nesting: 3 points (max depth: 3)
│ └─ Suggestions:
│ - Break down into smaller functions
│ - Extract complex conditions into named variables
│
├─ Function 'process_request' (line 45): 12 points
│ ├─ Control flow: 8 points (4 if, 1 match, 3 early returns)
│ ├─ Nesting: 4 points (max depth: 4)
│ └─ Suggestions:
│ - Consider using early returns to reduce nesting
│ - Extract validation logic into separate function
│
└─ Function 'handle_error' (line 89): 12 points
├─ Control flow: 9 points (5 match arms, 4 if conditions)
├─ Pattern: Error handling (confidence: 0.90)
└─ Suggestions:
- Consider error enum instead of multiple match arms
Formatting Artifacts: 13 points (28.9% of raw complexity)
├─ Multiline expressions: 8 points (Medium severity)
│ └─ Locations: lines 23, 45, 67, 89
├─ Excessive whitespace: 3 points (Low severity)
│ └─ Locations: lines 12-14, 56-58
└─ Inconsistent indentation: 2 points (Low severity)
└─ Locations: lines 34, 78
Pattern Complexity: 3 recognized patterns
├─ Error handling (confidence: 0.85): 8 occurrences
│ └─ Opportunity: Consider centralizing error handling
├─ Validation (confidence: 0.72): 5 occurrences
│ └─ Opportunity: Extract validation to separate module
└─ Data transformation (confidence: 0.68): 3 occurrences
└─ Opportunity: Review for functional composition
Interpreting the Results
Logical Complexity Breakdown
- Each function is listed with its complexity contribution
- Control flow elements are itemized (if, loops, match, etc.)
- Nesting depth shows how deeply structures are nested
- Suggestions are specific to that function’s complexity patterns
Formatting Artifacts
- Shows percentage of “false” complexity from formatting
- Severity indicates impact on metrics
- Locations help you find the formatting issues
- High formatting impact (>25%) suggests inconsistent style
Pattern Analysis
- Confidence score shows pattern recognition certainty
- High confidence (>0.7) means reliable pattern detection
- Low confidence (<0.5) suggests unique code structure
- Opportunities highlight potential refactoring
Insights and Recommendations
Multi-pass analysis automatically generates insights and recommendations based on the attribution results.
Insight Types
FormattingImpact
- Triggered when formatting contributes >20% of measured complexity
- Suggests using automated formatting tools
- Recommends standardizing team coding style
PatternOpportunity
- Triggered when pattern confidence is low (<0.5)
- Suggests extracting common patterns
- Recommends reviewing for code duplication
RefactoringCandidate
- Triggered when logical complexity exceeds threshold (>20)
- Identifies functions needing breakdown
- Provides specific refactoring strategies
ComplexityHotspot
- Identifies areas of concentrated complexity
- Highlights files or modules needing attention
- Suggests architectural improvements
Recommendation Structure
Each recommendation includes:
- Priority: Low, Medium, High
- Category: Refactoring, Pattern, Formatting, General
- Title: Brief description of the issue
- Description: Detailed explanation
- Estimated Impact: Expected complexity reduction (in points)
- Suggested Actions: Specific steps to take
Example Recommendations
{
"recommendations": [
{
"priority": "High",
"category": "Refactoring",
"title": "Simplify control flow in 'process_request'",
"description": "This function contributes 12 complexity points with deeply nested conditions",
"estimated_impact": 6,
"suggested_actions": [
"Extract validation logic into separate function",
"Use early returns to reduce nesting depth",
"Consider state pattern for complex branching"
]
},
{
"priority": "Medium",
"category": "Formatting",
"title": "Formatting contributes 29% of measured complexity",
"description": "Code formatting choices are inflating complexity metrics",
"estimated_impact": 13,
"suggested_actions": [
"Use automated formatting tools (rustfmt, prettier)",
"Standardize code formatting across the team",
"Configure editor to format on save"
]
},
{
"priority": "Low",
"category": "Pattern",
"title": "Low pattern recognition suggests unique code structure",
"description": "Pattern confidence score of 0.45 indicates non-standard patterns",
"estimated_impact": 3,
"suggested_actions": [
"Consider extracting common patterns into utilities",
"Review for code duplication opportunities",
"Document unique patterns for team understanding"
]
}
]
}
Performance Considerations
Multi-pass analysis adds overhead compared to single-pass analysis, but debtmap monitors and limits this overhead.
Performance Metrics
When performance tracking is enabled, you’ll see:
Performance Metrics
-------------------
Raw analysis: 145ms
Normalized analysis: 132ms
Attribution: 45ms
Total time: 322ms
Memory used: 12.3 MB
Overhead: 121.7% vs single-pass (145ms baseline)
⚠️ Warning: Overhead exceeds 25% target
Note: Memory usage values are estimates based on parallelism level, not precise heap measurements.
Tracked Metrics:
- Raw analysis time - Time to analyze original code
- Normalized analysis time - Time to analyze normalized code
- Attribution time - Time to compute attribution breakdown
- Total time - Complete multi-pass analysis duration
- Memory used - Estimated additional memory for two-pass analysis
Performance Overhead
Target Overhead: ≤25% compared to single-pass analysis
Multi-pass analysis aims to add no more than 25% overhead versus standard single-pass analysis. If overhead exceeds this threshold, a warning is issued.
Typical Overhead:
- Attribution adds ~10-15% on average
- Normalization adds ~5-10% on average
- Total overhead usually 15-25%
Factors Affecting Performance:
- File size - Larger files take proportionally longer
- Complexity - More complex code requires more analysis time
- Language - Some languages (TypeScript) are slower to parse
- Parallel processing - Overhead is per-file, parallel reduces impact
Optimization Tips
Disable Performance Tracking in Production
#![allow(unused)]
fn main() {
MultiPassOptions {
performance_tracking: false, // Reduces overhead slightly
..Default::default()
}
}
Use Parallel Processing
# Parallel analysis amortizes overhead across cores
# Note: --jobs is a general debtmap flag controlling parallelism for all analysis
debtmap analyze . --multi-pass --jobs 8
Target Specific Files
# Analyze only files that need detailed attribution
debtmap analyze src/complex_module.rs --multi-pass --attribution
Comparative Analysis
Multi-pass analysis supports comparing code changes to validate refactoring efforts.
Basic Comparison
The compare_complexity function is a standalone convenience function that performs complete multi-pass analysis on both code versions and returns the computed differences:
#![allow(unused)]
fn main() {
use debtmap::analysis::multi_pass::compare_complexity;
use debtmap::core::Language;
let before_code = r#"
fn process(items: Vec<i32>) -> i32 {
let mut sum = 0;
for item in items {
if item > 0 {
if item % 2 == 0 {
sum += item * 2;
} else {
sum += item;
}
}
}
sum
}
"#;
let after_code = r#"
fn process(items: Vec<i32>) -> i32 {
items
.into_iter()
.filter(|&item| item > 0)
.map(|item| if item % 2 == 0 { item * 2 } else { item })
.sum()
}
"#;
let comparison = compare_complexity(before_code, after_code, Language::Rust)?;
println!("Complexity change: {}", comparison.complexity_change);
println!("Cognitive complexity change: {}", comparison.cognitive_change);
println!("Formatting impact change: {}", comparison.formatting_impact_change);
}
Comparison Results
The ComparativeAnalysis struct contains the computed differences between before and after analyses:
#![allow(unused)]
fn main() {
pub struct ComparativeAnalysis {
pub complexity_change: i32, // Negative = improvement
pub cognitive_change: i32, // Negative = improvement
pub formatting_impact_change: f32, // Negative = less formatting noise
pub improvements: Vec<String>,
pub regressions: Vec<String>,
}
}
Note: The
compare_complexityfunction performs both analyses internally and returns only the change metrics. To access the full before/after results, perform separate analyses usingMultiPassAnalyzer.
Interpreting Changes:
- Negative complexity change - Refactoring reduced complexity ✓
- Positive complexity change - Refactoring increased complexity ✗
- Improvements - List of detected improvements (reduced nesting, extracted functions, etc.)
- Regressions - List of detected regressions (increased complexity, new anti-patterns, etc.)
Example Output
Comparative Analysis
====================
Complexity Changes:
├─ Cyclomatic: 8 → 4 (-4, -50%)
├─ Cognitive: 12 → 5 (-7, -58.3%)
└─ Formatting Impact: 25% → 10% (-15%, -60%)
Improvements Detected:
✓ Reduced nesting depth (3 → 1)
✓ Eliminated mutable state
✓ Replaced imperative loop with functional chain
✓ Improved formatting consistency
No regressions detected.
Verdict: Refactoring reduced complexity by 50% and improved code clarity.
Configuration Options
Configure multi-pass analysis programmatically:
#![allow(unused)]
fn main() {
use debtmap::analysis::multi_pass::{MultiPassAnalyzer, MultiPassOptions};
use debtmap::analysis::diagnostics::{DetailLevel, OutputFormat};
use debtmap::core::Language;
let options = MultiPassOptions {
language: Language::Rust,
detail_level: DetailLevel::Comprehensive,
enable_recommendations: true,
track_source_locations: true,
generate_insights: true,
output_format: OutputFormat::Json, // Also available: Yaml, Markdown, Html, Text
performance_tracking: true,
};
let analyzer = MultiPassAnalyzer::new(options);
}
Configuration Fields
| Field | Type | Default | Description |
|---|---|---|---|
language | Language | Rust | Target programming language |
detail_level | DetailLevel | Standard | Output detail: Summary, Standard, Comprehensive, Debug (CLI uses lowercase: --detail-level standard) |
enable_recommendations | bool | true | Generate actionable recommendations |
track_source_locations | bool | true | Include file/line/column in attribution |
generate_insights | bool | true | Automatically generate insights |
output_format | OutputFormat | Json | Output format: Json, Yaml, Markdown, Html, Text |
performance_tracking | bool | false | Track and report performance metrics |
Use Cases
When to Use Multi-Pass Analysis (Default)
Multi-pass analysis is the default because it provides the most valuable insights:
Refactoring Validation
- Compare before/after complexity to validate refactoring
- Ensure complexity actually decreased
- Identify unintended complexity increases
Formatting Impact Assessment
- Determine how much formatting affects your metrics
- Justify automated formatting tool adoption
- Identify formatting inconsistencies
Targeted Refactoring
- Use attribution to find highest-impact refactoring targets
- Focus on logical complexity, not formatting artifacts
- Prioritize functions with actionable suggestions
Code Review
- Provide objective complexity data in pull requests
- Identify genuine complexity increases vs formatting changes
- Guide refactoring discussions with data
Codebase Health Monitoring
- Track logical complexity trends over time
- Separate signal (logic) from noise (formatting)
- Identify complexity hotspots for architectural review
When to Disable Multi-Pass (–no-multi-pass)
Use --no-multi-pass for single-pass analysis only when:
Performance is Critical
- Fast complexity checks during development
- CI/CD gates where every second matters
- Very large codebases (>100k LOC) where overhead is significant
Simple Use Cases
- When overall complexity trends are enough
- No need for detailed attribution
- Formatting is already standardized
Resource Constraints
- Limited CPU or memory available
- Running on CI infrastructure with strict time limits
Future Enhancements
Spec 84: Detailed AST-Based Source Mapping
The current implementation uses estimated complexity locations based on function metrics. Spec 84 will enhance attribution with precise AST-based source mapping:
Planned Improvements:
- Exact AST node locations - Precise line, column, and span for each complexity point
- 100% accurate mapping - No estimation, direct AST-to-source mapping
- IDE integration - Jump from complexity reports directly to source code
- Inline visualization - Show complexity heat maps in your editor
- Statement-level tracking - Complexity attribution at statement granularity
Current vs Future:
Current (estimated):
#![allow(unused)]
fn main() {
ComplexityComponent {
location: CodeLocation {
line: 45, // Function start line
column: 0, // Estimated
span: None, // Not available
},
description: "Function: process_request",
}
}
Future (precise):
#![allow(unused)]
fn main() {
ComplexityComponent {
location: SourceLocation {
line: 47, // Exact if statement line
column: 8, // Exact column
span: Some(47, 52), // Exact span of construct
ast_path: "fn::process_request::body::if[0]",
},
description: "If condition: item.is_valid()",
}
}
This will enable:
- Click-to-navigate from reports to exact code locations
- Visual Studio Code / IntelliJ integration for inline complexity display
- More precise refactoring suggestions
- Better complexity trend tracking at fine granularity
Summary
Multi-pass analysis (enabled by default) provides deep insights into your code’s complexity by:
- Separating signal from noise - Distinguishing logical complexity from formatting artifacts
- Attributing complexity sources - Identifying what contributes to complexity and why
- Generating actionable insights - Providing specific refactoring recommendations
- Validating refactoring - Comparing before/after to prove complexity reduction
- Monitoring performance - Ensuring overhead stays within acceptable bounds
Multi-pass analysis runs by default, providing the most valuable insights out of the box. The overhead (typically 15-25%) is worthwhile for understanding why code is complex and how to improve it.
For performance-critical scenarios or very large codebases, use --no-multi-pass to disable multi-pass analysis and run faster single-pass analysis instead. You can also use the DEBTMAP_SINGLE_PASS=1 environment variable to disable multi-pass analysis globally.
See Also:
- Analysis Guide - General analysis capabilities
- Scoring Strategies - How complexity affects debt scores
- Coverage Integration - Combining complexity with coverage
- Examples - Real-world multi-pass analysis examples
Parallel Processing
Debtmap leverages Rust’s powerful parallel processing capabilities to analyze large codebases efficiently. Built on Rayon for data parallelism and DashMap for lock-free concurrent data structures, debtmap achieves 10-100x faster performance than Java/Python-based competitors.
Overview
Debtmap’s parallel processing architecture uses a three-phase approach:
- Parallel File Parsing - Parse source files concurrently across all available CPU cores
- Parallel Multi-File Extraction - Extract call graphs from parsed files in parallel
- Parallel Enhanced Analysis - Analyze trait dispatch, function pointers, and framework patterns
This parallel pipeline is controlled by CLI flags that let you tune performance for your environment.
Performance Characteristics
Typical analysis times:
- Small project (1k-5k LOC): <1 second
- Medium project (10k-50k LOC): 2-8 seconds
- Large project (100k-500k LOC): 10-45 seconds
Comparison with other tools (medium-sized Rust project, ~50k LOC):
- SonarQube: 3-4 minutes
- CodeClimate: 2-3 minutes
- Debtmap: 5-8 seconds
CLI Flags for Parallelization
Debtmap provides two flags to control parallel processing behavior:
–jobs / -j
Control the number of worker threads for parallel processing:
# Use all available CPU cores (default)
debtmap analyze --jobs 0
# Limit to 4 threads
debtmap analyze --jobs 4
debtmap analyze -j 4
Behavior:
--jobs 0(default): Auto-detects available CPU cores usingstd::thread::available_parallelism(). Falls back to 4 threads if detection fails.--jobs N: Explicitly sets the thread pool to N threads.
When to use:
- Use
--jobs 0for maximum performance on developer workstations - Use
--jobs 1-4in memory-constrained environments like CI/CD - Use
--jobs 1for deterministic analysis order during debugging
Environment Variables:
You can also set the default via environment variables:
DEBTMAP_JOBS - Set the default thread count:
export DEBTMAP_JOBS=4
debtmap analyze # Uses 4 threads
DEBTMAP_PARALLEL - Enable/disable parallel processing programmatically:
export DEBTMAP_PARALLEL=true
debtmap analyze # Parallel processing enabled
export DEBTMAP_PARALLEL=1
debtmap analyze # Parallel processing enabled (also accepts '1')
The DEBTMAP_PARALLEL variable accepts true or 1 to enable parallel processing. This is useful for programmatic control in scripts or CI environments.
The CLI flags (--jobs, --no-parallel) take precedence over environment variables.
–no-parallel
Disable parallel call graph construction entirely:
debtmap analyze --no-parallel
When to use:
- Debugging concurrency issues: Isolate whether a problem is parallelism-related
- Memory-constrained environments: Parallel processing increases memory usage
- Deterministic analysis: Ensures consistent ordering for reproducibility
Performance Impact:
Disabling parallelization significantly increases analysis time:
- Small projects (< 100 files): 2-3x slower
- Medium projects (100-1000 files): 5-10x slower
- Large projects (> 1000 files): 10-50x slower
For more details on both flags, see the CLI Reference.
Rayon Parallel Iterators
Debtmap uses Rayon, a data parallelism library for Rust, to parallelize file processing operations.
Thread Pool Configuration
The global Rayon thread pool is configured at startup based on the --jobs parameter. Thread pool configuration is centralized in src/cli/setup.rs:
#![allow(unused)]
fn main() {
// From src/cli/setup.rs:15-26
/// Configure rayon global thread pool once at startup
pub fn configure_thread_pool(jobs: usize) {
let mut builder = rayon::ThreadPoolBuilder::new().stack_size(RAYON_STACK_SIZE);
if jobs > 0 {
builder = builder.num_threads(jobs);
}
if let Err(e) = builder.build_global() {
// Already configured - this is fine, just ignore
eprintln!("Note: Thread pool already configured: {}", e);
}
}
}
This configures Rayon to use a specific number of worker threads for all parallel operations throughout the analysis. The Rayon stack size is set to 8MB per thread to handle deeply nested AST traversals.
Worker Thread Selection
The get_worker_count() function determines how many threads to use:
#![allow(unused)]
fn main() {
// From src/cli/setup.rs:29-37
/// Get the number of worker threads to use
pub fn get_worker_count(jobs: usize) -> usize {
if jobs == 0 {
std::thread::available_parallelism()
.map(|n| n.get())
.unwrap_or(4) // Fallback if detection fails
} else {
jobs // Use explicit value
}
}
}
Auto-detection behavior:
- Queries the OS for available parallelism (CPU cores)
- Respects cgroup limits in containers (Docker, Kubernetes)
- Falls back to 4 threads if detection fails (rare)
Manual configuration:
- Useful in shared environments (CI/CD, shared build servers)
- Prevents resource contention with other processes
- Enables reproducible benchmarking
Parallel File Processing
Phase 1: Parallel File I/O and Sequential Parsing
File reading is parallelized, but AST parsing is sequential due to syn::File not being Send. Files are processed in batches to prevent proc-macro2 SourceMap overflow (Spec 210):
#![allow(unused)]
fn main() {
// From src/builders/parallel_call_graph.rs:217-249
/// Parse a batch of files without progress tracking (used in batched processing)
fn parallel_parse_files_batch(
&self,
batch: &[PathBuf],
parallel_graph: &Arc<ParallelCallGraph>,
) -> Result<Vec<(PathBuf, syn::File)>> {
// Read file contents in parallel (I/O bound, content is Send)
let file_contents: Vec<_> = batch
.par_iter()
.filter_map(|file_path| {
io::read_file(file_path)
.ok()
.map(|content| (file_path.clone(), content))
})
.collect();
// Parse sequentially (syn::File is not Send)
let parsed_files: Vec<_> = file_contents
.iter()
.filter_map(|(file_path, content)| {
let parsed = syn::parse_file(content).ok()?;
parallel_graph.stats().increment_files();
Some((file_path.clone(), parsed))
})
.collect();
Ok(parsed_files)
}
}
Key features:
- Parallel I/O: File reading uses
.par_iter()to maximize disk throughput - Sequential parsing: AST parsing is sequential because
syn::FilelacksSendbound - Why this works: I/O operations dominate analysis time, so parallelizing file reads provides most of the speedup
- Progress tracking uses atomic counters and unified progress system (see Parallel Call Graph Statistics)
Phase 2: Batched File Extraction (Spec 210)
Files are processed in batches of 200 to prevent proc-macro2 SourceMap overflow on large codebases. The SourceMap is reset between batches to free memory:
#![allow(unused)]
fn main() {
// From src/builders/parallel_call_graph.rs:128-181
// Spec 210: Batch size to prevent SourceMap overflow
// 200 files * ~50KB avg = ~10MB per batch, well under the 4GB limit
const BATCH_SIZE: usize = 200;
// Process files in batches to prevent SourceMap overflow
for batch in rust_files.chunks(BATCH_SIZE) {
// Phase 2: Parse ASTs for this batch
let parsed_files = self.parallel_parse_files_batch(batch, ¶llel_graph)?;
// Phase 3: Extract calls for this batch
self.parallel_multi_file_extraction(&parsed_files, ¶llel_graph)?;
// Phase 4: Enhanced analysis for this batch
let (batch_framework_exclusions, batch_function_pointer_used) =
self.parallel_enhanced_analysis(&parsed_files, ¶llel_graph)?;
// Reset SourceMap after each batch to prevent overflow
crate::core::parsing::reset_span_locations();
}
}
Design rationale:
- Batched processing: 200 files per batch to prevent SourceMap overflow (4GB limit)
- SourceMap reset: Each batch releases span location memory before processing the next
- Cross-file resolution: Within each batch,
extract_call_graph_multi_fileprovides full cross-file visibility - Scalability: Can analyze codebases with 10,000+ files without running out of memory
AST Parsing Optimization (Spec 132)
Prior to spec 132, files were parsed twice during call graph construction:
- Phase 1: Read files and store content as strings
- Phase 2: Re-parse the same content to extract call graphs
This redundant parsing was eliminated by parsing each file exactly once and reusing the parsed syn::File AST:
#![allow(unused)]
fn main() {
// Optimized: Parse once in Phase 1
let parsed_files: Vec<(PathBuf, syn::File)> = rust_files
.par_iter()
.filter_map(|file_path| {
let content = io::read_file(file_path).ok()?;
let parsed = syn::parse_file(&content).ok()?; // Parse ONCE
Some((file_path.clone(), parsed))
})
.collect();
// Phase 2: Reuse parsed ASTs (no re-parsing)
for chunk in parsed_files.chunks(chunk_size) {
let chunk_for_extraction: Vec<_> = chunk
.iter()
.map(|(path, parsed)| (parsed.clone(), path.clone())) // Clone AST
.collect();
// Extract call graph...
}
}
Performance Impact:
- Before: 2N parse operations (404 files × 2 = 808 parses)
- After: N parse operations (404 files × 1 = 404 parses)
- Speedup: Cloning a parsed AST is 44% faster than re-parsing
- Time saved: ~432ms per analysis run on 400-file projects
- Memory overhead: <100MB for parsed AST storage
Why Clone Instead of Borrow?
syn::Fileis notSend + Sync(cannot be shared across threads)- Call graph extraction requires owned AST values
- Cloning is still significantly faster than re-parsing (1.33ms vs 2.40ms per file)
Benchmarks showed 44% faster analysis times after eliminating redundant parsing.
Phase 3: Enhanced Analysis
The third phase analyzes trait dispatch, function pointers, and framework patterns. This phase is currently sequential due to complex shared state requirements, but benefits from the parallel foundation built in phases 1-2.
Parallel Architecture
Debtmap processes files in parallel using Rayon’s parallel iterators:
#![allow(unused)]
fn main() {
files.par_iter()
.map(|file| analyze_file(file))
.collect()
}
Each file is:
- Parsed independently
- Analyzed for complexity
- Scored and prioritized
DashMap for Lock-Free Concurrency
Debtmap uses DashMap, a concurrent hash map implementation, for lock-free data structures during parallel call graph construction.
Why DashMap?
Traditional approaches to concurrent hash maps use a single Mutex<HashMap>, which creates contention:
#![allow(unused)]
fn main() {
// ❌ Traditional approach - serializes all access
let map = Arc<Mutex<HashMap<K, V>>>;
// Thread 1 blocks Thread 2, even for reads
let val = map.lock().unwrap().get(&key);
}
DashMap provides lock-free reads and fine-grained write locking through internal sharding:
#![allow(unused)]
fn main() {
// ✅ DashMap approach - concurrent reads, fine-grained writes
let map = Arc<DashMap<K, V>>;
// Multiple threads can read concurrently without blocking
let val = map.get(&key);
// Writes only lock the specific shard, not the whole map
map.insert(key, value);
}
ParallelCallGraph Implementation
The ParallelCallGraph uses DashMap for all concurrent data structures:
#![allow(unused)]
fn main() {
// From src/priority/parallel_call_graph.rs:50-56
pub struct ParallelCallGraph {
nodes: Arc<DashMap<FunctionId, NodeInfo>>, // Functions
edges: Arc<DashSet<FunctionCall>>, // Calls
caller_index: Arc<DashMap<FunctionId, DashSet<FunctionId>>>, // Who calls this?
callee_index: Arc<DashMap<FunctionId, DashSet<FunctionId>>>, // Who does this call?
stats: Arc<ParallelStats>, // Atomic counters
}
}
Key components:
- nodes: Maps function identifiers to metadata (complexity, lines, flags)
- edges: Set of all function calls (deduplicated automatically)
- caller_index: Reverse index for “who calls this function?”
- callee_index: Forward index for “what does this function call?”
- stats: Atomic counters for progress tracking
Concurrent Operations
Adding Functions Concurrently
Multiple analyzer threads can add functions simultaneously:
#![allow(unused)]
fn main() {
// From src/priority/parallel_call_graph.rs:79-96
pub fn add_function(
&self,
id: FunctionId,
is_entry_point: bool,
is_test: bool,
complexity: u32,
lines: usize,
) {
let node_info = NodeInfo {
id: id.clone(),
is_entry_point,
is_test,
complexity,
lines,
};
self.nodes.insert(id, node_info);
self.stats.add_nodes(1); // Atomic increment
}
}
Atomicity guarantees:
DashMap::insert()is atomic - no data racesAtomicUsizecounters can be incremented from multiple threads safely- No locks required for reading existing nodes
Adding Calls Concurrently
Function calls are added with automatic deduplication:
#![allow(unused)]
fn main() {
// From src/priority/parallel_call_graph.rs:99-117
pub fn add_call(&self, caller: FunctionId, callee: FunctionId, call_type: CallType) {
let call = FunctionCall {
caller: caller.clone(),
callee: callee.clone(),
call_type,
};
if self.edges.insert(call) { // DashSet deduplicates automatically
// Update indices concurrently
self.caller_index
.entry(caller.clone())
.or_default()
.insert(callee.clone());
self.callee_index.entry(callee).or_default().insert(caller);
self.stats.add_edges(1); // Only increment if actually inserted
}
}
}
Deduplication:
DashSet::insert()returnstrueonly for new items- Duplicate calls from multiple threads are safely ignored
- Indices are updated atomically using
entry()API
Shared Read-Only Data
Analysis configuration and indexes are shared across threads:
#![allow(unused)]
fn main() {
let coverage_index = Arc::new(build_coverage_index());
// All threads share the same index
files.par_iter()
.map(|file| analyze_with_coverage(file, &coverage_index))
}
Memory Overhead
DashMap uses internal sharding for parallelism, which has a memory overhead:
- DashMap overhead: ~2x the memory of a regular
HashMapdue to sharding - DashSet overhead: Similar to DashMap
- Benefit: Enables concurrent access without contention
- Trade-off: Debtmap prioritizes speed over memory for large codebases
For memory-constrained environments, use --jobs 2-4 or --no-parallel to reduce parallel overhead.
Parallel Call Graph Statistics
Debtmap tracks parallel processing progress using atomic counters that can be safely updated from multiple threads. These statistics are integrated with a unified progress system that provides consolidated reporting across all analysis phases.
Unified Progress System Integration
Parallel statistics now integrate with crate::io::progress::AnalysisProgress (src/builders/parallel_call_graph.rs:134-139), which replaced older per-phase progress bars. This provides:
- Consolidated progress reporting: Single progress view showing “3/4 Building call graph”
- Cross-phase coordination: Progress updates coordinated across parsing, extraction, and analysis phases
- Reduced visual clutter: Replaces multiple progress bars with unified display
ParallelStats Structure
#![allow(unused)]
fn main() {
// From src/priority/parallel_call_graph.rs:7-14
pub struct ParallelStats {
pub total_nodes: AtomicUsize, // Functions processed
pub total_edges: AtomicUsize, // Calls discovered
pub files_processed: AtomicUsize, // Files completed
pub total_files: AtomicUsize, // Total files to process
}
}
Atomic operations:
fetch_add()- Atomically increment counters from any threadload()- Read current value without blockingOrdering::Relaxed- Sufficient for statistics (no synchronization needed)
Progress Tracking
Progress ratio calculation for long-running analysis:
#![allow(unused)]
fn main() {
// From src/priority/parallel_call_graph.rs:38-46
pub fn progress_ratio(&self) -> f64 {
let processed = self.files_processed.load(Ordering::Relaxed) as f64;
let total = self.total_files.load(Ordering::Relaxed) as f64;
if total > 0.0 {
processed / total
} else {
0.0
}
}
}
This enables progress callbacks during analysis:
#![allow(unused)]
fn main() {
// From src/builders/parallel_call_graph.rs:110-121
parallel_graph.stats().increment_files();
if let Some(ref callback) = self.config.progress_callback {
let processed = parallel_graph
.stats()
.files_processed
.load(std::sync::atomic::Ordering::Relaxed);
let total = parallel_graph
.stats()
.total_files
.load(std::sync::atomic::Ordering::Relaxed);
callback(processed, total);
}
}
Log Output Format
After analysis completes, debtmap reports final statistics:
#![allow(unused)]
fn main() {
// From src/builders/parallel_call_graph.rs:85-93
log::info!(
"Parallel call graph complete: {} nodes, {} edges, {} files processed",
stats.total_nodes.load(std::sync::atomic::Ordering::Relaxed),
stats.total_edges.load(std::sync::atomic::Ordering::Relaxed),
stats
.files_processed
.load(std::sync::atomic::Ordering::Relaxed),
);
}
Example output:
INFO - Processing 1247 Rust files in parallel
INFO - Progress: 100/1247 files processed
INFO - Progress: 500/1247 files processed
INFO - Progress: 1000/1247 files processed
INFO - Parallel call graph complete: 8942 nodes, 23451 edges, 1247 files processed
Cross-File Call Resolution
Debtmap uses a two-phase parallel resolution approach for resolving cross-file function calls, achieving 10-15% faster call graph construction on multi-core systems.
Two-Phase Architecture
Phase 1: Parallel Resolution (Read-Only)
The first phase processes unresolved calls concurrently using Rayon’s parallel iterators:
#![allow(unused)]
fn main() {
// From src/priority/call_graph/cross_file.rs
let resolutions: Vec<(FunctionCall, FunctionId)> = calls_to_resolve
.par_iter() // Parallel iteration
.filter_map(|call| {
// Pure function - safe for parallel execution
Self::resolve_call_with_advanced_matching(
&all_functions,
&call.callee.name,
&call.caller.file,
).map(|resolved_callee| {
(call.clone(), resolved_callee)
})
})
.collect();
}
Key benefits:
- Pure functional resolution: No side effects, safe for concurrent execution
- Immutable data: All inputs are read-only during the parallel phase
- Independent operations: Each call resolution is independent of others
- Parallel efficiency: Utilizes all available CPU cores
- Sophisticated matching:
resolve_call_with_advanced_matchingdelegates toCallResolver(src/analyzers/call_graph/call_resolution.rs:44-58) which handles associated functions, qualified paths, and type hints
Phase 2: Sequential Updates (Mutation)
The second phase applies all resolutions to the graph sequentially:
#![allow(unused)]
fn main() {
// Apply resolutions to graph in sequence
for (original_call, resolved_callee) in resolutions {
self.apply_call_resolution(&original_call, &resolved_callee);
}
}
Key benefits:
- Batch updates: All resolutions processed together
- Data consistency: Sequential updates maintain index synchronization
- Deterministic: Same results regardless of parallel execution order
Performance Impact
The two-phase approach provides significant speedups on multi-core systems:
| CPU Cores | Speedup | Example Time (1500 calls) |
|---|---|---|
| 1 | 0% | 100ms (baseline) |
| 2 | ~8% | 92ms |
| 4 | ~12% | 88ms |
| 8 | ~15% | 85ms |
Performance characteristics:
- Best case: 10-15% reduction in call graph construction time
- Scaling: Diminishing returns beyond 8 cores due to batching overhead
- Memory overhead: <10MB for resolutions vector, even for large projects
Thread Safety
The parallel resolution phase is thread-safe without locks because:
- Pure resolution logic:
resolve_call_with_advanced_matching()is a static method with no side effects - Immutable inputs: All function data is read-only during parallel phase
- Independent resolutions: No dependencies between different call resolutions
- Safe collection: Rayon handles thread synchronization for result collection
The sequential update phase requires no synchronization since it runs single-threaded.
Memory Efficiency
Resolutions vector overhead:
- Per-resolution size: ~200 bytes (FunctionCall + FunctionId)
- For 1000 resolutions: ~200KB
- For 2000 resolutions: ~400KB
- Maximum overhead: <10MB even for very large projects
Total memory footprint:
Total Memory = Base Graph + Resolutions Vector
≈ 5-10MB + 0.2-0.4MB
≈ 5-10MB (negligible overhead)
Integration with Call Graph Construction
The two-phase resolution integrates seamlessly into the existing call graph construction pipeline:
File Parsing (Parallel)
↓
Function Extraction (Parallel)
↓
Build Initial Call Graph
↓
[NEW] Parallel Cross-File Resolution
├─ Phase 1: Parallel resolution → collect resolutions
└─ Phase 2: Sequential updates → apply to graph
↓
Call Graph Complete
Configuration
Cross-file resolution respects the --jobs flag for thread pool sizing:
# Use all cores for maximum speedup
debtmap analyze --jobs 0
# Limit to 4 threads
debtmap analyze --jobs 4
# Disable parallelism (debugging)
debtmap analyze --no-parallel
The --no-parallel flag disables parallel call graph construction entirely, including cross-file resolution parallelization.
Debugging
To verify parallel resolution is working:
# Enable verbose logging
debtmap analyze -vv
# Look for messages like:
# "Resolving 1523 cross-file calls in parallel"
# "Parallel resolution complete: 1423 resolved in 87ms"
To compare parallel vs sequential performance:
# Parallel (default)
time debtmap analyze .
# Sequential (for comparison)
time debtmap analyze . --no-parallel
Expected difference: 10-15% faster with parallel resolution on 4-8 core systems.
Concurrent Merging
The merge_concurrent() method combines call graphs from different analysis phases using parallel iteration.
Implementation
#![allow(unused)]
fn main() {
// From src/priority/parallel_call_graph.rs:120-138
pub fn merge_concurrent(&self, other: CallGraph) {
// Parallelize node merging
let nodes_vec: Vec<_> = other.get_all_functions().collect();
nodes_vec.par_iter().for_each(|func_id| {
if let Some((is_entry, is_test, complexity, lines)) = other.get_function_info(func_id) {
self.add_function((*func_id).clone(), is_entry, is_test, complexity, lines);
}
});
// Parallelize edge merging
let calls_vec: Vec<_> = other.get_all_calls();
calls_vec.par_iter().for_each(|call| {
self.add_call(
call.caller.clone(),
call.callee.clone(),
call.call_type.clone(),
);
});
}
}
How it works:
- Extract all nodes and edges from the source
CallGraph - Use
par_iter()to merge nodes in parallel - Use
par_iter()to merge edges in parallel - DashMap/DashSet automatically handle concurrent insertions
Converting Between Representations
Debtmap uses two call graph representations:
- ParallelCallGraph: Concurrent data structures (DashMap/DashSet) for parallel construction
- CallGraph: Sequential data structures (HashMap/HashSet) for analysis algorithms
Conversion happens at phase boundaries:
#![allow(unused)]
fn main() {
// From src/priority/parallel_call_graph.rs:141-162
pub fn to_call_graph(&self) -> CallGraph {
let mut call_graph = CallGraph::new();
// Add all nodes
for entry in self.nodes.iter() {
let node = entry.value();
call_graph.add_function(
node.id.clone(),
node.is_entry_point,
node.is_test,
node.complexity,
node.lines,
);
}
// Add all edges
for call in self.edges.iter() {
call_graph.add_call(call.clone());
}
call_graph
}
}
Why two representations?
- ParallelCallGraph: Optimized for concurrent writes during construction
- CallGraph: Optimized for graph algorithms (PageRank, connectivity, transitive reduction)
- Conversion overhead is negligible compared to analysis time
Coverage Index Optimization
Debtmap uses an optimized nested HashMap structure for coverage data lookups, providing significant performance improvements for coverage-enabled analysis.
Nested HashMap Architecture
The CoverageIndex structure uses a two-level nested HashMap instead of a flat structure:
#![allow(unused)]
fn main() {
// Optimized structure (nested)
pub struct CoverageIndex {
/// Outer map: file path → inner map of functions
by_file: HashMap<PathBuf, HashMap<String, FunctionCoverage>>,
/// Line-based index for range queries
by_line: HashMap<PathBuf, BTreeMap<usize, FunctionCoverage>>,
/// Pre-computed file paths for efficient iteration
file_paths: Vec<PathBuf>,
}
// OLD structure (flat) - no longer used
HashMap<(PathBuf, String), FunctionCoverage>
}
Performance Characteristics
The nested structure provides dramatic performance improvements:
Lookup Complexity:
- Exact match: O(1) file hash + O(1) function hash
- Path strategies: O(files) instead of O(functions)
- Line-based: O(log functions_in_file) binary search
Real-World Performance:
- Exact match lookups: ~100 nanoseconds
- Path matching fallback: ~10 microseconds (375 file checks vs 1,500 function checks)
- Overall speedup: 50-100x faster coverage lookups
Why This Matters
When analyzing a typical Rust project with coverage enabled:
- Function count: ~1,500 functions (after demangling)
- File count: ~375 files
- Lookups per analysis: ~19,600
- Average functions per file: ~4
OLD flat structure (O(n) scans):
- 19,600 lookups × 4,500 comparisons = 88 million operations
- Estimated time: ~1 minute
NEW nested structure (O(1) lookups):
- 19,600 lookups × 1-3 operations = ~60,000 operations
- Estimated time: ~3 seconds
Speedup: ~20x faster just from index structure optimization
Combined with Function Demangling
This optimization works synergistically with LLVM coverage function name demangling (Spec 134):
Original (no demangling, flat structure):
- 18,631 mangled functions
- O(n) linear scans
- Total time: 10+ minutes
After demangling (Spec 134):
- 1,500 demangled functions
- O(n) linear scans (still)
- Total time: ~1 minute
After nested structure (Spec 135):
- 1,500 demangled functions
- O(1) hash lookups
- Total time: ~3 seconds
Combined speedup: ~50,000x (10+ minutes → 3 seconds)
Implementation Details
Exact Match Lookup (O(1)):
#![allow(unused)]
fn main() {
pub fn get_function_coverage(&self, file: &Path, function_name: &str) -> Option<f64> {
// Two O(1) hash lookups
if let Some(file_functions) = self.by_file.get(file) {
if let Some(coverage) = file_functions.get(function_name) {
return Some(coverage.coverage_percentage / 100.0);
}
}
// Fallback to path strategies (rare)
self.find_by_path_strategies(file, function_name)
}
}
Path Strategy Fallback (O(files)):
#![allow(unused)]
fn main() {
fn find_by_path_strategies(&self, query_path: &Path, function_name: &str) -> Option<f64> {
// Iterate over FILES not FUNCTIONS (375 vs 1,500 = 4x faster)
for file_path in &self.file_paths {
if query_path.ends_with(file_path) {
// O(1) lookup once we find the right file
if let Some(file_functions) = self.by_file.get(file_path) {
if let Some(coverage) = file_functions.get(function_name) {
return Some(coverage.coverage_percentage / 100.0);
}
}
}
}
None
}
}
Memory Overhead
The nested structure has minimal memory overhead:
Flat structure:
- 1,500 entries × ~200 bytes = 300KB
Nested structure:
- Outer HashMap: 375 entries × ~50 bytes = 18.75KB
- Inner HashMaps: 375 × ~4 functions × ~200 bytes = 300KB
- File paths vector: 375 × ~100 bytes = 37.5KB
- Total: ~356KB
Memory increase: ~56KB (18%) - negligible cost for 50-100x speedup
Benchmarking Coverage Performance
Debtmap includes benchmarks to validate coverage index performance:
# Run coverage performance benchmarks
cargo bench --bench coverage_performance
# Compare old flat structure vs new nested structure
# Expected results:
# old_flat_structure: 450ms
# new_nested_structure: 8ms
# Speedup: ~56x
The flat_vs_nested_comparison benchmark simulates the old O(n) scan behavior and compares it with the new nested structure, demonstrating the 50-100x improvement.
Impact on Analysis Time
Coverage lookups are now negligible overhead:
Without coverage optimization:
- Analysis overhead from coverage: ~1 minute
- Percentage of total time: 60-80%
With coverage optimization:
- Analysis overhead from coverage: ~3 seconds
- Percentage of total time: 5-10%
This makes coverage-enabled analysis practical for CI/CD pipelines and real-time feedback during development.
Performance Tuning
Optimal Thread Count
General rule: Use physical core count, not logical cores.
# Check physical core count
lscpu | grep "Core(s) per socket"
# macOS
sysctl hw.physicalcpu
Recommended settings:
| System | Cores | Recommended –jobs |
|---|---|---|
| Laptop | 4 | Default or 4 |
| Desktop | 8 | Default |
| Workstation | 16+ | Default |
| CI/CD | Varies | 2-4 (shared resources) |
Memory Considerations
Each thread requires memory for:
- AST parsing (~1-5 MB per file)
- Analysis state (~500 KB per file)
- Temporary buffers
Memory usage estimate:
Total Memory ≈ (Thread Count) × (Average File Size) × 2-3
Example (50 files, average 10 KB each, 8 threads):
Memory ≈ 8 × 10 KB × 3 = 240 KB (negligible)
For very large files (>1 MB), consider reducing thread count.
Memory vs Speed Tradeoffs
Parallel processing uses more memory:
| Configuration | Memory Overhead | Speed Benefit |
|---|---|---|
--no-parallel | Baseline | Baseline |
--jobs 1 | +10% (data structures) | 1x |
--jobs 4 | +30% (+ worker buffers) | 4-6x |
--jobs 8 | +50% (+ worker buffers) | 6-10x |
--jobs 16 | +80% (+ worker buffers) | 10-15x |
Memory overhead sources:
- DashMap internal sharding (~2x HashMap)
- Per-worker thread stacks and buffers
- Parallel iterator intermediates
I/O Bound vs CPU Bound
CPU-bound analysis (default):
- Complexity calculations
- Pattern detection
- Risk scoring
Parallel processing provides 4-8x speedup.
I/O-bound operations:
- Reading files from disk
- Loading coverage data
Limited speedup from parallelism (1.5-2x).
If analysis is I/O-bound:
- Use SSD storage
- Reduce thread count (less I/O contention)
- Use
--max-filesto limit scope
Scaling Strategies
Small Projects (<10k LOC)
# Default settings are fine
debtmap analyze .
Parallel overhead may exceed benefits. Consider --no-parallel if analysis is <1 second.
Medium Projects (10k-100k LOC)
# Use all cores
debtmap analyze .
Optimal parallel efficiency. Expect 4-8x speedup from parallelism.
Large Projects (>100k LOC)
# Use all cores
debtmap analyze . --jobs 0 # 0 = all cores
Maximize parallel processing for large codebases.
CI/CD Environments
# Limit threads to avoid resource contention
debtmap analyze . --jobs 2
CI environments often limit CPU cores per job.
Scaling Behavior
Debtmap’s parallel processing scales with CPU core count:
Strong Scaling (Fixed Problem Size):
| CPU Cores | Speedup | Efficiency |
|---|---|---|
| 1 | 1x | 100% |
| 2 | 1.8x | 90% |
| 4 | 3.4x | 85% |
| 8 | 6.2x | 78% |
| 16 | 10.5x | 66% |
| 32 | 16.8x | 53% |
Efficiency decreases at higher core counts due to:
- Synchronization overhead (atomic operations, DashMap locking)
- Memory bandwidth saturation
- Diminishing returns from Amdahl’s law (sequential portions)
Weak Scaling (Problem Size Grows with Cores):
Debtmap maintains high efficiency when problem size scales with core count, making it ideal for analyzing larger codebases on more powerful machines.
Tuning Guidelines
Development Workstations:
# Use all cores for maximum speed
debtmap analyze --jobs 0
CI/CD Environments:
# Limit threads to avoid resource contention
debtmap analyze --jobs 2
# Or disable parallelism on very constrained runners
debtmap analyze --no-parallel
Containers:
# Auto-detection respects cgroup limits
debtmap analyze --jobs 0
# Or explicitly match container CPU allocation
debtmap analyze --jobs 4
Benchmarking:
# Use fixed thread count for reproducible results
debtmap analyze --jobs 8
Profiling and Debugging
Measure Analysis Time
time debtmap analyze .
Disable Parallelism for Debugging
debtmap analyze . --no-parallel -vv
Single-threaded mode with verbose output for debugging.
Profile Thread Usage
Use system tools to monitor thread usage:
# Linux
htop
# macOS
Activity Monitor (View > CPU Usage > Show Threads)
Look for:
- All cores at ~100% utilization (optimal)
- Some cores idle (I/O bound or insufficient work)
- Excessive context switching (too many threads)
Finding Optimal Settings
Finding the optimal setting:
# Benchmark different configurations
time debtmap analyze --jobs 0 # Auto
time debtmap analyze --jobs 4 # 4 threads
time debtmap analyze --jobs 8 # 8 threads
time debtmap analyze --no-parallel # Sequential
Monitor memory usage during analysis:
# Monitor peak memory usage
/usr/bin/time -v debtmap analyze --jobs 8
Best Practices
- Use default settings - Debtmap auto-detects optimal thread count
- Limit threads in CI - Use
--jobs 2or--jobs 4in shared environments - Profile before tuning - Measure actual performance impact
- Consider I/O - If using slow storage, reduce thread count
Troubleshooting
Analysis is Slow Despite Parallelism
Possible causes:
- I/O bottleneck (slow disk)
- Memory pressure (swapping)
- Thread contention
Solutions:
- Use faster storage (SSD)
- Reduce thread count to avoid memory pressure
- Limit analysis scope with
--max-files
Slow Analysis Performance
If analysis is slower than expected:
-
Check thread count:
# Ensure you're using all cores debtmap analyze --jobs 0 -vv | grep "threads" -
Check I/O bottleneck:
# Use iotop or similar to check disk saturation # SSD storage significantly improves performance -
Check memory pressure:
# Monitor memory usage during analysis top -p $(pgrep debtmap) -
Try different thread counts:
# Sometimes less threads = less contention debtmap analyze --jobs 4
High CPU Usage But No Progress
Possible cause: Analyzing very complex files (large ASTs)
Solution:
# Reduce thread count to avoid memory thrashing
debtmap analyze . --jobs 2
High Memory Usage
If debtmap uses too much memory:
-
Reduce parallelism:
debtmap analyze --jobs 2 -
Disable parallel call graph:
debtmap analyze --no-parallel -
Analyze subdirectories separately:
# Process codebase in chunks debtmap analyze src/module1 debtmap analyze src/module2
Inconsistent Results Between Runs
Possible cause: Non-deterministic parallel aggregation (rare)
Solution:
# Use single-threaded mode
debtmap analyze . --no-parallel
If results differ, report as a bug.
Debugging Concurrency Issues
If you suspect a concurrency bug:
-
Run sequentially to isolate:
debtmap analyze --no-parallel -
Use deterministic mode:
# Single-threaded = deterministic order debtmap analyze --jobs 1 -
Enable verbose logging:
debtmap analyze -vvv --no-parallel > debug.log 2>&1 -
Report the issue: If behavior differs between
--no-paralleland parallel mode, please report it with:- Command used
- Platform (OS, CPU core count)
- Debtmap version
- Minimal reproduction case
Thread Contention Warning
If you see warnings about thread contention:
WARN - High contention detected on parallel call graph
This indicates too many threads competing for locks. Try:
# Reduce thread count
debtmap analyze --jobs 4
See Also
- CLI Reference - Performance - Complete flag documentation
- Configuration - Project-specific settings
- Troubleshooting - General troubleshooting guide
- Troubleshooting - Slow Analysis - Performance debugging guide
- Troubleshooting - Out of Memory Errors - Memory optimization tips
- FAQ - Reducing Parallelism - Common questions about parallel processing
- Architecture - High-level system design
Summary
Debtmap’s parallel processing architecture provides:
- 10-100x speedup over sequential analysis using Rayon parallel iterators
- Lock-free concurrency with DashMap for minimal contention
- Flexible configuration via
--jobsand--no-parallelflags - Automatic thread pool tuning that respects system resources
- Production-grade reliability with atomic progress tracking and concurrent merging
The three-phase parallel pipeline (parse → extract → analyze) maximizes parallelism while maintaining correctness through carefully designed concurrent data structures.
Prodigy Integration
Debtmap integrates with Prodigy to provide fully automated technical debt reduction through AI-driven workflows. This chapter explains how to set up and use Prodigy workflows to automatically refactor code, add tests, and improve codebase quality.
Prerequisites Checklist
Before using Prodigy with Debtmap, ensure you have:
- Rust 1.70 or later installed
- Debtmap installed (
cargo install debtmap) - Prodigy installed (
cargo install --git https://github.com/iepathos/prodigy prodigy) - Anthropic API key for Claude access
- Git (for worktree management)
- Optional:
justcommand runner (a command runner like make), or use directcargocommands as alternatives
What is Prodigy?
Note: Prodigy is a separate open-source tool (https://github.com/iepathos/prodigy). You need to install both Debtmap and Prodigy to use this integration.
Prodigy is an AI-powered workflow automation system that uses Claude to execute complex multi-step tasks. When integrated with Debtmap, it can:
- Automatically refactor high-complexity functions identified by Debtmap
- Add unit tests for untested code
- Fix code duplication by extracting shared logic
- Improve code organization by addressing architectural issues
- Validate improvements with automated testing
All changes are made in isolated git worktrees, validated with tests and linting, and only committed if all checks pass.
Benefits
Automated Debt Reduction
Instead of manually addressing each technical debt item, Prodigy can:
- Analyze Debtmap’s output
- Select high-priority items
- Generate refactoring plans
- Execute refactorings automatically
- Validate with tests
- Commit clean changes
Iterative Improvement
Prodigy supports iterative workflows:
- Run analysis → fix top items → re-analyze → fix more
- Configurable iteration count (default: 5 iterations)
- Each iteration focuses on highest-priority remaining items
Safe Experimentation
All changes happen in isolated git worktrees:
- Original branch remains untouched
- Failed attempts don’t affect main codebase
- Easy to review before merging
- Automatic cleanup after workflow
Prerequisites
Install Prodigy
# Install Prodigy from GitHub repository
cargo install --git https://github.com/iepathos/prodigy prodigy
# Verify installation
prodigy --version
Note: Currently, Prodigy must be installed from GitHub. Check the Prodigy repository for the latest installation instructions.
Requirements:
- Rust 1.70 or later
- Git (for worktree management)
- Anthropic API key for Claude access
Configure Claude API
# Set Claude API key
export ANTHROPIC_API_KEY="your-api-key-here"
# Or in ~/.prodigy/config.toml:
[api]
anthropic_key = "your-api-key-here"
Ensure Debtmap is Installed
# Install Debtmap
cargo install debtmap
# Verify installation
debtmap --version
Quick Start
1. Initialize Workflow
Create a workflow file workflows/debtmap.yml:
# Sequential workflow. Fix top technical debt item
# Phase 1: Generate coverage data
- shell: "just coverage-lcov" # or: cargo tarpaulin --out lcov --output-dir target/coverage
# Phase 2: Analyze tech debt and capture baseline
- shell: "debtmap analyze . --lcov target/coverage/lcov.info --output .prodigy/debtmap-before.json --format json"
# Phase 3: Create implementation plan (PLANNING PHASE)
- claude: "/prodigy-debtmap-plan --before .prodigy/debtmap-before.json --output .prodigy/IMPLEMENTATION_PLAN.md"
capture_output: true
validate:
commands:
- claude: "/prodigy-validate-debtmap-plan --before .prodigy/debtmap-before.json --plan .prodigy/IMPLEMENTATION_PLAN.md --output .prodigy/plan-validation.json"
result_file: ".prodigy/plan-validation.json"
threshold: 75
on_incomplete:
commands:
- claude: "/prodigy-revise-debtmap-plan --gaps ${validation.gaps} --plan .prodigy/IMPLEMENTATION_PLAN.md"
max_attempts: 3
fail_workflow: false
# Phase 4: Execute the plan (IMPLEMENTATION PHASE)
- claude: "/prodigy-debtmap-implement --plan .prodigy/IMPLEMENTATION_PLAN.md"
commit_required: true
validate:
commands:
- shell: "debtmap analyze . --lcov target/coverage/lcov.info --output .prodigy/debtmap-after.json --format json"
- shell: "debtmap compare --before .prodigy/debtmap-before.json --after .prodigy/debtmap-after.json --plan .prodigy/IMPLEMENTATION_PLAN.md --output .prodigy/comparison.json --format json"
- shell: "debtmap validate-improvement --comparison .prodigy/comparison.json --output .prodigy/debtmap-validation.json"
result_file: ".prodigy/debtmap-validation.json"
threshold: 75
on_incomplete:
commands:
- claude: "/prodigy-complete-debtmap-fix --plan .prodigy/IMPLEMENTATION_PLAN.md --validation .prodigy/debtmap-validation.json --attempt ${validation.attempt_number}"
commit_required: true
- shell: "just coverage-lcov"
- shell: "debtmap analyze . --lcov target/coverage/lcov.info --output .prodigy/debtmap-after.json --format json"
- shell: "debtmap compare --before .prodigy/debtmap-before.json --after .prodigy/debtmap-after.json --plan .prodigy/IMPLEMENTATION_PLAN.md --output .prodigy/comparison.json --format json"
max_attempts: 5
fail_workflow: true
# Phase 5: Run tests with automatic fixing
- shell: "just test"
on_failure:
claude: "/prodigy-debug-test-failure --output ${shell.output}"
max_attempts: 5
fail_workflow: true
# Phase 6: Run linting and formatting
- shell: "just fmt-check && just lint"
on_failure:
claude: "/prodigy-lint ${shell.output}"
max_attempts: 5
fail_workflow: true
Note about
just: This example usesjust(a command runner likemake). If you don’t have ajustfile, replacejust coverage-lcovwithcargo tarpaulin --out lcov --output-dir target/coverage,just testwithcargo test, andjust fmt-check && just lintwithcargo fmt --check && cargo clippy -- -D warnings.
2. Run Workflow
# Run with auto-confirm, 5 iterations
prodigy run workflows/debtmap.yml -yn 5
# Run with custom iteration count
prodigy run workflows/debtmap.yml -yn 10
# Run single iteration for testing
prodigy run workflows/debtmap.yml -yn 1
Command Flags:
-y(--yes) - Auto-confirm workflow steps (skip prompts)-n 5(--max-iterations 5) - Run workflow for up to 5 iterations
Note: Worktrees are managed separately via the prodigy worktree command. In MapReduce mode, Prodigy automatically creates isolated worktrees for each parallel agent.
3. Review Results
Prodigy creates a detailed report:
📊 WORKFLOW SUMMARY
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Iterations: 5
Items Fixed: 12
Tests Added: 8
Complexity Reduced: 145 → 78 (-46%)
Coverage Improved: 45% → 72% (+27%)
✅ All validations passed
Useful Prodigy Commands
Beyond prodigy run, several commands help manage workflows and sessions:
Resume Interrupted Workflows
# Resume an interrupted sequential workflow
prodigy resume <SESSION_ID>
# Resume an interrupted MapReduce job
prodigy resume-job <JOB_ID>
# List all sessions to find the SESSION_ID
prodigy sessions
When to use: If a workflow is interrupted (Ctrl-C, system crash, network issues), you can resume from the last checkpoint rather than starting over.
View Checkpoints
# List all available checkpoints
prodigy checkpoints
# List checkpoints for specific session
prodigy checkpoints --session <SESSION_ID>
When to use: To see available restore points for interrupted workflows.
Manage Worktrees
# List all Prodigy worktrees
prodigy worktree list
# Clean up old worktrees
prodigy worktree clean
# Remove specific worktree
prodigy worktree remove <SESSION_ID>
When to use: MapReduce workflows create many worktrees. Clean them up periodically to save disk space.
Monitor MapReduce Progress
# View progress of running MapReduce job
prodigy progress <JOB_ID>
# View events and logs from MapReduce job
prodigy events <JOB_ID>
# Filter events by type
prodigy events <JOB_ID> --type agent_started
prodigy events <JOB_ID> --type agent_completed
prodigy events <JOB_ID> --type agent_failed
When to use: Monitor long-running MapReduce jobs to see how many agents have completed, which are still running, and which have failed.
Manage Dead Letter Queue
# View failed MapReduce items in DLQ
prodigy dlq list <JOB_ID>
# Retry failed items from DLQ
prodigy dlq retry <JOB_ID> <ITEM_ID>
# Remove items from DLQ
prodigy dlq remove <JOB_ID> <ITEM_ID>
When to use: When some MapReduce agents fail, their items go to the Dead Letter Queue. You can retry them individually or investigate why they failed.
Session Management
# List all workflow sessions
prodigy sessions
# Clean up old sessions
prodigy clean
When to use: View history of workflow runs and clean up old data.
Workflow Configuration
Prodigy workflows are defined as YAML lists of steps. Each step can be either a shell command or a claude slash command.
Workflow Step Types
Shell Commands
Execute shell commands directly:
# Simple shell command
- shell: "cargo test"
# With timeout (in seconds)
- shell: "just coverage-lcov"
timeout: 900 # 15 minutes
# With error handling
- shell: "just test"
on_failure:
claude: "/prodigy-debug-test-failure --output ${shell.output}"
max_attempts: 5
fail_workflow: true
Shell Command Fields:
shell: Command to execute (string)timeout: Maximum execution time in seconds (optional)on_failure: Error handler configuration (optional)claude: Slash command to run on failuremax_attempts: Maximum retry attemptsfail_workflow: If true, fail entire workflow after max attempts
Claude Commands
Execute Claude Code slash commands:
# Simple Claude command
- claude: "/prodigy-debtmap-plan --before .prodigy/debtmap-before.json --output .prodigy/IMPLEMENTATION_PLAN.md"
# With output capture (makes command output available in ${shell.output})
- claude: "/prodigy-debtmap-plan --before .prodigy/debtmap-before.json --output .prodigy/IMPLEMENTATION_PLAN.md"
capture_output: true
# With commit requirement (workflow fails if no git commit made)
- claude: "/prodigy-debtmap-implement --plan .prodigy/IMPLEMENTATION_PLAN.md"
commit_required: true
# With timeout and validation
- claude: "/prodigy-debtmap-implement --plan .prodigy/IMPLEMENTATION_PLAN.md"
commit_required: true
timeout: 1800 # 30 minutes
validate:
commands:
- shell: "cargo test"
result_file: ".prodigy/validation.json"
threshold: 75
Claude Command Fields:
claude: Slash command to execute (string)capture_output: If true, command output is available in${shell.output}variable (optional)commit_required: If true, workflow fails if command doesn’t create a git commit (optional)timeout: Maximum execution time in seconds (optional)validate: Validation configuration (optional, see Step-Level Validation below)
Step-Level Validation
Steps can include validation that must pass:
- claude: "/prodigy-debtmap-implement --plan .prodigy/IMPLEMENTATION_PLAN.md"
commit_required: true
validate:
commands:
- shell: "cargo test"
- shell: "cargo clippy -- -D warnings"
result_file: ".prodigy/validation.json"
threshold: 75
on_incomplete:
commands:
- claude: "/prodigy-complete-debtmap-fix --plan .prodigy/IMPLEMENTATION_PLAN.md --validation .prodigy/debtmap-validation.json --attempt ${validation.attempt_number}"
commit_required: true
max_attempts: 5
fail_workflow: true
Validation Options:
commands: List of commands to run for validationresult_file: JSON file containing validation resultsthreshold: Minimum score (0-100) required to passon_incomplete: Actions to take if validation score < thresholdmax_attempts: Maximum retry attemptsfail_workflow: Whether to fail entire workflow if validation never passes
Error Handling
Use on_failure to handle command failures:
- shell: "just fmt-check && just lint"
on_failure:
claude: "/prodigy-lint ${shell.output}"
max_attempts: 5
fail_workflow: true
Error Handling Options:
claude: Slash command to fix the failuremax_attempts: Maximum fix attemptsfail_workflow: If true, workflow fails after max_attempts; if false, continues to next step
Coverage Integration
Generate and use coverage data in workflows. See Coverage Integration for details on generating LCOV files and understanding coverage metrics.
# Generate coverage
- shell: "just coverage-lcov"
# Use coverage in analysis
- shell: "debtmap analyze . --lcov target/coverage/lcov.info --output .prodigy/debtmap-before.json --format json"
Claude Slash Commands
Important: The slash commands documented below are custom commands provided in Debtmap’s
.claude/commands/directory. They are included in the Debtmap repository as working examples. You can use them as-is or create your own based on these patterns.Note on workflow styles: The sequential workflow (workflows/debtmap.yml) uses
shell:commands directly, while the MapReduce workflow (workflows/debtmap-reduce.yml) usesclaude:wrapper commands for some operations likevalidate-improvement. Both approaches are valid - use whichever fits your workflow style.
Prodigy workflows use Claude Code slash commands to perform analysis, planning, and implementation. The key commands used in the debtmap workflow are:
Planning Commands
/prodigy-debtmap-plan
Creates an implementation plan for the top priority debt item.
- claude: "/prodigy-debtmap-plan --before .prodigy/debtmap-before.json --output .prodigy/IMPLEMENTATION_PLAN.md"
capture_output: true
Parameters:
--before: Path to debtmap analysis JSON file--output: Path to write implementation plan
/prodigy-validate-debtmap-plan
Validates that the implementation plan is complete and addresses the debt item.
- claude: "/prodigy-validate-debtmap-plan --before .prodigy/debtmap-before.json --plan .prodigy/IMPLEMENTATION_PLAN.md --output .prodigy/plan-validation.json"
Parameters:
--before: Original debtmap analysis--plan: Implementation plan to validate--output: Validation results JSON (with score 0-100)
/prodigy-revise-debtmap-plan
Revises an incomplete plan based on validation gaps.
- claude: "/prodigy-revise-debtmap-plan --gaps ${validation.gaps} --plan .prodigy/IMPLEMENTATION_PLAN.md"
Parameters:
--gaps: List of missing items from validation--plan: Plan file to update
Implementation Commands
/prodigy-debtmap-implement
Executes the implementation plan.
- claude: "/prodigy-debtmap-implement --plan .prodigy/IMPLEMENTATION_PLAN.md"
commit_required: true
Parameters:
--plan: Path to implementation plan
/prodigy-validate-debtmap-improvement
Validates that the implementation successfully addressed the debt item.
Note: The sequential workflow (workflows/debtmap.yml:31) uses the
debtmap validate-improvementshell command directly, while the MapReduce workflow (workflows/debtmap-reduce.yml:44) uses the Claude slash command wrapper/prodigy-validate-debtmap-improvement. Both approaches are valid.
# Standard approach (used in actual workflows)
- shell: "debtmap validate-improvement --comparison .prodigy/comparison.json --output .prodigy/debtmap-validation.json"
Parameters:
--comparison: Debtmap comparison results (before vs after)--output: Validation results JSON (with score 0-100)--previous-validation: (Optional) Previous validation result for trend tracking--threshold: (Optional) Improvement threshold percentage (default: 75.0)
/prodigy-complete-debtmap-fix
Completes a partial fix based on validation results.
- claude: "/prodigy-complete-debtmap-fix --plan .prodigy/IMPLEMENTATION_PLAN.md --validation .prodigy/debtmap-validation.json --attempt ${validation.attempt_number}"
commit_required: true
Parameters:
--plan: Path to implementation plan file--validation: Path to validation JSON file with completion status--attempt: Current retry attempt number (from ${validation.attempt_number})
Testing and Quality Commands
/prodigy-debug-test-failure
Automatically fixes failing tests.
- shell: "just test"
on_failure:
claude: "/prodigy-debug-test-failure --output ${shell.output}"
max_attempts: 5
Parameters:
--output: Test failure output from shell command
/prodigy-lint
Fixes linting and formatting issues.
- shell: "just fmt-check && just lint"
on_failure:
claude: "/prodigy-lint ${shell.output}"
max_attempts: 5
Parameters:
- Shell output with linting errors
Target Selection
Target selection happens through the debtmap analysis and slash commands, not through workflow configuration:
How Targets Are Selected
- Debtmap analyzes the codebase and scores all items by complexity, coverage, and risk
- Planning command (
/prodigy-debtmap-plan) selects the highest priority item - Implementation command (
/prodigy-debtmap-implement) fixes that specific item - Next iteration re-analyzes and selects the next highest priority item
Factors in Prioritization
- Complexity score: Functions with cyclomatic complexity > 10
- Coverage percentage: Lower coverage increases priority
- Risk score: Complexity × (100 - coverage%)
- Debt type: Complexity, TestGap, Duplication, GodObject, DeepNesting
Customizing Target Selection
To focus on specific debt types or modules, modify the slash commands or create custom commands in .claude/commands/
MapReduce Workflows
Prodigy supports MapReduce workflows for processing multiple items in parallel. This is powerful for large-scale refactoring where you want to fix many debt items simultaneously.
When to Use MapReduce
- Processing multiple independent debt items simultaneously (e.g., refactor 10 high-complexity functions in parallel)
- Applying the same fix pattern across many files
- Large-scale codebase cleanup tasks
- Situations where sequential iteration would be too slow
MapReduce vs Sequential Workflows
Sequential Workflow (-n 5):
- Runs entire workflow N times in sequence
- Fixes one item per iteration
- Each iteration re-analyzes the codebase
- Total time: N × workflow_duration
MapReduce Workflow:
- Processes multiple items in parallel in a single run
- Setup phase runs once
- Map phase spawns N parallel agents (each in isolated worktree)
- Reduce phase aggregates results
- Total time: setup + max(map_agent_durations) + reduce
Complete MapReduce Example
Create workflows/debtmap-reduce.yml:
name: debtmap-parallel-elimination
mode: mapreduce
# Setup phase: Analyze the codebase and generate debt items
setup:
timeout: 900 # 15 minutes for coverage generation
commands:
# Generate coverage data with tarpaulin
- shell: "just coverage-lcov"
# Run debtmap with coverage data to establish baseline
- shell: "debtmap analyze src --lcov target/coverage/lcov.info --output .prodigy/debtmap-before.json --format json"
# Map phase: Process each debt item in parallel with planning and validation
map:
# Input configuration - debtmap-before.json contains items array
input: .prodigy/debtmap-before.json
json_path: "$.items[*]"
# Commands to execute for each debt item
agent_template:
# Phase 1: Create implementation plan
- claude: "/prodigy-debtmap-plan --item '${item}' --output .prodigy/plan-${item_id}.md"
capture_output: true
validate:
commands:
- claude: "/prodigy-validate-debtmap-plan --item '${item}' --plan .prodigy/plan-${item_id}.md --output .prodigy/validation-${item_id}.json"
result_file: ".prodigy/validation-${item_id}.json"
threshold: 75
on_incomplete:
commands:
- claude: "/prodigy-revise-debtmap-plan --gaps ${validation.gaps} --plan .prodigy/plan-${item_id}.md"
max_attempts: 3
fail_workflow: false
# Phase 2: Execute the plan
- claude: "/prodigy-debtmap-implement --plan .prodigy/plan-${item_id}.md"
commit_required: true
validate:
commands:
- shell: "just coverage-lcov"
- shell: "debtmap analyze src --lcov target/coverage/lcov.info --output .prodigy/debtmap-after-${item_id}.json --format json"
- shell: "debtmap compare --before .prodigy/debtmap-before.json --after .prodigy/debtmap-after-${item_id}.json --plan .prodigy/plan-${item_id}.md --output .prodigy/comparison-${item_id}.json --format json"
- shell: "debtmap validate-improvement --comparison .prodigy/comparison-${item_id}.json --output .prodigy/debtmap-validation-${item_id}.json"
result_file: ".prodigy/debtmap-validation-${item_id}.json"
threshold: 75
on_incomplete:
commands:
- claude: "/prodigy-complete-debtmap-fix --plan .prodigy/plan-${item_id}.md --validation .prodigy/debtmap-validation-${item_id}.json --attempt ${validation.attempt_number}"
commit_required: true
- shell: "just coverage-lcov"
- shell: "debtmap analyze src --lcov target/coverage/lcov.info --output .prodigy/debtmap-after-${item_id}.json --format json"
- shell: "debtmap compare --before .prodigy/debtmap-before.json --after .prodigy/debtmap-after-${item_id}.json --plan .prodigy/plan-${item_id}.md --output .prodigy/comparison-${item_id}.json --format json"
max_attempts: 5
fail_workflow: true
# Phase 3: Verify tests pass
- shell: "just test"
on_failure:
claude: "/prodigy-debug-test-failure --output ${shell.output}"
max_attempts: 5
fail_workflow: true
# Phase 4: Check formatting and linting
- shell: "just fmt-check && just lint"
on_failure:
claude: "/prodigy-lint ${shell.output}"
max_attempts: 5
fail_workflow: true
# Parallelization settings
max_parallel: 5 # Run up to 5 agents in parallel
# Filter and sort items
# Note: NULLS LAST ensures File items (with null Function.unified_score.final_score)
# and Function items (with null File.score) sort correctly
filter: "File.score >= 10 OR Function.unified_score.final_score >= 10"
sort_by: "File.score DESC NULLS LAST, Function.unified_score.final_score DESC NULLS LAST"
max_items: 10 # Limit to 10 items per run
# Reduce phase: Aggregate results and verify overall improvements
reduce:
# Phase 1: Run final tests across all changes
- shell: "just test"
on_failure:
claude: "/prodigy-debug-test-failure --output ${shell.output}"
max_attempts: 5
fail_workflow: true
# Phase 2: Check formatting and linting
- shell: "just fmt-check && just lint"
on_failure:
claude: "/prodigy-lint ${shell.output}"
max_attempts: 5
fail_workflow: true
# Phase 3: Re-run debtmap to measure cumulative improvements
- shell: "just coverage-lcov"
- shell: "debtmap analyze src --lcov target/coverage/lcov.info --output .prodigy/debtmap-after.json --format json"
# Phase 4: Create final commit with summary
- write_file:
path: ".prodigy/map-results.json"
content: "${map.results}"
format: json
create_dirs: true
- claude: |
/prodigy-compare-debt-results \
--before .prodigy/debtmap-before.json \
--after .prodigy/debtmap-after.json \
--map-results-file .prodigy/map-results.json \
--successful ${map.successful} \
--failed ${map.failed} \
--total ${map.total}
commit_required: true
Running MapReduce Workflows
# Run MapReduce workflow (single execution processes multiple items in parallel)
prodigy run workflows/debtmap-reduce.yml
# Run with auto-confirm
prodigy run workflows/debtmap-reduce.yml -y
Note: MapReduce workflows don’t typically use -n for iterations. Instead, they process multiple items in a single run through parallel map agents.
MapReduce Configuration Options
Top-Level Fields
name: Workflow name (string)mode: mapreduce: Enables MapReduce mode (required)setup: Commands to run once before map phasemap: Map phase configurationreduce: Commands to run after all map agents complete
Setup Phase Fields
timeout: Maximum time in seconds for setup phasecommands: List of shell or claude commands to run
Map Phase Fields
input: Path to JSON file containing items to processjson_path: JSONPath expression to extract items array (e.g.,$.items[*])agent_template: List of commands to run for each item (each item gets its own agent in an isolated worktree)max_parallel: Maximum number of agents to run concurrentlyfilter: Expression to filter which items to process (e.g.,"score >= 10")sort_by: Expression to sort items (e.g.,"score DESC")max_items: Limit total items processed
MapReduce-Specific Variables
| Variable | Available In | Type | Description |
|---|---|---|---|
${item} | map phase | JSON | The full JSON object for current item (File or Function type) |
${item_id} | map phase | string | Unique ID for current item (auto-generated) |
${validation.gaps} | map phase | array | List of validation gaps from failed validation |
${validation.attempt_number} | map phase | number | Current retry attempt number (1, 2, 3, etc.) |
${shell.output} | both phases | string | Output from previous shell command |
${map.results} | reduce phase | array | All map agent results as JSON |
${map.successful} | reduce phase | number | Count of successful map agents |
${map.failed} | reduce phase | number | Count of failed map agents |
${map.total} | reduce phase | number | Total number of map agents |
Understanding ${item} Structure:
The ${item} variable contains different fields depending on whether it’s a File or Function debt item:
- File items: Have a
File.scorefield (non-null) andFunction.unified_score.final_scoreis null - Function items: Have a
Function.unified_score.final_scorefield (non-null) andFile.scoreis null
This distinction matters when filtering and sorting items in MapReduce workflows. See the filter/sort_by examples below for proper handling of both types.
MapReduce Architecture
┌─────────────────────────────────────────────────────────┐
│ Setup Phase (main worktree) │
│ - Generate coverage data │
│ - Run debtmap analysis │
│ - Output: .prodigy/debtmap-before.json │
└─────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ Map Phase (parallel worktrees) │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Agent 1 │ │ Agent 2 │ │ Agent 3 │ │
│ │ Item #1 │ │ Item #2 │ │ Item #3 │ │
│ │ Worktree A │ │ Worktree B │ │ Worktree C │ │
│ │ │ │ │ │ │ │
│ │ Plan → Fix │ │ Plan → Fix │ │ Plan → Fix │ │
│ │ → Validate │ │ → Validate │ │ → Validate │ │
│ │ → Test │ │ → Test │ │ → Test │ │
│ │ → Commit │ │ → Commit │ │ → Commit │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ Agent 4 │ │ Agent 5 │ │
│ │ Item #4 │ │ Item #5 │ │
│ │ Worktree D │ │ Worktree E │ │
│ │ │ │ │ │
│ │ Plan → Fix │ │ Plan → Fix │ │
│ │ → Validate │ │ → Validate │ │
│ │ → Test │ │ → Test │ │
│ │ → Commit │ │ → Commit │ │
│ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ Reduce Phase (main worktree) │
│ - Merge all agent worktrees │
│ - Run final tests on merged code │
│ - Run final linting │
│ - Re-analyze with debtmap │
│ - Generate summary commit │
└─────────────────────────────────────────────────────────┘
Key Concepts:
- Isolation: Each map agent works in its own git worktree
- Parallelism: Multiple agents process different items simultaneously
- Validation: Each agent validates its changes independently
- Merging: Reduce phase merges all successful agent worktrees
- Final Validation: Reduce phase ensures merged code passes all tests
Iteration Strategy
How Iterations Work
When you run prodigy run workflows/debtmap.yml -yn 5, the workflow executes up to 5 times:
-
Iteration 1:
- Analyze codebase with debtmap
- Select highest priority item
- Create implementation plan
- Execute plan and validate
- Run tests and linting
-
Iteration 2:
- Re-analyze codebase (scores updated based on Iteration 1 changes)
- Select next highest priority item
- Repeat plan/implement/validate cycle
-
Continue until iteration limit reached or workflow completes without finding issues
Controlling Iterations
Iterations are controlled via the -n flag:
# Single iteration (testing)
prodigy run workflows/debtmap.yml -yn 1
# Standard run (5 iterations)
prodigy run workflows/debtmap.yml -yn 5
# Deep cleanup (10+ iterations)
prodigy run workflows/debtmap.yml -yn 20
What Happens Each Iteration
Each iteration runs the entire workflow from start to finish:
- Generate coverage data
- Analyze technical debt
- Create implementation plan
- Execute plan
- Validate improvement
- Run tests (with auto-fixing)
- Run linting (with auto-fixing)
The workflow continues to the next iteration automatically if all steps succeed.
Example Output
Iteration 1:
- Fixed: parse_expression() (9.2 → 5.1)
- Fixed: calculate_score() (8.8 → 4.2)
- Fixed: apply_weights() (8.5 → 5.8)
✓ Tests pass
Iteration 2:
- Fixed: normalize_results() (7.5 → 3.9)
- Fixed: aggregate_data() (7.2 → 4.1)
✓ Tests pass
Iteration 3:
- No items above threshold (6.0)
✓ Early stop
Final Results:
Items fixed: 5
Average complexity: 15.2 → 8.6
Validation
Prodigy validates changes at the workflow step level, not as a standalone configuration.
Step-Level Validation
Validation is attached to specific workflow steps:
- claude: "/prodigy-debtmap-implement --plan .prodigy/IMPLEMENTATION_PLAN.md"
commit_required: true
validate:
commands:
- shell: "debtmap analyze . --lcov target/coverage/lcov.info --output .prodigy/debtmap-after.json --format json"
- shell: "debtmap compare --before .prodigy/debtmap-before.json --after .prodigy/debtmap-after.json --plan .prodigy/IMPLEMENTATION_PLAN.md --output .prodigy/comparison.json --format json"
- shell: "debtmap validate-improvement --comparison .prodigy/comparison.json --output .prodigy/debtmap-validation.json"
result_file: ".prodigy/debtmap-validation.json"
threshold: 75
on_incomplete:
commands:
- claude: "/prodigy-complete-debtmap-fix --plan .prodigy/IMPLEMENTATION_PLAN.md --validation .prodigy/debtmap-validation.json --attempt ${validation.attempt_number}"
commit_required: true
- shell: "just coverage-lcov"
- shell: "debtmap analyze . --lcov target/coverage/lcov.info --output .prodigy/debtmap-after.json --format json"
- shell: "debtmap compare --before .prodigy/debtmap-before.json --after .prodigy/debtmap-after.json --plan .prodigy/IMPLEMENTATION_PLAN.md --output .prodigy/comparison.json --format json"
max_attempts: 5
fail_workflow: true
Validation Process
- Commands run: Execute validation commands (shell or claude)
- Check result file: Read JSON file specified in
result_file - Compare to threshold: Score must be >= threshold (0-100 scale)
- On incomplete: If score < threshold, run
on_incompletecommands - Retry: Repeat up to
max_attemptstimes - Fail or continue: If
fail_workflow: true, stop workflow; otherwise continue
Validation Result Format
The result_file JSON should contain:
{
"score": 85,
"passed": true,
"gaps": [],
"details": "All debt improvement criteria met"
}
Test Validation with Auto-Fix
Tests are validated with automatic fixing on failure:
- shell: "just test"
on_failure:
claude: "/prodigy-debug-test-failure --output ${shell.output}"
max_attempts: 5
fail_workflow: true
If tests fail, Prodigy automatically attempts to fix them up to 5 times before failing the workflow.
Output and Metrics
Workflow Report
{
"workflow": "debtmap-debt-reduction",
"iterations": 5,
"items_processed": 12,
"items_fixed": 10,
"items_failed": 2,
"metrics": {
"complexity_before": 145,
"complexity_after": 78,
"complexity_reduction": -46.2,
"coverage_before": 45.3,
"coverage_after": 72.1,
"coverage_improvement": 26.8
},
"changes": [
{
"file": "src/parser.rs",
"function": "parse_expression",
"before_score": 9.2,
"after_score": 5.1,
"improvements": ["Reduced complexity", "Added tests"]
}
]
}
Commit Messages
Prodigy generates descriptive commit messages:
refactor(parser): reduce complexity in parse_expression
- Extract nested conditionals to helper functions
- Add unit tests for edge cases
- Coverage: 0% → 85%
- Complexity: 22 → 8
Generated by Prodigy workflow: debtmap-debt-reduction
Iteration: 1/5
Integration with CI/CD
GitHub Actions
name: Prodigy Debt Reduction
on:
schedule:
- cron: '0 0 * * 0' # Weekly on Sunday
workflow_dispatch:
jobs:
reduce-debt:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Rust
uses: actions-rs/toolchain@v1
with:
toolchain: stable
- name: Install Prodigy
run: cargo install prodigy
- name: Install dependencies
run: |
cargo install debtmap
cargo install just
- name: Run Prodigy workflow
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
run: prodigy run workflows/debtmap.yml -yn 5
- name: Create PR
uses: peter-evans/create-pull-request@v5
with:
title: "chore: automated debt reduction via Prodigy"
body: |
Automated technical debt reduction using Prodigy workflow.
This PR was generated by the weekly debt reduction workflow.
Review changes carefully before merging.
branch: prodigy-debt-reduction
GitLab CI
prodigy-debt-reduction:
stage: quality
rules:
- if: '$CI_PIPELINE_SOURCE == "schedule"'
script:
- cargo install prodigy
- cargo install debtmap
- cargo install just
- prodigy run workflows/debtmap.yml -yn 5
artifacts:
paths:
- .prodigy/debtmap-*.json
- .prodigy/comparison.json
Important CI Considerations
- API Keys: Store
ANTHROPIC_API_KEYas a secret - Worktrees: MapReduce mode creates isolated worktrees automatically for parallel processing
- Dependencies: Install
prodigy,debtmap, andjust(or your build tool) - Timeout: CI jobs may need extended timeout for multiple iterations
- Review: Always create a PR for human review before merging automated changes
Best Practices
1. Start Small
Begin with low iteration counts:
# First run: 1 iteration to test workflow
prodigy run workflows/debtmap.yml -yn 1
# Standard run: 3-5 iterations
prodigy run workflows/debtmap.yml -yn 5
2. Focus on High-Priority Items
The debtmap analysis automatically prioritizes by:
- Complexity score (cyclomatic complexity)
- Coverage percentage (lower coverage = higher priority)
- Risk score (complexity × (100 - coverage%))
To focus on specific areas, create custom slash commands in .claude/commands/ that filter by:
- Module/file patterns
- Specific debt types (Complexity, TestGap, Duplication)
- Score thresholds
3. Validate Thoroughly
Use comprehensive validation in your workflow:
- shell: "just test"
on_failure:
claude: "/prodigy-debug-test-failure --output ${shell.output}"
max_attempts: 5
fail_workflow: true
- shell: "just fmt-check && just lint"
on_failure:
claude: "/prodigy-lint ${shell.output}"
max_attempts: 5
fail_workflow: true
4. Review Before Merging
Always review Prodigy’s changes:
# Find your worktree
ls ~/.prodigy/worktrees/
# Check changes
cd ~/.prodigy/worktrees/session-xxx
git diff main
# Review commit history
git log --oneline
# Run full test suite
cargo test --all-features
5. Monitor Progress
Track debt reduction over iterations:
# Compare before and after
debtmap compare --before .prodigy/debtmap-before.json --after .prodigy/debtmap-after.json
# View detailed metrics
cat .prodigy/comparison.json | jq
Troubleshooting
Workflow Fails to Start
Issue: “Prodigy not found” or “API key missing”
Solution:
# Install Prodigy
cargo install prodigy
# Set API key
export ANTHROPIC_API_KEY="your-key"
# Verify installation
prodigy --version
Validation Failures
Issue: Validation score below threshold
Solution: Check validation results:
# View validation details
cat .prodigy/debtmap-validation.json
# Check what gaps remain
cat .prodigy/debtmap-validation.json | jq '.gaps'
# Review comparison results
cat .prodigy/comparison.json
The workflow will automatically retry up to max_attempts times with /prodigy-complete-debtmap-fix.
Test Failures
Issue: Tests fail after implementation
Solution: The workflow includes automatic test fixing:
- shell: "just test"
on_failure:
claude: "/prodigy-debug-test-failure --output ${shell.output}"
max_attempts: 5
fail_workflow: true
If tests still fail after 5 attempts, review manually:
# Check test output
just test
# Review recent changes
git diff HEAD~1
No Items Processed
Issue: Workflow completes but doesn’t find debt to fix
Possible Causes:
- Codebase has very low debt scores (below selection threshold)
- Coverage data not generated properly
- Debtmap analysis found no high-priority items
Solution:
# Check debtmap analysis results
cat .prodigy/debtmap-before.json | jq '.items | sort_by(-.unified_score.final_score) | .[0:5]'
# Verify coverage was generated
ls -lh target/coverage/lcov.info
# Run debtmap manually to see what's detected
debtmap analyze . --lcov target/coverage/lcov.info
Workflow Hangs or Times Out
Issue: Workflow takes too long or appears stuck
Possible Causes:
- Large codebase with many files
- Complex refactoring requiring extensive analysis
- Network issues with Claude API
Solution:
- Reduce iteration count for testing (
-n 1) - Check Claude API connectivity
- Monitor worktree for progress:
cd ~/.prodigy/worktrees/session-xxx && git log
MapReduce-Specific Troubleshooting
Resuming Failed MapReduce Jobs
Issue: MapReduce job was interrupted or failed
Solution:
# Find the job ID from recent sessions
prodigy sessions
# Resume the MapReduce job from checkpoint
prodigy resume-job <JOB_ID>
The job will resume from where it left off, skipping already-completed items.
Checking MapReduce Progress
Issue: Want to monitor long-running MapReduce job
Solution:
# View overall progress
prodigy progress <JOB_ID>
# View detailed events
prodigy events <JOB_ID>
# Filter for specific event types
prodigy events <JOB_ID> --type agent_completed
prodigy events <JOB_ID> --type agent_failed
Output example:
MapReduce Job: job-abc123
Status: running
Progress: 7/10 items (70%)
- Completed: 5
- Running: 2
- Failed: 3
Managing Failed MapReduce Items
Issue: Some agents failed, items in Dead Letter Queue
Solution:
# View failed items
prodigy dlq list <JOB_ID>
# Review why an item failed (check events)
prodigy events <JOB_ID> --item <ITEM_ID>
# Retry specific failed item
prodigy dlq retry <JOB_ID> <ITEM_ID>
# Remove unfixable items from DLQ
prodigy dlq remove <JOB_ID> <ITEM_ID>
Common failure reasons:
- Validation threshold not met after max_attempts
- Tests fail and can’t be fixed automatically
- Merge conflicts with other agents’ changes
- Timeout exceeded for complex refactoring
Cleaning Up MapReduce Worktrees
Issue: Disk space consumed by many MapReduce worktrees
Solution:
# List all worktrees
prodigy worktree list
# Clean up completed job worktrees
prodigy worktree clean
# Remove specific session's worktrees
prodigy worktree remove <SESSION_ID>
# Manual cleanup (if Prodigy commands don't work)
rm -rf ~/.prodigy/worktrees/session-xxx
When to clean:
- After successful job completion and merge
- When disk space is low
- After abandoned or failed jobs
MapReduce Merge Conflicts
Issue: Reduce phase fails due to merge conflicts between agent worktrees
Possible Causes:
- Multiple agents modified overlapping code
- Agents made conflicting architectural changes
- Shared dependencies updated differently
Solution:
# Review which agents succeeded
prodigy events <JOB_ID> --type agent_completed
# Check merge conflicts
cd ~/.prodigy/worktrees/session-xxx
git status
# Manually resolve conflicts
# Edit conflicting files
git add .
git commit -m "Resolve MapReduce merge conflicts"
# Resume the job
prodigy resume-job <JOB_ID>
Prevention:
- Use
filterto ensure agents work on independent items - Reduce
max_parallelto minimize conflicts - Design debt items to be truly independent
Understanding MapReduce Variables
If you’re debugging workflow files, these variables are available:
In map phase (agent_template):
${item}: Full JSON of current item being processed${item_id}: Unique ID for current item${validation.gaps}: Validation gaps from validation result${validation.attempt_number}: Current retry attempt (1, 2, 3…)${shell.output}: Output from previous shell command
In reduce phase:
${map.results}: All map agent results as JSON${map.successful}: Count of successful agents${map.failed}: Count of failed agents${map.total}: Total number of agents
Example debug command:
# In agent_template, log the item being processed
- shell: "echo 'Processing item: ${item_id}' >> .prodigy/debug.log"
Example Workflows
Full Repository Cleanup
For comprehensive debt reduction, use a higher iteration count:
# Run 10 iterations for deeper cleanup
prodigy run workflows/debtmap.yml -yn 10
# Run 20 iterations for major refactoring
prodigy run workflows/debtmap.yml -yn 20
The workflow automatically:
- Selects highest priority items each iteration
- Addresses different debt types (Complexity, TestGap, Duplication)
- Validates all changes with tests and linting
- Commits only successful improvements
Custom Workflow for Specific Focus
Create a custom workflow file for focused improvements:
workflows/add-tests.yml - Focus on test coverage:
# Generate coverage
- shell: "just coverage-lcov"
# Analyze with focus on test gaps
- shell: "debtmap analyze . --lcov target/coverage/lcov.info --output .prodigy/debtmap-before.json --format json"
# Create plan (slash command will prioritize TestGap items)
- claude: "/prodigy-debtmap-plan --before .prodigy/debtmap-before.json --output .prodigy/IMPLEMENTATION_PLAN.md"
# ... rest of standard workflow steps
Run with:
prodigy run workflows/add-tests.yml -yn 5
Targeted Module Cleanup
Create a custom slash command to focus on specific modules:
.claude/commands/refactor-module.md:
# /refactor-module
Refactor the highest complexity item in the specified module.
Arguments: --module <module_name>
... implementation details ...
Then create a workflow using this command for targeted refactoring.
See Also
- Debtmap CLI Reference - All Debtmap command options including
analyze,compare, andvalidate - Coverage Integration - Generating and using LCOV coverage data with Debtmap
- Configuration - Debtmap configuration file options
- Tiered Prioritization - Understanding how Debtmap scores and prioritizes debt items
- Prodigy Documentation - Full Prodigy reference and advanced features
Responsibility Analysis
Overview
Responsibility analysis is a core feature of Debtmap that helps identify violations of the Single Responsibility Principle (SRP), one of the fundamental SOLID design principles. By analyzing function and method names, Debtmap automatically infers the distinct functional responsibilities within a code unit and detects when a single module, struct, or class has taken on too many concerns.
This chapter provides an in-depth look at how Debtmap determines responsibilities, categorizes them, and uses this information to guide refactoring decisions.
What Are Responsibilities?
In the context of Debtmap, a responsibility is a distinct functional domain or concern that a code unit handles. Examples include:
- Data Access - Getting and setting values from data structures
- Validation - Checking inputs, verifying constraints, ensuring correctness
- Persistence - Saving and loading data to/from storage
- Computation - Performing calculations and transformations
- Communication - Sending and receiving messages or events
According to the Single Responsibility Principle, each module should have one and only one reason to change. When a module handles multiple unrelated responsibilities (e.g., validation, persistence, AND computation), it becomes:
- Harder to understand - Developers must mentally juggle multiple concerns
- More fragile - Changes to one responsibility can break others
- Difficult to test - Testing requires complex setup across multiple domains
- Prone to coupling - Dependencies from different domains become entangled
Debtmap’s responsibility analysis automatically identifies these violations and provides concrete recommendations for splitting modules along responsibility boundaries.
How Responsibilities Are Detected
Pattern-Based Inference
Debtmap uses prefix-based pattern matching to infer responsibilities from function and method names. This approach is both simple and effective because well-named functions naturally express their intent through conventional prefixes.
Implementation Location: src/organization/god_object_analysis.rs:316-386
The infer_responsibility_from_method() function performs case-insensitive prefix matching:
#![allow(unused)]
fn main() {
pub fn infer_responsibility_from_method(method_name: &str) -> String {
let lower_name = method_name.to_lowercase();
if lower_name.starts_with("format_") || lower_name.starts_with("render_") {
return "Formatting & Output".to_string();
}
if lower_name.starts_with("parse_") || lower_name.starts_with("read_") {
return "Parsing & Input".to_string();
}
// ... additional patterns
}
}
This approach works across languages (Rust, Python, JavaScript/TypeScript) because naming conventions are relatively consistent in modern codebases.
Responsibility Categories
Debtmap recognizes 11 built-in responsibility categories plus a generic “Utilities” fallback:
| Category | Prefixes | Examples |
|---|---|---|
| Formatting & Output | format_, render_, write_, print_ | format_json(), render_table(), write_report() |
| Parsing & Input | parse_, read_, extract_ | parse_config(), read_file(), extract_fields() |
| Filtering & Selection | filter_, select_, find_ | filter_results(), select_top(), find_item() |
| Transformation | transform_, convert_, map_, apply_ | transform_data(), convert_type(), map_fields() |
| Data Access | get_, set_ | get_value(), set_name() |
| Validation | validate_, check_, verify_, is_* | validate_input(), check_bounds(), is_valid() |
| Computation | calculate_, compute_ | calculate_score(), compute_sum() |
| Construction | create_, build_, new_*, make_ | create_instance(), build_config(), new_user() |
| Persistence | save_, load_, store_ | save_data(), load_cache(), store_result() |
| Processing | process_, handle_ | process_request(), handle_error() |
| Communication | send_, receive_ | send_message(), receive_data() |
| Utilities | (all others) | helper(), do_work(), utility_fn() |
Grouping Methods by Responsibility
Once individual methods are categorized, Debtmap groups them using group_methods_by_responsibility():
Implementation Location: src/organization/god_object_analysis.rs:268-280
#![allow(unused)]
fn main() {
pub fn group_methods_by_responsibility(methods: &[String]) -> HashMap<String, Vec<String>> {
let mut groups: HashMap<String, Vec<String>> = HashMap::new();
for method in methods {
let responsibility = infer_responsibility_from_method(method);
groups.entry(responsibility).or_default().push(method.clone());
}
groups
}
}
Output Structure:
- Keys: Responsibility category names (e.g., “Data Access”, “Validation”)
- Values: Lists of method names belonging to each category
The responsibility count is simply the number of unique keys in this HashMap.
Example Analysis
Consider a Rust struct with these methods:
#![allow(unused)]
fn main() {
impl UserManager {
fn get_user(&self, id: UserId) -> Option<User> { }
fn set_password(&mut self, id: UserId, password: &str) { }
fn validate_email(&self, email: &str) -> bool { }
fn validate_password(&self, password: &str) -> bool { }
fn save_user(&self, user: &User) -> Result<()> { }
fn load_user(&self, id: UserId) -> Result<User> { }
fn send_notification(&self, user_id: UserId, msg: &str) { }
fn format_user_profile(&self, user: &User) -> String { }
}
}
Debtmap’s Analysis:
| Method | Inferred Responsibility |
|---|---|
get_user | Data Access |
set_password | Data Access |
validate_email | Validation |
validate_password | Validation |
save_user | Persistence |
load_user | Persistence |
send_notification | Communication |
format_user_profile | Formatting & Output |
Result:
- Responsibility Count: 5 (Data Access, Validation, Persistence, Communication, Formatting)
- Assessment: This violates SRP -
UserManagerhas too many distinct concerns
Responsibility Scoring
Integration with God Object Detection
Responsibility count is a critical factor in God Object Detection. The scoring algorithm includes:
responsibility_factor = min(responsibility_count / 3.0, 3.0)
god_object_score = method_factor × field_factor × responsibility_factor × size_factor
Why divide by 3.0?
- 1-3 responsibilities: Normal, well-scoped module
- 4-6 responsibilities: Warning signs, approaching problematic territory
- 7+ responsibilities: Severe violation, likely a god object
Language-Specific Thresholds
Different languages have different expectations for responsibility counts:
| Language | Max Responsibilities | Rationale |
|---|---|---|
| Rust | 5 | Strong module system encourages tight boundaries |
| Python | 3 | Duck typing makes mixing concerns more dangerous |
| JavaScript/TypeScript | 3 | Prototype-based, benefits from focused classes |
These thresholds can be customized in .debtmap.toml:
[god_object_detection.rust]
max_traits = 5 # max_traits = max responsibilities
[god_object_detection.python]
max_traits = 3
Confidence Determination
Responsibility count contributes to overall confidence levels:
Implementation Location: src/organization/god_object_analysis.rs:234-266
#![allow(unused)]
fn main() {
pub fn determine_confidence(
method_count: usize,
field_count: usize,
responsibility_count: usize,
lines_of_code: usize,
complexity_sum: u32,
thresholds: &GodObjectThresholds,
) -> GodObjectConfidence {
let mut violations = 0;
if responsibility_count > thresholds.max_traits {
violations += 1;
}
// ... check other metrics
match violations {
5 => GodObjectConfidence::Definite,
3..=4 => GodObjectConfidence::Probable,
1..=2 => GodObjectConfidence::Possible,
_ => GodObjectConfidence::NotGodObject,
}
}
}
Advanced Responsibility Detection
Module-Level Analysis
For large modules without a single dominant struct, Debtmap performs module-level responsibility detection:
Implementation Location: src/organization/god_object_detector.rs:682-697
The classify_responsibility() function provides extended categorization:
#![allow(unused)]
fn main() {
fn classify_responsibility(prefix: &str) -> String {
match prefix {
"get" | "set" => "Data Access",
"calculate" | "compute" => "Computation",
"validate" | "check" | "verify" | "ensure" => "Validation",
"save" | "load" | "store" | "retrieve" | "fetch" => "Persistence",
"create" | "build" | "new" | "make" | "init" => "Construction",
"send" | "receive" | "handle" | "manage" => "Communication",
"update" | "modify" | "change" | "edit" => "Modification",
"delete" | "remove" | "clear" | "reset" => "Deletion",
"is" | "has" | "can" | "should" | "will" => "State Query",
"process" | "transform" => "Processing",
_ => format!("{} Operations", capitalize_first(prefix)),
}
}
}
This extended mapping covers 10 core categories plus dynamic fallback for custom prefixes.
Responsibility Groups
The ResponsibilityGroup data structure tracks detailed information about each responsibility:
Implementation Location: src/organization/mod.rs:156-161
#![allow(unused)]
fn main() {
#[derive(Debug, Clone, PartialEq)]
pub struct ResponsibilityGroup {
pub name: String, // e.g., "DataAccessManager"
pub methods: Vec<String>, // Methods in this group
pub fields: Vec<String>, // Associated fields
pub responsibility: String, // e.g., "Data Access"
}
}
This structure enables:
- Refactoring recommendations - Suggest splitting by responsibility group
- Cohesion analysis - Measure how tightly methods are related
- Field-method correlation - Identify which fields belong to which responsibilities
Refactoring Based on Responsibilities
Recommended Module Splits
When Debtmap detects a module with multiple responsibilities, it generates actionable refactoring recommendations using recommend_module_splits():
Implementation Location: src/organization/god_object_detector.rs:165-177
Process:
- Group all methods by their inferred responsibilities
- Create a
ModuleSplitfor each responsibility group - Suggest module names (e.g., “DataAccessManager”, “ValidationManager”)
- Estimate lines of code for each new module
- Order by cohesion (most focused groups first)
Example Output:
Recommended Splits for UserManager:
1. DataAccessManager (5 methods, ~80 lines)
- get_user, set_password, get_email, set_email, update_profile
2. ValidationManager (4 methods, ~60 lines)
- validate_email, validate_password, check_permissions, verify_token
3. PersistenceManager (3 methods, ~50 lines)
- save_user, load_user, delete_user
4. NotificationManager (2 methods, ~30 lines)
- send_notification, send_bulk_notifications
Practical Refactoring Patterns
Pattern 1: Extract Service Classes
Before (God Object):
#![allow(unused)]
fn main() {
struct UserManager {
db: Database,
cache: Cache,
notifier: Notifier,
}
impl UserManager {
fn get_user(&self, id: UserId) -> Option<User> { }
fn validate_email(&self, email: &str) -> bool { }
fn save_user(&self, user: &User) -> Result<()> { }
fn send_notification(&self, id: UserId, msg: &str) { }
}
}
After (Split by Responsibility):
#![allow(unused)]
fn main() {
// Data Access
struct UserRepository {
db: Database,
cache: Cache,
}
// Validation
struct UserValidator;
// Persistence
struct UserPersistence {
db: Database,
}
// Communication
struct NotificationService {
notifier: Notifier,
}
}
Pattern 2: Use Facade for Composition
After splitting, create a facade to coordinate:
#![allow(unused)]
fn main() {
struct UserFacade {
repository: UserRepository,
validator: UserValidator,
persistence: UserPersistence,
notifier: NotificationService,
}
impl UserFacade {
fn register_user(&mut self, user: User) -> Result<()> {
self.validator.validate_email(&user.email)?;
self.persistence.save_user(&user)?;
self.notifier.send_welcome(&user.id)?;
Ok(())
}
}
}
Pattern 3: Trait-Based Separation (Rust)
Use traits to define responsibility boundaries:
#![allow(unused)]
fn main() {
trait DataAccess {
fn get_user(&self, id: UserId) -> Option<User>;
}
trait Validation {
fn validate_email(&self, email: &str) -> bool;
}
trait Persistence {
fn save_user(&self, user: &User) -> Result<()>;
}
// Implement only the needed traits per struct
impl DataAccess for UserRepository { }
impl Validation for UserValidator { }
impl Persistence for UserPersistence { }
}
Data Structures
GodObjectAnalysis
The main result structure includes responsibility information:
Implementation Location: src/organization/god_object_analysis.rs:5-18
#![allow(unused)]
fn main() {
pub struct GodObjectAnalysis {
pub is_god_object: bool,
pub method_count: usize,
pub field_count: usize,
pub responsibility_count: usize, // Number of distinct responsibilities
pub lines_of_code: usize,
pub complexity_sum: u32,
pub god_object_score: f64,
pub recommended_splits: Vec<ModuleSplit>,
pub confidence: GodObjectConfidence,
pub responsibilities: Vec<String>, // List of responsibility names
pub purity_distribution: Option<PurityDistribution>,
}
}
ModuleSplit
Recommendations for splitting modules:
Implementation Location: src/organization/god_object_analysis.rs:40-45
#![allow(unused)]
fn main() {
pub struct ModuleSplit {
pub suggested_name: String, // e.g., "ValidationManager"
pub methods_to_move: Vec<String>, // Methods for this module
pub responsibility: String, // Responsibility category
pub estimated_lines: usize, // Approximate LOC
}
}
Testing Responsibility Detection
Debtmap includes comprehensive tests for responsibility detection:
Implementation Location: src/organization/god_object_analysis.rs:623-838
Test Coverage
Key test cases:
- Prefix recognition - Each of the 11 categories is tested individually
- Case insensitivity -
Format_Outputandformat_outputboth map to “Formatting & Output” - Multiple responsibilities - Grouping diverse methods correctly
- Empty input handling - Graceful handling of empty method lists
- Edge cases - Methods without recognized prefixes default to “Utilities”
Example Test:
#![allow(unused)]
fn main() {
#[test]
fn test_multiple_responsibility_groups() {
let methods = vec![
"format_output".to_string(),
"parse_input".to_string(),
"get_value".to_string(),
"validate_data".to_string(),
];
let groups = group_methods_by_responsibility(&methods);
assert_eq!(groups.len(), 4); // 4 distinct responsibilities
assert!(groups.contains_key("Formatting & Output"));
assert!(groups.contains_key("Parsing & Input"));
assert!(groups.contains_key("Data Access"));
assert!(groups.contains_key("Validation"));
}
}
Configuration
TOML Configuration
Customize responsibility thresholds in .debtmap.toml:
[god_object_detection]
enabled = true
[god_object_detection.rust]
max_traits = 5 # Max responsibilities for Rust
[god_object_detection.python]
max_traits = 3 # Max responsibilities for Python
[god_object_detection.javascript]
max_traits = 3 # Max responsibilities for JavaScript/TypeScript
Tuning Guidelines
Strict SRP Enforcement:
[god_object_detection.rust]
max_traits = 3
- Enforces very tight single responsibility
- Suitable for greenfield projects or strict refactoring efforts
Balanced Approach (Default):
[god_object_detection.rust]
max_traits = 5
- Allows some flexibility while catching major violations
- Works well for most projects
Lenient Mode:
[god_object_detection.rust]
max_traits = 7
- Only flags severe SRP violations
- Useful for large legacy codebases during initial assessment
Output and Reporting
Console Output
When analyzing a file with multiple responsibilities:
src/services/user_manager.rs
⚠️ God Object: 18 methods, 8 fields, 5 responsibilities
Score: 185 (Confidence: Probable)
Responsibilities:
- Data Access (5 methods)
- Validation (4 methods)
- Persistence (3 methods)
- Communication (3 methods)
- Formatting & Output (3 methods)
Recommended Splits:
1. DataAccessManager (5 methods, ~75 lines)
2. ValidationManager (4 methods, ~60 lines)
3. PersistenceManager (3 methods, ~45 lines)
JSON Output
For programmatic analysis, use --format json:
{
"file": "src/services/user_manager.rs",
"is_god_object": true,
"responsibility_count": 5,
"responsibilities": [
"Data Access",
"Validation",
"Persistence",
"Communication",
"Formatting & Output"
],
"recommended_splits": [
{
"suggested_name": "DataAccessManager",
"methods_to_move": ["get_user", "set_password", "get_email"],
"responsibility": "Data Access",
"estimated_lines": 75
}
]
}
Best Practices
Writing SRP-Compliant Code
- Name functions descriptively - Use standard prefixes (
get_,validate_, etc.) - Group related functions - Keep similar responsibilities together
- Limit responsibility count - Aim for 1-3 responsibilities per module
- Review regularly - Run Debtmap periodically to catch responsibility creep
- Refactor early - Split modules before they hit thresholds
Code Review Guidelines
When reviewing responsibility analysis results:
- Check responsibility boundaries - Are they logically distinct?
- Validate groupings - Do the recommended splits make sense?
- Consider dependencies - Will splitting introduce more coupling?
- Estimate refactoring cost - Is the improvement worth the effort?
- Prioritize by score - Focus on high-scoring god objects first
Team Adoption
Phase 1: Assessment
- Run Debtmap on codebase
- Review responsibility violations
- Identify top 10 problematic modules
Phase 2: Education
- Share responsibility analysis results with team
- Discuss SRP and its benefits
- Agree on responsibility threshold standards
Phase 3: Incremental Refactoring
- Start with highest-scoring modules
- Apply recommended splits
- Measure improvement with follow-up analysis
Phase 4: Continuous Monitoring
- Integrate Debtmap into CI/CD
- Track responsibility counts over time
- Prevent new SRP violations from merging
Limitations and Edge Cases
False Positives
Scenario 1: Utilities Module
#![allow(unused)]
fn main() {
// utilities.rs - 15 helper functions with different prefixes
fn format_date() { }
fn parse_config() { }
fn validate_email() { }
// ... 12 more diverse utilities
}
Issue: Flagged as having multiple responsibilities, but it’s intentionally a utility collection.
Solution: Either accept the flagging (utilities should perhaps be split) or increase max_traits threshold.
False Negatives
Scenario 2: Poor Naming
#![allow(unused)]
fn main() {
impl DataProcessor {
fn process_data(&mut self) { /* does everything */ }
fn handle_stuff(&mut self) { /* also does everything */ }
fn do_work(&mut self) { /* yet more mixed concerns */ }
}
}
Issue: All methods map to “Processing” or “Utilities”, so responsibility count is low despite clear SRP violations.
Solution: Encourage better naming conventions in your team. Debtmap relies on descriptive function names.
Language-Specific Challenges
Rust: Trait implementations may group methods by trait rather than responsibility, artificially inflating counts.
Python: Dynamic typing and duck typing make responsibility boundaries less clear from signatures alone.
JavaScript: Prototype methods and closures may not follow conventional naming patterns.
Integration with Other Features
God Object Detection
Responsibility analysis is a core component of God Object Detection. The responsibility count contributes to:
- God object scoring
- Confidence level determination
- Refactoring recommendations
Tiered Prioritization
High responsibility counts increase priority in Tiered Prioritization through the god object multiplier.
Risk Assessment
Modules with multiple responsibilities receive higher risk scores in risk assessment, as they are more prone to bugs and harder to maintain.
Related Documentation
- God Object Detection - Full god object analysis including responsibility detection
- Configuration - TOML configuration reference
- Metrics Reference - All metrics including responsibility count
- Architecture - High-level design including analysis pipelines
Summary
Responsibility analysis in Debtmap:
- Automatically detects SRP violations through pattern-based method name analysis
- Categorizes methods into 11 built-in responsibility types
- Provides actionable refactoring recommendations with suggested module splits
- Integrates with god object detection for holistic architectural analysis
- Supports language-specific thresholds for Rust, Python, and JavaScript/TypeScript
- Is fully configurable via
.debtmap.tomland CLI flags
By surfacing responsibility violations early and suggesting concrete refactoring paths, Debtmap helps teams maintain clean, modular architectures that follow the Single Responsibility Principle.
Scoring Strategies
Debtmap uses sophisticated scoring strategies to prioritize technical debt based on multiple factors including complexity, test coverage, and functional role. This section explains the different scoring approaches available.
Overview
Scoring strategies determine how debtmap calculates priority scores for functions and files. The goal is to identify the most impactful technical debt that provides the best return on investment for refactoring efforts.
Strategy Types
File-Level vs Function-Level
Debtmap operates at two granularity levels:
- Function-Level Scoring - Identifies specific functions that need attention, considering complexity, coverage, and role
- File-Level Scoring - Aggregates function metrics to identify module-level architectural issues
Role-Based Adjustments
Not all code deserves equal scrutiny. A complex orchestrator function has different testing requirements than pure business logic:
- Role-Based Adjustments - Automatic multipliers based on detected function roles (pure logic, I/O wrappers, entry points, etc.)
Scoring Algorithms
Different algorithms for calculating final scores:
- Rebalanced Scoring - The default balanced approach that weighs coverage, complexity, and dependencies
- Exponential Scaling - Aggressive scaling for codebases where highest-priority items need strong emphasis
- Data Flow Scoring - Scoring based on how data flows through functions (sources, sinks, transforms)
Choosing a Strategy
| Strategy | Best For | Characteristics |
|---|---|---|
| Rebalanced (default) | Most projects | Balanced, fair prioritization |
| Exponential | Large legacy codebases | Emphasizes worst offenders |
| Data Flow | Pipeline-heavy code | Prioritizes data transformation logic |
Configuration
Scoring can be tuned via the [scoring] section in .debtmap.toml:
[scoring]
coverage = 0.50 # Weight for test coverage
complexity = 0.35 # Weight for complexity metrics
dependency = 0.15 # Weight for dependency analysis
See Scoring Configuration for full details on available options.
File-Level Scoring
File-level scoring aggregates metrics across all functions in a file to identify architectural problems and module-level refactoring opportunities.
Overview
While function-level scoring helps identify specific hot spots, file-level scoring reveals broader architectural issues:
- God objects with too many responsibilities
- Files with poor cohesion
- Modules that should be split
- Files with excessive function counts
File-level analysis is essential for planning major refactoring initiatives and understanding module-level technical debt.
Formula
File Score = Size × Complexity × Coverage Factor × Density × GodObject × FunctionScores
Note: This is a conceptual formula showing the multiplicative relationship between factors. The actual implementation in src/priority/file_metrics.rs:262-306 includes additional normalization steps and conditional adjustments.
Where each factor is calculated as:
- Size =
sqrt(total_lines / 100) - Complexity =
(avg_complexity / 5.0) × sqrt(total_complexity / 50.0) - Coverage Factor =
((1.0 - coverage_percent) × 2.0) + 1.0 - Density =
1.0 + ((function_count - 50) × 0.02)if function_count > 50, else 1.0 - GodObject =
1.0 + (god_object_score / 50.0)if detected (maps score 0-100 to multiplier 1.0x-3.0x) - FunctionScores =
max(sum(function_scores) / 10, 1.0)
Scoring Factors
Size Factor
Formula: sqrt(total_lines / 100)
Source: src/priority/file_metrics.rs:264
The size factor reflects the impact scope of refactoring:
- A 100-line file has factor 1.0
- A 400-line file has factor 2.0
- A 1000-line file has factor 3.16
The square root dampens the effect to avoid over-penalizing large files while still accounting for their broader impact.
Rationale: Refactoring a 1000-line file affects more code than a 100-line file, but the impact doesn’t scale linearly.
Complexity Factor
Formula: (avg_complexity / 5.0).min(3.0) × sqrt(total_complexity / 50.0)
Source: src/priority/file_metrics.rs:267-269
Combines average and total complexity to balance per-function and aggregate complexity:
- Average factor: Capped at 3.0 to prevent extreme values
- Total factor: Square root dampening of aggregate complexity
Rationale: Both concentrated complexity (one very complex function) and spread-out complexity (many moderately complex functions) matter for maintainability.
Coverage Factor
Formula: ((1.0 - coverage_percent) × 2.0) + 1.0
Source: src/priority/file_metrics.rs:272-273
The coverage factor acts as a multiplicative amplifier:
- 100% coverage: Factor = 1.0 (no amplification)
- 50% coverage: Factor = 2.0 (doubles the score)
- 0% coverage: Factor = 3.0 (triples the score)
Example: A file with 50% coverage and base score 10 becomes 20 after coverage adjustment.
Rationale: Untested files amplify existing complexity and risk. The multiplicative approach reflects that testing gaps compound other issues.
Density Factor
Formula: 1.0 + ((function_count - 50) × 0.02) when function_count > 50, else 1.0
Source: src/priority/file_metrics.rs:276-280
Penalizes files with excessive function counts:
- 50 or fewer functions: Factor = 1.0 (no penalty)
- 75 functions: Factor = 1.5 (50% increase)
- 100 functions: Factor = 2.0 (doubles the score)
Rationale: Files with many functions likely violate single responsibility principle and should be split.
God Object Multiplier
Formula: 1.0 + (god_object_score / 50.0) when detected, else 1.0
Source: src/priority/file_metrics.rs:282-293
The god object multiplier applies proportional scaling based on severity:
- Score 0 (borderline): Multiplier = 1.0x
- Score 50 (moderate): Multiplier = 2.0x
- Score 100 (severe): Multiplier = 3.0x
Implementation Comment (from src/priority/file_metrics.rs:282-284):
Score 0-100 maps to multiplier 1.0x-3.0x (not 2x-102x!) This aligns with contextual risk cap (max 3x) for consistent scoring
Rationale: God objects need architectural attention. The proportional scaling ensures severe god objects rank higher while keeping multipliers within a reasonable range that aligns with other contextual risk factors.
Function Scores Factor
Formula: max(sum(function_scores) / 10.0, 1.0)
Source: src/priority/file_metrics.rs:296-297
Aggregates individual function debt scores with normalization:
- Sum of all function scores divided by 10
- Minimum value of 1.0 to prevent near-zero scores
Rationale: The aggregate function debt provides a baseline before other multiplicative modifiers are applied.
FileScoreFactors Struct
For transparency in score calculations, Debtmap provides the FileScoreFactors struct that breaks down how each factor contributes to the final score.
Source: src/priority/file_metrics.rs:217-259
#![allow(unused)]
fn main() {
pub struct FileScoreFactors {
pub size_factor: f64,
pub size_basis: usize,
pub complexity_factor: f64,
pub avg_complexity: f64,
pub total_complexity: u32,
pub coverage_factor: f64,
pub coverage_percent: f64,
pub coverage_gap: f64,
pub density_factor: f64,
pub function_count: usize,
pub god_object_multiplier: f64,
pub god_object_score: f64,
pub is_god_object: bool,
pub function_factor: f64,
pub function_score_sum: f64,
}
}
Using get_score_factors()
You can access the score breakdown via the get_score_factors() method:
#![allow(unused)]
fn main() {
let factors = metrics.get_score_factors();
println!("Coverage factor: {:.2} ({:.0}% coverage)",
factors.coverage_factor,
factors.coverage_percent * 100.0);
}
Source: src/priority/file_metrics.rs:333-394
File Context Adjustments
File-level scoring integrates with context-aware adjustments that reduce scores for test files, generated files, and other non-production code.
Source: src/priority/scoring/file_context_scoring.rs
Adjustment Rules
| File Context | Confidence | Score Multiplier | Effect |
|---|---|---|---|
| Test file | > 0.8 | 0.2 | 80% reduction |
| Probable test | 0.5-0.8 | 0.6 | 40% reduction |
| Generated file | - | 0.1 | 90% reduction |
| Production | - | 1.0 | No adjustment |
Example
#![allow(unused)]
fn main() {
use debtmap::priority::{FileDebtItem, FileDebtMetrics};
use debtmap::analysis::FileContext;
let metrics = FileDebtMetrics::default();
let test_context = FileContext::Test {
confidence: 0.95,
test_framework: Some("rust-std".to_string()),
test_count: 10,
};
let item = FileDebtItem::from_metrics(metrics, Some(&test_context));
// item.score is now reduced by 80% due to test file context
}
Source: src/priority/file_metrics.rs:675-706
Aggregation Methods
Debtmap supports multiple aggregation methods for combining function scores into file-level scores, configurable via CLI or configuration file.
Weighted Sum (Default)
Formula: Σ(function_score × complexity_weight × coverage_weight)
debtmap analyze . --aggregation-method weighted_sum
Or via configuration:
[aggregation]
method = "weighted_sum"
Characteristics:
- Weights functions by their complexity and coverage gaps
- Emphasizes high-impact functions over trivial ones
- Best for most use cases where you want to focus on significant issues
Simple Sum
Formula: Σ(function_scores)
[aggregation]
method = "sum"
Characteristics:
- Adds all function scores directly without weighting
- Treats all functions equally regardless of complexity
- Useful for broad overview and trend analysis
Logarithmic Sum
Formula: log(1 + Σ(function_scores))
[aggregation]
method = "logarithmic_sum"
Characteristics:
- Dampens impact of many small issues to prevent score explosion
- Prevents files with hundreds of minor issues from dominating
- Creates more balanced comparisons across files of different sizes
Best for: Legacy codebases with many small issues where you want to avoid extreme scores.
Max Plus Average
Formula: max_score × 0.6 + avg_score × 0.4
[aggregation]
method = "max_plus_average"
Characteristics:
- Considers worst function (60%) plus average of all functions (40%)
- Balances worst-case and typical-case scenarios
- Highlights files with both a critical hot spot and general issues
Choosing an Aggregation Method
| Codebase Type | Recommended Method | Rationale |
|---|---|---|
| New/Modern | weighted_sum | Proportional emphasis on real issues |
| Legacy with many small issues | logarithmic_sum | Prevents score explosion |
| Mixed quality | max_plus_average | Balances hot spots with overall quality |
| Trend analysis | sum | Simple, consistent metric over time |
Performance Note: All aggregation methods have O(n) complexity where n = number of functions. Performance differences are negligible for typical codebases (<100k functions).
CLI Usage
Show File-Level Results Only
# Show top 10 files needing architectural refactoring
debtmap analyze . --aggregate-only --top 10
Note: The --aggregate-only flag changes output to show only file-level metrics instead of function-level details.
Focus on Architectural Problems
debtmap analyze . --aggregate-only --filter Architecture
Find Files with Excessive Function Counts
debtmap analyze . --aggregate-only --min-problematic 50
Generate File-Level Report
debtmap analyze . --aggregate-only --format markdown -o report.md
Configuration
IMPORTANT: The configuration file must be named
.debtmap.toml(notdebtmap.ymlor other variants) and placed in your project root directory.
[aggregation]
method = "weighted_sum"
min_problematic = 3 # Need 3+ problematic functions for file-level score
[god_object_detection]
enabled = true
max_methods = 20
max_fields = 15
max_responsibilities = 5
Use Cases
Planning Major Refactoring Initiatives
When planning sprint or quarterly refactoring work, file-level scoring helps identify which modules need the most attention:
debtmap analyze . --aggregate-only --top 10
Use when:
- Planning sprint or quarterly refactoring work
- Deciding which modules to split
- Prioritizing architectural improvements
- Allocating team resources
Identifying God Objects
File-level scoring excels at finding files with architectural issues:
debtmap analyze . --aggregate-only --filter Architecture
Look for files with:
- High god object scores
- Many functions (50+)
- Low coverage combined with high complexity
Breaking Up Monolithic Modules
# Find files with excessive function counts
debtmap analyze . --aggregate-only --min-problematic 50
Evaluating Overall Codebase Health
# Generate file-level report for executive summary
debtmap analyze . --aggregate-only --format markdown -o report.md
Best Practices
- Start with file-level analysis for strategic planning before drilling into function-level details
- Use logarithmic aggregation for legacy codebases to prevent score explosion
- Track file-level trends over time to measure architectural improvement
- Combine with god object detection to identify structural issues beyond simple size metrics
- Consider context adjustments - test and generated files should not dominate your priority list
See Also
- Function-Level Scoring - For targeted hot spot identification
- Rebalanced Scoring - Advanced scoring algorithm that de-emphasizes size
- God Object Detection - Detailed god object analysis
- Configuration - Full scoring configuration options
Function-Level Scoring
Function-level scoring identifies specific functions needing attention for targeted improvements. This subsection covers the scoring formula, constructor detection, role classification, and role multipliers.
Overview
Function-level scoring combines complexity, coverage, and dependency metrics to calculate a priority score for each function. The formula uses coverage as a multiplicative dampener rather than an additive factor, reflecting that testing gaps amplify existing complexity.
Key Principle: Untested complex code is riskier than well-tested complex code. Coverage acts as a multiplier that reduces the score for well-tested functions and preserves the full score for untested functions.
Scoring Formula
The function-level scoring formula consists of three stages:
Stage 1: Factor Calculation
Complexity Factor (src/priority/scoring/calculation.rs:55-59):
complexity_factor = raw_complexity / 2.0 (clamped to 0-10)
Complexity of 20+ maps to the maximum factor of 10.0. This provides linear scaling with a reasonable cap.
Dependency Factor (src/priority/scoring/calculation.rs:62-66):
dependency_factor = upstream_count / 2.0 (capped at 10.0)
20+ upstream dependencies map to the maximum factor of 10.0.
Stage 2: Base Score Calculation
Without Coverage Data (src/priority/scoring/calculation.rs:119-129):
base_score = (complexity_factor × 10 × 0.50) + (dependency_factor × 10 × 0.25)
Weights:
- 50% weight on complexity
- 25% weight on dependencies
- 25% reserved for debt pattern adjustments
With Coverage Data (src/priority/scoring/calculation.rs:70-82):
coverage_multiplier = 1.0 - coverage_percent
base_score = base_score_no_coverage × coverage_multiplier
Coverage acts as a dampening multiplier:
- 0% coverage (multiplier = 1.0): Full base score preserved
- 50% coverage (multiplier = 0.5): Half the base score
- 100% coverage (multiplier = 0.0): Near-zero score
Stage 3: Role Multiplier
The final score applies a role-based multiplier:
final_score = base_score × role_multiplier
See Role Multipliers for the specific values.
Complete Example
Function: calculate_risk_score()
Cyclomatic: 12, Cognitive: 18
Coverage: 20%
Upstream dependencies: 8
Step 1: Calculate factors
complexity_factor = (12 + 18) / 2 / 2.0 = 7.5 (capped at 10)
dependency_factor = 8 / 2.0 = 4.0
Step 2: Base score (no coverage)
base = (7.5 × 10 × 0.50) + (4.0 × 10 × 0.25)
base = 37.5 + 10.0 = 47.5
Step 3: Apply coverage multiplier
coverage_multiplier = 1.0 - 0.20 = 0.80
score_with_coverage = 47.5 × 0.80 = 38.0
Step 4: Apply role multiplier (PureLogic = 1.2)
final_score = 38.0 × 1.2 = 45.6
Metrics
Cyclomatic Complexity
Counts decision points (if, match, loops, boolean operators). Guides the number of test cases needed for full branch coverage.
Interpretation:
- 1-5: Low complexity, easy to test
- 6-10: Moderate complexity, reasonable test effort
- 11-20: High complexity, significant test effort
- 20+: Very high complexity, consider refactoring
Cognitive Complexity
Measures how difficult code is to understand, accounting for:
- Nesting depth (deeper nesting = higher penalty)
- Control flow breaks
- Recursion
Why Cognitive Gets Higher Weight (src/config/scoring.rs:367-373):
- Cyclomatic weight: 30%
- Cognitive weight: 70%
Cognitive complexity correlates better with bug density because it measures comprehension difficulty, not just branching paths.
Coverage Percentage
Direct line coverage from LCOV data. Functions with 0% coverage receive maximum urgency.
Coverage Dampening (src/priority/scoring/calculation.rs:8-21):
- Test code automatically receives 0.0 multiplier (near-zero score)
- Production code: multiplier = 1.0 - coverage_percent
Dependency Count
Upstream callers indicate impact radius. Functions with many callers are riskier to modify.
Constructor Detection
Debtmap identifies simple constructors to prevent false positives where trivial initialization functions are flagged as critical business logic.
Detection Strategy
A function is classified as a constructor if it meets these criteria (src/analyzers/rust_constructor_detector.rs:1-50):
1. Name Pattern Match:
- Exact:
new,default,empty,zero,any - Prefix:
from_*,with_*,create_*,make_*,build_*
2. Complexity Thresholds:
- Cyclomatic complexity: <= 2
- Cognitive complexity: <= 3
- Function length: < 15 lines
- Nesting depth: <= 1
3. AST Analysis (when enabled):
- Return type:
Self,Result<Self, E>, orOption<Self> - Body pattern: Struct initialization, no loops
- No complex control flow
Return Type Classification
The AST detector classifies return types (src/analyzers/rust_constructor_detector.rs:36-42):
| Return Type | Classification |
|---|---|
Self | OwnedSelf |
Result<Self, E> | ResultSelf |
Option<Self> | OptionSelf |
&Self, &mut Self | RefSelf (builder pattern) |
| Other | Other |
Body Pattern Analysis
The constructor detector visits the function body (src/analyzers/rust_constructor_detector.rs:104-130):
#![allow(unused)]
fn main() {
// Tracks these patterns:
struct BodyPattern {
struct_init_count: usize, // Struct initialization expressions
self_refs: usize, // References to Self
field_assignments: usize, // Field assignment expressions
has_if: bool, // Contains if expression
has_match: bool, // Contains match expression
has_loop: bool, // Contains any loop
early_returns: usize, // Return statements
}
}
Constructor-Like Pattern (src/analyzers/rust_constructor_detector.rs:152-158):
- Has struct initialization AND no loops, OR
- Has Self references AND no loops AND no match AND no field assignments
Examples
Detected as Constructor (classified as IOWrapper, score reduced):
#![allow(unused)]
fn main() {
fn new() -> Self {
Self { field: 0 }
}
fn from_config(config: Config) -> Self {
Self {
timeout: config.timeout,
retries: 3,
}
}
fn try_new(value: i32) -> Result<Self, Error> {
if value > 0 {
Ok(Self { value })
} else {
Err(Error::InvalidValue)
}
}
}
NOT Detected as Constructor (remains PureLogic):
#![allow(unused)]
fn main() {
// Has loop - disqualified
fn process_items() -> Self {
let mut result = Self::new();
for item in items {
result.add(item);
}
result
}
// High complexity - disqualified
fn create_complex(data: Data) -> Result<Self> {
validate(&data)?;
// ... 30 lines of logic
Ok(Self { ... })
}
}
Role Classification
Functions are classified by semantic role to adjust their priority scores appropriately.
Classification Order
The classifier applies rules in precedence order (src/priority/semantic_classifier/mod.rs:47-114):
- EntryPoint: Main functions, CLI handlers, routes
- Debug: Functions with debug/diagnostic patterns
- Constructor: Simple object construction (enhanced detection)
- EnumConverter: Match-based enum to value conversion
- Accessor: Getters, is_, has_ methods
- DataFlow: High transformation ratio (spec 126)
- PatternMatch: Pattern matching functions
- IOWrapper: File/network I/O thin wrappers
- Orchestrator: Functions coordinating other functions
- PureLogic: Default for unclassified functions
Entry Point Detection
Entry points are identified by:
- Call graph analysis: No upstream callers
- Name patterns:
main,handle_*,run_*,execute_*
Debug Function Detection
Debug/diagnostic functions are detected via (src/priority/semantic_classifier/mod.rs:59-61):
- Name patterns:
debug_*,print_*,dump_*,trace_*,*_diagnostics,*_stats - Low complexity threshold
- Output-focused behavior
Accessor Detection
Accessor methods are identified when (src/priority/semantic_classifier/mod.rs:147-177):
- Name matches accessor pattern:
id,name,get_*,is_*,has_*,as_*,to_* - Cyclomatic complexity <= 2
- Cognitive complexity <= 1
- Length < 10 lines
- (With AST) Simple field access body
Role Multipliers
Each role receives a score multiplier based on test priority importance (src/config/scoring.rs:307-333):
| Role | Multiplier | Rationale |
|---|---|---|
| PureLogic | 1.2 | Core business logic deserves high test priority |
| Unknown | 1.0 | Default, no adjustment |
| EntryPoint | 0.9 | Often integration tested, slight reduction |
| Orchestrator | 0.8 | Coordinates tested functions, reduced priority |
| IOWrapper | 0.7 | Thin I/O wrappers, integration tested |
| PatternMatch | 0.6 | Simple pattern dispatch, lower priority |
| Debug | 0.3 | Diagnostic functions, lowest priority |
Multiplier Rationale
PureLogic (1.2x): Business rules and algorithms should have comprehensive unit tests. They’re easy to test in isolation and contain the core value of the application.
Orchestrator (0.8x): Orchestrators coordinate other tested functions. If the delegated functions are well-tested, the orchestrator is partially covered through integration.
IOWrapper (0.7x): Thin I/O wrappers are often tested via integration tests. Unit testing them provides limited value compared to integration testing.
Debug (0.3x): Diagnostic and debug functions have the lowest test priority. They’re not production-critical and are often exercised manually during development.
Configuration
Role multipliers are configurable in .debtmap.toml:
[role_multipliers]
pure_logic = 1.2
orchestrator = 0.8
io_wrapper = 0.7
entry_point = 0.9
pattern_match = 0.6
debug = 0.3
unknown = 1.0
Role Multiplier Clamping
To prevent extreme score swings, multipliers can be clamped (src/config/scoring.rs:457-493):
[scoring.role_multiplier]
clamp_min = 0.3 # Floor for all multipliers
clamp_max = 1.8 # Ceiling for all multipliers
enable_clamping = true
Complexity Weight Configuration
The balance between cyclomatic and cognitive complexity is configurable (src/config/scoring.rs:335-381):
[complexity_weights]
cyclomatic = 0.3 # 30% weight
cognitive = 0.7 # 70% weight
max_cyclomatic = 50.0
max_cognitive = 100.0
Default Rationale:
- Cognitive complexity (70%) correlates better with bug density
- Cyclomatic complexity (30%) guides test case count
- Combined weighting provides balanced assessment
Score Normalization
Raw scores undergo normalization for display (src/priority/scoring/calculation.rs:174-206):
| Score Range | Method | Formula |
|---|---|---|
| 0-10 | Linear | score (unchanged) |
| 10-100 | Square root | 10.0 + sqrt(score - 10.0) × 3.33 |
| 100+ | Logarithmic | 41.59 + ln(score / 100.0) × 10.0 |
This multi-phase approach:
- Preserves distinctions for low scores
- Moderately dampens medium scores
- Strongly dampens extreme values
See Also
- File-Level Scoring: Aggregate file scoring
- Role-Based Adjustments: Detailed role adjustment mechanics
- Rebalanced Scoring: Alternative scoring algorithm
- Data Flow Scoring: Purity-based adjustments
Role-Based Adjustments
DebtMap uses a sophisticated two-stage role adjustment mechanism to ensure that scores accurately reflect both the testing strategy appropriate for each function type and the architectural importance of different roles.
Why Role-Based Adjustments?
Problem: Traditional scoring treats all functions equally, leading to false positives:
-
Entry points (CLI handlers, HTTP routes,
mainfunctions) typically use integration tests rather than unit tests- Flagging them for “low unit test coverage” misses that they’re tested differently
- They orchestrate other code but contain minimal business logic
-
Pure business logic functions should have comprehensive unit tests
- Easy to test in isolation with deterministic inputs/outputs
- Core value of the application lives here
-
I/O wrappers are often tested implicitly through integration tests
- Thin abstractions over file system, network, or database operations
- Unit testing them provides limited value compared to integration testing
Solution: DebtMap applies role-based adjustments in two stages to address both coverage expectations and architectural importance.
Stage 1: Role-Based Coverage Weighting
The first stage adjusts coverage penalty expectations based on function role. This prevents functions that use different testing strategies from unfairly dominating the priority list.
How It Works
For each function, DebtMap:
- Detects the function’s role (entry point, pure logic, I/O wrapper, etc.)
- Applies a coverage weight multiplier based on that role
- Reduces or increases the coverage penalty accordingly
Default Coverage Weights
The RoleCoverageWeights struct (src/config/scoring.rs:384-413) defines these defaults:
| Function Role | Coverage Weight | Impact on Scoring |
|---|---|---|
| Pure Logic | 1.0 | Standard penalty (should have unit tests) |
| Pattern Match | 1.0 | Standard penalty |
| Unknown | 1.0 | Standard penalty |
| Orchestrator | 0.8 | Reduced penalty (partially integration tested) |
| Entry Point | 0.6 | Significantly reduced penalty (integration tested) |
| I/O Wrapper | 0.5 | Reduced penalty (integration tested) |
| Debug | 0.3 | Minimal penalty (low priority for testing) |
Source: Default values from src/config/scoring.rs:429-455
Coverage Expectations by Role
DebtMap also defines role-specific coverage targets in CoverageExpectations (src/priority/scoring/coverage_expectations.rs:107-133):
| Role | Minimum | Target | Maximum |
|---|---|---|---|
| Pure | 90% | 95% | 100% |
| Business Logic | 80% | 90% | 95% |
| Validation | 85% | 92% | 98% |
| State Management | 75% | 85% | 90% |
| Utilities | 75% | 85% | 95% |
| Error Handling | 70% | 80% | 90% |
| Orchestration | 65% | 75% | 85% |
| I/O Operations | 60% | 70% | 80% |
| Configuration | 60% | 70% | 80% |
| Initialization | 50% | 65% | 75% |
| Performance | 40% | 50% | 60% |
| Debug | 20% | 30% | 40% |
Example Score Changes
Before role-based coverage adjustment:
Function: handle_request (Entry Point)
Complexity: 5
Coverage: 0%
Raw Coverage Penalty: 1.0 (full penalty)
Score: 8.5 (flagged as high priority)
After role-based coverage adjustment:
Function: handle_request (Entry Point)
Complexity: 5
Coverage: 0%
Adjusted Coverage Penalty: 0.4 (60% reduction via 0.6 weight)
Score: 4.2 (medium priority - more realistic)
Rationale: Entry points are integration tested, not unit tested.
This function is likely tested via API/CLI integration tests.
Comparison with Pure Logic:
Function: calculate_discount (Pure Logic)
Complexity: 5
Coverage: 0%
Adjusted Coverage Penalty: 1.0 (standard penalty)
Score: 9.8 (critical priority)
Rationale: Pure logic should have unit tests.
This function needs immediate test coverage.
Stage 2: Role Multiplier
The second stage applies a final role-based multiplier to reflect architectural importance. This multiplier is clamped by default to prevent extreme score swings.
Role Multiplier Defaults
The RoleMultipliers struct (src/config/scoring.rs:207-236) defines these multipliers:
| Role | Multiplier | Impact |
|---|---|---|
| Pure Logic | 1.2 | +20% (prioritized for testing) |
| Unknown | 1.0 | No adjustment |
| Entry Point | 0.9 | -10% (integration tested) |
| Orchestrator | 0.8 | -20% (higher-level tests) |
| I/O Wrapper | 0.7 | -30% (often integration tested) |
| Pattern Match | 0.6 | -40% (less complex) |
| Debug | 0.3 | -70% (lowest priority) |
Source: Default values from src/config/scoring.rs:307-333
Multiplier Clamping
The RoleMultiplierConfig (src/config/scoring.rs:457-493) controls clamping:
[scoring.role_multiplier]
clamp_min = 0.3 # Minimum multiplier (default: 0.3)
clamp_max = 1.8 # Maximum multiplier (default: 1.8)
enable_clamping = true # Enable clamping (default: true)
Clamp Range Rationale:
- Default [0.3, 1.8]: Balances differentiation with stability
- Lower bound (0.3): I/O wrappers still contribute 30% of base score (not invisible)
- Upper bound (1.8): Critical entry points don’t overwhelm other issues (max 180%)
- Configurable: Adjust based on project priorities
Example with Clamping:
Function: process_data (Complex Pure Logic)
Base Score: 45.0
Unclamped Role Multiplier: 2.5
Clamped Multiplier: 1.8 (clamp_max)
Final Score: 45.0 x 1.8 = 81.0
Effect: Prevents one complex function from dominating entire priority list
Why Two Stages?
The separation of coverage weight adjustment and role multiplier ensures they work together without interfering:
Stage 1 (Coverage Weight): Adjusts testing expectations
- Question: “How much should we penalize missing unit tests for this type of function?”
- Example: Entry points get 60% of normal coverage penalty (they’re integration tested)
Stage 2 (Role Multiplier): Adjusts architectural importance
- Question: “How important is this function relative to others with similar complexity?”
- Example: Pure logic gets a 1.2x multiplier, while debug functions get 0.3x
Scoring Pipeline
The functional scoring pipeline (src/priority/scoring/coverage_scoring.rs:20-31):
#![allow(unused)]
fn main() {
pub fn calculate_coverage_score(
actual_coverage: f64,
role: &str,
expectations: &CoverageExpectations,
) -> f64 {
let range = expectations.for_role(role);
let gap = CoverageGap::calculate(actual_coverage, range);
calculate_gap_score(&gap)
.pipe(|score| weight_by_severity(score, gap.severity))
.pipe(|score| weight_by_role(score, role))
}
}
Independent Contributions:
- Calculate base score from complexity + dependencies
- Apply coverage weight by role -> adjusted coverage penalty
- Combine into preliminary score
- Apply clamped role multiplier -> final score
This approach ensures:
- Coverage adjustments don’t interfere with role multiplier
- Both mechanisms contribute independently
- Clamping prevents instability from extreme multipliers
How This Reduces False Positives
False Positive #1: Entry Points Flagged for Low Coverage
Before:
Top Priority Items:
1. main() - Score: 9.2 (0% unit test coverage)
2. handle_cli_command() - Score: 8.8 (5% unit test coverage)
3. run_server() - Score: 8.5 (0% unit test coverage)
After:
Top Priority Items:
1. calculate_tax() - Score: 9.8 (0% coverage, Pure Logic)
2. validate_payment() - Score: 9.2 (10% coverage, Pure Logic)
3. main() - Score: 4.2 (0% coverage, Entry Point - integration tested)
Result: Business logic functions that actually need unit tests rise to the top.
False Positive #2: I/O Wrappers Over-Prioritized
Before:
Function: read_config_file
Complexity: 3
Coverage: 0%
Score: 7.5 (high priority)
Issue: This is a thin wrapper over std::fs::read_to_string.
Unit testing it provides minimal value vs integration tests.
After:
Function: read_config_file
Complexity: 3
Coverage: 0%
Adjusted Coverage Weight: 0.5
Role Multiplier: 0.7
Score: 2.6 (low priority)
Rationale: I/O wrappers are integration tested.
Focus on business logic instead.
Configuration Examples
Emphasize Pure Logic Testing
[scoring.role_coverage_weights]
pure_logic = 1.5 # Strong penalty for untested pure logic
entry_point = 0.5 # Minimal penalty for untested entry points
io_wrapper = 0.4 # Minimal penalty for untested I/O wrappers
debug = 0.2 # Minimal penalty for debug code
Conservative Approach (Smaller Adjustments)
[scoring.role_coverage_weights]
pure_logic = 1.1 # Slight increase
entry_point = 0.9 # Slight decrease
io_wrapper = 0.8 # Slight decrease
orchestrator = 0.9 # Slight decrease
Disable Multiplier Clamping (Not Recommended)
[scoring.role_multiplier]
enable_clamping = false # Allow unclamped multipliers
# Warning: May cause unstable prioritization
Strict Clamping Range
[scoring.role_multiplier]
clamp_min = 0.5 # More conservative minimum
clamp_max = 1.5 # More conservative maximum
enable_clamping = true
Verification
To see how role-based adjustments affect your codebase:
# Show detailed scoring breakdown
debtmap analyze . --verbose
# Compare with role adjustments disabled (using minimal config)
debtmap analyze . --config minimal.toml
Sample verbose output:
Function: src/handlers/request.rs:handle_request
Role: Entry Point
Complexity: 5
Coverage: 0%
Coverage Weight: 0.6 (Entry Point adjustment)
Adjusted Coverage Penalty: 0.4 (reduced from 1.0)
Base Score: 15.0
Role Multiplier: 0.9 (clamped)
Final Score: 13.5
Interpretation:
- Entry point gets 60% coverage penalty instead of 100%
- Likely tested via integration tests
- Still flagged due to complexity, but not over-penalized for coverage
Benefits Summary
- Fewer false positives: Entry points and I/O wrappers no longer dominate priority lists
- Better resource allocation: Testing efforts focus on pure logic where unit tests provide most value
- Recognition of testing strategies: Integration tests are valued equally with unit tests
- Stable prioritization: Clamping prevents extreme multipliers from causing volatile rankings
- Configurable: Adjust weights and clamp ranges to match your project’s testing philosophy
See Also
- Semantic Classification - How roles are detected
- File-Level Scoring - Aggregating function scores to file level
- Function-Level Scoring - Detailed function scoring mechanics
- Coverage Integration - How coverage data is integrated
Rebalanced Scoring
Rebalanced scoring is a multi-dimensional scoring algorithm that prioritizes actual code quality issues over pure file size concerns. It provides a more nuanced approach to technical debt prioritization by considering complexity, coverage gaps, structural problems, and code smells.
Overview
Traditional scoring often over-emphasizes file size, causing large but simple files to rank higher than complex, untested code. The rebalanced algorithm fixes this by:
- De-emphasizing size: Reduces size weight from ~1.5 to 0.3 (80% reduction)
- Emphasizing quality: Increases weights for complexity (1.0) and coverage gaps (1.0)
- Additive bonuses: Provides +20 bonus for complex + untested code (not multiplicative)
- Context-aware detection: Automatically detects and reduces scores for generated code
Source: src/priority/scoring/rebalanced.rs:1-10
Severity Levels
The rebalanced algorithm assigns severity levels based on normalized total scores and risk factors. Scores are normalized to a 0-200 range.
Source: src/priority/scoring/rebalanced.rs:12-30 (Severity enum), src/priority/scoring/rebalanced.rs:416-448 (determine_severity)
| Severity | Score Threshold | Additional Criteria | Description |
|---|---|---|---|
| CRITICAL | > 120 | OR complexity > 60 AND coverage > 40 | Requires immediate attention |
| HIGH | > 80 | OR complexity > 40 AND coverage > 20 OR structural > 50 | High priority for next sprint |
| MEDIUM | > 40 | OR single moderate issue (complexity/coverage/structural > 30) | Plan for future sprint |
| LOW | Everything else | - | Minor concerns, size-only issues |
Severity Determination Logic (from src/priority/scoring/rebalanced.rs:416-448):
#![allow(unused)]
fn main() {
fn determine_severity(components: &ScoreComponents, ...) -> Severity {
let total = components.weighted_total(&ScoreWeights::default());
// CRITICAL: Total score > 120 OR high complexity + low coverage
if total > 120.0 || (components.complexity_score > 60.0 && components.coverage_score > 40.0) {
return Severity::Critical;
}
// HIGH: Total score > 80 OR moderate complexity + coverage gap OR severe structural issue
if total > 80.0
|| (components.complexity_score > 40.0 && components.coverage_score > 20.0)
|| components.structural_score > 50.0 {
return Severity::High;
}
// MEDIUM: Total score > 40 OR single moderate issue
if total > 40.0
|| components.complexity_score > 30.0
|| components.coverage_score > 30.0
|| components.structural_score > 30.0 {
return Severity::Medium;
}
// LOW: Everything else
Severity::Low
}
}
Score Components
The rebalanced algorithm computes five distinct scoring components, each contributing to the weighted total.
Source: src/priority/scoring/rebalanced.rs:32-55 (ScoreComponents struct)
| Component | Weight Range | Default Weight | Description |
|---|---|---|---|
| complexity_score | 0-100 | 1.0 | Cyclomatic + cognitive complexity combined |
| coverage_score | 0-80 | 1.0 | Testing coverage deficit with complexity bonus |
| structural_score | 0-60 | 0.8 | God objects and architectural issues |
| size_score | 0-30 | 0.3 | File/function size (reduced from legacy ~1.5) |
| smell_score | 0-40 | 0.6 | Long functions, deep nesting, impure logic |
ScoreComponents Struct
#![allow(unused)]
fn main() {
/// Individual scoring components with their contributions
pub struct ScoreComponents {
pub complexity_score: f64, // Weight: 0-100
pub coverage_score: f64, // Weight: 0-80
pub structural_score: f64, // Weight: 0-60
pub size_score: f64, // Weight: 0-30 (reduced from current)
pub smell_score: f64, // Weight: 0-40
}
}
Source: src/priority/scoring/rebalanced.rs:32-40
Weighted Total Calculation
The weighted total is normalized to a 0-200 range:
#![allow(unused)]
fn main() {
pub fn weighted_total(&self, weights: &ScoreWeights) -> f64 {
let raw_total = self.complexity_score * weights.complexity_weight
+ self.coverage_score * weights.coverage_weight
+ self.structural_score * weights.structural_weight
+ self.size_score * weights.size_weight
+ self.smell_score * weights.smell_weight;
// Normalize to 0-200 range
// Theoretical max: 100×1.0 + 80×1.0 + 60×0.8 + 30×0.3 + 40×0.6 = 237
(raw_total / 237.0) * 200.0
}
}
Source: src/priority/scoring/rebalanced.rs:44-54
Presets
Debtmap provides four weight presets for different prioritization strategies.
Source: src/priority/scoring/rebalanced.rs:73-127 (ScoreWeights presets)
Balanced (Default)
The default preset prioritizes complexity and coverage over size.
[scoring_rebalanced]
preset = "balanced"
Weights:
- complexity_weight: 1.0
- coverage_weight: 1.0
- structural_weight: 0.8
- size_weight: 0.3
- smell_weight: 0.6
Use when: Standard development with focus on actual code quality.
Source: src/priority/scoring/rebalanced.rs:74-83
Quality-Focused
Maximum emphasis on code quality, minimal concern for file size.
[scoring_rebalanced]
preset = "quality-focused"
Weights:
- complexity_weight: 1.2
- coverage_weight: 1.1
- structural_weight: 0.9
- size_weight: 0.2
- smell_weight: 0.7
Use when: You want maximum emphasis on code quality over size.
Source: src/priority/scoring/rebalanced.rs:85-94
Size-Focused (Legacy)
Restores legacy behavior for backward compatibility.
[scoring_rebalanced]
preset = "size-focused"
Weights:
- complexity_weight: 0.5
- coverage_weight: 0.4
- structural_weight: 0.6
- size_weight: 1.5
- smell_weight: 0.3
Use when: Maintaining legacy scoring behavior or when file size is the primary concern.
Source: src/priority/scoring/rebalanced.rs:96-105
Test-Coverage-Focused
Emphasizes testing gaps above all other factors.
[scoring_rebalanced]
preset = "test-coverage"
Weights:
- complexity_weight: 0.8
- coverage_weight: 1.3
- structural_weight: 0.6
- size_weight: 0.2
- smell_weight: 0.5
Use when: Prioritizing test coverage improvements.
Source: src/priority/scoring/rebalanced.rs:107-116
Preset Comparison Table
| Preset | Complexity | Coverage | Structural | Size | Smells | Best For |
|---|---|---|---|---|---|---|
| balanced | 1.0 | 1.0 | 0.8 | 0.3 | 0.6 | Standard development |
| quality-focused | 1.2 | 1.1 | 0.9 | 0.2 | 0.7 | Quality-first teams |
| size-focused | 0.5 | 0.4 | 0.6 | 1.5 | 0.3 | Legacy compatibility |
| test-coverage | 0.8 | 1.3 | 0.6 | 0.2 | 0.5 | Coverage campaigns |
Preset Name Aliases
Presets support multiple naming conventions for convenience:
Source: src/priority/scoring/rebalanced.rs:118-127
| Preset | Aliases |
|---|---|
| balanced | balanced |
| quality-focused | quality-focused, quality_focused, quality |
| size-focused | size-focused, size_focused, legacy |
| test-coverage | test-coverage, test_coverage, testing |
Generated Code Detection
The rebalanced scoring automatically detects and reduces scores for generated code, applying a 90% reduction to the size score.
Source: src/priority/scoring/rebalanced.rs:450-467 (is_generated_file), src/priority/scoring/rebalanced.rs:246-249 (reduction logic)
Detection Patterns
Generated files are identified by common naming patterns:
#![allow(unused)]
fn main() {
let generated_patterns = [
".generated.rs",
".pb.rs", // Protocol buffers
".g.rs", // Grammar generated files
"_pb.rs", // Alternative protobuf naming
"generated/", // Generated directory
"/gen/", // Gen directory
];
}
Source: src/priority/scoring/rebalanced.rs:455-462
Score Reduction
When a file matches a generated pattern, the size score is reduced by 90%:
#![allow(unused)]
fn main() {
// Apply generated code detection and scoring reduction
if is_generated_file(&func.file) {
// Reduce size score by 90% for generated code
components.size_score *= 0.1;
}
}
Source: src/priority/scoring/rebalanced.rs:246-249
Examples
| File Path | Pattern Match | Size Score Adjustment |
|---|---|---|
src/proto/api.pb.rs | .pb.rs | 90% reduction |
src/generated/schema.rs | generated/ | 90% reduction |
src/parser.g.rs | .g.rs | 90% reduction |
src/main.rs | None | No adjustment |
Configuration
Rebalanced scoring is configured in your .debtmap.toml configuration file.
Source: src/config/scoring.rs:495-554 (RebalancedScoringConfig)
Enabling Rebalanced Scoring
Add the [scoring_rebalanced] section to activate rebalanced scoring:
# .debtmap.toml
[scoring_rebalanced]
preset = "balanced" # Activates rebalanced scoring with balanced preset
Using Presets
Select a preset to use predefined weight configurations:
[scoring_rebalanced]
preset = "quality-focused"
Custom Weight Overrides
Override individual weights while using a preset as base:
[scoring_rebalanced]
preset = "balanced"
complexity_weight = 1.2 # Override complexity weight
coverage_weight = 1.0 # Keep default
# Other weights inherit from preset
Full Custom Configuration
Define all weights manually without a preset:
[scoring_rebalanced]
# Custom weights (no preset)
complexity_weight = 1.0
coverage_weight = 1.0
structural_weight = 0.8
size_weight = 0.3
smell_weight = 0.6
RebalancedScoringConfig Structure
#![allow(unused)]
fn main() {
pub struct RebalancedScoringConfig {
/// Preset name (balanced, quality-focused, size-focused, test-coverage)
pub preset: Option<String>,
/// Custom complexity weight (overrides preset if specified)
pub complexity_weight: Option<f64>,
/// Custom coverage weight (overrides preset if specified)
pub coverage_weight: Option<f64>,
/// Custom structural weight (overrides preset if specified)
pub structural_weight: Option<f64>,
/// Custom size weight (overrides preset if specified)
pub size_weight: Option<f64>,
/// Custom smell weight (overrides preset if specified)
pub smell_weight: Option<f64>,
}
}
Source: src/config/scoring.rs:496-521
Scoring Rationale
Each debt item includes a detailed rationale explaining why a score was assigned. The rationale includes primary factors, bonuses, and context adjustments.
Source: src/priority/scoring/rebalanced.rs:130-223 (ScoringRationale)
ScoringRationale Struct
#![allow(unused)]
fn main() {
pub struct ScoringRationale {
pub primary_factors: Vec<String>,
pub bonuses: Vec<String>,
pub context_adjustments: Vec<String>,
}
}
Source: src/priority/scoring/rebalanced.rs:131-136
Primary Factors
Primary factors are the main contributors to the score:
- High cyclomatic complexity (complexity_score > 40):
"High cyclomatic complexity (+{score})" - Significant coverage gap (coverage_score > 30):
"Significant coverage gap (+{score})" - Structural issues (structural_score > 30):
"Structural issues (+{score})"
Bonuses
Additive enhancements to the score:
- Complex + untested (complexity > 40 AND coverage > 20):
"Complex + untested: +20 bonus applied" - Code smells (smell_score > 20):
"Code smells detected (+{score})"
Context Adjustments
Explanations for score adjustments based on context:
- Size context-adjusted (0 < size_score < 10):
"File size context-adjusted (reduced weight for file type)" - Size de-emphasized (size_weight < 0.5):
"Size de-emphasized (weight: {weight})"
Example Rationale Output
Debt Item: src/payment/processor.rs:142 - process_payment()
Score: 95.3 (CRITICAL)
Primary factors:
- High cyclomatic complexity (+100.0)
- Significant coverage gap (+57.2)
Bonuses:
- Complex + untested: +20 bonus applied
- Code smells detected (+25.0)
Context adjustments:
- Size de-emphasized (weight: 0.3)
Complexity Scoring Details
The complexity score is computed from both cyclomatic and cognitive complexity.
Source: src/priority/scoring/rebalanced.rs:265-317 (score_complexity)
Cyclomatic Complexity Thresholds
| Cyclomatic Complexity | Base Score |
|---|---|
| > 30 | 100.0 |
| > 20 | 80.0 |
| > 15 | 60.0 |
| > 10 | 40.0 |
| > 5 | 20.0 |
| <= 5 | 0.0 |
Cognitive Complexity Bonus
An additive bonus is applied based on cognitive complexity:
| Cognitive Complexity | Bonus |
|---|---|
| > 50 | +20.0 |
| > 30 | +15.0 |
| > 20 | +10.0 |
| > 15 | +5.0 |
| <= 15 | 0.0 |
The final complexity score is capped at 100.0:
#![allow(unused)]
fn main() {
(cyclomatic_score + cognitive_bonus).min(100.0)
}
Example Calculations
| Function | Cyclomatic | Cognitive | Base Score | Bonus | Final |
|---|---|---|---|---|---|
| Simple getter | 2 | 3 | 0.0 | 0.0 | 0.0 |
| Moderate logic | 12 | 18 | 40.0 | 5.0 | 45.0 |
| Complex parser | 25 | 45 | 80.0 | 15.0 | 95.0 |
| Very complex | 35 | 60 | 100.0 | 20.0 | 100.0 (capped) |
Score Normalization
Weighted totals are normalized to a 0-200 range for consistent comparison across projects.
Source: src/priority/scoring/rebalanced.rs:44-54
Normalization Formula
normalized_score = (raw_total / 237.0) * 200.0
Theoretical Maximum
The theoretical maximum raw score is 237:
- Complexity: 100 × 1.0 = 100
- Coverage: 80 × 1.0 = 80
- Structural: 60 × 0.8 = 48
- Size: 30 × 0.3 = 9
- Smells: 40 × 0.6 = 24
- Total: 237
This maps to a normalized score of 200.
CLI Usage
Enable Rebalanced Scoring via Config
Create or modify .debtmap.toml:
[scoring_rebalanced]
preset = "balanced"
Compare Standard vs Rebalanced Scoring
# Create test config with rebalanced scoring
cat > .debtmap-rebalanced.toml <<EOF
[scoring_rebalanced]
preset = "balanced"
EOF
# Compare results
debtmap analyze . --format terminal # Standard scoring
debtmap analyze . --config .debtmap-rebalanced.toml --format terminal # Rebalanced scoring
Test Different Presets
# Quality-focused preset
cat > .debtmap-quality.toml <<EOF
[scoring_rebalanced]
preset = "quality-focused"
EOF
debtmap analyze . --config .debtmap-quality.toml --top 10
Best Practices
- Start with balanced preset for standard development workflows
- Use quality-focused when code quality is the primary concern
- Use test-coverage during coverage improvement campaigns
- Use size-focused (legacy) only for backward compatibility
- Review rationale output to understand why items are prioritized
- Combine with file-level analysis for comprehensive debt assessment
- Track severity distributions over time to measure improvement
Migration from Legacy Scoring
Breaking Changes
- Scores will change significantly for all debt items
- Large files with low complexity will rank lower
- Complex untested code will rank higher
- Size-based prioritization reduced by 80%
Restoring Legacy Behavior
[scoring_rebalanced]
preset = "size-focused"
Gradual Migration Steps
- Test first: Add
[scoring_rebalanced]section to a test config file - Compare: Run analysis with both standard and rebalanced scoring
- Evaluate: Review how priorities change
- Adopt: Switch your primary config after validation
- Tune: Adjust preset or custom weights based on team priorities
See Also
- File-Level Scoring - For architectural issue identification
- Scoring Configuration - Full scoring configuration options
- Analysis Guide - Interpreting analysis results
Exponential Scaling
Debtmap uses exponential scaling and risk boosting to amplify high-severity technical debt items, ensuring critical issues stand out clearly in priority lists. This section explains how these mechanisms work.
Overview
Traditional linear multipliers create uniform gaps between scores:
- Linear 2x multiplier: Score 50 → 100, Score 100 → 200 (uniform +50 and +100 gaps)
Exponential scaling creates growing gaps that make critical issues impossible to miss:
- Exponential scaling (^1.4): Score 50 → 279, Score 100 → 1000 (gaps grow dramatically)
Key Benefits:
- Visual Separation: Critical items have dramatically higher scores than medium items
- Natural Clustering: Similar-severity items cluster together in ranked lists
- Actionable Ordering: Work through the list from top to bottom with confidence
- No Arbitrary Thresholds: Pure score-based ranking eliminates debates about tier boundaries
How Exponential Scaling Works
After calculating the base score (complexity + coverage + dependencies), Debtmap applies pattern-specific exponential scaling.
Formula (from src/priority/scoring/scaling.rs:77):
#![allow(unused)]
fn main() {
scaled_score = base_score.max(1.0).powf(exponent)
}
The max(1.0) ensures minimum base score to avoid zero^exponent = 0.
Debt-Type-Specific Exponents
The exponents are defined in ScalingConfig (src/priority/scoring/scaling.rs:16-28):
| Debt Type | Exponent | Condition | Rationale |
|---|---|---|---|
| God Objects | 1.4 | All God Objects | Highest amplification - architectural issues deserve top priority |
| God Modules | 1.4 | All God Modules | Same as God Objects - file-level architectural issues |
| High Complexity | 1.2 | cyclomatic > 30 | Moderate amplification - major complexity issues |
| Moderate Complexity | 1.1 | cyclomatic > 15 | Light amplification - notable but less severe |
| Complex Testing Gap | 1.1 | cyclomatic > 20 | Functions with complex untested code |
| All Other Types | 1.0 | Default | Linear scaling (no amplification) |
Source: src/priority/scoring/scaling.rs:53-75 - apply_exponential_scaling() function
Example: God Object Scaling (exponent = 1.4)
Comparing three God Objects with different base scores:
| Base Score | Calculation | Scaled Score | Amplification |
|---|---|---|---|
| 10 | 10^1.4 | 25.1 | 2.5x |
| 50 | 50^1.4 | 279.5 | 5.6x |
| 100 | 100^1.4 | 1000.0 | 10x |
Result: The highest-severity God Object (score 100) gets 10x amplification, while a minor issue (score 10) only gets 2.5x. This creates clear visual separation in priority lists.
Source: Test validation in src/priority/scoring/scaling.rs:271-293
Risk Boosting
After exponential scaling, Debtmap applies additional risk multipliers based on architectural position and code patterns.
Formula (from src/priority/scoring/scaling.rs:84-109):
#![allow(unused)]
fn main() {
final_score = scaled_score * risk_multiplier
}
Risk Multipliers
Risk factors are applied multiplicatively. The multipliers are defined in ScalingConfig (src/priority/scoring/scaling.rs:23-28):
| Risk Factor | Multiplier | Condition | Rationale |
|---|---|---|---|
| High Dependency Count | 1.2x | total_deps > 15 | Central code that affects more of the codebase |
| Entry Point | 1.15x | FunctionRole::EntryPoint | Failures cascade to all downstream code |
| Complex + Untested | 1.25x | cyclomatic > 20 AND coverage < 10% | High-risk combination of complexity and no tests |
| Error Swallowing | 1.15x | error_swallowing_count > 0 | Functions with poor error handling patterns |
Note: Multiple risk factors combine multiplicatively. A function with high dependencies AND entry point status would get 1.2 * 1.15 = 1.38x boost.
Source: src/priority/scoring/scaling.rs:84-109 - apply_risk_boosts() function
Error Swallowing Boost
Functions that swallow errors receive a 1.15x boost to encourage proper error handling. This includes patterns like:
if let Ok(x) = exprwithout handling theErrcase- Ignoring error returns from fallible operations
Source: src/priority/scoring/scaling.rs:103-107
Example: Complete Score Calculation
Function: process_payment (God Object)
Base Score: 85.0
Step 1 - Exponential Scaling (exponent 1.4):
85.0^1.4 = 554.3
Step 2 - Risk Boosting:
- Entry point: ×1.15 → 637.4
- High dependencies (20 deps): ×1.2 → 764.9
Final Score: 764.9
Source: Integration test in src/priority/scoring/scaling.rs:355-383
Complete Scoring Pipeline
Debtmap processes scores through multiple stages:
1. Base Score Calculation
↓
Weighted sum of:
- Coverage factor (50% weight)
- Complexity factor (35% weight)
- Dependency factor (15% weight)
2. Exponential Scaling
↓
Debt-type-specific exponent applied:
- God Objects/Modules: ^1.4
- High Complexity (>30): ^1.2
- Moderate Complexity (>15): ^1.1
- Complex Testing Gap (>20): ^1.1
- Others: ^1.0 (linear)
3. Risk Boosting
↓
Architectural position multipliers:
- High dependencies (>15 total): ×1.2
- Entry points: ×1.15
- Complex + untested: ×1.25
- Error swallowing: ×1.15
4. Final Score
↓
Used for ranking (no tier bucketing)
5. Output
↓
Sorted descending by final score
Note on weights: The default scoring weights are defined in src/config/scoring.rs:187-198:
coverage_weight: 0.50 (50%)complexity_weight: 0.35 (35%)dependency_weight: 0.15 (15%)
Configuration
The exponential scaling parameters are defined in the ScalingConfig struct but are not currently configurable via TOML. The system uses hardcoded defaults:
#![allow(unused)]
fn main() {
// From src/priority/scoring/scaling.rs:30-43
impl Default for ScalingConfig {
fn default() -> Self {
Self {
god_object_exponent: 1.4,
god_module_exponent: 1.4,
high_complexity_exponent: 1.2,
moderate_complexity_exponent: 1.1,
high_dependency_boost: 1.2,
entry_point_boost: 1.15,
complex_untested_boost: 1.25,
error_swallowing_boost: 1.15,
}
}
}
}
Future Enhancement: TOML configuration support for these parameters could be added to allow per-project tuning.
Comparing With vs Without Exponential Scaling
Without Exponential Scaling (Linear Multipliers):
Priority List:
1. God Object (base: 85) → final: 170 (2x multiplier)
2. Long Function (base: 80) → final: 160 (2x multiplier)
3. Complex Function (base: 75) → final: 150 (2x multiplier)
4. Medium Issue (base: 70) → final: 140 (2x multiplier)
Problem: Gaps are uniform (10 points). Hard to distinguish critical from medium issues.
With Exponential Scaling:
Priority List:
1. God Object (base: 85) → scaled: 554 → with risk: 701
2. Complex Function (base: 80, cyclomatic>30) → scaled: 447 → with risk: 492
3. Moderate Complexity (base: 75, cyclomatic>15) → scaled: 357 → with risk: 357
4. Simple Issue (base: 70) → scaled: 70 → with risk: 70
Result: Clear separation. God Object stands out as nearly 10x higher than simple issues.
Score Ordering Guarantees
The exponential scaling implementation provides mathematical guarantees validated by property tests:
- Monotonicity: Higher base scores always result in higher scaled scores
- Non-decreasing boosts: Risk boosts never decrease scores (all multipliers ≥ 1.0)
- Strict ordering: No score inversions in final ranking
Source: Property tests in src/priority/scoring/scaling.rs:513-694
Practical Example
debtmap analyze . --top 10
Output:
Top 10 Technical Debt Items (Sorted by Score)
1. src/services/user_service.rs:45 - UserService::authenticate
Score: 1247.3 | Pattern: God Object | Coverage: 12%
→ 45 methods, 892 lines, high complexity
→ Risk factors: Entry point (×1.15), High dependencies (×1.2)
2. src/payment/processor.rs:142 - process_payment
Score: 891.2 | Pattern: Complexity Hotspot | Coverage: 8%
→ Cyclomatic: 42, Cognitive: 77
→ Risk factors: Entry point (×1.15), Complex untested (×1.25)
3. src/reporting/generator.rs:234 - generate_monthly_report
Score: 654.1 | Pattern: Complexity Hotspot | Coverage: 45%
→ Cyclomatic: 35, Cognitive: 50
→ Risk factors: High dependencies (×1.2)
Action: Focus on top 3 items first - they have dramatically higher scores than items 4-10.
Performance Impact
Exponential scaling has negligible performance impact:
- Computation: Simple
powf()operation per item - Overhead: <1% additional analysis time
- Scalability: Works with parallel processing (no synchronization needed)
- Memory: No additional data structures required
See Also
- Function-Level Scoring - Base score calculation at function level
- File-Level Scoring - Aggregation and file-level metrics
- Rebalanced Scoring - Score normalization and balancing
- Data Flow Scoring - Purity and refactorability adjustments
Data Flow Scoring
Data flow scoring enhances Debtmap’s technical debt analysis by evaluating function purity, refactorability, and code patterns. This subsection explains how data flow analysis affects debt prioritization through three key factors: purity, refactorability, and pattern recognition.
Overview
Data flow scoring is an optional scoring layer that adjusts debt priorities based on functional programming principles. Functions that are pure, easily refactorable, or follow recognized patterns receive reduced priority scores, reflecting their lower maintenance burden.
Key principle: Pure functions and data transformation pipelines represent less technical debt than impure functions with side effects, because they’re easier to test, reason about, and refactor.
Source: src/priority/unified_scorer.rs:995-1020 (calculate_unified_priority_with_data_flow)
How Data Flow Scoring Works
Data flow scoring applies three weighted factors to the base debt score:
adjusted_score = base_score * combined_adjustment
combined_adjustment = (purity_factor * purity_weight
+ refactorability_factor * refactorability_weight
+ pattern_factor * pattern_weight)
/ total_weight
Each factor ranges from 0.0 to 1.0, where lower values reduce the final priority score.
Source: src/priority/unified_scorer.rs:1058-1075
Purity Spectrum
The purity spectrum classifies functions into five levels based on their side effects and mutation behavior. Pure functions receive the lowest priority multipliers since they represent minimal technical debt.
Classification Levels
| Level | Multiplier | Description |
|---|---|---|
StrictlyPure | 0.0 | No mutations, no I/O, referentially transparent |
LocallyPure | 0.3 | Pure interface but uses local mutations internally |
IOIsolated | 0.6 | I/O operations clearly separated from logic |
IOMixed | 0.9 | I/O mixed with business logic |
Impure | 1.0 | Mutable state, side effects throughout |
Source: src/priority/unified_scorer.rs:64-94 (PuritySpectrum enum)
Classification Algorithm
The purity factor is calculated by analyzing three sources of information from the data flow graph:
- Purity Analysis Results: High-confidence purity (>80%) indicates strict or local purity
- Mutation Analysis: Tracks whether a function has local mutations
- I/O Operations: Identifies I/O patterns for non-pure functions
#![allow(unused)]
fn main() {
// Classification logic (simplified)
if purity.is_pure && purity.confidence > 0.8 {
if mutations.has_mutations {
PuritySpectrum::LocallyPure // 0.3 multiplier
} else {
PuritySpectrum::StrictlyPure // 0.0 multiplier
}
} else if purity.is_pure {
PuritySpectrum::LocallyPure // 0.3 multiplier
} else {
classify_io_isolation(io_ops) // 0.6-1.0 multiplier
}
}
Source: src/priority/unified_scorer.rs:878-918 (calculate_purity_factor)
I/O Isolation Classification
For impure functions, the system evaluates I/O isolation based on concentration:
- IOIsolated (0.6): At most 2 unique I/O operation types and 3 total operations
- IOMixed (0.9): More than 2 unique types or more than 3 operations
- Impure (1.0): No I/O information available
Source: src/priority/unified_scorer.rs:921-935 (classify_io_isolation)
Purity Level vs Purity Spectrum
Debtmap uses two related but distinct purity classifications:
| Aspect | PurityLevel | PuritySpectrum |
|---|---|---|
| Purpose | Analysis classification | Scoring multiplier |
| Levels | 4 (StrictlyPure, LocallyPure, ReadOnly, Impure) | 5 (adds IOIsolated, IOMixed) |
| Usage | src/analysis/purity_analysis.rs | src/priority/unified_scorer.rs |
| Focus | Categorizing purity type | Assigning debt priority |
PurityLevel (from purity analysis) describes what kind of function this is. PuritySpectrum (for scoring) determines how much this affects debt priority, with finer granularity for I/O patterns.
Source: src/analysis/purity_analysis.rs:32-43 (PurityLevel), src/priority/unified_scorer.rs:64-94 (PuritySpectrum)
Pattern Factor
The pattern factor distinguishes data flow pipelines from business logic. Pure data transformation chains (map/filter/reduce patterns) receive reduced priority.
Calculation
#![allow(unused)]
fn main() {
// Pattern factor ranges from 0.7 to 1.0
let transform_ratio = transform_count / dependency_count;
if transform_ratio > 0.5 {
0.7 // Data flow pipeline - lowest priority
} else if transform_ratio > 0.3 {
0.85 // Mixed - moderate reduction
} else {
1.0 // Business logic - no reduction
}
}
Rationale: Functions with high transformation-to-dependency ratios are likely data flow pipelines, which are easier to test and maintain than complex business logic.
Source: src/priority/unified_scorer.rs:949-978 (calculate_pattern_factor)
Data Transformation Detection
The system counts data transformations by examining the data flow graph for:
- Outgoing function calls with associated data transformations
- Variable dependencies passed between functions
- Transformation types (map, filter, reduce, etc.)
Source: src/priority/unified_scorer.rs:980-993 (count_data_transformations)
Refactorability Factor
The refactorability factor was designed to identify dead stores and unused mutations. However, this analysis produced too many false positives and has been simplified.
Current behavior: Returns a neutral factor of 1.0 (no adjustment).
#![allow(unused)]
fn main() {
fn calculate_refactorability_factor(
_func_id: &FunctionId,
_data_flow: &DataFlowGraph,
_config: &DataFlowScoringConfig,
) -> f64 {
// Dead store analysis has been removed as it produced
// too many false positives.
1.0
}
}
Future plans: More sophisticated dead store analysis may be reintroduced with improved heuristics.
Source: src/priority/unified_scorer.rs:937-947
Data Flow Graph
The DataFlowGraph struct provides the underlying data for all data flow scoring calculations:
#![allow(unused)]
fn main() {
pub struct DataFlowGraph {
call_graph: CallGraph,
variable_deps: HashMap<FunctionId, HashSet<String>>,
data_transformations: HashMap<(FunctionId, FunctionId), DataTransformation>,
io_operations: HashMap<FunctionId, Vec<IoOperation>>,
purity_analysis: HashMap<FunctionId, PurityInfo>,
mutation_analysis: HashMap<FunctionId, MutationInfo>,
// ... (CFG analysis fields omitted for brevity)
}
}
Key data used for scoring:
purity_analysis: Results from purity detectionmutation_analysis: Tracks live vs dead mutationsio_operations: I/O operation locations and typesvariable_deps: Variable dependencies for pattern detectiondata_transformations: Transformation relationships between functions
Source: src/data_flow/mod.rs:113-140
Configuration
Configure data flow scoring in your .debtmap.toml:
[data_flow_scoring]
enabled = true # Enable/disable data flow scoring (default: true)
purity_weight = 0.4 # Weight for purity factor (default: 0.4)
refactorability_weight = 0.3 # Weight for refactorability factor (default: 0.3)
pattern_weight = 0.3 # Weight for pattern factor (default: 0.3)
Weight guidelines:
- All weights should be between 0.0 and 1.0
- Weights are normalized internally (don’t need to sum to 1.0)
- Higher
purity_weightemphasizes functional programming style - Higher
pattern_weightrewards data transformation pipelines
Source: src/config/scoring.rs:678-723 (DataFlowScoringConfig)
Disabling Data Flow Scoring
To disable data flow scoring entirely:
[data_flow_scoring]
enabled = false
When disabled, calculate_unified_priority_with_data_flow returns the base score without any data flow adjustments.
Practical Examples
Example 1: Strictly Pure Function
#![allow(unused)]
fn main() {
fn calculate_total(prices: &[f64]) -> f64 {
prices.iter().sum()
}
}
Analysis:
- No mutations:
has_mutations = false - No I/O operations:
io_ops = [] - High purity confidence:
confidence > 0.8
Result: PuritySpectrum::StrictlyPure (multiplier: 0.0)
This function’s debt score is reduced by the purity factor, deprioritizing it for refactoring.
Example 2: I/O Isolated Function
#![allow(unused)]
fn main() {
fn save_report(report: &Report) -> std::io::Result<()> {
let json = serde_json::to_string(report)?;
std::fs::write("report.json", json)?;
Ok(())
}
}
Analysis:
- I/O operations:
[file_write](1 unique type, 1 operation) - Concentrated I/O:
unique_types.len() <= 2 && ops.len() <= 3
Result: PuritySpectrum::IOIsolated (multiplier: 0.6)
Example 3: Data Flow Pipeline
#![allow(unused)]
fn main() {
fn process_transactions(transactions: Vec<Transaction>) -> Vec<Summary> {
transactions
.into_iter()
.filter(|t| t.amount > 0.0)
.map(|t| Summary::from(t))
.collect()
}
}
Analysis:
- High transformation ratio (filter + map chains)
transform_ratio > 0.5
Result: Pattern factor = 0.7, reducing debt priority for this data pipeline.
Integration with Unified Scoring
Data flow scoring integrates with the broader unified scoring system. The entry point is:
#![allow(unused)]
fn main() {
pub fn calculate_unified_priority_with_data_flow(
func: &FunctionMetrics,
call_graph: &CallGraph,
data_flow: &DataFlowGraph,
coverage: Option<&LcovData>,
_organization_issues: Option<f64>,
debt_aggregator: Option<&DebtAggregator>,
config: &DataFlowScoringConfig,
) -> UnifiedScore
}
The function:
- Calculates base score using role-aware unified scoring
- If data flow scoring is enabled, calculates the three factors
- Applies weighted combination to adjust final score
- Records factors in
UnifiedScorefor debugging
Source: src/priority/unified_scorer.rs:995-1080
Related Documentation
- Rebalanced Scoring - How data flow factors combine with other scoring weights
- Function-Level Scoring - Base scoring for individual functions
- File-Level Scoring - Aggregated scoring at file level
- Scoring Configuration - Configuration reference
Semantic Classification
Debtmap performs semantic analysis to classify functions by their architectural role, enabling more accurate complexity scoring and prioritization.
Overview
Semantic classification identifies the purpose of each function based on AST patterns, helping debtmap:
- Apply appropriate complexity expectations
- Adjust scoring based on function role
- Provide role-specific recommendations
Function Roles
Debtmap classifies functions into seven distinct roles, each with specific detection criteria and scoring behavior.
Pure Logic
Functions that compute without side effects. These are the core business logic functions that deserve highest test priority.
Detection Criteria (from src/priority/semantic_classifier/mod.rs:43):
- Default classification when no other role matches
- Does not match entry point, debug, constructor, enum converter, accessor, pattern matching, I/O wrapper, or orchestrator patterns
Example:
#![allow(unused)]
fn main() {
fn calculate_total(items: &[Item]) -> u32 {
items.iter().map(|i| i.price).sum()
}
}
Orchestrator
Functions that coordinate other functions with simple delegation logic.
Detection Criteria (from src/priority/semantic_classifier/classifiers.rs:257-328):
- Name matches orchestration patterns:
orchestrate,coordinate,manage,dispatch,route,delegate,forward - Name prefixes:
workflow_,pipeline_,process_,orchestrate_,coordinate_,execute_flow_ - Must have at least 2 meaningful callees (non-stdlib functions)
- Cyclomatic complexity ≤ 5
- Delegation ratio ≥ 20% (function calls / total lines)
- Excludes adapter/wrapper patterns (single delegation)
Example:
#![allow(unused)]
fn main() {
fn process_order(order: Order) -> Result<Receipt> {
let validated = validate_order(&order)?;
let priced = calculate_prices(&validated)?;
finalize_order(&priced)
}
}
I/O Wrapper
Functions that wrap I/O operations. Includes simple constructors, accessors, and enum converters.
Detection Criteria (from src/priority/semantic_classifier/classifiers.rs:331-343):
- Name contains I/O keywords:
read,write,file,socket,http,request,response,stream,serialize,deserialize,save,load, etc. - Short I/O functions (< 20 lines) are always classified as I/O wrappers
- Longer functions (≤ 50 lines) with strong I/O name patterns (
output_,write_,print_, etc.) and low nesting (≤ 3)
Example:
#![allow(unused)]
fn main() {
fn read_config(path: &Path) -> Result<Config> {
let content = fs::read_to_string(path)?;
toml::from_str(&content)
}
}
Entry Point
Main functions and public API endpoints. These have highest classification precedence.
Detection Criteria (from src/priority/semantic_classifier/pattern_matchers.rs:54-63):
- Name patterns:
main,run,start,init,handle,process,execute,serve,listen - Functions at the top of the call graph
- Highest classification precedence (checked before all other roles)
Example:
fn main() {
let args = Args::parse();
run(args).unwrap();
}
Pattern Match
Functions dominated by pattern matching logic, typically with many branches but low cyclomatic complexity.
Detection Criteria (from src/priority/semantic_classifier/classifiers.rs:213-252):
- Name suggests pattern matching:
detect,classify,identify,determine,resolve,match,parse_type,get_type,find_type - Low cyclomatic complexity (≤ 2) but higher cognitive complexity
- Cognitive/cyclomatic ratio > 5.0 (indicates many if/else or match branches)
Example:
#![allow(unused)]
fn main() {
fn handle_event(event: Event) -> Action {
match event {
Event::Click(pos) => Action::Select(pos),
Event::Drag(from, to) => Action::Move(from, to),
Event::Release => Action::Confirm,
}
}
}
Debug
Functions used for troubleshooting and diagnostics. These have the lowest test priority.
Detection Criteria (from src/priority/semantic_classifier/classifiers.rs:14-53):
- Name prefixes:
debug_,print_,dump_,trace_ - Name suffixes:
_diagnostics,_debug,_stats - Name contains:
diagnostics - Cognitive complexity ≤ 10 (prevents misclassifying complex functions with debug-like names)
- Alternatively: Very simple functions (cognitive < 5, length < 20) with output-focused I/O patterns (
print,display,show,log,trace,dump)
Example:
#![allow(unused)]
fn main() {
fn print_call_graph_diagnostics(graph: &CallGraph) {
for node in graph.nodes() {
println!("{}: {} callers, {} callees",
node.name, node.callers.len(), node.callees.len());
}
}
}
Unknown
Functions that cannot be classified into any specific role. These receive neutral scoring adjustments.
Detection Criteria (from src/priority/semantic_classifier/mod.rs:32):
- Reserved for edge cases where classification fails
- In practice, functions default to
PureLogicwhen no other role matches
Classification Precedence
The classifier applies rules in a specific order to ensure correct classification when multiple patterns match (from src/priority/semantic_classifier/mod.rs:47-113):
- Entry Point - Highest precedence, checked first
- Debug - Diagnostic functions detected early
- Constructor - Simple constructors (AST-based detection, spec 117/122)
- Enum Converter - Simple enum-to-value converters (spec 124)
- Accessor - Simple getter/accessor methods (spec 125)
- Data Flow - Data transformation orchestrators (spec 126, opt-in)
- Pattern Match - Pattern matching functions
- I/O Wrapper - I/O-focused functions
- Orchestrator - Coordination functions
- Pure Logic - Default fallback
AST-Based Detection
Semantic classification uses AST analysis to detect function roles beyond simple name matching.
Constructor Detection (Spec 117/122)
Source: src/priority/semantic_classifier/classifiers.rs:115-183
Detects constructor functions even with non-standard names by analyzing:
- Return type: Must return
Self,Result<Self>, orOption<Self> - Body patterns: Contains struct initialization with
Self { ... } - No loops in function body
- Complexity thresholds: cyclomatic ≤ 5, nesting ≤ 2, length < 30
Detected Patterns:
- Standard names:
new,default,from_*,with_*,create_*,make_*,build_* - Non-standard names with AST analysis:
create_default_client()returningSelf
Example:
#![allow(unused)]
fn main() {
// Detected even without standard naming
pub fn create_default_client() -> Self {
Self {
timeout: Duration::from_secs(30),
retries: 3,
}
}
}
Enum Converter Detection (Spec 124)
Source: src/priority/semantic_classifier/classifiers.rs:185-211
Detects simple enum-to-string converter functions:
- Name patterns:
name,as_str,to_* - Body contains exhaustive
matchstatement onself - All match arms return string/numeric literals only
- No function calls in match arms (e.g., no
format!()) - Cognitive complexity ≤ 3
Example:
#![allow(unused)]
fn main() {
pub fn name(&self) -> &'static str {
match self {
FrameworkType::Django => "Django",
FrameworkType::Flask => "Flask",
FrameworkType::PyQt => "PyQt",
}
}
}
Accessor Method Detection (Spec 125)
Source: src/priority/semantic_classifier/mod.rs:121-177
Detects simple accessor and getter methods:
Single-word patterns:
id,name,value,kind,type,status,code,key,index
Prefix patterns:
get_*,is_*,has_*,can_*,should_*,as_*,to_*,into_*
Complexity thresholds:
- Cyclomatic complexity ≤ 2
- Cognitive complexity ≤ 1
- Length < 10 lines
- Nesting ≤ 1 level
- If AST available, verifies body is simple accessor pattern
Example:
#![allow(unused)]
fn main() {
pub fn id(&self) -> u32 {
self.id
}
}
Data Flow Classification (Spec 126)
Source: src/priority/semantic_classifier/mod.rs:81-96
Analyzes data flow patterns to identify orchestration functions based on transformation chains.
Detection Criteria:
- Enabled via configuration (opt-in by default)
- High confidence (≥ 0.8)
- High transformation ratio (≥ 0.7)
- Low business logic ratio (< 0.3)
Debug Function Detection (Spec 119)
Source: src/priority/semantic_classifier/pattern_matchers.rs:7-20
Detects debug/diagnostic functions using:
Name patterns:
- Prefixes:
debug_,print_,dump_,trace_ - Suffixes:
_diagnostics,_debug,_stats - Contains:
diagnostics
Behavioral characteristics:
- Low cognitive complexity (< 5)
- Short length (< 20 lines)
- Output-focused I/O patterns:
print,display,show,log,trace,dump
Role-Specific Expectations
Different roles have different coverage and complexity expectations:
| Role | Coverage Expectation | Complexity Tolerance |
|---|---|---|
| Pure Logic | High | Low |
| Orchestrator | Medium | Medium |
| I/O Wrapper | Low | Low |
| Entry Point | Low | Medium |
| Pattern Match | Medium | Variable |
| Debug | Low | Low |
| Unknown | Medium | Medium |
Scoring Adjustments
Semantic classification affects scoring through role multipliers. These values adjust the priority score for each function role (from src/config/scoring.rs:307-333):
[scoring.role_multipliers]
pure_logic = 1.2 # Prioritized (highest test priority)
orchestrator = 0.8 # Reduced priority
io_wrapper = 0.7 # Minor reduction
entry_point = 0.9 # Slight reduction
pattern_match = 0.6 # Moderate reduction
debug = 0.3 # Lowest test priority
unknown = 1.0 # No adjustment
Scoring Formula:
- Higher multipliers (> 1.0) increase function priority
- Lower multipliers (< 1.0) decrease function priority
pure_logic = 1.2means pure logic functions are prioritized 20% higherdebug = 0.3means debug functions are de-prioritized significantly
Configuration
Semantic Classification
[semantic]
enabled = true
role_detection = true
adjust_coverage_expectations = true
Constructor Detection (Spec 117/122)
From src/config/detection.rs:54-98:
[classification.constructors]
# Name patterns for constructor functions
patterns = ["new", "default", "from_", "with_", "create_", "make_", "build_", "of_", "empty", "zero", "any"]
# Complexity thresholds
max_cyclomatic = 2 # Maximum cyclomatic complexity
max_cognitive = 3 # Maximum cognitive complexity
max_length = 15 # Maximum function length
max_nesting = 1 # Maximum nesting depth
# Enable AST-based detection for non-standard constructor names
ast_detection = true # Analyzes return types and body patterns
Accessor Detection (Spec 125)
From src/config/detection.rs:135-226:
[classification.accessors]
enabled = true
# Single-word accessor names
single_word_patterns = ["id", "name", "value", "kind", "type", "status", "code", "key", "index"]
# Prefix patterns for accessors
prefix_patterns = ["get_", "is_", "has_", "can_", "should_", "as_", "to_", "into_"]
# Complexity thresholds
max_cyclomatic = 2 # Maximum cyclomatic complexity
max_cognitive = 1 # Maximum cognitive complexity (stricter than constructors)
max_length = 10 # Maximum function length
max_nesting = 1 # Maximum nesting depth
Data Flow Classification (Spec 126)
From src/config/detection.rs:228-273:
[classification.data_flow]
enabled = false # Opt-in feature
min_confidence = 0.8 # Minimum confidence required
min_transformation_ratio = 0.7 # Minimum transformation ratio for orchestrator
max_business_logic_ratio = 0.3 # Maximum business logic for orchestrator
Debug Function Detection
Debug function detection is controlled by name patterns in src/priority/semantic_classifier/pattern_matchers.rs:7-20. The detection thresholds are:
- Cognitive complexity ≤ 10 for name-matched functions
- Cognitive complexity < 5 and length < 20 for behavior-matched functions
Troubleshooting
Function Classified Incorrectly
If a function is classified with the wrong role:
- Check classification precedence - Entry points take highest precedence
- Review complexity thresholds - High complexity can disqualify certain roles
- Examine name patterns - Some roles require specific naming conventions
- Enable AST detection - Set
classification.constructors.ast_detection = truefor better constructor detection
Constructor Not Detected
If a simple constructor is classified as PureLogic:
- Ensure function name matches patterns or returns
Self - Check complexity thresholds: cyclomatic ≤ 2, cognitive ≤ 3, length < 15
- Enable AST detection for non-standard names
- Verify no loops in function body
Debug Function Not Detected
If a diagnostic function has high priority:
- Ensure name matches debug patterns (
debug_*,print_*,*_diagnostics, etc.) - Check cognitive complexity is ≤ 10
- Functions with high complexity are intentionally excluded to prevent misclassification
See Also
Test Quality Analysis
Debtmap provides comprehensive analysis of test code quality, detecting anti-patterns, identifying potentially flaky tests, and providing actionable recommendations for improvement.
Overview
Test quality analysis examines your test suite to identify:
- Assertion patterns - Missing or weak assertions that reduce test effectiveness
- Test complexity - Overly complex tests that are hard to maintain
- Flaky test patterns - Tests that may fail intermittently
- Framework detection - Automatic detection of testing frameworks
- Test type classification - Categorization of tests by type (unit, integration, property, benchmark)
The analysis pipeline uses multiple detectors defined in src/testing/mod.rs:111-115:
#![allow(unused)]
fn main() {
let detectors: Vec<Box<dyn TestingDetector>> = vec![
Box::new(assertion_detector::AssertionDetector::new()),
Box::new(complexity_detector::TestComplexityDetector::new()),
Box::new(flaky_detector::FlakyTestDetector::new()),
];
}
Test Type Classification
Tests are classified into distinct types based on attributes, file paths, and naming patterns. The classification is handled by TestClassifier in src/testing/rust/test_classifier.rs.
Test Types
From src/testing/rust/mod.rs:69-76:
| Test Type | Description | Detection Method |
|---|---|---|
UnitTest | Isolated function testing | Default for src/ tests |
IntegrationTest | Cross-module testing | Tests in tests/ directory |
BenchmarkTest | Performance benchmarks | #[bench] attribute |
PropertyTest | Generative testing | proptest! or quickcheck! macros |
DocTest | Documentation tests | Extracted from doc comments |
Classification Logic
The classifier checks in order (src/testing/rust/test_classifier.rs:22-44):
- Benchmark detection: Functions with
#[bench]attribute - Property test detection: Functions using proptest or quickcheck
- Integration test path: Files in
tests/directory - Default: Unit test
#![allow(unused)]
fn main() {
// Example: Integration test detection
fn is_integration_test_path(&self, path: &Path) -> bool {
let path_str = path.to_string_lossy();
path_str.contains("/tests/") || path_str.starts_with("tests/")
}
}
Assertion Pattern Detection
The AssertionDetector (src/testing/assertion_detector.rs) identifies tests with missing or inadequate assertions.
Detected Assertion Types
From src/testing/rust/mod.rs:79-95:
| Assertion Type | Description | Quality Rating |
|---|---|---|
Assert | assert!(condition) | Weak - no context on failure |
AssertEq | assert_eq!(left, right) | Strong - shows expected vs actual |
AssertNe | assert_ne!(left, right) | Strong - shows values |
Matches | matches!(value, pattern) | Medium - pattern-based |
ShouldPanic | #[should_panic] attribute | Valid for panic tests |
ResultOk | Ok(()) return | Valid for Result-based tests |
Custom(String) | Custom assertion macros | Depends on implementation |
Assertion Macro Recognition
The detector recognizes these macros (src/testing/assertion_detector.rs:227-237):
#![allow(unused)]
fn main() {
fn is_assertion_macro(name: &str) -> bool {
matches!(
name,
"assert" | "assert_eq" | "assert_ne" | "assert_matches"
| "debug_assert" | "debug_assert_eq" | "debug_assert_ne"
)
}
}
Tests Without Assertions
Tests flagged as having no assertions receive suggested fixes:
#![allow(unused)]
fn main() {
// From src/testing/assertion_detector.rs:257-278
fn suggest_assertions(analysis: &TestStructureAnalysis) -> Vec<String> {
if analysis.has_action && !analysis.has_assertions {
vec![
"Add assertions to verify the behavior".to_string(),
"Consider using assert!, assert_eq!, or assert_ne!".to_string(),
]
}
// ...
}
}
Test Complexity Analysis
The TestComplexityDetector (src/testing/complexity_detector.rs) measures test complexity and suggests simplifications.
Complexity Sources
From src/testing/mod.rs:39-46:
| Source | Description | Threshold |
|---|---|---|
ExcessiveMocking | Too many mock setups | > 3 mocks |
NestedConditionals | Deeply nested if/match | Nesting > 1 level |
MultipleAssertions | Too many assertions | > 5 assertions |
LoopInTest | Loops in test code | Any loop detected |
ExcessiveSetup | Long test functions | > 30 lines |
Complexity Scoring
The complexity score is calculated in src/testing/complexity_detector.rs:304-309:
#![allow(unused)]
fn main() {
pub(crate) fn calculate_total_complexity(analysis: &TestComplexityAnalysis) -> u32 {
analysis.cyclomatic_complexity
+ (analysis.mock_setup_count as u32 * 2)
+ analysis.assertion_complexity
+ (analysis.line_count as u32 / 10) // Penalty for long tests
}
}
The Rust-specific complexity scoring (src/testing/rust/mod.rs:24-29 in doc comments):
- Conditional statements: +2 per
if/match - Loops: +3 per loop
- Assertions beyond 5: +1 per additional assertion
- Nesting depth > 2: +2 per level
- Tests > 30 lines: +(lines-30)/10
Simplification Recommendations
From src/testing/mod.rs:48-55:
| Recommendation | When Applied | Action |
|---|---|---|
ExtractHelper | Long tests with shared setup | Extract common code to helper function |
SplitTest | Many assertions + many mocks | Split into focused tests |
ParameterizeTest | High cyclomatic complexity (> 5) | Use parameterized testing |
SimplifySetup | Default recommendation | Reduce test setup complexity |
ReduceMocking | Excessive mocks (> max_mock_setups) | Use real implementations or simpler mocks |
The suggestion logic (src/testing/complexity_detector.rs:320-334):
#![allow(unused)]
fn main() {
pub(crate) fn suggest_simplification(
analysis: &TestComplexityAnalysis,
detector: &TestComplexityDetector,
) -> TestSimplification {
match () {
_ if analysis.mock_setup_count > detector.max_mock_setups => {
TestSimplification::ReduceMocking
}
_ if analysis.line_count > detector.max_test_length => {
classify_length_based_simplification(analysis)
}
_ if analysis.cyclomatic_complexity > 5 => TestSimplification::ParameterizeTest,
_ => TestSimplification::SimplifySetup,
}
}
}
Flaky Test Detection
The FlakyTestDetector (src/testing/flaky_detector.rs) identifies patterns that can cause intermittent test failures.
Flakiness Types
From src/testing/mod.rs:57-65:
| Type | Description | Impact |
|---|---|---|
TimingDependency | Uses sleep, timeouts, or time measurements | High |
RandomValues | Uses random number generation | Medium |
ExternalDependency | Calls external services or APIs | Critical |
FilesystemDependency | Reads/writes files | Medium |
NetworkDependency | Network operations | Critical |
ThreadingIssue | Thread spawning or synchronization | High |
Rust-Specific Flakiness Types
From src/testing/rust/mod.rs:98-107:
| Type | Description |
|---|---|
HashOrdering | HashMap iteration (non-deterministic order) |
ThreadingIssue | Unsynchronized concurrent access |
Reliability Impact Levels
From src/testing/mod.rs:67-73:
- Critical: External dependencies, network calls - high failure probability
- High: Timing dependencies, threading issues - moderate failure probability
- Medium: Random values, filesystem operations - occasional failures
- Low: Minor ordering issues - rare failures
Detection Patterns
The detector uses pattern categories defined in src/testing/flaky_detector.rs:190-284:
Timing Patterns (TimingDependency):
sleep, Instant::now, SystemTime::now, Duration::from, delay,
timeout, wait_for, park_timeout, recv_timeout
Random Patterns (RandomValues):
rand, random, thread_rng, StdRng, SmallRng, gen_range,
sample, shuffle, choose
External Service Patterns (ExternalDependency):
reqwest, hyper, http, Client::new, HttpClient, ApiClient,
database, db, postgres, mysql, redis, mongodb, sqlx, diesel
Filesystem Patterns (FilesystemDependency):
fs::, File::, std::fs, tokio::fs, async_std::fs,
read_to_string, write, create, remove_file, remove_dir
Network Patterns (NetworkDependency):
TcpStream, TcpListener, UdpSocket, connect, bind,
listen, accept, send_to, recv_from
Stabilization Suggestions
Each flaky pattern includes a stabilization suggestion:
#![allow(unused)]
fn main() {
// From src/testing/flaky_detector.rs:156-187
_ if is_timing_function(path_str) => Some(FlakinessIndicator {
flakiness_type: FlakinessType::TimingDependency,
impact: ReliabilityImpact::High,
suggestion: "Replace sleep/timing dependencies with deterministic waits or mocks"
.to_string(),
}),
_ if is_external_service_call(path_str) => Some(FlakinessIndicator {
flakiness_type: FlakinessType::ExternalDependency,
impact: ReliabilityImpact::Critical,
suggestion: "Mock external service calls for unit tests".to_string(),
}),
}
Timing Classification
The timing classifier (src/testing/timing_classifier.rs) categorizes timing-related operations to assess flakiness risk.
Timing Categories
From src/testing/timing_classifier.rs:31-44:
| Category | Description | Flaky Risk |
|---|---|---|
CurrentTime | Instant::now() | Yes |
SystemTime | SystemTime::now() | Yes |
DurationCreation | Duration::from_*() | No |
ElapsedTime | elapsed(), duration_since() | Yes |
Sleep | Thread sleep operations | Yes |
Timeout | Operations with timeout | Yes |
Wait | Waiting operations (not await) | Yes |
ThreadTimeout | park_timeout, recv_timeout | Yes |
Delay | Delay operations | Yes |
Timer | Timer-based operations | Yes |
Unknown | Unrecognized patterns | No |
Only DurationCreation and Unknown are considered non-flaky.
Test Quality Issue Types
The Rust-specific module tracks comprehensive issue types (src/testing/rust/mod.rs:131-140):
| Issue Type | Severity | Description |
|---|---|---|
NoAssertions | High | Test has no assertions |
TooComplex(u32) | Medium | Complexity score exceeds threshold |
FlakyPattern(type) | High | Contains flaky pattern |
ExcessiveMocking(usize) | Medium | Too many mock setups |
IsolationIssue | High | Test affects shared state |
TestsTooMuch | Medium | Tests too many concerns |
SlowTest | Low | Test may be slow |
Severity Levels
From src/testing/rust/mod.rs:110-116:
- Critical: Fundamental test quality issues
- High: Significant problems affecting reliability
- Medium: Quality concerns worth addressing
- Low: Minor improvements possible
Framework Detection
Debtmap automatically detects testing frameworks to provide context-aware analysis.
Supported Frameworks
From src/testing/rust/mod.rs:54-66:
| Framework | Detection | Description |
|---|---|---|
Std | #[test] attribute | Standard library test |
Criterion | criterion crate usage | Benchmarking framework |
Proptest | proptest! macro | Property-based testing |
Quickcheck | quickcheck! macro | Property-based testing |
Rstest | #[rstest] attribute | Parameterized testing |
Multi-Language Support
From src/testing/mod.rs:89-106:
#![allow(unused)]
fn main() {
// Test attribute detection
path_str == "test"
|| path_str == "tokio::test"
|| path_str == "async_std::test"
|| path_str == "bench"
|| path_str.ends_with("::test")
}
The documentation also covers:
- Rust:
#[test],#[tokio::test], proptest, rstest, criterion - Python: pytest, unittest
- JavaScript: jest, mocha
Fast vs Slow Test Detection
Slow tests are identified as a quality issue (src/testing/rust/mod.rs:139):
#![allow(unused)]
fn main() {
RustTestIssueType::SlowTest
}
Detection criteria include:
- Long test functions (> 50 lines by default)
- Timing operations that suggest waiting
- External service calls that may have latency
Configuration
Basic Configuration
[test_quality]
enabled = true
complexity_threshold = 10
Advanced Configuration
From src/testing/complexity_detector.rs:9-13:
[test_quality]
enabled = true
# Maximum allowed test complexity score
complexity_threshold = 10
# Maximum number of mock setups per test
max_mock_setups = 5
# Maximum test function length (lines)
max_test_length = 50
Default values from TestComplexityDetector::new():
max_test_complexity: 10max_mock_setups: 5max_test_length: 50
Common Issues and Solutions
Issue: Test Without Assertions
Detection: TestWithoutAssertions anti-pattern
Example of problematic code:
#![allow(unused)]
fn main() {
#[test]
fn test_without_assertion() {
let result = calculate(10);
// No assertion!
}
}
Fix:
#![allow(unused)]
fn main() {
#[test]
fn test_with_assertion() {
let result = calculate(10);
assert_eq!(result, 20);
}
}
Issue: Timing-Dependent Test
Detection: FlakinessType::TimingDependency
Example of problematic code:
#![allow(unused)]
fn main() {
#[test]
fn test_timing_dependent() {
let start = Instant::now();
do_work();
assert!(start.elapsed() < Duration::from_millis(100));
}
}
Fix:
#![allow(unused)]
fn main() {
#[test]
fn test_deterministic() {
let result = do_work();
assert!(result.is_success());
}
}
Issue: Excessive Mocking
Detection: ComplexitySource::ExcessiveMocking
Solution: Consider using real implementations, test doubles, or restructuring to reduce mock count below 5.
Issue: Tests with Loops
Detection: ComplexitySource::LoopInTest
Solution: Use parameterized tests with #[rstest] or property-based testing with proptest instead of loops.
See Also
- Coverage Integration - Combining coverage with test quality analysis
- Complexity Metrics - Understanding complexity scoring
- Configuration - Full configuration reference
Tiered Prioritization
Debtmap uses a sophisticated tiered prioritization system to surface critical architectural issues above simple testing gaps. This chapter explains the tier strategy, how to interpret tier classifications, and how to customize tier thresholds for your project.
Overview
The tiered prioritization system organizes technical debt into four distinct tiers based on impact, urgency, and architectural significance. This prevents “walls of similar-scored items” and ensures critical issues don’t get lost among minor problems.
Three Display Systems: Debtmap uses three complementary tier/priority systems:
- RecommendationTier (T1-T4): Internal classification based on architectural significance and testing needs. Controls tier weight application during scoring.
- Tier (Critical/High/Moderate/Low): Score-based tiers for terminal and markdown output using 90/70/50 thresholds (src/priority/mod.rs:371-386)
- Priority (critical/high/medium/low): Score-based tiers for JSON output using 100/50/20 thresholds (src/output/unified/priority.rs:23-32)
The configuration examples below control the RecommendationTier classification logic, which influences scoring through tier weights. The final display uses different score-based tiers depending on output format.
The Four Tiers
Tier 1: Critical Architecture
Description: God Objects, God Modules, excessive complexity requiring immediate architectural attention
Priority: Must address before adding new features
Weight: 1.5x (highest priority multiplier)
Impact: High impact on maintainability and team velocity
Classification Criteria (src/priority/tiers/pure.rs:26-41 and src/priority/tiers/predicates.rs):
Debtmap uses sophisticated multi-factor analysis for Tier 1 classification, not just raw cyclomatic complexity. Items qualify for Tier 1 if they meet any of these criteria:
- Critical Patterns: AsyncMisuse or ErrorSwallowing debt types (always Tier 1)
- High Final Score: final_score > 10.0 after exponential scaling
- Extreme Cyclomatic: Entropy-dampened cyclomatic complexity > 50
- High Cognitive Load: Cognitive complexity >= 20
- Deep Nesting: Nesting depth >= 5 levels
- High Weighted Complexity: complexity_factor > 5.0 (weighted: 30% cyclomatic + 70% cognitive)
Examples:
- Files with 15+ responsibilities (God Objects)
- Modules with 50+ methods (God Modules)
- ComplexityHotspot debt items meeting any of the criteria above
- Functions with cognitive complexity >= 20 (extreme mental load)
- Deeply nested code (5+ nesting levels)
- Circular dependencies affecting core modules
When to Address: Immediately, before sprint work begins. These issues compound over time and block progress.
# Focus on Tier 1 items
debtmap analyze . --min-priority high --top 5
Tier 2: Complex Untested
Description: Untested code with high complexity or critical dependencies, plus moderate complexity hotspots not severe enough for Tier 1.
Priority: Risk of bugs in critical paths
Weight: 1.0x (standard multiplier)
Action: Should be tested before refactoring to prevent regressions
Classification Criteria (src/priority/tiers/pure.rs:64-99):
Items qualify for Tier 2 through two distinct paths:
Path 1: Testing Gaps - Untested code meeting ANY of:
- Cyclomatic complexity ≥ 15
- Total dependencies ≥ 10
- Entry point functions with any coverage gap
Path 2: Moderate Complexity Hotspots - ComplexityHotspot items with meaningful but non-extreme complexity, meeting ANY of:
- Complexity factor >= 2.0 (weighted: 30% cyclomatic + 70% cognitive, scaled 0-10)
- Cognitive complexity >= 12 (moderate to high mental load)
- Nesting depth >= 3 (meaningful nested control flow)
- Entropy-dampened cyclomatic complexity 8-50 (after filtering repetitive patterns)
Examples:
- Functions with cyclomatic complexity ≥ 15 and 0% coverage
- Functions with 10+ dependencies and low test coverage
- Business logic entry points without tests
- Complex error handling without validation
- Moderate complexity hotspots (complexity_factor >= 2.0, cognitive >= 12, or nesting >= 3)
When to Address: Within current sprint. Add tests before making changes.
# See Tier 2 testing gaps
debtmap analyze . --lcov coverage.lcov --min-priority high
Tier 3: Testing Gaps
Description: Untested code with moderate complexity
Priority: Improve coverage to prevent future issues
Weight: 0.7x (reduced multiplier)
Action: Add tests opportunistically or during related changes
Examples:
- Functions with cyclomatic complexity 10-15 and low coverage
- Utility functions without edge case tests
- Moderate complexity with partial coverage
When to Address: Next sprint or when touching related code.
Tier 4: Maintenance
Description: Low-complexity issues and code quality improvements
Priority: Address opportunistically during other work
Weight: 0.3x (lowest multiplier)
Action: Fix when convenient, low urgency
Examples:
- Simple functions with minor code quality issues
- TODO markers in well-tested code
- Minor duplication in test code
When to Address: During cleanup sprints or when refactoring nearby code.
Configuration
Tier configuration is optional in .debtmap.toml. If not specified, Debtmap uses the balanced defaults shown below.
Default Tier Thresholds
[tiers]
# Tier 2 thresholds (Complex Untested)
t2_complexity_threshold = 15 # Cyclomatic complexity cutoff
t2_dependency_threshold = 10 # Dependency count cutoff
# Tier 3 thresholds (Testing Gaps)
t3_complexity_threshold = 10 # Lower complexity threshold
# Display options
show_t4_in_main_report = false # Hide Tier 4 from main output (default: false)
# Tier weights (multipliers applied to base scores)
t1_weight = 1.5 # Critical architecture
t2_weight = 1.0 # Complex untested
t3_weight = 0.7 # Testing gaps
t4_weight = 0.3 # Maintenance
To use tier-based prioritization with custom settings, add the [tiers] section to your .debtmap.toml configuration file:
# Analyze with custom tier configuration
debtmap analyze . --config .debtmap.toml
Tier Preset Configurations
Debtmap provides three built-in tier presets for different project needs:
Balanced (Default)
[tiers]
t2_complexity_threshold = 15
t2_dependency_threshold = 10
t3_complexity_threshold = 10
Suitable for most projects. Balances detection sensitivity with manageable issue counts.
Strict
[tiers]
t2_complexity_threshold = 10
t2_dependency_threshold = 7
t3_complexity_threshold = 7
For high-quality codebases or teams with strict quality standards. Flags more items as requiring attention.
Lenient
[tiers]
t2_complexity_threshold = 20
t2_dependency_threshold = 15
t3_complexity_threshold = 15
For legacy codebases or gradual technical debt reduction. Focuses on the most critical issues first.
Programmatic Access: These presets are also available as methods when using Debtmap as a library:
TierConfig::balanced()- Equivalent to the balanced preset aboveTierConfig::strict()- Equivalent to the strict preset aboveTierConfig::lenient()- Equivalent to the lenient preset above
These methods can be used in Rust code to configure tier settings programmatically without manual TOML configuration.
Customizing Tier Thresholds
You can also create custom threshold configurations tailored to your project:
# Custom thresholds for specific project needs
[tiers]
t2_complexity_threshold = 12
t2_dependency_threshold = 8
t3_complexity_threshold = 8
Tier Weight Customization
Tier weights are multipliers applied to base debt scores during prioritization. A weight of 1.5 means items in that tier will score 50% higher than equivalent items in a tier with weight 1.0, pushing them higher in priority rankings.
Adjust weights based on your priorities:
# Emphasize testing over architecture
[tiers]
t1_weight = 1.2 # Reduce architecture weight
t2_weight = 1.3 # Increase testing weight
t3_weight = 0.8
t4_weight = 0.3
# Focus on architecture first
[tiers]
t1_weight = 2.0 # Maximize architecture weight
t2_weight = 1.0
t3_weight = 0.5
t4_weight = 0.2
Use Cases
Sprint Planning
Use tiered prioritization to allocate work:
# See Tier 1 items for architectural planning
debtmap analyze . --min-priority high --top 5
# See Tier 2/3 for testing sprint work
debtmap analyze . --lcov coverage.lcov --min-priority medium
Code Review Focus
Prioritize review attention based on tiers:
- Tier 1: Architectural review required, senior dev attention
- Tier 2: Test coverage validation critical
- Tier 3: Standard review process
- Tier 4: Quick review or automated checks
Refactoring Strategy
# Phase 1: Address Tier 1 architectural issues
debtmap analyze . --min-priority high
# Phase 2: Add tests for Tier 2 complex code
debtmap analyze . --lcov coverage.lcov --min-priority high
# Phase 3: Improve Tier 3 coverage
debtmap analyze . --lcov coverage.lcov --min-priority medium
Best Practices
- Always address Tier 1 before feature work - Architectural issues compound
- Test Tier 2 items before refactoring - Avoid regressions
- Batch Tier 3 items - Address multiple in one sprint
- Defer Tier 4 items - Only fix during cleanup or when convenient
- Track tier distribution over time - Aim to reduce Tier 1/2 counts
Interpreting Tier Output
Terminal Output
Terminal output displays items grouped by score-based tiers:
TECHNICAL DEBT ANALYSIS - PRIORITY TIERS
Critical (score >= 90)
src/services.rs - God Object (score: 127.5)
src/core/engine.rs - Circular dependency (score: 95.2)
High (score 70-89.9)
src/processing/transform.rs:145 - UntestableComplexity (score: 85.0)
src/api/handlers.rs - God Module (score: 78.3)
...
Moderate (score 50-69.9)
src/utils/parser.rs:220 - TestingGap (score: 62.1)
...
Low (score < 50)
[Items with score < 50 appear here]
Note: The scores shown reflect tier weight multipliers applied during classification. Items classified as Tier 1 (Critical Architecture) receive a 1.5x weight boost, which often elevates them into the Critical or High score ranges.
JSON Output
JSON output uses score-based priority levels with different thresholds than terminal output:
{
"summary": {
"score_distribution": {
"critical": 2,
"high": 5,
"medium": 12,
"low": 45
}
},
"items": [
{
"type": "File",
"score": 127.5,
"priority": "critical",
"location": {
"file": "src/services.rs"
},
"debt_type": "GodObject"
},
{
"type": "Function",
"score": 85.0,
"priority": "high",
"location": {
"file": "src/processing/transform.rs",
"line": 145,
"function": "process_data"
},
"debt_type": "UntestableComplexity"
}
]
}
The priority field is derived from the score field using these thresholds (src/output/unified/priority.rs:23-32):
critical: score >= 100.0high: score >= 50.0medium: score >= 20.0low: score < 20.0
Note: JSON output uses different thresholds than terminal output. Terminal/markdown uses the Tier enum (90/70/50 thresholds), while JSON uses the Priority enum (100/50/20 thresholds) for broader score distribution.
Note: While RecommendationTier (T1-T4) classifications exist internally for applying tier weights, they are not included in JSON output. The output shows final calculated scores and their corresponding priority levels.
Troubleshooting
Issue: Too many Tier 1 items
Solution: Lower tier weights or increase thresholds temporarily:
[tiers]
t1_weight = 1.2 # Reduce from 1.5
Issue: Not enough items in Tier 1
Solution: Check if god object detection is enabled:
[god_object_detection]
enabled = true
Issue: All items in Tier 4
Solution: Lower minimum thresholds:
[thresholds]
minimum_debt_score = 1.0
minimum_cyclomatic_complexity = 2
See Also
- Scoring Strategies - Understanding file-level vs function-level scoring
- Configuration - Complete configuration reference
- Analysis Guide - Detailed metric explanations and analysis techniques
Validation and Quality Gates
The validate command enforces quality gates in your development workflow, making it ideal for CI/CD integration. Unlike the analyze command which focuses on exploration and reporting, validate checks your codebase against configured thresholds and returns appropriate exit codes for automated workflows.
Table of Contents
- Validate vs Analyze
- Quick Start
- Understanding Density-Based Validation
- Configuration Setup
- Validation Metrics
- Exit Codes and CI Integration
- Coverage Integration
- Context-Aware Validation
- CI/CD Examples
- Migrating from Deprecated Thresholds
- Troubleshooting
- Best Practices
Validate vs Analyze
Understanding when to use each command is crucial:
| Aspect | validate | analyze |
|---|---|---|
| Purpose | Enforce quality gates | Explore and understand debt |
| Exit Codes | Returns non-zero on failure | Always returns 0 (unless error) |
| Thresholds | From .debtmap.toml config | Command-line flags |
| Use Case | CI/CD pipelines, pre-commit hooks | Interactive analysis, reports |
| Output Focus | Pass/fail with violation details | Comprehensive metrics and insights |
| Configuration | Requires .debtmap.toml | Works without config file |
Rule of thumb: Use validate for automation and analyze for investigation.
Quick Start
-
Initialize configuration:
debtmap init -
Edit
.debtmap.tomlto set thresholds:[thresholds.validation] max_debt_density = 50.0 # Debt items per 1000 LOC max_average_complexity = 10.0 # Average cyclomatic complexity max_codebase_risk_score = 7.0 # Overall risk level (1-10) -
Run validation:
debtmap validate . -
Check exit code:
echo $? # 0 = pass, non-zero = fail
Understanding Density-Based Validation
Debtmap uses density-based metrics as the primary quality measure. This approach provides several advantages over traditional absolute count metrics.
Why Density Matters
Traditional metrics like “maximum 50 high-complexity functions” fail as your codebase grows:
Scenario: Your team adds 10,000 LOC of high-quality code
- Old metric: "max 50 complex functions" → FAILS (now 55 total)
- Density metric: "max 50 per 1000 LOC" → PASSES (density improved)
Scale-dependent metrics (absolute counts):
- Grow linearly with codebase size
- Require constant threshold adjustments
- Punish healthy growth
- Don’t reflect actual code quality
Density metrics (per 1000 LOC):
- Remain stable as codebase grows
- Measure true quality ratios
- No adjustment needed for growth
- Directly comparable across projects
Calculating Debt Density
Debt Density = (Total Debt Items / Total LOC) × 1000
Example:
- 25 debt items in 5,000 LOC project
- Density = (25 / 5000) × 1000 = 5.0 debt items per 1000 LOC
This density remains meaningful whether your codebase is 5,000 or 500,000 LOC.
Recommended Density Thresholds
| Project Type | max_debt_density | Rationale |
|---|---|---|
| New/Greenfield | 20.0 | High quality bar for new code |
| Active Development | 50.0 | Balanced quality/velocity (default) |
| Legacy Modernization | 100.0 | Prevent regression during refactoring |
| Mature/Critical | 30.0 | Maintain quality in stable systems |
Configuration Setup
Creating Configuration File
The debtmap init command generates a .debtmap.toml with sensible defaults:
debtmap init
This creates:
[thresholds.validation]
# Primary quality metrics (scale-independent)
max_average_complexity = 10.0
max_debt_density = 50.0
max_codebase_risk_score = 7.0
# Optional metrics
min_coverage_percentage = 0.0 # Disabled by default
# Safety net (high ceiling for extreme cases)
max_total_debt_score = 10000
Editing Thresholds
Edit the [thresholds.validation] section to match your quality requirements:
[thresholds.validation]
# Enforce stricter quality for new project
max_debt_density = 30.0 # Tighter density requirement
max_average_complexity = 8.0 # Lower complexity tolerance
max_codebase_risk_score = 6.0 # Reduced risk threshold
min_coverage_percentage = 80.0 # Require 80% test coverage
Override via Command Line
You can override the density threshold from the command line:
# Temporarily use stricter threshold
debtmap validate . --max-debt-density 40.0
Validation Metrics
Debtmap organizes validation metrics into three categories:
Primary Metrics (Scale-Independent)
These are the core quality measures that every project should monitor:
-
max_average_complexity(default: 10.0)- Average cyclomatic complexity per function
- Measures typical function complexity across codebase
- Lower values indicate simpler, more maintainable code
max_average_complexity = 10.0 -
max_debt_density(default: 50.0) - BLOCKING METRIC- Debt items per 1000 lines of code
- Scale-independent quality measure
- Remains stable as codebase grows
- This is the only metric that determines validation pass/fail
max_debt_density = 50.0 -
max_codebase_risk_score(default: 7.0)- Overall risk level combining complexity, coverage, and criticality
- Score ranges from 1 (low risk) to 10 (high risk)
- Considers context-aware analysis when enabled
max_codebase_risk_score = 7.0
Optional Metrics
Configure these when you want additional quality enforcement:
-
min_coverage_percentage(default: 0.0 - disabled)- Minimum required test coverage percentage
- Only enforced when coverage data is provided via
--coverage-file - Set to 0.0 to disable coverage requirements
min_coverage_percentage = 75.0 # Require 75% coverage
Safety Net Metrics
High ceilings to catch extreme cases:
-
max_total_debt_score(default: 10000)- Absolute ceiling on total technical debt
- Prevents runaway growth even if density stays low
- Rarely triggers in normal operation
max_total_debt_score = 10000
How Validation Works
Debt density is the sole blocking metric. The validation pass/fail decision is based only on whether max_debt_density is exceeded (src/commands/validate/thresholds.rs:60-61).
Other metrics like average complexity, codebase risk score, and coverage percentage are displayed in the output for visibility, but they do not cause validation to fail. When these metrics exceed their thresholds, the output shows [ERROR] as an informational warning, helping you identify areas for improvement even when validation passes.
This design provides:
- Simple CI integration - One clear quality signal (debt density)
- Visibility without blocking - See all threshold violations without false failures
- Scale-independent quality - Debt density remains meaningful as your codebase grows
Recommended Remediation Priority
When improving code quality, address issues in this order:
- Critical:
max_debt_densityviolations (this actually blocks validation) - High:
max_average_complexityviolations (function-level quality) - High:
max_codebase_risk_scoreviolations (overall risk) - Medium:
min_coverage_percentageviolations (test coverage) - Low:
max_total_debt_scoreviolations (extreme cases only)
Note: Only debt density violations block validation. The other items are prioritized for improving code quality, not for making validation pass.
Exit Codes and CI Integration
The validate command uses exit codes to signal success or failure:
Exit Code Behavior
debtmap validate .
echo $?
Exit codes:
0- Success: All thresholds passed- Non-zero - Failure: One or more thresholds exceeded or errors occurred
Implementation detail: When validation fails, the command uses anyhow::bail!("Validation failed") to return a non-zero exit code. This ensures the failure propagates correctly to CI/CD systems (src/commands/validate.rs:166).
Using Exit Codes in CI
Exit codes integrate naturally with CI/CD systems:
GitHub Actions:
- name: Validate code quality
run: debtmap validate .
# Step fails automatically if exit code is non-zero
GitLab CI:
script:
- debtmap validate .
# Pipeline fails if exit code is non-zero
Shell scripts:
#!/bin/bash
if debtmap validate .; then
echo "✅ Validation passed"
else
echo "❌ Validation failed"
exit 1
fi
Understanding Validation Output
The validate command uses the actual validation output format from src/utils/validation_printer.rs.
Success output:
[OK] Validation PASSED
Primary Quality Metrics:
Debt Density: 32.5 per 1K LOC (threshold: 50.0)
└─ Using 65% of max density (35% headroom)
Average complexity: 7.2 (threshold: 10.0)
Codebase risk score: 5.8 (threshold: 7.0)
Codebase Statistics (informational):
High complexity functions: 8
Technical debt items: 42
Total debt score: 1250 (safety net threshold: 10000)
All validation checks passed
Failure output:
[ERROR] Validation FAILED - Some metrics exceed thresholds
Primary Quality Metrics:
Debt Density: 65.8 per 1K LOC (threshold: 50.0)
└─ Using 132% of max density (-32% headroom)
Average complexity: 12.3 (threshold: 10.0)
Codebase risk score: 5.2 (threshold: 7.0)
Codebase Statistics (informational):
High complexity functions: 25
Technical debt items: 87
Total debt score: 2100 (safety net threshold: 10000)
Failed checks:
[ERROR] Average complexity: 12.3 > 10.0
[ERROR] Debt density: 65.8 per 1K LOC > 50.0
Note: In the example above, validation failed because debt density (65.8) exceeds its threshold (50.0). The average complexity
[ERROR]is shown as an informational warning but does not affect the exit code. Even if average complexity were within its threshold, validation would still fail due to debt density. See How Validation Works for details.
The output format emphasizes:
- Debt Density as the primary metric - shown first with percentage usage
- Headroom visualization - shows how much threshold capacity remains
- Clear failure indicators -
[ERROR]prefix for failed checks - Informational statistics - absolute counts shown as context, not validation criteria
Coverage Integration
Integrate test coverage data to enable risk-based validation:
Generating Coverage Data
For Rust projects with cargo-tarpaulin:
cargo tarpaulin --out Lcov --output-dir target/coverage
For Python projects with pytest-cov:
pytest --cov --cov-report=lcov:coverage/lcov.info
For JavaScript projects with Jest:
jest --coverage --coverageReporters=lcov
Running Validation with Coverage
debtmap validate . --coverage-file target/coverage/lcov.info
Benefits of Coverage Integration
With coverage data, validation gains additional insights:
- Risk-based prioritization - Identifies untested complex code
- Coverage threshold enforcement - via
min_coverage_percentage - Enhanced risk scoring - Combines complexity + coverage + context
- Better failure diagnostics - Shows which untested areas need attention
Coverage-Enhanced Output
debtmap validate . --coverage-file coverage/lcov.info -vv
Output includes:
- Overall coverage percentage
- High-risk uncovered functions
- Coverage-adjusted risk scores
- Prioritized remediation recommendations
Context-Aware Validation
Enable context-aware analysis for deeper risk insights:
Available Context Providers
critical_path- Analyzes call graph to find execution bottlenecksdependency- Identifies highly-coupled modulesgit_history- Detects frequently-changed code (churn)
Enabling Context Providers
Enable all providers:
debtmap validate . --enable-context
Select specific providers:
debtmap validate . --enable-context --context-providers critical_path,git_history
Disable specific providers:
debtmap validate . --enable-context --disable-context dependency
Context-Aware Configuration
Add context settings to .debtmap.toml:
[analysis]
enable_context = true
context_providers = ["critical_path", "git_history"]
Then run validation:
debtmap validate . # Uses config settings
Context Benefits for Validation
Context-aware analysis improves risk scoring by:
- Prioritizing frequently-called functions
- Weighting high-churn code more heavily
- Identifying architectural bottlenecks
- Surfacing critical code paths
CI/CD Examples
GitHub Actions
Complete workflow with coverage generation and validation:
name: Code Quality Validation
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
env:
CARGO_TERM_COLOR: always
RUST_BACKTRACE: 1
jobs:
validate:
name: Technical Debt Validation
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v5
with:
fetch-depth: 0 # Full history for git context
- name: Setup Rust
uses: dtolnay/rust-toolchain@stable
with:
components: rustfmt, clippy
- name: Cache cargo dependencies
uses: actions/cache@v4
with:
path: |
~/.cargo/bin/
~/.cargo/registry/index/
~/.cargo/registry/cache/
~/.cargo/git/db/
target/
key: ${{ runner.os }}-cargo-${{ hashFiles('**/Cargo.lock') }}
restore-keys: |
${{ runner.os }}-cargo-
- name: Install cargo-tarpaulin
run: |
if ! command -v cargo-tarpaulin &> /dev/null; then
cargo install cargo-tarpaulin
fi
- name: Build debtmap
run: cargo build --release
- name: Generate coverage data
run: cargo tarpaulin --out Lcov --output-dir target/coverage --timeout 300
- name: Run debtmap validation with coverage
run: |
if [ -f "target/coverage/lcov.info" ]; then
./target/release/debtmap validate . \
--coverage-file target/coverage/lcov.info \
--enable-context \
--format json \
--output debtmap-report.json
else
echo "Warning: LCOV file not found, running without coverage"
./target/release/debtmap validate . \
--format json \
--output debtmap-report.json
fi
- name: Upload debtmap report
if: always()
uses: actions/upload-artifact@v4
with:
name: debtmap-analysis-artifacts
path: |
debtmap-report.json
target/coverage/lcov.info
retention-days: 7
GitLab CI
stages:
- test
- quality
variables:
CARGO_HOME: $CI_PROJECT_DIR/.cargo
debtmap:
stage: quality
image: rust:latest
cache:
paths:
- .cargo/
- target/
before_script:
# Install debtmap and coverage tools
- cargo install debtmap
- cargo install cargo-tarpaulin
script:
# Generate coverage
- cargo tarpaulin --out Lcov --output-dir coverage
# Validate with debtmap
- debtmap validate . --coverage-file coverage/lcov.info -v
artifacts:
when: always
paths:
- coverage/
reports:
coverage_report:
coverage_format: cobertura
path: coverage/cobertura.xml
CircleCI
version: 2.1
jobs:
validate:
docker:
- image: cimg/rust:1.75
steps:
- checkout
- restore_cache:
keys:
- cargo-{{ checksum "Cargo.lock" }}
- run:
name: Install tools
command: |
cargo install debtmap
cargo install cargo-tarpaulin
- run:
name: Generate coverage
command: cargo tarpaulin --out Lcov
- run:
name: Validate code quality
command: debtmap validate . --coverage-file lcov.info
- save_cache:
key: cargo-{{ checksum "Cargo.lock" }}
paths:
- ~/.cargo
- target
workflows:
version: 2
quality:
jobs:
- validate
Migrating from Deprecated Thresholds
Debtmap version 0.3.0 deprecated scale-dependent absolute count metrics in favor of density-based metrics.
Deprecated Metrics
The following metrics will be removed in v1.0:
| Deprecated Metric | Migration Path |
|---|---|
max_high_complexity_count | Use max_debt_density |
max_debt_items | Use max_debt_density |
max_high_risk_functions | Use max_debt_density + max_codebase_risk_score |
Migration Example
Old configuration (deprecated):
[thresholds.validation]
max_high_complexity_count = 50 # ❌ Scale-dependent
max_debt_items = 100 # ❌ Scale-dependent
max_high_risk_functions = 20 # ❌ Scale-dependent
New configuration (recommended):
[thresholds.validation]
max_debt_density = 50.0 # ✅ Scale-independent
max_average_complexity = 10.0 # ✅ Quality ratio
max_codebase_risk_score = 7.0 # ✅ Risk level
Calculating Equivalent Density Threshold
Convert your old absolute thresholds to density:
Old: max_debt_items = 100 in 10,000 LOC codebase
New: max_debt_density = (100 / 10000) × 1000 = 10.0
Deprecation Warnings
When you run validate with deprecated metrics, you’ll see:
⚠️ DEPRECATION WARNING:
The following validation thresholds are deprecated:
- max_high_complexity_count
- max_debt_items
These scale-dependent metrics will be removed in v1.0.
Please migrate to density-based validation:
- Use 'max_debt_density' instead of absolute counts
- Density metrics remain stable as your codebase grows
Migration Timeline
- v0.3.0 - Density metrics introduced, old metrics deprecated
- v0.4.0 - v0.9.x - Deprecation warnings shown
- v1.0.0 - Deprecated metrics removed
Troubleshooting
Debugging Validation Failures
Use verbosity flags to understand why validation failed:
Level 1: Basic details (-v)
debtmap validate . -v
Shows which thresholds failed, by how much, and timing breakdown:
- Call graph building time
- Trait resolution time
- Coverage loading time
- Individual analysis phase durations
Suppressing timing output:
# Set DEBTMAP_QUIET to disable timing information (src/commands/validate.rs:406)
DEBTMAP_QUIET=1 debtmap validate .
Level 2: Detailed breakdown (-vv)
debtmap validate . -vv
Shows everything from -v plus:
- Score calculation factors and weights
- Top violating functions with metrics
- Detailed phase timing information
- Risk score component breakdown
Level 3: Full diagnostic output (-vvv)
debtmap validate . -vvv
Shows complete debug information:
- All debt items with full details
- Complete risk calculations for each function
- All timing information including sub-phases
- File-level and function-level analysis data
- Context provider outputs (if enabled)
Common Issues
Issue: Validation fails but output unclear
# Solution: Increase verbosity
debtmap validate . -vv
Issue: Want to see only the worst problems
# Solution: Use --top flag
debtmap validate . --top 10 -v
Issue: Output is too verbose for CI logs
# Solution: Use lower verbosity or filter output
debtmap validate . --top 10 # Show only top 10 issues
# or reduce output detail
debtmap validate . # Default verbosity (clean output)
At default verbosity (0), validation output is already concise and CI-friendly, showing only metrics and pass/fail status.
Issue: Validation passes locally but fails in CI
# Possible causes:
# 1. Different code (stale local branch)
# 2. Different config file (check .debtmap.toml in CI)
# 3. Missing coverage data (check LCOV generation in CI)
# Debug in CI:
debtmap validate . -vvv # Maximum verbosity
Issue: Coverage threshold fails unexpectedly
# Check if coverage file is being read
debtmap validate . --coverage-file coverage/lcov.info -v
# Verify coverage file exists and is valid
ls -lh coverage/lcov.info
Issue: Context providers causing performance issues
# Disable expensive providers
debtmap validate . --enable-context --disable-context git_history
Issue: Semantic analysis causing errors or unexpected behavior
Debtmap uses semantic analysis by default, powered by tree-sitter for deep AST (Abstract Syntax Tree) analysis. This provides accurate understanding of code structure, control flow, and complexity patterns.
However, semantic analysis may encounter issues with:
- Unsupported or experimental language features
- Malformed or incomplete syntax
- Complex macro expansions
- Very large files that timeout during parsing
# Solution: Disable semantic analysis with fallback mode
debtmap validate . --semantic-off
When semantic analysis is disabled with --semantic-off, debtmap falls back to basic syntax analysis, which is faster but less accurate for complexity calculations. Use this flag if:
- Encountering parsing errors or timeouts
- Working with bleeding-edge language features
- Need faster validation at the cost of precision
Validation Report Generation
Generate detailed reports for debugging:
JSON format for programmatic analysis:
debtmap validate . --format json --output validation-report.json
cat validation-report.json | jq '.validation_details'
Markdown format for documentation:
debtmap validate . --format markdown --output validation-report.md
Terminal format with filtering:
debtmap validate . --format terminal --top 20 -vv
Best Practices
Setting Initial Thresholds
1. Establish baseline:
# Run analysis to see current metrics
debtmap analyze . --format json > baseline.json
cat baseline.json | jq '.unified_analysis.debt_density'
2. Set pragmatic thresholds:
[thresholds.validation]
# Start slightly above current values to prevent regression
max_debt_density = 60.0 # Current: 55.0
max_average_complexity = 12.0 # Current: 10.5
3. Gradually tighten:
# After 1 month of cleanup
max_debt_density = 50.0
max_average_complexity = 10.0
Progressive Threshold Tightening
Month 1-2: Prevent regression
max_debt_density = 60.0 # Above current baseline
Month 3-4: Incremental improvement
max_debt_density = 50.0 # Industry standard
Month 5-6: Quality leadership
max_debt_density = 30.0 # Best-in-class
Project-Specific Recommendations
Greenfield projects:
# Start with high quality bar
max_debt_density = 20.0
max_average_complexity = 8.0
min_coverage_percentage = 80.0
Active development:
# Balanced quality/velocity
max_debt_density = 50.0
max_average_complexity = 10.0
min_coverage_percentage = 70.0
Legacy modernization:
# Prevent regression during refactoring
max_debt_density = 100.0
max_average_complexity = 15.0
min_coverage_percentage = 50.0
Pre-Commit Hook Integration
Add validation as a pre-commit hook:
# .git/hooks/pre-commit
#!/bin/bash
echo "Running debtmap validation..."
if debtmap validate . -v; then
echo "✅ Validation passed"
exit 0
else
echo "❌ Validation failed - commit blocked"
exit 1
fi
Make it executable:
chmod +x .git/hooks/pre-commit
Performance Optimization
Enable parallel processing: Validation uses parallel processing by default for fast execution on multi-core systems.
Disable for resource-constrained environments:
# Limit parallelism
debtmap validate . --jobs 2
# Disable completely
debtmap validate . --no-parallel
Performance characteristics:
- Parallel call graph construction
- Multi-threaded file analysis
- Same performance as
analyzecommand
Monitoring Trends
Track validation metrics over time:
# Generate timestamped reports
debtmap validate . --format json --output "reports/validation-$(date +%Y%m%d).json"
# Compare trends
jq -s 'map(.unified_analysis.debt_density)' reports/validation-*.json
Documentation
Document your threshold decisions:
# .debtmap.toml
[thresholds.validation]
# Rationale: Team agreed 50.0 density balances quality and velocity
# Review: Quarterly (next: 2025-04-01)
max_debt_density = 50.0
# Rationale: Enforces single-responsibility principle
# Review: After 3 months of metrics
max_average_complexity = 10.0
Summary
The validate command provides automated quality gates for CI/CD integration:
- Use density-based metrics for scale-independent quality measurement
- Configure in
.debtmap.tomlfor consistent, version-controlled thresholds - Integrate with CI/CD using exit codes for automated enforcement
- Enable coverage and context for risk-based validation
- Migrate from deprecated metrics to density-based approach
- Debug with verbosity flags when validation fails unexpectedly
- Tighten thresholds progressively as code quality improves
Next steps:
- Review Configuration Reference for detailed threshold options
- See Examples for more CI/CD integration patterns
- Check CLI Reference for complete command documentation
Metrics Reference
Comprehensive guide to all signals measured by Debtmap and how to interpret them.
Debtmap acts as a sensor, providing quantified signals about code complexity and risk. These signals are designed for consumption by AI coding tools and developers alike.
Metric Categories (Spec 118)
Debtmap distinguishes between two fundamental categories of metrics:
Measured Metrics
Definition: Metrics directly computed from the Abstract Syntax Tree (AST) through precise analysis.
These metrics are:
- Deterministic: Same code always produces the same metric value
- Precise: Exact counts from syntax analysis, not estimates
- Suitable for thresholds: Reliable for CI/CD quality gates
- Language-specific: Computed using language parsers (syn for Rust, tree-sitter for others)
Measured metrics include:
| Metric | Description | Example |
|---|---|---|
cyclomatic_complexity | Count of decision points (if, match, while, for, etc.) | Function with 3 if statements = complexity 4 |
cognitive_complexity | Weighted measure of code understandability | Nested loops increase cognitive load |
nesting_depth | Maximum levels of nested control structures | 3 nested if statements = depth 3 |
loc | Lines of code in the function | Physical line count |
parameter_count | Number of function parameters | fn foo(a: i32, b: String) = 2 |
Estimated Metrics
Definition: Heuristic approximations calculated using formulas, not direct AST measurements.
These metrics are:
- Heuristic: Based on mathematical formulas and assumptions
- Approximate: Close estimates, not exact counts
- Useful for prioritization: Help estimate effort and risk
- Not suitable for hard thresholds: Use for relative comparisons, not absolute gates
Estimated metrics include:
| Metric | Formula | Purpose | Example |
|---|---|---|---|
est_branches | max(nesting, 1) × cyclomatic ÷ 3 | Estimate test cases needed for branch coverage | Complexity 12, nesting 3 → ~12 branches |
Important: The est_branches metric was previously called branches. It was renamed in Spec 118 to make it explicit that this is an estimate, not a precise count from the AST.
Why the Distinction Matters
For Code Quality Gates
# GOOD: Use measured metrics for CI/CD thresholds
debtmap validate . --threshold-complexity 15
# AVOID: Don't use estimated metrics for hard gates
# (est_branches is not exposed as a threshold flag)
Rationale: Measured metrics are deterministic and precise, making them suitable for build-breaking quality gates.
For Prioritization
# GOOD: Use est_branches for prioritization
debtmap analyze . --top 10 # Sorts by est_branches (among other factors)
# GOOD: Estimated metrics help understand testing effort
debtmap analyze . --lcov coverage.info --verbose
Rationale: Estimated metrics provide useful heuristics for understanding where to focus testing and refactoring efforts.
For Comparison Across Codebases
- Measured metrics: Comparable across projects (cyclomatic 10 means the same everywhere)
- Estimated metrics: Project-specific heuristics (est_branches depends on nesting patterns)
Detailed Metric Descriptions
Cyclomatic Complexity (Measured)
What it measures: The number of linearly independent paths through a function’s control flow.
How it’s calculated:
- Start with a base of 1
- Add 1 for each decision point:
if,else ifmatcharmswhile,for,loop&&,||in conditions?operator (early return)
Example:
#![allow(unused)]
fn main() {
fn example(x: i32, y: i32) -> bool {
if x > 0 { // +1
if y > 0 { // +1
true
} else { // implicit in if/else
false
}
} else if x < 0 { // +1
false
} else {
y == 0 // no additional branches
}
}
// Cyclomatic complexity = 1 + 3 = 4
}
Thresholds:
- 1-5: Simple, easy to test
- 6-10: Moderate, manageable complexity
- 11-20: Complex, consider refactoring
- 21+: Very complex, high maintenance cost
Cognitive Complexity (Measured)
What it measures: How difficult the code is for humans to understand.
How it differs from cyclomatic:
- Weights nested structures more heavily (nested if is worse than sequential if)
- Ignores shorthand structures (early returns, guard clauses)
- Focuses on readability, not just logic paths
Example:
#![allow(unused)]
fn main() {
fn cyclomatic_low_cognitive_low(status: Status) -> bool {
match status { // Cyclomatic: 4, Cognitive: 1
Status::Active => true,
Status::Pending => false,
Status::Closed => false,
Status::Error => false,
}
}
fn cyclomatic_low_cognitive_high(x: i32, y: i32, z: i32) -> bool {
if x > 0 {
if y > 0 { // Nested: +2 cognitive penalty
if z > 0 { // Deeply nested: +3 cognitive penalty
return true;
}
}
}
false
}
// Cyclomatic: 4, Cognitive: 7 (nesting penalty applied)
}
Thresholds:
- 1-5: Easy to understand
- 6-10: Moderate mental load
- 11-15: Difficult to follow
- 16+: Refactor recommended
Estimated Branches (Estimated)
What it estimates: Approximate number of execution paths that would need test coverage.
Formula:
est_branches = max(nesting_depth, 1) × cyclomatic_complexity ÷ 3
Why this formula:
- Nesting multiplier: Deeper nesting creates more combinations
- Cyclomatic base: Higher complexity → more paths
- ÷ 3 adjustment: Empirical factor to align with typical branch coverage needs
Example scenarios:
| Cyclomatic | Nesting | est_branches | Interpretation |
|---|---|---|---|
| 3 | 1 | 1 | Simple linear code |
| 12 | 1 | 4 | Multiple sequential branches |
| 12 | 3 | 12 | Nested conditions, many paths |
| 20 | 5 | 33 | Complex nested logic |
Use cases:
- Estimating test case requirements
- Prioritizing untested complex code
- Understanding coverage gaps
Limitations:
- Not a precise count: This is a heuristic approximation
- Don’t use for coverage percentage calculation: Use actual coverage tools
- Varies by code style: Heavily nested code scores higher
Terminology Change (Spec 118)
Before: branches
Previously, this metric was displayed as branches=X, which was confusing because:
- Users thought it was a precise count from AST analysis
- It was mistaken for cyclomatic complexity (actual branch count)
- The estimation nature was not obvious
After: est_branches
Now displayed as est_branches=X to:
- Make estimation explicit: “est_” prefix indicates this is approximate
- Avoid confusion: Clearly different from cyclomatic complexity
- Set correct expectations: Users know this is a heuristic, not a measurement
Migration Guide
Terminal Output:
- Old:
COMPLEXITY: cyclomatic=12, branches=8, cognitive=15 - New:
COMPLEXITY: cyclomatic=12, est_branches=8, cognitive=15
Code:
- Internal variable names updated from
branchestoest_branches - Comments added explaining the estimation formula
JSON Output:
- No change: The ComplexityMetrics struct does not include this field
est_branchesis calculated on-demand for display purposes only
Practical Usage Examples
Example 1: Code Quality Gate
# Fail build if any function exceeds cyclomatic complexity 15
debtmap validate . --threshold-complexity 15 --max-high 0
# Why: Cyclomatic is measured, precise, and repeatable
Example 2: Prioritize Testing Effort
# Show top 10 functions by risk (uses est_branches in scoring)
debtmap analyze . --lcov coverage.info --top 10
# Functions with high est_branches and low coverage appear first
Example 3: Understanding Test Requirements
# Verbose output shows est_branches for each function
debtmap analyze . --verbose
# Output:
# └─ COMPLEXITY: cyclomatic=12, est_branches=8, cognitive=15, nesting=2
#
# Interpretation: ~8 test cases likely needed for good branch coverage
Example 4: Explaining Metrics to Team
# Display comprehensive metric definitions
debtmap analyze --explain-metrics
# Shows:
# - Measured vs Estimated categories
# - Formulas and thresholds
# - When to use each metric
Metric Selection Guide
When to Use Cyclomatic Complexity
✅ Use for:
- CI/CD quality gates
- Code review guidelines
- Consistent cross-project comparison
- Identifying refactoring candidates
❌ Don’t use for:
- Estimating test effort (use est_branches)
- Readability assessment (use cognitive complexity)
When to Use Cognitive Complexity
✅ Use for:
- Readability reviews
- Identifying hard-to-maintain code
- Onboarding difficulty assessment
❌ Don’t use for:
- Test coverage planning
- Strict quality gates (more subjective than cyclomatic)
When to Use est_branches
✅ Use for:
- Estimating test case requirements
- Prioritizing test coverage work
- Understanding coverage gaps
❌ Don’t use for:
- CI/CD quality gates (it’s an estimate)
- Calculating coverage percentages (use actual coverage data)
- Cross-project comparison (formula is heuristic)
Combining Metrics for Insights
High Complexity, Low Coverage
cyclomatic=18, est_branches=12, coverage=0%
Interpretation: High-risk code needing ~12 test cases for adequate coverage.
Action: Prioritize writing tests, consider refactoring.
High Cyclomatic, Low Cognitive
cyclomatic=15, cognitive=5
Interpretation: Many branches, but simple linear logic (e.g., validation checks).
Action: Acceptable pattern, tests should be straightforward.
Low Cyclomatic, High Cognitive
cyclomatic=8, cognitive=18
Interpretation: Deeply nested logic, hard to understand despite fewer branches.
Action: Refactor to reduce nesting, extract functions.
High est_branches, Low Cyclomatic
cyclomatic=9, nesting=5, est_branches=15
Interpretation: Deep nesting creates many path combinations.
Action: Flatten nesting, use early returns, extract nested logic.
Frequently Asked Questions
Q: Why is est_branches different from cyclomatic complexity?
A: Cyclomatic is the measured count of decision points. est_branches is an estimated number of execution paths, calculated using nesting depth to account for path combinations.
Q: Can I use est_branches in CI/CD thresholds?
A: No. Use measured metrics (cyclomatic_complexity, cognitive_complexity) for quality gates. est_branches is a heuristic for prioritization, not a precise measurement.
Q: Why did the metric name change from “branches” to “est_branches”?
A: To make it explicit that this is an estimate, not a measured value. Users were confused, thinking it was a precise count from the AST.
Q: How accurate is est_branches for estimating test cases?
A: It’s a rough approximation. Actual test case requirements depend on:
- Business logic complexity
- Edge cases
- Error handling paths
- Integration points
Use est_branches as a starting point, not an exact requirement.
Q: Should I refactor code with high est_branches?
A: Not necessarily. High est_branches indicates complex logic that may need thorough testing. If the logic is unavoidable (e.g., state machines, complex business rules), focus on comprehensive test coverage rather than refactoring.
Signal Categories Summary
| Category | Signals | Purpose |
|---|---|---|
| Complexity | cyclomatic, cognitive, nesting, loc | How hard code is to understand |
| Coverage | line_percent, branch_percent | How risky changes are |
| Coupling | fan_in, fan_out, call_depth | How changes ripple |
| Quality | entropy, purity, dead_code | False positive reduction |
Using Signals with AI
When piping debtmap output to an AI assistant, signals provide the context needed for intelligent fixes:
# Get structured signals for AI consumption
debtmap analyze . --format markdown --top 5 | claude "Fix the top item"
The AI uses these signals to:
- Understand code complexity before reading it
- Prioritize which files to examine first
- Decide between refactoring vs testing approaches
- Estimate the scope of changes needed
Further Reading
- Why Debtmap? - Sensor model explained
- LLM Integration - AI workflow patterns
- Configuration - Threshold customization
- Scoring Strategies - How signals combine
Examples
This chapter provides practical, real-world examples of using Debtmap across different project types and workflows. All examples use current CLI syntax verified against the source code.
Quick Start: New to Debtmap? Start with Basic Rust Analysis for the simplest introduction, then explore Coverage Integration for risk-based prioritization.
Quick Navigation: For detailed explanations of all CLI options, see the CLI Reference chapter.
Overview
This chapter demonstrates:
- Language-specific analysis: Rust, Python, JavaScript/TypeScript with their respective testing tools
- CI/CD integration: GitHub Actions, GitLab CI, CircleCI with validation gates
- Output formats: Terminal, JSON, and Markdown with interpretation guidance
- Advanced features: Context-aware analysis, multi-pass processing
- Configuration patterns: Tailored settings for different project types
- Progress tracking: Using the
comparecommand to validate refactoring improvements
All examples are copy-paste ready and tested against the current Debtmap implementation.
Table of Contents
- Analyzing Rust Projects
- Python Analysis
- JavaScript/TypeScript
- CI Integration
- Output Formats
- Advanced Usage
- Configuration Examples
- Compare Command
Analyzing Rust Projects
Basic Rust Analysis
Start with a simple analysis of your Rust project:
# Analyze current directory (path defaults to '.' - both commands are identical)
debtmap analyze
# Same as above with explicit path
debtmap analyze .
# Analyze specific directory
debtmap analyze ./src
# Analyze with custom complexity threshold
debtmap analyze ./src --threshold-complexity 15
Coverage Integration with cargo-tarpaulin
Combine complexity analysis with test coverage for risk-based prioritization:
# Generate LCOV coverage data
cargo tarpaulin --out lcov --output-dir target/coverage
# Analyze with coverage data
debtmap analyze . --lcov target/coverage/lcov.info
# Or use the shorter alias
debtmap analyze . --coverage-file target/coverage/lcov.info
Note:
--lcovis an alias for--coverage-file- both work identically.
What this does:
- Functions with 0% coverage and high complexity get marked as
[CRITICAL] - Well-tested functions (>80% coverage) are deprioritized
- Shows risk reduction potential for each untested function
Custom Thresholds
Configure thresholds to match your project standards:
# Set both complexity and duplication thresholds
debtmap analyze . \
--threshold-complexity 15 \
--threshold-duplication 50
# Use preset configurations for quick setup
debtmap analyze . --threshold-preset strict # Strict standards
debtmap analyze . --threshold-preset balanced # Default balanced
debtmap analyze . --threshold-preset lenient # Lenient for legacy code
Preset configurations:
- Strict: Lower thresholds for high quality standards (good for new projects)
- Balanced: Default thresholds suitable for typical projects
- Lenient: Higher thresholds designed for legacy codebases with existing technical debt
Preset Threshold Values:
These presets control the minimum complexity levels that trigger debt flagging. Source: src/complexity/threshold_manager.rs:120-148
| Preset | Min Cyclomatic | Min Cognitive | Min Function Lines |
|---|---|---|---|
| Strict | 3 | 7 | 15 |
| Balanced | 5 | 10 | 20 |
| Lenient | 10 | 20 | 50 |
Note: These are minimum thresholds for flagging functions. Functions below these thresholds are considered simple and won’t appear in debt reports.
God Object Detection
Identify classes and modules with too many responsibilities:
# Standard analysis includes god object detection
debtmap analyze .
# Disable god object detection for specific run
debtmap analyze . --no-god-object
# Show detailed module split recommendations (experimental)
debtmap analyze . --show-splits
God objects are flagged with detailed metrics:
- Number of methods and fields
- Responsibility count (grouped by naming patterns)
- God object score (0-100%)
- Recommendations for splitting
The --show-splits option provides experimental decomposition suggestions:
# Get detailed recommendations for breaking up large modules
debtmap analyze . --show-splits
This shows suggested module boundaries, responsibility groupings, and how to decompose large files into smaller, focused modules.
Purity-Weighted God Object Scoring
Debtmap uses purity analysis to distinguish functional programming patterns from actual god objects. Enable verbose mode to see purity distribution:
# See purity distribution in god object analysis
debtmap analyze . -v
Example Output:
GOD OBJECT ANALYSIS: src/core/processor.rs
Total functions: 107
PURITY DISTRIBUTION:
Pure: 70 functions (65%) → complexity weight: 6.3
Impure: 37 functions (35%) → complexity weight: 14.0
Total weighted complexity: 20.3
God object score: 12.0 (threshold: 70.0)
Status: ✓ Not a god object (functional design)
This shows:
- Pure functions (no side effects, immutable) receive 0.3× weight
- Impure functions (I/O, mutations, side effects) receive 1.0× weight
- Functional modules with many pure helpers avoid false positives
- Focus shifts to modules with excessive stateful code
Why This Matters:
Without purity weighting:
Module with 100 pure helpers → Flagged as god object ❌
With purity weighting:
Module with 100 pure helpers → Normal (functional design) ✅
Module with 100 impure functions → God object detected ✅
Compare Two Modules:
Functional module (70 pure, 30 impure):
Pure: 70 × 0.3 = 21.0
Impure: 30 × 1.0 = 30.0
Score: 35.0 → Not a god object ✓
Procedural module (100 impure):
Impure: 100 × 1.0 = 100.0
Score: 125.0 → God object detected ✗
Filtering and Focusing
# Analyze only Rust files
debtmap analyze . --languages rust
# Focus on architecture issues (god objects, complexity)
debtmap analyze . --filter Architecture
# Focus on testing gaps
debtmap analyze . --filter Testing
# Filter by multiple categories
debtmap analyze . --filter Architecture,Testing
# Show only top 10 issues
debtmap analyze . --top 10
# Show only high-priority items
debtmap analyze . --min-priority high
Valid filter categories:
Architecture- God objects, high complexity, structural issuesTesting- Test coverage gaps, untested critical codeDuplication- Code duplication and similar patternsMaintainability- Long functions, deep nesting, readability issues
Output Formats
# JSON output for CI integration
debtmap analyze . --format json --output report.json
# Markdown report
debtmap analyze . --format markdown --output DEBT_REPORT.md
# Terminal output (default) - prettified
debtmap analyze .
Multi-Pass Analysis
For deeper analysis with context awareness:
# Enable context-aware analysis with multiple providers
debtmap analyze . \
--context \
--context-providers critical_path,dependency,git_history
# Multi-pass analysis with attribution (multi-pass is default)
debtmap analyze . --attribution
Complete CI Example
This is from Debtmap’s own .github/workflows/debtmap.yml:
# 1. Install cargo-tarpaulin
cargo install cargo-tarpaulin
# 2. Build debtmap
cargo build --release
# 3. Generate coverage
cargo tarpaulin --config .tarpaulin.toml --out Lcov --timeout 300
# 4. Run validation with coverage
./target/release/debtmap validate . \
--coverage-file target/coverage/lcov.info \
--format json \
--output debtmap-report.json
Python Analysis
Basic Python Analysis
# Analyze Python files only
debtmap analyze . --languages python
# Analyze specific Python directory
debtmap analyze src --languages python
Coverage Integration with pytest
Generate coverage and analyze risk:
# Generate LCOV coverage with pytest
pytest --cov --cov-report=lcov
# Analyze with coverage data
debtmap analyze . \
--languages python \
--lcov coverage.lcov
Python-Specific Patterns
# Focus on testing gaps in Python code
debtmap analyze . \
--languages python \
--filter Testing
# Find god objects in Python modules
debtmap analyze . \
--languages python \
--filter Architecture
Example Configuration for Python Projects
Create .debtmap.toml:
[languages]
enabled = ["python"]
[thresholds]
complexity = 12
max_function_lines = 40
[ignore]
patterns = [
"**/*_test.py",
"tests/**",
".venv/**",
"**/__pycache__/**",
]
[god_object]
enabled = true
max_methods = 15
max_responsibilities = 4
JavaScript/TypeScript
Analyzing JS/TS Projects
# Analyze JavaScript and TypeScript
debtmap analyze . --languages javascript,typescript
# TypeScript only
debtmap analyze . --languages typescript
Coverage Integration with Jest
# Generate LCOV with Jest
jest --coverage --coverageReporters=lcov
# Analyze with coverage
debtmap analyze . \
--languages javascript,typescript \
--lcov coverage/lcov.info
Node.js Project Patterns
# Exclude node_modules and focus on source
debtmap analyze src --languages javascript,typescript
# With custom complexity thresholds for JS
debtmap analyze . \
--languages javascript,typescript \
--threshold-complexity 10
TypeScript Configuration Example
Create .debtmap.toml:
[languages]
enabled = ["typescript", "javascript"]
[thresholds]
complexity = 10
max_function_lines = 50
[ignore]
patterns = [
"node_modules/**",
"**/*.test.ts",
"**/*.spec.ts",
"dist/**",
"build/**",
"**/*.d.ts",
]
Monorepo Analysis
# Analyze specific package
debtmap analyze packages/api --languages typescript
# Analyze all packages, grouped by category
debtmap analyze packages \
--languages typescript \
--group-by-category
CI Integration
GitHub Actions
Complete workflow example (from .github/workflows/debtmap.yml):
name: Debtmap
on:
push:
branches: [ main, master ]
pull_request:
branches: [ main, master ]
workflow_dispatch:
env:
CARGO_TERM_COLOR: always
RUST_BACKTRACE: 1
jobs:
validate:
name: Technical Debt Validation
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v5
with:
fetch-depth: 0
- name: Setup Rust
uses: dtolnay/rust-toolchain@stable
with:
components: rustfmt, clippy
- name: Cache cargo dependencies
uses: actions/cache@v4
with:
path: |
~/.cargo/bin/
~/.cargo/registry/index/
~/.cargo/registry/cache/
~/.cargo/git/db/
target/
key: ${{ runner.os }}-cargo-${{ hashFiles('**/Cargo.lock') }}
restore-keys: |
${{ runner.os }}-cargo-
- name: Install cargo-tarpaulin
run: |
if ! command -v cargo-tarpaulin &> /dev/null; then
cargo install cargo-tarpaulin
else
echo "cargo-tarpaulin already installed"
fi
- name: Build debtmap
run: cargo build --release
- name: Generate coverage data
run: cargo tarpaulin --config .tarpaulin.toml --out Lcov --timeout 300
- name: Run debtmap validation with coverage
run: |
if [ -f "target/coverage/lcov.info" ]; then
./target/release/debtmap validate . --coverage-file target/coverage/lcov.info --format json --output debtmap-report.json
else
echo "Warning: LCOV file not found, running validation without coverage data"
./target/release/debtmap validate . --format json --output debtmap-report.json
fi
- name: Upload debtmap report and coverage
if: always()
uses: actions/upload-artifact@v4
with:
name: debtmap-analysis-artifacts
path: |
debtmap-report.json
target/coverage/lcov.info
retention-days: 7
GitLab CI
debtmap:
stage: quality
image: rust:latest
script:
# Install debtmap
- cargo install debtmap
# Run tests with coverage (generates LCOV format)
- cargo install cargo-tarpaulin
- cargo tarpaulin --out Lcov
# Validate with debtmap (using LCOV format)
- debtmap validate .
--coverage-file lcov.info
--format json
--output debtmap-report.json
artifacts:
paths:
- lcov.info
- debtmap-report.json
expire_in: 1 week
CircleCI
version: 2.1
jobs:
debtmap:
docker:
- image: cimg/rust:1.75
steps:
- checkout
- run:
name: Install debtmap
command: cargo install debtmap
- run:
name: Generate coverage
command: |
cargo install cargo-tarpaulin
cargo tarpaulin --out Lcov
- run:
name: Run debtmap
command: |
debtmap validate . \
--coverage-file lcov.info \
--format json \
--output debtmap.json
- store_artifacts:
path: debtmap.json
workflows:
version: 2
build:
jobs:
- debtmap
Using debtmap validate for PR Gates
# Fail build if thresholds are exceeded
debtmap validate . --coverage-file lcov.info
# With custom thresholds
debtmap validate . \
--coverage-file lcov.info \
--threshold-complexity 15
# Exit code 0 if passing, 1 if failing
Compare Command in CI
Track technical debt trends over time:
# Generate baseline (on main branch)
debtmap analyze . --format json --output baseline.json
# After PR changes
debtmap analyze . --format json --output current.json
# Compare and fail if regressions detected
debtmap compare \
--before baseline.json \
--after current.json \
--format json
Output Formats
Terminal Output (Default)
The default terminal output is prettified with colors and priorities:
debtmap analyze . --lcov coverage.lcov --top 3
Example output:
════════════════════════════════════════════
PRIORITY TECHNICAL DEBT FIXES
════════════════════════════════════════════
🎯 TOP 3 RECOMMENDATIONS (by unified priority)
#1 SCORE: 8.9 [CRITICAL]
├─ TEST GAP: ./src/analyzers/rust.rs:38 parse_function()
├─ ACTION: Add 6 unit tests for full coverage
├─ IMPACT: Full test coverage, -3.7 risk
├─ COMPLEXITY: cyclomatic=6, cognitive=8, nesting=2, lines=32
├─ DEPENDENCIES: 0 upstream, 11 downstream
└─ WHY: Business logic with 0% coverage, manageable complexity
📊 TOTAL DEBT SCORE: 4907
📈 OVERALL COVERAGE: 67.12%
JSON Output
Machine-readable format for CI/CD integration:
debtmap analyze . --format json --output report.json
Using JSON output programmatically:
# Extract total debt score
debtmap analyze . --format json | jq '.total_debt_score'
# Count critical items
debtmap analyze . --format json | jq '[.items[] | select(.unified_score.final_score >= 8)] | length'
# Get top 5 functions by score
debtmap analyze . --format json | jq '.items | sort_by(-.unified_score.final_score) | .[0:5] | .[].location'
# Extract all test gap items
debtmap analyze . --format json | jq '[.items[] | select(.debt_type == "TestGap")]'
Structure:
{
"items": [
{
"location": {
"file": "src/main.rs",
"function": "process_data",
"line": 42
},
"debt_type": "TestGap",
"unified_score": {
"complexity_factor": 3.2,
"coverage_factor": 10.0,
"dependency_factor": 2.5,
"role_multiplier": 1.2,
"final_score": 9.4
},
"function_role": "BusinessLogic",
"recommendation": {
"action": "Add unit tests",
"details": "Add 6 unit tests for full coverage",
"effort_estimate": "2-3 hours"
},
"expected_impact": {
"risk_reduction": 3.9,
"complexity_reduction": 0,
"coverage_improvement": 100
}
}
],
"overall_coverage": 67.12,
"total_debt_score": 4907
}
Markdown Report
# Standard markdown report
debtmap analyze . --format markdown --output DEBT_REPORT.md
# Summary level for executives (minimal detail)
debtmap analyze . --format markdown --detail-level summary --output SUMMARY.md
# Standard level for team review (default)
debtmap analyze . --format markdown --detail-level standard --output DEBT.md
# Comprehensive level for deep analysis
debtmap analyze . --format markdown --detail-level comprehensive --output DETAILED.md
# Debug level for troubleshooting
debtmap analyze . --format markdown --detail-level debug --output DEBUG.md
Detail levels:
- summary: Executive summary with key metrics and top issues only
- standard: Balanced detail suitable for team reviews (default)
- comprehensive: Full details including all debt items and analysis
- debug: Maximum detail including AST information and parser internals
Use cases:
- Summary: Management reports, PR comments
- Standard: Regular team reviews
- Comprehensive: Deep dives, refactoring planning
- Debug: Troubleshooting debtmap behavior
Great for documentation or PR comments.
Advanced Usage
Design Pattern Detection
Detect design patterns in your codebase to understand architectural choices and identify potential overuse of certain patterns:
# Detect specific design patterns
debtmap analyze . --patterns observer,singleton,factory
# Adjust pattern confidence threshold (0.0-1.0, default: 0.7)
debtmap analyze . --pattern-threshold 0.8
# Show uncertain pattern matches with warnings
debtmap analyze . --show-pattern-warnings
# Disable pattern detection entirely
debtmap analyze . --no-pattern-detection
Available patterns:
observer- Event listener registrations, callback patternssingleton- Static instance managementfactory- Object creation methodsstrategy- Algorithm selection via traits/interfacescallback- Function passing and invocationtemplate_method- Abstract methods with concrete implementations
Use cases:
- Identify architectural patterns in unfamiliar codebases
- Detect potential overuse of certain patterns (e.g., too many singletons)
- Understand code organization and design decisions
Dead Code and Public API Analysis
Control how Debtmap detects public APIs to prevent false positives when analyzing libraries vs CLI tools:
# Analyze with public API awareness (default for libraries)
debtmap analyze . --context
# Disable public API heuristics (useful for CLI tools)
debtmap analyze . --no-public-api-detection
# Adjust public API confidence threshold (default: 0.7)
debtmap analyze . --public-api-threshold 0.8
When to use:
- Libraries: Keep default public API detection to avoid flagging exported functions as unused
- CLI tools: Use
--no-public-api-detectionsince there’s no public API - Mixed projects: Adjust threshold based on false positive rate
What it detects:
- Functions exported in
lib.rsorapi.rs - Public trait implementations
- Functions matching API naming patterns
- Prevents false positives for “unused” library functions
Context-Aware Analysis
Enable advanced context providers for more accurate prioritization:
# Enable all context providers for comprehensive analysis
debtmap analyze . \
--context \
--context-providers critical_path,dependency,git_history
Context Providers:
critical_path - Identifies functions on critical execution paths
- Analyzes call graph to find frequently-called functions
- Prioritizes functions that affect many code paths
- Use for: Understanding impact of potential failures
dependency - Analyzes dependency impact and cascade effects
- Tracks caller/callee relationships
- Calculates cascade impact of changes
- Use for: Understanding change propagation and refactoring risk
git_history - Tracks change frequency and churn
- Analyzes git blame and commit history
- Identifies frequently-changed functions
- Use for: Finding volatile code that needs stabilization
Example workflows:
# Find volatile high-complexity code
debtmap analyze . --context --context-providers git_history
# Understand refactoring impact
debtmap analyze . --context --context-providers dependency
# Disable slow provider for faster analysis
debtmap analyze . --context --disable-context git_history
Multi-Pass Analysis
Multi-pass analysis is enabled by default for deeper analysis with context awareness:
# Multi-pass analysis is default behavior (no flag needed)
debtmap analyze .
# Multi-pass with attribution tracking
debtmap analyze . --attribution
# Disable multi-pass for single-pass performance mode
debtmap analyze . --no-multi-pass
When to use --no-multi-pass:
- Performance-critical CI environments: Faster analysis for large codebases
- Quick validation: When you need fast feedback
- Single-file analysis: When deep context isn’t needed
Performance comparison example:
# Fast single-pass (skip context analysis)
time debtmap analyze . --no-multi-pass
# Default multi-pass (includes context analysis)
time debtmap analyze .
Multi-pass analysis provides better prioritization by analyzing dependencies and relationships across files, but single-pass mode can be 2-3x faster for large codebases.
Aggregation Methods
# Use logarithmic sum for aggregation
debtmap analyze . --aggregation-method logarithmic_sum
# Standard sum (default)
debtmap analyze . --aggregation-method sum
Filtering and Grouping
# Group results by debt category
debtmap analyze . --group-by-category
# Filter specific categories
debtmap analyze . --filter Architecture,Testing
# Show only high-priority items
debtmap analyze . --min-priority high --top 10
Call Graph Debugging
Debug and validate call graph construction for accurate dependency analysis:
# Enable call graph debugging output
debtmap analyze . --debug-call-graph
# Trace specific function resolution
debtmap analyze . --trace-function my_function --trace-function another_fn
# Validate call graph structure (detect orphans and cycles)
debtmap analyze . --validate-call-graph
# Show detailed caller/callee relationships
debtmap analyze . --show-dependencies
Use cases:
Troubleshooting resolution failures:
# When a function isn't being analyzed correctly
debtmap analyze . --debug-call-graph --trace-function problematic_function
Understanding function relationships:
# See who calls what
debtmap analyze . --show-dependencies --top 10
Validating call graph integrity:
# Detect cycles and orphaned nodes
debtmap analyze . --validate-call-graph
Output includes:
- Resolution statistics (success/failure rates)
- DFS cycle detection results
- Orphan node detection
- Cross-module resolution details
Verbosity Levels
Control the level of diagnostic output for debugging and understanding analysis decisions:
# Level 1: Show main score factors
debtmap analyze . -v
# Level 2: Show detailed calculations
debtmap analyze . -vv
# Level 3: Show all debug information
debtmap analyze . -vvv
# Long form also available
debtmap analyze . --verbose
# Show macro expansion details (Rust)
debtmap analyze . --verbose-macro-warnings --show-macro-stats
What each level shows:
- -v: Score factor breakdowns and purity distribution in god object analysis
- -vv: Detailed calculations, coverage lookups, and metric computations
- -vvv: Full debug information including AST details and parser internals
Understanding Metrics
Learn how Debtmap calculates complexity metrics and scores:
# Show metric definitions and formulas
debtmap analyze . --explain-metrics
What it explains:
Measured metrics (counted from AST):
cyclomatic_complexity- Decision points (if, match, while, for, etc.)cognitive_complexity- Weighted readability measurenesting_depth- Maximum nested control structure levelsloc- Lines of code in functionparameter_count- Number of function parameters
Estimated metrics (formula-based approximations):
est_branches- Estimated execution paths- Formula:
max(nesting_depth, 1) × cyclomatic_complexity ÷ 3 - Purpose: Estimate test cases needed for branch coverage
- Note: This is an ESTIMATE, not a count from the AST
- Formula:
Scoring formulas:
- Complexity factor calculation
- Coverage factor weight
- Dependency factor impact
- Role multiplier application
- Final score aggregation
Use –explain-metrics when:
- First learning debtmap
- Questioning why something is flagged
- Understanding score differences
- Teaching team members about technical debt metrics
AST Functional Analysis
Enable AST-based functional composition analysis to detect functional programming patterns:
# Enable AST-based functional composition analysis
debtmap analyze . --ast-functional-analysis
# Combine with verbose mode to see purity analysis
debtmap analyze . --ast-functional-analysis -v
What it detects:
- Pure functions (no side effects, immutable)
- Impure functions (I/O, mutations, side effects)
- Function composition patterns
- Immutability patterns
Benefits:
- Distinguishes functional patterns from god objects (see purity weighting in God Object Detection section)
- Identifies opportunities for better testability
- Highlights side effect boundaries
- Supports functional programming code reviews
Example output with -v:
PURITY DISTRIBUTION:
Pure: 70 functions (65%) → complexity weight: 6.3
Impure: 37 functions (35%) → complexity weight: 14.0
Total weighted complexity: 20.3
Parallel Processing Control
Control thread count for CPU-bound systems or to limit resource usage in CI environments. By default, Debtmap uses all available cores for optimal performance.
# Use 8 parallel jobs
debtmap analyze . --jobs 8
# Disable parallel processing
debtmap analyze . --no-parallel
When to adjust:
- CI environments: Limit thread count to avoid resource contention with other jobs
- CPU-bound systems: Reduce threads if machine is under load
- Large codebases: Default parallelism provides best performance
- Debugging: Use
--no-parallelto simplify sequential execution when troubleshooting
Configuration Examples
Basic Configuration
Create .debtmap.toml:
[thresholds]
complexity = 15
duplication = 25
max_function_lines = 50
max_nesting_depth = 4
[languages]
enabled = ["rust", "python"]
[ignore]
patterns = [
"tests/**/*",
"**/*.test.rs",
"target/**",
]
Entropy-Based Complexity
[entropy]
enabled = true
weight = 0.5
use_classification = true
pattern_threshold = 0.7
entropy_threshold = 0.4
branch_threshold = 0.8
max_combined_reduction = 0.3
This reduces false positives for repetitive code patterns.
Understanding entropy-adjusted output:
When entropy analysis detects repetitive patterns, detailed output (-vv) shows both original and adjusted complexity:
debtmap analyze . -vv --top 5
Example output for repetitive validation function:
#15 SCORE: 68.2 [HIGH]
├─ COMPLEXITY: cyclomatic=20 (dampened: 14, factor: 0.70), est_branches=40, cognitive=25, nesting=3, entropy=0.30
- Entropy Impact: 30% dampening (entropy: 0.30, repetition: 95%)
Interpreting the adjustment:
cyclomatic=20: Original complexity before entropy adjustmentdampened: 14: Adjusted complexity (20 × 0.70 = 14)factor: 0.70: Dampening factor (30% reduction applied)entropy: 0.30: Low entropy indicates repetitive patternsrepetition: 95%: High pattern repetition detected
When no dampening is applied:
#5 SCORE: 85.5 [CRITICAL]
├─ COMPLEXITY: cyclomatic=15, est_branches=30, cognitive=22, nesting=4
No “dampened” indicator means the function has diverse logic without repetitive patterns, so the full complexity is used for scoring.
See Entropy Analysis for more details.
Custom Scoring Weights
[scoring]
coverage = 0.40 # Test coverage gaps
complexity = 0.40 # Code complexity
dependency = 0.20 # Dependency criticality
God Object Detection Tuning
[god_object]
enabled = true
# Purity-based scoring reduces false positives for functional code
# Pure functions (no side effects) get lower weight in god object scoring
purity_weight_pure = 0.3 # Pure function complexity weight (default: 0.3)
purity_weight_impure = 1.0 # Impure function complexity weight (default: 1.0)
# Rust-specific thresholds
[god_object.rust]
max_methods = 25
max_fields = 15
max_traits = 5
max_lines = 400
max_complexity = 50
# Python-specific thresholds
[god_object.python]
max_methods = 20
max_fields = 12
max_lines = 350
max_complexity = 45
# JavaScript/TypeScript-specific thresholds
[god_object.javascript]
max_methods = 20
max_fields = 12
max_lines = 300
max_complexity = 40
Why purity weighting matters: See the Purity-Weighted God Object Scoring section for detailed explanation. In short:
- Modules with many pure helper functions avoid false god object flags
- Focus shifts to modules with excessive stateful/impure code
- Functional programming patterns are properly recognized
Example:
- Module with 100 pure functions → Normal (functional design) ✅
- Module with 100 impure functions → God object detected ✅
External API Configuration
For libraries (not CLI tools):
[external_api]
detect_external_api = true
api_functions = [
"parse",
"Parser::new",
"client::connect",
]
api_files = [
"src/lib.rs",
"src/api.rs",
"src/public/*.rs",
]
Complete Multi-Language Configuration
[thresholds]
complexity = 12
duplication = 30
max_file_lines = 400
max_function_lines = 40
minimum_debt_score = 1.0
minimum_cyclomatic_complexity = 2
[entropy]
enabled = true
weight = 0.5
[scoring]
coverage = 0.40
complexity = 0.40
dependency = 0.20
[languages]
enabled = ["rust", "python", "javascript", "typescript"]
[ignore]
patterns = [
# Tests
"tests/**/*",
"**/*.test.*",
"**/*_test.*",
# Build artifacts
"target/**",
"dist/**",
"build/**",
"node_modules/**",
# Python
".venv/**",
"**/__pycache__/**",
# Generated code
"*.generated.*",
"*.pb.*",
]
[god_object]
enabled = true
max_methods = 18
max_fields = 12
Compare Command
The compare command helps validate that refactoring achieved its goals.
Basic Comparison Workflow
# 1. Generate baseline before refactoring
debtmap analyze . --format json --output before.json
# 2. Make your code improvements
# ... refactor, add tests, etc ...
# 3. Generate new analysis
debtmap analyze . --format json --output after.json
# 4. Compare and verify improvements
debtmap compare --before before.json --after after.json
Target-Specific Comparison
Focus on whether a specific function improved:
# Target format: file:function:line
debtmap compare \
--before before.json \
--after after.json \
--target-location src/main.rs:process_data:100
Using with Implementation Plans
Extract target automatically from plan files:
# If IMPLEMENTATION_PLAN.md contains:
# **Target**: src/parser.rs:parse_expression:45
debtmap compare \
--before before.json \
--after after.json \
--plan IMPLEMENTATION_PLAN.md
Output Formats
# JSON output (default)
debtmap compare --before before.json --after after.json
# Terminal output (explicit)
debtmap compare \
--before before.json \
--after after.json \
--format terminal
# JSON for CI integration (explicit output file)
debtmap compare \
--before before.json \
--after after.json \
--format json \
--output comparison.json
# Markdown report
debtmap compare \
--before before.json \
--after after.json \
--format markdown \
--output COMPARISON.md
Interpreting Results
Target Status:
- Resolved: Function no longer appears (complexity reduced below threshold)
- Improved: Metrics improved (complexity down, coverage up)
- Unchanged: No significant change
- Regressed: Metrics got worse
- Not Found: Target not found in baseline
Overall Trend:
- Improving: More items resolved/improved than regressed
- Stable: No significant changes
- Regressing: New critical debt introduced
Example Output:
Target Status: Resolved ✅
- src/parser.rs:parse_expression:45 reduced from complexity 22 to 8
- Coverage improved from 0% to 85%
Overall Trend: Improving
- 3 items resolved
- 2 items improved
- 0 regressions
- Total debt score: 450 → 285 (-37%)
CI Integration
Use in pull request validation:
# In CI script
debtmap compare \
--before baseline.json \
--after current.json \
--format json | jq -e '.overall_trend == "Improving"'
# Exit code 0 if improving, 1 otherwise
Tips and Best Practices
- Start Simple: Begin with basic analysis, add coverage later
- Use Filters: Focus on one category at a time (Architecture, Testing)
- Iterate: Run analysis, fix top items, repeat
- CI Integration: Automate validation in your build pipeline
- Track Progress: Use
comparecommand to validate improvements - Configure Thresholds: Adjust to match your team’s standards
- Leverage Coverage: Always include coverage data for accurate risk assessment
Next Steps
- CLI Reference - Complete CLI documentation
- Analysis Guide - Understanding analysis results
- Configuration - Advanced configuration options
Frequently Asked Questions
Common questions about debtmap’s features, usage, and AI integration.
AI Integration
How does debtmap work with AI coding assistants?
Debtmap is designed as a sensor that provides structured data for AI consumption. Instead of telling you what to fix, it tells AI assistants:
- Where to look - Prioritized list of debt items with file locations
- What to read - Context suggestions (callers, callees, tests)
- What signals matter - Complexity, coverage, coupling metrics
Example workflow:
# Pipe directly to Claude Code
debtmap analyze . --format markdown --top 3 | claude "Fix the top item"
What output format is best for AI?
Use --format markdown for AI workflows. This format:
- Minimizes tokens while maximizing information
- Includes context suggestions inline
- Uses consistent structure for reliable parsing
- Avoids verbose descriptions that waste context window
debtmap analyze . --format markdown --top 5
Does debtmap provide fix suggestions?
No. Debtmap is a sensor, not a prescriber. It provides signals (metrics, coverage, coupling) and lets the AI decide how to fix issues.
This design is intentional:
- AI can consider business context you provide
- Different situations require different approaches
- Template recommendations are often wrong
How do I use context suggestions?
Each debt item includes file ranges the AI should read:
Context:
├─ Primary: src/parser.rs:38-85 (the debt item)
├─ Caller: src/handler.rs:100-120 (usage context)
└─ Test: tests/parser_test.rs:50-75 (expected behavior)
Tell your AI to read these files before making changes:
debtmap analyze . --format markdown --top 1 | \
claude "Read the context files first, then fix the top item"
Can I integrate debtmap with Cursor?
Yes. Generate a report file and reference it in Cursor:
# Generate report
debtmap analyze . --format markdown --top 10 > debt-report.md
# In Cursor, use: @debt-report.md Fix the top critical item
Features & Capabilities
What’s the difference between measured and estimated metrics?
Measured Metrics - Precise values from AST analysis:
cyclomatic_complexity: Exact count of decision pointscognitive_complexity: Weighted readability measurenesting_depth: Maximum nesting levelsloc: Lines of code
Estimated Metrics - Heuristic approximations:
est_branches: Estimated execution paths (formula-based)
Use measured metrics for thresholds and gates. Use estimated metrics for prioritization.
What is entropy-based complexity analysis?
Entropy analysis uses information theory to distinguish between genuinely complex code and repetitive pattern-based code.
A function with 20 identical if/return validation checks has the same cyclomatic complexity as a function with 20 diverse conditional branches. Entropy analysis gives the validation function a much lower effective complexity score because it follows a simple, repetitive pattern.
Result: 60-75% reduction in false positives compared to traditional complexity metrics.
What languages are supported?
Currently supported:
- Rust - Full support with AST parsing, macro expansion, and trait resolution
Planned:
- Python, JavaScript/TypeScript, Go (after Rust analysis is mature)
Why is debtmap Rust-only right now?
We’re taking a focused approach to deliver the best possible Rust code analyzer before expanding. This allows us to:
- Perfect core algorithms with one language
- Build Rust-specific features (macros, traits, lifetimes)
- Establish trust in the Rust community
- Apply learnings to future languages
How does coverage integration work?
Debtmap reads LCOV format coverage data and maps it to functions:
# Generate coverage
cargo llvm-cov --lcov --output-path coverage.lcov
# Analyze with coverage
debtmap analyze . --lcov coverage.lcov
Coverage affects prioritization:
- Complex function with good coverage = lower priority
- Simple function with no coverage = higher priority
- High complexity + zero coverage = critical priority
Usage & Configuration
How do I exclude test files from analysis?
By default, debtmap excludes common test directories. To customize:
# .debtmap.toml
[analysis]
exclude_patterns = [
"**/tests/**",
"**/*_test.rs",
"**/target/**",
]
Can I customize the complexity thresholds?
Yes. Configure in .debtmap.toml:
[thresholds]
cyclomatic_complexity = 10
nesting_depth = 3
loc = 200
[tiers]
critical = 8.0
high = 5.0
moderate = 2.0
Does debtmap integrate with CI/CD?
Yes. Use the validate command:
debtmap validate . --max-debt-density 10.0
Exit codes:
0= validation passed1= validation failed (debt exceeds thresholds)2= analysis error
GitHub Actions example:
- name: Check technical debt
run: |
cargo llvm-cov --lcov --output-path coverage.lcov
debtmap validate . --lcov coverage.lcov --max-debt-density 10.0
What if debtmap reports false positives?
-
Verify entropy analysis is enabled (default):
[analysis] enable_entropy_analysis = true -
Adjust thresholds for your project:
[thresholds] cyclomatic_complexity = 15 -
Use ignore comments for specific functions:
#![allow(unused)] fn main() { // debtmap:ignore - acceptable validation pattern fn validate_many_fields() { ... } } -
Report issues - If you believe analysis is incorrect, open an issue with a code example.
How accurate is the risk scoring?
Risk scores are relative prioritization metrics, not absolute measures. They help you answer “which code should I focus on first?” rather than “exactly how risky is this code?”
Use risk scores for prioritization, but apply your domain knowledge when deciding what to fix.
Comparison with Other Tools
How is debtmap different from SonarQube?
| Aspect | Debtmap | SonarQube |
|---|---|---|
| Output | Signals for AI | Recommendations |
| Speed | Seconds | Minutes |
| Coverage | Built-in | Enterprise only |
| Entropy | Yes | No |
| Setup | Single binary | Server required |
When to use SonarQube: Multi-language enterprise dashboards. When to use debtmap: AI-assisted Rust development.
Should I replace clippy with debtmap?
No—use both. They serve different purposes:
clippy:
- Rust idioms and patterns
- Common mistakes
- Runs in milliseconds
debtmap:
- Technical debt prioritization
- Coverage-based risk
- Context for AI
cargo clippy -- -D warnings
debtmap analyze . --lcov coverage.lcov --top 10
How does debtmap compare to coverage tools?
Coverage tools (tarpaulin, llvm-cov) tell you what’s tested. Debtmap tells you which untested code is most risky.
Coverage tools: “You have 75% coverage” Debtmap: “Function X has 0% coverage and complexity 12—fix this first”
Troubleshooting
Analysis is slow on my large codebase
Optimization strategies:
-
Exclude unnecessary files:
[analysis] exclude_patterns = ["**/target/**", "**/vendor/**"] -
Analyze specific directories:
debtmap analyze src/ -
Reduce parallelism:
debtmap analyze . --jobs 4
Coverage data isn’t being applied
Check:
- LCOV file path is correct
- LCOV file contains data:
grep -c "^SF:" coverage.lcov - Source paths match between LCOV and project
Debtmap reports “No functions found”
Check:
- Project contains Rust files (
.rs) - Files aren’t excluded by ignore patterns
- No syntax errors:
debtmap analyze . -vv
Getting Help
- Documentation: debtmap.dev
- GitHub Issues: Report bugs
- LLM Integration: See LLM Integration Guide
- Examples: See Examples
Troubleshooting
Common issues and solutions for using debtmap effectively.
Getting Started with Troubleshooting
When you encounter an issue with debtmap, start with these steps:
- Try a quick fix - See Quick Fixes for common problems and immediate solutions
- Enable debug mode - Use
-v,-vv, or-vvvfor increasing levels of detail - Check error messages - See Error Messages Reference for explanations
- Review your configuration - Check
.debtmap/config.tomlfor any settings that might cause issues
Subsections
This chapter is organized into focused troubleshooting topics:
- Quick Fixes - Common problems with immediate solutions
- Debug Mode - Verbosity levels and diagnostic options
- Context Provider Issues - Troubleshooting critical_path, dependency, and git_history providers
- Advanced Analysis Issues - Call graph, pattern detection, functional analysis issues
- Error Messages Reference - Detailed error message explanations
- Output and Command Issues - Output formatting and command-specific problems
- FAQ - Frequently asked questions
Common Problem Categories
Analysis Problems
- Slow analysis: See Quick Fixes or Debug Mode
- Parse errors: See Quick Fixes or Error Messages Reference
- No output: See Quick Fixes
- Inconsistent results: See Quick Fixes
Coverage and Context Problems
- Coverage not applied: See Quick Fixes
- Context provider errors: See Context Provider Issues
Output Problems
- JSON parsing issues: See Quick Fixes or Output and Command Issues
- Compare command errors: See Output and Command Issues
Detection Problems
- God object false positives: See Quick Fixes or Advanced Analysis Issues
- Pattern detection issues: See Advanced Analysis Issues
- Call graph problems: See Quick Fixes or Advanced Analysis Issues
Diagnostic Commands
Basic Diagnostics
# Check version
debtmap --version
# Run with basic verbosity
debtmap analyze . -v
# Run with detailed output
debtmap analyze . -vv
# Run with full debug output
debtmap analyze . -vvv
Coverage Diagnostics
# Debug coverage matching for a function
debtmap explain-coverage . \
--coverage-file coverage.lcov \
--function "function_name" \
-v
Performance Diagnostics
# Time the analysis
time debtmap analyze .
# Profile with verbosity
debtmap analyze . -vv 2>&1 | grep "processed in"
When to File Bug Reports
File a bug report when:
- Parse errors on valid syntax
- Crashes or panics
- Incorrect complexity calculations
- Concurrency errors
- Incorrect error messages
Before filing:
- Check this troubleshooting guide
- Try
--semantic-offfallback mode - Update to the latest version
- Search existing issues on GitHub
See FAQ for detailed guidance.
Related Documentation
- Configuration - Configure debtmap behavior
- CLI Reference - Complete CLI flag documentation
- Analysis Guide - Understanding analysis results
- Examples - Practical usage examples
Quick Fixes
If you’re experiencing problems with debtmap, try these quick solutions before diving into detailed troubleshooting.
Slow Analysis
Problem: Analysis takes too long to complete
Quick Solutions:
# Use all available CPU cores (default)
debtmap analyze . --jobs 0
# Disable multi-pass analysis for faster single-pass
debtmap analyze . --no-multi-pass
# Use faster fallback mode (less accurate but much faster)
debtmap analyze . --semantic-off
# Limit files for testing
debtmap analyze . --max-files 100
# Analyze a specific subdirectory only
debtmap analyze src/specific/module
Source: CLI flags defined in src/cli/args.rs:227-240
When to use each approach:
--jobs 0: Default, maximum performance on dedicated machine--no-multi-pass: CI/CD pipelines, large codebases (>100k LOC)--semantic-off: Quick complexity checks during development--max-files: Testing configuration before full analysis
See Debug Mode for performance monitoring options.
Parse Errors
Problem: “Parse error in file:line:column” messages
Quick Solutions:
# Try fallback mode without semantic analysis
debtmap analyze . --semantic-off
# For Rust macro issues, see detailed warnings
debtmap analyze . --verbose-macro-warnings --show-macro-stats
# Exclude problematic files via configuration
# Add to .debtmap/config.toml:
# exclude = ["path/to/problematic/file.rs"]
Source: Verbose macro flags in src/cli/args.rs:317-326
Common causes:
- Unsupported language syntax or version
- Complex macro expansions (Rust)
- Type inference edge cases (Python, TypeScript)
See Error Messages Reference for detailed parse error explanations.
No Output
Problem: Running debtmap produces no output or results
Quick Solutions:
# Increase verbosity to see what's happening
debtmap analyze . -v
# Lower the minimum priority threshold
debtmap analyze . --min-priority 0
# Lower the minimum score threshold
debtmap analyze . --min-score 0
# Check if files are being analyzed
debtmap analyze . -vv 2>&1 | grep "Processing"
# Use lenient threshold preset
debtmap analyze . --threshold-preset lenient
# Show all items without filtering
debtmap analyze . --min-priority 0 --top 100
Source: Filtering options in src/cli/args.rs:143-163
Common causes:
- Threshold preset too strict (try
--threshold-preset lenient) - Category filters excluding all results
- Min-priority too high
- Min-score too high
Inconsistent Results
Problem: Results differ between runs
Quick Solutions:
# Check if coverage file changed
debtmap analyze . --coverage-file coverage.info -v
# Disable context providers for consistent baseline
debtmap analyze . --no-context-aware
# Compare runs with debug output
debtmap analyze . -vv > run1.txt
debtmap analyze . -vv > run2.txt
diff run1.txt run2.txt
Source: Context-aware flag in src/cli/args.rs:202-204
Common causes:
- Coverage file changed between runs
- Context providers enabled/disabled (
--context) - Git history changes affecting git_history provider
- Different threshold settings
Coverage Data Not Matching Functions
Problem: Coverage data not being applied to functions
Quick Solutions:
# Debug coverage data parsing for a specific function
debtmap explain-coverage . \
--coverage-file coverage.lcov \
--function "process_file"
# See detailed function matching diagnostics
debtmap explain-coverage . \
--coverage-file coverage.lcov \
--function "process_file" \
-v
# Narrow search to specific file
debtmap explain-coverage . \
--coverage-file coverage.lcov \
--function "calculate_score" \
--file src/scoring.rs
Source: explain-coverage command in src/commands/explain_coverage/mod.rs
Example output:
Coverage Detection Explanation
==============================
Function: process_file
File: src/processor.rs
Matched: suffix_match
Coverage: 87.5%
See Context Provider Issues for detailed coverage troubleshooting.
Too Many Low-Priority Warnings
Problem: Results overwhelmed with low-priority items
Quick Solutions:
# Increase minimum score threshold
debtmap analyze . --min-score 5.0
# Use stricter threshold preset
debtmap analyze . --threshold-preset strict
# Filter by specific debt categories
debtmap analyze . --filter "complexity,debt"
# Limit output to top N items
debtmap analyze . --top 20
# Use summary mode for compact output
debtmap analyze . --summary
Source: Filtering flags in src/cli/args.rs:147-159
Min-score filter behavior (spec 193, 205):
- T1 Critical Architecture and T2 Complex Untested items bypass the min-score filter
- T3 and T4 items are filtered by min-score threshold
- Default min-score is 3.0
Call Graph Resolution Failures
Problem: Call graph shows incomplete or missing relationships
Quick Solutions:
# Enable debug output to see graph construction
debtmap analyze . --debug-call-graph -vv
# Validate the call graph consistency
debtmap analyze . --validate-call-graph
# Include external dependencies if relevant
debtmap analyze . --show-external-calls --show-std-lib-calls
# Trace specific functions to see their relationships
debtmap analyze . --trace-function "my_function" -vv
# Show only statistics, not the full graph
debtmap analyze . --call-graph-stats
Source: Call graph debug flags in src/cli/args.rs:332-350
See Advanced Analysis Issues for detailed call graph debugging.
God Object False Positives
Problem: God object detection flagging legitimate large files
Quick Solutions:
# Disable god object detection entirely
debtmap analyze . --no-god-object
# See god object analysis with responsibility metrics
debtmap analyze . -vv
# Check specific file for god object patterns
debtmap analyze path/to/large/file.rs -vv
Source: God object flag in src/cli/args.rs:242-244
When to disable:
- Framework files with intentionally large aggregator classes
- Generated code files
- Files that are legitimately large due to single responsibility
See Advanced Analysis Issues for god object configuration options.
JSON Format Parsing Errors
Problem: JSON output parsing errors or unexpected structure
Quick Solutions:
# Use unified JSON format (consistent structure, recommended)
debtmap analyze . --format json --output-format unified
# Validate JSON output with jq
debtmap analyze . --format json | jq .
# Write to file for easier inspection
debtmap analyze . --format json --output results.json
Source: Output format options from CLI help
Understanding the two formats:
| Format | Structure | Recommended For |
|---|---|---|
| Legacy (default) | {"File": {...}} | Backwards compatibility |
| Unified | {"type": "File", ...} | Tool integration, parsing |
See Output and Command Issues for detailed JSON format documentation.
Context Provider Errors
Problem: Errors with critical_path, dependency, or git_history providers
Quick Solutions:
# Disable all context providers
debtmap analyze . --no-context-aware
# Disable specific problematic provider
debtmap analyze . --context --disable-context git_history
# Enable specific providers only
debtmap analyze . --context --context-providers critical_path,dependency
# Check context provider details
debtmap analyze . --context -vvv
Source: Context provider flags in src/cli/args.rs:119-125
Provider-specific issues:
- git_history: Requires git repository, fails outside git repos
- dependency: Complex import structures may not resolve
- critical_path: Requires valid call graph
See Context Provider Issues for detailed troubleshooting.
Quick Reference Table
| Problem | Quick Fix |
|---|---|
| Slow analysis | --no-multi-pass or --semantic-off |
| Parse errors | --semantic-off or exclude files |
| No output | --min-priority 0 or -v |
| Coverage issues | explain-coverage command |
| Too many warnings | --min-score 5.0 or --top 20 |
| Call graph issues | --debug-call-graph -vv |
| God object false positives | --no-god-object |
| JSON parsing | --output-format unified |
| Context errors | --no-context-aware |
When Quick Fixes Don’t Work
If these quick fixes don’t resolve your issue, consult:
- Debug Mode - Detailed debugging options
- Context Provider Issues - Provider-specific troubleshooting
- Advanced Analysis Issues - Complex analysis problems
- Error Messages Reference - Error message explanations
- FAQ - Common questions and answers
For issues not covered in this documentation, consider:
- Running with maximum verbosity:
debtmap analyze . -vvv 2>&1 | tee debug.log - Checking the GitHub issues
- Filing a bug report with the debug output
Debug Mode
Use verbosity flags and environment variables to diagnose issues and understand analysis behavior.
Verbosity Levels
Debtmap provides three verbosity levels for progressive debugging:
# Level 1: Show main score factors
debtmap analyze . -v
# Level 2: Show detailed calculations
debtmap analyze . -vv
# Level 3: Show all debug information
debtmap analyze . -vvv
What each level shows:
| Level | Flag | Information Displayed |
|---|---|---|
| 1 | -v | Score breakdowns, main contributing factors |
| 2 | -vv | Detailed metric calculations, file processing |
| 3 | -vvv | Full debug output, context provider details |
Example Output at Each Level
Level 1 (-v) - Score factors:
[Score] src/main.rs::process_file
Complexity: 15 (cyclomatic) + 20 (cognitive)
Coverage: 45%
Priority: 7.2
Level 2 (-vv) - Detailed calculations:
[Score] src/main.rs::process_file
Base risk: 3.6
Coverage penalty: 1.5× (40-60% coverage tier)
Debt factor: 1.2×
Final: 7.2
Processing time: 45ms
Level 3 (-vvv) - Full debug:
[Debug] Parsing src/main.rs
[Debug] AST nodes: 1,234
[Debug] Functions detected: 15
[Context] critical_path: distance=2 from entry point
[Context] dependency: 8 callers, 3 callees
...
Diagnostic Options
Macro Debugging (Rust)
# Show macro parsing warnings
debtmap analyze . --verbose-macro-warnings
# Show macro expansion statistics
debtmap analyze . --show-macro-stats
These flags help diagnose Rust macro-related parse issues:
--verbose-macro-warnings: Shows warnings for each macro that couldn’t be fully expanded--show-macro-stats: Displays statistics about macro expansion success rates
Semantic Analysis Control
# Disable semantic analysis (faster fallback mode)
debtmap analyze . --semantic-off
# Validate LOC consistency
debtmap analyze . --validate-loc
Use --semantic-off when:
- Encountering parse errors on valid syntax
- Need faster analysis with reduced accuracy
- Testing files with unsupported language constructs
Explain Metrics
The --explain-metrics flag displays detailed explanations of all metrics and scoring formulas:
debtmap --explain-metrics
This shows:
- Metric definitions (cyclomatic complexity, cognitive complexity, etc.)
- Scoring formula breakdowns
- Threshold explanations
- Priority calculation details
Source: src/cli/args.rs:313-314
Note: The --explain-score flag has been deprecated in favor of granular verbosity levels (-v, -vv, -vvv).
Performance Profiling
Enable Profiling
Use the --profile flag to identify performance bottlenecks (Spec 001):
# Enable profiling output
debtmap --profile
# Write profiling data to JSON file
debtmap --profile --profile-output profile-data.json
Source: src/cli/args.rs:356-368
Profiling output includes:
- Time spent in each analysis phase
- File processing times
- Memory usage statistics
- Bottleneck identification
Analyze Profiling Results
# Generate profile and analyze
debtmap --profile --profile-output analysis.json
jq '.phases | to_entries | sort_by(.value.duration_ms) | reverse' analysis.json
Debugging Score Calculations
# Use verbosity levels to see score breakdown
debtmap analyze . -v # Shows score factors
# See how coverage affects scores
debtmap analyze . --coverage-file coverage.info -v
# See how context affects scores
debtmap analyze . --context --context-providers critical_path -v
Debug Session Example
# Step 1: Run with verbosity to see what's happening
debtmap analyze . -vv
# Step 2: Try without semantic analysis
debtmap analyze . --semantic-off -v
# Step 3: Check specific file
debtmap analyze path/to/file.rs -vvv
# Step 4: Validate results
debtmap analyze . --validate-loc
Environment Variables
Debtmap supports various environment variables for debugging and diagnostics without command-line flags.
Debug Environment Variables
| Variable | Purpose | Source |
|---|---|---|
DEBTMAP_COVERAGE_DEBUG | Enable detailed coverage matching diagnostics | src/risk/lcov/diagnostics.rs:4 |
DEBTMAP_TIMING | Show timing information for analysis phases | src/analyzers/effects.rs:98 |
DEBTMAP_DEBUG_SCORING | Enable detailed score calculation output | src/priority/mod.rs:551 |
DEBTMAP_LOG_FILE | Write tracing output to a log file | src/observability/tracing.rs:63 |
DEBTMAP_SHOW_FILTER_STATS | Show filter statistics summary | src/priority/mod.rs:649 |
DEBTMAP_ENTROPY_ENABLED | Enable entropy-based pattern dampening | src/complexity/entropy.rs |
DEBTMAP_FILE_TIMEOUT | Override per-file analysis timeout (seconds) | src/analysis_utils.rs:206 |
DEBTMAP_NO_TIMEOUT | Disable file analysis timeouts entirely | src/analysis_utils.rs:220 |
Coverage Debugging
Enable detailed coverage matching diagnostics when troubleshooting coverage file issues:
# Enable coverage debug mode
DEBTMAP_COVERAGE_DEBUG=1 debtmap analyze .
# Combine with explain-coverage for detailed matching
DEBTMAP_COVERAGE_DEBUG=1 debtmap explain-coverage . \
--coverage-file coverage.lcov \
--function "my_function" \
-v
Source: src/risk/lcov/diagnostics.rs:4-18
When enabled, this shows:
- Coverage file parsing statistics
- Function name matching attempts and results
- Path normalization details
- Match success rates by strategy (exact, suffix, method name, etc.)
Example output:
=== Coverage Matching Statistics ===
Total functions analyzed: 150
Functions with coverage: 123 (82.0%)
Functions without coverage: 27 (18.0%)
Match Strategy Breakdown:
exact_match: 95 (77.2%)
suffix_match: 20 (16.3%)
method_name_match: 8 (6.5%)
Timing Information
Enable timing output to identify slow files or analysis phases:
# Enable timing output
DEBTMAP_TIMING=1 debtmap analyze .
Source: src/analyzers/effects.rs:81-99, src/analyzers/rust.rs:211-213
Example output:
[TIMING] analyze_rust_file src/main.rs: total=0.45s (analysis=0.32s, debt=0.08s, deps=0.05s)
[TIMING] analyze_rust_file src/lib.rs: total=0.12s (analysis=0.09s, debt=0.02s, deps=0.01s)
Score Calculation Debugging
For detailed score calculation output:
DEBTMAP_DEBUG_SCORING=1 debtmap analyze .
Source: src/priority/mod.rs:551-553
Example output:
=== Score Calculation Debug ===
Function-level items count: 150
Base risk calculation:
complexity_component: 0.27
coverage_component: 0.45
base_risk: 3.6
...
Filter Statistics
Show statistics about filtering operations:
DEBTMAP_SHOW_FILTER_STATS=1 debtmap analyze .
Source: src/priority/mod.rs:649-653
Example output:
=== Filter Statistics ===
Items before filtering: 500
Items after priority filter: 150
Items after category filter: 120
Items in final output: 100
File Logging
Write tracing output to a file instead of stderr (useful for TUI debugging):
DEBTMAP_LOG_FILE=debtmap.log debtmap analyze .
Source: src/observability/tracing.rs:41-42, src/observability/tracing.rs:63-65
This is particularly useful when:
- Using the TUI mode (prevents display corruption)
- Need to capture debug output for later analysis
- Debugging issues that require verbose output
File Timeout Control
Control per-file analysis timeouts:
# Set custom timeout (in seconds, default varies by file size)
DEBTMAP_FILE_TIMEOUT=30 debtmap analyze .
# Disable timeouts entirely (use with caution)
DEBTMAP_NO_TIMEOUT=1 debtmap analyze .
Source: src/analysis_utils.rs:206-221
When to adjust timeouts:
DEBTMAP_FILE_TIMEOUT: When analyzing very large or complex files that need more timeDEBTMAP_NO_TIMEOUT: When debugging timeout issues or analyzing files that consistently timeout
Warning: Disabling timeouts can cause the analysis to hang on problematic files.
RUST_LOG for Tracing
Control the structured tracing verbosity:
# Default: warnings and errors only
debtmap analyze .
# Show phase-level progress
RUST_LOG=info debtmap analyze .
# Detailed debugging output
RUST_LOG=debug debtmap analyze .
# Debug only debtmap crate
RUST_LOG=debtmap=debug debtmap analyze .
# Trace-level (very verbose)
RUST_LOG=trace debtmap analyze .
Source: src/observability/tracing.rs:17-31
Analysis Feature Flags
# Enable context-aware analysis by default
export DEBTMAP_CONTEXT_AWARE=true
# Enable functional analysis by default
export DEBTMAP_FUNCTIONAL_ANALYSIS=true
# Single-pass analysis (disable multi-pass)
export DEBTMAP_SINGLE_PASS=1
Output Customization
# Disable emoji in output
export NO_EMOJI=1
# Force plain text output (no colors)
export NO_COLOR=1
CI/CD Variables
# Enable automation-friendly output
export PRODIGY_AUTOMATION=true
# Enable validation mode (stricter checks)
export PRODIGY_VALIDATION=true
Precedence Rules
When both environment variables and CLI flags are present:
- CLI flags take precedence over environment variables
- Environment variables override config file defaults
- Config file settings override built-in defaults
# CLI flag overrides environment variable
DEBTMAP_CONTEXT_AWARE=false debtmap --context # Flag wins, context enabled
Performance Tips
Parallel Processing
# Use all CPU cores (default)
debtmap --jobs 0
# Limit to 4 threads
debtmap --jobs 4
# Disable parallel processing (debugging)
debtmap --no-parallel
When to adjust parallelism:
- Use
--jobs 0(default): Maximum performance on dedicated machine - Use
--jobs N: Limit resource usage while other tasks run - Use
--no-parallel: Debugging concurrency issues
Analysis Optimizations
# Faster: disable multi-pass analysis (single-pass mode)
debtmap --no-multi-pass
# Fast mode: disable semantic analysis
debtmap --semantic-off
# Plain output: faster terminal rendering
debtmap --plain
# Limit files for testing
debtmap --max-files 100
# Analyze subdirectory only
debtmap src/specific/module
# Reduce output with filters
debtmap --min-priority 4 --top 20
Performance Comparison
| Configuration | Speed | Accuracy |
|---|---|---|
| Default (multi-pass) | Fast | Highest |
--no-multi-pass | Faster | High |
--semantic-off | Fastest | Medium |
--no-parallel | Slowest | High |
--jobs 4 | Medium | High |
Monitoring Performance
# Time analysis
time debtmap
# Profile with verbosity
debtmap -vv 2>&1 | grep "processed in"
# Use built-in profiling
debtmap --profile
Troubleshooting Environment Variables
# Test with specific environment
env DEBTMAP_CONTEXT_AWARE=true debtmap -v
# See all debtmap-related environment variables
env | grep -i debtmap
env | grep -i prodigy
# Combine multiple debug variables
DEBTMAP_COVERAGE_DEBUG=1 DEBTMAP_TIMING=1 debtmap analyze . -v
See Also
- Quick Fixes - Common problems with immediate solutions
- Error Messages Reference - Detailed error message explanations
- CLI Reference - Complete command-line documentation
- Configuration Guide - Configure debtmap behavior
Context Provider Issues
This section covers troubleshooting for debtmap’s context providers: critical_path, dependency, and git_history. Context providers gather additional risk-relevant information to enhance the base complexity analysis.
Overview
Context providers implement the ContextProvider trait (defined in src/risk/context/mod.rs:13-25) and gather context for analysis targets. Each provider:
- Has a name identifying the provider
- Gathers context for a given analysis target
- Has a weight for its contribution to overall risk
- Can explain its context contribution
The three available providers are:
| Provider | Purpose | Source |
|---|---|---|
critical_path | Analyzes call graph paths from entry points | src/risk/context/critical_path.rs |
dependency | Calculates risk propagation through dependencies | src/risk/context/dependency.rs |
git_history | Provides change frequency and bug density metrics | src/risk/context/git_history.rs |
Enabling Context Analysis
# Enable with default providers
debtmap analyze . --context
# Or use the explicit flag
debtmap analyze . --enable-context
# Specify specific providers
debtmap analyze . --context --context-providers critical_path,dependency,git_history
Source: CLI arguments defined in src/cli/args.rs:111-125
Disabling Specific Providers
# Disable git_history only
debtmap analyze . --context --disable-context git_history
# Disable multiple providers
debtmap analyze . --context --disable-context git_history,dependency
# Disable context-aware filtering (keeps providers but disables filtering)
debtmap analyze . --no-context-aware
Source: CLI arguments defined in src/cli/args.rs:123-125 and src/cli/args.rs:202-204
Git History Provider Issues
The GitHistoryProvider (defined in src/risk/context/git_history.rs:27-33) analyzes git commit history to provide:
- Change frequency
- Bug fix count
- Author count
- File age in days
- Stability score
Problem: “Git history error” or “Not a git repository”
Causes:
- Not in a git repository
- No git history for files
- Git not installed or accessible
Solutions:
# Verify you're in a git repository
git status
# If not a git repository, disable git_history provider
debtmap analyze . --context --disable-context git_history
# Initialize git repo if needed
git init
Technical detail: The provider verifies the git repository by running git rev-parse --git-dir (see src/risk/context/git_history.rs:38-46). If this fails, the provider returns an error.
Problem: Git History Analysis is Slow
The git history provider fetches all git data upfront using batched operations.
Solutions:
Reduce commit history depth in your CI pipeline:
# Use shallow clone in CI
git clone --depth 50 repo.git
debtmap analyze . --context-providers critical_path,dependency
Disable git_history for faster analysis:
debtmap analyze . --context --disable-context git_history
Technical detail: The provider uses batched::BatchedGitHistory (see src/risk/context/git_history.rs:49-62) to load git data in bulk rather than per-file queries.
Dependency Provider Issues
The DependencyRiskProvider (defined in src/risk/context/dependency.rs:181-191) calculates:
- Propagated risk through the dependency graph
- Blast radius (how many modules would be affected by changes)
- Coupling strength between modules
Problem: “Dependency error” or incomplete dependency graph
Causes:
- Complex import structures
- Circular dependencies
- Unsupported dependency patterns
Solutions:
# Disable dependency provider
debtmap analyze . --context --disable-context dependency
# Try with verbosity to see details
debtmap analyze . --context -vvv
# Use without context
debtmap analyze .
Problem: Dependency Analysis Errors
The dependency calculator uses iterative risk propagation with a maximum of 10 iterations and a convergence threshold of 0.01 (see src/risk/context/dependency.rs:93-118).
If analysis errors occur:
# Check for circular dependencies in your codebase
debtmap analyze . --context -vvv 2>&1 | grep -i "dependency\|circular"
# Disable dependency provider if issues persist
debtmap analyze . --context --disable-context dependency
Critical Path Provider Issues
The CriticalPathProvider (defined in src/risk/context/critical_path.rs) analyzes:
- Entry points (main, API endpoints, CLI commands, event handlers)
- Call graph paths
- User-facing code paths
Problem: Critical path analysis fails or produces unexpected results
Causes:
- Invalid call graph
- Missing function definitions
- Complex control flow
Solutions:
# Disable critical_path provider
debtmap analyze . --context --disable-context critical_path
# Try with semantic analysis disabled
debtmap analyze . --context --semantic-off
# Debug with verbosity to see entry point detection
debtmap analyze . --context --context-providers critical_path -vvv
Problem: Incorrect Entry Point Classification
Entry points are classified by function name and file path patterns (see src/risk/context/critical_path.rs:82-98). The following entry types are detected:
| Entry Type | Detection Pattern |
|---|---|
Main | Functions named main |
CliCommand | Functions in CLI-related paths |
ApiEndpoint | Functions matching API patterns |
WebHandler | Functions in web handler paths |
EventHandler | Functions matching event handler patterns |
TestEntry | Functions in test files |
Solutions:
# Check with verbose output to see classification
debtmap analyze . -vv | grep "entry point\|Entry"
# Verify call graph is being built
debtmap analyze . --show-call-graph
Context Impact on Scoring
Context providers add additional risk factors to the base complexity score. The contribution is calculated in ContextualRisk::new() (see src/risk/context/mod.rs:215-238):
- Context contribution is capped at 2.0 to prevent excessive score amplification
- This means a maximum 3x multiplier on base risk
- Formula:
contextual_risk = base_risk * (1.0 + context_contribution)
# See context contribution to scores
debtmap analyze . --context -v
# Compare with and without context
debtmap analyze . --format json --output baseline.json
debtmap analyze . --context --format json --output with_context.json
debtmap compare --before baseline.json --after with_context.json
Problem: Context Providers Not Affecting Scores
Solution: Ensure providers are enabled with --context or --context-providers:
# Wrong: context flag missing
debtmap analyze .
# Correct: context enabled
debtmap analyze . --context
Performance Considerations
Context analysis adds processing overhead. The ContextAggregator (defined in src/risk/context/mod.rs:98-155) uses:
- Lock-free
DashMapfor caching (thread-safe concurrent access) Arcfor cheap cloning across threads- Cache key:
{file_path}:{function_name}
Performance comparison:
# Fastest: no context
debtmap analyze .
# Slowest: all context providers
debtmap analyze . --context --context-providers critical_path,dependency,git_history
# Medium: selective providers (skip git_history)
debtmap analyze . --context --context-providers critical_path,dependency
Debugging Context Providers
# See detailed context provider output
debtmap analyze . --context -vvv
# Check which providers are active
debtmap analyze . --context -v 2>&1 | grep -i "context provider\|provider"
# See provider execution times
debtmap analyze . --context -vvv 2>&1 | grep -i "time\|duration\|elapsed"
Understanding Provider Output
When running with verbose mode, you’ll see context details including:
CriticalPath context (from src/risk/context/mod.rs:48-52):
entry_points: List of reachable entry pointspath_weight: Weight of the critical pathis_user_facing: Whether the function is user-facing
Dependency context (from src/risk/context/mod.rs:53-58):
depth: Dependency chain depthpropagated_risk: Risk propagated from dependenciesdependents: List of dependent modulesblast_radius: Number of affected modules
Historical context (from src/risk/context/mod.rs:59-64):
change_frequency: How often the file changesbug_density: Bug fix densityage_days: File age in daysauthor_count: Number of contributors
Common Issues Summary
| Issue | Quick Fix |
|---|---|
| Git history error | --disable-context git_history |
| Dependency analysis errors | --disable-context dependency |
| Critical path fails | --disable-context critical_path |
| Slow analysis | --disable-context git_history |
| Providers not affecting scores | Add --context flag |
| Need version 0.2.0+ | cargo install debtmap --force |
See Also
- Quick Fixes - Common problems with immediate solutions
- Debug Mode - General debugging techniques
- Advanced Analysis Issues - Multi-pass and semantic analysis
- Context Providers - Full context providers guide
Advanced Analysis Issues
Advanced troubleshooting for call graph, pattern detection, functional analysis, and other complex analysis features.
Multi-Pass Analysis
Multi-pass analysis is enabled by default and performs two iterations to distinguish logical complexity from formatting artifacts.
# Multi-pass analysis runs by default
debtmap analyze .
# Disable for performance-critical scenarios
debtmap analyze . --no-multi-pass
When to disable (--no-multi-pass):
- Performance-critical CI/CD pipelines
- Very large codebases (>100k LOC)
- Quick complexity checks during development
Call Graph Debugging
Available Flags:
# Enable call graph debug output
debtmap analyze . --debug-call-graph
# Trace specific functions through call graph
debtmap analyze . --trace-function "function_name,another_function"
# Show only call graph statistics
debtmap analyze . --call-graph-stats
# Validate call graph consistency
debtmap analyze . --validate-call-graph
Dependency Control Flags:
# Show dependency information in results
debtmap analyze . --show-dependencies
# Limit number of callers shown per function
debtmap analyze . --max-callers 10
# Include external crate calls in call graph
debtmap analyze . --show-external-calls
# Include standard library calls
debtmap analyze . --show-std-lib-calls
God Object Detection
Flag: --no-god-object
Disables god object (large class/module) detection.
God Object Types:
- god_class: Files with excessive complexity excluding test functions
- god_file: Files with excessive complexity including all functions
- god_module: Alias for god_file
# Disable god object detection entirely
debtmap analyze . --no-god-object
# See god object analysis with responsibility metrics
debtmap analyze . -vv
Pattern Detection Issues
Control Pattern Detection:
# Disable pattern detection entirely
debtmap analyze . --no-pattern-detection
# Specify specific patterns to detect
debtmap analyze . --patterns "god_object,long_function,complex_conditional"
# Adjust pattern detection sensitivity (default: 0.7)
debtmap analyze . --pattern-threshold 0.6
# Show pattern detection warnings
debtmap analyze . --show-pattern-warnings
Detected Patterns:
Debt patterns (src/cli/args.rs:84):
god_object: Classes/modules with too many responsibilitieslong_function: Functions exceeding length thresholdscomplex_conditional: Nested or complex branching logicdeep_nesting: Excessive indentation depth
Design patterns (src/cli/args.rs:272-273):
observer: Event-driven observer pattern implementationssingleton: Singleton pattern usagesfactory: Factory pattern implementationsstrategy: Strategy pattern for interchangeable algorithmscallback: Callback-based async patternstemplate_method: Template method pattern implementations
Functional Analysis Issues
Enable Functional Analysis:
# Enable AST-based functional analysis
debtmap analyze . --ast-functional-analysis
# Use different strictness profiles
debtmap analyze . --ast-functional-analysis --functional-analysis-profile strict
debtmap analyze . --ast-functional-analysis --functional-analysis-profile balanced
debtmap analyze . --ast-functional-analysis --functional-analysis-profile lenient
Common Issues:
Too many false positives for legitimate imperative code:
# Use lenient profile
debtmap analyze . --ast-functional-analysis --functional-analysis-profile lenient
Public API Detection Issues
Control Public API Detection:
# Disable public API detection
debtmap analyze . --no-public-api-detection
# Adjust public API detection threshold (default: 0.7)
debtmap analyze . --public-api-threshold 0.5
Attribution Issues
Attribution analysis tracks where complexity originates in your code, separating logical complexity from formatting artifacts.
Enable Attribution (src/cli/args.rs:206-208):
# Show complexity attribution details
debtmap analyze . --attribution
Understanding Attribution Output:
Attribution breaks complexity into three categories:
- Logical Complexity (confidence: ~0.9): Genuine control flow and decision points
- Formatting Artifacts (confidence: ~0.75): Complexity from code formatting style
- Pattern Complexity: Complexity from recognized patterns
Common Issues:
Attribution shows high formatting artifacts:
# Use multi-pass analysis (enabled by default) to filter formatting
debtmap analyze .
# If disabled, re-enable it
debtmap analyze . --no-multi-pass # DON'T do this
Attribution confidence is too low:
- Low confidence indicates the analysis couldn’t reliably determine complexity sources
- This often happens with heavily macro-generated code or unusual control flow patterns
Missing source mappings:
- Source mappings require AST-level analysis
- Some dynamic patterns may not map to specific source locations
See Also
- Quick Fixes - Common problems with immediate solutions
- Context Provider Issues - Provider-specific troubleshooting
- Debug Mode - Verbosity levels and diagnostics
Error Messages Reference
Understanding common error messages, error codes, and how to resolve them.
Error Code System
Debtmap uses structured error codes for programmatic handling and documentation lookup. Error codes are assigned by category (from src/debtmap_error.rs:62-116):
| Code Range | Category | Description |
|---|---|---|
| E001-E009 | I/O | File system and I/O errors |
| E010-E019 | Parse | Source code parsing errors |
| E020-E029 | Config | Configuration file errors |
| E030-E039 | Analysis | Analysis algorithm errors |
| E040-E049 | CLI | Command-line argument errors |
| E050-E059 | Validation | Input validation errors |
I/O Error Codes (E001-E009)
| Code | Name | Description |
|---|---|---|
| E001 | IO_FILE_NOT_FOUND | File or directory does not exist |
| E002 | IO_PERMISSION_DENIED | Insufficient permissions to read file |
| E003 | IO_RESOURCE_BUSY | File is locked or resource temporarily unavailable |
| E009 | IO_GENERIC | Other I/O errors |
Parse Error Codes (E010-E019)
| Code | Name | Description |
|---|---|---|
| E010 | PARSE_SYNTAX | Syntax error in source file |
| E011 | PARSE_UNSUPPORTED | Unsupported language or feature |
| E012 | PARSE_ENCODING | Invalid file encoding |
| E019 | PARSE_GENERIC | Other parse errors |
Configuration Error Codes (E020-E029)
| Code | Name | Description |
|---|---|---|
| E020 | CONFIG_INVALID | Invalid configuration value |
| E021 | CONFIG_MISSING | Missing required configuration field |
| E022 | CONFIG_FILE_NOT_FOUND | Configuration file not found |
| E029 | CONFIG_GENERIC | Other configuration errors |
Analysis Error Codes (E030-E039)
| Code | Name | Description |
|---|---|---|
| E030 | ANALYSIS_COMPLEXITY | Complexity calculation failed |
| E031 | ANALYSIS_COVERAGE | Coverage data loading failed |
| E032 | ANALYSIS_SCORING | Debt scoring calculation failed |
| E039 | ANALYSIS_GENERIC | Other analysis errors |
CLI Error Codes (E040-E049)
| Code | Name | Description |
|---|---|---|
| E040 | CLI_INVALID_COMMAND | Unknown or invalid command |
| E041 | CLI_MISSING_ARG | Required argument not provided |
| E042 | CLI_INVALID_ARG | Invalid argument value |
| E049 | CLI_GENERIC | Other CLI errors |
Validation Error Codes (E050-E059)
| Code | Name | Description |
|---|---|---|
| E050 | VALIDATION_GENERIC | General validation failure |
| E051 | VALIDATION_THRESHOLD | Threshold validation exceeded |
| E052 | VALIDATION_CONSTRAINT | Constraint violation |
Error Classification
Debtmap classifies errors to help determine appropriate action (from src/debtmap_error.rs:466-519):
Retryable vs Non-Retryable Errors
Retryable errors may succeed on a subsequent attempt:
- Resource busy / file locks (
E003) - Network timeouts
- Connection issues
- “Temporarily unavailable” errors
Non-retryable errors require user intervention:
- Parse/syntax errors
- Configuration errors
- File not found (permanent)
- Validation errors
User-Fixable vs System Errors
User-fixable errors can be resolved by modifying input:
- Configuration errors (fix config file)
- CLI errors (fix command arguments)
- Validation errors (fix input values)
- Parse errors (fix source code syntax)
System errors are outside user control:
- I/O errors (system issues)
- Analysis errors (internal algorithm issues)
Exit Codes
Debtmap returns specific exit codes for CI/CD integration (from src/debtmap_error.rs:521-532):
| Exit Code | Meaning | When Returned |
|---|---|---|
| 0 | Success | Analysis completed successfully |
| 1 | Analysis/I/O Error | Analysis failed or I/O error occurred |
| 2 | CLI Error | Invalid command-line usage |
| 3 | Config Error | Configuration file or value error |
| 4 | Validation Error | Validation threshold exceeded |
| 5 | Parse Error | Source code parsing failed |
CI/CD Integration Example:
#!/bin/bash
debtmap analyze .
EXIT_CODE=$?
case $EXIT_CODE in
0)
echo "Analysis passed"
;;
1)
echo "Analysis or I/O error - check logs"
exit 1
;;
2)
echo "CLI usage error - check command syntax"
exit 1
;;
3)
echo "Configuration error - check .debtmap.toml"
exit 1
;;
4)
echo "Validation failed - thresholds exceeded"
exit 1
;;
5)
echo "Parse error - source code issues"
exit 1
;;
esac
File System Errors
Message: [E001] I/O error: No such file or directory
Meaning: File or directory does not exist
Solutions:
- Verify path is correct:
ls -la <path> - Check current working directory:
pwd - Use absolute paths if needed
Message: [E002] I/O error: Permission denied
Meaning: Cannot read file or directory due to permissions
Solutions:
- Check file permissions:
ls -la <file> - Ensure user has read access
- Run from appropriate directory
Message: [E003] I/O error: Resource busy
Meaning: File is locked or temporarily unavailable (retryable)
Solutions:
- Wait and retry the analysis
- Close other programs using the file
- Check for file locks
Parse Errors
Message: [E010] Parse error in file.rs: unexpected token at line 42, column 10
Meaning: Syntax that debtmap cannot parse
Solutions:
# Try fallback mode without semantic analysis
debtmap analyze . --semantic-off
# For Rust macro issues, enable verbose warnings
debtmap analyze . --verbose-macro-warnings --show-macro-stats
Message: [E011] Parse error: unsupported language
Meaning: The file type is not supported or has unsupported features
Solutions:
- Check that the file is a supported language (Rust, Python, JavaScript, TypeScript)
- For Rust, complex procedural macros may require
--semantic-off
Configuration Errors
Message: [E020] Configuration error: invalid config value
Meaning: Invalid configuration in .debtmap.toml or CLI flags
Configuration File Locations (checked in order):
.debtmap.tomlin project root (project-level)~/.config/debtmap/config.toml(global user config)
Solutions:
- Check
.debtmap.tomlsyntax with a TOML validator - Review CLI flag values
- Check for typos in configuration keys
Example valid configuration:
# .debtmap.toml
[thresholds]
complexity = 10
max_file_length = 500
[scoring]
coverage = 0.40
complexity = 0.30
Message: [E022] Configuration error: file not found
Meaning: Specified configuration file does not exist
Solutions:
- Create
.debtmap.tomlfrom the example:cp .debtmap.toml.example .debtmap.toml - Verify the configuration path if using
--configflag
Validation Errors
Message: [E051] Validation error: threshold validation failed
Meaning: Analysis results exceed configured thresholds
Solutions:
- Check threshold values in
.debtmap.tomlunder[thresholds.validation] - Ensure
--min-priorityis in valid range (0-10) - Use
--threshold-presetwith a valid preset name
Validation thresholds example:
[thresholds.validation]
max_average_complexity = 10.0
max_high_complexity_count = 100
max_debt_items = 2000
max_total_debt_score = 10000
max_codebase_risk_score = 7.0
Analysis Errors
Message: [E039] Analysis error: internal analysis failure
Meaning: Internal error during analysis phase
Solutions:
# Try fallback mode
debtmap analyze . --semantic-off
# Report with debug info
debtmap analyze . -vvv 2>&1 | tee error.log
# Isolate problem file
debtmap analyze . --max-files 1 path/to/suspected/file
Message: [E031] Analysis error: Coverage error
Meaning: Failed to load or process coverage data
Solutions:
# Analyze without coverage data
debtmap analyze . --no-coverage
# Check coverage file format
debtmap explain-coverage <coverage-file>
# Verify function name matching
debtmap analyze . -vvv | grep -i coverage
Dependency Errors
Message: Dependency error: cannot resolve dependency graph
Meaning: Cannot build dependency relationships
Solutions:
# Disable dependency provider
debtmap analyze . --context --disable-context dependency
# Try without context
debtmap analyze .
Concurrency Errors
Message: Concurrency error: parallel processing failure
Meaning: Error during parallel execution
Solutions:
# Disable parallel processing
debtmap analyze . --no-parallel
# Reduce thread count
debtmap analyze . --jobs 1
Pattern Errors
Message: Pattern error: invalid glob pattern
Meaning: Invalid glob pattern in configuration or CLI
Solutions:
- Check glob pattern syntax
- Escape special characters if needed
- Use simpler patterns or path prefixes
Handling False Positives
Debtmap includes context-aware false positive reduction, enabled by default. This uses pattern-based classification to reduce spurious debt reports (from src/cli/args.rs:202-204).
Controlling False Positive Reduction
# Default: context-aware analysis enabled
debtmap analyze .
# Disable context-aware analysis for raw results
debtmap analyze . --no-context-aware
When to Expect False Positives
False positives are more likely with:
- Complex macro-generated code
- Unusual code patterns
- Generated or vendored code
- Test utilities with intentionally complex patterns
Reducing false positives:
- Use context-aware analysis (default)
- Configure exclusion patterns in
.debtmap.toml - Use
--verbose-macro-warningsto identify macro issues
Boilerplate Detection Messages
Debtmap identifies boilerplate patterns that may be better suited for macros or code generation (from src/organization/boilerplate_detector.rs:33-58).
Understanding Boilerplate Reports
Message: Boilerplate detected: high trait implementation density
Meaning: File contains many similar trait implementations with low complexity, suggesting macro-ification would reduce maintenance burden.
Detection criteria:
- 20+ impl blocks (configurable via
min_impl_blocks) - 70%+ method uniformity across implementations
- Average complexity below 2.0 (simple, repetitive code)
- 70%+ confidence threshold
Acting on Boilerplate Recommendations
When a file is flagged as boilerplate:
- Consider macro extraction: If implementing the same trait for many types, use a declarative macro
- Consider code generation: For very large patterns, use build.rs or proc-macros
- Review the recommendation: Debtmap provides specific suggestions for each pattern
Boilerplate is NOT the same as complex code:
- Complex code (high cyclomatic complexity) → split into modules
- Boilerplate code (low complexity, high repetition) → macro-ify or generate
Language-Specific Issues
Rust Macro Handling
Rust macros may produce parse warnings or analysis limitations:
# Enable verbose macro warnings
debtmap analyze . --verbose-macro-warnings
# Show macro expansion statistics
debtmap analyze . --show-macro-stats
# Both together for full diagnostics
debtmap analyze . --verbose-macro-warnings --show-macro-stats
Language Support Status
| Language | Support Level | Notes |
|---|---|---|
| Rust | Full | Primary language, best analysis |
| Python | Stub | Basic complexity analysis |
| JavaScript | Stub | Basic complexity analysis |
| TypeScript | Stub | Basic complexity analysis |
See Also
- Quick Fixes - Common problems with immediate solutions
- Debug Mode - Verbosity levels for diagnostics
- FAQ - Frequently asked questions
Output and Command Issues
Troubleshooting output formatting and command-specific problems.
Output Format Selection
Debtmap supports multiple output formats for different use cases:
# Terminal format (default, human-readable with colors and emoji)
debtmap analyze .
# JSON format (unified format for programmatic processing)
debtmap analyze . --format json
# Markdown format (comprehensive analysis with LLM-optimized structure)
debtmap analyze . --format markdown
# HTML format (for web display)
debtmap analyze . --format html
# DOT format (Graphviz dependency visualization)
debtmap analyze . --format dot
Source: src/cli/args.rs:573-583 (OutputFormat enum)
JSON Format
JSON output uses the unified format internally (spec 202 removed legacy format). All JSON output is automatically structured with a consistent schema.
# Generate JSON output
debtmap analyze . --format json
# Validate JSON with jq
debtmap analyze . --format json | jq .
# Write to file
debtmap analyze . --format json --output results.json
Source: src/output/json.rs:32 - “Always use unified format (spec 202 - removed legacy format)”
Note: There is no --output-format flag. The unified format is always used automatically.
Plain Output Mode
For environments without color or emoji support (CI/CD, terminals without UTF-8):
# ASCII-only, no colors, no emoji
debtmap analyze . --plain
# Combine with JSON for machine-readable CI output
debtmap analyze . --format json --plain --output report.json
Source: src/cli/args.rs:176-178 (--plain flag definition)
Note: Only the --plain flag is supported. There is no NO_EMOJI environment variable.
Compare Command Issues
The compare command generates diff reports between two analysis snapshots.
Note: The compare command defaults to JSON output format (unlike analyze which defaults to terminal).
Basic Usage
# Save baseline results
debtmap analyze . --format json --output before.json
# Make code changes...
# Save new results
debtmap analyze . --format json --output after.json
# Compare results (outputs JSON by default)
debtmap compare --before before.json --after after.json
# Compare with terminal output
debtmap compare --before before.json --after after.json --format terminal
Target Location Format
When specifying a target location, use the format file:function:line:
# With explicit target location
debtmap compare --before before.json --after after.json \
--target-location "src/main.rs:process_file:42"
# Or use an implementation plan file
debtmap compare --before before.json --after after.json \
--plan IMPLEMENTATION_PLAN.md
Source: src/cli/args.rs:483-489 (target_location format specification)
Note: --plan and --target-location are mutually exclusive. Using both causes an error.
Common Compare Command Errors
Problem: “Cannot use –plan and –target-location together”
Solution: Use only one method to specify the target:
# Option 1: Use plan file
debtmap compare --before before.json --after after.json --plan IMPLEMENTATION_PLAN.md
# Option 2: Use explicit target location
debtmap compare --before before.json --after after.json --target-location "src/lib.rs:main:10"
Problem: Empty comparison results
Solution: Ensure both JSON files contain valid analysis output:
# Verify files have content
jq '.items | length' before.json
jq '.items | length' after.json
# Regenerate if empty
debtmap analyze . --format json --output before.json
Problem: Target location not found in comparison
Solution: Verify the target location format is file:function:line:
# Correct format
--target-location "src/parser.rs:parse_expression:45"
# Incorrect formats (missing components)
--target-location "src/parser.rs" # Missing function and line
--target-location "parse_expression" # Missing file and line
Validate Command Issues
The validate command checks if a codebase meets specified quality thresholds.
Basic Validation
# Validate codebase against thresholds
debtmap validate /path/to/project
# Set maximum acceptable debt density
debtmap validate /path/to/project --max-debt-density 10.0
# Use configuration file for thresholds
debtmap validate /path/to/project --config debtmap.toml
CI/CD Integration
# In CI pipeline (fails build if validation fails)
debtmap validate . --max-debt-density 10.0 || exit 1
# With verbose output for debugging
debtmap validate . --max-debt-density 10.0 -v
# With coverage integration
debtmap validate . --coverage-file coverage.lcov --max-debt-density 10.0
Exit Codes
The validate command returns:
0- Success (all thresholds passed)- Non-zero - Failure (thresholds exceeded or errors occurred)
Source: src/cli/args.rs:378-471 (Validate command definition)
Validate-Improvement Command Issues
The validate-improvement command validates that technical debt improvements meet quality thresholds.
Basic Usage
# First, create a comparison file
debtmap compare --before before.json --after after.json --output comparison.json
# Then validate the improvement
debtmap validate-improvement --comparison comparison.json
Source: src/commands/validate_improvement/mod.rs:1-45
Configuration Options
# Set custom improvement threshold (default: 75%)
debtmap validate-improvement --comparison comparison.json --threshold 80.0
# Custom output location (default: .prodigy/debtmap-validation.json)
debtmap validate-improvement --comparison comparison.json --output validation.json
# Track progress across multiple attempts
debtmap validate-improvement --comparison comparison.json \
--previous-validation .prodigy/previous-validation.json
# Suppress console output (for automation)
debtmap validate-improvement --comparison comparison.json --quiet
# Output format (json, terminal, or markdown)
debtmap validate-improvement --comparison comparison.json --format terminal
Source: src/cli/args.rs:502-531 (ValidateImprovement command definition)
Composite Score Calculation
The validation score combines three weighted components:
| Component | Weight | Description |
|---|---|---|
| Target Improvement | 50% | Did the specific target item improve? |
| Project Health | 30% | Did overall project debt decrease? |
| No Regressions | 20% | Were new critical items introduced? |
Formula: score = (0.5 × target) + (0.3 × health) + (0.2 × no_regressions)
Source: src/commands/validate_improvement/scoring.rs (calculate_composite_score function)
Common Validate-Improvement Errors
Problem: “Cannot read comparison file”
Solution: Ensure the comparison file exists and is valid JSON:
# Verify file exists
ls -la comparison.json
# Validate JSON format
jq . comparison.json
# Regenerate if needed
debtmap compare --before before.json --after after.json --output comparison.json
Problem: Low improvement score despite fixing issues
Cause: The composite score considers overall project health and regressions, not just the target fix.
Solution: Check for regressions introduced elsewhere:
# Use verbose output to see score breakdown
debtmap validate-improvement --comparison comparison.json --format terminal
Problem: Threshold not being met
Solution: Adjust the threshold or improve more areas:
# Lower threshold for partial improvements
debtmap validate-improvement --comparison comparison.json --threshold 50.0
# Or continue improving until the default 75% threshold is met
Trend Tracking
Track improvement trends across multiple validation attempts:
# First attempt
debtmap validate-improvement --comparison comparison.json \
--output .prodigy/validation-1.json
# Second attempt (after more improvements)
debtmap validate-improvement --comparison comparison-2.json \
--previous-validation .prodigy/validation-1.json \
--output .prodigy/validation-2.json
The output includes trend analysis with direction and recommendations.
Source: src/commands/validate_improvement/types.rs:81-88 (TrendAnalysis structure)
Summary vs Full Output
# Summary mode (compact tiered display)
debtmap analyze . --summary
debtmap analyze . -s
# Full output (default)
debtmap analyze .
# Limit number of items
debtmap analyze . --top 10
debtmap analyze . --tail 10
Filtering Output
# Minimum priority level
debtmap analyze . --min-priority high
# Filter by debt categories (comma-separated)
debtmap analyze . --filter complexity,duplication
# Combine filters
debtmap analyze . --min-priority medium --top 20 --filter complexity
# Show filter statistics
debtmap analyze . --filter complexity --show-filter-stats
# Minimum score threshold for T3/T4 items (default: 3.0)
debtmap analyze . --min-score 5.0
Source: src/cli/args.rs:153-155 (--filter with value_delimiter=',')
Note: Categories are comma-separated values. Use --show-filter-stats to see how many items were filtered.
See Also
- Quick Fixes - Common problems with immediate solutions
- CLI Reference - Complete CLI flag documentation
- Debug Mode - Debugging analysis issues
Troubleshooting FAQ
Frequently asked questions about troubleshooting debtmap issues.
General Questions
Q: Why is my analysis slow?
A: Check several factors:
# Use all CPU cores
debtmap analyze . --jobs 0
# Disable multi-pass for faster analysis
debtmap analyze . --no-multi-pass
# Try faster fallback mode
debtmap analyze . --semantic-off
Q: What does ‘Parse error’ mean?
A: File contains syntax debtmap cannot parse. Try:
--semantic-offfor fallback mode--verbose-macro-warningsfor Rust macros- Exclude problematic files in
.debtmap/config.toml
Q: Why do scores differ between runs?
A: Several factors affect scores:
- Coverage file changed
- Context providers enabled/disabled
- Code changes (intended behavior)
- Different threshold settings
Coverage Questions
Q: How does coverage affect scores?
A: Coverage affects scores through multiplicative factors (from src/risk/strategy.rs:189-204):
- < 20% coverage: 3.0x penalty
- 20-40% coverage: 2.0x penalty
- 40-60% coverage: 1.5x penalty
- 60-80% coverage: 1.2x penalty
- ≥ 80% coverage: 0.8x reduction (bonus for high coverage)
Q: Why isn’t my coverage data being applied?
A: Use the explain-coverage command:
debtmap explain-coverage . \
--coverage-file coverage.lcov \
--function "function_name" \
-v
Output Questions
Q: Why no output?
A: Check verbosity and filtering:
# Increase verbosity
debtmap analyze . -v
# Lower priority threshold
debtmap analyze . --min-priority 0
# Use lenient threshold
debtmap analyze . --threshold-preset lenient
Q: What’s the difference between legacy and unified JSON?
A:
- Legacy:
{File: {...}}- nested structure - Unified:
{type: "File", ...}- consistent structure
Scoring Questions
Q: What’s the difference between cyclomatic and cognitive complexity?
A: Cyclomatic complexity counts decision points (branches, conditions), while cognitive complexity measures human comprehension difficulty (nested structures, breaks in linear flow). Both are measured metrics computed directly from the AST.
Q: How does coverage dampening work?
A: Well-tested code gets lower debt scores through a dampening multiplier. Functions with high coverage (≥80%) receive a 0.8x reduction, surfacing untested complex functions as higher priority targets for improvement.
Q: When should I use god object detection vs boilerplate detection?
A: Use god object detection for large, complex classes that have too many responsibilities and need to be split into modules. Use boilerplate detection for repetitive, low-complexity code patterns that could benefit from macros or code generation.
Q: What are measured vs estimated metrics?
A: Measured metrics (cyclomatic, cognitive complexity, nesting depth) are precise values computed from AST analysis. Estimated metrics (branches) are heuristic approximations used for test planning and coverage predictions.
When to File Bug Reports
File a bug report when:
- Parse errors on valid syntax
- Crashes or panics
- Incorrect complexity calculations
- Concurrency errors
Before filing:
- Check this troubleshooting guide
- Try
--semantic-offfallback mode - Update to the latest version
- Search existing GitHub issues
See Also
- Quick Fixes - Common problems with immediate solutions
- Debug Mode - Verbosity and diagnostics
- Error Messages Reference - Error explanations