Introduction
🚧 Early Prototype - This project is under active development and APIs may change
Debtmap is a code complexity and technical debt analyzer that identifies which code to refactor for maximum cognitive debt reduction and which code to test for maximum risk reduction.
What is Debtmap?
Unlike traditional static analysis tools that simply flag complex code, Debtmap answers two critical questions:
- “What should I refactor to reduce cognitive burden?” - Identifies overly complex code that slows down development
- “What should I test first to reduce the most risk?” - Pinpoints untested complex code that threatens stability
Debtmap analyzes your codebase to identify complexity hotspots, technical debt patterns, and architectural risks. It supports Rust, Python, JavaScript, and TypeScript with full AST parsing and analysis capabilities. Rust includes additional advanced features like macro expansion and trait tracking.
What Makes Debtmap Different:
- Coverage-Risk Correlation: Combines complexity metrics with test coverage to identify genuinely risky code (high complexity + low coverage = critical risk)
- Multi-Factor Analysis: Analyzes complexity, coverage, dependencies, and call graphs for comprehensive prioritization
- Reduced False Positives: Uses entropy analysis and pattern detection to distinguish genuinely complex code from repetitive patterns, reducing false positives by up to 70% compared to traditional complexity-only analyzers. This is achieved through an advanced token classification system that categorizes code tokens and applies weighted entropy to accurately assess complexity.
- Actionable Guidance: Provides specific recommendations like “extract nested conditions” or “split this 80-line function” with quantified impact metrics
- Performance: Significantly faster than Java/Python-based competitors (written in Rust with parallel processing)
Why Use Debtmap?
Debtmap helps you make data-driven decisions about where to focus your refactoring and testing efforts:
- Identify Complexity - Find complex functions and modules that need refactoring, with concrete metrics showing which changes will have the most impact
- Detect Technical Debt - Discover 30+ debt patterns including code smells, security vulnerabilities, resource management issues, and architectural problems
- Assess Risk - Prioritize improvements based on sophisticated risk scoring that combines complexity, test coverage, and dependency impact
- Track Quality - Monitor code quality metrics over time with the
comparecommand (which can use--planto automatically extract target locations from implementation plans and track improvements) to verify that refactoring efforts achieved their goals - Get Actionable Recommendations - Receive specific guidance like “refactoring this will reduce complexity by 60%” or “testing this will reduce risk by 5%”
- Automated Debt Reduction - Integrates with Prodigy workflows for AI-driven automated refactoring with iterative validation and testing (via external integration)
Key Features
Analysis Capabilities
- Multi-language support - Full support for Rust, Python, JavaScript, and TypeScript with AST parsing, complexity analysis, and debt detection
- Reduced false positives - Uses entropy analysis and pattern detection to distinguish genuinely complex code from repetitive patterns (up to 70% reduction compared to traditional analyzers)
- Threshold presets - Quick setup with strict, balanced (default), or lenient presets matching different project types and quality standards
- Comprehensive debt detection - Identifies 30+ technical debt patterns across security (5 types), code organization (god objects, feature envy, magic values), resource management (5 types), testing quality (3 types), and error handling (4 types: error swallowing, poor error propagation, panic patterns, inadequate exception handling)
- Security vulnerability detection - Finds hardcoded secrets, weak crypto, SQL injection risks, and unsafe code patterns
- Resource management analysis - Identifies inefficient allocations, nested loops, and blocking I/O patterns
- Code organization analysis - Detects god objects, feature envy, primitive obsession, and magic values
- Testing quality assessment - Analyzes test complexity, flaky patterns, and assertion quality
- File-level aggregation - Multiple aggregation methods (sum, weighted, logarithmic) for identifying files needing organizational refactoring
- Context-aware analysis - Reduces false positives through intelligent context detection (enabled by default)
Risk Analysis & Prioritization
- Coverage-based risk analysis - Correlates complexity with test coverage to identify truly risky code
- Risk-driven testing recommendations - Prioritizes testing efforts based on complexity-coverage correlation and dependency impact
- Call graph analysis - Tracks upstream callers and downstream callees to understand dependency impact
- Tiered prioritization - Multi-stage pipeline (zero coverage, complexity-risk, critical path, dependency impact, effort optimization) surfaces critical architectural issues above simple testing gaps
- Quantified impact - Shows concrete metrics like “refactoring this will reduce complexity by 60%”
Performance & Output
- Parallel processing - Built with Rust and Rayon for blazing-fast analysis of large codebases
- Multiple output formats - JSON (legacy and unified structures), Markdown, and human-readable terminal formats for different tool integration needs
- Configurable thresholds - Customize complexity and duplication thresholds to match your standards
- Verbosity controls - Multiple verbosity levels (-v, -vv, -vvv) for progressive detail
Configuration & Customization
- Flexible suppression - Inline comment-based suppression for specific code sections
- Configuration file -
.debtmap.tomlordebtmap.toml(TOML/JSON/YAML formats supported) for project-specific settings - Test-friendly - Easily exclude test fixtures and example code from debt analysis
- Macro expansion support - Handles Rust macro expansions with configurable warnings
Commands
analyze- Comprehensive debt analysis with unified prioritizationvalidate- Enforce quality thresholds in CI/CD pipelinesvalidate-improvement- Validate refactoring improvements against quality thresholds (automates verification that changes actually improved code quality)compare- Track improvements over time and verify refactoring goalsinit- Generate configuration file with sensible defaults (–force to overwrite)
Target Audience
Debtmap is designed for:
- Development teams - Get concrete metrics for planning sprints. Know exactly which refactoring will reduce complexity by 60% or which function needs 6 unit tests for full coverage.
- Engineering managers - Track quality trends over time with the
comparecommand. Monitor whether refactoring efforts are actually improving codebase health. - Code reviewers - Focus reviews on high-risk areas identified by Debtmap. Prioritize reviewing untested complex code over simple utility functions.
- CI/CD engineers - Enforce quality gates with
validatecommand. Automate refactoring verification withvalidate-improvementfor continuous quality monitoring in pipelines. - Developers refactoring legacy codebases - Receive actionable guidance like “extract nested conditions”, “split this 80-line function into 3 smaller functions”, or “add error handling for this catch block”.
Getting Started
Ready to analyze your codebase? Check out:
- Getting Started - Installation and first analysis
- Analysis Guide - Understanding the metrics and output
- Output Formats - JSON, Markdown, and terminal formats
Tip: Start with debtmap analyze . --summary for a quick overview, or use --top 10 to focus on highest-priority items. Consider --threshold-preset balanced for typical projects or strict for high-quality standards.
Why Debtmap?
Technical debt analysis tools are everywhere. So why another one? Debtmap takes a fundamentally different approach to code quality analysis—one that reduces false positives and gives you actionable insights instead of just flagging “complex” code.
The Problem with Traditional Static Analysis
Most static analysis tools flag code as “complex” based purely on metrics like cyclomatic complexity or lines of code. The problem? Not all complexity is equal.
Consider this common pattern:
#![allow(unused)]
fn main() {
fn validate_config(config: &Config) -> Result<()> {
if config.output_dir.is_none() {
return Err(anyhow!("output_dir required"))
}
if config.max_workers.is_none() {
return Err(anyhow!("max_workers required"))
}
if config.timeout_secs.is_none() {
return Err(anyhow!("timeout_secs required"))
}
if config.log_level.is_none() {
return Err(anyhow!("log_level required"))
}
if config.cache_dir.is_none() {
return Err(anyhow!("cache_dir required"))
}
// ... 15 more similar checks
Ok(())
}
}
Traditional tools say: “Cyclomatic complexity: 20 - CRITICAL! Refactor immediately!”
Reality: This is a simple validation function with a repetitive pattern. Yes, it has 20 branches, but they’re all identical in structure. An experienced developer can read and understand this in seconds.
Now compare with this function:
#![allow(unused)]
fn main() {
fn reconcile_state(current: &State, desired: &State) -> Vec<Action> {
let mut actions = vec![];
match (current.mode, desired.mode) {
(Mode::Active, Mode::Standby) => {
if current.has_active_connections() {
actions.push(Action::DrainConnections);
actions.push(Action::WaitForDrain);
}
actions.push(Action::TransitionToStandby);
}
(Mode::Standby, Mode::Active) => {
if desired.requires_warmup() {
actions.push(Action::Warmup);
}
actions.push(Action::TransitionToActive);
}
(Mode::Active, Mode::Maintenance) => {
// Complex state transitions based on multiple conditions
if current.has_pending_operations() {
if desired.force_maintenance {
actions.push(Action::AbortPending);
} else {
actions.push(Action::FinishPending);
}
}
actions.push(Action::TransitionToMaintenance);
}
// ... more complex state transitions
_ => {}
}
actions
}
}
Traditional tools say: “Cyclomatic complexity: 8 - moderate”
Reality: This function involves complex state machine logic with conditional transitions, side effects, and non-obvious control flow. It’s genuinely complex and error-prone.
The key insight: Traditional metrics treat both functions equally, but they’re fundamentally different in terms of cognitive load and risk.
Debtmap’s Unique Approach
1. Entropy-Based Complexity Analysis
Debtmap uses information theory to distinguish between genuinely complex code and repetitive pattern-based code.
How it works:
- Calculate the variety of code patterns in a function
- High variety (many different patterns) = high entropy = genuinely complex
- Low variety (repetitive patterns) = low entropy = simple despite high branch count
Applied to our examples:
validate_config():
- Cyclomatic complexity: 20
- Pattern entropy: 0.3 (low - all branches identical)
- Entropy-adjusted complexity: 5
- Assessment: Low risk despite high branch count
reconcile_state():
- Cyclomatic complexity: 8
- Pattern entropy: 0.85 (high - diverse conditional logic)
- Entropy-adjusted complexity: 9
- Assessment: High risk - genuinely complex logic
This approach significantly reduces false positives compared to traditional cyclomatic complexity metrics by recognizing that repetitive patterns are easier to understand than diverse, complex logic.
2. Coverage-Risk Correlation
Debtmap is the only Rust analysis tool that natively combines code complexity with test coverage to compute risk scores.
Why this matters:
- Complex code with good tests = managed risk
- Simple code without tests = unmanaged risk (but low priority)
- Complex code without tests = CRITICAL gap
Example:
#![allow(unused)]
fn main() {
// Function A: Complex but well-tested
fn parse_query(sql: &str) -> Result<Query> {
// Complexity: 15, Coverage: 95%
// Risk Score: 3.2 (moderate - complexity managed by tests)
}
// Function B: Moderate complexity, no tests
fn apply_migrations(db: &mut Database) -> Result<()> {
// Complexity: 8, Coverage: 0%
// Risk Score: 8.9 (critical - untested with moderate complexity)
}
}
Debtmap integrates with LCOV coverage data to automatically prioritize Function B over Function A, even though A is more complex. This is because the risk is about untested complexity, not just complexity alone.
What makes this unique:
Debtmap is the only Rust-focused tool that natively combines complexity analysis with LCOV coverage data to compute risk scores. While other tools support coverage reporting, they don’t correlate it with complexity metrics to prioritize technical debt and testing efforts.
3. Actionable Recommendations
Most tools tell you what is wrong. Debtmap tells you what to do about it and what impact it will have.
Compare:
SonarQube:
Function 'process_request' has complexity 15 (threshold: 10)
Severity: Major
Debtmap:
#1 SCORE: 8.9 [CRITICAL]
├─ TEST GAP: ./src/handlers.rs:127 process_request()
├─ ACTION: Add 8 unit tests for full coverage
├─ IMPACT: -5.2 risk reduction
├─ WHY: Complex logic (cyclo=15) with 0% coverage
└─ SUGGEST: Extract validation to separate functions, test each independently
Debtmap tells you:
- Specific location (file:line)
- Quantified gap (8 missing tests)
- Expected impact (-5.2 risk reduction)
- Rationale (complexity + no coverage)
- Refactoring suggestions (extract functions)
4. Context-Aware Analysis
Debtmap understands that not all code needs the same level of scrutiny.
Entry Points: Main functions, CLI handlers, and framework integration points are typically tested via integration tests, not unit tests. Debtmap’s analysis accounts for this:
// Entry point - flagged as low priority for unit test coverage
fn main() {
// Debtmap: "Integration test coverage expected - low priority"
}
// Core business logic - flagged as high priority
fn calculate_risk_score(metrics: &Metrics) -> f64 {
// Debtmap: "High complexity + low coverage = CRITICAL"
}
Call Graph Analysis: Debtmap traces function dependencies to prioritize functions called by many untested paths:
parse_input() [untested]
├─ called by: main() [integration tested]
└─ called by: process_batch() [untested]
Priority: HIGH (called from untested code path)
5. Performance
Debtmap is written in Rust and uses parallel processing for analysis. Being a native Rust binary with no JVM overhead, it’s designed for fast local development workflow integration.
Typical analysis time:
- Small project (~10k LOC): 1-2 seconds
- Medium project (~50k LOC): 5-8 seconds
- Large project (~200k LOC): 20-30 seconds
This speed means you can run debtmap in your local development workflow without breaking flow, not just in CI.
What Problem Does Debtmap Solve?
Debtmap addresses a gap that existing tools don’t fill: quantified technical debt prioritization with actionable refactoring guidance.
The Gap in Existing Tools
| Tool Type | What It Does | What It Doesn’t Do |
|---|---|---|
| Linters (clippy, ESLint) | Find code style issues and common mistakes | Don’t quantify risk or prioritize by impact |
| Complexity Analyzers (SonarQube, CodeClimate) | Flag complex code | Don’t correlate with test coverage or provide refactoring impact estimates |
| Coverage Tools (tarpaulin, codecov) | Show what code is tested | Don’t identify which untested code is most risky |
Note: Debtmap is not a security scanner. Use tools like cargo-audit and cargo-geiger for security vulnerability detection. Debtmap focuses on technical debt prioritization, though complex untested code can sometimes harbor security issues.
What Debtmap uniquely provides:
- Quantified Debt Scoring - Not just “this is complex,” but “this scores 8.9/10 on risk”
- Coverage-Risk Correlation - Identifies untested complex code, not just complex code
- Impact Quantification - “Adding 6 tests will reduce risk by 3.7 points”
- Actionable Recommendations - Specific refactoring suggestions with effort estimates
- Dependency-Aware Prioritization - Prioritizes code that impacts many other functions
Debtmap vs Traditional Tools
SonarQube / CodeClimate:
- They say: “Function has complexity 15 (threshold exceeded)”
- Debtmap says: “Add 8 tests (-5.2 risk). Extract validation logic to reduce complexity by 60%”
Coverage Tools (tarpaulin, codecov):
- They say: “67% line coverage, 54% branch coverage”
- Debtmap says: “3 critical gaps: untested complex functions that are called from 12+ code paths”
Linters (clippy):
- They say: “Consider using Iterator::any() instead of a for loop”
- Debtmap says: “This function has high cognitive complexity (12) and is called by 8 untested modules - prioritize adding tests before refactoring”
When to Use Debtmap
Use Debtmap when you need to:
- Decide which technical debt to tackle first (limited time/resources)
- Identify critical testing gaps (high-complexity, zero-coverage code)
- Quantify the impact of refactoring efforts
- Reduce false positives from repetitive validation code
- Prioritize refactoring based on risk, not just complexity
- Get specific, actionable recommendations with effort estimates
Use other tools for different needs:
- clippy - Catch Rust idiom violations and common mistakes
- tarpaulin - Generate LCOV coverage data (Debtmap analyzes it)
- SonarQube - Multi-language analysis with centralized dashboards
Security is a separate concern:
- cargo-audit - Find known vulnerabilities in dependencies
- cargo-geiger - Detect unsafe code usage
- Debtmap doesn’t scan for security issues, though complex code may harbor security risks
Recommended Workflow
Debtmap works alongside existing tools, not instead of them:
# 1. Local development loop (before commit)
cargo fmt # Format code
cargo clippy # Check idioms and common issues
cargo test # Run tests
debtmap analyze . # Identify new technical debt
# 2. CI/CD pipeline (PR validation)
cargo test --all-features # Full test suite
cargo clippy -- -D warnings # Fail on warnings
debtmap validate . # Enforce debt thresholds
# 3. Weekly planning (prioritize work)
cargo tarpaulin --out lcov # Generate coverage
debtmap analyze . --lcov lcov.info --top 20
# Review top 20 debt items, plan sprint work
# 4. Monthly review (track trends)
debtmap analyze . --format json --output debt-$(date +%Y%m).json
debtmap compare --before debt-202410.json --after debt-202411.json
The Bottom Line
Debtmap isn’t a replacement for linters or coverage tools. It solves a different problem: turning raw complexity and coverage data into prioritized, actionable technical debt recommendations.
If you’re asking “Where should I focus my refactoring efforts?” or “Which code needs tests most urgently?”, that’s what Debtmap is built for.
Key Differentiators
- Entropy analysis - Reduces false positives from repetitive code
- Native coverage integration - Built-in LCOV support for risk scoring
- Actionable recommendations - Specific steps with quantified impact
- Context-aware - Understands entry points, call graphs, and testing patterns
- Fast - Rust performance for local development workflow
- Tiered prioritization - Critical/High/Moderate/Low classification with clear rationale
Next Steps
Ready to try it? Head to Getting Started to install debtmap and run your first analysis.
Want to understand how it works under the hood? See Architecture for the analysis pipeline.
Have questions? Check the FAQ for common questions and answers.
Getting Started
This guide will help you install Debtmap and run your first analysis in just a few minutes.
Prerequisites
Before installing Debtmap, you’ll need:
- For pre-built binaries: No prerequisites! The install script handles everything.
- For cargo install or building from source:
- Rust toolchain (rustc and cargo)
- Supported platforms: Linux, macOS, Windows
- Rust edition 2021 or later
Optional (for coverage-based risk analysis):
- Rust projects:
cargo-tarpaulinfor coverage data - JavaScript/TypeScript: Jest or other tools generating LCOV format
- Python: pytest with coverage plugin
Installation
Quick Install (Recommended)
Install the latest release with a single command:
curl -sSL https://raw.githubusercontent.com/iepathos/debtmap/master/install.sh | bash
Or with wget:
wget -qO- https://raw.githubusercontent.com/iepathos/debtmap/master/install.sh | bash
This will:
- Automatically detect your OS and architecture
- Download the appropriate pre-built binary from the latest GitHub release
- Install debtmap to
~/.cargo/binif it exists, otherwise~/.local/bin - Offer to automatically add the install directory to your PATH if needed
Using Cargo
If you have Rust installed:
cargo install debtmap
From Source
For the latest development version:
# Clone the repository
git clone https://github.com/iepathos/debtmap.git
cd debtmap
# Build and install
cargo install --path .
Verify Installation
After installation, verify Debtmap is working:
# Check version
debtmap --version
# See available commands
debtmap --help
Troubleshooting
Installation Issues
- Binary not in PATH: Add
~/.cargo/binor~/.local/binto your PATHexport PATH="$HOME/.cargo/bin:$PATH" # Add to ~/.bashrc or ~/.zshrc - Permission issues: Run the install script with your current user (don’t use sudo)
- Cargo not found: Install Rust from https://rustup.rs
First Run Issues
-
Empty output or no items found: Check that your project contains supported source files (
.rs,.py,.js,.ts,.tsx). Verify withdebtmap analyze . -vvvfor debug output. -
Parser failures: If analysis fails with parsing errors:
# Run with verbose output to identify problematic files debtmap analyze . -vv # Exclude problematic files temporarily debtmap init # Creates .debtmap.toml # Edit .debtmap.toml to add ignore patterns -
Performance issues: For large codebases (10,000+ files):
# Limit parallel jobs to reduce memory usage debtmap analyze . --jobs 4 # Or analyze specific directories debtmap analyze ./src
Quick Start
Here are the most common commands to get you started:
# Analyze current directory (simplest command)
debtmap analyze .
# Analyze with coverage data for risk scoring (recommended)
# Note: --lcov is a shorthand alias for --coverage-file
debtmap analyze . --lcov target/coverage/lcov.info
# Generate coverage first (for Rust projects)
cargo tarpaulin --out lcov --output-dir target/coverage
debtmap analyze . --lcov target/coverage/lcov.info
# Analyze with custom thresholds
# Note: threshold-duplication specifies minimum lines of duplicated code to detect
debtmap analyze ./src --threshold-complexity 15 --threshold-duplication 50
# Output as JSON (for CI/CD integration)
debtmap analyze ./src --format json --output report.json
# Show only top 10 high-priority issues
debtmap analyze . --top 10
# Initialize configuration file for project-specific settings
debtmap init
# Validate against thresholds (CI/CD integration)
debtmap validate ./src --max-debt-density 5.0
# Validate using config file settings
# Note: CLI flags override config file values when both are specified
debtmap validate . --config .debtmap.toml --max-debt-density 5.0
# Compare before/after to track improvements
debtmap analyze . --format json --output before.json
# ... make improvements ...
debtmap analyze . --format json --output after.json
debtmap compare --before before.json --after after.json
# Advanced comparison: focus on specific function
debtmap compare --before before.json --after after.json --target-location src/main.rs:main:10
# Extract target from implementation plan
debtmap compare --before before.json --after after.json --plan IMPLEMENTATION_PLAN.md
Advanced Options
Debtmap provides many powerful options to customize your analysis:
Verbosity Levels:
# Show main factors contributing to scores
debtmap analyze . -v
# Show detailed calculations
debtmap analyze . -vv
# Show all debug information
debtmap analyze . -vvv
Filtering and Prioritization:
# Only show high-priority items
debtmap analyze . --min-priority high
# Filter by specific categories
debtmap analyze . --filter Architecture,Testing
# Group results by debt category
debtmap analyze . --group-by-category
Performance Control:
# Limit parallel jobs
debtmap analyze . --jobs 4
# Disable parallel processing
debtmap analyze . --no-parallel
Output Control:
# Plain output (no colors/emoji, for CI/CD)
debtmap analyze . --plain
# Compact summary output
debtmap analyze . --summary
# Control aggregation behavior
debtmap analyze . --aggregate-only # Show only aggregated results
debtmap analyze . --no-aggregation # Skip aggregation entirely
debtmap analyze . --aggregation-method sum # Choose aggregation method
# Adjust detail level in output
debtmap analyze . --detail-level high # More detailed output
Expert Options:
These advanced options are available for power users and specialized use cases:
# Analysis behavior
--semantic-off # Disable semantic analysis
--no-context-aware # Disable context-aware analysis
--multi-pass # Enable multi-pass analysis for deeper insights
--validate-loc # Validate lines of code calculations
# Rust-specific options
--verbose-macro-warnings # Show detailed macro expansion warnings
--show-macro-stats # Display macro usage statistics
# Filtering and thresholds
--threshold-preset <name> # Use predefined threshold preset
--min-problematic <count> # Minimum problematic items to report
--max-files <count> # Limit analysis to N files
--no-god-object # Disable god object detection
# Advanced reporting
--attribution # Include code attribution information
For detailed documentation of these options, run debtmap analyze --help.
First Analysis
Let’s run your first analysis! Navigate to a project directory and run:
debtmap analyze .
What happens during analysis:
- File Discovery - Debtmap scans your project for supported source files (Rust, Python, JavaScript, TypeScript)
- Parsing - Each file is parsed into an Abstract Syntax Tree (AST)
- Metrics Calculation - Complexity, debt patterns, and risk scores are computed
- Prioritization - Results are ranked by priority (CRITICAL, HIGH, MEDIUM, LOW)
- Output - Results are displayed in your chosen format
Expected timing: Analyzing a 10,000 LOC project typically takes 2-5 seconds.
Language Support
Debtmap supports multiple programming languages with varying feature completeness:
Rust (Full Support)
All analysis features available:
- Complexity metrics (cyclomatic, cognitive, nesting, lines)
- Technical debt detection (code smells, anti-patterns)
- Test gap analysis with coverage integration
- Advanced features:
- Trait detection and analysis
- Function purity analysis
- Call graph generation
- Macro expansion tracking
Python (Partial Support)
Core features available:
- Complexity metrics (cyclomatic, cognitive, nesting, lines)
- Basic debt detection (code smells, god objects)
- Test gap analysis with coverage integration
Not yet available: Purity analysis, detailed call graphs
JavaScript/TypeScript (Partial Support)
Core features available:
- Complexity metrics (cyclomatic, cognitive, nesting, lines)
- Basic debt detection (code smells, god objects)
- Test gap analysis with coverage integration
Not yet available: Purity analysis, detailed call graphs
Note: All languages benefit from coverage integration for accurate risk assessment. The core analysis workflow (complexity → debt patterns → prioritization) works consistently across all supported languages.
Example Output
When you run debtmap analyze ., you’ll see output similar to this:
════════════════════════════════════════════
PRIORITY TECHNICAL DEBT FIXES
════════════════════════════════════════════
🎯 TOP 3 RECOMMENDATIONS (by unified priority)
#1 SCORE: 8.9 [CRITICAL]
├─ TEST GAP: ./src/analyzers/rust_call_graph.rs:38 add_function_to_graph()
├─ ACTION: Add 6 unit tests for full coverage
├─ IMPACT: Full test coverage, -3.7 risk
├─ COMPLEXITY: cyclomatic=6, branches=6, cognitive=8, nesting=2, lines=32
├─ DEPENDENCIES: 0 upstream, 11 downstream
└─ WHY: Business logic with 0% coverage, manageable complexity (cyclo=6, cog=8)
#2 SCORE: 8.9 [CRITICAL]
├─ TEST GAP: ./src/debt/smells.rs:196 detect_data_clumps()
├─ ACTION: Add 5 unit tests for full coverage
├─ IMPACT: Full test coverage, -3.7 risk
├─ COMPLEXITY: cyclomatic=5, branches=5, cognitive=11, nesting=5, lines=31
├─ DEPENDENCIES: 0 upstream, 4 downstream
└─ WHY: Business logic with 0% coverage, manageable complexity (cyclo=5, cog=11)
#3 SCORE: 8.6 [CRITICAL]
├─ TEST GAP: ./src/risk/context/dependency.rs:247 explain()
├─ ACTION: Add 5 unit tests for full coverage
├─ IMPACT: Full test coverage, -3.6 risk
├─ COMPLEXITY: cyclomatic=5, branches=5, cognitive=9, nesting=1, lines=24
├─ DEPENDENCIES: 0 upstream, 1 downstream
└─ WHY: Business logic with 0% coverage, manageable complexity (cyclo=5, cog=9)
📊 TOTAL DEBT SCORE: 4907
📈 OVERALL COVERAGE: 67.12%
Understanding the Output
Let’s break down what this output means:
Priority Levels
- CRITICAL (9.0-10.0): Immediate action required - high complexity with no test coverage
- HIGH (7.0-8.9): Should be addressed soon - moderate-high complexity with poor coverage
- MEDIUM (5.0-6.9): Plan for next sprint - moderate complexity or partial coverage gaps
- LOW (3.0-4.9): Nice to have - well-tested or simple functions
Note: These are default priority thresholds. You can customize them in .debtmap.toml under the [tiers] section to match your team’s standards.
Key Metrics
-
Unified Score (0-10 scale): Overall priority combining complexity, coverage, and dependencies
- Higher score = higher priority
- Takes into account multiple risk factors
-
Debt Type: Category of the issue
TestGap: Missing test coverageComplexity: Exceeds complexity thresholdsDuplication: Repeated code blocksCodeSmell: Anti-patterns and bad practices
-
Complexity Metrics:
- Cyclomatic: Number of decision points (branches, loops)
- Cognitive: How difficult the code is to understand
- Nesting: Maximum indentation depth
- Lines: Function length
-
Dependencies:
- Upstream callers: Functions that call this function
- Downstream callees: Functions this function calls
- More dependencies = higher impact when this code breaks
Recommendation Structure
Each recommendation shows:
- ACTION: What you should do (e.g., “Add 6 unit tests”)
- IMPACT: Expected improvement (e.g., “Full test coverage, -3.7 risk”)
- WHY: The reasoning behind this recommendation
Organizing Results
When analyzing large codebases, you can organize and filter results to focus on specific areas:
Group by Debt Category:
debtmap analyze . --group-by-category
This organizes results by type: Architecture, Testing, Performance, CodeQuality
Filter by Priority:
# Show only high and critical priority items
debtmap analyze . --min-priority high
# Combine with --top to limit results
debtmap analyze . --min-priority high --top 10
Filter by Category:
# Focus on specific debt types
debtmap analyze . --filter Architecture,Testing
# Available categories: Architecture, Testing, Performance, CodeQuality
These filtering options help you focus on specific types of technical debt, making it easier to plan targeted improvements.
Summary Statistics
-
Total Debt Score: Sum of all debt scores across your codebase
- Lower is better
- Track over time to measure improvement
-
Overall Coverage: Percentage of code covered by tests
- Only shown when coverage data is provided
Output Formats
Debtmap supports multiple output formats:
- Terminal (default): Human-readable colored output with tables
- JSON: Machine-readable format for CI/CD integration
- Markdown: Documentation-friendly format for reports
Example JSON output:
# By default, JSON uses legacy format
debtmap analyze . --format json --output report.json
# For the new unified format (with consistent structure and type field):
debtmap analyze . --format json --output-format unified --output report.json
JSON Format Options:
- legacy (default): Uses
{File: {...}}and{Function: {...}}wrappers for backward compatibility with existing tools - unified: New format (spec 108) with consistent structure and
typefield for all items
Recommendation: Use unified for new integrations, legacy only for compatibility with existing tooling.
Example Markdown output:
debtmap analyze . --format markdown --output report.md
What’s Next?
Now that you’ve run your first analysis, explore these topics:
- Analysis Guide - Deep dive into complexity metrics, debt patterns, and risk scoring
- Output Formats - Detailed guide to JSON schema and integration options
- Configuration - Customize thresholds and filters with
.debtmap.toml - CI/CD Integration - Use the
validatecommand to enforce quality gates
Generate a Configuration File
Create a project-specific configuration:
debtmap init
This creates a .debtmap.toml file with sensible defaults that you can customize for your project.
Example Configuration File:
Here’s a typical .debtmap.toml with common settings:
# Debtmap Configuration File
# Generated by: debtmap init
# Analysis thresholds
[thresholds]
complexity = 10 # Flag functions with cyclomatic complexity > 10
duplication = 40 # Minimum lines for duplicate code detection
file_size = 500 # Warn on files exceeding 500 lines
# Priority tier boundaries (for unified priority scores 0-10)
[tiers]
critical = 9.0 # Score >= 9.0 = CRITICAL priority
high = 7.0 # Score >= 7.0 = HIGH priority
medium = 5.0 # Score >= 5.0 = MEDIUM priority
# Anything below 5.0 = LOW priority
# Risk scoring weights (must sum to 1.0)
[weights]
coverage = 0.4 # Weight for test coverage gaps
complexity = 0.35 # Weight for complexity metrics
dependencies = 0.25 # Weight for call graph dependencies
# Language-specific settings
[languages]
rust = true # Enable Rust analysis
python = true # Enable Python analysis
javascript = true # Enable JavaScript/TypeScript analysis
# Files and directories to ignore
[ignore]
patterns = [
"**/target/**", # Rust build artifacts
"**/node_modules/**", # JavaScript dependencies
"**/__pycache__/**", # Python bytecode
"**/tests/**", # Test directories (optional)
"**/*.test.ts", # Test files (optional)
]
# God object detection thresholds
[god_object]
methods = 20 # Warn on classes/modules with > 20 methods
lines = 500 # Warn on classes/modules with > 500 lines
# Entropy-based complexity detection
[entropy]
enabled = true # Enable entropy analysis
threshold = 0.7 # Flag functions with entropy > 0.7
Key Configuration Options:
The configuration file allows you to customize:
- Threshold customization - Adjust complexity, duplication, and file size thresholds
- Scoring weights - Fine-tune how coverage, complexity, and dependencies are weighted
- Language selection - Enable/disable specific language analyzers
- Ignore patterns - Exclude test files or generated code from analysis
- God object thresholds - Configure what constitutes a “god object” anti-pattern
- Entropy analysis - Control entropy-based complexity detection
- Priority tiers - Customize CRITICAL/HIGH/MEDIUM/LOW threshold ranges
See the Configuration chapter for complete documentation of all available options.
Try Analysis with Coverage
For more accurate risk assessment, run analysis with coverage data. Coverage helps Debtmap identify truly risky code - functions that are both complex AND untested.
Generating Coverage Data
Rust Projects:
# Install cargo-tarpaulin if not already installed
cargo install cargo-tarpaulin
# Generate LCOV coverage report
cargo tarpaulin --out lcov --output-dir target/coverage
# Run debtmap with coverage
debtmap analyze . --lcov target/coverage/lcov.info
Python Projects:
# Install coverage plugin if not already installed
pip install pytest-cov
# Generate LCOV coverage report
pytest --cov=. --cov-report=lcov:coverage.lcov
# Run debtmap with coverage
debtmap analyze . --lcov coverage.lcov
JavaScript/TypeScript Projects:
# Using Jest (add to package.json or run directly)
jest --coverage --coverageReporters=lcov
# Run debtmap with coverage
debtmap analyze . --lcov coverage/lcov.info
# Using NYC with other test runners
npx nyc --reporter=lcovonly npm test
debtmap analyze . --lcov coverage/lcov.info
# Using Vitest
vitest run --coverage --coverage.reporter=lcov
debtmap analyze . --lcov coverage/lcov-report/lcov.info
Note: The --lcov flag is a shorthand alias for --coverage-file. Both accept LCOV format coverage reports.
Need help? Report issues at https://github.com/iepathos/debtmap/issues
CLI Reference
Complete reference for Debtmap command-line interface.
Quick Start
# Basic analysis
debtmap analyze src/
# With coverage integration
debtmap analyze src/ --coverage-file coverage.lcov
# Generate JSON report
debtmap analyze . --format json --output report.json
# Show top 10 priority items only
debtmap analyze . --top 10 --min-priority high
# Initialize configuration and validate
debtmap init
debtmap validate . --config debtmap.toml
Commands
Debtmap provides six main commands: five for analysis and validation, plus one debugging tool.
analyze
Analyze code for complexity and technical debt.
Usage:
debtmap analyze <PATH> [OPTIONS]
Arguments:
<PATH>- Path to analyze (file or directory)
Description: Primary command for code analysis. Supports multiple output formats (json, markdown, terminal), coverage file integration, parallel processing, context-aware risk analysis, and comprehensive filtering options.
See Options section below for all available flags.
init
Initialize a Debtmap configuration file.
Usage:
debtmap init [OPTIONS]
Options:
-f, --force- Force overwrite existing config
Description:
Creates a debtmap.toml configuration file in the current directory with default settings. Use --force to overwrite an existing configuration file.
validate
Validate code against thresholds defined in configuration file.
Usage:
debtmap validate <PATH> [OPTIONS]
Arguments:
<PATH>- Path to analyze
Options:
Configuration & Output:
-c, --config <CONFIG>- Configuration file path-f, --format <FORMAT>- Output format: json, markdown, terminal-o, --output <OUTPUT>- Output file path (defaults to stdout)
Coverage & Context:
--coverage-file <PATH>/--lcov <PATH>- LCOV coverage file for risk analysis--context/--enable-context- Enable context-aware risk analysis--context-providers <PROVIDERS>- Context providers to use (comma-separated)--disable-context <PROVIDERS>- Disable specific context providers
Thresholds & Validation:
--max-debt-density <N>- Maximum debt density allowed (per 1000 LOC)
Display Filtering:
--top <N>/--head <N>- Show only top N priority items--tail <N>- Show only bottom N priority items-s, --summary- Use summary format with tiered priority display
Analysis Control:
--semantic-off- Disable semantic analysis (fallback mode)
Performance Control:
--no-parallel- Disable parallel call graph construction (enabled by default)-j, --jobs <N>- Number of threads for parallel processing (0 = use all cores)- Can also use
DEBTMAP_JOBSenvironment variable
- Can also use
Debugging & Verbosity:
-v, --verbose- Increase verbosity level (can be repeated: -v, -vv, -vvv)
Description:
Similar to analyze but enforces thresholds defined in configuration file. Returns non-zero exit code if thresholds are exceeded, making it suitable for CI/CD integration.
The validate command supports a focused subset of analyze options, primarily for output control, coverage integration, context-aware analysis, and display filtering.
Note: The following analyze options are NOT available in the validate command:
--threshold-complexity,--threshold-duplication,--threshold-preset(configure these in.debtmap.tomlinstead)--languages(language filtering)
Note: The --explain-score flag exists in the validate command but is deprecated (hidden). Use -v, -vv, or -vvv for verbosity instead.
Configure analysis thresholds in your .debtmap.toml configuration file for use with the validate command.
Exit Codes:
0- Success (no errors, all thresholds passed)- Non-zero - Failure (errors occurred or thresholds exceeded)
compare
Compare two analysis results and generate a diff report.
Usage:
debtmap compare --before <FILE> --after <FILE> [OPTIONS]
Required Options:
--before <FILE>- Path to “before” analysis JSON--after <FILE>- Path to “after” analysis JSON
Optional Target Location:
--plan <FILE>- Path to implementation plan (to extract target location)--target-location <LOCATION>- Target location in formatfile:function:line
Note: --plan and --target-location are mutually exclusive options. Using both together will cause a CLI error:
error: the argument '--plan <FILE>' cannot be used with '--target-location <LOCATION>'
Use one or the other to specify the target location.
Output Options:
-f, --format <FORMAT>- Output format: json, markdown, terminal (default: json)-o, --output <OUTPUT>- Output file (defaults to stdout)
Description: Compares two analysis results and generates a diff showing improvements or regressions in code quality metrics.
validate-improvement
Validate that technical debt improvements meet quality thresholds.
Usage:
debtmap validate-improvement --comparison <FILE> [OPTIONS]
Required Options:
--comparison <FILE>- Path to comparison JSON file (fromdebtmap compare)
Optional Options:
-o, --output <FILE>- Output file path for validation results (default:.prodigy/debtmap-validation.json)--previous-validation <FILE>- Path to previous validation result for trend tracking--threshold <N>- Improvement threshold percentage (default: 75.0)-f, --format <FORMAT>- Output format: json, markdown, terminal (default: json)--quiet- Suppress console output (useful for automation)
Description:
Validates improvement quality by analyzing comparison output from debtmap compare. Calculates a composite improvement score based on:
- Target item improvement (50% weight)
- Overall project health (30% weight)
- Absence of regressions (20% weight)
When --previous-validation is provided, tracks progress trends across multiple attempts and provides recommendations for continuing or adjusting the improvement approach.
Example:
# Basic validation
debtmap validate-improvement --comparison comparison.json
# With trend tracking and custom threshold
debtmap validate-improvement \
--comparison .prodigy/comparison.json \
--previous-validation .prodigy/validation.json \
--output .prodigy/validation.json \
--threshold 80.0
explain-coverage (Debugging)
Explain coverage detection for a specific function.
Usage:
debtmap explain-coverage <PATH> --coverage-file <FILE> --function <NAME> [OPTIONS]
Arguments:
<PATH>- Path to the codebase to analyze
Required Options:
--coverage-file <FILE>/--lcov <FILE>- LCOV coverage file--function <NAME>- Function name to explain (e.g., “create_auto_commit”)
Optional Options:
--file <PATH>- File path containing the function (helps narrow search)-v, --verbose- Show all attempted matching strategies-f, --format <FORMAT>- Output format: text or json (default: text)
Description: Debugging tool that explains how coverage detection works for a specific function. Shows all attempted matching strategies and helps diagnose coverage mapping issues. This command is particularly useful when:
- Coverage appears incorrect for specific functions
- You need to understand why a function isn’t matched in coverage data
- Debugging LCOV line mapping issues
Example:
# Explain coverage for a specific function
debtmap explain-coverage src/ --coverage-file coverage.lcov --function "process_file"
# Narrow search to specific file with verbose output
debtmap explain-coverage . --lcov lcov.info --function "analyze_complexity" --file "src/analyzer.rs" -v
# JSON output for automation
debtmap explain-coverage . --coverage-file coverage.lcov --function "my_function" --format json
Options
Options are organized by category for clarity. Most options apply to the analyze command, with a subset available for validate.
Output Control
Control how analysis results are formatted and displayed.
Format Options:
-f, --format <FORMAT>- Output format: json, markdown, terminal (default: terminal for analyze)--output-format <JSON_FORMAT>- JSON structure format: legacy or unified (default: legacy)legacy- Current format with{File: {...}}and{Function: {...}}wrappersunified- New format with consistent structure and ‘type’ field
-o, --output <OUTPUT>- Output file path (defaults to stdout)--plain- Plain output mode: ASCII-only, no colors, no emoji, machine-parseable
Display Filtering:
--top <N>/--head <N>- Show only top N priority items--tail <N>- Show only bottom N priority items (lowest priority)-s, --summary- Use summary format with tiered priority display (compact output)-c, --compact- Use compact output format (minimal details, top metrics only). Conflicts with verbosity flags (-v, -vv, -vvv). Only available inanalyzecommand (note:validateuses-cfor--config)--min-priority <PRIORITY>- Minimum priority to display: low, medium, high, critical--filter <CATEGORIES>- Filter by debt categories (comma-separated)--aggregate-only- Show only aggregated file-level scores--group-by-category- Group output by debt category
Dependency Display Options:
--show-dependencies- Show caller/callee information in output--no-dependencies- Hide dependency information (conflicts with –show-dependencies)--max-callers <N>- Maximum number of callers to display (default: 5)--max-callees <N>- Maximum number of callees to display (default: 5)--show-external-calls- Include external crate calls in dependencies--show-std-lib-calls- Include standard library calls in dependencies
Analysis Control
Configure analysis behavior, thresholds, and language selection.
Thresholds:
--threshold-complexity <N>- Complexity threshold (default: 10) [analyze command]--threshold-duplication <N>- Duplication threshold in lines (default: 50) [analyze command]--threshold-preset <PRESET>- Complexity threshold preset: strict, balanced, lenient [analyze command]strict- Strict thresholds for high code quality standardsbalanced- Balanced thresholds for typical projects (default)lenient- Lenient thresholds for legacy or complex domains
--max-debt-density <N>- Maximum debt density allowed per 1000 LOC [validate command]
Note: Threshold options (--threshold-complexity, --threshold-duplication, --threshold-preset) are command-line options for the analyze command. For the validate command, these thresholds are configured via the --config file (debtmap.toml) rather than as command-line flags.
Language Selection:
--languages <LANGS>- Comma-separated list of languages to analyze- Example:
--languages rust,python,javascript - Supported: rust, python, javascript, typescript
- Example:
Analysis Modes:
--semantic-off- Disable semantic analysis (fallback mode)--no-context-aware- Disable context-aware false positive reduction (enabled by default)--multi-pass- Enable multi-pass analysis with attribution--attribution- Show complexity attribution details
Functional Programming Analysis:
--ast-functional-analysis- Enable AST-based functional composition analysis (spec 111)- Analyzes code for functional programming patterns and composition
- Detects pure functions, immutability, and side effects
--functional-analysis-profile <PROFILE>- Set functional analysis profilestrict- Strict functional purity requirements (for pure FP codebases)balanced- Balanced analysis suitable for mixed paradigms (default)lenient- Lenient thresholds for imperative codebases
Context & Coverage
Enable context-aware risk analysis and integrate test coverage data.
Context-Aware Risk Analysis:
--context/--enable-context- Enable context-aware risk analysis--context-providers <PROVIDERS>- Context providers to use (comma-separated)- Available:
critical_path,dependency,git_history - Example:
--context-providers critical_path,git_history
- Available:
--disable-context <PROVIDERS>- Disable specific context providers (comma-separated)
Coverage Integration:
--coverage-file <PATH>/--lcov <PATH>- LCOV coverage file for risk analysis- Coverage data dampens debt scores for well-tested code (multiplier = 1.0 - coverage)
- Surfaces untested complex functions as higher priority
- Total debt score with coverage ≤ score without coverage
--validate-loc- Validate LOC consistency across analysis modes (with/without coverage)
Performance
Optimize analysis performance through parallelization.
Parallel Processing:
--no-parallel- Disable parallel call graph construction (enabled by default)-j, --jobs <N>- Number of threads for parallel processing0= use all available CPU cores (default)- Specify number to limit thread count
Other Performance:
--max-files <N>- Maximum number of files to analyze (0 = no limit)
Debugging & Verbosity
Control diagnostic output and debugging information.
Verbosity Levels:
-v, --verbose- Increase verbosity level (can be repeated: -v, -vv, -vvv)-v- Show main score factors-vv- Show detailed calculations-vvv- Show all debug information
Specialized Debugging:
--explain-metrics- Explain metric definitions and formulas (measured vs estimated)--verbose-macro-warnings- Show verbose macro parsing warnings (Rust analysis)--show-macro-stats- Show macro expansion statistics at end of analysis--detail-level <LEVEL>- Detail level for diagnostic reports- Options: summary, standard, comprehensive, debug (default: standard)
Call Graph Debugging:
--debug-call-graph- Enable detailed call graph debugging with resolution information--trace-function <FUNCTIONS>- Trace specific functions during call resolution (comma-separated)- Example:
--trace-function 'my_function,another_function'
- Example:
--call-graph-stats- Show only call graph statistics (no detailed failure list)--validate-call-graph- Validate call graph structure and report issues--debug-format <FORMAT>- Debug output format: text or json (default: text)- Use with call graph debugging flags to control output format
Aggregation
Control file-level aggregation and god object detection.
File Aggregation:
--aggregate-only- Show only aggregated file-level scores--no-aggregation- Disable file-level aggregation--aggregation-method <METHOD>- File aggregation method (default: weighted_sum)- Options: sum, weighted_sum, logarithmic_sum, max_plus_average
--min-problematic <N>- Minimum number of problematic functions for file aggregation--no-god-object- Disable god object detection
Option Aliases
Common option shortcuts and aliases for convenience:
--lcovis alias for--coverage-file--enable-contextis alias for--context--headis alias for--top-sis short form for--summary-vis short form for--verbose-fis short form for--format-ois short form for--output-cis short form for--config-jis short form for--jobs
Deprecated Options
The following options are deprecated and should be migrated:
--explain-score(hidden) - Deprecated: use-vinstead- Migration: Use
-v,-vv, or-vvvfor increasing verbosity levels
- Migration: Use
Configuration
Configuration File
Created via debtmap init command. The configuration file (debtmap.toml) is used by the validate command for threshold enforcement and default settings.
Creating Configuration:
# Create new config
debtmap init
# Overwrite existing config
debtmap init --force
Environment Variables
DEBTMAP_JOBS- Number of threads for parallel processing (same as--jobs/-jflag)- Example:
export DEBTMAP_JOBS=8 # Same as --jobs 8 - Use
0to utilize all available CPU cores - Controls thread pool size for parallel call graph construction
- Example:
Getting Help
Get help for any command:
# General help
debtmap --help
# Command-specific help
debtmap analyze --help
debtmap validate --help
debtmap compare --help
debtmap init --help
Common Workflows
Basic Analysis
Analyze a project and view results in terminal:
debtmap analyze src/
Generate JSON report for further processing:
debtmap analyze . --format json --output report.json
Generate Markdown report:
debtmap analyze . --format markdown --output report.md
Coverage-Integrated Analysis
Analyze with test coverage to surface untested complex code:
# Generate coverage file first (example for Rust)
cargo tarpaulin --out lcov
# Run analysis with coverage
debtmap analyze src/ --coverage-file lcov.info
Coverage dampens debt scores for well-tested code, making untested complex functions more visible.
Context-Aware Analysis
Enable context providers for risk-aware prioritization:
# Use all context providers
debtmap analyze . --context
# Use specific context providers
debtmap analyze . --context --context-providers critical_path,git_history
Context-aware analysis reduces false positives and prioritizes code based on:
- Critical execution paths
- Dependency relationships
- Git history (change frequency)
Filtered & Focused Analysis
Show only top priority items:
debtmap analyze . --top 10 --min-priority high
Filter by specific debt categories:
debtmap analyze . --filter complexity,duplication
Use summary mode for compact output:
debtmap analyze . --summary
Show only file-level aggregations:
debtmap analyze . --aggregate-only
Performance Tuning
Control parallelization:
# Use 8 threads
debtmap analyze . --jobs 8
# Disable parallel processing
debtmap analyze . --no-parallel
Limit analysis scope:
# Analyze maximum 100 files
debtmap analyze . --max-files 100
# Analyze specific languages only
debtmap analyze . --languages rust,python
CI/CD Integration
Use the validate command in CI/CD pipelines:
# Initialize configuration (one time)
debtmap init
# Edit debtmap.toml to set thresholds
# ...
# In CI pipeline: validate against thresholds
debtmap validate . --config debtmap.toml --max-debt-density 50
The validate command returns non-zero exit code if thresholds are exceeded, failing the build.
Comparison & Tracking
Compare analysis results before and after changes:
# Before changes
debtmap analyze . --format json --output before.json
# Make code changes...
# After changes
debtmap analyze . --format json --output after.json
# Generate comparison report
debtmap compare --before before.json --after after.json --format markdown
With implementation plan:
debtmap compare --before before.json --after after.json --plan IMPLEMENTATION_PLAN.md
Debugging Analysis
Increase verbosity to understand scoring:
# Show main score factors
debtmap analyze src/ -v
# Show detailed calculations
debtmap analyze src/ -vv
# Show all debug information
debtmap analyze src/ -vvv
Debug call graph resolution issues:
# Enable call graph debugging
debtmap analyze . --debug-call-graph
# Trace specific functions
debtmap analyze . --debug-call-graph --trace-function 'problematic_function'
# Validate call graph structure
debtmap analyze . --validate-call-graph --debug-format json
Show macro expansion statistics (Rust):
debtmap analyze . --show-macro-stats --verbose-macro-warnings
Use detailed diagnostic reports:
debtmap analyze . --detail-level comprehensive
Analyze functional programming patterns:
# Enable functional analysis
debtmap analyze . --ast-functional-analysis
# Use strict profile for pure FP codebases
debtmap analyze . --ast-functional-analysis --functional-analysis-profile strict
Examples
Basic Analysis
# Analyze current directory
debtmap analyze .
# Analyze specific directory
debtmap analyze src/
# Generate JSON output
debtmap analyze . --format json --output report.json
With Coverage
# Analyze with LCOV coverage file
debtmap analyze src/ --coverage-file coverage.lcov
# Alternative alias
debtmap analyze src/ --lcov coverage.lcov
Context-Aware Analysis
# Enable all context providers
debtmap analyze . --context
# Use specific context providers
debtmap analyze . --context --context-providers critical_path,git_history
# Disable specific providers
debtmap analyze . --context --disable-context dependency
Filtered Output
# Top 10 priority items only
debtmap analyze . --top 10
# High priority and above
debtmap analyze . --min-priority high
# Specific categories
debtmap analyze . --filter complexity,duplication
# Summary format
debtmap analyze . --summary
# Group by category
debtmap analyze . --group-by-category
Performance Tuning
# Use 8 threads
debtmap analyze . --jobs 8
# Disable parallelization
debtmap analyze . --no-parallel
# Limit file count
debtmap analyze . --max-files 100
Validation
# Initialize config
debtmap init --force
# Validate against config
debtmap validate . --config debtmap.toml
# With max debt density threshold
debtmap validate . --max-debt-density 50
Comparison
# Compare two analyses
debtmap compare --before before.json --after after.json
# With markdown output
debtmap compare --before before.json --after after.json --format markdown
# With implementation plan
debtmap compare --before before.json --after after.json --plan IMPLEMENTATION_PLAN.md
# With target location
debtmap compare --before before.json --after after.json --target-location "src/main.rs:process_file:42"
Language Selection
# Analyze only Rust files
debtmap analyze . --languages rust
# Multiple languages
debtmap analyze . --languages rust,python,javascript
Threshold Configuration
# Custom complexity threshold
debtmap analyze . --threshold-complexity 15
# Use preset
debtmap analyze . --threshold-preset strict
# Custom duplication threshold
debtmap analyze . --threshold-duplication 100
Plain/Machine-Readable Output
# Plain output (no colors, no emoji)
debtmap analyze . --plain
# Combine with JSON for CI
debtmap analyze . --format json --plain --output report.json
Advanced Debugging
# Call graph debugging with detailed information
debtmap analyze . --debug-call-graph --debug-format json
# Trace specific functions during call resolution
debtmap analyze . --debug-call-graph --trace-function 'process_file,analyze_complexity'
# Validate call graph structure
debtmap analyze . --validate-call-graph
# Show only call graph statistics
debtmap analyze . --debug-call-graph --call-graph-stats
# Functional programming analysis with strict profile
debtmap analyze . --ast-functional-analysis --functional-analysis-profile strict
# Explain metric definitions
debtmap analyze . --explain-metrics -v
Command Compatibility Matrix
| Option | analyze | validate | compare | init | explain-coverage |
|---|---|---|---|---|---|
<PATH> argument | ✓ | ✓ | ✗ | ✗ | ✓ |
--format | ✓ | ✓ | ✓ | ✗ | ✓ |
--output | ✓ | ✓ | ✓ | ✗ | ✗ |
--coverage-file | ✓ | ✓ | ✗ | ✗ | ✓ |
--context | ✓ | ✓ | ✗ | ✗ | ✗ |
--threshold-* | ✓ | ✗ | ✗ | ✗ | ✗ |
--top / --tail | ✓ | ✓ | ✗ | ✗ | ✗ |
--jobs | ✓ | ✓ | ✗ | ✗ | ✗ |
--no-parallel | ✓ | ✓ | ✗ | ✗ | ✗ |
--verbose | ✓ | ✓ | ✗ | ✗ | ✓ |
--explain-metrics | ✓ | ✗ | ✗ | ✗ | ✗ |
--debug-call-graph | ✓ | ✗ | ✗ | ✗ | ✗ |
--trace-function | ✓ | ✗ | ✗ | ✗ | ✗ |
--call-graph-stats | ✓ | ✗ | ✗ | ✗ | ✗ |
--validate-call-graph | ✓ | ✗ | ✗ | ✗ | ✗ |
--debug-format | ✓ | ✗ | ✗ | ✗ | ✗ |
--show-dependencies | ✓ | ✗ | ✗ | ✗ | ✗ |
--no-dependencies | ✓ | ✗ | ✗ | ✗ | ✗ |
--max-callers | ✓ | ✗ | ✗ | ✗ | ✗ |
--max-callees | ✓ | ✗ | ✗ | ✗ | ✗ |
--show-external-calls | ✓ | ✗ | ✗ | ✗ | ✗ |
--show-std-lib-calls | ✓ | ✗ | ✗ | ✗ | ✗ |
--ast-functional-analysis | ✓ | ✗ | ✗ | ✗ | ✗ |
--functional-analysis-profile | ✓ | ✗ | ✗ | ✗ | ✗ |
--function | ✗ | ✗ | ✗ | ✗ | ✓ |
--file | ✗ | ✗ | ✗ | ✗ | ✓ |
--config | ✗ | ✓ | ✗ | ✗ | ✗ |
--before / --after | ✗ | ✗ | ✓ | ✗ | ✗ |
--force | ✗ | ✗ | ✗ | ✓ | ✗ |
Note: The validate command supports output control (--format, --output), coverage integration (--coverage-file), context-aware analysis (--context), display filtering (--top, --tail, --summary), performance control (--jobs, --no-parallel), and verbosity options (--verbose) from the analyze command. Analysis thresholds (--threshold-complexity, --threshold-duplication, --threshold-preset) are configured via the --config file rather than as command-line options. Debugging features like call graph debugging and functional analysis are specific to the analyze command. The explain-coverage command is a specialized debugging tool for diagnosing coverage detection issues and has its own unique options (--function, --file).
Troubleshooting
Performance Issues
Problem: Analysis is slow on large codebases
Solutions:
# Use more threads (if you have CPU cores available)
debtmap analyze . --jobs 16
# Limit analysis scope
debtmap analyze . --max-files 500 --languages rust
Memory Issues
Problem: Analysis runs out of memory
Solutions:
# Disable parallelization
debtmap analyze . --no-parallel
# Limit file count
debtmap analyze . --max-files 100
# Analyze in batches by language
debtmap analyze . --languages rust
debtmap analyze . --languages python
Output Issues
Problem: Terminal output has garbled characters
Solution:
# Use plain mode
debtmap analyze . --plain
Problem: Want machine-readable output
Solution:
# Use JSON with plain mode
debtmap analyze . --format json --plain --output report.json
Threshold Issues
Problem: Too many items flagged
Solutions:
# Use lenient preset
debtmap analyze . --threshold-preset lenient
# Increase threshold
debtmap analyze . --threshold-complexity 20
# Filter to high priority only
debtmap analyze . --min-priority high
Problem: Not enough items flagged
Solutions:
# Use strict preset
debtmap analyze . --threshold-preset strict
# Lower threshold
debtmap analyze . --threshold-complexity 5
# Show all items
debtmap analyze . --min-priority low
Best Practices
Regular Analysis
Run analysis regularly to track code quality trends:
# Daily in CI
debtmap validate . --config debtmap.toml
# Weekly deep analysis with coverage
debtmap analyze . --coverage-file coverage.lcov --format json --output weekly-report.json
Performance Optimization
For large codebases:
# Use maximum parallelization
debtmap analyze . --jobs 0 # 0 = all cores
# Focus on changed files in CI
# (implement via custom scripts to analyze git diff)
Integration with Coverage
Always analyze with coverage when available:
# Rust example
cargo tarpaulin --out lcov
debtmap analyze src/ --coverage-file lcov.info
# Python example
pytest --cov --cov-report=lcov
debtmap analyze . --coverage-file coverage.lcov
Coverage integration helps prioritize untested complex code.
Additional Tools
prodigy-validate-debtmap-improvement
Specialized validation tool for Prodigy workflow integration.
Description: This binary is part of the Prodigy workflow system and provides specialized validation for Debtmap improvement workflows.
Usage: See Prodigy documentation for detailed usage instructions.
See Also
- Configuration Format - Detailed configuration file format
- Output Formats - Understanding JSON, Markdown, and Terminal output
- Coverage Integration - Integrating test coverage data
- Context Providers - Understanding context-aware analysis
- Examples - More comprehensive usage examples
Analysis Guide
This guide explains Debtmap’s analysis capabilities, metrics, and methodologies in depth. Use this to understand what Debtmap measures, how it scores technical debt, and how to interpret analysis results for maximum impact.
Overview
Debtmap analyzes code through multiple lenses to provide a comprehensive view of technical health:
- Complexity Metrics - Quantifies how difficult code is to understand and test
- Debt Patterns - Identifies 13 types of technical debt requiring attention
- Risk Scoring - Correlates complexity with test coverage to find truly risky code
- Prioritization - Ranks findings by impact to guide refactoring efforts
The goal is to move beyond simple “here are your problems” to “here’s what to fix first and why.”
Complexity Metrics
Debtmap measures complexity using multiple complementary approaches. Each metric captures a different aspect of code difficulty.
Cyclomatic Complexity
Measures the number of linearly independent paths through code - essentially counting decision points.
How it works:
- Start with a base complexity of 1
- Add 1 for each:
if,else if,matcharm,while,for,&&,||,?operator - Does NOT increase for
else(it’s the alternate path, not a new decision)
Thresholds:
- 1-5: Simple, easy to test - typically needs 1-3 test cases
- 6-10: Moderate complexity - needs 4-8 test cases
- 11-20: Complex, consider refactoring - needs 9+ test cases
- 20+: Very complex, high risk - difficult to test thoroughly
Example:
#![allow(unused)]
fn main() {
fn validate_user(age: u32, has_license: bool, country: &str) -> bool {
// Complexity: 4
// Base (1) + if (1) + && (1) + match (1) = 4
if age >= 18 && has_license {
match country {
"US" | "CA" => true,
_ => false,
}
} else {
false
}
}
}
Cognitive Complexity
Measures how difficult code is to understand by considering nesting depth and control flow interruptions.
How it differs from cyclomatic:
- Nesting increases weight (deeply nested code is harder to understand)
- Linear sequences don’t increase complexity (easier to follow)
- Breaks and continues add complexity (interrupt normal flow)
Calculation:
- Each structure (if, loop, match) gets a base score
- Nesting multiplies the weight (nested structures = harder to understand)
- Break/continue/return in middle of function adds cognitive load
Example:
#![allow(unused)]
fn main() {
// Cyclomatic: 5, Cognitive: 8
fn process_items(items: Vec<Item>) -> Vec<Result> {
let mut results = vec![];
for item in items { // +1 cognitive
if item.is_valid() { // +2 (nested in loop)
match item.type { // +3 (nested 2 levels)
Type::A => results.push(process_a(item)),
Type::B => {
if item.priority > 5 { // +4 (nested 3 levels)
results.push(process_b_priority(item));
}
}
_ => continue, // +1 (control flow interruption)
}
}
}
results
}
}
Thresholds:
- 0-5: Trivial - anyone can understand
- 6-10: Simple - straightforward logic
- 11-20: Moderate - requires careful reading
- 21-40: Complex - difficult to understand
- 40+: Very complex - needs refactoring
Entropy-Based Complexity Analysis
Uses information theory to distinguish genuinely complex code from pattern-based repetitive code. This dramatically reduces false positives for validation functions, dispatchers, and configuration parsers.
How it works:
-
Token Entropy (0.0-1.0): Measures variety in code tokens
- High entropy (0.7+): Diverse logic, genuinely complex
- Low entropy (0.0-0.4): Repetitive patterns, less complex than it appears
-
Pattern Repetition (0.0-1.0): Detects repetitive structures in AST
- High repetition (0.7+): Similar blocks repeated (validation checks, case handlers)
- Low repetition: Unique logic throughout
-
Branch Similarity (0.0-1.0): Analyzes similarity between conditional branches
- High similarity (0.8+): Branches do similar things (consistent handling)
- Low similarity: Each branch has unique logic
-
Token Classification: Categorizes tokens by type with weighted importance
- Variables, methods, literals weighted differently
- Focuses on structural complexity over superficial differences
Dampening logic: Dampening is applied when multiple factors indicate repetitive patterns:
- Low token entropy (< 0.4) indicates simple, repetitive patterns
- High pattern repetition (> 0.6) shows similar code blocks
- High branch similarity (> 0.7) indicates consistent branching logic
When these conditions are met:
effective_complexity = entropy × pattern_factor × similarity_factor
Dampening cap: The dampening factor has a minimum of 0.7, ensuring no more than 30% reduction in complexity scores. This prevents over-correction of pattern-based code and maintains a baseline complexity floor for functions that still require understanding and maintenance.
Example:
#![allow(unused)]
fn main() {
// Without entropy: Cyclomatic = 15 (appears very complex)
// With entropy: Effective = 5 (pattern-based, dampened 67%)
fn validate_config(config: &Config) -> Result<(), ValidationError> {
if config.name.is_empty() { return Err(ValidationError::EmptyName); }
if config.port == 0 { return Err(ValidationError::InvalidPort); }
if config.host.is_empty() { return Err(ValidationError::EmptyHost); }
if config.timeout == 0 { return Err(ValidationError::InvalidTimeout); }
// ... 11 more similar checks
Ok(())
}
}
Enable in .debtmap.toml:
[entropy]
enabled = true # Enable entropy analysis (default: true)
weight = 0.5 # Weight in adjustment (0.0-1.0)
use_classification = true # Advanced token classification
pattern_threshold = 0.7 # Pattern detection threshold
entropy_threshold = 0.4 # Entropy below this triggers dampening
branch_threshold = 0.8 # Branch similarity threshold
max_combined_reduction = 0.3 # Maximum 30% reduction
Output fields in EntropyScore:
unique_variables: Count of distinct variables in the function (measures variable diversity)max_nesting: Maximum nesting depth detected (contributes to dampening calculation)dampening_applied: Actual dampening factor applied to the complexity score
Nesting Depth
Maximum level of indentation in a function. Deep nesting makes code hard to follow.
Thresholds:
- 1-2: Flat, easy to read
- 3-4: Moderate nesting
- 5+: Deep nesting, consider extracting functions
Example:
#![allow(unused)]
fn main() {
// Nesting depth: 4 (difficult to follow)
fn process(data: Data) -> Result<Output> {
if data.is_valid() { // Level 1
for item in data.items { // Level 2
if item.active { // Level 3
match item.type { // Level 4
Type::A => { /* ... */ }
Type::B => { /* ... */ }
}
}
}
}
}
}
Refactored:
#![allow(unused)]
fn main() {
// Nesting depth: 2 (much clearer)
fn process(data: Data) -> Result<Output> {
if !data.is_valid() {
return Err(Error::Invalid);
}
data.items
.iter()
.filter(|item| item.active)
.map(|item| process_item(item)) // Extract to separate function
.collect()
}
}
Function Length
Number of lines in a function. Long functions often violate single responsibility principle.
Thresholds:
- 1-20 lines: Good - focused, single purpose
- 21-50 lines: Acceptable - may have multiple steps
- 51-100 lines: Long - consider breaking up
- 100+ lines: Very long - definitely needs refactoring
Why length matters:
- Harder to understand and remember
- Harder to test thoroughly
- Often violates single responsibility
- Difficult to reuse
Constructor Detection
Debtmap identifies constructor functions using AST-based analysis (Spec 122), which goes beyond simple name-based detection to catch non-standard constructor patterns.
Detection Strategy:
- Return Type Analysis: Functions returning
Self,Result<Self>, orOption<Self> - Body Pattern Analysis: Struct initialization or simple field assignments
- Complexity Check: Low cyclomatic complexity (≤5), no loops, minimal branching
Why AST-based detection?
Name-based detection (looking for new, new_*, from_*) misses non-standard constructors:
#![allow(unused)]
fn main() {
// Caught by name-based detection
fn new() -> Self {
Self { timeout: 30 }
}
// Missed by name-based, caught by AST detection
pub fn create_default_client() -> Self {
Self { timeout: Duration::from_secs(30) }
}
pub fn initialized() -> Self {
Self::new()
}
}
Builder vs Constructor:
AST analysis distinguishes between constructors and builder methods:
#![allow(unused)]
fn main() {
// Constructor: creates new instance
pub fn new(timeout: u32) -> Self {
Self { timeout }
}
// Builder method: modifies existing instance (NOT a constructor)
pub fn set_timeout(mut self, timeout: Duration) -> Self {
self.timeout = timeout;
self // Returns modified self, not new instance
}
}
Detection Criteria:
A function is classified as a constructor if:
- Returns
Self,Result<Self>, orOption<Self> - Contains struct initialization (
Self { ... }) without loops - OR delegates to another constructor (
Self::new()) with minimal logic
Fallback Behavior:
If AST parsing fails (syntax errors, unsupported language), Debtmap gracefully falls back to name-based detection (Spec 117):
new,new_*try_new*from_*
This ensures analysis always completes, even on partially broken code.
Performance:
AST-based detection adds < 5% overhead compared to name-only detection. See benchmarks:
cargo bench --bench constructor_detection_bench
Why it matters:
Accurately identifying constructors helps:
- Exclude them from complexity thresholds (constructors naturally have high complexity)
- Focus refactoring on business logic, not initialization code
- Understand initialization patterns across the codebase
Debt Patterns
Debtmap detects 25 types of technical debt, organized into 4 strategic categories. Each debt type is mapped to a category that guides prioritization and remediation strategies.
Debt Type Enum
The DebtType enum defines all specific debt patterns that Debtmap can detect:
Testing Debt:
TestingGap- Functions with insufficient test coverageTestTodo- TODO comments in test codeTestComplexity- Test functions exceeding complexity thresholdsTestDuplication- Duplicated code in test filesTestComplexityHotspot- Complex test logic that’s hard to maintainAssertionComplexity- Complex test assertionsFlakyTestPattern- Non-deterministic test behavior
Architecture Debt:
ComplexityHotspot- Functions exceeding complexity thresholdsDeadCode- Unreachable or unused codeGodObject- Classes with too many responsibilitiesGodModule- Modules with too many responsibilitiesFeatureEnvy- Using more data from other objects than ownPrimitiveObsession- Overusing basic types instead of domain objectsMagicValues- Unexplained literal values
Performance Debt:
AllocationInefficiency- Inefficient memory allocationsStringConcatenation- Inefficient string building in loopsNestedLoops- Multiple nested iterations (O(n²) or worse)BlockingIO- Blocking I/O in async contextsSuboptimalDataStructure- Wrong data structure for access patternAsyncMisuse- Improper async/await usageResourceLeak- Resources not properly releasedCollectionInefficiency- Inefficient collection operations
Code Quality Debt:
Risk- High-risk code (complex + poorly tested)Duplication- Duplicated code blocksErrorSwallowing- Errors caught but ignored
Debt Categories
The DebtCategory enum groups debt types into strategic categories:
#![allow(unused)]
fn main() {
pub enum DebtCategory {
Architecture, // Structure, design, complexity
Testing, // Coverage, test quality
Performance, // Speed, memory, efficiency
CodeQuality, // Maintainability, readability
}
}
Category Mapping:
| Debt Type | Category | Strategic Focus |
|---|---|---|
| ComplexityHotspot, DeadCode, GodObject, GodModule, FeatureEnvy, PrimitiveObsession, MagicValues | Architecture | Structural improvements, design patterns |
| TestingGap, TestTodo, TestComplexity, TestDuplication, TestComplexityHotspot, AssertionComplexity, FlakyTestPattern | Testing | Test coverage, test quality |
| AllocationInefficiency, StringConcatenation, NestedLoops, BlockingIO, SuboptimalDataStructure, AsyncMisuse, ResourceLeak, CollectionInefficiency | Performance | Runtime efficiency, resource usage |
| Risk, Duplication, ErrorSwallowing | CodeQuality | Maintainability, reliability |
Language-Specific Debt Patterns:
Some debt patterns only apply to languages with specific features:
- BlockingIO, AsyncMisuse: Async-capable languages (Rust, JavaScript, TypeScript)
- AllocationInefficiency, ResourceLeak: Languages with manual memory management (Rust)
- Error handling patterns: Vary by language error model (Result in Rust, exceptions in Python/JS)
Debtmap automatically applies only the relevant debt patterns for each language during analysis.
Examples by Category
Architecture Debt
ComplexityHotspot: Functions exceeding complexity thresholds
#![allow(unused)]
fn main() {
// Cyclomatic: 22, Cognitive: 35
fn process_transaction(tx: Transaction, account: &mut Account) -> Result<Receipt> {
if tx.amount <= 0 {
return Err(Error::InvalidAmount);
}
// ... deeply nested logic with many branches
Ok(receipt)
}
}
When detected: Cyclomatic > 10 OR Cognitive > 15 (configurable) Action: Break into smaller functions, extract validation, simplify control flow
GodObject / GodModule: Too many responsibilities
#![allow(unused)]
fn main() {
// God module: handles parsing, validation, storage, notifications
mod user_service {
fn parse_user() { /* ... */ }
fn validate_user() { /* ... */ }
fn save_user() { /* ... */ }
fn send_email() { /* ... */ }
fn log_activity() { /* ... */ }
// ... 20+ more functions
}
}
When detected: Complexity-weighted scoring system (see detailed explanation below) Action: Split into focused modules (parser, validator, repository, notifier)
Complexity-Weighted God Object Detection
Debtmap uses complexity-weighted scoring for god object detection to reduce false positives on well-refactored code. This ensures that a file with 100 simple helper functions doesn’t rank higher than a file with 10 complex functions.
The Problem:
Traditional god object detection counts methods:
- File A: 100 methods (average complexity: 1.5) → Flagged as god object
- File B: 10 methods (average complexity: 17.0) → Not flagged
But File A might be a well-organized utility module with many small helpers, while File B is truly problematic with highly complex functions that need refactoring.
The Solution:
Debtmap weights each function by its cyclomatic complexity using this formula:
weight = (max(1, complexity) / 3)^1.5
Weight Examples:
- Simple helper (complexity 1): weight ≈ 0.19
- Baseline function (complexity 3): weight = 1.0
- Moderate function (complexity 9): weight ≈ 5.2
- Complex function (complexity 17): weight ≈ 13.5
- Critical function (complexity 33): weight ≈ 36.5
God Object Score Calculation:
weighted_method_count = sum(weight for each function)
complexity_penalty = 0.7 if avg_complexity < 3, 1.0 if 3-10, 1.5 if > 10
god_object_score = (
(weighted_method_count / threshold) * 40% +
(field_count / threshold) * 20% +
(responsibility_count / threshold) * 15% +
(lines_of_code / 500) * 25%
) * complexity_penalty
Threshold: God object detected if score >= 70.0
Real-World Example:
shared_cache.rs:
- 100 functions, average complexity: 1.5
- Weighted score: ~19.0 (100 * 0.19)
- God object score: 45.2
- Result: Not a god object ✓
legacy_parser.rs:
- 10 functions, average complexity: 17.0
- Weighted score: ~135.0 (10 * 13.5)
- God object score: 87.3
- Result: God object detected ✓
Benefits:
- Reduces false positives on utility modules with many simple functions
- Focuses attention on truly problematic complex modules
- Rewards good refactoring - breaking large functions into small helpers improves score
- Aligns with reality - complexity matters more than count for maintainability
How to View:
When Debtmap detects a god object, the output includes:
- Raw method count
- Weighted method count
- Average complexity
- God object score
- Recommended module splits based on responsibility clustering
MagicValues: Unexplained literals
#![allow(unused)]
fn main() {
// Bad: Magic numbers
fn calculate_price(quantity: u32) -> f64 {
quantity as f64 * 19.99 + 5.0 // What are these numbers?
}
// Good: Named constants
const UNIT_PRICE: f64 = 19.99;
const SHIPPING_COST: f64 = 5.0;
fn calculate_price(quantity: u32) -> f64 {
quantity as f64 * UNIT_PRICE + SHIPPING_COST
}
}
Testing Debt
TestingGap: Functions with insufficient test coverage
#![allow(unused)]
fn main() {
// 0% coverage - critical business logic untested
fn calculate_tax(amount: f64, region: &str) -> f64 {
// Complex tax calculation logic
// No tests exist for this function!
}
}
When detected: Coverage data shows function has < 80% line coverage Action: Add unit tests to cover all branches and edge cases
TestComplexity: Test functions too complex
#![allow(unused)]
fn main() {
#[test]
fn complex_test() {
// Cyclomatic: 12 (too complex for a test)
for input in test_cases {
if input.is_special() {
match input.type {
/* complex test logic */
}
}
}
}
}
When detected: Test functions with cyclomatic > 10 or cognitive > 15 Action: Split into multiple focused tests, use test fixtures
FlakyTestPattern: Non-deterministic tests
#![allow(unused)]
fn main() {
#[test]
fn flaky_test() {
let result = async_operation().await; // Timing-dependent
thread::sleep(Duration::from_millis(100)); // Race condition!
assert_eq!(result.status, "complete");
}
}
When detected: Pattern analysis for timing dependencies, random values Action: Use mocks, deterministic test data, proper async test utilities
Performance Debt
AllocationInefficiency: Excessive allocations
#![allow(unused)]
fn main() {
// Bad: Allocates on every iteration
fn process_items(items: &[Item]) -> Vec<String> {
let mut results = Vec::new();
for item in items {
results.push(item.name.clone()); // Unnecessary clone
}
results
}
// Good: Pre-allocate, avoid clones
fn process_items(items: &[Item]) -> Vec<&str> {
items.iter().map(|item| item.name.as_str()).collect()
}
}
BlockingIO: Blocking operations in async contexts
#![allow(unused)]
fn main() {
// Bad: Blocks async runtime
async fn load_data() -> Result<Data> {
let file = std::fs::read_to_string("data.json")?; // Blocking!
parse_json(&file)
}
// Good: Async I/O
async fn load_data() -> Result<Data> {
let file = tokio::fs::read_to_string("data.json").await?;
parse_json(&file)
}
}
NestedLoops: O(n²) or worse complexity
#![allow(unused)]
fn main() {
// Bad: O(n²) nested loops
fn find_duplicates(items: &[Item]) -> Vec<(Item, Item)> {
let mut dupes = vec![];
for i in 0..items.len() {
for j in i+1..items.len() {
if items[i] == items[j] {
dupes.push((items[i].clone(), items[j].clone()));
}
}
}
dupes
}
// Good: O(n) with HashSet
fn find_duplicates(items: &[Item]) -> Vec<Item> {
let mut seen = HashSet::new();
items.iter().filter(|item| !seen.insert(item)).cloned().collect()
}
}
Code Quality Debt
Duplication: Duplicated code blocks
#![allow(unused)]
fn main() {
// File A:
fn process_user(user: User) -> Result<()> {
validate_email(&user.email)?;
validate_age(user.age)?;
save_to_database(&user)?;
send_welcome_email(&user.email)?;
Ok(())
}
// File B: Duplicated validation
fn process_admin(admin: Admin) -> Result<()> {
validate_email(&admin.email)?; // Duplicated
validate_age(admin.age)?; // Duplicated
save_to_database(&admin)?;
grant_admin_privileges(&admin)?;
Ok(())
}
}
When detected: Similar code blocks > 50 lines (configurable) Action: Extract shared code into reusable functions
ErrorSwallowing: Errors caught but ignored
#![allow(unused)]
fn main() {
// Bad: Error swallowed, no context
match risky_operation() {
Ok(result) => process(result),
Err(_) => {}, // Silent failure!
}
// Good: Error handled with context
match risky_operation() {
Ok(result) => process(result),
Err(e) => {
log::error!("Risky operation failed: {}", e);
return Err(e.into());
}
}
}
When detected: Empty catch blocks, ignored Results Action: Add proper error logging and propagation
Risk: High-risk code (complex + poorly tested)
#![allow(unused)]
fn main() {
// Cyclomatic: 18, Coverage: 20%, Risk Score: 47.6 (HIGH)
fn process_payment(tx: Transaction) -> Result<Receipt> {
// Complex payment logic with minimal testing
// High risk of bugs in production
}
}
When detected: Combines complexity metrics with coverage data Action: Either add comprehensive tests OR refactor to reduce complexity
Debt Scoring Formula
Each debt item gets a score based on priority and type:
debt_score = priority_weight × type_weight
Priority weights:
- Low = 1
- Medium = 3
- High = 5
- Critical = 10
Combined examples:
- Low Todo = 1 × 1 = 1
- Medium Fixme = 3 × 2 = 6
- High Complexity = 5 × 5 = 25
- Critical Complexity = 10 × 5 = 50
Total debt score = Sum of all debt item scores
Lower is better. Track over time to measure improvement.
Risk Scoring
Debtmap’s risk scoring identifies code that is both complex AND poorly tested - the true risk hotspots.
Unified Scoring System
Debtmap uses a unified scoring system (0-10 scale) as the primary prioritization mechanism. This multi-factor approach balances complexity, test coverage, and dependency impact, adjusted by function role.
Score Scale and Priority Classifications
Functions receive scores from 0 (minimal risk) to 10 (critical risk):
| Score Range | Priority | Description | Action |
|---|---|---|---|
| 9.0-10.0 | Critical | Severe risk requiring immediate attention | Address immediately |
| 7.0-8.9 | High | Significant risk, should be addressed soon | Plan for this sprint |
| 5.0-6.9 | Medium | Moderate risk, plan for future work | Schedule for next sprint |
| 3.0-4.9 | Low | Minor risk, lower priority | Monitor and address as time permits |
| 0.0-2.9 | Minimal | Well-managed code | Continue monitoring |
Scoring Formula
The unified score combines three weighted factors:
Base Score = (Complexity Factor × 0.40) + (Coverage Factor × 0.40) + (Dependency Factor × 0.20)
Final Score = Base Score × Role Multiplier
Factor Calculations:
Complexity Factor (0-10 scale):
Complexity Factor = min(10, ((cyclomatic / 10) + (cognitive / 20)) × 5)
Normalized to 0-10 range based on cyclomatic and cognitive complexity.
Coverage Factor (0-10 scale):
Coverage Factor = 10 × (1 - coverage_percentage) × complexity_weight
Uncovered complex code scores higher than uncovered simple code. Coverage dampens the score - well-tested code gets lower scores.
Dependency Factor (0-10 scale): Based on call graph analysis with specific thresholds:
- High impact (score 8-10): 5+ upstream callers, or on critical path from entry point (adds 2-3 points)
- Moderate impact (score 4-6): 2-4 upstream callers
- Low impact (score 1-3): 0-1 upstream callers
- Critical path bonus: Being on a critical path from an entry point adds 2-3 points to the base dependency score
Default Weights
The scoring formula uses configurable weights (default values shown):
- Complexity: 40% - How difficult the code is to understand and test
- Coverage: 40% - How well the code is tested
- Dependency: 20% - How many other functions depend on this code
These weights can be adjusted in .debtmap.toml to match your team’s priorities.
Role-Based Prioritization
The unified score is multiplied by a role multiplier based on the function’s semantic classification:
| Role | Multiplier | Description | Example |
|---|---|---|---|
| Entry Points | 1.5× | main(), HTTP handlers, API endpoints | User-facing code where bugs have immediate impact |
| Business Logic | 1.2× | Core domain functions, algorithms | Critical functionality |
| Data Access | 1.0× | Database queries, file I/O | Baseline importance |
| Infrastructure | 0.8× | Logging, configuration, monitoring | Supporting code |
| Utilities | 0.5× | Helpers, formatters, converters | Lower impact |
| Test Code | 0.1× | Test functions, fixtures, mocks | Internal quality |
How role classification works:
Debtmap identifies function roles through pattern analysis:
- Entry points: Functions named
main, handlers with routing decorators, public API functions - Business logic: Core domain operations, calculation functions, decision-making code
- Data access: Database queries, file operations, network calls
- Infrastructure: Logging, config parsing, monitoring, error handling
- Utilities: Helper functions, formatters, type converters, validators
- Test code: Functions in test modules, test functions, fixtures
Example: Same complexity, different priorities
Consider a function with base score 8.0:
If classified as Entry Point:
Final Score = 8.0 × 1.5 = 12.0 (capped at 10.0) → CRITICAL priority
If classified as Business Logic:
Final Score = 8.0 × 1.2 = 9.6 → CRITICAL priority
If classified as Data Access:
Final Score = 8.0 × 1.0 = 8.0 → HIGH priority
If classified as Utility:
Final Score = 8.0 × 0.5 = 4.0 → LOW priority
This ensures that complex code in critical paths gets higher priority than equally complex utility code.
Coverage Propagation
Coverage impact flows through the call graph using transitive coverage:
Transitive Coverage = Direct Coverage + Σ(Caller Coverage × Weight)
How it works:
Functions called by well-tested code inherit some coverage benefit, reducing their urgency. This helps identify which untested functions are on critical paths versus safely isolated utilities.
Example scenarios:
Scenario 1: Untested function with well-tested callers
Function A: 0% direct coverage
Called by:
- handle_request (95% coverage)
- process_payment (90% coverage)
- validate_order (88% coverage)
Transitive coverage: ~40% (inherits coverage benefit from callers)
Final priority: Lower than isolated 0% coverage function
Scenario 2: Untested function on critical path
Function B: 0% direct coverage
Called by:
- main (0% coverage)
- startup (10% coverage)
Transitive coverage: ~5% (minimal coverage benefit)
Final priority: Higher - on critical path with no safety net
Coverage propagation prevents false alarms about utility functions called only by well-tested code, while highlighting genuinely risky untested code on critical paths.
Unified Score Example
Function: process_payment
Location: src/payments.rs:145
Metrics:
- Cyclomatic complexity: 18
- Cognitive complexity: 25
- Test coverage: 20%
- Upstream callers: 3 (high dependency)
- Role: Business Logic
Calculation:
Complexity Factor = min(10, ((18/10) + (25/20)) × 5) = min(10, 8.75) = 8.75
Coverage Factor = 10 × (1 - 0.20) × 1.0 = 8.0
Dependency Factor = 7.5 (3 upstream callers, moderate impact)
Base Score = (8.75 × 0.40) + (8.0 × 0.40) + (7.5 × 0.20)
= 3.5 + 3.2 + 1.5
= 8.2
Final Score = 8.2 × 1.2 (Business Logic multiplier)
= 9.84 → CRITICAL priority
Legacy Risk Scoring (Pre-0.2.x)
Prior to the unified scoring system, Debtmap used a simpler additive risk formula. This is still available for compatibility but unified scoring is now the default and provides better prioritization.
Risk Categories
Note: The RiskLevel enum (Low, Medium, High, Critical) is used for legacy risk scoring compatibility. When using unified scoring (0-10 scale), refer to the priority classifications shown in the Unified Scoring System section above.
Legacy RiskLevel Enum
For legacy risk scoring, Debtmap classifies functions into four risk levels:
#![allow(unused)]
fn main() {
pub enum RiskLevel {
Low, // Score < 10
Medium, // Score 10-24
High, // Score 25-49
Critical, // Score ≥ 50
}
}
Critical (legacy score ≥ 50)
- High complexity (cyclomatic > 15) AND low coverage (< 30%)
- Untested code that’s likely to break and hard to fix
- Action: Immediate attention required - add tests or refactor
High (legacy score 25-49)
- High complexity (cyclomatic > 10) AND moderate coverage (< 60%)
- Risky code with incomplete testing
- Action: Should be addressed soon
Medium (legacy score 10-24)
- Moderate complexity (cyclomatic > 5) AND low coverage (< 50%)
- OR: High complexity with good coverage
- Action: Plan for next sprint
Low (legacy score < 10)
- Low complexity OR high coverage
- Well-managed code
- Action: Monitor, low priority
Unified Scoring Priority Levels
When using unified scoring (default), functions are classified using the 0-10 scale:
- Critical (9.0-10.0): Immediate attention
- High (7.0-8.9): Address this sprint
- Medium (5.0-6.9): Plan for next sprint
- Low (3.0-4.9): Monitor and address as time permits
- Minimal (0.0-2.9): Well-managed code
Well-tested complex code is an outcome in both systems, not a separate category:
- Complex function (cyclomatic 18, cognitive 25) with 95% coverage
- Unified score: ~2.5 (Minimal priority due to coverage dampening)
- Legacy risk score: ~8 (Low risk)
- Falls into low-priority categories because good testing mitigates complexity
- This is the desired state for inherently complex business logic
Legacy Risk Calculation
Note: The legacy risk calculation is still supported for compatibility but has been superseded by the unified scoring system (see above). Unified scoring provides better prioritization through its multi-factor, weighted approach with role-based adjustments.
The legacy risk score uses a simpler additive formula:
#![allow(unused)]
fn main() {
risk_score = complexity_factor + coverage_factor + debt_factor
where:
complexity_factor = (cyclomatic / 5) + (cognitive / 10)
coverage_factor = (1 - coverage_percentage) × 50
debt_factor = debt_score / 10 // If debt data available
}
Example (legacy scoring):
Function: process_payment
- Cyclomatic complexity: 18
- Cognitive complexity: 25
- Coverage: 20%
- Debt score: 15
Calculation:
complexity_factor = (18 / 5) + (25 / 10) = 3.6 + 2.5 = 6.1
coverage_factor = (1 - 0.20) × 50 = 40
debt_factor = 15 / 10 = 1.5
risk_score = 6.1 + 40 + 1.5 = 47.6 (HIGH RISK)
When to use legacy scoring:
- Comparing with historical data from older Debtmap versions
- Teams with existing workflows built around the old scale
- Gradual migration to unified scoring
Why unified scoring is better:
- Normalized 0-10 scale is more intuitive
- Weighted factors (40% complexity, 40% coverage, 20% dependency) provide better balance
- Role multipliers adjust priority based on function importance
- Coverage propagation reduces false positives for utility functions
Test Effort Assessment
Debtmap estimates testing difficulty based on cognitive complexity:
Difficulty Levels:
- Trivial (cognitive < 5): 1-2 test cases, < 1 hour
- Simple (cognitive 5-10): 3-5 test cases, 1-2 hours
- Moderate (cognitive 10-20): 6-10 test cases, 2-4 hours
- Complex (cognitive 20-40): 11-20 test cases, 4-8 hours
- VeryComplex (cognitive > 40): 20+ test cases, 8+ hours
Test Effort includes:
- Cognitive load: How hard to understand the function
- Branch count: Number of paths to test
- Recommended test cases: Suggested number of tests
Risk Distribution
Debtmap provides codebase-wide risk metrics:
{
"risk_distribution": {
"critical_count": 12,
"high_count": 45,
"medium_count": 123,
"low_count": 456,
"minimal_count": 234,
"total_functions": 870
},
"codebase_risk_score": 1247.5
}
Interpreting distribution:
- Healthy codebase: Most functions in Low/Minimal priority (unified scoring) or Low/WellTested (legacy)
- Needs attention: Many Critical/High priority functions
- Technical debt: High codebase risk score
Note on minimal_count:
In unified scoring (0-10 scale), minimal_count represents functions scoring 0-2.9, which includes:
- Simple utility functions
- Helper functions with low complexity
- Well-tested complex code that scores low due to coverage dampening
This is not a separate risk category but an outcome of the unified scoring system. Complex business logic with 95% test coverage appropriately receives a minimal score, reflecting that good testing mitigates complexity risk.
Important: minimal_count does not appear in the standard risk_categories from features.json (Low, Medium, High, Critical, WellTested). It’s specific to unified scoring’s 0-10 scale priority classifications (Minimal, Low, Medium, High, Critical).
Testing Recommendations
When coverage data is provided, Debtmap generates prioritized testing recommendations with ROI analysis:
{
"function": "process_transaction",
"file": "src/payments.rs",
"line": 145,
"current_risk": 47.6,
"potential_risk_reduction": 35.2,
"test_effort_estimate": {
"estimated_difficulty": "Complex",
"cognitive_load": 25,
"branch_count": 18,
"recommended_test_cases": 12
},
"roi": 4.4,
"rationale": "High complexity with low coverage (20%) and 3 downstream dependencies. Testing will reduce risk by 74%.",
"dependencies": {
"upstream_callers": ["handle_payment_request"],
"downstream_callees": ["validate_amount", "check_balance", "record_transaction"]
}
}
ROI calculation:
roi = potential_risk_reduction / estimated_effort_hours
Higher ROI = better return on testing investment
Interpreting Results
Understanding Output Formats
Debtmap provides three output formats:
Terminal (default): Human-readable with colors and tables
debtmap analyze .
JSON: Machine-readable for CI/CD integration
debtmap analyze . --format json --output report.json
Markdown: Documentation-friendly
debtmap analyze . --format markdown --output report.md
JSON Structure
{
"timestamp": "2025-10-09T12:00:00Z",
"project_path": "/path/to/project",
"complexity": {
"metrics": [
{
"name": "process_data",
"file": "src/main.rs",
"line": 42,
"cyclomatic": 15,
"cognitive": 22,
"est_branches": 20,
"nesting": 4,
"length": 68,
"is_test": false,
"visibility": "Public",
"is_trait_method": false,
"in_test_module": false,
"entropy_score": {
"token_entropy": 0.65,
"pattern_repetition": 0.25,
"branch_similarity": 0.30,
"effective_complexity": 0.85
},
"is_pure": false,
"purity_confidence": 0.8,
"detected_patterns": ["validation_pattern"],
"upstream_callers": ["main", "process_request"],
"downstream_callees": ["validate", "save", "notify"]
}
],
"summary": {
"total_functions": 150,
"average_complexity": 5.3,
"max_complexity": 22,
"high_complexity_count": 8
}
},
"technical_debt": {
"items": [
{
"id": "complexity_src_main_rs_42",
"debt_type": "Complexity",
"priority": "High",
"file": "src/main.rs",
"line": 42,
"column": 1,
"message": "Function exceeds complexity threshold",
"context": "Cyclomatic: 15, Cognitive: 22"
}
],
"by_type": {
"Complexity": [...],
"Duplication": [...],
"Todo": [...]
}
}
}
JSON Output Format Variants
Debtmap supports two JSON output format variants for different integration needs:
Legacy Format (default):
- Uses wrapper objects:
{"File": {...}}and{"Function": {...}} - Compatible with existing tooling and scripts
- Shown in the JSON structure example above
Unified Format (spec 108 - future enhancement):
- Uses consistent structure with
"type"field discriminator - Simpler parsing for new integrations
- Example structure:
{
"type": "function",
"name": "process_data",
"file": "src/main.rs",
"line": 42,
"metrics": { /* ... */ }
}
Note: The unified format is currently an internal representation and is not available as a user-facing CLI option. The legacy format remains the stable default for all current integrations. If you need the unified format exposed as a CLI option (--format json-unified), please open a feature request on GitHub.
Reading Function Metrics
Key fields:
cyclomatic: Decision points - guides test case countcognitive: Understanding difficulty - guides refactoring priorityest_branches: Estimated execution paths (formula: max(nesting_depth, 1) × cyclomatic ÷ 3) - approximates test cases needed for branch coveragenesting: Indentation depth - signals need for extractionlength: Lines of code - signals SRP violationsvisibility: Function visibility ("Private","Crate", or"Public"from FunctionVisibility enum)is_pure: No side effects - easier to test (Option type, may be None)purity_confidence: How certain we are about purity 0.0-1.0 (Option type, may be None)is_trait_method: Whether this function implements a trait methodin_test_module: Whether function is inside a#[cfg(test)]moduledetected_patterns: Complexity adjustment patterns identified (e.g., “validation_pattern”)entropy_score: Pattern analysis for false positive reductionupstream_callers: Impact radius if this function breaksdownstream_callees: Functions this depends on
Entropy interpretation:
token_entropy < 0.4: Repetitive code, likely pattern-basedpattern_repetition > 0.7: High similarity between blocksbranch_similarity > 0.8: Similar conditional brancheseffective_complexity < 1.0: Dampening applied
Prioritizing Work
Debtmap provides multiple prioritization strategies, with unified scoring (0-10 scale) as the recommended default for most workflows:
1. By Unified Score (default - recommended)
debtmap analyze . --top 10
Shows top 10 items by combined complexity, coverage, and dependency factors, weighted and adjusted by function role.
Why use unified scoring:
- Balances complexity (40%), coverage (40%), and dependency impact (20%)
- Adjusts for function importance (entry points prioritized over utilities)
- Normalized 0-10 scale is intuitive and consistent
- Reduces false positives through coverage propagation
- Best for sprint planning and function-level refactoring decisions
Example:
# Show top 20 critical items
debtmap analyze . --min-priority 7.0 --top 20
# Focus on high-impact functions (score >= 7.0)
debtmap analyze . --format json | jq '.functions[] | select(.unified_score >= 7.0)'
2. By Risk Category (legacy compatibility)
debtmap analyze . --min-priority high
Shows only HIGH and CRITICAL priority items using legacy risk scoring.
Note: Legacy risk scoring uses additive formulas and unbounded scales. Prefer unified scoring for new workflows.
3. By Debt Type
debtmap analyze . --filter Architecture,Testing
Focuses on specific categories:
Architecture: God objects, complexity, dead codeTesting: Coverage gaps, test qualityPerformance: Resource leaks, inefficienciesCodeQuality: Code smells, maintainability
4. By ROI (with coverage)
debtmap analyze . --lcov coverage.lcov --top 20
Prioritizes by return on investment for testing/refactoring. Combines unified scoring with test effort estimates to identify high-value work.
Choosing the right strategy:
- Sprint planning for developers: Use unified scoring (
--top N) - Architectural review: Use tiered prioritization (
--summary) - Category-focused work: Use debt type filtering (
--filter) - Testing priorities: Use ROI analysis with coverage data (
--lcov) - Historical comparisons: Use legacy risk scoring (for consistency with old reports)
Tiered Prioritization
Note: Tiered prioritization uses traditional debt scoring (additive, higher = worse) and is complementary to the unified scoring system (0-10 scale). Both systems can be used together:
- Unified scoring (0-10 scale): Best for function-level prioritization and sprint planning
- Tiered prioritization (debt tiers): Best for architectural focus and strategic debt planning
Use --summary for tiered view focusing on architectural issues, or default output for function-level unified scores.
Debtmap uses a tier-based system to map debt scores to actionable priority levels. Each tier includes effort estimates and strategic guidance for efficient debt remediation.
Tier Levels
The Tier enum defines four priority levels based on score thresholds:
#![allow(unused)]
fn main() {
pub enum Tier {
Critical, // Score ≥ 90
High, // Score 70-89.9
Moderate, // Score 50-69.9
Low, // Score < 50
}
}
Score-to-Tier Mapping:
- Critical (≥ 90): Immediate action required - blocks progress
- High (70-89.9): Should be addressed this sprint
- Moderate (50-69.9): Plan for next sprint
- Low (< 50): Background maintenance work
Effort Estimates Per Tier
Each tier includes estimated effort based on typical remediation patterns:
| Tier | Estimated Effort | Typical Work |
|---|---|---|
| Critical | 1-2 days | Major refactoring, comprehensive testing, architectural changes |
| High | 2-4 hours | Extract functions, add test coverage, fix resource leaks |
| Moderate | 1-2 hours | Simplify logic, reduce duplication, improve error handling |
| Low | 30 minutes | Address TODOs, minor cleanup, documentation |
Effort calculation considers:
- Complexity metrics (cyclomatic, cognitive)
- Test coverage gaps
- Number of dependencies (upstream/downstream)
- Debt category (Architecture debt takes longer than CodeQuality)
Tiered Display Grouping
TieredDisplay groups similar debt items for batch action recommendations:
#![allow(unused)]
fn main() {
pub struct TieredDisplay {
pub tier: Tier,
pub items: Vec<DebtItem>,
pub total_score: f64,
pub estimated_total_effort_hours: f64,
pub batch_recommendations: Vec<String>,
}
}
Grouping strategy:
- Groups items by tier and similarity pattern
- Prevents grouping of god objects (always show individually)
- Prevents grouping of Critical items (each needs individual attention)
- Suggests batch actions for similar Low/Moderate items
Example batch recommendations:
{
"tier": "Moderate",
"total_score": 245.8,
"estimated_total_effort_hours": 12.5,
"batch_recommendations": [
"Extract 5 validation functions from similar patterns",
"Add test coverage for 8 moderately complex functions (grouped by module)",
"Refactor 3 functions with similar nested loop patterns"
]
}
Using Tiered Prioritization
1. Start with Critical tier:
debtmap analyze . --min-priority critical
Focus on items with score ≥ 90. These typically represent:
- Complex functions with 0% coverage
- God objects blocking feature development
- Critical resource leaks or security issues
2. Plan High tier work:
debtmap analyze . --min-priority high --format json > sprint-plan.json
Schedule 2-4 hours per item for this sprint. Look for:
- Functions approaching complexity thresholds
- Moderate coverage gaps on important code paths
- Performance bottlenecks with clear solutions
3. Batch Moderate tier items:
debtmap analyze . --min-priority moderate
Review batch recommendations. Examples:
- “10 validation functions detected - extract common pattern”
- “5 similar test files with duplication - create shared fixtures”
- “8 functions with magic values - create constants module”
4. Schedule Low tier background work: Address during slack time or as warm-up tasks for new contributors.
Strategic Guidance by Tier
Critical Tier Strategy:
- Block new features until addressed
- Pair programming recommended for complex items
- Architectural review before major refactoring
- Comprehensive testing after changes
High Tier Strategy:
- Sprint planning priority
- Impact analysis before changes
- Code review from senior developers
- Integration testing after changes
Moderate Tier Strategy:
- Batch similar items for efficiency
- Extract patterns across multiple files
- Incremental improvement over multiple PRs
- Regression testing for affected areas
Low Tier Strategy:
- Good first issues for new contributors
- Documentation improvements
- Code cleanup during refactoring nearby code
- Technical debt gardening sessions
Categorized Debt Analysis
Debtmap provides CategorizedDebt analysis that groups debt items by category and identifies cross-category dependencies. This helps teams understand strategic relationships between different types of technical debt.
CategorySummary
Each category gets a summary with metrics for planning:
#![allow(unused)]
fn main() {
pub struct CategorySummary {
pub category: DebtCategory,
pub total_score: f64,
pub item_count: usize,
pub estimated_effort_hours: f64,
pub average_severity: f64,
pub top_items: Vec<DebtItem>, // Up to 5 highest priority
}
}
Effort estimation formulas:
- Architecture debt:
complexity_score / 10 × 2hours (structural changes take longer) - Testing debt:
complexity_score / 10 × 1.5hours (writing tests) - Performance debt:
complexity_score / 10 × 1.8hours (profiling + optimization) - CodeQuality debt:
complexity_score / 10 × 1.2hours (refactoring)
Example category summary:
{
"category": "Architecture",
"total_score": 487.5,
"item_count": 15,
"estimated_effort_hours": 97.5,
"average_severity": 32.5,
"top_items": [
{
"debt_type": "GodObject",
"file": "src/services/user_service.rs",
"score": 95.0,
"estimated_effort_hours": 16.0
},
{
"debt_type": "ComplexityHotspot",
"file": "src/payments/processor.rs",
"score": 87.3,
"estimated_effort_hours": 14.0
}
]
}
Cross-Category Dependencies
CrossCategoryDependency identifies blocking relationships between different debt categories:
#![allow(unused)]
fn main() {
pub struct CrossCategoryDependency {
pub from_category: DebtCategory,
pub to_category: DebtCategory,
pub blocking_items: Vec<(DebtItem, DebtItem)>,
pub impact_level: ImpactLevel, // Critical, High, Medium, Low
pub recommendation: String,
}
}
Common dependency patterns:
1. Architecture blocks Testing:
- Pattern: God objects are too complex to test effectively
- Example:
UserServicehas 50+ functions, making comprehensive testing impractical - Impact: Critical - cannot improve test coverage without refactoring
- Recommendation: “Split god object into 4-5 focused modules before adding tests”
2. Async issues require Architecture changes:
- Pattern: Blocking I/O in async contexts requires architectural redesign
- Example: Sync database calls in async handlers
- Impact: High - performance problems require design changes
- Recommendation: “Introduce async database layer before optimizing handlers”
3. Complexity affects Testability:
- Pattern: High cyclomatic complexity makes thorough testing difficult
- Example: Function with 22 branches needs 22+ test cases
- Impact: High - testing effort grows exponentially with complexity
- Recommendation: “Reduce complexity to < 10 before writing comprehensive tests”
4. Performance requires Architecture:
- Pattern: O(n²) nested loops need different data structures
- Example: Linear search in loops should use HashMap
- Impact: Medium - optimization requires structural changes
- Recommendation: “Refactor data structure before micro-optimizations”
Example cross-category dependency:
{
"from_category": "Architecture",
"to_category": "Testing",
"impact_level": "Critical",
"blocking_items": [
{
"blocker": {
"debt_type": "GodObject",
"file": "src/services/user_service.rs",
"functions": 52,
"score": 95.0
},
"blocked": {
"debt_type": "TestingGap",
"file": "src/services/user_service.rs",
"coverage": 15,
"score": 78.0
}
}
],
"recommendation": "Split UserService into focused modules (auth, profile, settings, notifications) before attempting to improve test coverage. Current structure makes comprehensive testing impractical.",
"estimated_unblock_effort_hours": 16.0
}
Using Categorized Debt Analysis
View all category summaries:
debtmap analyze . --format json | jq '.categorized_debt.summaries'
Focus on specific category:
debtmap analyze . --filter Architecture --top 10
Identify blocking relationships:
debtmap analyze . --format json | jq '.categorized_debt.cross_category_dependencies[] | select(.impact_level == "Critical")'
Strategic planning workflow:
-
Review category summaries:
- Identify which category has highest total score
- Check estimated effort hours per category
- Note average severity to gauge urgency
-
Check cross-category dependencies:
- Find Critical and High impact blockers
- Prioritize blockers before blocked items
- Plan architectural changes before optimization
-
Plan remediation order:
Example decision tree: - Architecture score > 400? → Address god objects first - Testing gap with low complexity? → Quick wins, add tests - Performance issues + architecture debt? → Refactor structure first - High code quality debt but good architecture? → Incremental cleanup -
Use category-specific strategies:
- Architecture: Pair programming, design reviews, incremental refactoring
- Testing: TDD for new code, characterization tests for legacy
- Performance: Profiling first, optimize hot paths, avoid premature optimization
- CodeQuality: Code review focus, linting rules, consistent patterns
CategorizedDebt Output Structure
{
"categorized_debt": {
"summaries": [
{
"category": "Architecture",
"total_score": 487.5,
"item_count": 15,
"estimated_effort_hours": 97.5,
"average_severity": 32.5,
"top_items": [...]
},
{
"category": "Testing",
"total_score": 356.2,
"item_count": 23,
"estimated_effort_hours": 53.4,
"average_severity": 15.5,
"top_items": [...]
},
{
"category": "Performance",
"total_score": 234.8,
"item_count": 12,
"estimated_effort_hours": 42.3,
"average_severity": 19.6,
"top_items": [...]
},
{
"category": "CodeQuality",
"total_score": 189.3,
"item_count": 31,
"estimated_effort_hours": 22.7,
"average_severity": 6.1,
"top_items": [...]
}
],
"cross_category_dependencies": [
{
"from_category": "Architecture",
"to_category": "Testing",
"impact_level": "Critical",
"blocking_items": [...],
"recommendation": "..."
}
]
}
}
Debt Density Metric
Debt density normalizes technical debt scores across projects of different sizes, providing a per-1000-lines-of-code metric for fair comparison.
Formula
debt_density = (total_debt_score / total_lines_of_code) × 1000
Example calculation:
Project A:
- Total debt score: 1,250
- Total lines of code: 25,000
- Debt density: (1,250 / 25,000) × 1000 = 50
Project B:
- Total debt score: 2,500
- Total lines of code: 50,000
- Debt density: (2,500 / 50,000) × 1000 = 50
Projects A and B have equal debt density (50) despite B having twice the absolute debt, because B is also twice as large. They have proportionally similar technical debt.
Interpretation Guidelines
Use these thresholds to assess codebase health:
| Debt Density | Assessment | Description |
|---|---|---|
| 0-50 | Clean | Well-maintained codebase, minimal debt |
| 51-100 | Moderate | Typical technical debt, manageable |
| 101-150 | High | Significant debt, prioritize remediation |
| 150+ | Critical | Severe debt burden, may impede development |
Context matters:
- Early-stage projects: Often have higher density (rapid iteration)
- Mature projects: Should trend toward lower density over time
- Legacy systems: May have high density, track trend over time
- Greenfield rewrites: Aim for density < 50
Using Debt Density
1. Compare projects fairly:
# Small microservice (5,000 LOC, debt = 250)
# Debt density: 50
# Large monolith (100,000 LOC, debt = 5,000)
# Debt density: 50
# Equal health despite size difference
2. Track improvement over time:
Sprint 1: 50,000 LOC, debt = 7,500, density = 150 (High)
Sprint 5: 52,000 LOC, debt = 6,500, density = 125 (Improving)
Sprint 10: 54,000 LOC, debt = 4,860, density = 90 (Moderate)
3. Set team goals:
Current density: 120
Target density: < 80 (by Q4)
Reduction needed: 40 points
Strategy:
- Fix 2-3 Critical items per sprint
- Prevent new debt (enforce thresholds)
- Refactor before adding features in high-debt modules
4. Benchmark across teams/projects:
{
"team_metrics": [
{
"project": "auth-service",
"debt_density": 45,
"assessment": "Clean",
"trend": "stable"
},
{
"project": "billing-service",
"debt_density": 95,
"assessment": "Moderate",
"trend": "improving"
},
{
"project": "legacy-api",
"debt_density": 165,
"assessment": "Critical",
"trend": "worsening"
}
]
}
Limitations
Debt density doesn’t account for:
- Code importance: 100 LOC in payment logic ≠ 100 LOC in logging utils
- Complexity distribution: One 1000-line god object vs. 1000 simple functions
- Test coverage: 50% coverage on critical paths vs. low-priority features
- Team familiarity: New codebase vs. well-understood legacy system
Best practices:
- Use density as one metric among many
- Combine with category analysis and tiered prioritization
- Focus on trend (improving/stable/worsening) over absolute number
- Consider debt per module for more granular insights
Debt Density in CI/CD
Track density over time:
# Generate report with density
debtmap analyze . --format json --output debt-report.json
# Extract density for trending
DENSITY=$(jq '.debt_density' debt-report.json)
# Store in metrics database
echo "debtmap.density:${DENSITY}|g" | nc -u -w0 statsd 8125
Set threshold gates:
# .github/workflows/debt-check.yml
- name: Check debt density
run: |
DENSITY=$(debtmap analyze . --format json | jq '.debt_density')
if (( $(echo "$DENSITY > 150" | bc -l) )); then
echo "❌ Debt density too high: $DENSITY (limit: 150)"
exit 1
fi
echo "✅ Debt density acceptable: $DENSITY"
Actionable Insights
Each recommendation includes:
ACTION: What to do
- “Add 6 unit tests for full coverage”
- “Refactor into 3 smaller functions”
- “Extract validation to separate function”
IMPACT: Expected improvement
- “Full test coverage, -3.7 risk”
- “Reduce complexity from 22 to 8”
- “Eliminate 120 lines of duplication”
WHY: Rationale
- “Business logic with 0% coverage, manageable complexity”
- “High complexity with low coverage threatens stability”
- “Repeated validation pattern across 5 files”
Example workflow:
- Run analysis with coverage:
debtmap analyze . --lcov coverage.lcov - Filter to CRITICAL items:
--min-priority critical - Review top 5 recommendations
- Start with highest ROI items
- Rerun analysis to track progress
Common Patterns to Recognize
Pattern 1: High Complexity, Well Tested
Complexity: 25, Coverage: 95%, Risk: LOW
This is actually good! Complex but thoroughly tested code. Learn from this approach.
Pattern 2: Moderate Complexity, No Tests
Complexity: 12, Coverage: 0%, Risk: CRITICAL
Highest priority - manageable complexity, should be easy to test.
Pattern 3: Low Complexity, No Tests
Complexity: 3, Coverage: 0%, Risk: LOW
Low priority - simple code, less risky without tests.
Pattern 4: Repetitive High Complexity (Dampened)
Cyclomatic: 20, Effective: 7 (65% dampened), Risk: LOW
Validation or dispatch pattern - looks complex but is repetitive. Lower priority.
Pattern 5: God Object
File: services.rs, Functions: 50+, Responsibilities: 15+
Architectural issue - split before adding features.
Analyzer Types
Debtmap supports multiple programming languages with varying levels of analysis capability.
Supported Languages
Rust (Full Support)
- Parser: syn (native Rust AST)
- Capabilities:
- Full complexity metrics (cyclomatic, cognitive, entropy)
- Trait implementation tracking
- Purity detection with confidence scoring
- Call graph analysis (upstream callers, downstream callees)
- Semantic function classification (entry points, business logic, data access, infrastructure, utilities, test code)
- Enhanced call graph with transitive relationships
- Macro expansion support for accurate complexity analysis
- Pattern-based adjustments for macros and code generation
- Visibility tracking (pub, pub(crate), private)
- Test module detection (#[cfg(test)])
Semantic Classification:
Debtmap automatically identifies function roles in Rust code to apply appropriate role multipliers in unified scoring:
- Entry Points: Functions named
main,start, or public functions inbin/modules - Business Logic: Core domain functions with complex logic, algorithms, business rules
- Data Access: Functions performing database queries, file I/O, network operations
- Infrastructure: Logging, configuration, monitoring, error handling utilities
- Utilities: Helper functions, formatters, type converters, validation functions
- Test Code: Functions in
#[cfg(test)]modules, functions with#[test]attribute
This classification feeds directly into the unified scoring system’s role multiplier (see Risk Scoring section).
Python (Partial Support)
- Parser: rustpython-parser
- Capabilities:
- Complexity metrics (cyclomatic, cognitive)
- Python-specific error handling patterns
- Purity detection for pure functions
- Basic debt pattern detection
- Limited call graph support
JavaScript (Partial Support)
- Parser: tree-sitter (JavaScript grammar)
- File extensions: .js, .jsx, .mjs, .cjs
- Capabilities:
- ECMAScript complexity patterns
- Basic complexity metrics
- Function extraction
- Limited pattern detection
TypeScript (Partial Support)
- Parser: tree-sitter (TypeScript grammar)
- File extensions: .ts, .tsx, .mts, .cts
- Capabilities:
- Similar to JavaScript support
- Type information currently not utilized
- Basic complexity metrics
- Limited pattern detection
Unsupported Languages:
Debtmap’s Language enum contains only the four supported languages: Rust, Python, JavaScript, and TypeScript. Files with unsupported extensions are filtered out during the file discovery phase and never reach the analysis stage.
Files with extensions like .cpp (C++), .java, .go, .rb (Ruby), .php, .cs (C#), .swift, .kt (Kotlin), .scala, and others are silently filtered during discovery.
File filtering behavior:
- Discovery scans project for files matching supported extensions
- Unsupported files are skipped silently (no warnings or errors)
- No analysis, metrics, or debt patterns are generated for filtered files
- Use
--languagesflag to explicitly control which languages to analyze
Example:
# Only analyze Rust files (skip Python/JS/TS)
debtmap analyze . --languages rust
# Analyze Rust and Python only
debtmap analyze . --languages rust,python
Language Detection
Automatic detection by file extension:
#![allow(unused)]
fn main() {
let language = Language::from_path(&path);
}
Explicit language selection:
debtmap analyze . --languages rust,python
Extensibility
Debtmap’s architecture allows adding new languages:
- Implement Analyzer trait:
#![allow(unused)]
fn main() {
pub trait Analyzer: Send + Sync {
fn parse(&self, content: &str, path: PathBuf) -> Result<Ast>;
fn analyze(&self, ast: &Ast) -> FileMetrics;
fn language(&self) -> Language;
}
}
- Register in get_analyzer():
#![allow(unused)]
fn main() {
pub fn get_analyzer(language: Language) -> Box<dyn Analyzer> {
match language {
Language::Rust => Box::new(RustAnalyzer::new()),
Language::YourLanguage => Box::new(YourAnalyzer::new()),
// ...
}
}
}
See src/analyzers/rust.rs for a complete implementation example.
Advanced Features
Purity Detection
Debtmap detects pure functions - those without side effects that always return the same output for the same input.
What makes a function pure:
- No I/O operations (file, network, database)
- No mutable global state
- No random number generation
- No system calls
- Deterministic output
Purity detection is optional:
- Both
is_pureandpurity_confidenceareOptiontypes - May be
Nonefor some functions or languages where detection is not available - Rust has the most comprehensive purity detection support
Confidence scoring (when available):
- 0.9-1.0: Very confident (no side effects detected)
- 0.7-0.8: Likely pure (minimal suspicious patterns)
- 0.5-0.6: Uncertain (some suspicious patterns)
- 0.0-0.4: Likely impure (side effects detected)
Example:
#![allow(unused)]
fn main() {
// Pure: confidence = 0.95
fn calculate_total(items: &[Item]) -> f64 {
items.iter().map(|i| i.price).sum()
}
// Impure: confidence = 0.1 (I/O detected)
fn save_total(items: &[Item]) -> Result<()> {
let total = items.iter().map(|i| i.price).sum();
write_to_file(total) // Side effect!
}
}
Benefits:
- Pure functions are easier to test
- Can be safely cached or memoized
- Safe to parallelize
- Easier to reason about
Data Flow Analysis
Debtmap builds a comprehensive DataFlowGraph that extends basic call graph analysis with variable dependencies, data transformations, I/O operations, and purity tracking.
Call Graph Foundation
Upstream callers - Who calls this function
- Indicates impact radius
- More callers = higher impact if it breaks
Downstream callees - What this function calls
- Indicates dependencies
- More callees = more integration testing needed
Example:
{
"name": "process_payment",
"upstream_callers": [
"handle_checkout",
"process_subscription",
"handle_refund"
],
"downstream_callees": [
"validate_payment_method",
"calculate_fees",
"record_transaction",
"send_receipt"
]
}
Variable Dependency Tracking
DataFlowGraph tracks which variables each function depends on:
#![allow(unused)]
fn main() {
pub struct DataFlowGraph {
// Maps function_id -> set of variable names used
variable_dependencies: HashMap<String, HashSet<String>>,
// ...
}
}
What it tracks:
- Local variables accessed in function body
- Function parameters
- Captured variables (closures)
- Mutable vs immutable references
Benefits:
- Identify functions coupled through shared state
- Detect potential side effect chains
- Guide refactoring to reduce coupling
Example output:
{
"function": "calculate_total",
"variable_dependencies": ["items", "tax_rate", "discount", "total"],
"parameter_count": 3,
"local_var_count": 1
}
Data Transformation Patterns
DataFlowGraph identifies common functional programming patterns:
#![allow(unused)]
fn main() {
pub enum TransformationType {
Map, // Transform each element
Filter, // Select subset of elements
Reduce, // Aggregate to single value
FlatMap, // Transform and flatten
Unknown, // Other transformations
}
}
Pattern detection:
- Recognizes iterator chains (
.map(),.filter(),.fold()) - Identifies functional vs imperative data flow
- Tracks input/output variable relationships
Example:
#![allow(unused)]
fn main() {
// Detected as: Filter → Map → Reduce pattern
fn total_active_users(users: &[User]) -> f64 {
users.iter()
.filter(|u| u.active) // Filter transformation
.map(|u| u.balance) // Map transformation
.sum() // Reduce transformation
}
}
Transformation metadata:
{
"function": "total_active_users",
"input_vars": ["users"],
"output_vars": ["sum_result"],
"transformation_type": "Reduce",
"is_functional_style": true,
"pipeline_length": 3
}
I/O Operation Detection
Tracks functions performing I/O operations for purity and performance analysis:
I/O categories tracked:
- File I/O:
std::fs,File::open,read_to_string - Network I/O: HTTP requests, socket operations
- Database I/O: SQL queries, ORM operations
- System calls: Process spawning, environment access
- Blocking operations:
thread::sleep, synchronous I/O in async
Example detection:
#![allow(unused)]
fn main() {
// Detected I/O operations: FileRead, FileWrite
fn save_config(config: &Config, path: &Path) -> Result<()> {
let json = serde_json::to_string(config)?; // No I/O
std::fs::write(path, json)?; // FileWrite detected
Ok(())
}
}
I/O metadata:
{
"function": "save_config",
"io_operations": ["FileWrite"],
"is_blocking": true,
"affects_purity": true,
"async_safe": false
}
Purity Analysis Integration
DataFlowGraph integrates with purity detection to provide comprehensive side effect analysis:
Side effect tracking:
- I/O operations (file, network, console)
- Global state mutations
- Random number generation
- System time access
- Non-deterministic behavior
Purity confidence factors:
- 1.0: Pure mathematical function, no side effects
- 0.8: Pure with deterministic data transformations
- 0.5: Mixed - some suspicious patterns
- 0.2: Likely impure - I/O detected
- 0.0: Definitely impure - multiple side effects
Example analysis:
{
"function": "calculate_discount",
"is_pure": true,
"purity_confidence": 0.95,
"side_effects": [],
"deterministic": true,
"safe_to_parallelize": true,
"safe_to_cache": true
}
Modification Impact Analysis
DataFlowGraph calculates the impact of modifying a function:
#![allow(unused)]
fn main() {
pub struct ModificationImpact {
pub function_name: String,
pub affected_functions: Vec<String>, // Upstream callers
pub dependency_count: usize, // Downstream callees
pub has_side_effects: bool,
pub risk_level: RiskLevel,
}
}
Risk level calculation:
- Critical: Many upstream callers + side effects + low test coverage
- High: Many callers OR side effects with moderate coverage
- Medium: Few callers with side effects OR many callers with good coverage
- Low: Few callers, no side effects, or well-tested
Example impact analysis:
{
"function": "validate_payment_method",
"modification_impact": {
"affected_functions": [
"process_payment",
"refund_payment",
"update_payment_method",
"validate_subscription"
],
"affected_count": 4,
"dependency_count": 8,
"has_side_effects": true,
"io_operations": ["DatabaseRead", "NetworkCall"],
"risk_level": "High",
"recommendation": "Comprehensive testing required - 4 functions depend on this, performs I/O"
}
}
Using modification impact:
# Analyze impact before refactoring
debtmap analyze . --format json | jq '.functions[] | select(.name == "validate_payment_method") | .modification_impact'
Impact analysis uses:
- Refactoring planning: Understand blast radius before changes
- Test prioritization: Focus tests on high-impact functions
- Code review: Flag high-risk changes for extra scrutiny
- Dependency management: Identify tightly coupled components
DataFlowGraph Methods
Key methods for data flow analysis:
#![allow(unused)]
fn main() {
// Add function with its dependencies
pub fn add_function(&mut self, function_id: String, callees: Vec<String>)
// Track variable dependencies
pub fn add_variable_dependency(&mut self, function_id: String, var_name: String)
// Record I/O operations
pub fn add_io_operation(&mut self, function_id: String, io_type: IoType)
// Calculate modification impact
pub fn calculate_modification_impact(&self, function_id: &str) -> ModificationImpact
// Get all functions affected by a change
pub fn get_affected_functions(&self, function_id: &str) -> Vec<String>
// Find functions with side effects
pub fn find_functions_with_side_effects(&self) -> Vec<String>
}
Integration in analysis pipeline:
- Parser builds initial call graph
- DataFlowGraph extends with variable/I/O tracking
- Purity analyzer adds side effect information
- Modification impact calculated for each function
- Results used in prioritization and risk scoring
Connection to Unified Scoring:
The dependency analysis from DataFlowGraph directly feeds into the unified scoring system’s dependency factor (20% weight):
- Dependency Factor Calculation: Functions with high upstream caller count or on critical paths from entry points receive higher dependency scores (8-10)
- Isolated Utilities: Functions with few or no callers score lower (1-3) on dependency factor
- Impact Prioritization: This helps prioritize functions where bugs have wider impact across the codebase
- Modification Risk: The modification impact analysis uses dependency data to calculate blast radius when changes are made
Example:
Function: validate_payment_method
Upstream callers: 4 (high impact)
→ Dependency Factor: 8.0
Function: format_currency_string
Upstream callers: 0 (utility)
→ Dependency Factor: 1.5
Both have same complexity, but validate_payment_method gets higher unified score
due to its critical role in the call graph.
This integration ensures that the unified scoring system considers not just internal function complexity and test coverage, but also the function’s importance in the broader codebase architecture.
Entropy-Based Complexity
Advanced pattern detection to reduce false positives.
Token Classification:
#![allow(unused)]
fn main() {
enum TokenType {
Variable, // Weight: 1.0
Method, // Weight: 1.5 (more important)
Literal, // Weight: 0.5 (less important)
Keyword, // Weight: 0.8
Operator, // Weight: 0.6
}
}
Shannon Entropy Calculation:
H(X) = -Σ p(x) × log₂(p(x))
where p(x) is the probability of each token type.
Dampening Decision:
#![allow(unused)]
fn main() {
if entropy_score.token_entropy < 0.4
&& entropy_score.pattern_repetition > 0.6
&& entropy_score.branch_similarity > 0.7
{
// Apply dampening
effective_complexity = base_complexity × (1 - dampening_factor);
}
}
Output explanation:
Function: validate_input
Cyclomatic: 15 → Effective: 5
Reasoning:
- High pattern repetition detected (85%)
- Low token entropy indicates simple patterns (0.32)
- Similar branch structures found (92% similarity)
- Complexity reduced by 67% due to pattern-based code
Entropy Analysis Caching
EntropyAnalyzer includes an LRU-style cache for performance optimization when analyzing large codebases or performing repeated analysis.
Cache Structure
#![allow(unused)]
fn main() {
struct CacheEntry {
score: EntropyScore,
timestamp: Instant,
hit_count: usize,
}
}
Cache configuration:
- Default size: 1000 entries
- Eviction policy: LRU (Least Recently Used)
- Memory per entry: ~128 bytes
- Total memory overhead: ~128 KB for default size
Cache Statistics
The analyzer tracks cache performance:
#![allow(unused)]
fn main() {
pub struct CacheStats {
pub hits: usize,
pub misses: usize,
pub evictions: usize,
pub hit_rate: f64,
pub memory_bytes: usize,
}
}
Example stats output:
{
"entropy_cache_stats": {
"hits": 3427,
"misses": 1573,
"evictions": 573,
"hit_rate": 0.685,
"memory_bytes": 128000
}
}
Hit rate interpretation:
- > 0.7: Excellent - many repeated analyses, cache is effective
- 0.4-0.7: Good - moderate reuse, typical for incremental analysis
- < 0.4: Low - mostly unique functions, cache less helpful
Performance Benefits
Typical performance gains:
- Cold analysis: 100ms baseline (no cache benefit)
- Incremental analysis: 30-40ms (~60-70% faster) for unchanged functions
- Re-analysis: 15-20ms (~80-85% faster) for recently analyzed functions
Best for:
- Watch mode: Analyzing on file save (repeated analysis of same files)
- CI/CD: Comparing feature branch to main (overlap in functions)
- Large codebases: Many similar functions benefit from pattern caching
Memory estimation:
Total cache memory = entry_count × 128 bytes
Examples:
- 1,000 entries: ~128 KB (default)
- 5,000 entries: ~640 KB (large projects)
- 10,000 entries: ~1.25 MB (very large)
Cache Management
Automatic eviction:
- When cache reaches size limit, oldest entries evicted
- Hit count influences retention (frequently accessed stay longer)
- Timestamp used for LRU ordering
Cache invalidation:
- Function source changes invalidate entry
- Cache cleared between major analysis runs
- No manual invalidation needed
Configuration (if exposed in future):
[entropy.cache]
enabled = true
size = 1000 # Number of entries
ttl_seconds = 3600 # Optional: expire after 1 hour
Context-Aware Analysis
Debtmap adjusts analysis based on code context:
Pattern Recognition:
- Validation patterns (repetitive checks)
- Dispatcher patterns (routing logic)
- Builder patterns (fluent APIs)
- Configuration parsers (key-value processing)
Adjustment Strategies:
- Reduce false positives for recognized patterns
- Apply appropriate thresholds by pattern type
- Consider pattern confidence in scoring
Example:
#![allow(unused)]
fn main() {
// Recognized as "validation_pattern"
// Complexity dampening applied
fn validate_user_input(input: &UserInput) -> Result<()> {
if input.name.is_empty() { return Err(Error::EmptyName); }
if input.email.is_empty() { return Err(Error::EmptyEmail); }
if input.age < 13 { return Err(Error::TooYoung); }
// ... more similar validations
Ok(())
}
}
Coverage Integration
Debtmap parses LCOV coverage data for risk analysis:
LCOV Support:
- Standard format from most coverage tools
- Line-level coverage tracking
- Function-level aggregation
Coverage Index:
- O(1) exact name lookups (~0.5μs)
- O(log n) line-based fallback (~5-8μs)
- ~200 bytes per function
- Thread-safe (
Arc<CoverageIndex>)
Performance Characteristics
Index Build Performance:
- Index construction: O(n), approximately 20-30ms for 5,000 functions
- Memory usage: ~200 bytes per record (~2MB for 5,000 functions)
- Scales linearly with function count
Lookup Performance:
- Exact match (function name): O(1) average, ~0.5μs per lookup
- Line-based fallback: O(log n), ~5-8μs per lookup
- Cache-friendly data structure for hot paths
Analysis Overhead:
- Coverage integration overhead: ~2.5x baseline analysis time
- Target overhead: ≤3x (maintained through optimizations)
- Example timing: 53ms baseline → 130ms with coverage (2.45x overhead)
- Overhead includes index build + lookups + coverage propagation
When to use coverage integration:
- Skip coverage (faster iteration): For rapid development iteration or quick local checks, omit
--lcovto get baseline results 2.5x faster - Include coverage (comprehensive analysis): Use coverage integration for final validation, sprint planning, and CI/CD gates where comprehensive risk analysis is needed
Thread Safety:
- Coverage index wrapped in
Arc<CoverageIndex>for lock-free parallel access - Multiple analyzer threads can query coverage simultaneously
- No contention on reads, suitable for parallel analysis pipelines
Memory Footprint:
Total memory = (function_count × 200 bytes) + index overhead
Examples:
- 1,000 functions: ~200 KB
- 5,000 functions: ~2 MB
- 10,000 functions: ~4 MB
Scalability:
- Tested with codebases up to 10,000 functions
- Performance remains predictable and acceptable
- Memory usage stays bounded and reasonable
Generating coverage:
# Rust
cargo tarpaulin --out lcov --output-dir target/coverage
# Python
pytest --cov --cov-report=lcov
# JavaScript/TypeScript
jest --coverage --coverageReporters=lcov
# Go
go test -coverprofile=coverage.out
gocover-cobertura < coverage.out > coverage.lcov
Using with Debtmap:
debtmap analyze . --lcov target/coverage/lcov.info
Coverage dampening: When coverage data is provided, debt scores are dampened for well-tested code:
final_score = base_score × (1 - coverage_percentage)
This ensures well-tested complex code gets lower priority than untested simple code.
Example Outputs
High Complexity Function (Needs Refactoring)
Terminal Output:
#1 SCORE: 9.2 [CRITICAL]
├─ COMPLEXITY: ./src/payments/processor.rs:145 process_transaction()
├─ ACTION: Refactor into 4 smaller functions
├─ IMPACT: Reduce complexity from 25 to 8, improve testability
├─ COMPLEXITY: cyclomatic=25, branches=25, cognitive=38, nesting=5, lines=120
├─ DEPENDENCIES: 3 upstream, 8 downstream
└─ WHY: Exceeds all complexity thresholds, difficult to test and maintain
JSON Output:
{
"id": "complexity_src_payments_processor_rs_145",
"debt_type": "Complexity",
"priority": "Critical",
"file": "src/payments/processor.rs",
"line": 145,
"message": "Function exceeds complexity threshold",
"context": "Cyclomatic: 25, Cognitive: 38, Nesting: 5",
"function_metrics": {
"name": "process_transaction",
"cyclomatic": 25,
"cognitive": 38,
"nesting": 5,
"length": 120,
"is_pure": false,
"purity_confidence": 0.15,
"upstream_callers": ["handle_payment", "handle_subscription", "handle_refund"],
"downstream_callees": ["validate", "calculate_fees", "record_transaction", "send_receipt", "update_balance", "log_transaction", "check_fraud", "notify_user"]
}
}
Well-Tested Complex Function (Good Example)
Terminal Output:
Function: calculate_tax (WELL TESTED - Good Example!)
File: src/tax/calculator.rs:78
Complexity: Cyclomatic=18, Cognitive=22
Coverage: 98%
Risk: LOW
Why this is good:
- High complexity is necessary (tax rules are complex)
- Thoroughly tested with 45 test cases
- Clear documentation of edge cases
- Good example to follow for other complex logic
Test Gap (Needs Testing)
Terminal Output:
#2 SCORE: 8.9 [CRITICAL]
├─ TEST GAP: ./src/analyzers/rust_call_graph.rs:38 add_function_to_graph()
├─ ACTION: Add 6 unit tests for full coverage
├─ IMPACT: Full test coverage, -3.7 risk reduction
├─ COMPLEXITY: cyclomatic=6, branches=6, cognitive=8, nesting=2, lines=32
├─ DEPENDENCIES: 0 upstream, 11 downstream
├─ TEST EFFORT: Simple (2-3 hours)
└─ WHY: Business logic with 0% coverage, manageable complexity (cyclo=6, cog=8)
High impact - 11 functions depend on this
JSON Output:
{
"function": "add_function_to_graph",
"file": "src/analyzers/rust_call_graph.rs",
"line": 38,
"current_risk": 8.9,
"potential_risk_reduction": 3.7,
"recommendation": {
"action": "Add unit tests",
"details": "Add 6 unit tests for full coverage",
"effort_estimate": "2-3 hours"
},
"test_effort": {
"estimated_difficulty": "Simple",
"cognitive_load": 8,
"branch_count": 6,
"recommended_test_cases": 6
},
"complexity": {
"cyclomatic": 6,
"cognitive": 8,
"nesting": 2,
"length": 32
},
"dependencies": {
"upstream_callers": [],
"downstream_callees": [
"get_function_name", "extract_parameters", "parse_return_type",
"add_to_registry", "update_call_sites", "resolve_types",
"track_visibility", "record_location", "increment_counter",
"validate_signature", "log_registration"
]
},
"roi": 4.5
}
Entropy-Dampened Validation Function
Terminal Output:
Function: validate_config
File: src/config/validator.rs:23
Cyclomatic: 20 → Effective: 7 (65% dampened)
Risk: LOW
Entropy Analysis:
├─ Token Entropy: 0.28 (low variety - repetitive patterns)
├─ Pattern Repetition: 0.88 (high similarity between checks)
├─ Branch Similarity: 0.91 (consistent validation structure)
└─ Reasoning: Complexity reduced by 65% due to pattern-based code
This appears complex but is actually a repetitive validation pattern.
Lower priority for refactoring.
Before/After Refactoring Comparison
Before:
Function: process_order
Cyclomatic: 22
Cognitive: 35
Coverage: 15%
Risk Score: 52.3 (CRITICAL)
Debt Score: 50 (Critical Complexity)
After:
Function: process_order (refactored)
Cyclomatic: 5
Cognitive: 6
Coverage: 92%
Risk Score: 2.1 (LOW)
Debt Score: 0 (no debt)
Extracted functions:
- validate_order (Cyclomatic: 4, Coverage: 100%)
- calculate_totals (Cyclomatic: 3, Coverage: 95%)
- apply_discounts (Cyclomatic: 6, Coverage: 88%)
- finalize_order (Cyclomatic: 4, Coverage: 90%)
Impact:
✓ Complexity reduced by 77%
✓ Coverage improved by 513%
✓ Risk reduced by 96%
✓ Created 4 focused, testable functions
Next Steps
- Output Formats - Detailed JSON schema and integration patterns
- Configuration - Customize thresholds and analysis behavior
For questions or issues, visit GitHub Issues.
Compare Analysis
The compare command enables you to track technical debt changes over time by comparing two analysis results. This is essential for validating refactoring efforts, detecting regressions in pull requests, and monitoring project health trends.
Implementation Status
All Features Available Now:
- ✅ Target location tracking with intelligent fuzzy matching
- ✅ Detailed improvement percentage calculations (per-item)
- ✅ Multiple output formats (JSON, Markdown, Terminal)
- ✅ Implementation plan parsing for target extraction
- ✅ Four match strategies (Exact, FunctionLevel, ApproximateName, FileLevel)
- ✅ Resolved items tracking (debt eliminated)
- ✅ Improved items detection (score reduction ≥ 30%)
- ✅ New critical items detection (regressions)
- ✅ Project health metrics and trends
- ✅ CI/CD integration support
Overview
The compare command analyzes differences between “before” and “after” debtmap analyses, providing:
- Target location tracking - Monitor specific code locations through refactoring with fuzzy matching
- Validation tracking - Verify debt items are resolved or improved
- Project health metrics - Track overall debt trends across your codebase
- Regression detection - Identify new critical debt items introduced (score ≥ 60.0)
- Improvement tracking - Measure and celebrate debt reduction with detailed per-item metrics
- CI/CD integration - Automate quality gates in your pipeline
Basic Usage
Command Syntax
debtmap compare \
--before path/to/before.json \
--after path/to/after.json \
--output validation.json
Command-Line Options
| Option | Required | Description |
|---|---|---|
--before FILE | Yes | Path to “before” analysis JSON |
--after FILE | Yes | Path to “after” analysis JSON |
--output FILE | No | Output file path (default: stdout) |
--plan FILE | No | Implementation plan to extract target location |
--target-location LOCATION | No | Manual target location (format: file:function:line) |
--format FORMAT | No | Output format: json, markdown, or terminal (default: json) |
All comparison features are available now, including target location tracking, fuzzy matching, and multiple output formats.
Target Location Tracking
Target location tracking allows you to monitor specific code locations through refactoring changes. The compare command uses intelligent fuzzy matching to find your target even when code is moved or renamed.
Location Format
Target locations use the format: file:function:line
Examples:
src/main.rs:complex_function:42lib/parser.rs:parse_expression:156api/handler.rs:process_request:89
Specifying Target Locations
Option 1: Via Implementation Plan
Create an IMPLEMENTATION_PLAN.md file with a target location section:
# Implementation Plan
## Target Item
**Location**: ./src/example.rs:complex_function:45
**Current Debt Score**: 85.5
**Severity**: critical
## Problem Analysis
The `complex_function` has high cognitive complexity...
## Proposed Solution
1. Extract nested conditionals into separate functions
2. Use early returns to reduce nesting depth
3. Add comprehensive unit tests
Then run compare with the plan:
debtmap compare --before before.json --after after.json --plan IMPLEMENTATION_PLAN.md
Option 2: Manual Target Location
Specify the target directly via command-line:
debtmap compare \
--before before.json \
--after after.json \
--target-location "src/example.rs:complex_function:45"
Matching Strategies
Debtmap uses intelligent matching to find your target item even when code changes. The matcher tries multiple strategies in order, using the most precise match available:
| Strategy | When Used | Confidence |
|---|---|---|
| Exact | file:function:line matches exactly | 1.0 |
| FunctionLevel | file:function matches (any line) | 0.8 |
| ApproximateName | Fuzzy name matching finds similar function | 0.6 |
| FileLevel | All items in file match | 0.4 |
The comparison result includes the match strategy and confidence score used, along with the count of matched items (useful when fuzzy matching finds multiple candidates).
Target Status Values
After comparing, the target item will have one of these statuses:
- Resolved - Item no longer exists in after analysis (debt eliminated!)
- Improved - Item exists but with lower debt score
- Unchanged - Item exists with similar metrics (within 5%)
- Regressed - Item exists but got worse
- NotFoundBefore - Item didn’t exist in before analysis
- NotFound - Item not found in either analysis
Project Health Metrics
The compare command tracks project-wide health metrics to show overall trends.
Tracked Metrics
{
"project_health": {
"before": {
"total_debt_score": 450.5,
"total_items": 25,
"critical_items": 5,
"high_priority_items": 12,
"average_score": 18.02
},
"after": {
"total_debt_score": 380.2,
"total_items": 22,
"critical_items": 3,
"high_priority_items": 10,
"average_score": 17.28
},
"changes": {
"debt_score_change": -70.3,
"debt_score_change_pct": -15.6,
"items_change": -3,
"critical_items_change": -2
}
}
}
Understanding Metrics
- total_debt_score - Sum of all debt item scores
- total_items - Total number of debt items detected
- critical_items - Items with score ≥ 60.0 (critical threshold)
- high_priority_items - Items with score ≥ 40.0 (high priority threshold)
- average_score - Mean debt score across all items
- debt_score_change - Absolute change in total debt
- debt_score_change_pct - Percentage change in total debt
Debt Trends
The comparison calculates an overall debt trend based on the percentage change:
- Improving - Debt decreased by more than 5%
- Stable - Debt changed by less than 5% (within normal variance)
- Regressing - Debt increased by more than 5%
Regression Detection
Regressions are new critical debt items (score ≥ 60.0) that appear in the after analysis.
What Counts as a Regression
A regression is detected when:
- An item exists in the after analysis
- The item does NOT exist in the before analysis
- The item has a debt score ≥ 60.0 (critical severity threshold)
Regression Output
The compare command returns a ComparisonResult with detailed regression information:
{
"regressions": [
{
"location": "src/new_feature.rs:process_data:23",
"score": 65.5,
"debt_type": "high_complexity",
"description": "Function has cyclomatic complexity of 12 and cognitive complexity of 15"
}
]
}
Each regression item includes:
- location - Full path with function and line number
- score - Debt score (≥ 60.0 for regressions)
- debt_type - Type of debt detected (e.g., “high_complexity”, “god_object”)
- description - Human-readable explanation of the issue
Using Regressions in CI/CD
Fail your CI build if regressions are detected:
# Run comparison
debtmap compare --before before.json --after after.json --output result.json
# Check for regressions
REGRESSION_COUNT=$(jq '.regressions | length' result.json)
if [ "$REGRESSION_COUNT" -gt 0 ]; then
echo "❌ Regression detected - $REGRESSION_COUNT new critical debt items found"
jq '.regressions[]' result.json
exit 1
fi
# Check overall debt trend
TREND=$(jq -r '.summary.overall_debt_trend' result.json)
if [ "$TREND" = "Regressing" ]; then
echo "⚠️ Warning: Overall debt is increasing"
fi
Improvement Tracking
The compare command tracks improvements as a list of ImprovementItem objects with detailed before/after metrics.
Improvement Types
The improvement_type field indicates the kind of improvement:
- Resolved - Debt item completely eliminated (no longer in after analysis)
- ScoreReduced - Overall debt score reduced significantly (≥ 30% reduction)
- ComplexityReduced - Cyclomatic or cognitive complexity decreased
- CoverageImproved - Test coverage increased
Improvement Items Structure
{
"improvements": [
{
"location": "src/example.rs:complex_function:45",
"before_score": 68.5,
"after_score": 35.2,
"improvement_type": "ScoreReduced"
},
{
"location": "src/legacy.rs:old_code:120",
"before_score": 72.0,
"after_score": null,
"improvement_type": "Resolved"
},
{
"location": "src/utils.rs:helper_function:88",
"before_score": 45.0,
"after_score": 28.0,
"improvement_type": "ComplexityReduced"
}
]
}
Each improvement item includes:
- location - Full path with function and line number
- before_score - Original debt score
- after_score - New debt score (null if resolved)
- improvement_type - Type of improvement achieved
Before/After Metrics
When you specify a target location (via --plan or --target-location), the compare command provides detailed before/after metrics for that specific code location.
Target Item Comparison
{
"target_item": {
"location": "src/example.rs:complex_function:45",
"match_strategy": "Exact",
"match_confidence": 1.0,
"matched_items_count": 1,
"before": {
"score": 68.5,
"cyclomatic_complexity": 8,
"cognitive_complexity": 15,
"coverage": 45.0,
"function_length": 120,
"nesting_depth": 4
},
"after": {
"score": 35.1,
"cyclomatic_complexity": 3,
"cognitive_complexity": 5,
"coverage": 85.0,
"function_length": 45,
"nesting_depth": 2
},
"improvements": {
"score_reduction_pct": 48.8,
"complexity_reduction_pct": 66.7,
"coverage_improvement_pct": 88.9
},
"status": "Improved"
}
}
Target Metrics Fields
Each TargetMetrics object (before/after) includes:
- score - Unified debt score
- cyclomatic_complexity - Cyclomatic complexity metric
- cognitive_complexity - Cognitive complexity metric
- coverage - Test coverage percentage
- function_length - Lines of code in function
- nesting_depth - Maximum nesting depth
Improvement Percentages
The improvements object provides percentage improvements:
- score_reduction_pct - Percentage reduction in overall debt score
- complexity_reduction_pct - Reduction in cyclomatic/cognitive complexity
- coverage_improvement_pct - Increase in test coverage
Metric Aggregation
When multiple items match the target location (due to fuzzy matching), metrics are aggregated:
- score - Average across matched items
- cyclomatic_complexity - Average
- cognitive_complexity - Average
- coverage - Average
- function_length - Average
- nesting_depth - Maximum (worst case)
The matched_items_count field tells you how many items were aggregated.
Validating Refactoring Success
Use the comparison output to verify your refactoring:
# Check target status
STATUS=$(jq -r '.target_item.status' result.json)
SCORE_REDUCTION=$(jq '.target_item.improvements.score_reduction_pct' result.json)
echo "Target Status: $STATUS"
echo "Score Reduction: ${SCORE_REDUCTION}%"
# Check for improvements
IMPROVEMENT_COUNT=$(jq '.improvements | length' result.json)
echo "Improvements: $IMPROVEMENT_COUNT items"
# Verify no regressions
REGRESSION_COUNT=$(jq '.regressions | length' result.json)
if [ "$REGRESSION_COUNT" -eq 0 ]; then
echo "✅ No regressions detected!"
else
echo "⚠️ $REGRESSION_COUNT new critical items"
fi
Output Formats
JSON Format
The default JSON format provides complete comparison results:
debtmap compare --before before.json --after after.json --output result.json
The ComparisonResult JSON output includes:
metadata- Comparison metadata (date, file paths, target location)target_item- Target item comparison with before/after metrics (if specified)project_health- Project-wide health metrics comparisonregressions- List of new critical itemsimprovements- List of improved/resolved itemssummary- Summary statistics and overall debt trend
Example output:
{
"metadata": {
"comparison_date": "2024-01-15T10:30:00Z",
"before_file": "before.json",
"after_file": "after.json",
"target_location": "src/example.rs:complex_function:45"
},
"target_item": {
"status": "Improved",
"improvements": {
"score_reduction_pct": 48.8
}
},
"summary": {
"target_improved": true,
"new_critical_count": 0,
"resolved_count": 3,
"overall_debt_trend": "Improving"
}
}
Markdown Format
Generate human-readable markdown reports for pull request comments:
debtmap compare --before before.json --after after.json --format markdown
The markdown output is suitable for:
- Pull request comments
- Documentation
- Email reports
- Team dashboards
Terminal Format
Display colorized output directly in the terminal:
debtmap compare --before before.json --after after.json --format terminal
The terminal format provides:
- Color-coded status indicators
- Formatted tables for metrics
- Human-readable summaries
- Easy scanning of results
CI/CD Integration
GitHub Actions Example
name: Technical Debt Check
on: [pull_request]
jobs:
debt-check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
with:
fetch-depth: 0 # Need history for before/after
- name: Install debtmap
run: cargo install debtmap
- name: Analyze main branch
run: |
git checkout main
debtmap analyze --output before.json
- name: Analyze PR branch
run: |
git checkout ${{ github.head_ref }}
debtmap analyze --output after.json
- name: Compare analyses
run: |
debtmap compare \
--before before.json \
--after after.json \
--output comparison.json
- name: Check comparison result
run: |
TREND=$(jq -r '.summary.overall_debt_trend' comparison.json)
REGRESSION_COUNT=$(jq '.regressions | length' comparison.json)
IMPROVEMENT_COUNT=$(jq '.improvements | length' comparison.json)
echo "Debt Trend: $TREND"
echo "Regressions: $REGRESSION_COUNT"
echo "Improvements: $IMPROVEMENT_COUNT"
# Fail on regression
if [ "$REGRESSION_COUNT" -gt 0 ]; then
echo "❌ Regression detected"
jq '.regressions[]' comparison.json
exit 1
fi
# Warn if debt is increasing
if [ "$TREND" = "Regressing" ]; then
echo "⚠️ Warning: Overall debt is increasing"
fi
- name: Post comparison to PR
uses: actions/github-script@v6
with:
script: |
const fs = require('fs');
const comparison = JSON.parse(fs.readFileSync('comparison.json', 'utf8'));
const body = `## Technical Debt Comparison
**Overall Trend:** ${comparison.summary.overall_debt_trend}
**Regressions:** ${comparison.summary.new_critical_count}
**Improvements:** ${comparison.summary.resolved_count}
${comparison.improvements.length > 0 ? `
### Improvements
${comparison.improvements.map(i => `- ${i.location}: ${i.before_score.toFixed(1)} → ${i.after_score ? i.after_score.toFixed(1) : 'resolved'}`).join('\n')}
` : ''}
${comparison.regressions.length > 0 ? `
### ⚠️ Regressions
${comparison.regressions.map(r => `- ${r.location}: ${r.score.toFixed(1)} (${r.debt_type})`).join('\n')}
` : ''}`;
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: body
});
GitLab CI Example
debt_check:
stage: test
script:
# Analyze main branch
- git fetch origin main
- git checkout origin/main
- debtmap analyze --output before.json
# Analyze current branch
- git checkout $CI_COMMIT_SHA
- debtmap analyze --output after.json
# Compare and check status
- debtmap compare --before before.json --after after.json --output comparison.json
- |
TREND=$(jq -r '.summary.overall_debt_trend' comparison.json)
REGRESSION_COUNT=$(jq '.regressions | length' comparison.json)
echo "Debt Trend: $TREND"
echo "Regressions: $REGRESSION_COUNT"
if [ "$REGRESSION_COUNT" -gt 0 ]; then
echo "Failed: Regression detected"
jq '.regressions[]' comparison.json
exit 1
fi
artifacts:
paths:
- before.json
- after.json
- comparison.json
expire_in: 1 week
Best Practices for CI/CD
- Store analyses as artifacts - Keep before/after JSON for debugging
- Check status field - Use
statusto determine pass/fail - Track completion percentage - Monitor progress toward debt resolution
- Review improvements - Celebrate and document successful refactorings
- Act on remaining issues - Create follow-up tasks for unresolved items
- Set completion thresholds - Require minimum completion percentage for merges
Practical Examples
Example 1: Basic Comparison
Compare two analyses to track debt changes:
# Run before analysis
debtmap analyze --output before.json
# Make changes to codebase...
# Run after analysis
debtmap analyze --output after.json
# Compare
debtmap compare --before before.json --after after.json --output comparison.json
# Check results
cat comparison.json | jq '.'
# Output shows: target_item, project_health, regressions, improvements, summary
Example 2: Validating Function Refactoring
Validate your refactoring work with target location tracking:
# Run before analysis
debtmap analyze --output before.json
# Identify critical items to fix
jq '.items[] | select(.unified_score.final_score >= 60.0)' before.json
# Refactor the high-priority functions...
# Run after analysis
debtmap analyze --output after.json
# Compare and validate with target location
debtmap compare \
--before before.json \
--after after.json \
--target-location "src/example.rs:complex_function:45" \
--output comparison.json
# Check target status
STATUS=$(jq -r '.target_item.status' comparison.json)
SCORE_REDUCTION=$(jq '.target_item.improvements.score_reduction_pct' comparison.json)
echo "Target Status: $STATUS"
echo "Score Reduction: ${SCORE_REDUCTION}%"
# Review all improvements
jq '.improvements[]' comparison.json
Example 3: Detecting PR Regressions
Check if a pull request introduces new critical debt:
# Analyze base branch
git checkout main
debtmap analyze --output main.json
# Analyze PR branch
git checkout feature/new-feature
debtmap analyze --output feature.json
# Compare
debtmap compare \
--before main.json \
--after feature.json \
--output comparison.json
# Check for regressions
REGRESSION_COUNT=$(jq '.regressions | length' comparison.json)
TREND=$(jq -r '.summary.overall_debt_trend' comparison.json)
echo "Regressions: $REGRESSION_COUNT"
echo "Debt Trend: $TREND"
# Example output structure:
jq '.' comparison.json
# {
# "summary": {
# "overall_debt_trend": "Improving", // or "Regressing"
# "new_critical_count": 0,
# "resolved_count": 3
# },
# "regressions": [],
# "improvements": [...],
# "project_health": {...}
# }
Example 4: Monitoring Project Health Over Releases
Track overall project health across releases:
# Analyze release v1.0
git checkout v1.0
debtmap analyze --output v1.0.json
# Analyze release v1.1
git checkout v1.1
debtmap analyze --output v1.1.json
# Compare
debtmap compare \
--before v1.0.json \
--after v1.1.json \
--output v1.0-to-v1.1.json
# Check project health metrics
echo "Before (v1.0):"
jq '.project_health.before' v1.0-to-v1.1.json
echo "After (v1.1):"
jq '.project_health.after' v1.0-to-v1.1.json
# Check overall trend
TREND=$(jq -r '.summary.overall_debt_trend' v1.0-to-v1.1.json)
DEBT_CHANGE=$(jq '.project_health.changes.debt_score_change_pct' v1.0-to-v1.1.json)
echo "Debt Trend: $TREND"
echo "Debt Score Change: ${DEBT_CHANGE}%"
Example 5: Full CI/CD Workflow
Complete workflow for continuous debt monitoring:
#!/bin/bash
# ci-debt-check.sh
set -e
BEFORE="before.json"
AFTER="after.json"
COMPARISON="comparison.json"
# Step 1: Analyze baseline (main branch)
echo "📊 Analyzing baseline..."
git checkout main
debtmap analyze --output "$BEFORE"
# Step 2: Analyze current branch
echo "📊 Analyzing current branch..."
git checkout -
debtmap analyze --output "$AFTER"
# Step 3: Run comparison
echo "🔍 Running comparison..."
debtmap compare \
--before "$BEFORE" \
--after "$AFTER" \
--output "$COMPARISON"
# Step 4: Extract metrics
TREND=$(jq -r '.summary.overall_debt_trend' "$COMPARISON")
REGRESSION_COUNT=$(jq '.regressions | length' "$COMPARISON")
IMPROVEMENT_COUNT=$(jq '.improvements | length' "$COMPARISON")
RESOLVED_COUNT=$(jq '.summary.resolved_count' "$COMPARISON")
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo "📈 Debt Comparison Results"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo "Trend: $TREND"
echo "Regressions: $REGRESSION_COUNT"
echo "Improvements: $IMPROVEMENT_COUNT"
echo "Resolved: $RESOLVED_COUNT"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
# Step 5: Quality gate
if [ "$REGRESSION_COUNT" -gt 0 ]; then
echo "❌ FAILED: Regression detected"
jq '.regressions[]' "$COMPARISON"
exit 1
fi
if [ "$TREND" = "Regressing" ]; then
echo "⚠️ WARNING: Overall debt is increasing"
# Don't fail, just warn
fi
if [ "$RESOLVED_COUNT" -gt 0 ]; then
echo "🎉 SUCCESS: $RESOLVED_COUNT debt items resolved!"
fi
echo "✅ PASSED: No regressions detected"
Example 6: Interpreting Comparison Results
Understanding the comparison output:
# Run comparison
debtmap compare --before before.json --after after.json --output comparison.json
# Check debt trend
TREND=$(jq -r '.summary.overall_debt_trend' comparison.json)
REGRESSION_COUNT=$(jq '.regressions | length' comparison.json)
IMPROVEMENT_COUNT=$(jq '.improvements | length' comparison.json)
case "$TREND" in
"Improving")
echo "🎉 Success! Debt is decreasing"
echo "Improvements: $IMPROVEMENT_COUNT"
jq '.improvements[] | "\(.location): \(.improvement_type)"' comparison.json
;;
"Stable")
echo "➡️ Stable - no significant debt change"
echo "Improvements: $IMPROVEMENT_COUNT"
echo "Regressions: $REGRESSION_COUNT"
;;
"Regressing")
echo "❌ Warning! Debt is increasing"
echo "New critical items: $REGRESSION_COUNT"
jq '.regressions[] | "\(.location): \(.score) (\(.debt_type))"' comparison.json
;;
esac
# Check if target improved (if target was specified)
if jq -e '.target_item' comparison.json > /dev/null; then
TARGET_STATUS=$(jq -r '.target_item.status' comparison.json)
echo "Target Status: $TARGET_STATUS"
fi
Troubleshooting
Understanding Debt Trends
Issue: Confused about what the debt trend means
Solution: Check the summary.overall_debt_trend field in comparison output:
Improving- Total debt decreased by more than 5%Stable- Total debt changed by less than 5% (within normal variance)Regressing- Total debt increased by more than 5%
Check the trend:
TREND=$(jq -r '.summary.overall_debt_trend' comparison.json)
DEBT_CHANGE=$(jq '.project_health.changes.debt_score_change_pct' comparison.json)
echo "Debt Trend: $TREND (${DEBT_CHANGE}% change)"
No Improvements Detected
Issue: Made changes but comparison shows no improvements
Possible causes:
- Changes didn’t reduce debt scores by ≥30% (improvement threshold)
- Refactored items had scores <60.0 (not tracked as critical)
- Changes were neutral (e.g., code moved but complexity unchanged)
Solution: Check the details:
# Compare before/after project health
jq '.project_health.before' result.json
jq '.project_health.after' result.json
# Look for critical items in before analysis
jq '.items[] | select(.unified_score.final_score >= 60.0)' before.json
JSON Parsing Errors
Problem: Error parsing JSON file
Solutions:
- Verify the file is valid JSON:
jq . before.json - Ensure the file is a debtmap analysis output
- Check file permissions and path
- Regenerate the analysis if corrupted
Understanding Target Status Values
| Status | Meaning | Action Required |
|---|---|---|
Resolved | Item eliminated completely | ✅ Celebrate! Item no longer exists |
Improved | Score reduced significantly | ✅ Good progress, verify metrics improved |
Unchanged | No significant change | ⚠️ Review approach, may need different strategy |
Regressed | Item got worse | ❌ Investigate and fix before merging |
NotFoundBefore | Item didn’t exist before | ℹ️ New code, ensure quality is acceptable |
NotFound | Item not found in either | ⚠️ Check target location format |
Handling Missing Files
Problem: No such file or directory
Solutions:
# Verify files exist
ls -la before.json after.json
# Check current directory
pwd
# Use absolute paths if needed
debtmap compare \
--before /absolute/path/to/before.json \
--after /absolute/path/to/after.json
Interpreting Edge Cases
All Items Resolved:
{
"summary": {
"resolved_count": 25,
"new_critical_count": 0,
"overall_debt_trend": "Improving"
},
"project_health": {
"after": {
"total_items": 0,
"critical_items": 0
}
}
}
All debt items resolved - excellent work!
New Project (Empty Before):
{
"summary": {
"new_critical_count": 15,
"resolved_count": 0,
"overall_debt_trend": "Stable"
},
"project_health": {
"before": {
"total_items": 0
}
}
}
New project or first analysis - establish baseline for future comparisons.
No Changes:
{
"summary": {
"overall_debt_trend": "Stable",
"new_critical_count": 0,
"resolved_count": 0
},
"improvements": [],
"regressions": []
}
No changes detected - either no code changes or changes were neutral to debt.
Related Documentation
- Validation Command - Validate implementation plans match analysis
- Prodigy Integration - Automated refactoring workflows
- Output Formats - Understanding analysis JSON structure
- Scoring Strategies - How debt scores are calculated
- CI/CD Integration - Advanced pipeline configurations
Summary
The compare command provides validation for refactoring efforts:
Current Capabilities:
- ✅ Target location tracking with intelligent fuzzy matching
- ✅ Detect regressions (new critical items with score ≥ 60.0)
- ✅ Track resolved items and improvements (≥30% score reduction)
- ✅ Detailed per-item improvement metrics with before/after scores
- ✅ Multiple output formats (JSON, Markdown, Terminal)
- ✅ Implementation plan parsing for target extraction
- ✅ Project-wide health metrics and debt trends
- ✅ Automate quality gates in CI/CD pipelines
Use the compare command regularly to maintain visibility into your codebase’s technical health and ensure continuous improvement. All features are fully implemented and ready for production use.
Configuration
Debtmap is highly configurable through a .debtmap.toml file. This chapter explains how to customize Debtmap’s behavior for your project’s specific needs.
Config Files
Debtmap uses TOML format for configuration files (.debtmap.toml). TOML provides a clear, readable syntax well-suited for configuration.
Creating a Configuration File
Debtmap looks for a .debtmap.toml file in the current directory and up to 10 parent directories. To create an initial configuration:
debtmap init
This command creates a .debtmap.toml file with sensible defaults.
Configuration File Discovery
When you run debtmap, it searches for .debtmap.toml starting in your current directory and traversing up to 10 parent directories. The first configuration file found is used.
If no configuration file is found, Debtmap uses built-in defaults that work well for most projects.
Basic Example
Here’s a minimal .debtmap.toml configuration:
[scoring]
coverage = 0.50 # 50% weight for test coverage gaps
complexity = 0.35 # 35% weight for code complexity
dependency = 0.15 # 15% weight for dependency criticality
[thresholds]
complexity = 10
max_file_length = 500
max_function_length = 50
[languages]
enabled = ["rust", "python", "javascript", "typescript"]
Scoring Configuration
Scoring Weights
The [scoring] section controls how different factors contribute to the overall debt score. Debtmap uses a weighted sum model where weights must sum to 1.0.
[scoring]
coverage = 0.50 # Weight for test coverage gaps (default: 0.50)
complexity = 0.35 # Weight for code complexity (default: 0.35)
dependency = 0.15 # Weight for dependency criticality (default: 0.15)
Active weights (used in scoring):
coverage- Prioritizes untested code (default: 0.50)complexity- Identifies complex areas (default: 0.35)dependency- Considers impact radius (default: 0.15)
Unused weights (reserved for future features):
semantic- Not currently used (default: 0.00)security- Not currently used (default: 0.00)organization- Not currently used (default: 0.00)
Validation rules:
- All weights must be between 0.0 and 1.0
- Active weights (coverage + complexity + dependency) must sum to 1.0 (±0.001 tolerance)
- If weights don’t sum to 1.0, they will be automatically normalized
Example - Prioritize complexity over coverage:
[scoring]
coverage = 0.30
complexity = 0.55
dependency = 0.15
Role Multipliers
Role multipliers adjust complexity scores based on a function’s semantic role:
[role_multipliers]
pure_logic = 1.2 # Prioritize pure computation (default: 1.2)
orchestrator = 0.8 # Reduce for delegation functions (default: 0.8)
io_wrapper = 0.7 # Reduce for I/O wrappers (default: 0.7)
entry_point = 0.9 # Slight reduction for main/CLI (default: 0.9)
pattern_match = 0.6 # Reduce for pattern matching (default: 0.6)
debug = 0.3 # Debug/diagnostic functions (default: 0.3)
unknown = 1.0 # No adjustment (default: 1.0)
These multipliers help reduce false positives by recognizing that different function types have naturally different complexity levels. The debug role has the lowest multiplier (0.3) since debug and diagnostic functions typically have low testing priority.
Role-Based Scoring Configuration
DebtMap uses a two-stage role adjustment mechanism to accurately score functions based on their architectural role and testing strategy. This section explains how to configure both stages.
Stage 1: Role Coverage Weights
The first stage adjusts how much coverage gaps penalize different function types. This recognizes that not all functions need the same level of unit test coverage.
Configuration (.debtmap.toml under [scoring.role_coverage_weights]):
[scoring.role_coverage_weights]
entry_point = 0.6 # Reduce coverage penalty (often integration tested)
orchestrator = 0.8 # Reduce coverage penalty (tested via higher-level tests)
pure_logic = 1.0 # Pure logic should have unit tests, no reduction (default: 1.0)
io_wrapper = 0.5 # I/O wrappers are integration tested (default: 0.5)
pattern_match = 1.0 # Standard penalty
debug = 0.3 # Debug functions have lowest coverage expectations (default: 0.3)
unknown = 1.0 # Standard penalty (default behavior)
Rationale:
| Function Role | Weight | Why This Value? |
|---|---|---|
| Entry Point | 0.6 | CLI handlers, HTTP routes, main functions are integration tested, not unit tested |
| Orchestrator | 0.8 | Coordination functions tested via higher-level tests |
| Pure Logic | 1.0 | Core business logic should have unit tests (default: 1.0) |
| I/O Wrapper | 0.5 | File/network operations tested via integration tests (default: 0.5) |
| Pattern Match | 1.0 | Standard coverage expectations |
| Debug | 0.3 | Debug/diagnostic functions have lowest testing priority (default: 0.3) |
| Unknown | 1.0 | Default when role cannot be determined |
Example Impact:
# Emphasize pure logic testing strongly
[scoring.role_coverage_weights]
pure_logic = 1.5 # 50% higher penalty for untested logic
entry_point = 0.5 # 50% lower penalty for untested entry points
io_wrapper = 0.4 # 60% lower penalty for untested I/O
# Conservative approach (smaller adjustments)
[scoring.role_coverage_weights]
pure_logic = 1.1 # Only 10% increase
entry_point = 0.9 # Only 10% decrease
How It Works:
When a function has 0% coverage:
- Entry Point (weight 0.6): Gets 60% penalty instead of 100% penalty
- Pure Logic (weight 1.0): Gets 100% penalty (standard emphasis on testing)
- I/O Wrapper (weight 0.5): Gets 50% penalty
This prevents entry points from dominating the priority list due to low unit test coverage while emphasizing the importance of testing pure business logic.
Stage 2: Role Multiplier with Clamping
The second stage applies a final role-based multiplier to reflect architectural importance. This multiplier is clamped by default to prevent extreme score variations.
Configuration (.debtmap.toml under [scoring.role_multiplier]):
[scoring.role_multiplier]
clamp_min = 0.3 # Minimum multiplier (default: 0.3)
clamp_max = 1.8 # Maximum multiplier (default: 1.8)
enable_clamping = true # Enable clamping (default: true)
Parameters:
| Parameter | Default | Description |
|---|---|---|
clamp_min | 0.3 | Minimum allowed multiplier - prevents functions from becoming invisible |
clamp_max | 1.8 | Maximum allowed multiplier - prevents extreme score spikes |
enable_clamping | true | Whether to apply clamping (disable for prototyping only) |
Clamp Range Rationale:
Default [0.3, 1.8]: Balances differentiation with stability
-
Lower bound (0.3): I/O wrappers still contribute 30% of their base score
- Prevents them from becoming invisible in the priority list
- Ensures simple wrappers aren’t completely ignored
-
Upper bound (1.8): Critical functions get at most 180% of base score
- Prevents one complex function from dominating the entire list
- Maintains balanced prioritization across different issues
When to Adjust Clamp Range:
# Wider range for more differentiation
[scoring.role_multiplier]
clamp_min = 0.2 # Allow more reduction
clamp_max = 2.5 # Allow more emphasis
# Narrower range for more stability
[scoring.role_multiplier]
clamp_min = 0.5 # Less reduction
clamp_max = 1.5 # Less emphasis
# Disable clamping (not recommended for production)
[scoring.role_multiplier]
enable_clamping = false # Allow unclamped multipliers
# Warning: May cause unstable prioritization
When to Disable Clamping:
- Prototyping: Testing extreme multiplier values for custom scoring strategies
- Special cases: Very specific project needs requiring wide multiplier ranges
- Not recommended for production use as it can lead to unstable prioritization
Example Impact:
Without clamping:
Function: critical_business_logic (Pure Logic)
Base Score: 45.0
Role Multiplier: 2.5 (unclamped)
Final Score: 112.5 (dominates entire list)
With clamping (default):
Function: critical_business_logic (Pure Logic)
Base Score: 45.0
Role Multiplier: 1.8 (clamped from 2.5)
Final Score: 81.0 (high priority, but balanced)
Complete Example Configuration
Here’s a complete example showing both stages configured together:
# Stage 1: Coverage weight adjustments
[scoring.role_coverage_weights]
pure_logic = 1.0 # Pure logic should have unit tests (default: 1.0)
entry_point = 0.6 # Reduce penalty for integration-tested entry points
orchestrator = 0.8 # Partially reduce penalty for orchestrators
io_wrapper = 0.5 # I/O wrappers are integration tested (default: 0.5)
pattern_match = 1.0 # Standard
debug = 0.3 # Debug functions have lowest coverage expectations (default: 0.3)
unknown = 1.0 # Standard
# Stage 2: Role multiplier with clamping
[scoring.role_multiplier]
clamp_min = 0.3 # I/O wrappers contribute at least 30%
clamp_max = 1.8 # Critical functions get at most 180%
enable_clamping = true # Keep clamping enabled for stability
How the Two Stages Work Together
The two-stage approach ensures role-based coverage adjustments and architectural importance multipliers work independently:
Example Workflow:
1. Calculate base score from complexity (10) and dependencies (5)
→ Base = 15.0
2. Stage 1: Apply coverage weight based on role (Entry Point, weight 0.6)
→ Coverage penalty reduced from 1.0 to 0.4
→ Preliminary score = 15.0 × 0.4 = 6.0
3. Stage 2: Apply clamped role multiplier (Entry Point, multiplier 1.2)
→ Clamped to [0.3, 1.8] → stays 1.2
→ Final score = 6.0 × 1.2 = 7.2
Key Benefits:
- Coverage adjustments don’t interfere with role multiplier
- Both mechanisms contribute independently to final score
- Clamping prevents instability from extreme values
- Configuration flexibility for different project needs
Verification
To see how role-based adjustments affect your codebase:
# Show detailed scoring breakdown
debtmap analyze . --verbose
# Look for lines like:
# Coverage Weight: 0.6 (Entry Point adjustment)
# Adjusted Coverage Penalty: 0.4 (reduced from 1.0)
# Role Multiplier: 1.2 (clamped from 1.5)
For more details on how role-based adjustments reduce false positives, see the Role-Based Adjustments section in the Scoring Strategies guide.
Thresholds Configuration
Basic Thresholds
Control when code is flagged as technical debt:
[thresholds]
complexity = 10 # Cyclomatic complexity threshold
duplication = 50 # Duplication threshold
max_file_length = 500 # Maximum lines per file
max_function_length = 50 # Maximum lines per function
Note: The TOML configuration accepts max_file_length (shown above), which maps to the internal struct field max_file_lines. Both names refer to the same setting.
Minimum Thresholds
Filter out trivial functions that aren’t really technical debt:
[thresholds]
minimum_debt_score = 2.0 # Only show items with debt score ≥ 2.0
minimum_cyclomatic_complexity = 3 # Ignore functions with cyclomatic < 3
minimum_cognitive_complexity = 5 # Ignore functions with cognitive < 5
minimum_risk_score = 2.0 # Only show Risk items with score ≥ 2.0
These minimum thresholds help focus on significant issues by filtering out simple functions with minor complexity.
Validation Thresholds
The [thresholds.validation] subsection configures limits for the debtmap validate command:
[thresholds.validation]
max_average_complexity = 10.0 # Maximum allowed average complexity (default: 10.0)
max_high_complexity_count = 100 # DEPRECATED: Use max_debt_density instead (default: 100)
max_debt_items = 2000 # DEPRECATED: Use max_debt_density instead (default: 2000)
max_total_debt_score = 10000 # Maximum total debt score (default: 10000)
max_codebase_risk_score = 7.0 # Maximum codebase risk score (default: 7.0)
max_high_risk_functions = 50 # DEPRECATED: Use max_debt_density instead (default: 50)
min_coverage_percentage = 0.0 # Minimum required coverage % (default: 0.0)
max_debt_density = 50.0 # Maximum debt per 1000 LOC (default: 50.0)
Deprecated Fields (v0.3.0+):
The following validation thresholds are deprecated since v0.3.0 and will be removed in v1.0:
max_high_complexity_count- Replaced bymax_debt_density(scale-independent)max_debt_items- Replaced bymax_debt_density(scale-independent)max_high_risk_functions- Replaced bymax_debt_density(scale-independent)
Migration: Use max_debt_density instead, which provides a scale-independent metric (debt per 1000 lines of code). This allows the same threshold to work across codebases of different sizes.
Use debtmap validate in CI to enforce code quality standards:
# Fail build if validation thresholds are exceeded
debtmap validate
Language Configuration
Enabling Languages
Specify which languages to analyze:
[languages]
enabled = ["rust", "python", "javascript", "typescript"]
Language-Specific Features
Configure features for individual languages:
[languages.rust]
detect_dead_code = false # Rust: disabled by default (compiler handles it)
detect_complexity = true
detect_duplication = true
[languages.python]
detect_dead_code = true
detect_complexity = true
detect_duplication = true
[languages.javascript]
detect_dead_code = true
detect_complexity = true
detect_duplication = true
[languages.typescript]
detect_dead_code = true
detect_complexity = true
detect_duplication = true
Note: Rust’s dead code detection is disabled by default since the Rust compiler already provides excellent unused code warnings.
Exclusion Patterns
File and Directory Exclusion
Use glob patterns to exclude files and directories from analysis:
[ignore]
patterns = [
"target/**", # Rust build output
"venv/**", # Python virtual environment
"node_modules/**", # JavaScript dependencies
"*.min.js", # Minified files
"benches/**", # Benchmark code
"tests/**/*", # Test files
"**/test_*.rs", # Test files (prefix)
"**/*_test.rs", # Test files (suffix)
"**/fixtures/**", # Test fixtures
"**/mocks/**", # Mock implementations
"**/stubs/**", # Stub implementations
"**/examples/**", # Example code
"**/demo/**", # Demo code
]
Glob pattern syntax:
*- Matches any characters except/**- Matches any characters including/(recursive)?- Matches a single character[abc]- Matches any character in the set
Note: Function-level filtering (e.g., ignoring specific function name patterns) is handled by role detection and context-aware analysis rather than explicit ignore patterns. See the Context-Aware Detection section for function-level filtering options.
Display Configuration
Control how results are displayed:
[display]
tiered = true # Use tiered priority display (default: true)
items_per_tier = 5 # Show 5 items per tier (default: 5)
When tiered = true, Debtmap groups results into priority tiers (Critical, High, Medium, Low) and shows the top items from each tier.
Output Configuration
Set the default output format:
[output]
default_format = "terminal" # Options: "terminal", "json", "markdown"
Supported formats:
"terminal"- Human-readable colored output for the terminal (default)"json"- Machine-readable JSON for integration with other tools"markdown"- Markdown format for documentation and reports
This can be overridden with the --format CLI flag:
debtmap analyze --format json # JSON output
debtmap analyze --format markdown # Markdown output
Normalization Configuration
Control how raw scores are normalized to a 0-10 scale:
[normalization]
linear_threshold = 10.0 # Use linear scaling below this value
logarithmic_threshold = 100.0 # Use logarithmic scaling above this value
sqrt_multiplier = 3.33 # Multiplier for square root scaling
log_multiplier = 10.0 # Multiplier for logarithmic scaling
show_raw_scores = true # Show both raw and normalized scores
Normalization ensures scores are comparable across different codebases and prevents extreme outliers from dominating the results.
Advanced Configuration
Entropy-Based Complexity Scoring
Entropy analysis helps identify repetitive code patterns (like large match statements) that inflate complexity metrics:
[entropy]
enabled = true # Enable entropy analysis (default: true)
weight = 1.0 # Weight in complexity adjustment (default: 1.0)
min_tokens = 20 # Minimum tokens for analysis (default: 20)
pattern_threshold = 0.7 # Pattern similarity threshold (default: 0.7)
entropy_threshold = 0.4 # Low entropy threshold (default: 0.4)
branch_threshold = 0.8 # Branch similarity threshold (default: 0.8)
use_classification = false # Use smarter token classification (default: false)
# Maximum reductions to prevent over-correction
max_repetition_reduction = 0.20 # Max 20% reduction for repetition (default: 0.20)
max_entropy_reduction = 0.15 # Max 15% reduction for low entropy (default: 0.15)
max_branch_reduction = 0.25 # Max 25% reduction for similar branches (default: 0.25)
max_combined_reduction = 0.30 # Max 30% total reduction (default: 0.30)
Entropy scoring reduces false positives from functions like parsers and state machines that have high cyclomatic complexity but are actually simple and maintainable.
God Object Detection
Configure detection of classes/structs with too many responsibilities:
[god_object_detection]
enabled = true
# Rust-specific thresholds
[god_object_detection.rust]
max_methods = 20 # Maximum methods before flagging (default: 20)
max_fields = 15 # Maximum fields before flagging (default: 15)
max_traits = 5 # Maximum implemented traits
max_lines = 1000 # Maximum lines of code
max_complexity = 200 # Maximum total complexity
# Python-specific thresholds
[god_object_detection.python]
max_methods = 15
max_fields = 10
max_traits = 3
max_lines = 500
max_complexity = 150
# JavaScript-specific thresholds
[god_object_detection.javascript]
max_methods = 15
max_fields = 20 # JavaScript classes often have more properties
max_traits = 3
max_lines = 500
max_complexity = 150
Note: Different languages have different defaults. Rust allows more methods since trait implementations add methods, while JavaScript classes should be smaller.
Context-Aware Detection
Enable context-aware pattern detection to reduce false positives:
[context]
enabled = false # Opt-in (default: false)
# Custom context rules
[[context.rules]]
name = "allow_blocking_in_main"
pattern = "blocking_io"
action = "allow"
priority = 100
reason = "Main function can use blocking I/O"
[context.rules.context]
role = "main"
# Function pattern configuration
[context.function_patterns]
test_patterns = ["test_*", "bench_*"]
config_patterns = ["load_*_config", "parse_*_config"]
handler_patterns = ["handle_*", "*_handler"]
init_patterns = ["initialize_*", "setup_*"]
Context-aware detection adjusts severity based on where code appears (main functions, test code, configuration loaders, etc.).
Error Handling Detection
Configure detection of error handling anti-patterns:
[error_handling]
detect_async_errors = true # Detect async error issues (default: true)
detect_context_loss = true # Detect error context loss (default: true)
detect_propagation = true # Analyze error propagation (default: true)
detect_panic_patterns = true # Detect panic/unwrap usage (default: true)
detect_swallowing = true # Detect swallowed errors (default: true)
# Custom error patterns
[[error_handling.custom_patterns]]
name = "custom_panic"
pattern = "my_panic_macro"
pattern_type = "macro_name"
severity = "high"
description = "Custom panic macro usage"
remediation = "Replace with Result-based error handling"
# Severity overrides
[[error_handling.severity_overrides]]
pattern = "unwrap"
context = "test"
severity = "low" # Unwrap is acceptable in test code
Pure Mapping Pattern Detection
Configure detection of pure mapping patterns to reduce false positives from exhaustive match expressions:
[mapping_patterns]
enabled = true # Enable mapping pattern detection (default: true)
complexity_reduction = 0.30 # Reduce complexity by 30% (default: 0.30)
min_branches = 3 # Minimum match arms to consider (default: 3)
What are pure mapping patterns?
Pure mapping patterns are exhaustive match expressions that transform input to output without side effects. These patterns have high cyclomatic complexity due to many branches, but are actually simple and maintainable because:
- Each branch is independent and straightforward
- No mutation or side effects occur
- The pattern is predictable and easy to understand
- Adding new cases requires minimal changes
Example:
#![allow(unused)]
fn main() {
fn status_to_string(status: Status) -> &'static str {
match status {
Status::Success => "success",
Status::Pending => "pending",
Status::Failed => "failed",
Status::Cancelled => "cancelled",
// ... many more cases
}
}
}
This function has high cyclomatic complexity (one branch per case), but is simple to maintain. Mapping pattern detection recognizes this and reduces the complexity score appropriately.
Configuration options:
| Parameter | Default | Description |
|---|---|---|
enabled | true | Enable mapping pattern detection |
complexity_reduction | 0.30 | Percentage to reduce complexity (0.0-1.0) |
min_branches | 3 | Minimum match arms to be considered a mapping pattern |
Example configuration:
# Conservative reduction
[mapping_patterns]
complexity_reduction = 0.20 # Only 20% reduction
# Aggressive reduction for codebases with many mapping patterns
[mapping_patterns]
complexity_reduction = 0.50 # 50% reduction
# Disable if you want to see all match complexity
[mapping_patterns]
enabled = false
When to adjust:
- Increase
complexity_reductionif you have many simple mapping functions being flagged as complex - Decrease
complexity_reductionif you want more conservative adjustments - Increase
min_branchesto only apply reduction to very large match statements - Disable entirely if you want raw complexity scores without adjustment
External API Configuration
Mark functions as public API for enhanced testing recommendations:
[external_api]
detect_external_api = false # Auto-detect public APIs (default: false)
api_functions = [] # Explicitly mark API functions
api_files = [] # Explicitly mark API files
When enabled, public API functions receive higher priority for test coverage.
Classification Configuration
The [classification] section controls how Debtmap classifies functions by their semantic role (constructor, accessor, data flow, etc.). This classification drives role-based adjustments and reduces false positives.
[classification]
# Constructor detection
[classification.constructors]
detect_constructors = true # Enable constructor detection (default: true)
constructor_patterns = ["new", "create", "build", "from"] # Common constructor names
# Accessor detection
[classification.accessors]
detect_accessors = true # Enable accessor/getter detection (default: true)
accessor_patterns = ["get_*", "set_*", "is_*", "has_*"] # Common accessor patterns
# Data flow detection
[classification.data_flow]
detect_data_flow = true # Enable data flow analysis (default: true)
Configuration Options:
| Section | Option | Default | Description |
|---|---|---|---|
constructors | detect_constructors | true | Identify constructor functions |
constructors | constructor_patterns | [“new”, “create”, “build”, “from”] | Name patterns for constructors |
accessors | detect_accessors | true | Identify accessor/getter functions |
accessors | accessor_patterns | [“get_”, “set_”, “is_”, “has_”] | Name patterns for accessors |
data_flow | detect_data_flow | true | Enable data flow analysis |
Why Classification Matters:
Classification helps Debtmap understand function intent and apply appropriate complexity adjustments:
- Constructors typically have boilerplate initialization code with naturally higher complexity
- Accessors are simple getters/setters that shouldn’t be flagged as debt
- Data flow functions (mappers, filters) have predictable patterns that inflate metrics
By detecting these patterns, Debtmap reduces false positives and focuses on genuine technical debt.
Additional Advanced Options
Debtmap supports additional advanced configuration options:
Lines of Code Configuration
The [loc] section controls how lines of code are counted for metrics and reporting:
[loc]
include_tests = false # Exclude test files from LOC counts (default: false)
include_generated = false # Exclude generated files from LOC counts (default: false)
count_comments = false # Include comment lines in LOC counts (default: false)
count_blank_lines = false # Include blank lines in LOC counts (default: false)
Configuration options:
| Option | Default | Description |
|---|---|---|
include_tests | false | Whether to include test files in LOC metrics |
include_generated | false | Whether to include generated files in LOC metrics |
count_comments | false | Whether to count comment lines as LOC |
count_blank_lines | false | Whether to count blank lines as LOC |
Example - Strict LOC counting:
[loc]
include_tests = false # Focus on production code
include_generated = false # Exclude auto-generated code
count_comments = false # Only count executable code
count_blank_lines = false # Exclude whitespace
Tier Configuration
The [tiers] section configures tier threshold boundaries for prioritization:
[tiers]
t2_complexity_threshold = 15 # Complexity threshold for Tier 2 (default: 15)
t2_dependency_threshold = 10 # Dependency threshold for Tier 2 (default: 10)
t3_complexity_threshold = 10 # Complexity threshold for Tier 3 (default: 10)
show_t4_in_main_report = false # Show Tier 4 items in main report (default: false)
Tier priority levels:
- Tier 1 (Critical): Highest priority items
- Tier 2 (High): Items above
t2_*thresholds - Tier 3 (Medium): Items above
t3_*thresholds - Tier 4 (Low): Items below all thresholds
Example - Stricter tier boundaries:
[tiers]
t2_complexity_threshold = 12 # Lower threshold = more items in high priority
t2_dependency_threshold = 8
t3_complexity_threshold = 8
show_t4_in_main_report = true # Include low-priority items
Enhanced Complexity Thresholds
The [complexity_thresholds] section provides more granular control over complexity detection:
This supplements the basic [thresholds] section with minimum total, cyclomatic, and cognitive complexity thresholds for flagging functions.
These options are advanced features with sensible defaults. Most users won’t need to configure them explicitly.
Orchestration Adjustment
The [orchestration_adjustment] section configures complexity reduction for orchestrator functions that primarily delegate to other functions:
[orchestration_adjustment]
enabled = true # Enable orchestration detection (default: true)
min_delegation_ratio = 0.6 # Minimum ratio of delegated calls (default: 0.6)
complexity_reduction = 0.25 # Reduce complexity by 25% (default: 0.25)
Configuration Options:
| Option | Default | Description |
|---|---|---|
enabled | true | Enable orchestration pattern detection |
min_delegation_ratio | 0.6 | Minimum % of function that delegates to be considered orchestrator |
complexity_reduction | 0.25 | Percentage to reduce complexity score (0.0-1.0) |
Orchestrator functions coordinate multiple operations but don’t contain complex logic themselves. This adjustment prevents them from being over-penalized.
Boilerplate Detection
The [boilerplate_detection] section identifies and reduces penalties for boilerplate code patterns:
[boilerplate_detection]
enabled = true # Enable boilerplate detection (default: true)
detect_constructors = true # Detect constructor boilerplate (default: true)
detect_error_conversions = true # Detect error conversion boilerplate (default: true)
complexity_reduction = 0.20 # Reduce complexity by 20% (default: 0.20)
Configuration Options:
| Option | Default | Description |
|---|---|---|
enabled | true | Enable boilerplate pattern detection |
detect_constructors | true | Identify constructor initialization boilerplate |
detect_error_conversions | true | Identify error type conversion boilerplate |
complexity_reduction | 0.20 | Percentage to reduce complexity for boilerplate (0.0-1.0) |
Boilerplate code often inflates complexity metrics without representing true technical debt. This detection reduces false positives from necessary but repetitive code.
Functional Analysis
The [functional_analysis] section configures detection of functional programming patterns:
[functional_analysis]
enabled = true # Enable functional pattern detection (default: true)
detect_pure_functions = true # Detect pure functions (default: true)
detect_higher_order = true # Detect higher-order functions (default: true)
detect_immutable_patterns = true # Detect immutable data patterns (default: true)
Configuration Options:
| Option | Default | Description |
|---|---|---|
enabled | true | Enable functional programming analysis |
detect_pure_functions | true | Identify functions without side effects |
detect_higher_order | true | Identify functions that take/return functions |
detect_immutable_patterns | true | Identify immutable data structure usage |
Functional patterns often lead to cleaner, more testable code. This analysis helps Debtmap recognize and appropriately score functional programming idioms.
CLI Integration
CLI flags can override configuration file settings:
# Override complexity threshold
debtmap analyze --threshold-complexity 15
# Provide coverage file
debtmap analyze --coverage-file coverage.json
# Enable context-aware detection
debtmap analyze --context
# Override output format
debtmap analyze --format json
Configuration Precedence
Debtmap resolves configuration values in the following order (highest to lowest priority):
- CLI flags - Command-line arguments (e.g.,
--threshold-complexity 15) - Configuration file - Settings from
.debtmap.toml - Built-in defaults - Debtmap’s sensible default values
This allows you to set project-wide defaults in .debtmap.toml while customizing specific runs with CLI flags.
Configuration Validation
Automatic Validation
Debtmap automatically validates your configuration when loading:
- Scoring weights must sum to 1.0 (±0.001 tolerance)
- Individual weights must be between 0.0 and 1.0
- Invalid configurations fall back to defaults with a warning
Normalization
If scoring weights don’t sum exactly to 1.0, Debtmap automatically normalizes them:
# Input (sums to 0.80)
[scoring]
coverage = 0.40
complexity = 0.30
dependency = 0.10
# Automatically normalized to:
# coverage = 0.50
# complexity = 0.375
# dependency = 0.125
Debug Validation
To verify which configuration file is being loaded, check debug logs:
RUST_LOG=debug debtmap analyze
Look for log messages like:
DEBUG debtmap::config: Loaded config from /path/to/.debtmap.toml
Complete Configuration Example
Here’s a comprehensive configuration showing all major sections:
# Scoring configuration
[scoring]
coverage = 0.50
complexity = 0.35
dependency = 0.15
# Basic thresholds
[thresholds]
complexity = 10
duplication = 50
max_file_length = 500
max_function_length = 50
minimum_debt_score = 2.0
minimum_cyclomatic_complexity = 3
minimum_cognitive_complexity = 5
minimum_risk_score = 2.0
# Validation thresholds for CI
[thresholds.validation]
max_average_complexity = 10.0
max_high_complexity_count = 100 # DEPRECATED: Use max_debt_density
max_debt_items = 2000 # DEPRECATED: Use max_debt_density
max_total_debt_score = 10000
max_codebase_risk_score = 7.0
max_high_risk_functions = 50 # DEPRECATED: Use max_debt_density
min_coverage_percentage = 0.0
max_debt_density = 50.0
# Language configuration
[languages]
enabled = ["rust", "python", "javascript", "typescript"]
[languages.rust]
detect_dead_code = false
detect_complexity = true
detect_duplication = true
# Exclusion patterns
[ignore]
patterns = [
"target/**",
"node_modules/**",
"tests/**/*",
"**/*_test.rs",
]
# Display configuration
[display]
tiered = true
items_per_tier = 5
# Output configuration
[output]
default_format = "terminal"
# Entropy configuration
[entropy]
enabled = true
weight = 1.0
min_tokens = 20
# God object detection
[god_object_detection]
enabled = true
[god_object_detection.rust]
max_methods = 20
max_fields = 15
# Classification configuration
[classification.constructors]
detect_constructors = true
constructor_patterns = ["new", "create", "build", "from"]
[classification.accessors]
detect_accessors = true
accessor_patterns = ["get_*", "set_*", "is_*", "has_*"]
[classification.data_flow]
detect_data_flow = true
# Advanced analysis
[orchestration_adjustment]
enabled = true
min_delegation_ratio = 0.6
complexity_reduction = 0.25
[boilerplate_detection]
enabled = true
detect_constructors = true
detect_error_conversions = true
complexity_reduction = 0.20
[functional_analysis]
enabled = true
detect_pure_functions = true
detect_higher_order = true
detect_immutable_patterns = true
Configuration Best Practices
For Strict Quality Standards
[scoring]
coverage = 0.60 # Emphasize test coverage
complexity = 0.30
dependency = 0.10
[thresholds]
minimum_debt_score = 3.0 # Higher bar for flagging issues
max_function_length = 30 # Enforce smaller functions
[thresholds.validation]
max_average_complexity = 8.0 # Stricter complexity limits
max_debt_items = 500 # Stricter debt limits
min_coverage_percentage = 80.0 # Require 80% coverage
For Legacy Codebases
[scoring]
coverage = 0.30 # Reduce coverage weight (legacy code often lacks tests)
complexity = 0.50 # Focus on complexity
dependency = 0.20
[thresholds]
minimum_debt_score = 5.0 # Only show highest priority items
minimum_cyclomatic_complexity = 10 # Filter out moderate complexity
[thresholds.validation]
max_debt_items = 10000 # Accommodate large debt
max_total_debt_score = 5000 # Higher limits for legacy code
For Open Source Libraries
[scoring]
coverage = 0.55 # Prioritize test coverage (public API)
complexity = 0.30
dependency = 0.15
[external_api]
detect_external_api = true # Flag untested public APIs
[thresholds.validation]
min_coverage_percentage = 90.0 # High coverage for public API
max_high_complexity_count = 20 # Keep complexity low
Troubleshooting
Configuration Not Loading
Check file location:
# Ensure file is named .debtmap.toml (note the dot prefix)
ls -la .debtmap.toml
# Debtmap searches current directory + 10 parent directories
pwd
Check file syntax:
# Verify TOML syntax is valid
debtmap analyze 2>&1 | grep -i "failed to parse"
Weights Don’t Sum to 1.0
Error message:
Warning: Invalid scoring weights: Active scoring weights must sum to 1.0, but sum to 0.800. Using defaults.
Fix: Ensure coverage + complexity + dependency = 1.0
[scoring]
coverage = 0.50
complexity = 0.35
dependency = 0.15 # Sum = 1.0 ✓
No Results Shown
Possible causes:
- Minimum thresholds too high
- All code excluded by ignore patterns
- No supported languages in project
Solutions:
# Lower minimum thresholds
[thresholds]
minimum_debt_score = 1.0
minimum_cyclomatic_complexity = 1
# Check language configuration
[languages]
enabled = ["rust", "python", "javascript", "typescript"]
# Review ignore patterns
[ignore]
patterns = [
# Make sure you're not excluding too much
]
Related Chapters
- Getting Started - Initial setup and basic usage
- Analysis Guide - Understanding scoring and prioritization
- Output Formats - Formatting and exporting results
Threshold Configuration
Debtmap uses configurable thresholds to determine when code complexity, duplication, or structural issues should be flagged as technical debt. This chapter explains how to configure thresholds to match your project’s quality standards.
Overview
Thresholds control what gets flagged as technical debt. You can configure thresholds using:
- Preset configurations - Quick start with strict, balanced, or lenient settings
- CLI flags - Override thresholds for a single analysis run
- Configuration file - Project-specific thresholds in
.debtmap.toml
Threshold Presets
Debtmap provides three preset threshold configurations to match different project needs:
Preset Comparison
| Threshold | Strict | Balanced (Default) | Lenient |
|---|---|---|---|
| Cyclomatic Complexity | 3 | 5 | 10 |
| Cognitive Complexity | 7 | 10 | 20 |
| Total Complexity | 5 | 8 | 15 |
| Function Length (lines) | 15 | 20 | 50 |
| Match Arms | 3 | 4 | 8 |
| If-Else Chain | 2 | 3 | 5 |
| Role Multipliers | |||
| Entry Point Multiplier | 1.2x | 1.5x | 2.0x |
| Utility Multiplier | 0.6x | 0.8x | 1.0x |
| Test Function Multiplier | 3.0x | 2.0x | 3.0x |
When to Use Each Preset
Strict Preset
- New projects aiming for high code quality standards
- Libraries and reusable components
- Critical systems requiring high reliability
- Teams enforcing strict coding standards
debtmap analyze . --threshold-preset strict
Balanced Preset (Default)
- Typical production applications
- Projects with moderate complexity tolerance
- Recommended starting point for most projects
- Good balance between catching issues and avoiding false positives
debtmap analyze . # Uses balanced preset by default
Lenient Preset
- Legacy codebases during initial assessment
- Complex domains (compilers, scientific computing)
- Gradual debt reduction strategies
- Temporary relaxation during major refactoring
debtmap analyze . --threshold-preset lenient
Understanding Complexity Thresholds
Debtmap tracks multiple complexity metrics. A function must exceed ALL thresholds to be flagged:
Cyclomatic Complexity
Counts decision points in code: if, while, for, match, &&, ||, etc.
- What it measures: Number of independent paths through code
- Why it matters: More paths = harder to test completely
- Default threshold: 5
Cognitive Complexity
Measures the mental effort required to understand code by weighing nested structures and breaks in linear flow.
- What it measures: How hard code is to read and comprehend
- Why it matters: High cognitive load = maintenance burden
- Default threshold: 10
Total Complexity
Sum of cyclomatic and cognitive complexity.
- What it measures: Combined complexity burden
- Why it matters: Catches functions high in either metric
- Default threshold: 8
Function Length
Number of lines of code in the function body.
- What it measures: Physical size of function
- Why it matters: Long functions are hard to understand and test
- Default threshold: 20 lines
Structural Complexity
Additional metrics for specific patterns:
- Match arms: Flags large match/switch statements (default: 4)
- If-else chains: Flags long conditional chains (default: 3)
Important: Functions are flagged when they meet ALL of these conditions simultaneously:
- Cyclomatic complexity >= adjusted cyclomatic threshold
- Cognitive complexity >= adjusted cognitive threshold
- Function length >= minimum function length
- Total complexity (cyclomatic + cognitive) >= adjusted total threshold
The thresholds are first adjusted by role-based multipliers, then all four checks must pass for the function to be flagged. This is a conjunction (AND) of individual threshold checks.
Threshold Validation
All threshold configurations are validated to ensure they are positive (non-zero) values. The following validation rules apply:
minimum_total_complexity> 0minimum_cyclomatic_complexity> 0minimum_cognitive_complexity> 0minimum_match_arms> 0minimum_if_else_chain> 0minimum_function_length> 0- All role multipliers > 0
If any threshold is set to zero or a negative value, Debtmap will reject the configuration and use default values instead. This ensures that thresholds are always meaningful and prevent misconfiguration.
Role-Based Multipliers
Debtmap automatically adjusts thresholds based on function role, recognizing that different types of functions have different complexity expectations:
| Function Role | Multiplier | Effect | Examples |
|---|---|---|---|
| Entry Points | 1.2x - 2.0x (preset-specific) | More lenient | main(), HTTP handlers, CLI commands |
| Core Logic | 1.0x | Standard | Business logic, algorithms |
| Utility Functions | 0.6x - 1.0x (preset-specific) | Stricter | Getters, setters, simple helpers |
| Test Functions | 2.0x - 3.0x (preset-specific) | Most lenient | Unit tests, integration tests |
| Unknown Functions | 1.0x (defaults to core logic) | Standard | Functions that don’t match any role pattern |
Note: Some multipliers vary by preset:
- Entry Points: Strict=1.2x, Balanced=1.5x, Lenient=2.0x
- Utility Functions: Strict=0.6x, Balanced=0.8x, Lenient=1.0x
- Test Functions: Strict=3.0x, Balanced=2.0x, Lenient=3.0x
How multipliers work:
A higher multiplier makes thresholds more lenient by adjusting ALL thresholds. The multiplier values vary by preset - for example, entry point functions use 1.2x (strict), 1.5x (balanced), or 2.0x (lenient).
Example: Entry Point function with Balanced preset (multiplier = 1.5x):
- Cyclomatic threshold: 7.5 (5 × 1.5)
- Cognitive threshold: 15 (10 × 1.5)
- Total threshold: 12 (8 × 1.5)
- Length threshold: 30 lines (20 × 1.5)
The function is flagged only if ALL conditions are met:
- Cyclomatic complexity >= 7.5 AND
- Cognitive complexity >= 15 AND
- Function length >= 30 lines AND
- Total complexity (cyclomatic + cognitive) >= 12
Comparison across roles (Balanced preset):
| Role | Cyclomatic | Cognitive | Total | Length | Flagged When |
|---|---|---|---|---|---|
| Entry Point (1.5x) | 7.5 | 15 | 12 | 30 | ALL conditions met |
| Core Logic (1.0x) | 5 | 10 | 8 | 20 | ALL conditions met |
| Utility (0.8x) | 4 | 8 | 6.4 | 16 | ALL conditions met |
| Test (2.0x) | 10 | 20 | 16 | 40 | ALL conditions met |
Note: Entry point multipliers differ by preset. With the strict preset, entry points use 1.2x (cyclomatic=3.6, cognitive=8.4), while the lenient preset uses 2.0x (cyclomatic=20, cognitive=40).
This allows test functions and entry points to be more complex without false positives, while keeping utility functions clean and simple.
CLI Threshold Flags
Override thresholds for a single analysis run using command-line flags:
Preset-Based Configuration (Recommended)
Use --threshold-preset to apply a predefined threshold configuration:
# Use strict preset (cyclomatic=3, cognitive=7, total=5, length=15)
debtmap analyze . --threshold-preset strict
# Use balanced preset (default - cyclomatic=5, cognitive=10, total=8, length=20)
debtmap analyze . --threshold-preset balanced
# Use lenient preset (cyclomatic=10, cognitive=20, total=15, length=50)
debtmap analyze . --threshold-preset lenient
Individual Threshold Overrides
You can also override specific thresholds:
# Override cyclomatic complexity threshold (legacy flag, default: 10)
debtmap analyze . --threshold-complexity 15
# Override duplication threshold in lines (default: 50)
debtmap analyze . --threshold-duplication 30
# Combine multiple threshold flags
debtmap analyze . --threshold-complexity 15 --threshold-duplication 30
Note:
--threshold-presetprovides the most comprehensive threshold configuration (includes all complexity metrics and role multipliers)- Individual flags like
--threshold-complexityare legacy flags that only set a single cyclomatic complexity threshold, without configuring cognitive complexity, total complexity, function length, or role multipliers - For full control over all complexity metrics and role-based multipliers, use the
.debtmap.tomlconfiguration file - CLI flags override configuration file settings for that run only
Configuration File
For project-specific thresholds, create a .debtmap.toml file in your project root.
Complexity Thresholds Configuration
The [complexity_thresholds] section in .debtmap.toml allows fine-grained control over function complexity detection:
[complexity_thresholds]
# Core complexity metrics
minimum_total_complexity = 8 # Sum of cyclomatic + cognitive
minimum_cyclomatic_complexity = 5 # Decision points (if, match, etc.)
minimum_cognitive_complexity = 10 # Mental effort to understand code
# Structural complexity metrics
minimum_match_arms = 4 # Maximum match/switch arms
minimum_if_else_chain = 3 # Maximum if-else chain length
minimum_function_length = 20 # Minimum lines before flagging
# Role-based multipliers (applied to all thresholds above)
entry_point_multiplier = 1.5 # main(), handlers, CLI commands
core_logic_multiplier = 1.0 # Standard business logic
utility_multiplier = 0.8 # Getters, setters, helpers
test_function_multiplier = 2.0 # Unit tests, integration tests
Note: The multipliers are applied to thresholds before comparison. For example, with entry_point_multiplier = 1.5 and minimum_cyclomatic_complexity = 5, an entry point function would be flagged at cyclomatic complexity 7.5 (5 × 1.5).
Validation: All threshold values must be positive (> 0). Zero or negative values will cause validation errors and Debtmap will use default values instead. This ensures that thresholds are always meaningful and prevents misconfiguration.
Complete Example
# Legacy threshold settings (simple configuration)
# Note: For comprehensive control, use [complexity_thresholds] instead
[thresholds]
complexity = 15 # Cyclomatic complexity threshold (legacy)
cognitive = 20 # Cognitive complexity threshold (legacy)
max_file_length = 500 # Maximum file length in lines
# Validation thresholds for CI/CD
[thresholds.validation]
max_average_complexity = 10.0 # Maximum average complexity across codebase
max_debt_density = 50.0 # Maximum debt items per 1000 LOC
max_codebase_risk_score = 7.0 # Maximum overall risk score
min_coverage_percentage = 0.0 # Minimum test coverage (0 = disabled)
max_total_debt_score = 10000 # Safety net for total debt score
# God object detection
[god_object]
enabled = true
# Rust-specific thresholds
[god_object.rust]
max_methods = 20 # Maximum methods before flagging as god object
max_fields = 15 # Maximum fields
max_traits = 5 # Maximum trait implementations
max_lines = 1000 # Maximum lines in impl block
max_complexity = 200 # Maximum total complexity
# Python-specific thresholds
[god_object.python]
max_methods = 15
max_fields = 10
max_traits = 3
max_lines = 500
max_complexity = 150
# JavaScript/TypeScript-specific thresholds
[god_object.javascript]
max_methods = 15
max_fields = 20
max_traits = 3
max_lines = 500
max_complexity = 150
Configuration Section Notes:
[thresholds]: Legacy/simple threshold configuration. Sets basic complexity thresholds without role multipliers or comprehensive metric control.[complexity_thresholds]: Modern/comprehensive threshold configuration. Provides fine-grained control over all complexity metrics, structural thresholds, and role-based multipliers. Use this for full control.- Recommendation: For new projects, use
[complexity_thresholds]for comprehensive configuration. The[thresholds]section is maintained for backward compatibility.
Using Configuration File
# Initialize with default configuration
debtmap init
# Edit .debtmap.toml to customize thresholds
# Then run analysis (automatically uses config file)
debtmap analyze .
# Validate against thresholds in CI/CD
debtmap validate . --config .debtmap.toml
God Object Thresholds
God objects are classes/structs with too many responsibilities. Debtmap uses language-specific thresholds to detect them:
Rust Thresholds
[god_object.rust]
max_methods = 20 # Methods in impl blocks
max_fields = 15 # Struct fields
max_traits = 5 # Trait implementations
max_lines = 1000 # Lines in impl blocks
max_complexity = 200 # Total complexity
Python Thresholds
[god_object.python]
max_methods = 15
max_fields = 10
max_traits = 3 # Base classes
max_lines = 500
max_complexity = 150
JavaScript/TypeScript Thresholds
[god_object.javascript]
max_methods = 15
max_fields = 20
max_traits = 3 # Extended classes
max_lines = 500
max_complexity = 150
Why language-specific thresholds?
Different languages have different idioms:
- Rust: Encourages small traits and composition, so lower thresholds
- Python: Duck typing allows more fields, but fewer methods
- JavaScript: Prototype-based, typically has more properties
Validation Thresholds
Use validation thresholds in CI/CD pipelines to enforce quality gates:
Scale-Independent Metrics (Recommended)
These metrics work for codebases of any size:
[thresholds.validation]
# Average complexity per function (default: 10.0)
max_average_complexity = 10.0
# Debt items per 1000 lines of code (default: 50.0)
max_debt_density = 50.0
# Overall risk score 0-10 (default: 7.0)
max_codebase_risk_score = 7.0
Optional Metrics
[thresholds.validation]
# Minimum test coverage percentage (default: 0.0 = disabled)
min_coverage_percentage = 80.0
# Safety net for total debt score (default: 10000)
max_total_debt_score = 5000
Using Validation in CI/CD
# Run validation (exits with error if thresholds exceeded)
debtmap validate . --config .debtmap.toml
# Example CI/CD workflow
debtmap analyze . --output report.json
debtmap validate . --config .debtmap.toml || exit 1
CI/CD Best Practices:
- Start with lenient thresholds to establish baseline
- Gradually tighten thresholds as you pay down debt
- Use
max_debt_densityfor stable quality metric - Track trends over time, not just point-in-time values
Tuning Guidelines
How to choose and adjust thresholds for your project:
1. Start with Defaults
Begin with the balanced preset to understand your codebase:
debtmap analyze .
Review the output to see what gets flagged and what doesn’t.
2. Run Baseline Analysis
Understand your current state:
# Analyze and save results
debtmap analyze . --output baseline.json
# Review high-priority items
debtmap analyze . --top 20
3. Adjust Based on Project Type
New Projects:
- Use strict preset to enforce high quality from the start
- Prevents accumulation of technical debt
Typical Projects:
- Use balanced preset (recommended)
- Good middle ground for most teams
Legacy Codebases:
- Use lenient preset initially
- Focus on worst offenders first
- Gradually tighten thresholds as you refactor
4. Fine-Tune in Configuration File
Create .debtmap.toml and adjust specific thresholds:
# Initialize config file
debtmap init
# Edit .debtmap.toml
# Adjust thresholds based on your baseline analysis
5. Validate and Iterate
# Test your thresholds
debtmap validate . --config .debtmap.toml
# Adjust if needed
# Iterate until you find the right balance
Troubleshooting Threshold Configuration
Too many false positives?
- Increase thresholds or switch to lenient preset
- Check if role multipliers are appropriate
- Review god object thresholds for your language
Missing important issues?
- Decrease thresholds or switch to strict preset
- Verify
.debtmap.tomlis being loaded - Check for suppression patterns hiding issues
Different standards for tests?
- Don’t worry - role multipliers automatically handle this
- Test functions get 2-3x multiplier by default
Inconsistent results?
- Ensure
.debtmap.tomlis in project root - CLI flags override config file - remove them for consistency
- Use
--configflag to specify config file explicitly
Examples
Example 1: Quick Analysis with Strict Preset
# Use strict thresholds for new project
debtmap analyze . --threshold-preset strict
Example 2: Custom CLI Thresholds
# Analyze with custom thresholds (no config file)
debtmap analyze . \
--threshold-complexity 15 \
--threshold-duplication 30
Example 3: Project-Specific Configuration
# Initialize configuration
debtmap init
# Creates .debtmap.toml - edit to customize
# Example: Increase complexity threshold to 15
# Run analysis with project config
debtmap analyze .
Example 4: CI/CD Validation
# Create strict validation configuration
cat > .debtmap.toml << EOF
[thresholds.validation]
max_average_complexity = 8.0
max_debt_density = 30.0
max_codebase_risk_score = 6.0
min_coverage_percentage = 75.0
EOF
# Run in CI/CD pipeline
debtmap analyze . --output report.json
debtmap validate . --config .debtmap.toml
Example 5: Gradual Debt Reduction
# Month 1: Start lenient
debtmap analyze . --threshold-preset lenient --output month1.json
# Month 2: Switch to balanced
debtmap analyze . --threshold-preset balanced --output month2.json
# Month 3: Tighten further
debtmap analyze . --threshold-preset strict --output month3.json
# Compare progress
debtmap analyze . --output current.json
# Review trend: month1.json -> month2.json -> month3.json -> current.json
Decision Tree: Choosing Your Preset
Start here: What kind of project are you working on?
│
├─ New project or library
│ └─ Use STRICT preset
│ └─ Prevent debt accumulation from day one
│
├─ Existing production application
│ └─ What's your goal?
│ ├─ Maintain current quality
│ │ └─ Use BALANCED preset
│ │ └─ Good default for most teams
│ │
│ └─ Reduce existing debt gradually
│ └─ Start with LENIENT preset
│ └─ Focus on worst issues first
│ └─ Tighten thresholds over time
│
└─ Legacy codebase or complex domain
└─ Use LENIENT preset
└─ Avoid overwhelming with false positives
└─ Create baseline and improve incrementally
Best Practices
- Start with defaults - Don’t over-customize initially
- Track trends - Monitor debt over time, not just point values
- Be consistent - Use same thresholds across team
- Document choices - Comment your
.debtmap.tomlto explain custom thresholds - Automate validation - Run
debtmap validatein CI/CD - Review regularly - Reassess thresholds quarterly
- Gradual tightening - Don’t make thresholds stricter too quickly
- Trust role multipliers - Let Debtmap handle different function types
Related Topics
- Getting Started - Initial setup and first analysis
- CLI Reference - Complete command-line flag documentation
- Configuration - Full
.debtmap.tomlreference - Scoring Strategies - How thresholds affect debt scores
- God Object Detection - Deep dive into god object analysis
Suppression Patterns
Debtmap provides flexible suppression mechanisms to help you focus on the technical debt that matters most. You can suppress specific debt items inline with comments, or exclude entire files and functions through configuration.
Why Use Suppressions?
Not all detected technical debt requires immediate action. Suppressions allow you to:
- Focus on priorities: Hide known, accepted debt to see new issues clearly
- Handle false positives: Suppress patterns that don’t apply to your context
- Document decisions: Explain why certain debt is acceptable using reason annotations
- Exclude test code: Ignore complexity in test fixtures and setup functions
Inline Comment Suppression
Debtmap supports four inline comment formats that work with your language’s comment syntax:
Single-Line Suppression
Suppress debt on the same line as the comment:
#![allow(unused)]
fn main() {
// debtmap:ignore
// TODO: Implement caching later - performance is acceptable for now
}
# debtmap:ignore
# FIXME: Refactor this after the Q2 release
The suppression applies to debt detected on the same line as the comment.
Next-Line Suppression
Suppress debt on the line immediately following the comment:
#![allow(unused)]
fn main() {
// debtmap:ignore-next-line
fn complex_algorithm() {
// ...20 lines of complex code...
}
}
// debtmap:ignore-next-line
function calculateMetrics(data: DataPoint[]): Metrics {
// ...complex implementation...
}
This format is useful when you want the suppression comment to appear before the code it affects.
Block Suppression
Suppress multiple lines of code between start and end markers:
#![allow(unused)]
fn main() {
// debtmap:ignore-start
fn setup_test_environment() {
// TODO: Add more test cases
// FIXME: Handle edge cases
// Complex test setup code...
}
// debtmap:ignore-end
}
# debtmap:ignore-start
def mock_api_responses():
# TODO: Add more mock scenarios
# Multiple lines of mock setup
pass
# debtmap:ignore-end
Important: Every ignore-start must have a matching ignore-end. Debtmap tracks unclosed blocks and can warn you about them.
Type-Specific Suppression
You can suppress specific types of debt using bracket notation instead of suppressing everything:
Quick Reference: Debt Type Suppression
| Debt Type | Bracket Name(s) | Example | Notes |
|---|---|---|---|
| TODO comments | [todo] | // debtmap:ignore[todo] | Also suppresses TestTodo |
| FIXME comments | [fixme] | // debtmap:ignore[fixme] | |
| Code smells | [smell] or [codesmell] | // debtmap:ignore[smell] | |
| High complexity | [complexity] | // debtmap:ignore[complexity] | Also suppresses TestComplexity |
| Code duplication | [duplication] or [duplicate] | // debtmap:ignore[duplication] | Also suppresses TestDuplication |
| Dependency issues | [dependency] | // debtmap:ignore[dependency] | |
| Error swallowing | ❌ Not supported | // debtmap:ignore | Use general suppression only |
| Resource management | ❌ Not supported | // debtmap:ignore | Use general suppression only |
| Code organization | ❌ Not supported | // debtmap:ignore | Use general suppression only |
| Test quality | ❌ Not supported | // debtmap:ignore | Use general suppression only |
| All types | [*] | // debtmap:ignore[*] | Wildcard matches everything |
Suppress Specific Types
#![allow(unused)]
fn main() {
// debtmap:ignore[todo]
// TODO: This TODO is ignored, but FIXMEs and complexity are still reported
}
#![allow(unused)]
fn main() {
// debtmap:ignore[todo,fixme]
// TODO: Both TODOs and FIXMEs are ignored here
// FIXME: But complexity issues would still be detected
}
Supported Debt Types
You can suppress the following debt types by name in bracket notation:
Currently Supported:
todo- TODO comments (also detects test-specific TODOs)fixme- FIXME commentssmellorcodesmell- Code smell patternscomplexity- High cognitive complexity (also detects test complexity)duplicationorduplicate- Code duplication (also detects test duplication)dependency- Dependency issues*- All types (wildcard)
Auto-Detected Types (cannot be suppressed by name):
The following debt types are detected by code analysis rather than comment scanning. These types cannot be suppressed using bracket notation like [error_swallowing] because they are not included in the suppression parser’s type mapping.
Why bracket notation doesn’t work: The suppression parser only recognizes specific type names in its internal mapping (DEBT_TYPE_MAP): todo, fixme, smell/codesmell, complexity, duplication/duplicate, and dependency. Types detected through AST analysis (like error swallowing and resource management) don’t have string identifiers in the parser. To suppress these, use the general debtmap:ignore marker without brackets:
error_swallowing- Error handling issues (empty catch blocks, ignored errors)resource_management- Resource cleanup issues (file handles, connections)code_organization- Structural issues (god objects, large classes)
Example:
#![allow(unused)]
fn main() {
// ✅ Correct: General suppression without brackets
// debtmap:ignore -- Intentional empty catch for cleanup
match result {
Err(_) => {} // Empty catch block
Ok(v) => process(v)
}
// ❌ Wrong: Bracket notation not supported for auto-detected types
// debtmap:ignore[error_swallowing]
}
Test-Specific Debt Types:
Test-specific variants like TestComplexity, TestTodo, TestDuplication, and TestQuality are suppressed through their base types:
TestComplexity→ suppressed with[complexity]TestTodo→ suppressed with[todo]TestDuplication→ suppressed with[duplication]TestQuality→ suppressed with generaldebtmap:ignore(no bracket notation)
Example:
#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
// Suppresses both Complexity and TestComplexity
// debtmap:ignore[complexity] -- Complex test setup acceptable
fn setup_test_environment() {
// Complex test initialization
}
// debtmap:ignore[todo] -- Suppresses both Todo and TestTodo
// TODO: Add more test cases
fn test_feature() { }
}
## Wildcard Suppression
Use `[*]` to explicitly suppress all types (equivalent to no bracket notation):
```rust
// debtmap:ignore[*]
// Suppresses all debt types
}
Type-Specific Blocks
Block suppressions also support type filtering:
#![allow(unused)]
fn main() {
// debtmap:ignore-start[complexity]
fn intentionally_complex_for_performance() {
// Complex nested logic is intentional here
// Complexity warnings suppressed, but TODOs still detected
}
// debtmap:ignore-end
}
Suppression Reasons
Document why you’re suppressing debt using the -- separator:
#![allow(unused)]
fn main() {
// debtmap:ignore -- Intentional for backward compatibility
// TODO: Remove this after all clients upgrade to v2.0
}
# debtmap:ignore[complexity] -- Performance-critical hot path
def optimize_query(params):
# Complex but necessary for performance
pass
// debtmap:ignore-next-line -- Waiting on upstream library fix
function workaroundBug() {
// FIXME: Remove when library v3.0 is released
}
Best Practice: Always include reasons for suppressions. This helps future maintainers understand the context and know when suppressions can be removed.
Config File Exclusions
For broader exclusions, use the [ignore] section in .debtmap.toml:
File Pattern Exclusions
[ignore]
patterns = [
"target/**", # Build artifacts
"node_modules/**", # Dependencies
"**/*_test.rs", # Test files with _test suffix
"tests/**", # All test directories
"**/fixtures/**", # Test fixtures
"**/mocks/**", # Mock implementations
"**/*.min.js", # Minified files
"**/demo/**", # Demo code
"**/*.generated.rs", # Generated files
"vendor/**", # Vendor code
"third_party/**", # Third-party code
]
Function Name Exclusions (Planned)
Note: Function-level exclusions by name pattern are not yet implemented. This is a planned feature for a future release.
When implemented, you will be able to exclude entire function families by name pattern:
# Planned feature - not yet available
[ignore.functions]
patterns = [
# Test setup functions
"setup_test_*",
"teardown_test_*",
"create_test_*",
"mock_*",
# Generated code
"derive_*",
"__*", # Python dunder methods
# CLI parsing (naturally complex)
"parse_args",
"parse_cli",
"build_cli",
# Serialization (naturally complex pattern matching)
"serialize_*",
"deserialize_*",
"to_json",
"from_json",
]
Current workaround: Use inline suppression comments (debtmap:ignore) for specific functions, or use file pattern exclusions to exclude entire test files.
Glob Pattern Syntax
File patterns use standard glob syntax:
| Pattern | Matches | Example |
|---|---|---|
* | Any characters within a path component | *.rs matches main.rs |
** | Any directories (recursive) | tests/** matches tests/unit/foo.rs |
? | Single character | test?.rs matches test1.rs |
[abc] | Character class | test[123].rs matches test1.rs |
[!abc] | Negated class | test[!0].rs matches test1.rs but not test0.rs |
Glob Pattern Examples
[ignore]
patterns = [
"src/**/*_generated.rs", # Generated files in any subdirectory
"**/test_*.py", # Python test files anywhere
"legacy/**/[!i]*.js", # Legacy JS files not starting with 'i'
"**/*.min.js", # Minified JavaScript
"**/*.min.css", # Minified CSS
]
Note: Brace expansion (e.g.,
*.{js,css}) is not supported. Use separate patterns for each file extension.
Language-Specific Comment Syntax
Debtmap automatically uses the correct comment syntax for each language:
| Language | Comment Prefix | Example |
|---|---|---|
| Rust | // | // debtmap:ignore |
| JavaScript | // | // debtmap:ignore |
| TypeScript | // | // debtmap:ignore |
| Python | # | # debtmap:ignore |
| Other languages | // | // debtmap:ignore |
Note: Languages not explicitly listed use // as the default comment prefix.
You don’t need to configure this—Debtmap detects the language and uses the appropriate syntax.
Explicitly Specified Files
Important behavior: When you analyze a specific file directly, ignore patterns are bypassed:
# Respects [ignore] patterns in .debtmap.toml
debtmap analyze .
debtmap analyze src/
# Bypasses ignore patterns - analyzes the file even if patterns would exclude it
debtmap analyze src/test_helper.rs
This ensures you can always analyze specific files when needed, even if they match an ignore pattern.
Suppression Statistics
Debtmap internally tracks suppression usage during analysis:
- Total suppressions: Count of active suppressions across all files
- Suppressions by type: How many of each debt type are suppressed
- Unclosed blocks: Detection of
ignore-startwithout matchingignore-end
Current Status: These statistics are computed during analysis (via the SuppressionContext::get_stats() method) but are not currently displayed in any output format. The SuppressionStats struct exists and tracks all metrics, but there is no user-facing command or report format that exposes them. Future releases may add a dedicated command to view suppression metrics.
Auditing Suppressions Now: You can audit your suppressions using standard tools:
# Find all suppressions in Rust code
rg "debtmap:ignore" --type rust
# Count suppressions by type
rg "debtmap:ignore\[" --type rust | grep -o "\[.*\]" | sort | uniq -c
# Find unclosed blocks
rg "debtmap:ignore-start" --type rust -A 100 | grep -v "debtmap:ignore-end"
# List files with suppressions
rg "debtmap:ignore" --files-with-matches
Best Practices
Use Suppressions Sparingly
Suppressions hide information, so use them intentionally:
✅ Good use cases:
- Test fixtures and mock data
- Known technical debt with an accepted timeline
- Intentional complexity for performance
- False positives specific to your domain
❌ Poor use cases:
- Hiding all debt to make reports look clean
- Suppressing instead of fixing simple issues
- Using wildcards when specific types would work
Always Include Reasons
#![allow(unused)]
fn main() {
// ✅ Good: Clear reason and timeline
// debtmap:ignore[complexity] -- Hot path optimization, profiled and necessary
fn fast_algorithm() { }
// ❌ Bad: No context for future maintainers
// debtmap:ignore
fn fast_algorithm() { }
}
Prefer Specific Over Broad
#![allow(unused)]
fn main() {
// ✅ Good: Only suppress the specific debt type
// debtmap:ignore[todo] -- Remove after v2.0 migration
// TODO: Migrate to new API
// ❌ Bad: Suppresses everything, including real issues
// debtmap:ignore
// TODO: Migrate to new API
}
Use Config for Systematic Exclusions
For patterns that apply project-wide, use .debtmap.toml instead of inline comments:
# ✅ Good: One config applies to all test files
[ignore]
patterns = ["tests/**"]
# ❌ Bad: Repetitive inline suppressions in every test file
Review Suppressions Periodically
Suppressions can become outdated:
- Remove suppressions when the reason no longer applies
- Check if suppressed debt can now be fixed
- Verify reasons are still accurate after refactoring
Solution: Periodically search for suppressions:
rg "debtmap:ignore" --type rust
Ensure Blocks Are Closed
#![allow(unused)]
fn main() {
// ✅ Good: Properly closed block
// debtmap:ignore-start
fn test_setup() { }
// debtmap:ignore-end
// ❌ Bad: Unclosed block affects all subsequent code
// debtmap:ignore-start
fn test_setup() { }
// (missing ignore-end)
}
Debtmap detects unclosed blocks and can warn you about them.
Common Patterns
Suppressing Test Code
# In .debtmap.toml
[ignore]
patterns = [
"tests/**/*",
"**/*_test.rs",
"**/test_*.py",
"**/fixtures/**",
]
For test functions within production files, use inline suppressions:
#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
// debtmap:ignore-start -- Test code
fn setup_test_environment() { }
// debtmap:ignore-end
}
}
Suppressing Generated Code
[ignore]
patterns = [
"**/*_generated.*",
"**/proto/**",
"**/bindings/**",
]
Temporary Suppressions with Timeline
#![allow(unused)]
fn main() {
// debtmap:ignore[complexity] -- TODO: Refactor during Q2 2025 sprint
fn legacy_payment_processor() {
// Complex legacy code scheduled for refactoring
}
}
Suppressing False Positives
# debtmap:ignore[duplication] -- Similar but semantically different
def calculate_tax_us():
# US tax calculation
pass
# debtmap:ignore[duplication] -- Similar but semantically different
def calculate_tax_eu():
# EU tax calculation with different rules
pass
Conditional Suppression
#![allow(unused)]
fn main() {
#[cfg(test)]
// debtmap:ignore[complexity]
fn test_helper() {
// Complex test setup is acceptable
}
}
Suppression with Detailed Justification
#![allow(unused)]
fn main() {
// debtmap:ignore[complexity] -- Required by specification XYZ-123
// This function implements the state machine defined in spec XYZ-123.
// Complexity is inherent to the specification and cannot be reduced
// without violating requirements.
fn state_machine() { ... }
}
Troubleshooting
Suppression Not Working
- Check comment syntax: Ensure you’re using the correct comment prefix for your language (
//for Rust/JS/TS,#for Python) - Verify spelling: It’s
debtmap:ignore, notdebtmap-ignoreordebtmap_ignore - Check line matching: For same-line suppressions, ensure the debt is on the same line as the comment
- Verify type names: Use
todo,fixme,complexity, etc. (lowercase)
Common syntax errors:
#![allow(unused)]
fn main() {
// Wrong: debtmap: ignore (space after colon)
// Right: debtmap:ignore
// Wrong: debtmap:ignore[Complexity] (capital C)
// Right: debtmap:ignore[complexity]
}
Check placement:
#![allow(unused)]
fn main() {
// Wrong: comment after code
fn function() { } // debtmap:ignore
// Right: comment before code
// debtmap:ignore
fn function() { }
}
Unclosed Block Warning
If you see warnings about unclosed blocks:
#![allow(unused)]
fn main() {
// Problem: Missing ignore-end
// debtmap:ignore-start
fn test_helper() { }
// (Should have debtmap:ignore-end here)
// Solution: Add the closing marker
// debtmap:ignore-start
fn test_helper() { }
// debtmap:ignore-end
}
File Still Being Analyzed
If a file in your ignore patterns is still being analyzed:
- Check if you’re analyzing the specific file directly (bypasses ignore patterns)
- Verify the glob pattern matches the file path
- Check for typos in the pattern
- Test the pattern in isolation
Test pattern with find:
find . -path "tests/**/*" -type f
Use double asterisk for subdirectories:
# Wrong: "tests/*" (only direct children)
# Right: "tests/**/*" (all descendants)
Check relative paths:
# Patterns are relative to project root
patterns = [
"src/legacy/**", # ✓ Correct
"/src/legacy/**", # ✗ Wrong (absolute path)
]
Function Suppression Not Working
Function-level exclusions by name pattern are not yet implemented. To suppress specific functions:
- Use inline suppressions:
// debtmap:ignorebefore the function - Use block suppressions:
// debtmap:ignore-start…// debtmap:ignore-end - Exclude entire files using
[ignore]patterns if the functions are in dedicated files
Related Topics
- Configuration - Full
.debtmap.tomlreference - CLI Reference - Command-line analysis options
- Analysis Guide - Understanding debt detection
- Output Formats - Viewing suppressed items in reports
Summary
Suppressions help you focus on actionable technical debt:
- Inline comments:
debtmap:ignore,ignore-next-line,ignore-start/end - Type-specific: Use
[type1,type2]to suppress selectively - Reasons: Always use
-- reasonto document why - Config patterns: Use
.debtmap.tomlfor systematic file exclusions - Best practices: Use sparingly, prefer specific over broad, review periodically
With proper use of suppressions, your Debtmap reports show only the debt that matters to your team.
Output Formats
Debtmap provides multiple output formats to suit different workflows, from interactive terminal reports to machine-readable JSON for CI/CD integration. This chapter covers all available formats and how to use them effectively.
Format Selection
Select the output format using the -f or --format flag:
# Terminal output (default) - human-readable with colors
debtmap analyze .
# JSON output - machine-readable for tooling
debtmap analyze . --format json
# Markdown output - documentation and reports
debtmap analyze . --format markdown
Available formats:
- terminal (default): Interactive output with colors, emoji, and formatting
- json: Structured data for programmatic processing
- markdown: Reports suitable for documentation and PR comments
Note: The codebase also includes an
Htmlformat variant insrc/core/types.rsandsrc/analysis/diagnostics/, but this is only available internally for diagnostic reporting and is not exposed as a CLI option. For HTML output, convert markdown reports using tools like pandoc (see Rendering to HTML/PDF).
Writing to Files
By default, output goes to stdout. Use -o or --output to write to a file:
# Write JSON to file
debtmap analyze . --format json -o report.json
# Write markdown report
debtmap analyze . --format markdown -o DEBT_REPORT.md
# Terminal output to file (preserves colors)
debtmap analyze . -o analysis.txt
Terminal Output
The terminal format provides an interactive, color-coded report designed for developer workflows. It’s the default format and optimized for readability.
Output Structure
Terminal output is organized into five main sections:
- Header - Analysis report title
- Codebase Summary - High-level metrics and debt score
- Complexity Hotspots - Top 5 most complex functions with refactoring guidance
- Technical Debt - High-priority debt items requiring attention
- Pass/Fail Status - Overall quality assessment
Example Terminal Output
═══════════════════════════════════════════
DEBTMAP ANALYSIS REPORT
═══════════════════════════════════════════
📊 CODEBASE Summary
───────────────────────────────────────────
Files analyzed: 42
Total functions: 287
Average complexity: 6.3
Debt items: 15
Total debt score: 156 (threshold: 100)
⚠️ COMPLEXITY HOTSPOTS (Top 5)
───────────────────────────────────────────
1. src/analyzers/rust.rs:245 parse_function() - Cyclomatic: 18, Cognitive: 24
ACTION: Extract 3-5 pure functions using decompose-then-transform strategy
PATTERNS: Decompose into logical units, then apply functional patterns
BENEFIT: Pure functions are easily testable and composable
2. src/debt/smells.rs:196 detect_data_clumps() - Cyclomatic: 15, Cognitive: 20
↓ Entropy: 0.32, Repetition: 85%, Effective: 0.6x
High pattern repetition detected (85%)
🔧 TECHNICAL DEBT (15 items)
───────────────────────────────────────────
High Priority (5):
- src/risk/scoring.rs:142 - TODO: Implement caching for score calculations
- src/core/metrics.rs:89 - High complexity: cyclomatic=16
- src/debt/patterns.rs:201 - Code duplication: 65 lines duplicated
✓ Pass/Fail: PASS
Color Coding and Symbols
The terminal output uses colors and symbols for quick visual scanning:
Status Indicators:
- ✓ Green: Passing, good, well-tested
- ⚠️ Yellow: Warning, moderate complexity
- ✗ Red: Failing, critical, high complexity
- 📊 Blue: Information, metrics
- 🔧 Orange: Technical debt items
- 🎯 Cyan: Recommendations
Complexity Classification:
- LOW (0-5): Green - Simple, easy to maintain
- MODERATE (6-10): Yellow - Consider refactoring
- HIGH (11-15): Orange - Should refactor
- SEVERE (>15): Red - Urgent refactoring needed
Note: These levels match the
ComplexityLevelenum in the implementation.
Debt Score Thresholds:
The default debt threshold is 100. Scores are colored based on this threshold:
- Green (≤50): Healthy - Below half threshold (score ≤ threshold/2)
- Yellow (51-100): Attention needed - Between half and full threshold (threshold/2 < score ≤ threshold)
- Red (>100): Action required - Exceeds threshold (score > threshold)
Note: Boundary values use strict inequalities: 50 is Green, 100 is Yellow (not Red), 101+ is Red.
Refactoring Guidance
For complex functions (cyclomatic complexity > 5), the terminal output provides actionable refactoring recommendations:
ACTION: Extract 3-5 pure functions using decompose-then-transform strategy
PATTERNS: Decompose into logical units, then apply functional patterns
BENEFIT: Pure functions are easily testable and composable
Guidance levels:
- Moderate (6-10): Extract 2-3 pure functions using direct functional transformation
- High (11-15): Extract 3-5 pure functions using decompose-then-transform strategy
- Severe (>15): Extract 5+ pure functions into modules with functional core/imperative shell
See the Analysis Guide for metric explanations.
Plain Terminal Mode
For environments without color support or when piping to tools, use --plain:
# ASCII-only output, no colors
debtmap analyze . --plain
Plain mode:
- Removes ANSI color codes
- Uses ASCII box-drawing characters
- Machine-parseable structure
Note: Terminal output formatting is controlled internally via
FormattingConfig(found insrc/formattingandsrc/io/writers/terminal.rs), which manages color mode settings. The--plainflag and environment variables provide user-facing control over these settings:
--plainflag - Disables colors and fancy formattingNO_COLOR=1- Disables colors (per no-color.org standard)CLICOLOR=0- Disables colorsCLICOLOR_FORCE=1- Forces colors even when output is not a terminal
FormattingConfigis not directly exposed to CLI users but can be accessed when using debtmap as a library throughTerminalWriter::with_formatting.
Verbosity Levels
Control detail level with -v flags (can be repeated):
# Standard output
debtmap analyze .
# Level 1: Show main score factors
debtmap analyze . -v
# Level 2: Show detailed calculations
debtmap analyze . -vv
# Level 3: Show all debug information
debtmap analyze . -vvv
Verbosity features:
-v: Show main score factors (complexity, coverage, dependency breakdown)-vv: Show detailed calculations with formulas and intermediate values-vvv: Show all debug information including entropy metrics and role detection
Note: Verbosity flags affect terminal output only. JSON and markdown formats include all data regardless of verbosity level.
Each level includes all information from the previous levels, progressively adding more detail to help understand how scores are calculated.
Example Output Differences:
Standard output shows basic metrics:
Total debt score: 156 (threshold: 100)
Level 1 (-v) adds score breakdowns:
Total debt score: 156 (threshold: 100)
Complexity contribution: 85 (54%)
Coverage gaps: 45 (29%)
Dependency issues: 26 (17%)
Level 2 (-vv) adds detailed calculations:
Total debt score: 156 (threshold: 100)
Complexity contribution: 85 (54%)
Formula: sum(cyclomatic_weight * severity_multiplier)
High complexity functions: 5 × 12 = 60
Medium complexity: 8 × 3 = 24
Base penalty: 1
Coverage gaps: 45 (29%)
Uncovered complex functions: 3 × 15 = 45
Level 3 (-vvv) adds all internal details:
Total debt score: 156 (threshold: 100)
... (all level 2 output) ...
Debug info:
Entropy metrics analyzed: 42/50 functions
Function role detection: BusinessLogic=12, Utility=8, TestHelper=5
Parse time: 245ms
Understanding Metrics
To get detailed explanations of how metrics are calculated, use the --explain-metrics flag:
# Get explanations of metric definitions and formulas
debtmap analyze . --explain-metrics
This flag provides:
- Metric definitions - Detailed explanations of what each metric measures
- Calculation formulas - How scores are computed from raw data
- Measured vs estimated - Which metrics are exact and which are heuristic-based
- Interpretation guidance - How to understand and act on metric values
The explanations appear inline with the analysis output, helping you understand:
- What cyclomatic and cognitive complexity measure
- How debt scores are calculated
- What entropy metrics indicate
- How risk scores are determined
This is particularly useful when:
- Learning how debtmap evaluates code quality
- Understanding why certain functions have high scores
- Explaining analysis results to team members
- Tuning thresholds based on metric meanings
Risk Analysis Output
When coverage data is provided via --lcov, terminal output includes a dedicated risk analysis section:
═══════════════════════════════════════════
RISK ANALYSIS REPORT
═══════════════════════════════════════════
📈 RISK Summary
───────────────────────────────────────────
Codebase Risk Score: 45.5 (MEDIUM)
Complexity-Coverage Correlation: -0.65
Risk Distribution:
Critical: 2 functions
High: 5 functions
Medium: 10 functions
Low: 15 functions
Well Tested: 20 functions
🎯 CRITICAL RISKS
───────────────────────────────────────────
1. src/core/parser.rs:142 parse_complex_ast()
Risk: 85.0 | Complexity: 15 | Coverage: 0%
Recommendation: Add 5 unit tests (est: 2-3 hours)
Impact: -40 risk reduction
💡 RECOMMENDATIONS (by ROI)
───────────────────────────────────────────
1. test_me() - ROI: 5.0x
Current Risk: 75 | Reduction: 40 | Effort: Moderate
Rationale: High risk function with low coverage
Risk Level Classification:
- LOW (<30): Green - score < 30.0
- MEDIUM (30-59): Yellow - 30.0 ≤ score < 60.0
- HIGH (≥60): Red - score ≥ 60.0
Note: 60 is the start of HIGH risk level.
JSON Output
JSON output provides complete analysis results in a machine-readable format, ideal for CI/CD pipelines, custom tooling, and programmatic analysis.
Basic Usage
# Generate JSON output
debtmap analyze . --format json
# Save to file
debtmap analyze . --format json -o report.json
# Pretty-printed by default for readability
debtmap analyze . --format json | jq .
Note: JSON output is automatically pretty-printed for readability.
JSON Schema Structure
Debtmap outputs a structured JSON document with the following top-level fields:
{
"project_path": "/path/to/project",
"timestamp": "2025-01-09T12:00:00Z",
"complexity": { ... },
"technical_debt": { ... },
"dependencies": { ... },
"duplications": [ ... ]
}
Full Schema Example
Here’s a complete annotated JSON output example:
{
// Project metadata
"project_path": "/Users/dev/myproject",
"timestamp": "2025-01-09T15:30:00Z",
// Complexity analysis results
"complexity": {
"metrics": [
{
"name": "calculate_risk_score",
"file": "src/risk/scoring.rs",
"line": 142,
"cyclomatic": 12,
"cognitive": 18,
"nesting": 4,
"length": 85,
"is_test": false,
"visibility": "pub",
"is_trait_method": false,
"in_test_module": false,
"entropy_score": {
"token_entropy": 0.65,
"pattern_repetition": 0.30,
"branch_similarity": 0.45,
"effective_complexity": 0.85
},
"is_pure": false,
"purity_confidence": 0.75,
"detected_patterns": ["nested_loops", "complex_conditionals"],
"upstream_callers": ["analyze_codebase", "generate_report"],
"downstream_callees": ["get_metrics", "apply_weights"]
}
],
"summary": {
"total_functions": 287,
"average_complexity": 6.3,
"max_complexity": 24,
"high_complexity_count": 12
}
},
// Technical debt items
"technical_debt": {
"items": [
{
"id": "debt_001",
"debt_type": "Complexity",
"priority": "High",
"file": "src/analyzers/rust.rs",
"line": 245,
"column": 5,
"message": "High cyclomatic complexity: 18",
"context": "Function parse_function has excessive branching"
},
{
"id": "debt_002",
"debt_type": "Todo",
"priority": "Medium",
"file": "src/core/cache.rs",
"line": 89,
"column": null,
"message": "TODO: Implement LRU eviction policy",
"context": null
}
],
"by_type": {
"Complexity": [ /* same structure as items */ ],
"Todo": [ /* ... */ ],
"Duplication": [ /* ... */ ]
},
"priorities": ["Low", "Medium", "High", "Critical"]
},
// Dependency analysis
"dependencies": {
"modules": [
{
"module": "risk::scoring",
"dependencies": ["core::metrics", "debt::patterns"],
"dependents": ["commands::analyze", "io::output"]
}
],
"circular": [
{
"cycle": ["module_a", "module_b", "module_c", "module_a"]
}
]
},
// Code duplication blocks
"duplications": [
{
"hash": "abc123def456",
"lines": 15,
"locations": [
{
"file": "src/parser/rust.rs",
"start_line": 42,
"end_line": 57
},
{
"file": "src/parser/python.rs",
"start_line": 89,
"end_line": 104
}
]
}
]
}
Field Descriptions
FunctionMetrics Fields:
-
name: Function name -
file: Path to source file -
line: Line number where function is defined -
cyclomatic: Cyclomatic complexity score -
cognitive: Cognitive complexity score -
nesting: Maximum nesting depth -
length: Lines of code in function -
is_test: Whether this is a test function -
visibility: Rust visibility modifier (pub, pub(crate), or null) -
is_trait_method: Whether this implements a trait -
in_test_module: Whether inside #[cfg(test)] -
entropy_score: Optional entropy analysis with structure:{ "token_entropy": 0.65, // Token distribution entropy (0-1): measures variety of tokens "pattern_repetition": 0.30, // Pattern repetition score (0-1): detects repeated code patterns "branch_similarity": 0.45, // Branch similarity metric (0-1): compares similarity between branches "effective_complexity": 0.85 // Adjusted complexity multiplier: complexity adjusted for entropy }EntropyScore Fields:
token_entropy: Measures the variety and distribution of tokens in the function (0-1, higher = more variety)pattern_repetition: Detects repeated code patterns within the function (0-1, higher = more repetition)branch_similarity: Measures similarity between different code branches (0-1, higher = more similar)effective_complexity: The overall complexity multiplier adjusted for entropy effects
-
is_pure: Whether function is pure (no side effects) -
purity_confidence: Confidence level (0.0-1.0) -
detected_patterns: List of detected code patterns -
upstream_callers: Functions that call this one -
downstream_callees: Functions this one calls
DebtItem Fields:
id: Unique identifierdebt_type: Type of debt (see DebtType enum below)priority: Priority level (Low, Medium, High, Critical)file: Path to file containing debtline: Line numbercolumn: Optional column numbermessage: Human-readable descriptioncontext: Optional additional context
DebtType Enum:
Todo: TODO markersFixme: FIXME markersCodeSmell: Code smell patternsDuplication: Duplicated codeComplexity: Excessive complexityDependency: Dependency issuesErrorSwallowing: Suppressed errorsResourceManagement: Resource management issuesCodeOrganization: Organizational problemsTestComplexity: Complex test codeTestTodo: TODOs in testsTestDuplication: Duplicated test codeTestQuality: Test quality issues
JSON Format Variants
Debtmap supports two JSON output formats:
# Legacy format (default) - backward compatible
debtmap analyze . --format json --output-format legacy
# Unified format - new consistent structure
debtmap analyze . --format json --output-format unified
Note: The
--output-formatflag only applies when using--format json. It has no effect with markdown or terminal formats.
Format Comparison
Legacy format: Uses {File: {...}} and {Function: {...}} wrappers for backward compatibility with existing tooling.
Unified format: Consistent structure with a type field, making parsing simpler and more predictable. Recommended for new integrations.
When to use each format:
-
Use legacy format if:
- You have existing tooling that expects the old structure
- You need backward compatibility with version 1.x parsers
- You’re integrating with third-party tools expecting the legacy format
-
Use unified format for:
- All new integrations and tooling
- Cleaner, more predictable JSON parsing
- Future-proof implementations
- Simpler type discrimination in statically-typed languages
Migration strategy:
The legacy format will be maintained for backward compatibility, but unified is the recommended format going forward. If you’re starting a new integration, use unified format from the beginning. If migrating existing tooling:
- Test unified format with a subset of your codebase
- Update parsers to handle the
typefield instead of key-based discrimination - Validate results match between legacy and unified formats
- Switch to unified format once validation passes
Structural Differences
Legacy format example:
{
"complexity": {
"metrics": [
{
"File": {
"path": "src/main.rs",
"functions": 12,
"average_complexity": 5.3
}
},
{
"Function": {
"name": "calculate_score",
"file": "src/scoring.rs",
"line": 42,
"cyclomatic": 8
}
}
]
}
}
Unified format example:
{
"complexity": {
"metrics": [
{
"type": "File",
"path": "src/main.rs",
"functions": 12,
"average_complexity": 5.3
},
{
"type": "Function",
"name": "calculate_score",
"file": "src/scoring.rs",
"line": 42,
"cyclomatic": 8
}
]
}
}
Key difference: Legacy uses {File: {...}} wrapper objects, while unified uses a flat structure with "type": "File" field. This makes unified format easier to parse in most programming languages.
Risk Insights JSON
When coverage data is provided via --lcov, risk insights are included as part of the analysis output. The write_risk_insights method (found in src/io/writers/json.rs, terminal.rs, and markdown/core.rs) outputs risk analysis data in the following JSON structure:
{
"items": [
{
"location": {
"file": "src/risk/scoring.rs",
"function": "calculate_priority",
"line": 66
},
"debt_type": "TestGap",
"unified_score": {
"complexity_factor": 3.2,
"coverage_factor": 10.0,
"dependency_factor": 2.5,
"role_multiplier": 1.2,
"final_score": 9.4
},
"function_role": "BusinessLogic",
"recommendation": {
"action": "Add unit tests",
"details": "Add 6 unit tests for full coverage",
"effort_estimate": "2-3 hours"
},
"expected_impact": {
"risk_reduction": 3.9,
"complexity_reduction": 0,
"coverage_improvement": 100
},
"upstream_dependencies": 0,
"downstream_dependencies": 3,
"nesting_depth": 1,
"function_length": 13
}
],
"call_graph": {
"total_functions": 1523,
"entry_points": 12,
"test_functions": 456,
"max_depth": 8
},
"overall_coverage": 82.3,
"total_impact": {
"risk_reduction": 45.2,
"complexity_reduction": 12.3,
"coverage_improvement": 18.5
}
}
Markdown Output
Markdown format generates documentation-friendly reports suitable for README files, PR comments, and technical documentation.
Basic Usage
# Generate markdown report
debtmap analyze . --format markdown
# Save to documentation
debtmap analyze . --format markdown -o docs/DEBT_REPORT.md
Markdown Structure
Markdown output includes:
- Executive Summary - High-level metrics and health dashboard
- Complexity Analysis - Detailed complexity breakdown by file
- Technical Debt - Categorized debt items with priorities
- Dependencies - Module dependencies and circular references
- Recommendations - Prioritized action items
Example Markdown Output
# Debtmap Analysis Report
**Generated:** 2025-01-09 15:30:00 UTC
**Project:** /Users/dev/myproject
## Executive Summary
- **Files Analyzed:** 42
- **Total Functions:** 287
- **Average Complexity:** 6.3
- **Total Debt Items:** 15
- **Debt Score:** 156/100 ⚠️
### Health Dashboard
| Metric | Value | Status |
|--------|-------|--------|
| Complexity | 6.3 avg | ✅ Good |
| Debt Score | 156 | ⚠️ Attention |
| High Priority Items | 5 | ⚠️ Action Needed |
## Complexity Analysis
### Top 5 Complex Functions
| Function | File | Cyclomatic | Cognitive | Priority |
|----------|------|-----------|-----------|----------|
| parse_function | src/analyzers/rust.rs:245 | 18 | 24 | High |
| detect_data_clumps | src/debt/smells.rs:196 | 15 | 20 | Medium |
| analyze_dependencies | src/core/deps.rs:89 | 14 | 18 | Medium |
### Refactoring Recommendations
**src/analyzers/rust.rs:245** - `parse_function()`
- **Complexity:** Cyclomatic: 18, Cognitive: 24
- **Action:** Extract 3-5 pure functions using decompose-then-transform strategy
- **Patterns:** Decompose into logical units, then apply functional patterns
- **Benefit:** Improved testability and maintainability
## Technical Debt
### High Priority (5 items)
- **src/risk/scoring.rs:142** - TODO: Implement caching for score calculations
- **src/core/metrics.rs:89** - High complexity: cyclomatic=16
- **src/debt/patterns.rs:201** - Code duplication: 65 lines duplicated
### Medium Priority (8 items)
...
## Dependencies
### Circular Dependencies
- `risk::scoring` → `core::metrics` → `risk::scoring`
## Recommendations
1. **Refactor parse_function** (High Priority)
- Reduce complexity from 18 to <10
- Extract helper functions
- Estimated effort: 4-6 hours
2. **Add tests for scoring module** (High Priority)
- Current coverage: 35%
- Target coverage: 80%
- Estimated effort: 2-3 hours
CLI vs Library Markdown Features
CLI Markdown Output (--format markdown):
When you use debtmap analyze . --format markdown, you get comprehensive reports that include:
- Executive summary with health dashboard
- Complexity analysis with refactoring recommendations
- Technical debt categorization by priority
- Dependency analysis with circular reference detection
- Actionable recommendations
This uses the base MarkdownWriter implementation and provides everything needed for documentation and PR comments.
Enhanced Library Features:
If you’re using debtmap as a Rust library in your own tools, additional markdown capabilities are available:
EnhancedMarkdownWritertrait (src/io/writers/markdown/enhanced.rs) - Provides advanced formatting and analysis features- Enhanced markdown modules (
src/io/writers/enhanced_markdown/) - Building blocks for custom visualizations including:- Priority-based debt rankings with unified scoring
- Dead code detection and reporting
- Call graph insights and dependency visualization
- Testing recommendations with ROI analysis
To use enhanced features in your Rust code:
#![allow(unused)]
fn main() {
use debtmap::io::writers::markdown::enhanced::EnhancedMarkdownWriter;
use debtmap::io::writers::enhanced_markdown::*;
// Create custom reports with enhanced features
let mut writer = create_enhanced_writer(output)?;
writer.write_priority_rankings(&analysis)?;
writer.write_dead_code_analysis(&call_graph)?;
}
Note: Enhanced markdown features are only available through the library API, not via the CLI. The CLI
--format markdownoutput is comprehensive for most use cases.
Rendering to HTML/PDF
Markdown reports can be converted to other formats:
# Generate markdown
debtmap analyze . --format markdown -o report.md
# Convert to HTML with pandoc
pandoc report.md -o report.html --standalone --css style.css
# Convert to PDF
pandoc report.md -o report.pdf --pdf-engine=xelatex
Tool Integration
CI/CD Pipelines
Debtmap JSON output integrates seamlessly with CI/CD systems.
GitHub Actions
name: Code Quality
on: [pull_request]
jobs:
analyze:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Install debtmap
run: cargo install debtmap
- name: Run analysis
run: |
debtmap analyze . \
--format json \
--output analysis.json \
--lcov coverage/lcov.info
- name: Check thresholds
run: |
DEBT_SCORE=$(jq '.technical_debt.items | length' analysis.json)
if [ "$DEBT_SCORE" -gt 100 ]; then
echo "❌ Debt score too high: $DEBT_SCORE"
exit 1
fi
- name: Comment on PR
uses: actions/github-script@v6
with:
script: |
const fs = require('fs');
const analysis = JSON.parse(fs.readFileSync('analysis.json'));
const summary = `## Debtmap Analysis
- **Debt Items:** ${analysis.technical_debt.items.length}
- **Average Complexity:** ${analysis.complexity.summary.average_complexity}
- **High Complexity Functions:** ${analysis.complexity.summary.high_complexity_count}
`;
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: summary
});
GitLab CI
code_quality:
stage: test
script:
- cargo install debtmap
- debtmap analyze . --format json --output gl-code-quality.json
- |
DEBT=$(jq '.technical_debt.items | length' gl-code-quality.json)
if [ "$DEBT" -gt 50 ]; then
echo "Debt threshold exceeded"
exit 1
fi
artifacts:
reports:
codequality: gl-code-quality.json
Jenkins Pipeline
pipeline {
agent any
stages {
stage('Analyze') {
steps {
sh 'debtmap analyze . --format json -o report.json'
script {
def json = readJSON file: 'report.json'
def debtScore = json.technical_debt.items.size()
if (debtScore > 100) {
error("Debt score ${debtScore} exceeds threshold")
}
}
}
}
}
post {
always {
archiveArtifacts artifacts: 'report.json'
}
}
}
Querying JSON with jq
Common jq queries for analyzing debtmap output:
# Get total debt items
jq '.technical_debt.items | length' report.json
# Get high-priority items only
jq '.technical_debt.items[] | select(.priority == "High")' report.json
# Get functions with complexity > 10
jq '.complexity.metrics[] | select(.cyclomatic > 10)' report.json
# Calculate average complexity
jq '.complexity.summary.average_complexity' report.json
# Get all TODO items
jq '.technical_debt.items[] | select(.debt_type == "Todo")' report.json
# Get top 5 complex functions
jq '.complexity.metrics | sort_by(-.cyclomatic) | .[0:5] | .[] | {name, file, cyclomatic}' report.json
# Get files with circular dependencies
jq '.dependencies.circular[] | .cycle' report.json
# Count debt items by type
jq '.technical_debt.items | group_by(.debt_type) | map({type: .[0].debt_type, count: length})' report.json
# Get functions with 0% coverage (when using --lcov)
jq '.complexity.metrics[] | select(.coverage == 0)' report.json
# Extract file paths with high debt
jq '.technical_debt.items[] | select(.priority == "High" or .priority == "Critical") | .file' report.json | sort -u
Filtering and Transformation Examples
Python Script to Parse JSON
#!/usr/bin/env python3
import json
import sys
def analyze_debtmap_output(json_file):
with open(json_file) as f:
data = json.load(f)
# Get high-priority items
high_priority = [
item for item in data['technical_debt']['items']
if item['priority'] in ['High', 'Critical']
]
# Group by file
by_file = {}
for item in high_priority:
file = item['file']
if file not in by_file:
by_file[file] = []
by_file[file].append(item)
# Print summary
print(f"High-priority debt items: {len(high_priority)}")
print(f"Files affected: {len(by_file)}")
print("\nBy file:")
for file, items in sorted(by_file.items(), key=lambda x: -len(x[1])):
print(f" {file}: {len(items)} items")
return by_file
if __name__ == '__main__':
analyze_debtmap_output(sys.argv[1])
Shell Script for Threshold Checking
#!/bin/bash
set -e
REPORT="$1"
DEBT_THRESHOLD=100
COMPLEXITY_THRESHOLD=10
# Check debt score
DEBT_SCORE=$(jq '.technical_debt.items | length' "$REPORT")
if [ "$DEBT_SCORE" -gt "$DEBT_THRESHOLD" ]; then
echo "❌ Debt score $DEBT_SCORE exceeds threshold $DEBT_THRESHOLD"
exit 1
fi
# Check average complexity
AVG_COMPLEXITY=$(jq '.complexity.summary.average_complexity' "$REPORT")
if (( $(echo "$AVG_COMPLEXITY > $COMPLEXITY_THRESHOLD" | bc -l) )); then
echo "❌ Average complexity $AVG_COMPLEXITY exceeds threshold $COMPLEXITY_THRESHOLD"
exit 1
fi
echo "✅ All quality checks passed"
echo " Debt score: $DEBT_SCORE/$DEBT_THRESHOLD"
echo " Avg complexity: $AVG_COMPLEXITY"
Editor Integration
VS Code Tasks
Create .vscode/tasks.json:
{
"version": "2.0.0",
"tasks": [
{
"label": "Debtmap: Analyze",
"type": "shell",
"command": "debtmap",
"args": [
"analyze",
".",
"--format",
"terminal"
],
"problemMatcher": [],
"presentation": {
"reveal": "always",
"panel": "new"
}
},
{
"label": "Debtmap: Generate Report",
"type": "shell",
"command": "debtmap",
"args": [
"analyze",
".",
"--format",
"markdown",
"-o",
"DEBT_REPORT.md"
],
"problemMatcher": []
}
]
}
Problem Matcher for VS Code
Parse debtmap output in VS Code’s Problems panel:
{
"problemMatcher": {
"owner": "debtmap",
"fileLocation": "absolute",
"pattern": {
"regexp": "^(.+?):(\\d+):(\\d+)?\\s*-\\s*(.+)$",
"file": 1,
"line": 2,
"column": 3,
"message": 4
}
}
}
Webhook Integration
Send debtmap results to webhooks for notifications:
#!/bin/bash
# Run analysis
debtmap analyze . --format json -o report.json
# Send to Slack
DEBT_SCORE=$(jq '.technical_debt.items | length' report.json)
curl -X POST "$SLACK_WEBHOOK_URL" \
-H 'Content-Type: application/json' \
-d "{\"text\": \"Debtmap Analysis Complete\n• Debt Score: $DEBT_SCORE\n• High Priority: $(jq '[.technical_debt.items[] | select(.priority == "High")] | length' report.json)\"}"
# Send to custom webhook
curl -X POST "$CUSTOM_WEBHOOK_URL" \
-H 'Content-Type: application/json' \
-d @report.json
Output Filtering
Debtmap provides several flags to filter and limit output:
Note: Filtering options (
--top,--tail,--summary,--filter) apply to all output formats (terminal, JSON, and markdown). The filtered data is applied at the analysis level before formatting, ensuring consistent results across all output types.
Limiting Results
# Show only top 10 priority items
debtmap analyze . --top 10
# Show bottom 5 lowest priority items
debtmap analyze . --tail 5
Priority Filtering
# Show only high and critical priority items
debtmap analyze . --min-priority high
# Filter by specific debt categories
debtmap analyze . --filter Architecture,Testing
Available categories:
Architecture: God objects, complexity hotspots, dead codeTesting: Testing gaps, coverage issuesPerformance: Resource leaks, inefficient patternsCodeQuality: Code smells, maintainability
Grouping Output
# Group results by debt category
debtmap analyze . --group-by-category
# Combine filters for focused analysis
debtmap analyze . --filter Architecture --min-priority high --top 5
Summary Mode
# Compact tiered priority display
debtmap analyze . --summary
# Combines well with filtering
debtmap analyze . --summary --min-priority medium
Best Practices
When to Use Each Format
Use Terminal Format When:
- Developing locally and reviewing code
- Getting quick feedback on changes
- Presenting results to team members
- Exploring complexity hotspots interactively
Use JSON Format When:
- Integrating with CI/CD pipelines
- Building custom analysis tools
- Tracking metrics over time
- Programmatically processing results
- Feeding into dashboards or monitoring systems
Use Markdown Format When:
- Generating documentation
- Creating PR comments
- Sharing reports with stakeholders
- Archiving analysis results
- Producing executive summaries
Quick Reference Table
| Format | Best For | Machine Readable | Human Readable | File Extension |
|---|---|---|---|---|
| Terminal | Development | No | Yes | .txt |
| JSON | Automation | Yes | No | .json |
| Markdown | Documentation | Partially | Yes | .md |
Combining Formats
Use multiple formats for comprehensive workflows:
# Generate terminal output for review
debtmap analyze .
# Generate JSON for automation
debtmap analyze . --format json -o ci-report.json
# Generate markdown for documentation
debtmap analyze . --format markdown -o docs/DEBT.md
Performance Considerations
- Terminal format: Fastest, minimal overhead
- JSON format: Fast serialization, efficient for large codebases
- Markdown format: Slightly slower due to formatting, but still performant
For very large codebases (>10,000 files), use --top or --filter to limit output size.
Troubleshooting
Common Issues
Colors not showing in terminal:
- Check if terminal supports ANSI colors
- Use
--plainflag for ASCII-only output - Some CI systems may not support color codes
JSON parsing errors:
- Ensure output is complete (check for errors during analysis)
- Validate JSON with
jqor online validators - Check for special characters in file paths
Markdown rendering issues:
- Some markdown renderers don’t support all features
- Use standard markdown for maximum compatibility
- Test with pandoc or GitHub/GitLab preview
File encoding problems:
- Ensure UTF-8 encoding for all output files
- Use
--plainfor pure ASCII output - Check locale settings (LC_ALL, LANG environment variables)
Exit Codes
Current behavior (as verified in src/main.rs):
0: Successful analysis completed without errors- Non-zero: Error during analysis (invalid path, parsing error, etc.)
Note: Threshold-based exit codes (where analysis succeeds but fails quality gates) are not currently implemented. The
analyzecommand returns 0 on successful analysis regardless of debt scores or complexity thresholds.
To enforce quality gates based on thresholds, use the validate command or parse JSON output:
# Use validate command for threshold enforcement
debtmap validate . --config debtmap.toml
# Or parse JSON output for threshold checking
debtmap analyze . --format json -o report.json
DEBT_SCORE=$(jq '.technical_debt.items | length' report.json)
if [ "$DEBT_SCORE" -gt 100 ]; then
echo "Debt threshold exceeded"
exit 1
fi
See Also
- Getting Started - Basic usage and examples
- Analysis Guide - Understanding metrics and scores
- Configuration - Customizing analysis behavior
Architecture
This chapter explains how debtmap’s analysis pipeline works, from discovering files to producing prioritized technical debt recommendations.
Analysis Pipeline Overview
Debtmap’s analysis follows a multi-stage pipeline that transforms source code into actionable recommendations:
┌─────────────────┐
│ File Discovery │
└────────┬────────┘
│
▼
┌─────────────────┐
│Language Detection│
└────────┬────────┘
│
▼
┌────────┐
│ Parser │
└────┬───┘
│
┌────┼────────────┐
│ │ │
▼ ▼ ▼
┌─────┐ ┌──────────┐ ┌───────────┐
│ syn │ │rustpython│ │tree-sitter│
│ AST │ │ AST │ │ AST │
└──┬──┘ └────┬─────┘ └─────┬─────┘
│ │ │
└─────────┼─────────────┘
│
▼
┌──────────────────┐
│ Metric Extraction │
└─────────┬────────┘
│
┌───────┼───────┐
│ │ │
▼ ▼ ▼
┌────────┐ ┌─────┐ ┌─────────┐
│Complexity│ │Call │ │ Pattern │
│ Calc │ │Graph│ │Detection│
└────┬───┘ └──┬──┘ └────┬────┘
│ │ │
▼ │ │
┌─────────┐ │ │
│ Entropy │ │ │
│ Analysis│ │ │
└────┬────┘ │ │
│ │ │
▼ ▼ ▼
┌─────────┐ ┌────────┐ ┌──────┐ ┌──────────┐
│Effective│ │Dependency│ │ Debt │ │ LCOV │
│Complexity│ │Analysis│ │Class │ │ Coverage │
└────┬────┘ └────┬───┘ └──┬───┘ └────┬─────┘
│ │ │ │
└───────────┼────────┼─────────────┘
│ │
▼ ▼
┌─────────────────┐
│ Risk Scoring │
└────────┬────────┘
│
▼
┌───────────────────┐
│Tiered Prioritization│
└─────────┬─────────┘
│
▼
┌────────────────┐
│Output Formatting│
└────────┬───────┘
│
▼
┌─────────────────────┐
│Terminal/JSON/Markdown│
└─────────────────────┘
Key Components
1. File Discovery and Language Detection
Purpose: Identify source files to analyze and determine their language.
How it works:
- Walks the project directory tree (respecting
.gitignoreand.debtmapignore) - Detects language based on file extension (
.rs,.py,.js,.ts) - Filters out test files, build artifacts, and vendored dependencies
- Groups files by language for parallel processing
Configuration:
[analysis]
exclude_patterns = ["**/tests/**", "**/target/**", "**/node_modules/**"]
include_patterns = ["src/**/*.rs", "lib/**/*.py"]
2. Parser Layer
Purpose: Convert source code into Abstract Syntax Trees (ASTs) for analysis.
Language-Specific Parsers:
Rust (syn):
- Uses the
syncrate for full Rust syntax support - Extracts: functions, structs, impls, traits, macros
- Handles: async/await, generic types, lifetime annotations
- Performance: ~10-20ms per file
Python (rustpython):
- Uses rustpython’s parser for Python 3.x syntax
- Extracts: functions, classes, methods, decorators
- Handles: comprehensions, async/await, type hints
- Performance: ~5-15ms per file
JavaScript/TypeScript (tree-sitter):
- Uses tree-sitter for JS/TS parsing
- Extracts: functions, classes, arrow functions, hooks
- Handles: JSX/TSX, decorators, generics
- Performance: ~8-18ms per file
Error Handling:
- Syntax errors logged but don’t stop analysis
- Partial ASTs used when possible
- Files with parse errors excluded from final report
3. Metric Extraction
Purpose: Extract raw metrics from ASTs.
Metrics Computed:
Function-Level:
- Lines of code (LOC)
- Cyclomatic complexity (branch count)
- Nesting depth (max indentation level)
- Parameter count
- Return path count
- Comment ratio
File-Level:
- Total LOC
- Number of functions/classes
- Dependency count (imports)
- Documentation coverage
Implementation:
#![allow(unused)]
fn main() {
pub struct FunctionMetrics {
pub name: String,
pub location: Location,
pub loc: u32,
pub cyclomatic_complexity: u32,
pub nesting_depth: u32,
pub parameter_count: u32,
pub return_paths: u32,
}
}
4. Complexity Calculation and Entropy Analysis
Purpose: Compute effective complexity using entropy-adjusted metrics.
Traditional Cyclomatic Complexity:
- Count decision points (if, match, loop, etc.)
- Each branch adds +1 to complexity
- Does not distinguish between repetitive and varied logic
Entropy-Based Adjustment:
Debtmap calculates pattern entropy to adjust cyclomatic complexity:
- Extract patterns - Identify branch structures (e.g., all if/return patterns)
- Calculate variety - Measure information entropy of patterns
- Adjust complexity - Reduce score for low-entropy (repetitive) code
Formula:
Entropy = -Σ(p_i * log2(p_i))
where p_i = frequency of pattern i
Effective Complexity = Cyclomatic * (1 - (1 - Entropy/Max_Entropy) * 0.75)
Example:
#![allow(unused)]
fn main() {
// 20 similar if/return statements
// Cyclomatic: 20, Entropy: 0.3
// Effective: 20 * (1 - (1 - 0.3/4.32) * 0.75) ≈ 5.5
}
This approach reduces false positives from validation/configuration code while still flagging genuinely complex logic.
5. Call Graph Construction
Purpose: Understand function dependencies and identify critical paths.
What’s Tracked:
- Function calls within the same file
- Cross-file calls (when possible to resolve)
- Method calls on structs/classes
- Trait/interface implementations
Analysis:
- Fan-in: How many functions call this function
- Fan-out: How many functions this function calls
- Depth: Distance from entry points (main, handlers)
- Cycles: Detect recursive calls
Usage:
- Prioritize functions called from many untested paths
- Identify central functions (high fan-in/fan-out)
- Detect test coverage gaps in critical paths
Limitations:
- Dynamic dispatch not fully resolved
- Cross-crate calls require additional analysis
- Closures and function pointers approximated
6. Pattern Detection and Debt Classification
Purpose: Identify specific technical debt patterns.
Debt Categories:
Test Gaps:
- Functions with 0% coverage and high complexity
- Untested error paths
- Missing edge case tests
Complexity Issues:
- Functions exceeding thresholds (default: 10)
- Deep nesting (3+ levels)
- Long functions (200+ LOC)
Design Smells:
- God functions (high fan-out)
- Unused code (fan-in = 0)
- Circular dependencies
Implementation:
#![allow(unused)]
fn main() {
pub enum DebtType {
TestGap { missing_tests: u32 },
HighComplexity { score: u32 },
DeepNesting { depth: u32 },
LongFunction { loc: u32 },
TooManyParams { count: u32 },
}
}
7. Coverage Integration
Purpose: Map test coverage data to complexity metrics for risk scoring.
Coverage Data Flow:
- Read LCOV file - Parse coverage report from test runners
- Map to source - Match coverage lines to functions/branches
- Calculate coverage % - For each function, compute:
- Line coverage: % of lines executed
- Branch coverage: % of branches taken
- Identify gaps - Find untested branches in complex functions
Coverage Scoring:
#![allow(unused)]
fn main() {
pub struct CoverageMetrics {
pub lines_covered: u32,
pub lines_total: u32,
pub branches_covered: u32,
pub branches_total: u32,
pub coverage_percent: f64,
}
}
Special Cases:
- Entry points (main, handlers) expect integration test coverage
- Generated code excluded from coverage requirements
- Test files themselves not analyzed for coverage
8. Risk Scoring
Purpose: Combine complexity and coverage into a unified risk score.
Risk Formula:
Risk Score = (Effective Complexity * Coverage Gap Weight) + (Call Graph Depth * Path Weight)
where:
- Effective Complexity: Entropy-adjusted cyclomatic complexity
- Coverage Gap Weight: 1.0 for 0% coverage, decreasing to 0.1 for 95%+
- Call Graph Depth: Distance from entry points
- Path Weight: Number of untested paths leading to this function
Example Calculation:
#![allow(unused)]
fn main() {
fn calculate_risk_score():
Effective Complexity: 8.5
Coverage: 30%
Coverage Gap Weight: 0.7
Call Graph Depth: 3
Untested Paths: 2
Risk = (8.5 * 0.7) + (3 * 2 * 0.3) = 5.95 + 1.8 = 7.75
}
Risk Tiers:
- Critical (8.0+): Immediate attention required
- High (5.0-7.9): Priority for next sprint
- Moderate (2.0-4.9): Address when refactoring nearby code
- Low (<2.0): Monitor but no immediate action
9. Tiered Prioritization
Purpose: Classify and rank technical debt items by urgency and impact.
Prioritization Algorithm:
- Calculate base risk score (from Risk Scoring step)
- Apply context adjustments:
- Entry points: -2.0 score (lower priority for unit tests)
- Core business logic: +1.5 score (higher priority)
- Frequently changed files: +1.0 score (git history analysis)
- Critical paths: +0.5 score per untested caller
- Classify into tiers:
- Critical: score >= 8.0
- High: score >= 5.0
- Moderate: score >= 2.0
- Low: score < 2.0
- Sort within tiers by:
- Impact (estimated risk reduction)
- Effort (test count or refactoring size)
- ROI (impact / effort)
Output:
#![allow(unused)]
fn main() {
pub struct PrioritizedDebtItem {
pub rank: u32,
pub score: f64,
pub tier: Tier,
pub location: Location,
pub debt_type: DebtType,
pub action: String,
pub impact: f64,
pub effort: Effort,
}
}
See Tiered Prioritization for detailed explanation of the ranking algorithm.
10. Output Formatting
Purpose: Present analysis results in user-friendly formats.
Output Formats:
Terminal (default):
- Color-coded by tier (red=critical, yellow=high, etc.)
- Hierarchical tree view with unicode box characters
- Collapsible sections for detailed recommendations
- Summary statistics at top
JSON:
- Machine-readable for CI/CD integration
- Full metadata for each debt item
- Structured for programmatic consumption
- Schema-versioned for compatibility
Markdown:
- Rendered in GitHub/GitLab for PR comments
- Embedded code blocks with syntax highlighting
- Collapsible details sections
- Linked to source code locations
GitHub PR Comments:
- Automated comments on pull requests
- Inline annotations at specific lines
- Comparison with base branch (new vs existing debt)
- Summary card with key metrics
See Output Formats for examples and configuration options.
Data Flow Example
Let’s trace a single function through the entire pipeline:
Input: Source File
#![allow(unused)]
fn main() {
// src/handlers.rs
pub fn process_request(req: Request) -> Result<Response> {
validate_auth(&req)?;
let data = parse_payload(&req.body)?;
let result = apply_business_logic(data)?;
format_response(result)
}
}
Stage 1: Parsing
#![allow(unused)]
fn main() {
FunctionAst {
name: "process_request",
location: Location { file: "src/handlers.rs", line: 2 },
calls: ["validate_auth", "parse_payload", "apply_business_logic", "format_response"],
...
}
}
Stage 2: Metric Extraction
#![allow(unused)]
fn main() {
FunctionMetrics {
name: "process_request",
cyclomatic_complexity: 4, // 3 ?-operators + base
nesting_depth: 1,
loc: 5,
...
}
}
Stage 3: Entropy Analysis
#![allow(unused)]
fn main() {
// Pattern: repetitive ?-operator error handling
Entropy: 0.4 (low variety)
Effective Complexity: 4 * 0.85 = 3.4
}
Stage 4: Call Graph
#![allow(unused)]
fn main() {
CallGraphNode {
function: "process_request",
fan_in: 3, // called from 3 handlers
fan_out: 4, // calls 4 functions
depth: 1, // direct handler (entry point)
}
}
Stage 5: Coverage (from LCOV)
#![allow(unused)]
fn main() {
CoverageMetrics {
lines_covered: 5,
lines_total: 5,
branches_covered: 3,
branches_total: 4, // Missing one error path
coverage_percent: 75%,
}
}
Stage 6: Risk Scoring
#![allow(unused)]
fn main() {
Risk = (3.4 * 0.25) + (1 * 1 * 0.2) = 0.85 + 0.2 = 1.05
Tier: LOW (entry point with decent coverage)
}
Stage 7: Recommendation
#23 SCORE: 1.1 [LOW]
├─ MINOR GAP: ./src/handlers.rs:2 process_request()
├─ ACTION: Add 1 test for error path at line 3
├─ IMPACT: -0.3 risk reduction
└─ WHY: Entry point with 75% branch coverage, missing error case
Performance Characteristics
Analysis Speed:
- Small project (< 10k LOC): 1-3 seconds
- Medium project (10-50k LOC): 5-15 seconds
- Large project (50-200k LOC): 20-60 seconds
- Very large project (200k+ LOC): 1-5 minutes
Parallelization:
- File parsing: Parallel across all available cores
- Metric extraction: Parallel per-file
- Call graph construction: Sequential (requires cross-file state)
- Risk scoring: Parallel per-function
- Output formatting: Sequential
Memory Usage:
- Approx 100-200 KB per file analyzed
- Peak memory for large projects: 500 MB - 1 GB
- Streaming mode available for very large codebases
Optimization Strategies:
- Skip unchanged files (git diff integration)
- Parallel processing with rayon
- Efficient AST traversal (visitor pattern)
- Memory-efficient streaming for large codebases
Extension Points
Custom Analyzers:
Implement the Analyzer trait to add language support:
#![allow(unused)]
fn main() {
pub trait Analyzer {
fn parse(&self, content: &str) -> Result<Ast>;
fn extract_metrics(&self, ast: &Ast) -> Vec<FunctionMetrics>;
fn detect_patterns(&self, ast: &Ast) -> Vec<DebtPattern>;
}
}
Custom Scoring:
Implement the RiskScorer trait to adjust scoring logic:
#![allow(unused)]
fn main() {
pub trait RiskScorer {
fn calculate_risk(&self, metrics: &FunctionMetrics, coverage: &CoverageMetrics) -> f64;
fn classify_tier(&self, score: f64) -> Tier;
}
}
Custom Output:
Implement the OutputFormatter trait for new formats:
#![allow(unused)]
fn main() {
pub trait OutputFormatter {
fn format(&self, items: &[PrioritizedDebtItem]) -> Result<String>;
}
}
Next Steps
- Understand prioritization: See Tiered Prioritization
- Learn scoring strategies: See Scoring Strategies
- Configure analysis: See Configuration
- View examples: See Examples
Architectural Analysis
Debtmap provides comprehensive architectural analysis capabilities based on Robert C. Martin’s software engineering principles. These tools help identify structural issues, coupling problems, and architectural anti-patterns in your codebase.
Overview
Architectural analysis examines module-level relationships and dependencies to identify:
- Circular Dependencies - Modules that create dependency cycles
- Coupling Metrics - Afferent and efferent coupling measurements
- Bidirectional Dependencies - Inappropriate intimacy between modules
- Stable Dependencies Principle Violations - Unstable modules being depended upon
- Zone of Pain - Rigid, concrete implementations heavily depended upon
- Zone of Uselessness - Overly abstract, unstable modules
- Code Duplication - Identical or similar code blocks across files
These analyses help you maintain clean architecture and identify refactoring opportunities.
Circular Dependency Detection
Circular dependencies occur when modules form a dependency cycle (A depends on B, B depends on C, C depends on A). These violations break architectural boundaries and make code harder to understand, test, and maintain.
How It Works
Debtmap builds a dependency graph from module imports and uses depth-first search (DFS) with recursion stack tracking to detect cycles:
- Parse all files to extract import/module dependencies
- Build a directed graph where nodes are modules and edges are dependencies
- Run DFS from each unvisited module
- Track visited nodes and recursion stack
- When a node is reached that’s already in the recursion stack, a cycle is detected
Implementation: src/debt/circular.rs:44-66 (detect_circular_dependencies)
Example
#![allow(unused)]
fn main() {
// Module A (src/auth.rs)
use crate::user::User;
use crate::session::validate_session;
// Module B (src/user.rs)
use crate::session::Session;
// Module C (src/session.rs)
use crate::auth::authenticate; // Creates cycle: auth → session → auth
}
Debtmap detects:
Circular dependency detected: auth → session → auth
Refactoring Recommendations
To break circular dependencies:
- Extract Interface - Create a trait that both modules depend on
- Dependency Inversion - Introduce an abstraction layer
- Move Shared Code - Extract common functionality to a new module
- Remove Dependency - Inline or duplicate small amounts of code
Coupling Metrics
Coupling metrics measure how interconnected modules are. Debtmap calculates two primary metrics:
Afferent Coupling (Ca)
Afferent coupling is the number of modules that depend on this module. High afferent coupling means many modules rely on this code.
#![allow(unused)]
fn main() {
pub struct CouplingMetrics {
pub module: String,
pub afferent_coupling: usize, // Number depending on this module
pub efferent_coupling: usize, // Number this module depends on
pub instability: f64, // Calculated from Ca and Ce
pub abstractness: f64, // Ratio of abstract types
}
}
Implementation: src/debt/coupling.rs:6-30
Efferent Coupling (Ce)
Efferent coupling is the number of modules this module depends on. High efferent coupling means this module has many dependencies.
Example Coupling Analysis
Module: api_handler
Afferent coupling (Ca): 8 // 8 modules depend on api_handler
Efferent coupling (Ce): 3 // api_handler depends on 3 modules
Instability: 0.27 // Relatively stable
High afferent or efferent coupling (typically >5) indicates potential maintainability issues.
Instability Metric
The instability metric measures how resistant a module is to change. It’s calculated as:
I = Ce / (Ca + Ce)
Interpretation:
- I = 0.0 - Maximally stable (no dependencies, many dependents)
- I = 1.0 - Maximally unstable (many dependencies, no dependents)
Implementation: src/debt/coupling.rs:16-24 (calculate_instability)
Stability Guidelines
- Stable modules (I < 0.3) - Hard to change but depended upon; should contain stable abstractions
- Balanced modules (0.3 ≤ I ≤ 0.7) - Normal modules with both dependencies and dependents
- Unstable modules (I > 0.7) - Change frequently; should have few or no dependents
Example
#![allow(unused)]
fn main() {
// Stable module (I = 0.1)
// core/types.rs - defines fundamental types, depended on by 20 modules
pub struct User { ... }
pub struct Session { ... }
// Unstable module (I = 0.9)
// handlers/admin_dashboard.rs - depends on 10 modules, no dependents
use crate::auth::*;
use crate::database::*;
use crate::templates::*;
// ... 7 more imports
}
Stable Dependencies Principle
The Stable Dependencies Principle (SDP) states: Depend in the direction of stability. Modules should depend on modules that are more stable than themselves.
SDP Violations
Debtmap flags violations when a module has:
- Instability > 0.8 (very unstable)
- Afferent coupling > 2 (multiple modules depend on it)
This means an unstable, frequently changing module is being depended upon by multiple other modules - a recipe for maintenance problems.
Implementation: src/debt/coupling.rs:69-76
Example Violation
Module 'temp_utils' violates Stable Dependencies Principle
(instability: 0.85, depended on by 5 modules)
Problem: This module changes frequently but is heavily depended upon.
Solution: Extract stable interface or reduce dependencies on this module.
Fixing SDP Violations
- Increase stability - Reduce the module’s dependencies
- Reduce afferent coupling - Extract interface, use dependency injection
- Split module - Separate stable and unstable parts
Bidirectional Dependencies
Bidirectional dependencies (also called inappropriate intimacy) occur when two modules depend on each other:
Module A depends on Module B
Module B depends on Module A
This creates tight coupling and makes both modules harder to change, test, or reuse independently.
Implementation: src/debt/coupling.rs:98-117 (detect_inappropriate_intimacy)
Example
#![allow(unused)]
fn main() {
// order.rs
use crate::customer::Customer;
pub struct Order {
customer: Customer,
}
// customer.rs
use crate::order::Order; // Bidirectional dependency!
pub struct Customer {
orders: Vec<Order>,
}
}
Debtmap detects:
Inappropriate intimacy detected between 'order' and 'customer'
Refactoring Recommendations
- Create Mediator - Introduce a third module to manage the relationship
- Break into Separate Modules - Split concerns more clearly
- Use Events - Replace direct dependencies with event-driven communication
- Dependency Inversion - Introduce interfaces/traits both depend on
Zone of Pain Detection
The zone of pain contains modules with:
- Low abstractness (< 0.2) - Concrete implementations, no abstractions
- Low instability (< 0.2) - Stable, hard to change
- High afferent coupling (> 3) - Many modules depend on them
These modules are rigid concrete implementations that are heavily used but hard to change - causing pain when modifications are needed.
Implementation: src/debt/coupling.rs:125-138
Example
Module 'database_client' is in the zone of pain (rigid and hard to change)
Abstractness: 0.1 (all concrete implementation)
Instability: 0.15 (very stable, many dependents)
Afferent coupling: 12 (12 modules depend on it)
Problem: This concrete database client is used everywhere.
Any change to its implementation requires updating many modules.
Refactoring Recommendations
- Extract Interfaces - Create a
DatabaseClienttrait - Introduce Abstractions - Define abstract operations others depend on
- Break into Smaller Modules - Separate concerns to reduce coupling
- Use Dependency Injection - Pass implementations via interfaces
Zone of Uselessness Detection
The zone of uselessness contains modules with:
- High abstractness (> 0.8) - Mostly abstract, few concrete implementations
- High instability (> 0.8) - Frequently changing
These modules are overly abstract and unstable, providing little stable value to the system.
Implementation: src/debt/coupling.rs:141-153
Example
Module 'base_processor' is in the zone of uselessness
(too abstract and unstable)
Abstractness: 0.9 (mostly traits and interfaces)
Instability: 0.85 (changes frequently)
Problem: This module defines many abstractions but provides little
concrete value. It changes often, breaking implementations.
Refactoring Recommendations
- Add Concrete Implementations - Make the module useful by implementing functionality
- Remove if Unused - Delete if no real value is provided
- Stabilize Interfaces - Stop changing abstractions frequently
- Merge with Implementations - Combine abstract and concrete code
Distance from Main Sequence
The main sequence represents the ideal balance between abstractness and instability. Modules should lie on the line:
A + I = 1
Where:
- A = Abstractness (ratio of abstract types to total types)
- I = Instability (Ce / (Ca + Ce))
Distance from the main sequence:
D = |A + I - 1|
Implementation: src/debt/coupling.rs:119-123
Interpretation
- D ≈ 0.0 - Module is on the main sequence (ideal)
- D > 0.5 - Module is far from ideal
- High D with low A and I → Zone of Pain
- High D with high A and I → Zone of Uselessness
Visual Representation
Abstractness
1.0 ┤ Zone of Uselessness
│ ╱
│ ╱
0.5 ┤ ╱ Main Sequence
│╱
╱
0.0 ┤──────────────────────────
0.0 0.5 1.0
Instability
Zone of Pain
Code Duplication Detection
Debtmap detects code duplication using hash-based chunk comparison:
- Extract chunks - Split files into fixed-size chunks (default: 50 lines)
- Normalize - Remove whitespace and comments
- Calculate hash - Compute SHA-256 hash for each normalized chunk
- Match duplicates - Find chunks with identical hashes
- Merge adjacent - Consolidate consecutive duplicate blocks
Note: The minimum chunk size is configurable via the --threshold-duplication flag or in .debtmap.toml (default: 50 lines).
Implementation: src/debt/duplication.rs:6-44 (detect_duplication)
Algorithm Details
#![allow(unused)]
fn main() {
pub fn detect_duplication(
files: Vec<(PathBuf, String)>,
min_lines: usize, // Default: 50
_similarity_threshold: f64, // Currently unused (exact matching)
) -> Vec<DuplicationBlock>
}
The algorithm:
- Extracts overlapping chunks from each file
- Normalizes by trimming whitespace and removing comments
- Calculates SHA-256 hash for each normalized chunk
- Groups chunks by hash
- Returns groups with 2+ locations (duplicates found)
Example Output
Code duplication detected:
Hash: a3f2b9c1...
Lines: 50
Locations:
- src/handlers/user.rs:120-169
- src/handlers/admin.rs:85-134
- src/handlers/guest.rs:200-249
Recommendation: Extract common validation logic to shared module
Duplication Configuration
Configure duplication detection in .debtmap.toml:
# Minimum lines for duplication detection
threshold_duplication = 50 # Default value
# Smaller values catch more duplications but increase noise
# threshold_duplication = 30 # More sensitive
# Larger values only catch major duplications
# threshold_duplication = 100 # Less noise
Configuration reference: src/cli.rs:69 (threshold_duplication flag definition)
Implementation: src/debt/duplication.rs:6-10
Current Limitations
- Exact matching only - Currently uses hash-based exact matching
- similarity_threshold parameter - Defined but not implemented yet
- Future enhancement - Fuzzy matching for near-duplicates
Refactoring Recommendations
Debtmap provides specific refactoring recommendations for each architectural issue:
For Circular Dependencies
- Extract Interface - Create shared abstraction both modules use
- Dependency Inversion - Introduce interfaces to reverse dependency direction
- Move Shared Code - Extract to new module both can depend on
- Event-Driven - Replace direct calls with event publishing/subscribing
For High Coupling
- Facade Pattern - Provide simplified interface hiding complex dependencies
- Reduce Dependencies - Remove unnecessary imports and calls
- Dependency Injection - Pass dependencies via constructors/parameters
- Interface Segregation - Split large interfaces into focused ones
For Zone of Pain
- Introduce Abstractions - Extract traits/interfaces for flexibility
- Adapter Pattern - Wrap concrete implementations with adapters
- Strategy Pattern - Make algorithms pluggable via interfaces
For Zone of Uselessness
- Add Concrete Implementations - Provide useful functionality
- Remove Unused Code - Delete if providing no value
- Stabilize Interfaces - Stop changing abstractions frequently
For Bidirectional Dependencies
- Create Mediator - Third module manages relationship
- Break into Separate Modules - Clearer separation of concerns
- Observer Pattern - One-way communication via observers
For Code Duplication
- Extract Common Code - Create shared function/module
- Use Inheritance/Composition - Share via traits or composition
- Parameterize Differences - Extract variable parts as parameters
- Template Method - Define algorithm structure, vary specific steps
Examples and Use Cases
Running Architectural Analysis
# Architectural analysis runs automatically with standard analysis
debtmap analyze .
# Duplication detection with custom chunk size
debtmap analyze . --threshold-duplication 30
# Note: Circular dependencies, coupling metrics, and SDP violations
# are analyzed automatically. There are no separate flags to enable
# or disable specific architectural checks.
Example: Circular Dependency
Before:
src/auth.rs → src/session.rs → src/user.rs → src/auth.rs
Circular dependency detected: auth → session → user → auth
After refactoring:
src/auth.rs → src/auth_interface.rs ← src/session.rs
↑
src/user.rs
No circular dependencies found.
Example: Coupling Metrics Table
Module Analysis Results:
Module Ca Ce Instability Issues
-------------------------------------------------
core/types 15 0 0.00 None
api/handlers 2 8 0.80 High Ce
database/client 8 2 0.20 None
utils/temp 5 12 0.71 SDP violation
auth/session 3 3 0.50 None
Example: Zone of Pain
Module: legacy_db_client
Metrics:
Abstractness: 0.05 (all concrete code)
Instability: 0.12 (depended on by 25 modules)
Afferent coupling: 25
Distance from main sequence: 0.83
Status: Zone of Pain - rigid and hard to change
Refactoring steps:
1. Extract interface DatabaseClient trait
2. Create adapter wrapping legacy implementation
3. Gradually migrate dependents to use trait
4. Introduce alternative implementations
Interpreting Results
Prioritization
Address architectural issues in this order:
-
Circular Dependencies (Highest Priority)
- Break architectural boundaries
- Make testing impossible
- Cause build issues
-
Bidirectional Dependencies (High Priority)
- Create tight coupling
- Prevent independent testing
- Block modular changes
-
Zone of Pain Issues (Medium-High Priority)
- Indicate rigid architecture
- Block future changes
- High risk for bugs
-
SDP Violations (Medium Priority)
- Cause ripple effects
- Increase maintenance cost
- Unstable foundation
-
High Coupling (Medium Priority)
- Maintainability risk
- Testing difficulty
- Change amplification
-
Code Duplication (Lower Priority)
- Maintenance burden
- Bug multiplication
- Inconsistency risk
Decision Flowchart
Is there a circular dependency?
├─ YES → Break immediately (extract interface, DI)
└─ NO → Continue
Is there bidirectional dependency?
├─ YES → Refactor (mediator, event-driven)
└─ NO → Continue
Is module in zone of pain?
├─ YES → Introduce abstractions
└─ NO → Continue
Is SDP violated?
├─ YES → Stabilize or reduce afferent coupling
└─ NO → Continue
Is coupling > threshold?
├─ YES → Reduce dependencies
└─ NO → Continue
Is there significant duplication?
├─ YES → Extract common code
└─ NO → Architecture is good!
Integration with Debt Categories
Architectural analysis results are integrated with debtmap’s debt categorization system:
Debt Type Mapping
Architectural issues are mapped to existing DebtType enum variants:
- Duplication - Duplicated code blocks found
- Dependency - Used for circular dependencies and coupling issues
- CodeOrganization - May be used for architectural violations (SDP, zone issues)
Note: The DebtType enum does not have dedicated variants for CircularDependency, HighCoupling, or ArchitecturalViolation. Architectural issues are mapped to existing general-purpose debt types.
Reference: src/core/mod.rs:220-236 for actual DebtType enum definition
Tiered Prioritization
Architectural issues are assigned priority tiers:
- Tier 1 (Critical) - Circular dependencies, bidirectional dependencies
- Tier 2 (High) - Zone of pain, SDP violations
- Tier 3 (Medium) - High coupling, large duplications
- Tier 4 (Low) - Small duplications, minor coupling issues
Reference: See Tiered Prioritization for complete priority assignment logic
Cohesion Analysis
Note: Module cohesion analysis is currently a simplified placeholder implementation.
Current status: src/debt/coupling.rs:82-95 (analyze_module_cohesion)
The function exists but provides basic cohesion calculation. Full cohesion analysis (measuring how well module elements belong together) is planned for a future release.
Future Enhancement
Full cohesion analysis would measure:
- Functional cohesion (functions operating on related data)
- Sequential cohesion (output of one function feeds another)
- Communicational cohesion (functions operating on same data structures)
Configuration
Configurable Parameters
Configure duplication detection in .debtmap.toml or via CLI:
# Minimum lines for duplication detection
threshold_duplication = 50 # Default value
Or via command line:
debtmap analyze . --threshold-duplication 50
Configuration reference: src/cli.rs:69 (threshold_duplication flag definition)
Hardcoded Thresholds
Note: Most architectural thresholds are currently hardcoded in the implementation and cannot be configured:
- Coupling threshold: 5 (modules with >5 dependencies are flagged)
- Instability threshold: 0.8 (for SDP violations)
- SDP afferent threshold: 2 (minimum dependents for SDP violations)
- Zone of pain thresholds:
- Abstractness < 0.2
- Instability < 0.2
- Afferent coupling > 3
- Zone of uselessness thresholds:
- Abstractness > 0.8
- Instability > 0.8
Source: src/debt/coupling.rs:70-76 (hardcoded threshold definitions)
See Configuration for complete options.
Troubleshooting
“No circular dependencies detected but build fails”
Cause: Circular dependencies at the package/crate level, not module level.
Solution: Use cargo tree to analyze package-level dependencies.
“Too many coupling warnings”
Cause: Default threshold of 5 may be too strict for your codebase.
Solution: The coupling threshold is currently hardcoded at 5 in the implementation (src/debt/coupling.rs:62). To adjust it, you would need to modify the source code. Consider using suppression patterns to exclude specific modules if needed. See Suppression Patterns.
“Duplication detected in generated code”
Cause: Code generation tools create similar patterns.
Solution: Use suppression patterns to exclude generated files. See Suppression Patterns.
“Zone of pain false positives”
Cause: Utility modules are intentionally stable and concrete.
Solution: This is often correct - utility modules should be stable. Consider whether the module should be more abstract.
Further Reading
Robert C. Martin’s Principles
The architectural metrics in debtmap are based on:
- Clean Architecture by Robert C. Martin
- Agile Software Development: Principles, Patterns, and Practices by Robert C. Martin
- Stable Dependencies Principle (SDP)
- Stable Abstractions Principle (SAP)
- Main Sequence distance metric
Related Topics
- Analysis Guide - Complete analysis workflow
- Configuration - Configuration options
- Entropy Analysis - Complexity vs. entropy
- Scoring Strategies - How debt is scored
- Tiered Prioritization - Priority assignment
Boilerplate vs Complexity
Overview
Debtmap distinguishes between boilerplate code (necessary but mechanical patterns) and true complexity (business logic requiring cognitive effort). This distinction is critical for:
- Avoiding false positives in complexity analysis
- Focusing refactoring efforts on actual problems
- Understanding which high-complexity code is acceptable
- Providing actionable recommendations
This chapter explains how Debtmap identifies boilerplate patterns, why they differ from complexity, and how to interpret the analysis results.
The Distinction
What is Boilerplate?
Boilerplate code consists of repetitive, mechanical patterns that are:
- Required by language/framework - Type conversions, trait implementations, builder patterns
- Structurally necessary - Match arms for enums, error propagation, validation chains
- Low cognitive load - Pattern-based code that developers scan rather than deeply analyze
- Not actual complexity - High cyclomatic complexity but mechanistic structure
Examples:
Fromtrait implementations converting between typesDisplayformatting with exhaustive enum match arms- Builder pattern setters with validation
- Error conversion implementations
- Serialization/deserialization code
What is True Complexity?
True complexity consists of business logic that requires:
- Domain understanding - Knowledge of problem space and requirements
- Cognitive effort - Careful analysis to understand behavior
- Algorithmic decisions - Non-obvious control flow or data transformations
- Maintainability risk - Changes may introduce subtle bugs
Examples:
- Graph traversal algorithms
- Complex business rules with multiple conditions
- State machine implementations with non-trivial transitions
- Performance-critical optimizations
- Error recovery with fallback strategies
Real Example: ripgrep’s defs.rs
The ripgrep codebase provides an excellent real-world example of boilerplate vs complexity.
File: crates/printer/src/defs.rs
This file contains type conversion implementations with high cyclomatic complexity scores but minimal actual complexity:
#![allow(unused)]
fn main() {
impl From<HyperlinkFormat> for ColorHyperlink {
fn from(format: HyperlinkFormat) -> ColorHyperlink {
match format {
HyperlinkFormat::Default => ColorHyperlink::default(),
HyperlinkFormat::Grep => ColorHyperlink::grep(),
HyperlinkFormat::GrepPlus => ColorHyperlink::grep_plus(),
HyperlinkFormat::Ripgrep => ColorHyperlink::ripgrep(),
HyperlinkFormat::FileNone => ColorHyperlink::file_none(),
// ... 10+ more variants
}
}
}
}
Analysis:
- Cyclomatic Complexity: 15+ (one branch per enum variant)
- Cognitive Complexity: Low (simple delegation pattern)
- Boilerplate Confidence: 95% (trait implementation with mechanical structure)
Why This Matters
Without boilerplate detection, this file would be flagged as:
- High complexity debt
- Requiring refactoring
- Priority for review
With boilerplate detection, it’s correctly classified as:
- Necessary type conversion code
- Low maintenance risk
- Can be safely skipped in debt prioritization
Detection Methodology
Debtmap uses a multi-phase analysis pipeline to detect boilerplate:
Phase 1: Trait Analysis
Identifies trait implementations known to produce boilerplate:
High-confidence boilerplate traits:
From,Into- Type conversionsDisplay,Debug- FormattingDefault- Default value constructionClone,Copy- Value semanticsEq,PartialEq,Ord,PartialOrd- ComparisonsHash- Hashing implementations
Medium-confidence boilerplate traits:
Serialize,Deserialize- SerializationAsRef,AsMut,Deref,DerefMut- Reference conversions- Custom builder traits
See src/debt/boilerplate/boilerplate_traits.rs:10-58 for complete trait categorization.
Phase 2: Pattern Analysis
Analyzes code structure for boilerplate patterns:
Pattern 1: Simple Delegation
#![allow(unused)]
fn main() {
fn operation(&self) -> Result<T> {
self.inner.operation() // Single delegation call
}
}
Score: 90% confidence
Pattern 2: Trivial Match Arms
#![allow(unused)]
fn main() {
match variant {
A => handler_a(),
B => handler_b(),
C => handler_c(),
}
}
Each arm calls a single function with no additional logic. Score: 85% confidence
Pattern 3: Validation Chains
#![allow(unused)]
fn main() {
fn validate(&self) -> Result<()> {
check_condition_1()?;
check_condition_2()?;
check_condition_3()?;
Ok(())
}
}
Sequential validation with early returns. Score: 75% confidence
Pattern 4: Builder Setters
#![allow(unused)]
fn main() {
pub fn with_field(mut self, value: T) -> Self {
self.field = value;
self
}
}
Simple field assignment with fluent return. Score: 95% confidence
See src/debt/boilerplate/pattern_detector.rs:18-82 for pattern detection logic.
Phase 3: Macro Analysis
Detects macro-generated code and provides recommendations:
Derivable Traits:
Debtmap suggests using #[derive(...)] when it detects manual implementations of:
Clone,Copy,Debug,DefaultEq,PartialEq,Ord,PartialOrdHash
Custom Macros: Recommends creating custom derive macros for:
- Repeated builder pattern implementations
- Repeated conversion trait implementations
- Repeated validation logic
Existing Crates: Suggests established crates for common patterns:
derive_more- Extended derive macrosthiserror- Error type boilerplatetyped-builder- Builder pattern macrosdelegate- Delegation patterns
See src/debt/boilerplate/macro_recommender.rs:9-136 for macro recommendation logic.
Common Boilerplate Patterns
Type Conversions
#![allow(unused)]
fn main() {
// High complexity (15+), but boilerplate
impl From<ConfigFormat> for Config {
fn from(format: ConfigFormat) -> Config {
match format {
ConfigFormat::Json => Config::json(),
ConfigFormat::Yaml => Config::yaml(),
ConfigFormat::Toml => Config::toml(),
// ... many variants
}
}
}
}
Boilerplate Confidence: 90%+ Recommendation: Consider using a macro if pattern repeats
Error Propagation
#![allow(unused)]
fn main() {
// High nesting, but boilerplate pattern
fn complex_operation(&self) -> Result<Output> {
let step1 = self.step_one()
.context("Step one failed")?;
let step2 = self.step_two(&step1)
.context("Step two failed")?;
let step3 = self.step_three(&step2)
.context("Step three failed")?;
Ok(Output::new(step3))
}
}
Boilerplate Confidence: 75% Recommendation: Acceptable pattern for error handling
Builder Patterns
#![allow(unused)]
fn main() {
// Many methods, but all boilerplate
impl ConfigBuilder {
pub fn with_timeout(mut self, timeout: Duration) -> Self {
self.timeout = Some(timeout);
self
}
pub fn with_retries(mut self, retries: u32) -> Self {
self.retries = Some(retries);
self
}
// ... 20+ more setters
}
}
Boilerplate Confidence: 95%
Recommendation: Use typed-builder or similar crate
Display Formatting
#![allow(unused)]
fn main() {
// High complexity due to match, but boilerplate
impl Display for Status {
fn fmt(&self, f: &mut Formatter) -> fmt::Result {
match self {
Status::Pending => write!(f, "pending"),
Status::Running => write!(f, "running"),
Status::Success => write!(f, "success"),
Status::Failed(err) => write!(f, "failed: {}", err),
// ... many variants
}
}
}
}
Boilerplate Confidence: 90%
Recommendation: Consider using strum or derive_more
Decision Table
Use this table to interpret boilerplate confidence scores:
| Confidence | Interpretation | Action |
|---|---|---|
| 90-100% | Definite boilerplate | Exclude from complexity prioritization; consider macro optimization |
| 70-89% | Probable boilerplate | Review pattern; likely acceptable; low refactoring priority |
| 50-69% | Mixed boilerplate/logic | Investigate; may contain hidden complexity; medium priority |
| 30-49% | Mostly real complexity | Standard complexity analysis; normal refactoring priority |
| 0-29% | True complexity | High priority; focus refactoring efforts here |
Example Classifications
Boilerplate (90%+ confidence):
#![allow(unused)]
fn main() {
// Simple trait delegation - skip in debt analysis
impl AsRef<str> for CustomString {
fn as_ref(&self) -> &str {
&self.inner
}
}
}
Mixed (50-70% confidence):
#![allow(unused)]
fn main() {
// Match with some logic - review case by case
fn process_event(&mut self, event: Event) -> Result<()> {
match event {
Event::Simple => self.handle_simple(), // Boilerplate
Event::Complex(data) => { // Real logic
if data.priority > 10 {
self.handle_urgent(data)?;
} else {
self.queue_normal(data)?;
}
self.update_metrics()?;
Ok(())
}
}
}
}
True Complexity (0-30% confidence):
#![allow(unused)]
fn main() {
// Business logic requiring domain knowledge
fn calculate_optimal_strategy(&self, market: &Market) -> Strategy {
let volatility = market.calculate_volatility();
let trend = market.detect_trend();
if volatility > self.risk_threshold {
if trend.is_bullish() && self.can_hedge() {
Strategy::hedged_long(self.calculate_position_size())
} else {
Strategy::defensive()
}
} else {
Strategy::momentum_based(trend, self.confidence_level())
}
}
}
Integration with Complexity Analysis
Boilerplate Scoring
Debtmap calculates a BoilerplateScore for each function:
#![allow(unused)]
fn main() {
pub struct BoilerplateScore {
pub confidence: f64, // 0.0-1.0 (0% to 100%)
pub primary_pattern: Pattern, // Strongest detected pattern
pub contributing_patterns: Vec<Pattern>,
pub macro_recommendation: Option<MacroRecommendation>,
}
}
Complexity Adjustment
High-confidence boilerplate reduces effective complexity:
effective_complexity = raw_complexity × (1.0 - boilerplate_confidence)
Example:
- Raw cyclomatic complexity: 15
- Boilerplate confidence: 0.90 (90%)
- Effective complexity: 15 × (1.0 - 0.90) = 1.5
This prevents boilerplate from dominating debt prioritization.
Output Display
Debtmap annotates boilerplate functions in analysis output:
src/types/conversions.rs:
├─ from (complexity: 15, boilerplate: 92%)
│ Pattern: Trait Implementation (From)
│ Recommendation: Consider #[derive(From)] via derive_more
│ Priority: Low (boilerplate)
├─ process_request (complexity: 12, boilerplate: 15%)
│ Priority: High (true complexity)
Best Practices
When to Accept Boilerplate
Accept high-complexity boilerplate when:
- Required by language - Trait implementations, type conversions
- Pattern is clear - Developers can scan quickly without deep analysis
- Covered by tests - Mechanical patterns verified by unit tests
- No simpler alternative - Refactoring would reduce clarity
Example: Exhaustive match arms for enum variants with simple delegation.
When to Refactor Boilerplate
Refactor boilerplate when:
- Pattern repeats extensively - 10+ similar implementations
- Macro alternative exists - Can use derive or custom macro
- Maintenance burden - Changes require updating many copies
- Error-prone - Manual pattern increases bug risk
Example: 50+ builder setters that could use typed-builder crate.
Configuring Thresholds
Adjust boilerplate sensitivity in .debtmap.toml:
[boilerplate_detection]
enabled = true
min_confidence_to_exclude = 0.85 # Only exclude 85%+ confidence
trait_delegation_threshold = 0.90 # Trait impl confidence
pattern_match_threshold = 0.75 # Match pattern confidence
Strict mode (minimize false negatives):
min_confidence_to_exclude = 0.95 # Very high bar for exclusion
Lenient mode (minimize false positives):
min_confidence_to_exclude = 0.70 # More aggressive exclusion
Validation and Testing
Integration Test Example
Debtmap’s test suite includes real-world boilerplate validation:
#![allow(unused)]
fn main() {
#[test]
fn test_ripgrep_defs_boilerplate() {
let code = r#"
impl From<HyperlinkFormat> for ColorHyperlink {
fn from(format: HyperlinkFormat) -> ColorHyperlink {
match format {
HyperlinkFormat::Default => ColorHyperlink::default(),
// ... 15 variants
}
}
}
"#;
let result = analyze_boilerplate(code);
assert!(result.confidence >= 0.85, "Should detect trait boilerplate");
assert_eq!(result.primary_pattern, Pattern::TraitImplementation);
}
}
See tests/boilerplate_integration_test.rs for complete test cases.
Performance Overhead
Boilerplate detection adds minimal overhead:
Measurement: <5% increase in analysis time Reason: Single-pass AST analysis with cached pattern matching Optimization: Trait analysis uses fast HashMap lookups
See tests/boilerplate_performance_test.rs for benchmark details.
Troubleshooting
“Why is my code marked as boilerplate?”
Check:
- Is it a trait implementation? (From, Display, etc.)
- Does it follow a mechanical pattern?
- Are all branches simple delegations?
If incorrectly classified:
- Adjust
min_confidence_to_excludethreshold - Report false positive if confidence is very high
“My boilerplate isn’t detected”
Common causes:
- Custom logic mixed with boilerplate pattern
- Non-standard trait names
- Complex match arm logic
Solutions:
- Extract pure boilerplate into separate functions
- Use standard traits when possible
- Check confidence score - may be detected with lower confidence
“Boilerplate detection seems too aggressive”
Adjust configuration:
[boilerplate_detection]
min_confidence_to_exclude = 0.95 # Raise threshold
trait_delegation_threshold = 0.95
Related Documentation
- Complexity Metrics - Understanding cyclomatic complexity
- Configuration - Complete
.debtmap.tomlreference - Tiered Prioritization - How boilerplate affects debt ranking
Summary
Boilerplate detection is a critical feature that:
- Distinguishes mechanical patterns from true complexity
- Reduces false positives in debt analysis
- Provides actionable macro recommendations
- Integrates seamlessly with complexity scoring
- Helps teams focus on real maintainability issues
By identifying boilerplate with 85%+ confidence, Debtmap ensures that high-complexity scores reflect actual cognitive burden rather than necessary but mechanical code patterns.
Context Providers
Context providers enhance debtmap’s risk analysis by incorporating additional factors beyond complexity and test coverage. They analyze critical execution paths, dependency relationships, and version control history to provide a more comprehensive understanding of technical risk.
Overview
Context providers implement the ContextProvider trait, which gathers risk-relevant information about functions and modules. Each provider analyzes a specific dimension of risk:
- Critical Path Provider: Identifies functions on critical execution paths
- Dependency Provider: Analyzes call graph relationships and blast radius
- Git History Provider: Integrates version control history for change patterns
Context providers help debtmap understand:
- Which code paths are most critical
- How functions depend on each other
- Which code changes most frequently
- Where bugs are likely to occur
This context-aware analysis improves prioritization accuracy and reduces false positives.
The ContextAggregator combines context from multiple enabled providers and adjusts risk scores using the formula:
contextual_risk = base_risk × (1.0 + context_contribution)
Where context_contribution is the weighted sum of all provider contributions:
context_contribution = Σ(provider.contribution × provider.weight)
Critical Path Provider
The Critical Path provider identifies functions that lie on critical execution paths through your application. Functions on these paths have elevated risk because failures directly impact user-facing functionality.
Entry Point Detection
The provider automatically detects entry points based on function names and file paths. These weights determine the base criticality of execution paths:
| Entry Type | Weight | Detection Pattern | User-Facing |
|---|---|---|---|
| Main | 10.0 | Function named main | Yes |
| API Endpoint | 8.0 | handle_*, *_handler, get_*, post_* in api/, handler/, route/ paths | Yes |
| CLI Command | 7.0 | cmd_*, command_*, *_command in cli/, command/ paths | Yes |
| Web Handler | 7.0 | Functions with route, handler in web/, http/ paths | Yes |
| Event Handler | 5.0 | on_*, *_listener, contains event | No |
| Test Entry | 2.0 | test_*, in test/ paths | No |
Note on API Endpoint detection: Detection requires BOTH conditions: (1) path contains api/, handler/, or route/ AND (2) function starts with handle_*, get_*, post_*, put_*, delete_* or ends with *_handler. This combined matching ensures accurate classification of HTTP endpoint handlers.
What it detects:
- Entry points (main functions, CLI handlers, API endpoints)
- Error handling paths
- Data processing pipelines
- Resource initialization
Path Weighting
Functions on critical paths receive contribution scores based on:
- Path weight: The maximum entry point weight leading to the function
- User-facing flag: Doubles contribution for user-facing paths
The contribution formula consists of two steps:
#![allow(unused)]
fn main() {
// Step 1: Calculate base contribution (normalized 0-1)
base_contribution = path_weight / max_weight
// Step 2: Apply user-facing multiplier
final_contribution = base_contribution × user_facing_multiplier
// Example: main entry path (weight 10.0, user-facing)
base = 10.0 / 10.0 = 1.0
final = 1.0 × 2.0 = 2.0
// Example: event handler path (weight 5.0, non-user-facing)
base = 5.0 / 10.0 = 0.5
final = 0.5 × 1.0 = 0.5
}
Impact on scoring:
- Functions on critical paths get higher priority
- Entry point multiplier: 1.5x
- Business logic multiplier: 1.2x
Use Cases
- API prioritization: Identify critical endpoints that need careful review
- Refactoring safety: Avoid breaking user-facing execution paths
- Test coverage: Ensure critical paths have adequate test coverage
Enable
debtmap analyze . --context-providers critical_path
Configuration:
[analysis]
context_providers = ["critical_path"]
# Note: Provider-specific TOML sections below are planned features.
# Currently, providers use hard-coded defaults. Use CLI flags for now.
[context.critical_path]
# Multiplier for entry points (default: 1.5)
entry_point_multiplier = 1.5
# Multiplier for business logic (default: 1.2)
business_logic_multiplier = 1.2
Dependency Provider
The Dependency provider analyzes call graph relationships to identify functions with high architectural impact. It calculates how changes propagate through the dependency graph and determines the blast radius of modifications.
Dependency Chain Analysis
The provider builds a dependency graph where:
- Modules contain functions and have intrinsic risk scores
- Edges represent dependencies with coupling strength (0.0-1.0)
- Risk propagation flows through dependencies using iterative refinement
Convergence Parameters: The risk propagation algorithm uses iterative convergence with a maximum of 10 iterations. Convergence is reached when the maximum risk change between iterations falls below 0.01. This ensures risk stabilizes throughout the dependency graph.
What it detects:
- Upstream dependencies (functions this function calls)
- Downstream dependencies (functions that call this function)
- Transitive dependencies through the call graph
- Dependency criticality
Blast Radius Calculation
The blast radius represents how many modules would be affected by changes to a function. It counts unique modules reachable through transitive dependencies by traversing the dependency graph edges.
| Blast Radius | Contribution | Impact Level |
|---|---|---|
| > 10 modules | 1.5 | Critical dependency affecting many modules |
| > 5 modules | 1.0 | Important dependency with moderate impact |
| > 2 modules | 0.5 | Medium impact |
| ≤ 2 modules | 0.2 | Minimal or isolated component |
Risk Propagation Formula
Risk propagation uses an iterative convergence algorithm to stabilize risk scores throughout the dependency graph:
#![allow(unused)]
fn main() {
propagated_risk = base_risk × criticality_factor + Σ(caller.risk × 0.3)
where:
criticality_factor = 1.0 + min(0.5, dependents.len() × 0.1)
The 0.3 factor dampens risk propagation from callers
}
Iterative Convergence: The algorithm runs with a maximum of 10 iterations and converges when the maximum risk change between iterations falls below 0.01. This ensures risk stabilizes throughout the dependency graph without requiring manual tuning.
Note: The constants (0.5, 0.1, 0.3) are currently hard-coded based on empirical analysis. Future versions may make these configurable.
Impact on scoring:
dependency_factor = normalized_to_0_10(upstream + downstream)
Ranges:
- Entry points: 8-10 (critical path)
- Business logic: 6-8 (core functionality)
- Data access: 5-7 (important but stable)
- Utilities: 3-5 (lower priority)
- Test helpers: 1-3 (lowest priority)
Use Cases
- Architectural refactoring: Identify high-impact modules to refactor carefully
- Change impact analysis: Understand downstream effects of modifications
- Module decoupling: Find tightly coupled modules with high blast radius
Enable
debtmap analyze . --context-providers dependency
Configuration:
[analysis]
context_providers = ["dependency"]
# Note: Provider-specific TOML sections below are planned features.
# Currently, providers use hard-coded defaults. Use CLI flags for now.
[context.dependency]
# Include transitive dependencies (default: true)
include_transitive = true
# Maximum depth for transitive analysis (default: 5)
max_depth = 5
Git History Provider
The Git History provider integrates version control data to detect change-prone code and bug patterns. Files with frequent changes and bug fixes indicate higher maintenance risk.
Metrics Collected
The provider analyzes Git history to calculate:
- Change frequency: Commits per month (recent activity indicator)
- Bug density: Ratio of bug fix commits to total commits
- Age: Days since first commit (maturity indicator)
- Author count: Number of unique contributors (complexity indicator)
- Total commits: Total number of commits to the file
- Last modified: Timestamp of the most recent commit
- Stability score: Weighted combination of churn, bug fixes, and age (0.0-1.0)
What it analyzes:
- Commit frequency per file/function
- Bug fix patterns (commits with “fix” in message)
- Code churn (lines added/removed)
- Recent activity
Risk Classification
| Category | Conditions | Contribution | Explanation |
|---|---|---|---|
| Very unstable | freq > 5.0 AND bug_density > 0.3 | 2.0 | High churn with many bug fixes |
| Moderately unstable | freq > 2.0 OR bug_density > 0.2 | 1.0 | Frequent changes or bug-prone |
| Slightly unstable | freq > 1.0 OR bug_density > 0.1 | 0.5 | Some instability |
| Stable | freq ≤ 1.0 AND bug_density ≤ 0.1 | 0.1 | Low change rate, few bugs |
Bug Fix Detection
The provider identifies bug fixes by searching commit messages for patterns:
git log --grep=fix --grep=bug --grep=Fix --grep=Bug -- <file>
Stability Score
Stability is calculated using weighted factors:
#![allow(unused)]
fn main() {
stability = (churn_factor × 0.4) + (bug_factor × 0.4) + (age_factor × 0.2)
where:
churn_factor = 1.0 / (1.0 + monthly_churn)
bug_factor = 1.0 - (bug_fixes / total_commits)
age_factor = min(1.0, age_days / 365.0)
}
Stability Status Classifications
The provider internally classifies files into stability statuses based on the calculated metrics:
| Status | Criteria | Explanation |
|---|---|---|
| HighlyUnstable | freq > 5.0 AND bug_density > 0.3 | Extremely high churn combined with many bug fixes |
| FrequentlyChanged | freq > 2.0 | High change rate regardless of bug density |
| BugProne | bug_density > 0.2 | High proportion of bug fix commits |
| MatureStable | age > 365 days | Code older than one year (unless unstable) |
| RelativelyStable | (default) | Moderate activity, typical stability |
These classifications are used internally for contribution calculations and appear in verbose output.
Impact on scoring:
- High-churn functions get higher priority
- Recently fixed bugs indicate risk areas
- Stable code (no recent changes) gets lower priority
Use Cases
- Find change-prone code: Identify files that change frequently and need attention
- Detect bug hotspots: Locate areas with high bug fix rates
- Prioritize refactoring: Target unstable code for improvement
- Team collaboration patterns: Files touched by many authors may need better documentation
Enable
debtmap analyze . --context-providers git_history
Configuration:
[analysis]
context_providers = ["git_history"]
# Note: Provider-specific TOML sections below are planned features.
# Currently, providers use hard-coded defaults. Use CLI flags for now.
[context.git_history]
# Commits to analyze (default: 100)
max_commits = 100
# Time range in days (default: 90)
time_range_days = 90
# Minimum commits to consider "high churn" (default: 10)
high_churn_threshold = 10
Troubleshooting
Git repository not found: The provider requires a Git repository. If analysis fails:
# Verify you're in a git repository
git rev-parse --git-dir
# If not a git repo, initialize one or disable git_history provider
# Option 1: Enable context but exclude git_history
debtmap analyze --context --disable-context git_history
# Option 2: Use only specific providers
debtmap analyze --context-providers critical_path,dependency
Performance issues: Git history analysis can be slow for large repositories:
# Use only lightweight providers
debtmap analyze --context-providers critical_path,dependency
Enabling Context Providers
Context-aware analysis is disabled by default. Enable it using CLI flags:
Enable All Providers
# Enable all available context providers
debtmap analyze --context
# or
debtmap analyze --enable-context
Enable Specific Providers
# Enable only critical_path and dependency
debtmap analyze --context-providers critical_path,dependency
# Enable only git_history
debtmap analyze --context-providers git_history
# Enable all three explicitly
debtmap analyze --context-providers critical_path,dependency,git_history
Disable Specific Providers
# Enable context but disable git_history (useful for non-git repos)
debtmap analyze --context --disable-context git_history
# Enable context but disable dependency analysis
debtmap analyze --context --disable-context dependency
Enabling Multiple Providers
Combine providers for comprehensive analysis:
debtmap analyze . --context-providers critical_path,dependency,git_history
Or via config:
[analysis]
context_providers = ["critical_path", "dependency", "git_history"]
Provider Weights
Each provider has a weight that determines its influence on the final risk score:
| Provider | Weight | Rationale |
|---|---|---|
| critical_path | 1.5 | Critical paths have high impact on users |
| dependency_risk | 1.2 | Architectural dependencies affect many modules |
| git_history | 1.0 | Historical patterns indicate future risk |
The total context contribution is calculated as:
#![allow(unused)]
fn main() {
total_contribution = Σ(contribution_i × weight_i)
Example with all providers:
critical_path: 2.0 × 1.5 = 3.0
dependency: 1.0 × 1.2 = 1.2
git_history: 0.5 × 1.0 = 0.5
────────────────────────────
total_contribution = 4.7
contextual_risk = base_risk × (1.0 + 4.7) = base_risk × 5.7
}
How Context Affects Scoring
Base Scoring (No Context)
Base Score = (Complexity × 0.40) + (Coverage × 0.40) + (Dependency × 0.20)
With Context Providers
Context-Adjusted Score = Base Score × Role Multiplier × Churn Multiplier
Role Multiplier (from critical path & dependency analysis):
- Entry points: 1.5x
- Business logic: 1.2x
- Data access: 1.0x
- Infrastructure: 0.8x
- Utilities: 0.5x
- Test code: 0.1x
Churn Multiplier (from git history):
- High churn (10+ commits/month): 1.3x
- Medium churn (5-10 commits/month): 1.1x
- Low churn (1-5 commits/month): 1.0x
- Stable (0 commits/6 months): 0.8x
Context Details Structure
When using --format json, context information is included in the output. The ContextDetails enum contains provider-specific data:
CriticalPath
{
"provider": "critical_path",
"weight": 1.5,
"contribution": 2.0,
"details": {
"CriticalPath": {
"entry_points": ["main (Main)", "handle_request (ApiEndpoint)"],
"path_weight": 10.0,
"is_user_facing": true
}
}
}
DependencyChain
{
"provider": "dependency_risk",
"weight": 1.2,
"contribution": 1.5,
"details": {
"DependencyChain": {
"depth": 3,
"propagated_risk": 8.5,
"dependents": ["module_a", "module_b", "module_c"],
"blast_radius": 12
}
}
}
Historical
{
"provider": "git_history",
"weight": 1.0,
"contribution": 1.0,
"details": {
"Historical": {
"change_frequency": 3.5,
"bug_density": 0.25,
"age_days": 180,
"author_count": 5
}
}
}
Practical Examples
Example 1: Entry Point vs Utility
Without context providers:
Function: main() - Entry point
Complexity: 8
Coverage: 50%
Score: 6.0 [MEDIUM]
Function: format_string() - Utility
Complexity: 8
Coverage: 50%
Score: 6.0 [MEDIUM]
Both functions have the same score.
With context providers:
Function: main() - Entry point
Complexity: 8
Coverage: 50%
Base Score: 6.0
Role Multiplier: 1.5x (entry point)
Final Score: 9.0 [CRITICAL]
Function: format_string() - Utility
Complexity: 8
Coverage: 50%
Base Score: 6.0
Role Multiplier: 0.5x (utility)
Final Score: 3.0 [LOW]
Entry point is prioritized over utility.
Example 2: High-Churn Function
Without git history:
Function: process_payment()
Complexity: 12
Coverage: 60%
Score: 7.5 [HIGH]
With git history:
Function: process_payment()
Complexity: 12
Coverage: 60%
Base Score: 7.5
Churn: 15 commits in last month (bug fixes)
Churn Multiplier: 1.3x
Final Score: 9.75 [CRITICAL]
High-churn function is elevated to critical.
Example 3: Stable Well-Tested Code
Without context:
Function: legacy_parser()
Complexity: 15
Coverage: 95%
Score: 3.5 [LOW]
With context:
Function: legacy_parser()
Complexity: 15
Coverage: 95%
Base Score: 3.5
Churn: 0 commits in last 2 years
Churn Multiplier: 0.8x
Role: Data access (stable)
Role Multiplier: 1.0x
Final Score: 2.8 [LOW]
Stable, well-tested code gets even lower priority.
Example 4: API Endpoint Prioritization
Analyze a web service to identify critical API endpoints:
debtmap analyze --context-providers critical_path --format json
Functions on API endpoint paths will receive elevated risk scores. Use this to prioritize code review and testing for user-facing functionality.
Example 5: Finding Change-Prone Code
Identify files with high change frequency and bug fixes:
debtmap analyze --context-providers git_history --top 20
This highlights unstable areas of the codebase that may benefit from refactoring or increased test coverage.
Example 6: Architectural Impact Analysis
Find high-impact modules with large blast radius:
debtmap analyze --context-providers dependency --format json | \
jq '.[] | select(.blast_radius > 10)'
Use this to identify architectural choke points that require careful change management.
Example 7: Comprehensive Risk Assessment
Combine all providers for holistic risk analysis:
debtmap analyze --context -v
The verbose output shows how each provider contributes to the final risk score:
function: process_payment
base_risk: 5.0
critical_path: +3.0 (on main path, user-facing)
dependency: +1.2 (12 dependent modules)
git_history: +1.0 (3.5 changes/month, 0.25 bug density)
──────────────────
contextual_risk: 26.0
Configuration
⚠️ Configuration Limitation: Provider-specific TOML configuration sections shown below are planned features not yet implemented. Currently, all provider settings use hard-coded defaults from the implementation. Use CLI flags (
--context,--context-providers,--disable-context) to control providers. See the CLI examples throughout this chapter for working configurations.
Configure context providers in .debtmap.toml:
[analysis]
# Enable context-aware analysis (default: false)
enable_context = true
# Specify which providers to use
context_providers = ["critical_path", "dependency", "git_history"]
# Disable specific providers (use CLI flag --disable-context instead)
# disable_context = ["git_history"] # Not yet implemented in config
[context.git_history]
# Commits to analyze (default: 100) - PLANNED
max_commits = 100
# Time range in days (default: 90) - PLANNED
time_range_days = 90
# Minimum commits to consider "high churn" (default: 10) - PLANNED
high_churn_threshold = 10
[context.critical_path]
# Multiplier for entry points (default: 1.5) - PLANNED
entry_point_multiplier = 1.5
# Multiplier for business logic (default: 1.2) - PLANNED
business_logic_multiplier = 1.2
[context.dependency]
# Include transitive dependencies (default: true) - PLANNED
include_transitive = true
# Maximum depth for transitive analysis (default: 5) - PLANNED
max_depth = 5
Performance Considerations
Context providers add computational overhead to analysis:
Impact on analysis time:
- Critical path: +10-15% (fast - call graph traversal)
- Dependency: +20-30% (moderate - iterative risk propagation)
- Git history: +30-50% (slow for large repos - multiple git commands per file)
Combined overhead: ~60-80% increase in analysis time
Optimization Tips
- Start minimal: Use
--context-providers critical_path,dependencyinitially - Add git_history selectively: Enable for critical modules only
- Use caching: The
ContextAggregatorcaches results byfile:functionkey - Profile with verbose flags: Use
-vvvto see provider execution times
For Large Projects
# Disable git history for faster analysis
debtmap analyze . --disable-context git_history
# Or disable all context
debtmap analyze . --no-context-aware
For CI/CD
# Full analysis with context (run nightly)
debtmap analyze . --context-providers critical_path,dependency,git_history
# Fast analysis without context (run on every commit)
debtmap analyze . --no-context-aware
When to Use Each Provider
| Scenario | Recommended Providers |
|---|---|
| API service refactoring | critical_path |
| Legacy codebase analysis | git_history |
| Microservice boundaries | dependency |
| Pre-release risk review | All providers (--context) |
| CI/CD integration | critical_path,dependency (faster) |
Troubleshooting
Git History Analysis Slow
Issue: Analysis takes much longer with git history enabled
Solutions:
Reduce commit history:
[context.git_history]
max_commits = 50
time_range_days = 30
Use shallow clone in CI:
git clone --depth 50 repo.git
debtmap analyze . --context-providers critical_path,dependency
Incorrect Role Classification
Issue: Function classified as wrong role (e.g., utility instead of business logic)
Possible causes:
- Function naming doesn’t match patterns
- Call graph analysis incomplete
- Function is misplaced in codebase
Solutions:
Check with verbose output:
debtmap analyze . -vv | grep "Role classification"
Manually verify call graph:
debtmap analyze . --show-call-graph
Context Providers Not Available
Issue: --context-providers flag not recognized
Solution: Ensure you’re using a recent version:
debtmap --version
# Should be 0.2.0 or later
Update debtmap:
cargo install debtmap --force
Common Issues
Issue: Context providers not affecting scores
Solution: Ensure providers are enabled with --context or --context-providers
# Wrong: context flag missing
debtmap analyze
# Correct: context enabled
debtmap analyze --context
Issue: Git history provider fails with “Not a git repository”
Solution: Disable git_history if not using version control
debtmap analyze --context --disable-context git_history
Issue: Dependency analysis errors
Solution: Check for circular dependencies or disable dependency provider
debtmap analyze --context --disable-context dependency
Issue: Slow analysis with all providers
Solution: Use selective providers or increase verbosity to identify bottlenecks
# Faster: skip git_history
debtmap analyze --context-providers critical_path,dependency
# Debug: see provider execution times
debtmap analyze --context -vvv
For more troubleshooting guidance, see the Troubleshooting chapter.
Advanced Usage
Interpreting Context Contribution
Enable verbose output to see detailed context contributions:
debtmap analyze --context -v
Each function shows:
- Base risk score from complexity/coverage
- Individual provider contributions
- Total contextual risk score
- Provider-specific explanations
Architecture Exploration
The ContextAggregator caches context by file:function key to avoid redundant analysis during a single run.
Cache Lifetime: The cache is in-memory per ContextAggregator instance and is cleared when a new instance is created or when analyzing a different codebase. This enables efficient re-analysis within the same run without requiring external cache management:
#![allow(unused)]
fn main() {
let mut aggregator = ContextAggregator::new()
.with_provider(Box::new(CriticalPathProvider::new(analyzer)))
.with_provider(Box::new(DependencyRiskProvider::new(graph)))
.with_provider(Box::new(GitHistoryProvider::new(repo_root)?));
let context = aggregator.analyze(&target);
let contribution = context.total_contribution();
}
Custom Provider Implementation
Advanced users can implement custom context providers by implementing the ContextProvider trait:
#![allow(unused)]
fn main() {
pub trait ContextProvider: Send + Sync {
fn name(&self) -> &str;
fn gather(&self, target: &AnalysisTarget) -> Result<Context>;
fn weight(&self) -> f64;
fn explain(&self, context: &Context) -> String;
}
}
See src/risk/context/mod.rs for implementation examples.
Future Enhancements
Business Context Provider (Planned)
A Business context provider is defined but not yet implemented. It will support:
#![allow(unused)]
fn main() {
Business {
priority: Priority, // Critical, High, Medium, Low
impact: Impact, // Revenue, UserExperience, Security, Compliance
annotations: Vec<String> // Custom business metadata
}
}
This will allow manual prioritization based on business requirements through code annotations or configuration files.
Best Practices
- Use all providers for comprehensive analysis - Especially for production code
- Disable git history in CI - Use shallow clones or disable for speed
- Verify role classifications - Use
-vvto see how functions are classified - Tune multipliers for your project - Adjust in config based on architecture
- Combine with coverage data - Context providers enhance coverage-based risk analysis
Summary
Context providers transform debtmap from a static complexity analyzer into a comprehensive risk assessment tool. By combining:
- Critical path analysis for user impact
- Dependency analysis for architectural risk
- Git history analysis for maintenance patterns
You gain actionable insights for prioritizing technical debt and refactoring efforts. Start with --context to enable all providers, then refine based on your project’s needs.
See Also
- Analysis Guide - Core analysis concepts
- Risk Assessment - Risk scoring methodology
- Configuration - Complete configuration reference
- Parallel Processing - Performance optimization
- Troubleshooting - Common issues and solutions
Coverage Gap Analysis
Debtmap provides precise line-level coverage gap reporting to help you understand exactly which lines of code lack test coverage, rather than relying on misleading function-level percentages.
Understanding Coverage Gaps
A coverage gap represents the portion of a function that is not executed during tests. Traditional tools report this as a simple percentage (e.g., “50% covered”), but this can be misleading:
- A 100-line function with 1 uncovered line shows “99% covered” - sounds great!
- A 10-line function with 1 uncovered line shows “90% covered” - sounds worse, but is actually better
Debtmap improves on this by:
- Reporting the actual number of uncovered lines
- Showing which specific lines are uncovered
- Calculating the gap as a percentage of instrumented lines (not total lines)
- Providing visual severity indicators based on gap size
Precise vs Estimated Gaps
Debtmap uses different precision levels depending on available coverage data:
Precise Gaps (Line-Level Data Available)
When LCOV coverage data is available, debtmap provides exact line-level reporting:
Business logic - 1 line uncovered (11% gap) - line 52
Complex calculation - 4 lines uncovered (20% gap) - lines 10-12, 15
Benefits:
- Exact line numbers for uncovered code
- Accurate gap percentage based on instrumented lines
- Compact line range formatting (e.g., “10-12, 15, 20-21”)
- Distinguishes between code that can’t be instrumented vs uncovered code
How it works:
- Debtmap reads LCOV coverage data from your test runs
- Matches functions by file path and name
- Extracts precise uncovered line numbers
- Calculates percentage as:
(uncovered_lines / instrumented_lines) * 100
Estimated Gaps (Function-Level Data Only)
When only function-level coverage percentages are available:
Data processing - ~50% gap (estimated, ~25 lines)
Helper function - ~100% gap (estimated, 15 lines)
Utility - ~3% gap (mostly covered)
Characteristics:
- Estimates uncovered line count from percentage
- Uses tilde (~) prefix to indicate estimation
- Special cases:
- ≥99% gap → “~100% gap”
- <5% gap → “mostly covered”
- Otherwise → “~X% gap (estimated, ~Y lines)”
How it works:
- Falls back when LCOV data unavailable or function not found
- Calculates:
estimated_uncovered = total_lines * (gap_percentage / 100) - Useful for quick overview but less actionable than precise gaps
Unknown Coverage
When no coverage data is available:
Untested module - Coverage data unavailable (42 lines)
This typically occurs when:
- No coverage collection has been run
- File not included in coverage report
- Coverage data file path mismatch
Gap Severity Indicators
Debtmap uses visual indicators to quickly identify the severity of coverage gaps:
| Indicator | Range | Severity | Meaning |
|---|---|---|---|
| 🟡 | 1-25% | LOW | Minor gaps, mostly covered |
| 🟠 | 26-50% | MODERATE | Significant gaps, needs attention |
| 🔴 | 51-75% | HIGH | Major gaps, high priority |
| 🔴🔴 | 76-100% | CRITICAL | Severe gaps, critical priority |
These indicators appear in debtmap’s priority output to help you quickly identify which functions need testing most urgently.
Severity Calculation
Gap severity is based on the percentage of uncovered code:
#![allow(unused)]
fn main() {
fn get_severity(gap_percentage: f64) -> &'static str {
match gap_percentage {
p if p <= 25.0 => "🟡 LOW",
p if p <= 50.0 => "🟠 MODERATE",
p if p <= 75.0 => "🔴 HIGH",
_ => "🔴🔴 CRITICAL"
}
}
}
This works for both precise and estimated gaps, ensuring consistent severity classification across your codebase.
Example Output
High Verbosity Mode
Priority 1: Authentication Logic (CRITICAL)
File: src/auth/login.rs:45
Coverage Gap: 2 lines uncovered (89% gap) 🔴🔴 - lines 67, 89
Complexity: Cyclomatic 8, Cognitive 12
Impact: High-risk business logic with critical coverage gaps
Priority 2: Data Validation (HIGH)
File: src/validation/rules.rs:120
Coverage Gap: 15 lines uncovered (65% gap) 🔴 - lines 145-152, 167-173
Complexity: Cyclomatic 5, Cognitive 8
Impact: Complex validation logic needs comprehensive testing
Priority 3: Helper Function (MODERATE)
File: src/utils/helpers.rs:30
Coverage Gap: ~45% gap (estimated, ~12 lines) 🟠
Complexity: Cyclomatic 3, Cognitive 4
Impact: Moderate complexity with estimated coverage gaps
Standard Mode
1. Authentication Logic (src/auth/login.rs:45)
Gap: 2 lines uncovered (89%) 🔴🔴 [lines 67, 89]
2. Data Validation (src/validation/rules.rs:120)
Gap: 15 lines uncovered (65%) 🔴 [lines 145-152, 167-173]
3. Helper Function (src/utils/helpers.rs:30)
Gap: ~45% (estimated) 🟠
Integration with Coverage Tools
Generating LCOV Data
For precise gap reporting, generate LCOV coverage data with your test framework:
Rust (using cargo-tarpaulin):
cargo tarpaulin --out Lcov --output-dir ./coverage
Python (using pytest-cov):
pytest --cov=mypackage --cov-report=lcov:coverage/lcov.info
JavaScript (using Jest):
jest --coverage --coverageReporters=lcov
Configuring Debtmap
Point debtmap to your coverage data:
debtmap analyze --coverage-path ./coverage/lcov.info
Or in .debtmap.toml:
[coverage]
lcov_path = "./coverage/lcov.info"
Best Practices
1. Use Precise Gaps When Possible
Always generate LCOV data for actionable coverage insights:
- Precise line numbers help you quickly locate untested code
- Accurate percentages prevent over/under-estimating gaps
- Line ranges show if gaps are concentrated or scattered
2. Focus on High Severity Gaps First
Prioritize based on severity indicators:
- 🔴🔴 CRITICAL (76-100%) - Address immediately
- 🔴 HIGH (51-75%) - Schedule for next sprint
- 🟠 MODERATE (26-50%) - Address when convenient
- 🟡 LOW (1-25%) - Acceptable for some code
3. Consider Context
Gap severity should be weighted by:
- Function role: Business logic vs utilities
- Complexity: High complexity + high gap = top priority
- Change frequency: Frequently changed code needs better coverage
- Risk: Security, data integrity, financial calculations
4. Track Progress Over Time
Run debtmap regularly to track coverage improvements:
# Weekly coverage check
debtmap analyze --coverage-path ./coverage/lcov.info > weekly-gaps.txt
Compare reports to see gap reduction progress.
Troubleshooting
“Coverage data unavailable” for all functions
Cause: Debtmap can’t find or parse LCOV file
Solutions:
- Verify
--coverage-pathpoints to valid LCOV file - Ensure LCOV file was generated recently
- Check file permissions (readable by debtmap)
- Validate LCOV format:
head -20 ./coverage/lcov.info
Line numbers don’t match source code
Cause: Source code changed since coverage was generated
Solutions:
- Re-run tests with coverage collection
- Ensure clean build before coverage run
- Commit code before running coverage
Estimated gaps for functions with LCOV data
Cause: Function name or path mismatch
Solutions:
- Check function names match exactly (case-sensitive)
- Verify file paths are consistent (relative vs absolute)
- Enable debug logging:
debtmap analyze --log-level debug
Missing functions in coverage report
Cause: Functions not instrumented or filtered out
Solutions:
- Check coverage tool configuration
- Ensure test execution reaches those functions
- Verify functions aren’t in excluded paths
Related Topics
- Coverage Integration - Detailed coverage tool setup
- Tiered Prioritization - How coverage gaps affect priority
- Scoring Strategies - Coverage weight in debt scoring
- Metrics Reference - All coverage-related metrics
Coverage Integration
Coverage integration is one of Debtmap’s most powerful capabilities, enabling risk-based prioritization by correlating complexity metrics with test coverage. This helps you identify truly risky code—functions that are both complex and untested—rather than just highlighting complex but well-tested functions.
Why Coverage Matters
Without coverage data, complexity analysis shows you what’s complex, but not what’s risky. A complex function with 100% test coverage poses far less risk than a simple function with 0% coverage on a critical path.
Coverage integration transforms Debtmap from a complexity analyzer into a risk assessment tool:
- Prioritize testing efforts: Focus on high-complexity functions with low coverage
- Validate refactoring safety: See which complex code is already protected by tests
- Risk-based sprint planning: Surface truly risky code ahead of well-tested complexity
- Quantify risk reduction: Measure how coverage improvements reduce project risk
LCOV Format: The Universal Standard
Debtmap uses the LCOV format for coverage data. LCOV is a language-agnostic standard supported by virtually all coverage tools across all major languages.
Why LCOV?
- Universal compatibility: Works with Rust, Python, JavaScript, TypeScript, Go, and more
- Tool independence: Not tied to any specific test framework
- Simple text format: Easy to inspect and debug
- Widely supported: Generated by most modern coverage tools
LCOV File Structure
An LCOV file contains line-by-line coverage information:
SF:src/analyzer.rs
FN:42,calculate_complexity
FNDA:15,calculate_complexity
DA:42,15
DA:43,15
DA:44,12
DA:45,0
LH:3
LF:4
end_of_record
SF:- Source file pathFN:- Function name and starting lineFNDA:- Function execution countDA:- Line execution data (line number, hit count)LH:- Lines hitLF:- Lines found (total)
Generating Coverage Data
Rust: cargo-tarpaulin
Installation:
cargo install cargo-tarpaulin
Generate LCOV:
cargo tarpaulin --out lcov --output-dir target/coverage
Analyze with Debtmap:
debtmap analyze . --lcov target/coverage/lcov.info
Common Issues:
- Ensure tests compile before running tarpaulin
- Use
--ignore-testsif tests themselves show up in coverage - Check paths match your project structure (relative to project root)
JavaScript/TypeScript: Jest
Configuration (package.json or jest.config.js):
{
"jest": {
"coverageReporters": ["lcov", "text"]
}
}
Generate Coverage:
npm test -- --coverage
Analyze with Debtmap:
debtmap analyze . --lcov coverage/lcov.info
Python: pytest-cov
Installation:
pip install pytest-cov
Generate LCOV:
pytest --cov=src --cov-report=lcov
Analyze with Debtmap:
debtmap analyze . --lcov coverage.lcov
Go: go test with gocover-cobertura
Generate Coverage:
go test -coverprofile=coverage.out ./...
gocover-cobertura < coverage.out > coverage.xml
# Convert to LCOV using lcov tools
Note: Go’s native coverage format requires conversion. Most CI systems support LCOV conversion plugins.
How Coverage Affects Scoring
Coverage data fundamentally changes how Debtmap calculates debt scores. The scoring system operates in two different modes depending on whether coverage data is available.
Scoring Modes
Mode 1: With Coverage Data (Dampening Multiplier)
When you provide an LCOV file with --lcov, coverage acts as a dampening multiplier that reduces scores for well-tested code:
Base Score = (Complexity Factor × 0.50) + (Dependency Factor × 0.25)
Coverage Multiplier = 1.0 - coverage_percentage
Final Score = Base Score × Coverage Multiplier
This is the current implementation as of spec 122. Coverage dampens the base score rather than contributing as an additive component.
Mode 2: Without Coverage Data (Weighted Sum)
When no coverage data is available, Debtmap falls back to a weighted sum model:
Final Score = (Coverage × 0.50) + (Complexity × 0.35) + (Dependency × 0.15)
In this mode, coverage is assumed to be 0% (worst case), giving it a weight of 50% in the total score. See src/priority/scoring/calculation.rs:119-129 for the implementation.
Coverage Dampening Multiplier
When coverage data is provided, it acts as a multiplier that dampens the base score:
Coverage Multiplier = 1.0 - coverage_percentage
Final Score = Base Score × Coverage Multiplier
Examples:
| Base Score | Coverage | Multiplier | Final Score | Priority |
|---|---|---|---|---|
| 8.5 | 100% | 0.0 | 0.0 | Minimal (well-tested) |
| 8.5 | 50% | 0.5 | 4.25 | Medium |
| 8.5 | 0% | 1.0 | 8.5 | High (untested) |
Key Insight: Complex but well-tested code automatically drops in priority, while untested complex code rises to the top.
Special Cases:
- Test functions: Coverage multiplier = 0.0 (tests get near-zero scores regardless of complexity)
- Entry points: Handled through semantic classification (FunctionRole) system with role multipliers, not coverage-specific weighting
Invariant: Total debt score with coverage ≤ total debt score without coverage.
Implementation: See src/priority/scoring/calculation.rs:68-82 for the coverage dampening calculation.
Transitive Coverage Propagation
Debtmap doesn’t just look at direct coverage—it propagates coverage through the call graph using transitive analysis.
How It Works
A function’s effective coverage considers:
- Direct coverage: Lines executed by tests
- Caller coverage: Coverage of functions that call this function
Transitive Coverage = Direct Coverage + Σ(Caller Coverage × Weight)
Algorithm Parameters
The transitive coverage propagation uses carefully tuned parameters to balance accuracy and performance:
- Well-Tested Threshold: 80% - Only functions with ≥80% direct coverage contribute to indirect coverage, ensuring high confidence
- Distance Discount: 70% per hop - Each level of indirection reduces contribution by 30%, reflecting decreased confidence
- Maximum Distance: 3 hops - Limits recursion depth to prevent exponential complexity (after 3 hops, contribution drops to ~34%)
These parameters ensure that indirect coverage signals are meaningful while preventing false confidence from distant call relationships. See src/priority/coverage_propagation.rs:38-46 for the implementation.
Why It Matters
A function with 0% direct coverage might have high transitive coverage if it’s only called by well-tested functions:
#![allow(unused)]
fn main() {
// direct_coverage = 0%
// But called only by `process_request` (100% coverage)
// → transitive_coverage = 85%
fn validate_input(data: &str) -> bool {
data.len() > 0
}
// direct_coverage = 100%
fn process_request(input: String) -> Result<()> {
if !validate_input(&input) {
return Err("Invalid");
}
// ...
}
}
Effect: validate_input has reduced urgency because it’s only reachable through well-tested code paths.
Performance Characteristics
Coverage integration is highly optimized for large codebases:
- Index Build: O(n), ~20-30ms for 5,000 functions
- Exact Lookup: O(1), ~0.5μs per lookup
- Fallback Lookup: O(files) iteration with O(1) per-file lookup, ~5-8μs when exact match fails. Uses suffix matching and normalized path equality strategies
- Memory Usage: ~200 bytes per record (~2MB for 5,000 functions)
- Thread Safety: Lock-free parallel access via
Arc<CoverageIndex> - Analysis Overhead: ~2.5x baseline (target: ≤3x)
Result: Coverage integration adds minimal overhead even on projects with thousands of functions.
Implementation: See src/risk/coverage_index.rs:13-38 for the CoverageIndex struct and performance documentation.
CLI Options Reference
Primary Coverage Options
# Provide LCOV coverage file
debtmap analyze . --coverage-file path/to/lcov.info
# Shorthand alias
debtmap analyze . --lcov path/to/lcov.info
Context Providers
Coverage can be combined with other context providers for nuanced risk assessment:
# Enable all context providers (includes coverage propagation)
debtmap analyze . --lcov coverage.info --enable-context
# Specify specific providers
debtmap analyze . --lcov coverage.info \
--context-providers critical_path,dependency,git_history
# Disable specific providers
debtmap analyze . --lcov coverage.info \
--disable-context git_history
Available Context Providers:
critical_path: Identifies functions on critical execution pathsdependency: Analyzes dependency relationships and impactgit_history: Uses change frequency from version control
See Scoring Strategies for details on how these combine.
Validate Command Support
The validate command also supports coverage integration for risk-based quality gates:
# Fail CI builds if untested complex code exceeds thresholds
debtmap validate . --lcov coverage.info --max-debt-density 50
See CLI Reference for complete validation options.
Troubleshooting Coverage Integration
Coverage Not Correlating with Functions
Symptoms:
- Debtmap shows 0% coverage for all functions
- Warning: “No coverage data correlated with analyzed functions”
Solutions:
- Verify LCOV Format:
head coverage.info
# Should show: SF:, FN:, DA: lines
- Check Path Matching: Coverage file paths must be relative to project root:
# Good: SF:src/analyzer.rs
# Bad: SF:/home/user/project/src/analyzer.rs
- Enable Verbose Logging:
debtmap analyze . --lcov coverage.info -vv
This shows coverage lookup details for each function.
- Verify Coverage Tool Output:
# Ensure your coverage tool generated line data (DA: records)
grep "^DA:" coverage.info | head
Functions Still Show Up Despite 100% Coverage
This is expected behavior when:
- Function has high complexity (cyclomatic > 10)
- Function has other debt issues (duplication, nesting, etc.)
- You’re viewing function-level output (coverage dampens but doesn’t eliminate)
Coverage reduces priority but doesn’t hide issues. Use filters to focus:
# Show only critical and high priority items
debtmap analyze . --lcov coverage.info --min-priority high
# Show top 10 most urgent items
debtmap analyze . --lcov coverage.info --top 10
Coverage File Path Issues
Problem: Can’t find coverage file
Solutions:
# Use absolute path
debtmap analyze . --lcov /absolute/path/to/coverage.info
# Or ensure relative path is from project root
debtmap analyze . --lcov ./target/coverage/lcov.info
LCOV Format Errors
Problem: “Invalid LCOV format” error
Causes:
- Non-LCOV format (Cobertura XML, JaCoCo, etc.)
- Corrupted file
- Wrong file encoding
Solutions:
- Verify your coverage tool is configured for LCOV output
- Check for binary/encoding issues:
file coverage.info - Regenerate coverage with explicit LCOV format flag
See Troubleshooting for more debugging tips.
Best Practices
Analysis Workflow
-
Generate Coverage Before Analysis:
# Rust example cargo tarpaulin --out lcov --output-dir target/coverage debtmap analyze . --lcov target/coverage/lcov.info -
Use Coverage for Sprint Planning:
# Focus on untested complex code debtmap analyze . --lcov coverage.info --top 20 -
Combine with Tiered Prioritization: Coverage automatically feeds into Tiered Prioritization:
- Tier 1: Architectural issues (less affected by coverage)
- Tier 2: Complex untested code (coverage < 50%, complexity > 15)
- Tier 3: Testing gaps (coverage < 80%, complexity 10-15)
-
Validate Refactoring Impact:
# Before refactoring debtmap analyze . --lcov coverage-before.info -o before.json # After refactoring debtmap analyze . --lcov coverage-after.info -o after.json # Compare debtmap compare --before before.json --after after.json
Testing Strategy
Prioritize testing based on risk:
-
High Complexity + Low Coverage = Highest Priority:
debtmap analyze . --lcov coverage.info \ --filter Risk --min-priority high -
Focus on Business Logic: Entry points and infrastructure code have natural coverage patterns. Focus unit tests on business logic functions.
-
Use Dependency Analysis:
debtmap analyze . --lcov coverage.info \ --context-providers dependency -vvTests high-dependency functions first—they have the most impact.
-
Don’t Over-Test Entry Points: Entry points (main, handlers) are better tested with integration tests, not unit tests. Debtmap applies role multipliers through its semantic classification system (FunctionRole) to adjust scoring for different function types. See
src/priority/unified_scorer.rs:149andsrc/priority/scoring/classification.rsfor the classification system.
Configuration
In .debtmap.toml:
[scoring]
# Default weights for scoring WITHOUT coverage data
# When coverage data IS provided, it acts as a dampening multiplier instead
coverage = 0.50 # Default: 50% (only used when no LCOV provided)
complexity = 0.35 # Default: 35%
dependency = 0.15 # Default: 15%
[thresholds]
# Set minimum risk score to filter low-priority items
minimum_risk_score = 15.0
# Skip simple functions even if uncovered
minimum_cyclomatic_complexity = 5
Important: These weights are from the deprecated additive scoring model. The current implementation (spec 122) calculates a base score from complexity (50%) and dependency (25%) factors, then applies coverage as a dampening multiplier: Final Score = Base Score × (1.0 - coverage_pct). These weights only apply when coverage data is not available. See src/priority/scoring/calculation.rs:68-82 for the coverage dampening calculation and src/priority/scoring/calculation.rs:119-129 for the fallback weighted sum mode.
See Configuration for complete options.
CI Integration
Example GitHub Actions Workflow:
- name: Generate Coverage
run: cargo tarpaulin --out lcov --output-dir target/coverage
- name: Analyze with Debtmap
run: |
debtmap analyze . \
--lcov target/coverage/lcov.info \
--format json \
--output debtmap-report.json
- name: Validate Quality Gates
run: |
debtmap validate . \
--lcov target/coverage/lcov.info \
--max-debt-density 50
Quality Gate Strategy:
- Fail builds on new critical debt (Tier 1 architectural issues)
- Warn on new high-priority untested code (Tier 2)
- Track coverage trends over time with
comparecommand
Complete Language Examples
Rust End-to-End
# 1. Generate coverage
cargo tarpaulin --out lcov --output-dir target/coverage
# 2. Verify LCOV output
head target/coverage/lcov.info
# 3. Run Debtmap with coverage
debtmap analyze . --lcov target/coverage/lcov.info
# 4. Interpret results (look for [UNTESTED] markers on high-complexity functions)
JavaScript/TypeScript End-to-End
# 1. Configure Jest for LCOV (in package.json or jest.config.js)
# "coverageReporters": ["lcov", "text"]
# 2. Generate coverage
npm test -- --coverage
# 3. Verify LCOV output
head coverage/lcov.info
# 4. Run Debtmap
debtmap analyze . --lcov coverage/lcov.info --languages javascript,typescript
Python End-to-End
# 1. Install pytest-cov
pip install pytest-cov
# 2. Generate LCOV coverage
pytest --cov=src --cov-report=lcov
# 3. Verify output
head coverage.lcov
# 4. Run Debtmap
debtmap analyze . --lcov coverage.lcov --languages python
Go End-to-End
# 1. Generate native coverage
go test -coverprofile=coverage.out ./...
# 2. Convert to LCOV (requires gocover-cobertura or similar)
# Note: This step is tool-dependent
# 3. Run Debtmap
debtmap analyze . --lcov coverage.lcov --languages go
FAQ
Why does my 100% covered function still show up?
Coverage dampens debt scores but doesn’t eliminate debt. A function with cyclomatic complexity 25 and 100% coverage still represents technical debt—it’s just lower priority than untested complex code.
Use filters to focus on high-priority items:
debtmap analyze . --lcov coverage.info --top 10
What’s the difference between direct and transitive coverage?
- Direct coverage: Lines executed directly by tests
- Transitive coverage: Coverage considering call graph (functions called by well-tested code)
Transitive coverage reduces urgency for functions only reachable through well-tested paths.
Should I test everything to 100% coverage?
No. Use Debtmap’s risk scores to prioritize:
- Test high-complexity, low-coverage functions first
- Entry points are better tested with integration tests
- Simple utility functions (complexity < 5) may not need dedicated unit tests
Debtmap helps you achieve optimal coverage, not maximal coverage.
How do I debug coverage correlation issues?
Use verbose logging:
debtmap analyze . --lcov coverage.info -vv
This shows:
- Coverage file parsing details
- Function-to-coverage correlation attempts
- Path matching diagnostics
Can I use coverage with validate command?
Yes! The validate command supports --lcov for risk-based quality gates:
debtmap validate . --lcov coverage.info --max-debt-density 50
See CLI Reference for details.
Further Reading
- Scoring Strategies - Deep dive into how coverage affects unified scoring
- Tiered Prioritization - How coverage fits into tiered priority levels
- CLI Reference - Complete coverage option documentation
- Configuration - Customizing coverage scoring weights
- Troubleshooting - More debugging tips for coverage issues
Dead Code Analysis
Debtmap’s Python dead code detection system uses advanced static analysis to identify unused functions with high accuracy and low false positive rates. The analyzer integrates multiple detection systems to provide confidence-scored results that help you make informed decisions about code removal.
Overview
The dead code analyzer combines several detection techniques:
- Static call graph analysis - Tracks which functions call each other across your codebase
- Framework pattern detection - Recognizes entry points from Flask, Django, FastAPI, Click, pytest, and more
- Test detection - Identifies test functions and test files to avoid false positives
- Callback tracking - Detects functions registered as callbacks or event handlers
- Import analysis - Tracks which functions are imported and exported by other modules
- Coverage integration - Uses test coverage data when available to identify live code
- Public API detection - Uses heuristics to identify external API functions
This multi-layered approach significantly reduces false positives compared to naive call graph analysis, with the goal of achieving a target false positive rate of less than 10% (see Spec 116 for confidence scoring validation).
Confidence Scoring
Every analysis result includes a confidence score to help you prioritize code removal:
High Confidence (0.8-1.0)
Safe to remove - These functions are very likely dead code.
Characteristics:
- No static callers found in the codebase
- Not a framework entry point (route, command, view, etc.)
- Not a test function or in a test file
- Not registered as a callback or event handler
- Not exported in
__all__or used in public API patterns - Often private functions (starting with
_)
Example output:
Function: _old_helper
Confidence: High (0.95)
Suggestion: High confidence this function is dead code and can be safely removed.
Medium Confidence (0.5-0.8)
Review recommended - These functions might be dead code but require manual verification.
Characteristics:
- No static callers but function is public
- In a test file but not called by any tests
- Might be used dynamically (via
getattr, plugins, etc.) - Public API that might be used by external code
Example output:
Function: legacy_api_method
Confidence: Medium (0.65)
Suggestion: Medium confidence this function is dead code. Manual verification recommended.
Risks:
- Function is public and may be used by external code.
Low Confidence (0.0-0.5)
Likely in use - These functions are probably not dead code.
Characteristics:
- Has static callers in the codebase
- Framework entry point (Flask route, Django view, Click command)
- Test function (starts with
test_, in test file) - Callback target or event handler
- Magic method (
__init__,__str__, etc.) - Property accessor or descriptor
Example output:
Function: index
Confidence: Low (0.15)
Result: LIVE
Reasons:
- Framework entry point (Flask route)
- Function is public
Public API Detection
Debtmap uses advanced heuristics to identify functions that are likely part of your project’s external API (introduced in Spec 113). This prevents false positives when analyzing library code.
Detection Heuristics
The public API detector considers:
- Public visibility - Function doesn’t start with
_ - File location patterns - Functions in
api/,public/, or top-level__init__.pyfiles - Naming conventions - Functions following API naming patterns
- Export declarations - Functions listed in
__all__ - Explicit configuration - Functions marked as API in
.debtmap.toml
Configuration
Configure public API detection in .debtmap.toml:
[external_api]
# Enable/disable automatic public API detection (default: true)
detect_external_api = true
# Explicitly mark specific functions as external APIs
api_functions = [
"calculate_score", # Just function name
"mylib.api::process_data", # Module-qualified name
"public_handler", # Any function matching this name
]
# Mark entire files as containing external APIs (supports glob patterns)
api_files = [
"src/api/**/*.py", # All Python files in api directory recursively
"src/lib.rs", # Rust library entry point (all public functions)
"src/public_interface.py", # Specific Python file
"**/__init__.py", # All __init__.py files in any directory
"**/public_*.py", # Any file starting with 'public_'
"myapp/api.py", # Specific API module
]
Functions identified as public APIs receive lower dead code confidence scores, even if they have no internal callers.
Framework Support
The analyzer recognizes entry points from popular Python frameworks to avoid false positives:
Web Frameworks
- Flask:
@app.route,@app.before_request,@app.after_request,@app.errorhandler - Django: View functions, admin actions, signal handlers, middleware methods
- FastAPI:
@app.get,@app.post,@app.put,@app.delete, route decorators
CLI Frameworks
- Click:
@click.command,@click.group, subcommand handlers - argparse: Functions registered as subcommand handlers
Testing Frameworks
- pytest: Functions starting with
test_,@pytest.fixture, parametrized tests - unittest:
TestCasemethods,setUp,tearDown,setUpClass,tearDownClass
Event Systems
- Qt/PyQt: Signal connections, slot decorators (
@pyqtSlot) - Tkinter: Event bindings, button command callbacks, widget event handlers
Framework Detection Matrix
| Framework | Pattern | Decorator/Keyword | Detection Method | Example |
|---|---|---|---|---|
| Flask | Routes | @app.route | Decorator analysis | @app.route('/') |
| Flask | Before request | @app.before_request | Decorator analysis | Handler hooks |
| Flask | Error handlers | @app.errorhandler | Decorator analysis | Custom error pages |
| Django | Views | Function-based views | Module structure | def my_view(request): |
| Django | Admin actions | @admin.action | Decorator analysis | Admin panel actions |
| Django | Signals | @receiver | Decorator analysis | Signal handlers |
| FastAPI | Routes | @app.get, @app.post | Decorator analysis | REST endpoints |
| FastAPI | Dependencies | Depends() | Call graph analysis | Dependency injection |
| Click | Commands | @click.command | Decorator analysis | CLI commands |
| Click | Groups | @click.group | Decorator analysis | Command groups |
| pytest | Tests | test_* prefix | Naming convention | def test_foo(): |
| pytest | Fixtures | @pytest.fixture | Decorator analysis | Test fixtures |
| unittest | Tests | TestCase methods | Class hierarchy | class TestFoo(TestCase): |
| unittest | Setup/Teardown | setUp, tearDown | Method naming | Lifecycle methods |
| Qt/PyQt | Slots | @pyqtSlot | Decorator analysis | Signal handlers |
| Qt | Connections | .connect() calls | Call graph analysis | Event wiring |
| Tkinter | Callbacks | command=func | Assignment tracking | Button callbacks |
See Framework Patterns documentation for comprehensive framework support details and language-specific patterns.
Confidence Thresholds
You can customize confidence thresholds based on your project’s tolerance for false positives vs. false negatives:
#![allow(unused)]
fn main() {
use debtmap::analysis::python_dead_code_enhanced::AnalysisConfig;
let config = AnalysisConfig {
high_confidence_threshold: 0.8, // Default: 0.8
medium_confidence_threshold: 0.5, // Default: 0.5
respect_suppression_comments: true, // Default: true
include_private_api: true, // Default: true
enable_public_api_detection: true, // Default: true (Spec 113)
..Default::default()
};
}
Tuning recommendations:
- Conservative projects (libraries, public APIs): Raise thresholds to 0.9/0.7 to reduce false positives
- Aggressive cleanup (internal tools): Lower thresholds to 0.7/0.4 to catch more dead code
- Balanced approach (most projects): Use defaults of 0.8/0.5
Suppressing False Positives
Mark functions as intentionally unused with suppression comments:
# debtmap: not-dead
def future_api_endpoint():
"""Will be activated in v2.0"""
pass
def compatibility_shim(): # noqa: dead-code
"""Kept for backwards compatibility"""
pass
Supported Comment Formats
All of these formats are recognized:
# debtmap: not-dead(recommended)# debtmap:not-dead# noqa: dead-code# noqa:dead-code
Comment Placement
Suppression comments can appear:
-
Above the function (most common):
# debtmap: not-dead def my_function(): pass -
Same line as function definition:
def my_function(): # debtmap: not-dead pass -
Below the function definition (less common):
def my_function(): # debtmap: not-dead pass
Coverage Integration
When test coverage data is available, the analyzer uses it to dramatically improve accuracy by marking covered functions as live:
Generating Coverage Data
# With pytest and pytest-cov
pytest --cov=myapp --cov-report=json
# With coverage.py directly
coverage run -m pytest
coverage json
# Debtmap automatically detects and uses coverage.json
debtmap analyze myapp/
How It Works
Functions that appear in coverage data are considered live, even if:
- No static callers are found
- They’re private functions
- They’re not framework entry points
This catches functions called:
- Dynamically via
getattr()orexec() - Through plugin systems
- By external libraries or C extensions
Programmatic Coverage Usage
In Rust code, you can provide coverage data programmatically:
#![allow(unused)]
fn main() {
use debtmap::analysis::python_dead_code_enhanced::{EnhancedDeadCodeAnalyzer, CoverageData};
// Load coverage from coverage.json file
let coverage = CoverageData::from_coverage_json("coverage.json")?;
// Create analyzer with coverage data
let analyzer = EnhancedDeadCodeAnalyzer::new()
.with_coverage(coverage);
// Analyze functions - covered functions will have higher "live" confidence
let result = analyzer.analyze_function(&func, &call_graph);
}
Accuracy Improvement
Coverage integration substantially improves accuracy by:
- Significantly reducing false positives - Eliminates most false positives in complex codebases
- High accuracy for covered functions - Functions with test coverage are correctly identified as live
- Clear removal candidates - Uncovered code with no static callers is more confidently dead
- Dynamic call detection - Catches functions called via
getattr(), plugins, or other dynamic mechanisms that static analysis misses
Coverage data format: Debtmap uses the standard coverage.json format produced by coverage.py and pytest-cov. The file should be in your project root and contain executed line numbers for each source file.
Configuration Reference
TOML Configuration
Complete dead code analysis configuration in .debtmap.toml:
# Language-specific dead code detection
[languages.python]
detect_dead_code = true # Enable Python dead code analysis (default: true)
# External API detection (Spec 113)
[external_api]
detect_external_api = true # Enable automatic public API detection (default: true)
api_functions = [
"public_function_name", # Function name only
"module::qualified_name", # Module-qualified format
]
api_files = [
"src/api/**/*.py", # Glob patterns supported
"src/public_interface.py", # Exact file paths
"**/__init__.py", # All package entry points
]
Programmatic Configuration (Rust API)
Note: Confidence thresholds and analysis behavior are configured programmatically via the Rust API. These settings are not available in .debtmap.toml - only the detect_external_api and API detection settings can be configured via TOML (see above).
For Rust API users, you can customize thresholds:
#![allow(unused)]
fn main() {
use debtmap::analysis::python_dead_code_enhanced::AnalysisConfig;
let config = AnalysisConfig {
// Confidence thresholds (Rust API only - not in .debtmap.toml)
high_confidence_threshold: 0.8, // Default: 0.8 (80%)
medium_confidence_threshold: 0.5, // Default: 0.5 (50%)
// Analysis options (Rust API only)
respect_suppression_comments: true, // Honor # debtmap: not-dead (default: true)
include_private_api: true, // Analyze private functions (default: true)
enable_public_api_detection: true, // Use public API heuristics (default: true)
// Public API detector configuration (optional)
public_api_config: None, // Use default PublicApiConfig
};
let analyzer = EnhancedDeadCodeAnalyzer::new()
.with_config(config);
}
Configuration Tuning by Project Type
Libraries and Public APIs (conservative):
#![allow(unused)]
fn main() {
AnalysisConfig {
high_confidence_threshold: 0.9, // Very strict
medium_confidence_threshold: 0.7,
enable_public_api_detection: true, // Critical for libraries
..Default::default()
}
}
Internal Applications (aggressive):
#![allow(unused)]
fn main() {
AnalysisConfig {
high_confidence_threshold: 0.7, // More lenient
medium_confidence_threshold: 0.4,
include_private_api: true, // Analyze everything
..Default::default()
}
}
Balanced Approach (recommended default):
#![allow(unused)]
fn main() {
AnalysisConfig::default() // Uses 0.8/0.5 thresholds
}
Understanding Results
Interpreting Output
When you run dead code analysis, you’ll see results like:
Dead code analysis for 'calculate_total':
Result: LIVE
Confidence: Low (0.2)
Reasons it's LIVE:
- HasStaticCallers (called by 3 functions)
- PublicApi
Suggestion:
Function appears to be in use or is a framework/test entry point.
Decision Guide
| Result | Confidence | Action |
|---|---|---|
is_dead: true | High (0.8-1.0) | Safe to remove - Very likely unused |
is_dead: true | Medium (0.5-0.8) | Review manually - Might be dead, verify first |
is_dead: true | Low (0.0-0.5) | Keep - Likely used dynamically |
is_dead: false | Any | Keep - Function is in use |
Decision Tree for Confidence Interpretation
Use this decision tree to determine what action to take:
Is the function flagged as dead?
│
├─ NO → Keep the function (it's in use)
│
└─ YES → What is the confidence level?
│
├─ HIGH (0.8-1.0)
│ ├─ Is it a public API function? → Review, add suppression comment if keeping
│ └─ Is it private (_prefix)? → **SAFE TO REMOVE**
│
├─ MEDIUM (0.5-0.8)
│ ├─ Check git history: recently added? → Keep for now, review in next sprint
│ ├─ Has coverage data been generated? → Run with coverage first
│ ├─ Is it used dynamically (getattr, plugins)? → Add suppression comment
│ └─ No clear reason to keep? → **REVIEW MANUALLY, likely safe to remove**
│
└─ LOW (0.0-0.5)
├─ Review "Reasons it's LIVE" → If reasons are valid, keep it
├─ Function is public and might be external API? → Keep it
└─ Truly unused but marked live incorrectly? → Report issue or use suppression
Confidence Level Quick Reference
When to act without review:
is_dead: true+confidence: HIGH+private function (_prefix)→ Remove immediatelyis_dead: true+confidence: HIGH+in test file+not test function→ Remove immediately
When to review before acting:
is_dead: true+confidence: MEDIUM→ Manual review requiredis_dead: true+confidence: HIGH+public function→ Check git history, verify external usage
When to keep:
is_dead: false→ Always keep (function is live)is_dead: true+confidence: LOW→ Keep (too uncertain to remove)
Filtering Results by Confidence
To filter dead code results by confidence level, you can process the JSON output:
# Analyze and output JSON
debtmap analyze --format=json > results.json
# Filter for high confidence dead code using jq
jq '.dead_code | map(select(.confidence >= 0.8))' results.json
# Filter for high and medium confidence
jq '.dead_code | map(select(.confidence >= 0.5))' results.json
Note: CLI filtering by confidence threshold (e.g., --min-confidence) is planned for a future release (see Spec 116). Currently, filtering must be done via JSON post-processing.
See CLI Reference for complete command options.
Common Patterns
False Positives (and How to Handle Them)
Public API methods
class Calculator:
def add(self, a, b): # Might be used by external code
return a + b
Solution: Add to api_functions in .debtmap.toml or use suppression comment
Dynamic imports
# Module loaded dynamically via importlib
def handle_command(cmd): # Called via getattr()
pass
Solution: Add # debtmap: not-dead suppression comment
Plugin registration
@registry.register
def handler(): # Registered at import time
pass
Solution: Should be detected by callback tracker; if not, add suppression comment
True Positives (Safe to Remove)
Old implementations
def _old_calculate(x): # Replaced but not removed
return x * 2
Action: Safe to remove (high confidence)
Unused helper functions
def _format_date(date): # Was used but caller was removed
return date.strftime("%Y-%m-%d")
Action: Safe to remove (high confidence)
Commented-out code alternatives
def process_v1(data): # Old version, v2 is now used
pass
Action: Safe to remove (high confidence)
Best Practices
Workflow Recommendations
-
Start with high confidence items - Remove functions with 0.8+ confidence first to build trust in the tool
-
Run with coverage data - Generate
coverage.jsonto dramatically improve accuracy:pytest --cov=myapp --cov-report=json debtmap analyze myapp/ -
Review medium confidence items - These often find real dead code but need manual verification
-
Use suppression comments liberally - Better to mark something as intentionally unused than to have noise in results
-
Check git history - Before removing, verify the function wasn’t recently added:
git log -p -- path/to/file.py | grep -A5 "def function_name" -
Remove incrementally - Remove a few functions, run tests, commit. Don’t remove everything at once:
# Remove 3-5 high confidence functions pytest # Verify tests still pass git commit -m "Remove dead code: _old_helper, _unused_formatter" -
Look for patterns - If multiple related functions are flagged, they might all be part of an abandoned feature
CI/CD Integration
Prevent dead code from accumulating by integrating into your CI pipeline:
# .github/workflows/dead-code.yml
- name: Check for dead code
run: |
debtmap analyze --min-confidence=0.8 --format=json > dead-code.json
# Fail if high-confidence dead code is found
if [ $(jq '.dead_code | length' dead-code.json) -gt 0 ]; then
echo "High-confidence dead code detected!"
jq '.dead_code[] | "\(.file):\(.line) - \(.function)"' dead-code.json
exit 1
fi
Limitations
What the Analyzer CAN Detect
- ✅ Static function calls across modules
- ✅ Framework entry points via decorators
- ✅ Test functions in test files
- ✅ Callback registrations and event handlers
- ✅ Functions in
__all__exports - ✅ Property decorators and descriptors
- ✅ Magic methods (
__init__,__str__, etc.) - ✅ Functions covered by test coverage data
What the Analyzer CANNOT Detect
- ❌
eval()orexec()usage - arbitrary code execution - ❌
getattr()with dynamic string names - runtime attribute lookup - ❌ Reflection-based calls -
inspectmodule usage - ❌ Functions called from C extensions
- ❌ Plugin systems using string-based loading - dynamic imports
Mitigation Strategies
For functions the analyzer cannot detect, use suppression comments:
# Called dynamically via getattr in plugin system
# debtmap: not-dead
def handle_dynamic_command():
pass
# Loaded via string-based plugin system
# debtmap: not-dead
def plugin_entrypoint():
pass
Troubleshooting
“Function marked as dead but it’s actually used”
Possible causes and solutions:
-
Dynamic call via
getattr()orexec()- Solution: Add
# debtmap: not-deadsuppression comment - Example: Plugin systems, command dispatchers
- Solution: Add
-
Called from external code or C extension
- Solution: Add function to
api_functionsin.debtmap.toml - Example: Public library APIs
- Solution: Add function to
-
Framework pattern not recognized
- Solution: Report issue on GitHub with framework details
- Workaround: Add suppression comment
-
Callback registration not detected
- Solution: Check if decorator is supported; add suppression if not
- Example: Custom registration decorators
“Too many false positives in my codebase”
Solutions to try (in order):
-
Run with coverage data - Biggest impact on accuracy:
pytest --cov=myapp --cov-report=json debtmap analyze myapp/ -
Configure public API detection - Mark your external APIs:
[external_api] api_files = ["src/api/**/*.py", "src/public/**/*.py"] -
Add framework patterns - Report unrecognized frameworks on GitHub
-
Add suppression comments - Mark intentionally unused functions
-
Adjust confidence thresholds - Raise to 0.9/0.7 for conservative analysis
“Low confidence on obviously dead code”
This is working as intended - the analyzer is conservative to avoid false positives.
What to do:
- Review the “Reasons it’s LIVE” - Understand why confidence is low
- Check if function is truly unused - Verify no dynamic calls
- Run with coverage - Coverage data will increase confidence for truly dead code
- Accept medium/low confidence - Manual review is valuable for complex cases
Examples
Example 1: Flask Application
from flask import Flask
app = Flask(__name__)
@app.route('/')
def index(): # ✅ LIVE - Framework entry point
return helper()
def helper(): # ✅ LIVE - Called by index()
return format_response("Hello")
def format_response(msg): # ✅ LIVE - Called by helper()
return f"<html>{msg}</html>"
def _old_route(): # ❌ DEAD - No callers, not a route (High: 0.95)
return "Unused"
Analysis results:
index: LIVE (Low: 0.15) - Flask route decorator detectedhelper: LIVE (Low: 0.25) - Has static caller (index)format_response: LIVE (Low: 0.30) - Has static caller (helper)_old_route: DEAD (High: 0.95) - No callers, private function
Example 2: Test File
import pytest
def test_addition(): # ✅ LIVE - Test function
assert add(1, 2) == 3
def add(a, b): # ✅ LIVE - Called by test
return a + b
@pytest.fixture
def sample_data(): # ✅ LIVE - pytest fixture
return [1, 2, 3]
def _unused_helper(): # ❌ DEAD - No callers (High: 0.90)
return 42
def _old_test_helper(): # ❌ DEAD - Was used, now orphaned (High: 0.92)
return "test data"
Analysis results:
test_addition: LIVE (Low: 0.10) - Test function patternadd: LIVE (Low: 0.20) - Called by testsample_data: LIVE (Low: 0.15) - pytest fixture decorator_unused_helper: DEAD (High: 0.90) - No callers in test file_old_test_helper: DEAD (High: 0.92) - Orphaned helper
Example 3: Public API with Configuration
# src/api/calculator.py
__all__ = ['calculate', 'format_result']
def calculate(x): # ✅ LIVE - Exported in __all__
return _internal_multiply(x, 2)
def format_result(x): # ✅ LIVE - Exported in __all__
return f"Result: {x}"
def _internal_multiply(a, b): # ✅ LIVE - Called by calculate
return a * b
def _internal_helper(): # ❌ DEAD - Not exported, no callers (High: 0.88)
return None
# Public API but not in __all__
def legacy_api(): # ⚠️ MEDIUM - Public but no callers (Medium: 0.65)
"""Kept for backwards compatibility"""
pass
.debtmap.toml configuration:
[external_api]
api_files = ["src/api/**/*.py"]
# Explicitly mark legacy API
api_functions = ["legacy_api"]
Analysis results:
calculate: LIVE (Low: 0.20) - In__all__, has callersformat_result: LIVE (Low: 0.25) - In__all___internal_multiply: LIVE (Low: 0.30) - Called by calculate_internal_helper: DEAD (High: 0.88) - Private, no callerslegacy_api: LIVE (Low: 0.35) - Marked as API in config
Getting Help
- Documentation: See Troubleshooting Guide for common issues
- Report issues: https://github.com/anthropics/debtmap/issues
- Examples: Check the Examples chapter for more scenarios
- Related topics:
- Coverage Integration - Detailed coverage setup
- Suppression Patterns - Advanced suppression techniques
- Configuration - Complete configuration reference
Design Pattern Detection
Debtmap automatically detects common design patterns in your codebase to provide better architectural insights and reduce false positives in complexity analysis. When recognized design patterns are detected, Debtmap applies appropriate complexity adjustments to avoid penalizing idiomatic code.
Overview
Debtmap detects 7 design patterns across Python, JavaScript, TypeScript, and Rust:
| Pattern | Primary Language | Detection Confidence |
|---|---|---|
| Observer | Python, Rust | High (0.8-0.9) |
| Singleton | Python | High (0.85-0.95) |
| Factory | Python | Medium-High (0.7-0.85) |
| Strategy | Python | Medium (0.7-0.8) |
| Callback | Python, JavaScript | High (0.8-0.9) |
| Template Method | Python | Medium (0.7-0.8) |
| Dependency Injection | Python | Medium (0.65-0.75) |
Pattern detection serves multiple purposes:
- Reduces false positives: Avoids flagging idiomatic pattern implementations as overly complex
- Documents architecture: Automatically identifies architectural patterns in your codebase
- Validates consistency: Helps ensure patterns are used correctly and completely
- Guides refactoring: Identifies incomplete pattern implementations
Pattern Detection Details
Observer Pattern
The Observer pattern is detected in Python and Rust by identifying abstract base classes with concrete implementations.
Detection Criteria (Python):
- Abstract base class with
ABC,Protocol, orInterfacemarkers - Abstract methods decorated with
@abstractmethod - Concrete implementations inheriting from the interface
- Methods prefixed with
on_,handle_, ornotify_ - Registration methods like
add_observer,register, orsubscribe - Notification methods like
notify,notify_all,trigger,emit
Detection Criteria (Rust):
- Trait definitions with callback-style methods
- Multiple implementations of the same trait
- Trait registry tracking for cross-module detection
Example (Python):
from abc import ABC, abstractmethod
class EventObserver(ABC):
@abstractmethod
def on_event(self, data):
"""Handle event notification"""
pass
class LoggingObserver(EventObserver):
def on_event(self, data):
print(f"Event occurred: {data}")
class EmailObserver(EventObserver):
def on_event(self, data):
send_email(f"Alert: {data}")
class EventManager:
def __init__(self):
self.observers = []
def add_observer(self, observer: EventObserver):
self.observers.append(observer)
def notify_all(self, data):
for observer in self.observers:
observer.on_event(data)
Confidence: High (0.8-0.9) when abstract base class, implementations, and registration/notification methods are present. Lower confidence (0.5-0.7) for partial implementations.
Singleton Pattern
Singleton pattern detection identifies three common Python implementations: module-level singletons, __new__ override, and decorator-based patterns.
Detection Criteria:
- Module-level variable assignments (e.g.,
instance = MyClass()) - Classes overriding
__new__to enforce single instance - Classes decorated with
@singletonor similar decorators - Presence of instance caching logic
Example (Module-level):
# config.py
class Config:
def __init__(self):
self.settings = {}
def load(self, path):
# Load configuration
pass
# Single instance created at module level
config = Config()
Example (__new__ override):
class DatabaseConnection:
_instance = None
def __new__(cls):
if cls._instance is None:
cls._instance = super().__new__(cls)
return cls._instance
def __init__(self):
if not hasattr(self, 'initialized'):
self.initialized = True
self.connect()
Example (Decorator-based):
def singleton(cls):
instances = {}
def get_instance(*args, **kwargs):
if cls not in instances:
instances[cls] = cls(*args, **kwargs)
return instances[cls]
return get_instance
@singleton
class Logger:
def __init__(self):
self.log_file = open('app.log', 'a')
Confidence: Very High (0.9-0.95) for __new__ override and decorator patterns. High (0.85) for module-level singletons with clear naming.
Factory Pattern
Factory pattern detection identifies factory functions, factory classes, and factory registries based on naming conventions and structural patterns.
Detection Criteria:
- Functions with names containing
create_,make_,build_, or_factory - Factory registry patterns (dictionaries mapping types to constructors)
- Functions that return instances of different types based on parameters
- Classes with factory methods
Example (Factory Function):
def create_logger(log_type: str):
if log_type == "file":
return FileLogger()
elif log_type == "console":
return ConsoleLogger()
elif log_type == "network":
return NetworkLogger()
else:
raise ValueError(f"Unknown logger type: {log_type}")
Example (Registry-based Factory):
# Parser registry
PARSERS = {
'json': JSONParser,
'xml': XMLParser,
'yaml': YAMLParser,
}
def create_parser(format: str):
parser_class = PARSERS.get(format)
if parser_class is None:
raise ValueError(f"No parser for format: {format}")
return parser_class()
Example (Factory Method):
class DocumentFactory:
@staticmethod
def create_document(doc_type: str):
if doc_type == "pdf":
return PDFDocument()
elif doc_type == "word":
return WordDocument()
else:
return PlainTextDocument()
Confidence: Medium-High (0.75-0.85) for functions with factory naming patterns. Lower confidence (0.6-0.7) for registry patterns without factory names.
Strategy Pattern
Strategy pattern detection identifies interfaces with multiple implementations representing interchangeable algorithms.
Detection Criteria:
- Abstract base class or Protocol defining strategy interface
- Multiple concrete implementations
- Strategy interface typically has 1-2 core methods
- Used via composition (strategy object passed to context)
Example:
from abc import ABC, abstractmethod
class CompressionStrategy(ABC):
@abstractmethod
def compress(self, data: bytes) -> bytes:
pass
class ZipCompression(CompressionStrategy):
def compress(self, data: bytes) -> bytes:
return zlib.compress(data)
class GzipCompression(CompressionStrategy):
def compress(self, data: bytes) -> bytes:
return gzip.compress(data)
class LzmaCompression(CompressionStrategy):
def compress(self, data: bytes) -> bytes:
return lzma.compress(data)
class FileCompressor:
def __init__(self, strategy: CompressionStrategy):
self.strategy = strategy
def compress_file(self, path):
data = read_file(path)
return self.strategy.compress(data)
Confidence: Medium (0.7-0.8) based on interface structure and implementation count.
Callback Pattern
Callback pattern detection identifies decorator-based callbacks commonly used in web frameworks and event handlers.
Detection Criteria:
- Decorators with patterns like
@route,@handler,@app.,@on,@callback - Framework-specific decorators (Flask routes, FastAPI endpoints, event handlers)
- Functions registered as callbacks for events or hooks
Example (Flask Routes):
from flask import Flask
app = Flask(__name__)
@app.route('/api/users')
def get_users():
return {"users": []}
@app.route('/api/users/<id>')
def get_user(id):
return {"user": find_user(id)}
Example (Event Handler):
class EventBus:
def __init__(self):
self.handlers = {}
def on(self, event_name):
def decorator(func):
self.handlers[event_name] = func
return func
return decorator
bus = EventBus()
@bus.on('user.created')
def handle_user_created(user):
send_welcome_email(user)
@bus.on('order.placed')
def handle_order_placed(order):
process_payment(order)
Confidence: High (0.8-0.9) for framework decorator patterns. Medium (0.6-0.7) for custom callback implementations.
Template Method Pattern
Template method pattern detection identifies base classes with template methods that call abstract hook methods.
Detection Criteria:
- Base class with concrete methods (template methods)
- Abstract methods intended to be overridden (hook methods)
- Template method calls hook methods in a defined sequence
- Subclasses override hook methods but not template method
Example:
from abc import ABC, abstractmethod
class DataProcessor(ABC):
def process(self, data):
"""Template method defining the algorithm skeleton"""
raw = self.load_data(data)
validated = self.validate(raw)
transformed = self.transform(validated)
self.save(transformed)
@abstractmethod
def load_data(self, source):
"""Hook: Load data from source"""
pass
@abstractmethod
def validate(self, data):
"""Hook: Validate data"""
pass
def transform(self, data):
"""Hook: Transform data (optional override)"""
return data
@abstractmethod
def save(self, data):
"""Hook: Save processed data"""
pass
class CSVProcessor(DataProcessor):
def load_data(self, source):
return read_csv(source)
def validate(self, data):
return [row for row in data if row]
def save(self, data):
write_csv('output.csv', data)
Confidence: Medium (0.7-0.8) based on combination of abstract and concrete methods in base class.
Dependency Injection Pattern
Dependency injection pattern detection identifies classes that receive dependencies through constructors or setters rather than creating them internally.
Detection Criteria:
- Constructor parameters accepting interface/protocol types
- Setter methods for injecting dependencies
- Optional dependencies with default values
- Absence of hard-coded object instantiation inside the class
Example (Constructor Injection):
class UserService:
def __init__(self,
user_repository: UserRepository,
email_service: EmailService,
logger: Logger):
self.user_repo = user_repository
self.email_service = email_service
self.logger = logger
def create_user(self, username, email):
user = self.user_repo.create(username, email)
self.email_service.send_welcome(email)
self.logger.info(f"Created user: {username}")
return user
Example (Setter Injection):
class ReportGenerator:
def __init__(self):
self.data_source = None
self.formatter = None
def set_data_source(self, source):
self.data_source = source
def set_formatter(self, formatter):
self.formatter = formatter
def generate(self):
data = self.data_source.fetch()
return self.formatter.format(data)
Confidence: Medium (0.65-0.75) based on constructor signatures and absence of direct instantiation.
Internal Pattern Detection
Debtmap also detects certain patterns internally for analysis purposes, but these are not exposed as user-facing design pattern detection features. These internal patterns help improve the accuracy of other analyses like god object detection and complexity calculations.
Builder Pattern (Internal Use Only)
The Builder pattern is detected internally during god object detection to avoid false positives. Classes that follow the builder pattern are given adjusted scores in god object analysis since builder classes naturally have many methods and fields.
Note: Builder pattern detection is not available via the --patterns CLI flag. It’s used only internally for scoring adjustments.
Internal Detection Criteria:
- Struct with builder suffix or builder-related naming
- Methods returning
Selffor chaining - Final
build()method returning the constructed type - Type-state pattern usage (optional)
Example (Internal Detection):
#![allow(unused)]
fn main() {
pub struct HttpClientBuilder {
base_url: Option<String>,
timeout: Duration,
headers: HashMap<String, String>,
}
impl HttpClientBuilder {
pub fn new() -> Self { /* ... */ }
// Chaining methods detected internally
pub fn base_url(mut self, url: impl Into<String>) -> Self { /* ... */ }
pub fn timeout(mut self, timeout: Duration) -> Self { /* ... */ }
pub fn header(mut self, key: String, value: String) -> Self { /* ... */ }
pub fn build(self) -> Result<HttpClient> { /* ... */ }
}
}
Why Internal Only: Builder patterns are a legitimate design choice for complex object construction. Debtmap detects them to prevent flagging builder classes as god objects, but doesn’t report them as design patterns since they don’t require complexity adjustments like other patterns.
Source: src/organization/builder_pattern.rs - Used for god object detection score adjustment
Visitor Pattern (Internal Use Only)
The Visitor pattern is detected internally for complexity analysis normalization. When exhaustive pattern matching is detected (typical of visitor patterns), Debtmap applies logarithmic complexity scaling instead of linear scaling to avoid penalizing idiomatic exhaustive match expressions.
Note: Visitor pattern detection is not available via the --patterns CLI flag. It’s used only internally for complexity scaling adjustments.
Internal Detection Criteria:
- Trait with visit methods for different types
- Implementations providing behavior for each visited type
- Exhaustive pattern matching across enum variants
- Used primarily for AST traversal or data structure processing
Example (Internal Detection):
#![allow(unused)]
fn main() {
trait Visitor {
fn visit_function(&mut self, func: &Function);
fn visit_class(&mut self, class: &Class);
fn visit_module(&mut self, module: &Module);
}
impl Visitor for ComplexityVisitor {
fn visit_function(&mut self, func: &Function) {
// Exhaustive matching detected for complexity scaling
match &func.body {
FunctionBody::Simple => { /* ... */ }
FunctionBody::Complex(statements) => { /* ... */ }
}
}
}
}
Why Internal Only: Visitor patterns often involve exhaustive pattern matching which can appear complex by traditional metrics. Debtmap detects these patterns to apply logarithmic scaling (log2(match_arms) * avg_complexity) instead of linear, preventing false positives in complexity analysis. This is a complexity adjustment mechanism, not a user-visible pattern detection feature.
Source: src/complexity/visitor_detector.rs - Used for complexity analysis, not pattern reporting
Configuration
CLI Options
Enable or configure pattern detection using command-line flags:
# Disable all pattern detection
debtmap analyze --no-pattern-detection
# Enable only specific patterns (all 7 available patterns shown)
debtmap analyze --patterns observer,singleton,factory,strategy,callback,template_method,dependency_injection
# Enable a subset of patterns
debtmap analyze --patterns observer,singleton,factory
# Set confidence threshold (0.0-1.0)
debtmap analyze --pattern-threshold 0.8
# Show warnings for uncertain pattern detections
debtmap analyze --show-pattern-warnings
Available Patterns for --patterns Flag:
observer- Observer pattern detectionsingleton- Singleton pattern detectionfactory- Factory pattern detectionstrategy- Strategy pattern detectioncallback- Callback pattern detectiontemplate_method- Template method pattern detectiondependency_injection- Dependency injection detection
Note: Builder and Visitor patterns are detected internally but are not available via the --patterns flag. See Internal Pattern Detection for details.
Pattern Detection Output
Pattern detection results are integrated into debtmap’s output in different formats:
Terminal Format: Detected patterns are shown in a dedicated section of the analysis output:
Design Patterns Detected:
Observer Pattern (confidence: 0.88)
Interface: EventListener (event_system.py:4)
Implementations: AuditLogger, SessionManager
JSON Format: Pattern results are included in the pattern_instances field:
{
"pattern_instances": [
{
"pattern_type": "Observer",
"confidence": 0.88,
"location": "event_system.py:4",
"implementations": ["AuditLogger", "SessionManager"]
}
]
}
Markdown Format: Patterns are documented in a dedicated section with cross-references to source files.
Using --show-pattern-warnings: This flag reveals low-confidence detections (below the threshold) that might indicate:
- Incomplete pattern implementations
- Patterns in development
- False positives to review
Use this flag during initial analysis to understand what patterns debtmap sees:
debtmap analyze --show-pattern-warnings --pattern-threshold 0.7
Confidence Scoring
Pattern detection uses a confidence scoring system (0.0-1.0) to indicate match quality:
- 0.9-1.0: Very High - Strong structural match with all key elements present
- 0.8-0.9: High - Clear pattern with most elements present
- 0.7-0.8: Medium-High - Pattern present with some uncertainty
- 0.6-0.7: Medium - Possible pattern with limited evidence
- 0.5-0.6: Low - Weak match, may be false positive
Default Threshold: 0.7 - Only patterns with 70% or higher confidence are reported by default.
Adjusting Thresholds:
# More strict (fewer patterns, higher confidence)
debtmap analyze --pattern-threshold 0.85
# More lenient (more patterns, lower confidence)
debtmap analyze --pattern-threshold 0.6 --show-pattern-warnings
How Confidence is Calculated:
Each pattern detector calculates confidence holistically based on multiple factors:
- Structural completeness: Are all expected elements present?
- Naming conventions: Do names match expected patterns?
- Implementation count: Are there enough implementations to confirm the pattern?
- Cross-validation: Do different detection heuristics agree?
For example, Observer pattern confidence is calculated holistically based on:
- Presence of abstract base class with appropriate markers (
ABC,Protocol, etc.) - Number of concrete implementations found
- Detection of registration methods (
add_observer,register,subscribe) - Detection of notification methods (
notify,notify_all,trigger,emit) - Naming conventions matching observer patterns
Higher confidence requires more structural elements to be present. The calculation is not a simple sum of individual weights but rather a holistic assessment of pattern completeness.
Cross-File Pattern Detection
Debtmap can detect patterns that span multiple files, particularly for the Observer pattern where interfaces and implementations may be in separate modules.
How Cross-File Detection Works:
- Import Tracking: Debtmap tracks imports to understand module dependencies
- Interface Registry: Abstract base classes are registered globally
- Implementation Matching: Implementations in other files are matched to registered interfaces
- Cross-Module Context: A shared context links related files
Example:
# interfaces/observer.py
from abc import ABC, abstractmethod
class EventObserver(ABC):
@abstractmethod
def on_event(self, data):
pass
# observers/logging_observer.py
from interfaces.observer import EventObserver
class LoggingObserver(EventObserver):
def on_event(self, data):
log(data)
# observers/email_observer.py
from interfaces.observer import EventObserver
class EmailObserver(EventObserver):
def on_event(self, data):
send_email(data)
Debtmap detects this as a single Observer pattern with cross-file implementations.
Limitations:
- Only works for explicitly imported interfaces
- Requires static import analysis (dynamic imports may not be tracked)
- Most effective within a single project (not across external dependencies)
Rust-Specific Pattern Detection
Trait-Based Patterns
Rust pattern detection leverages the trait system for identifying patterns:
Trait Registry: Tracks trait definitions and implementations across modules
#![allow(unused)]
fn main() {
// Trait registered for pattern detection
pub trait EventHandler {
fn handle(&self, event: &Event);
}
// Multiple implementations tracked
impl EventHandler for LogHandler { /* ... */ }
impl EventHandler for MetricsHandler { /* ... */ }
impl EventHandler for AlertHandler { /* ... */ }
}
Observer Pattern via Traits:
#![allow(unused)]
fn main() {
pub trait Observable {
fn subscribe(&mut self, observer: Box<dyn Observer>);
fn notify(&self, event: &Event);
}
pub trait Observer {
fn on_event(&self, event: &Event);
}
}
Differences from Python Detection:
- Traits are more explicit than Python’s ABC
- Type system ensures implementation correctness
- No runtime reflection needed for detection
- Pattern matching exhaustiveness helps identify Visitor pattern
Integration with Complexity Analysis
Debtmap has two separate but complementary systems for patterns:
1. Design Pattern Detection (This Feature)
The 7 user-facing design patterns documented in this chapter (Observer, Singleton, Factory, Strategy, Callback, Template Method, Dependency Injection) are detected and reported to users. These patterns appear in the output to document architectural choices but do not directly adjust complexity scores.
Purpose: Architectural documentation and pattern identification
Output: Pattern instances with confidence scores in terminal, JSON, and markdown formats
2. Complexity Pattern Adjustments (Internal System)
Debtmap has a separate internal system in src/complexity/python_pattern_adjustments.rs that detects specific complexity patterns and applies multipliers. These are different patterns from the user-facing design patterns:
Internal complexity patterns include:
- Dictionary Dispatch (0.5x multiplier)
- Strategy Pattern detection via conditionals (0.6x multiplier)
- Comprehension patterns (0.8x multiplier)
- Other Python-specific complexity patterns
Purpose: Adjust complexity scores to avoid penalizing idiomatic code
Output: Applied automatically during complexity calculation, not reported separately
Relationship Between the Systems
Currently, these are independent systems:
- Design pattern detection focuses on architectural patterns
- Complexity adjustments focus on implementation patterns
The design pattern detection results are primarily for documentation and architectural insights. The complexity scoring uses its own pattern recognition to apply appropriate adjustments.
Visitor Pattern Special Case
The Visitor pattern (internal-only) is used for complexity analysis. When exhaustive pattern matching is detected, debtmap applies logarithmic scaling:
visitor_complexity = log2(match_arms) * average_arm_complexity
This prevents exhaustive pattern matching from being flagged as overly complex. See Visitor Pattern (Internal Use Only) for more details.
See Also:
- Complexity Analysis - How complexity is calculated
- Scoring Strategies - Complexity adjustments and multipliers
Practical Examples
Example 1: Analyzing a Web Framework
Analyzing a Flask application with callback patterns:
debtmap analyze --patterns callback --show-pattern-warnings myapp/
Output excerpt:
Design Patterns Detected:
Callback Pattern (15 instances, confidence: 0.85-0.92)
- @app.route decorators: 12
- @app.before_request decorators: 2
- @app.errorhandler decorators: 1
Complexity Adjustments:
- Route handlers: -40% complexity (pattern boilerplate)
- Error handlers: -50% complexity (expected pattern)
Example 2: Detecting Observer Pattern
Analyzing a codebase with event-driven architecture:
debtmap analyze --patterns observer --pattern-threshold 0.75
Code:
# event_system.py
from abc import ABC, abstractmethod
class EventListener(ABC):
@abstractmethod
def on_user_login(self, user):
pass
class AuditLogger(EventListener):
def on_user_login(self, user):
audit_log.write(f"User {user.id} logged in")
class SessionManager(EventListener):
def on_user_login(self, user):
create_session(user)
class EventDispatcher:
def __init__(self):
self.listeners = []
def add_listener(self, listener):
self.listeners.append(listener)
def notify_login(self, user):
for listener in self.listeners:
listener.on_user_login(user)
Output:
Design Patterns:
Observer Pattern (confidence: 0.88)
Interface: EventListener (event_system.py:4)
Implementations:
- AuditLogger (event_system.py:9)
- SessionManager (event_system.py:13)
Registration: add_listener (event_system.py:21)
Notification: notify_login (event_system.py:24)
Use Cases
1. False Positive Reduction
Problem: Complex factory functions flagged as too complex Solution: Enable factory pattern detection to apply appropriate complexity adjustments
debtmap analyze --patterns factory --pattern-threshold 0.7
2. Architecture Documentation
Problem: Undocumented design patterns in legacy codebase Solution: Run pattern detection to automatically identify architectural patterns
debtmap analyze --show-pattern-warnings > architecture-report.txt
3. Pattern Consistency Validation
Problem: Inconsistent Observer implementations across the codebase Solution: Use pattern detection to identify all Observer instances and compare their structure
debtmap analyze --patterns observer --output-format json > observers.json
4. Refactoring Guidance
Problem: Code smells that might be incomplete pattern implementations Solution: Detect partial patterns with lower confidence thresholds
debtmap analyze --pattern-threshold 0.5 --show-pattern-warnings
Troubleshooting
Pattern Not Detected
Symptoms: Expected pattern not appearing in output
Possible Causes:
- Confidence below threshold
- Solution: Lower
--pattern-thresholdor use--show-pattern-warnings
- Solution: Lower
- Pattern disabled
- Solution: Check
--patternsflag and.debtmap.tomlconfig
- Solution: Check
- Implementation doesn’t match detection criteria
- Solution: Review pattern-specific criteria above or add custom rule
Builder or Visitor Pattern Not Available via CLI
Symptoms: Using --patterns builder or --patterns visitor has no effect
Explanation: Builder and Visitor patterns are detected internally only and are not available as user-facing pattern detection features:
- Builder: Used internally during god object detection to adjust scores for builder classes
- Visitor: Used internally for complexity analysis to apply logarithmic scaling to exhaustive match expressions
Solution: These patterns are detected automatically when needed for internal analyses. They don’t require manual enablement and won’t appear in pattern detection output. See Internal Pattern Detection for details.
Available user-facing patterns: observer, singleton, factory, strategy, callback, template_method, dependency_injection
False Positive Detection
Symptoms: Pattern detected incorrectly
Possible Causes:
- Naming collision (e.g.,
create_function that isn’t a factory)- Solution: Increase
--pattern-thresholdto require stronger evidence
- Solution: Increase
- Coincidental structural match
- Solution: Add exclusion rules in configuration (if supported)
Incomplete Cross-File Detection
Symptoms: Pattern implementations in other files not linked to interface
Possible Causes:
- Dynamic imports not tracked
- Solution: Use static imports where possible
- Interface not explicitly imported
- Solution: Add explicit import even if not type-checking
Best Practices
- Start with defaults: The default 0.7 threshold works well for most projects
- Use
--show-pattern-warningsduring initial analysis to see borderline detections - Configure per-pattern: Adjust detection criteria for patterns most relevant to your project
- Define custom rules: Add project-specific patterns to reduce false positives
- Combine with complexity analysis: Use pattern detection to understand complexity adjustments
- Review low-confidence detections: They may indicate incomplete implementations worth refactoring
Summary
Debtmap’s design pattern detection provides:
- 7 user-facing patterns covering common OOP and functional patterns (Observer, Singleton, Factory, Strategy, Callback, Template Method, Dependency Injection)
- 2 internal patterns (Builder, Visitor) used for god object detection and complexity normalization
- Configurable confidence thresholds for precision vs. recall tradeoff
- Custom pattern rules for project-specific patterns
- Cross-file detection for patterns spanning multiple modules
- Rust trait support for idiomatic Rust pattern detection
- Complexity integration to reduce false positives in analysis
Pattern detection improves the accuracy of technical debt analysis by recognizing idiomatic code patterns and applying appropriate complexity adjustments. Internal pattern detection helps prevent false positives in god object and complexity analyses without exposing implementation details to users.
Entropy Analysis
Entropy analysis is Debtmap’s unique approach to distinguishing genuinely complex code from repetitive pattern-based code. This reduces false positives by 60-75% compared to traditional cyclomatic complexity metrics.
Overview
Traditional static analysis tools flag code as “complex” based purely on cyclomatic complexity or lines of code. However, not all complexity is equal:
- Repetitive patterns (validation functions, dispatchers) have high cyclomatic complexity but low cognitive load
- Diverse logic (state machines, business rules) may have moderate cyclomatic complexity but high cognitive load
Entropy analysis uses information theory to distinguish between these cases.
How It Works
Debtmap’s entropy analysis is language-agnostic, working across Rust, Python, JavaScript, and TypeScript codebases using a universal token classification approach. This ensures consistent complexity assessment regardless of the programming language used.
Language-Agnostic Analysis
The same entropy concepts apply consistently across all supported languages. Here’s how a validation function would be analyzed in different languages:
Rust:
#![allow(unused)]
fn main() {
fn validate_config(config: &Config) -> Result<()> {
if config.output_dir.is_none() { return Err(anyhow!("output_dir required")); }
if config.max_workers.is_none() { return Err(anyhow!("max_workers required")); }
if config.timeout_secs.is_none() { return Err(anyhow!("timeout_secs required")); }
Ok(())
}
// Entropy: ~0.3, Pattern Repetition: 0.9, Effective Complexity: ~5
}
Python:
def validate_config(config: Config) -> None:
if config.output_dir is None: raise ValueError("output_dir required")
if config.max_workers is None: raise ValueError("max_workers required")
if config.timeout_secs is None: raise ValueError("timeout_secs required")
# Entropy: ~0.3, Pattern Repetition: 0.9, Effective Complexity: ~5
JavaScript/TypeScript:
function validateConfig(config: Config): void {
if (!config.outputDir) throw new Error("outputDir required");
if (!config.maxWorkers) throw new Error("maxWorkers required");
if (!config.timeoutSecs) throw new Error("timeoutSecs required");
}
// Entropy: ~0.3, Pattern Repetition: 0.9, Effective Complexity: ~5
All three receive similar entropy scores because they share the same repetitive validation pattern, demonstrating how Debtmap’s analysis transcends language syntax to identify underlying code structure patterns.
Shannon Entropy
Shannon entropy measures the variety and unpredictability of code patterns:
H(X) = -Σ p(x) × log₂(p(x))
Where:
p(x)= probability of each token type- High entropy (0.8-1.0) = many different patterns
- Low entropy (0.0-0.3) = repetitive patterns
Token Classification
Debtmap can classify tokens by importance to give more weight to semantically significant tokens in entropy calculations. This is controlled by the use_classification configuration option.
When enabled (use_classification = false by default for backward compatibility), tokens are weighted by importance:
High importance (weight: 1.0):
- Control flow keywords (
if,match,for,while) - Error handling (
try,catch,?,unwrap) - Async keywords (
async,await)
Medium importance (weight: 0.7):
- Function calls
- Method invocations
- Operators
Low importance (weight: 0.3):
- Identifiers (variable names)
- Literals (strings, numbers)
- Punctuation
When disabled (use_classification = false), all tokens are treated equally, which may be useful for debugging or when you want unweighted entropy scores.
Pattern Repetition Detection
Detects repetitive structures in the AST:
#![allow(unused)]
fn main() {
// Low pattern repetition (0.2) - all branches identical
if a.is_none() { return Err(...) }
if b.is_none() { return Err(...) }
if c.is_none() { return Err(...) }
// High pattern repetition (0.9) - diverse branches
match state {
Active => transition_to_standby(),
Standby => transition_to_active(),
Maintenance => schedule_restart(),
}
}
Branch Similarity Analysis
Analyzes similarity between conditional branches:
#![allow(unused)]
fn main() {
// High branch similarity (0.9) - branches are nearly identical
if condition_a {
log("A happened");
process_a();
}
if condition_b {
log("B happened");
process_b();
}
// Low branch similarity (0.2) - branches are very different
if needs_auth {
authenticate_user()?;
load_profile()?;
} else {
show_guest_ui();
}
}
Effective Complexity Adjustment
Debtmap uses a multi-factor dampening approach that analyzes three dimensions of code repetitiveness:
- Pattern Repetition - Detects repetitive AST structures
- Token Entropy - Measures variety in token usage
- Branch Similarity - Compares similarity between conditional branches
These factors are combined multiplicatively with a minimum floor of 0.7 (preserving at least 70% of original complexity):
dampening_factor = (repetition_factor × entropy_factor × branch_factor).max(0.7)
effective_complexity = raw_complexity × dampening_factor
Historical Note: Spec 68
Spec 68: Graduated Entropy Dampening was the original simple algorithm that only considered entropy < 0.2:
dampening_factor = 0.5 + 0.5 × (entropy / 0.2) [when entropy < 0.2]
The current implementation uses a more sophisticated graduated dampening approach that considers all three factors (repetition, entropy, branch similarity) with separate thresholds and ranges for each. The test suite references Spec 68 to verify backward compatibility with the original behavior.
When Dampening Applies
Dampening is applied based on multiple thresholds:
- Pattern Repetition: Values approaching 1.0 trigger dampening (high repetition detected)
- Token Entropy: Values below 0.4 trigger graduated dampening (low variety)
- Branch Similarity: Values above 0.8 trigger dampening (similar branches)
Graduated Dampening Formula
Each factor is dampened individually using a graduated calculation:
#![allow(unused)]
fn main() {
// Conceptual pseudocode showing the three-factor approach
// Actual implementation in src/complexity/entropy.rs:185-195 and :429-439
fn calculate_dampening_factor(
repetition: f64, // 0.0-1.0
entropy: f64, // 0.0-1.0
branch_similarity: f64 // 0.0-1.0
) -> f64 {
// Each factor uses calculate_graduated_dampening with its own threshold/range
let repetition_factor = graduated_dampening(repetition, threshold=1.0, max_reduction=0.20);
let entropy_factor = graduated_dampening(entropy, threshold=0.4, max_reduction=0.15);
let branch_factor = graduated_dampening(branch_similarity, threshold=0.8, max_reduction=0.25);
(repetition_factor * entropy_factor * branch_factor).max(0.7) // Never reduce below 70%
}
}
Key Parameters:
- Repetition: Threshold 1.0, max 20% reduction (configurable via
max_repetition_reduction) - Entropy: Threshold 0.4 (hardcoded), max 15% reduction (configurable via
max_entropy_reduction) - Branch Similarity: Threshold 0.8 (configurable via
branch_threshold), max 25% reduction (configurable viamax_branch_reduction) - Combined Floor: Minimum 70% of original complexity preserved (configurable via
max_combined_reduction)
Example: Repetitive Validation Function
Raw Complexity: 20
Pattern Repetition: 0.95 (very high)
Token Entropy: 0.3 (low variety)
Branch Similarity: 0.9 (very similar branches)
repetition_factor ≈ 0.85 (15% reduction)
entropy_factor ≈ 0.90 (10% reduction)
branch_factor ≈ 0.80 (20% reduction)
dampening_factor = (0.85 × 0.90 × 0.80) = 0.612
dampening_factor = max(0.612, 0.7) = 0.7 // Floor applied
Effective Complexity = 20 × 0.7 = 14
Result: 30% reduction (maximum allowed)
Example: Diverse State Machine
Raw Complexity: 20
Pattern Repetition: 0.2 (low - not repetitive)
Token Entropy: 0.8 (high variety)
Branch Similarity: 0.3 (diverse branches)
repetition_factor ≈ 1.0 (no reduction)
entropy_factor ≈ 1.0 (no reduction)
branch_factor ≈ 1.0 (no reduction)
dampening_factor = (1.0 × 1.0 × 1.0) = 1.0
Effective Complexity = 20 × 1.0 = 20
Result: 0% reduction (complexity preserved)
Real-World Examples
Example 1: Validation Function
#![allow(unused)]
fn main() {
fn validate_config(config: &Config) -> Result<()> {
if config.output_dir.is_none() {
return Err(anyhow!("output_dir required"));
}
if config.max_workers.is_none() {
return Err(anyhow!("max_workers required"));
}
if config.timeout_secs.is_none() {
return Err(anyhow!("timeout_secs required"));
}
// ... 17 more similar checks
Ok(())
}
}
Traditional analysis:
- Cyclomatic Complexity: 20
- Assessment: CRITICAL
Entropy analysis:
- Shannon Entropy: 0.3 (low variety)
- Pattern Repetition: 0.9 (highly repetitive)
- Branch Similarity: 0.95 (nearly identical)
- Effective Complexity: 5
- Assessment: LOW PRIORITY
Example 2: State Machine Logic
#![allow(unused)]
fn main() {
fn reconcile_state(current: &State, desired: &State) -> Vec<Action> {
let mut actions = vec![];
match (current.mode, desired.mode) {
(Mode::Active, Mode::Standby) => {
if current.has_active_connections() {
actions.push(Action::DrainConnections);
actions.push(Action::WaitForDrain);
}
actions.push(Action::TransitionToStandby);
}
(Mode::Standby, Mode::Active) => {
if desired.requires_warmup() {
actions.push(Action::Warmup);
}
actions.push(Action::TransitionToActive);
}
// ... more diverse state transitions
_ => {}
}
actions
}
}
Traditional analysis:
- Cyclomatic Complexity: 8
- Assessment: MODERATE
Entropy analysis:
- Shannon Entropy: 0.85 (high variety)
- Pattern Repetition: 0.2 (not repetitive)
- Branch Similarity: 0.3 (diverse branches)
- Effective Complexity: 9
- Assessment: HIGH PRIORITY
Configuration
Configure entropy analysis in .debtmap.toml or disable via the --semantic-off CLI flag.
[entropy]
# Enable entropy analysis (default: true)
enabled = true
# Weight of entropy in overall complexity scoring (0.0-1.0, default: 1.0)
# Note: This affects scoring, not dampening thresholds
weight = 1.0
# Minimum tokens required for entropy calculation (default: 20)
min_tokens = 20
# Pattern similarity threshold for repetition detection (0.0-1.0, default: 0.7)
pattern_threshold = 0.7
# Enable advanced token classification (default: false for backward compatibility)
# When true, weights tokens by semantic importance (control flow > operators > identifiers)
use_classification = false
# Branch similarity threshold (0.0-1.0, default: 0.8)
# Branches with similarity above this threshold contribute to dampening
branch_threshold = 0.8
# Maximum reduction limits (these are configurable)
max_repetition_reduction = 0.20 # Max 20% reduction from pattern repetition
max_entropy_reduction = 0.15 # Max 15% reduction from low token entropy
max_branch_reduction = 0.25 # Max 25% reduction from branch similarity
max_combined_reduction = 0.30 # Overall cap at 30% reduction (minimum 70% preserved)
Important Notes:
-
Dampening thresholds - Some are configurable, some are hardcoded (
src/complexity/entropy.rs:185-195):- Entropy factor threshold: 0.4 - Hardcoded internally (not configurable)
- Branch threshold: 0.8 - Configurable via
branch_thresholdin config file - Pattern threshold: 0.7/1.0 - Configurable via
pattern_thresholdin config file
-
The
weightparameter affects how entropy scores contribute to overall complexity scoring, but does not change the dampening thresholds or reductions. -
Token classification defaults to
false(disabled) for backward compatibility, even though it provides more accurate entropy analysis when enabled.
Tuning for Your Project
Enable token classification for better accuracy:
[entropy]
enabled = true
use_classification = true # Weight control flow keywords more heavily
Strict mode (fewer reductions, flag more code):
[entropy]
enabled = true
max_repetition_reduction = 0.10 # Reduce from default 0.20
max_entropy_reduction = 0.08 # Reduce from default 0.15
max_branch_reduction = 0.12 # Reduce from default 0.25
max_combined_reduction = 0.20 # Reduce from default 0.30 (preserve 80%)
Lenient mode (more aggressive reduction):
[entropy]
enabled = true
max_repetition_reduction = 0.30 # Increase from default 0.20
max_entropy_reduction = 0.25 # Increase from default 0.15
max_branch_reduction = 0.35 # Increase from default 0.25
max_combined_reduction = 0.50 # Increase from default 0.30 (preserve 50%)
Disable entropy dampening entirely:
[entropy]
enabled = false
Or via CLI (disables entropy-based complexity adjustments):
# Disables semantic analysis features including entropy dampening
debtmap analyze . --semantic-off
Note: The --semantic-off flag disables all semantic analysis features, including entropy-based complexity adjustments. This is useful when you want raw cyclomatic complexity without any dampening.
Interpreting Entropy-Adjusted Output
When entropy analysis detects repetitive patterns, debtmap displays both the original and adjusted complexity values to help you understand the adjustment. This transparency allows you to verify the analysis and understand why certain code receives lower priority.
Output Format
When viewing detailed output (verbosity level 2 with -vv), entropy-adjusted complexity is shown in the COMPLEXITY section:
COMPLEXITY: cyclomatic=20 (dampened: 14, factor: 0.70), est_branches=40, cognitive=25, nesting=3, entropy=0.30
And in the Entropy Impact scoring section:
- Entropy Impact: 30% dampening (entropy: 0.30, repetition: 95%)
Understanding the Values
cyclomatic=20: Original cyclomatic complexity before adjustment dampened: 14: Adjusted complexity after entropy analysis (20 × 0.70 = 14) factor: 0.70: The dampening factor applied (0.70 = 30% reduction) entropy=0.30: Shannon entropy score (0.0-1.0, lower = more repetitive) repetition: 95%: Pattern repetition score (higher = more repetitive)
Reconstructing the Calculation
You can verify the adjustment by multiplying:
original_complexity × dampening_factor = adjusted_complexity
20 × 0.70 = 14
The dampening percentage shown in the Entropy Impact section is:
dampening_percentage = (1.0 - dampening_factor) × 100%
(1.0 - 0.70) × 100% = 30%
When Entropy Data is Unavailable
If a function is too small for entropy analysis (< 20 tokens) or entropy is disabled, the output shows complexity without dampening:
COMPLEXITY: cyclomatic=5, est_branches=10, cognitive=8, nesting=2
No “dampened” or “factor” values are shown, indicating the raw complexity is used for scoring.
Example Output Comparison
Before entropy-adjustment:
#1 SCORE: 95.5 [CRITICAL]
├─ COMPLEXITY: cyclomatic=20, est_branches=40, cognitive=25, nesting=3
After entropy-adjustment:
#15 SCORE: 68.2 [HIGH]
├─ COMPLEXITY: cyclomatic=20 (dampened: 14, factor: 0.70), est_branches=40, cognitive=25, nesting=3, entropy=0.30
- Entropy Impact: 30% dampening (entropy: 0.30, repetition: 95%)
The item dropped from rank #1 to #15 because entropy analysis detected the high complexity was primarily due to repetitive validation patterns rather than genuine cognitive complexity.
Understanding the Impact
Measuring False Positive Reduction
Run analysis with and without entropy:
# Without entropy
debtmap analyze . --semantic-off --top 20 > without_entropy.txt
# With entropy (default)
debtmap analyze . --top 20 > with_entropy.txt
# Compare
diff without_entropy.txt with_entropy.txt
Expected results:
- 60-75% reduction in flagged validation functions
- 40-50% reduction in flagged dispatcher functions
- 20-30% reduction in flagged configuration parsers
- No reduction in genuinely complex state machines or business logic
Verifying Correctness
Entropy analysis should:
- Reduce flags on repetitive code (validators, dispatchers)
- Preserve flags on genuinely complex code (state machines, business logic)
If entropy analysis incorrectly reduces flags on genuinely complex code, adjust configuration:
[entropy]
max_combined_reduction = 0.20 # Reduce from default 0.30 (preserve 80%)
max_repetition_reduction = 0.10 # Reduce individual factors
max_entropy_reduction = 0.08
max_branch_reduction = 0.12
Best Practices
- Use default settings - They work well for most projects
- Verify results - Spot-check top-priority items to ensure correctness
- Tune conservatively - Start with default settings, adjust if needed
- Disable for debugging - Use
--semantic-offif entropy seems incorrect - Report issues - If entropy incorrectly flags code, report it
Limitations
Entropy analysis works best for:
- Functions with cyclomatic complexity 10-50
- Code with clear repetitive patterns
- Validation, dispatch, and configuration functions
Entropy analysis is less effective for:
- Very simple functions (complexity < 5)
- Very complex functions (complexity > 100)
- Obfuscated or generated code
Comparison with Other Approaches
| Approach | False Positive Rate | Complexity | Speed |
|---|---|---|---|
| Raw Cyclomatic Complexity | High (many false positives) | Low | Fast |
| Cognitive Complexity | Medium | Medium | Medium |
| Entropy Analysis (Debtmap) | Low | High | Fast |
| Manual Code Review | Very Low | Very High | Very Slow |
Debtmap’s entropy analysis provides the best balance of accuracy and speed.
See Also
- Why Debtmap? - Real-world examples of entropy analysis
- Analysis Guide - General analysis concepts
- Configuration - Complete configuration reference
Error Handling Analysis
Debtmap provides comprehensive error handling analysis across all supported languages (Rust, Python, JavaScript, TypeScript), detecting anti-patterns that lead to silent failures, production panics, and difficult-to-debug issues.
Overview
Error handling issues are classified as ErrorSwallowing debt with Major severity (weight 4), reflecting their significant impact on code reliability and debuggability. Debtmap detects:
- Error swallowing: Exception handlers that silently catch errors without logging or re-raising
- Panic patterns: Rust code that can panic in production (unwrap, expect, panic!)
- Error propagation issues: Missing error context in Result chains
- Async error handling: Unhandled promise rejections, dropped futures, missing await
- Python-specific patterns: Bare except clauses, silent exception handling
All error handling patterns are filtered intelligently - code detected in test modules (e.g., #[cfg(test)], test_ prefixes) receives lower priority or is excluded entirely.
Rust Error Handling Analysis
Panic Pattern Detection
Debtmap identifies Rust code that can panic at runtime instead of returning Result:
Detected patterns:
#![allow(unused)]
fn main() {
// ❌ CRITICAL: Direct panic in production code
fn process_data(value: Option<i32>) -> i32 {
panic!("not implemented"); // Detected: PanicInNonTest
}
// ❌ HIGH: Unwrap on Result
fn read_config(path: &Path) -> Config {
let content = fs::read_to_string(path).unwrap(); // Detected: UnwrapOnResult
parse_config(&content)
}
// ❌ HIGH: Unwrap on Option
fn get_user(id: u32) -> User {
users.get(&id).unwrap() // Detected: UnwrapOnOption
}
// ❌ MEDIUM: Expect with generic message
fn parse_value(s: &str) -> i32 {
s.parse().expect("parse failed") // Detected: ExpectWithGenericMessage
}
// ❌ MEDIUM: TODO in production
fn calculate_tax(amount: f64) -> f64 {
todo!("implement tax calculation") // Detected: TodoInProduction
}
}
Recommended alternatives:
#![allow(unused)]
fn main() {
// ✅ GOOD: Propagate errors with ?
fn read_config(path: &Path) -> Result<Config> {
let content = fs::read_to_string(path)?;
parse_config(&content)
}
// ✅ GOOD: Handle Option explicitly
fn get_user(id: u32) -> Result<User> {
users.get(&id)
.ok_or_else(|| anyhow!("User {} not found", id))
}
// ✅ GOOD: Add meaningful context
fn parse_value(s: &str) -> Result<i32> {
s.parse()
.with_context(|| format!("Failed to parse '{}' as integer", s))
}
}
Test code exceptions:
#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
#[test]
fn test_parsing() {
let result = "42".parse::<i32>().unwrap(); // ✅ OK in tests (LOW priority)
assert_eq!(result, 42);
}
}
}
Debtmap detects #[cfg(test)] attributes and test function contexts, automatically assigning Low priority to panic patterns in test code.
Error Propagation Analysis
Debtmap detects missing error context in Result chains:
#![allow(unused)]
fn main() {
// ❌ Missing context - which file failed? What was the error?
fn load_multiple_configs(paths: &[PathBuf]) -> Result<Vec<Config>> {
paths.iter()
.map(|p| fs::read_to_string(p)) // Error loses file path information
.collect::<Result<Vec<_>>>()?
.into_iter()
.map(|c| parse_config(&c)) // Error loses which config failed
.collect()
}
// ✅ GOOD: Preserve context through the chain
fn load_multiple_configs(paths: &[PathBuf]) -> Result<Vec<Config>> {
paths.iter()
.map(|p| {
fs::read_to_string(p)
.with_context(|| format!("Failed to read config from {}", p.display()))
})
.collect::<Result<Vec<_>>>()?
.into_iter()
.enumerate()
.map(|(i, content)| {
parse_config(&content)
.with_context(|| format!("Failed to parse config #{}", i))
})
.collect()
}
}
Best practices:
- Use
.context()or.with_context()fromanyhoworthiserror - Include relevant values in error messages (file paths, indices, input values)
- Maintain error context at each transformation in the chain
Error Swallowing in Rust
Debtmap detects seven distinct patterns of error swallowing in Rust, where errors are silently ignored without logging or propagation:
1. IfLetOkNoElse - Missing else branch
#![allow(unused)]
fn main() {
// ❌ Detected: if let Ok without else branch
fn try_update(value: &str) {
if let Ok(parsed) = value.parse::<i32>() {
update_value(parsed);
}
// Error case silently ignored - no logging or handling
}
// ✅ GOOD: Handle both cases
fn try_update(value: &str) -> Result<()> {
if let Ok(parsed) = value.parse::<i32>() {
update_value(parsed);
Ok(())
} else {
Err(anyhow!("Failed to parse value: {}", value))
}
}
}
2. IfLetOkEmptyElse - Empty else branch
#![allow(unused)]
fn main() {
// ❌ Detected: if let Ok with empty else
fn process_result(result: Result<Data, Error>) {
if let Ok(data) = result {
process(data);
} else {
// Empty else - error silently swallowed
}
}
// ✅ GOOD: Log the error
fn process_result(result: Result<Data, Error>) {
if let Ok(data) = result {
process(data);
} else {
log::error!("Failed to process: {:?}", result);
}
}
}
3. LetUnderscoreResult - Discarding Result with let _
#![allow(unused)]
fn main() {
// ❌ Detected: Result discarded with let _
fn save_data(data: &Data) {
let _ = fs::write("data.json", serde_json::to_string(data).unwrap());
// Write failure silently ignored
}
// ✅ GOOD: Handle or propagate the error
fn save_data(data: &Data) -> Result<()> {
fs::write("data.json", serde_json::to_string(data)?)
.context("Failed to save data")?;
Ok(())
}
}
4. OkMethodDiscard - Calling .ok() and discarding
#![allow(unused)]
fn main() {
// ❌ Detected: .ok() called but result discarded
fn try_parse(s: &str) -> Option<i32> {
s.parse::<i32>().ok(); // Result immediately discarded
None
}
// ✅ GOOD: Use the Ok value or log the error
fn try_parse(s: &str) -> Option<i32> {
match s.parse::<i32>() {
Ok(v) => Some(v),
Err(e) => {
log::warn!("Failed to parse '{}': {}", s, e);
None
}
}
}
}
5. MatchIgnoredErr - Match with ignored error variant
#![allow(unused)]
fn main() {
// ❌ Detected: match with _ in Err branch
fn try_load(path: &Path) -> Option<String> {
match fs::read_to_string(path) {
Ok(content) => Some(content),
Err(_) => None, // Error details ignored
}
}
// ✅ GOOD: Log the error with context
fn try_load(path: &Path) -> Option<String> {
match fs::read_to_string(path) {
Ok(content) => Some(content),
Err(e) => {
log::error!("Failed to read {}: {}", path.display(), e);
None
}
}
}
}
6. UnwrapOrNoLog - .unwrap_or() without logging
#![allow(unused)]
fn main() {
// ❌ Detected: unwrap_or without logging
fn get_config_value(key: &str) -> String {
load_config()
.and_then(|c| c.get(key))
.unwrap_or_else(|| "default".to_string())
// Error silently replaced with default
}
// ✅ GOOD: Log before falling back to default
fn get_config_value(key: &str) -> String {
match load_config().and_then(|c| c.get(key)) {
Ok(value) => value,
Err(e) => {
log::warn!("Config key '{}' not found: {}. Using default.", key, e);
"default".to_string()
}
}
}
}
7. UnwrapOrDefaultNoLog - .unwrap_or_default() without logging
#![allow(unused)]
fn main() {
// ❌ Detected: unwrap_or_default without logging
fn load_settings() -> Settings {
read_settings_file().unwrap_or_default()
// Error silently replaced with default settings
}
// ✅ GOOD: Log the fallback to defaults
fn load_settings() -> Settings {
match read_settings_file() {
Ok(settings) => settings,
Err(e) => {
log::warn!("Failed to load settings: {}. Using defaults.", e);
Settings::default()
}
}
}
}
Summary of Error Swallowing Patterns:
| Pattern | Description | Common Cause |
|---|---|---|
| IfLetOkNoElse | if let Ok(..) without else | Quick prototyping, forgotten error path |
| IfLetOkEmptyElse | if let Ok(..) with empty else | Incomplete implementation |
| LetUnderscoreResult | let _ = result | Intentional ignore without thought |
| OkMethodDiscard | .ok() result not used | Misunderstanding of .ok() semantics |
| MatchIgnoredErr | Err(_) => ... with no logging | Generic error handling |
| UnwrapOrNoLog | .unwrap_or() without logging | Convenience over observability |
| UnwrapOrDefaultNoLog | .unwrap_or_default() without logging | Default fallback without visibility |
All these patterns are detected at Medium to High priority depending on context, as they represent lost error information that makes debugging difficult.
Python Error Handling Analysis
Bare Except Clause Detection
Python’s bare except: catches all exceptions, including system exits and keyboard interrupts:
# ❌ CRITICAL: Bare except catches everything
def process_file(path):
try:
with open(path) as f:
return f.read()
except: # Detected: BareExceptClause
return None # Catches SystemExit, KeyboardInterrupt, etc.
# ❌ HIGH: Catching Exception is too broad
def load_config(path):
try:
return yaml.load(open(path))
except Exception: # Detected: OverlyBroadException
return {} # Silent failure loses error information
# ✅ GOOD: Specific exception types
def process_file(path):
try:
with open(path) as f:
return f.read()
except FileNotFoundError:
log.error(f"File not found: {path}")
return None
except PermissionError:
log.error(f"Permission denied: {path}")
return None
Why bare except is dangerous:
- Catches
SystemExit(prevents clean shutdown) - Catches
KeyboardInterrupt(prevents Ctrl+C) - Catches
GeneratorExit(breaks generator protocol) - Masks programming errors like
NameError,AttributeError
Best practices:
- Always specify exception types:
except ValueError,except (TypeError, KeyError) - Use
except Exceptiononly when truly catching all application errors - Never use bare
except:in production code - Log exceptions with full context before suppressing
Silent Exception Handling
# ❌ Silent exception handling
def get_user_age(user_id):
try:
user = db.get_user(user_id)
return user.age
except: # Detected: SilentException (no logging, no re-raise)
pass
# ✅ GOOD: Log and provide meaningful default
def get_user_age(user_id):
try:
user = db.get_user(user_id)
return user.age
except UserNotFound:
logger.warning(f"User {user_id} not found")
return None
except DatabaseError as e:
logger.error(f"Database error fetching user {user_id}: {e}")
raise # Re-raise for caller to handle
Contextlib Suppress Detection
Python’s contextlib.suppress() intentionally silences exceptions, which can hide errors:
from contextlib import suppress
# ❌ MEDIUM: contextlib.suppress hides errors
def cleanup_temp_files(paths):
for path in paths:
with suppress(FileNotFoundError, PermissionError):
os.remove(path) # Detected: ContextlibSuppress
# Errors silently suppressed - no visibility into failures
# ✅ GOOD: Log suppressed errors
def cleanup_temp_files(paths):
for path in paths:
try:
os.remove(path)
except FileNotFoundError:
logger.debug(f"File already deleted: {path}")
except PermissionError as e:
logger.warning(f"Permission denied removing {path}: {e}")
except Exception as e:
logger.error(f"Unexpected error removing {path}: {e}")
# ✅ ACCEPTABLE: Use suppress only for truly ignorable cases
def best_effort_cleanup(paths):
"""Best-effort cleanup - failures are expected and acceptable."""
for path in paths:
with suppress(OSError): # OK if documented and intentional
os.remove(path)
When contextlib.suppress is acceptable:
- Cleanup operations where failures are genuinely unimportant
- Operations explicitly documented as “best effort”
- Code where logging would create noise without value
When to avoid contextlib.suppress:
- Production code where error visibility matters
- Operations where partial failure should be noticed
- Any case where debugging might be needed later
Exception Flow Analysis
Debtmap tracks exception propagation through Python codebases to identify functions that can raise exceptions without proper handling. This analysis helps ensure that exceptions are either caught at appropriate levels or documented in the function’s interface.
# Potential issue: Exceptions may propagate unhandled
def process_batch(items):
for item in items:
validate_item(item) # Can raise ValueError
transform_item(item) # Can raise TransformError
save_item(item) # Can raise DatabaseError
# ✅ GOOD: Handle exceptions appropriately
def process_batch(items):
results = {"success": 0, "failed": 0}
for item in items:
try:
validate_item(item)
transform_item(item)
save_item(item)
results["success"] += 1
except ValueError as e:
logger.warning(f"Invalid item {item.id}: {e}")
results["failed"] += 1
except (TransformError, DatabaseError) as e:
logger.error(f"Failed to process item {item.id}: {e}")
results["failed"] += 1
# Optionally re-raise critical errors
if isinstance(e, DatabaseError):
raise
return results
Async Error Handling
Unhandled Promise Rejections (JavaScript/TypeScript)
Note: JavaScript and TypeScript support in debtmap currently focuses on complexity analysis and basic error patterns. Advanced async error handling detection (unhandled promise rejections, missing await) is primarily implemented for Rust async code.
Current Language Support Comparison:
| Feature | Rust | Python | JavaScript/TypeScript |
|---|---|---|---|
| Basic error swallowing | ✅ Full | ✅ Full | ✅ Basic |
| Panic/exception patterns | ✅ Full | ✅ Full | ⚠️ Limited |
| Async error detection | ✅ Full | N/A | ⚠️ Limited |
| Error propagation analysis | ✅ Full | ✅ Basic | ❌ Not yet |
| Context loss detection | ✅ Full | ⚠️ Limited | ❌ Not yet |
Current JavaScript/TypeScript detection includes:
- Basic try/catch without error handling
- Some promise rejection patterns
- Complexity analysis
Enhanced JavaScript/TypeScript async error detection is planned for future releases.
// ❌ CRITICAL: Unhandled promise rejection
async function loadUserData(userId) {
const response = await fetch(`/api/users/${userId}`);
// If fetch rejects, promise is unhandled
return response.json();
}
loadUserData(123); // Detected: UnhandledPromiseRejection
// ✅ GOOD: Handle rejections
async function loadUserData(userId) {
try {
const response = await fetch(`/api/users/${userId}`);
if (!response.ok) {
throw new Error(`HTTP ${response.status}: ${response.statusText}`);
}
return await response.json();
} catch (error) {
console.error(`Failed to load user ${userId}:`, error);
throw error; // Re-throw or return default
}
}
loadUserData(123).catch(err => {
console.error("Top-level error handler:", err);
});
Missing Await Detection
// ❌ HIGH: Missing await - promise dropped
async function saveAndNotify(data) {
await saveToDatabase(data);
sendNotification(data.userId); // Detected: MissingAwait
// Function returns before notification completes
}
// ✅ GOOD: Await all async operations
async function saveAndNotify(data) {
await saveToDatabase(data);
await sendNotification(data.userId);
}
Async Rust Error Handling
Debtmap detects five async-specific error handling patterns in Rust:
1. DroppedFuture - Future dropped without awaiting
#![allow(unused)]
fn main() {
// ❌ HIGH: Dropped future without error handling
async fn process_requests(requests: Vec<Request>) {
for req in requests {
tokio::spawn(async move {
handle_request(req).await // Detected: DroppedFuture
// Errors silently dropped
});
}
}
// ✅ GOOD: Join handles and propagate errors
async fn process_requests(requests: Vec<Request>) -> Result<()> {
let handles: Vec<_> = requests.into_iter()
.map(|req| {
tokio::spawn(async move {
handle_request(req).await
})
})
.collect();
for handle in handles {
handle.await??; // Propagate both JoinError and handler errors
}
Ok(())
}
}
2. UnhandledJoinHandle - Spawned task without join
#![allow(unused)]
fn main() {
// ❌ HIGH: Task spawned but handle never checked
async fn background_sync() {
tokio::spawn(async {
sync_to_database().await // Detected: UnhandledJoinHandle
});
// Handle dropped - can't detect if task panicked or failed
}
// ✅ GOOD: Store and check join handle
async fn background_sync() -> Result<()> {
let handle = tokio::spawn(async {
sync_to_database().await
});
handle.await? // Wait for completion and check for panic
}
}
3. SilentTaskPanic - Task panic without monitoring
#![allow(unused)]
fn main() {
// ❌ HIGH: Task panic silently ignored
tokio::spawn(async {
panic!("task failed"); // Detected: SilentTaskPanic
});
// ✅ GOOD: Handle task panics
let handle = tokio::spawn(async {
critical_operation().await
});
match handle.await {
Ok(Ok(result)) => println!("Success: {:?}", result),
Ok(Err(e)) => eprintln!("Task failed: {}", e),
Err(e) => eprintln!("Task panicked: {}", e),
}
}
4. SpawnWithoutJoin - Spawning without storing handle
#![allow(unused)]
fn main() {
// ❌ MEDIUM: Spawn without storing handle
async fn fire_and_forget_tasks(items: Vec<Item>) {
for item in items {
tokio::spawn(process_item(item)); // Detected: SpawnWithoutJoin
// No way to check task completion or errors
}
}
// ✅ GOOD: Collect handles for later checking
async fn process_tasks_with_monitoring(items: Vec<Item>) -> Result<()> {
let handles: Vec<_> = items.into_iter()
.map(|item| tokio::spawn(process_item(item)))
.collect();
for handle in handles {
handle.await??;
}
Ok(())
}
}
5. SelectBranchIgnored - Select branch without error handling
#![allow(unused)]
fn main() {
// ❌ MEDIUM: tokio::select! branch error ignored
async fn process_with_timeout(data: Data) {
tokio::select! {
result = process_data(data) => {
// Detected: SelectBranchIgnored
// result could be Err but not checked
}
_ = tokio::time::sleep(Duration::from_secs(5)) => {
println!("Timeout");
}
}
}
// ✅ GOOD: Handle errors in select branches
async fn process_with_timeout(data: Data) -> Result<()> {
tokio::select! {
result = process_data(data) => {
result?; // Propagate error
Ok(())
}
_ = tokio::time::sleep(Duration::from_secs(5)) => {
Err(anyhow!("Processing timeout after 5s"))
}
}
}
}
Async Error Pattern Summary:
| Pattern | Severity | Description | Common in |
|---|---|---|---|
| DroppedFuture | High | Future result ignored | Fire-and-forget spawns |
| UnhandledJoinHandle | High | JoinHandle never checked | Background tasks |
| SilentTaskPanic | High | Task panic not monitored | Unmonitored spawns |
| SpawnWithoutJoin | Medium | Handle not stored | Quick prototypes |
| SelectBranchIgnored | Medium | select! branch error ignored | Concurrent operations |
All async error patterns emphasize the importance of properly handling errors in concurrent Rust code, where failures can easily go unnoticed.
Severity Levels and Prioritization
Error handling issues are assigned severity based on their impact:
| Pattern | Severity | Weight | Priority | Rationale |
|---|---|---|---|---|
| Panic in production | CRITICAL | 4 | Critical | Crashes the process |
| Bare except clause | CRITICAL | 4 | Critical | Masks system signals |
| Silent task panic | CRITICAL | 4 | Critical | Hidden failures |
| Unwrap on Result/Option | HIGH | 4 | High | Likely to panic |
| Dropped future | HIGH | 4 | High | Lost error information |
| Unhandled promise rejection | HIGH | 4 | High | Silently fails |
| Error swallowing | MEDIUM | 4 | Medium | Loses debugging context |
| Missing error context | MEDIUM | 4 | Medium | Hard to debug |
| Expect with generic message | MEDIUM | 4 | Medium | Uninformative errors |
| TODO in production | MEDIUM | 4 | Medium | Incomplete implementation |
All ErrorSwallowing debt has weight 4 (Major severity), but individual patterns receive different priorities based on production impact.
Integration with Risk Scoring
Error handling issues contribute to the debt_factor in Debtmap’s risk scoring formula:
risk_score = (complexity_factor * 0.4) + (debt_factor * 0.3) + (coverage_factor * 0.3)
where debt_factor includes:
- ErrorSwallowing count * weight (4)
- Combined with other debt types
Compound risk example:
#![allow(unused)]
fn main() {
// HIGH RISK: High complexity + error swallowing + low coverage
fn process_transaction(tx: Transaction) -> bool { // Cyclomatic: 12, Cognitive: 18
if tx.amount > 1000 {
if tx.verified {
if validate_funds(&tx).unwrap() { // ❌ Panic pattern
if tx.user_type == "premium" {
match apply_premium_discount(&tx) {
Ok(_) => {},
Err(_) => return false, // ❌ Error swallowed
}
}
charge_account(&tx).unwrap(); // ❌ Another panic
return true;
}
}
}
false
}
// Coverage: 45% (untested error paths)
// Risk Score: Very High (complexity + error handling + coverage gaps)
}
This function would be flagged as Priority 1 in Debtmap’s output due to:
- High cyclomatic complexity (12)
- Multiple panic patterns (unwrap calls)
- Error swallowing (ignored Result)
- Coverage gaps in error handling paths
Configuration
Error Handling Configuration Options
By default, all error handling detection is fully enabled. The configuration options below are primarily used to selectively disable specific patterns during gradual adoption or for specific project needs.
Configure error handling analysis in .debtmap.toml:
[error_handling]
# All detection patterns are enabled by default (all default to true)
detect_panic_patterns = true # Rust unwrap/expect/panic detection
detect_swallowing = true # Silent exception handling
detect_async_errors = true # Unhandled promises, dropped futures
detect_context_loss = true # Error propagation without context
detect_propagation = true # Error propagation analysis
# Disable specific patterns for gradual adoption
# detect_async_errors = false
All error handling patterns are detected by default with the ErrorSwallowing debt category (weight 4). The configuration is fully implemented and functional - use it primarily to disable specific patterns when needed.
Detection Examples
What Gets Detected vs. Not Detected
Rust examples:
#![allow(unused)]
fn main() {
// ❌ Detected: unwrap() in production code
pub fn get_config() -> Config {
load_config().unwrap()
}
// ✅ Not detected: ? operator (proper error propagation)
pub fn get_config() -> Result<Config> {
load_config()?
}
// ✅ Not detected: unwrap() in test
#[test]
fn test_config() {
let config = load_config().unwrap(); // OK in tests
assert_eq!(config.port, 8080);
}
// ❌ Detected: expect() with generic message
let value = map.get("key").expect("missing");
// ✅ Not detected: expect() with descriptive context
let value = map.get("key")
.expect("Configuration must contain 'key' field");
}
Python examples:
# ❌ Detected: bare except
try:
risky_operation()
except:
pass
# ✅ Not detected: specific exception
try:
risky_operation()
except ValueError:
handle_value_error()
# ❌ Detected: silent exception (no logging/re-raise)
try:
db.save(record)
except DatabaseError:
pass # Silent failure
# ✅ Not detected: logged exception
try:
db.save(record)
except DatabaseError as e:
logger.error(f"Failed to save record: {e}")
raise
Suppression Patterns
For cases where error handling patterns are intentional, use suppression comments:
Rust:
#![allow(unused)]
fn main() {
// debtmap: ignore - Unwrap is safe here due to prior validation
let value = validated_map.get("key").unwrap();
}
Python:
try:
experimental_feature()
except: # debtmap: ignore - Intentional catch-all during migration
use_fallback()
See Suppression Patterns for complete syntax and usage.
Best Practices
Rust Error Handling
-
Prefer
?operator over unwrap/expect#![allow(unused)] fn main() { // Instead of: fs::read_to_string(path).unwrap() // Use: fs::read_to_string(path)? } -
Use anyhow for application errors, thiserror for libraries
#![allow(unused)] fn main() { use anyhow::{Context, Result}; fn load_data(path: &Path) -> Result<Data> { let content = fs::read_to_string(path) .with_context(|| format!("Failed to read {}", path.display()))?; parse_data(&content) .context("Invalid data format") } } -
Add context at each error boundary
#![allow(unused)] fn main() { .with_context(|| format!("meaningful message with {}", value)) } -
Handle Option explicitly
#![allow(unused)] fn main() { map.get(key).ok_or_else(|| anyhow!("Missing key: {}", key))? }
Python Error Handling
-
Always use specific exception types
except (ValueError, KeyError) as e: -
Log before suppressing
except DatabaseError as e: logger.error(f"Database operation failed: {e}", exc_info=True) # Then decide: re-raise, return default, or handle -
Avoid bare except completely
# If you must catch everything: except Exception as e: # Not bare except: logger.exception("Unexpected error") raise -
Use context managers for resource cleanup
with open(path) as f: # Ensures cleanup even on exception process(f)
JavaScript/TypeScript Error Handling
-
Always handle promise rejections
fetchData().catch(err => console.error(err)); // Or use try/catch with async/await -
Use async/await consistently
async function process() { try { const data = await fetchData(); await saveData(data); } catch (error) { console.error("Failed:", error); throw error; } } -
Don’t forget await
await asyncOperation(); // Don't drop promises
Improving Error Handling Based on Debtmap Reports
Workflow
-
Run analysis with error focus
debtmap analyze --filter-categories ErrorSwallowing -
Review priority issues first
- Address CRITICAL (panic in production, bare except) immediately
- Schedule HIGH (unwrap, dropped futures) for next sprint
- Plan MEDIUM (missing context) for gradual improvement
-
Fix systematically
- One file or module at a time
- Add tests as you improve error handling
- Run debtmap after each fix to verify
-
Validate improvements
# Before fixes debtmap analyze --output before.json # After fixes debtmap analyze --output after.json # Compare debtmap compare before.json after.json
Migration Strategy for Legacy Code
# .debtmap.toml - Gradual adoption
[error_handling]
# Start with just critical panic patterns
detect_panic_patterns = true
detect_swallowing = false # Add later
detect_async_errors = false # Add later
detect_context_loss = false # Add later
# After fixing panic patterns, enable error swallowing detection
# detect_swallowing = true
# Eventually enable all patterns
# detect_swallowing = true
# detect_async_errors = true
# detect_context_loss = true
# detect_propagation = true
Track progress over time:
# Weekly error handling health check
debtmap analyze --filter-categories ErrorSwallowing | tee weekly-error-health.txt
Troubleshooting
Too Many False Positives in Test Code
Problem: Debtmap flagging unwrap() in test functions
Solution: Debtmap should automatically detect test code via:
#[cfg(test)]modules in Rust#[test]attributestest_function name prefix in Python*.test.ts,*.spec.jsfile patterns
If false positives persist:
#![allow(unused)]
fn main() {
// Use suppression comment
let value = result.unwrap(); // debtmap: ignore - Test assertion
}
Error Patterns Not Being Detected
Problem: Known error patterns not appearing in report
Causes and solutions:
-
Language support not enabled
debtmap analyze --languages rust,python,javascript -
Pattern disabled in config
[error_handling] detect_panic_patterns = true detect_swallowing = true detect_async_errors = true # Ensure relevant detectors are enabled -
Suppression comment present
- Check for
debtmap: ignorecomments - Review
.debtmap.tomlignore patterns
- Check for
Disagreement with Severity Levels
Problem: Severity feels too high/low for your codebase
Solution: Customize in .debtmap.toml:
[debt_categories.ErrorSwallowing]
weight = 2 # Reduce from default 4 to Warning level
severity = "Warning"
# Or increase for stricter enforcement
# weight = 5
# severity = "Critical"
Can’t Find Which Line Has the Issue
Problem: Debtmap reports error at wrong line number
Causes:
- Source code changed since analysis
- Parser approximation for line numbers
Solutions:
- Re-run analysis:
debtmap analyze - Search for pattern:
rg "\.unwrap\(\)" src/ - Enable debug logging:
debtmap analyze --log-level debug
Validating Error Handling Improvements
Problem: Unsure if fixes actually improved code quality
Solution: Use compare workflow:
# Baseline before fixes
git checkout main
debtmap analyze --output baseline.json
# After fixes
git checkout feature/improve-errors
debtmap analyze --output improved.json
# Compare reports
debtmap compare baseline.json improved.json
Look for:
- Reduced ErrorSwallowing debt count
- Lower risk scores for affected functions
- Improved coverage of error paths (if running with coverage)
Related Topics
- Configuration - Complete
.debtmap.tomlreference - Suppression Patterns - Suppress false positives
- Scoring Strategies - How error handling affects risk scores
- Coverage Integration - Detect untested error paths
- CLI Reference - Command-line options for error analysis
- Troubleshooting - General debugging guide
Functional Composition Analysis
Debtmap provides deep AST-based analysis to detect and evaluate functional programming patterns in Rust code. This feature helps you understand how effectively your codebase uses functional composition patterns like iterator pipelines, identify opportunities for refactoring imperative code to functional style, and rewards pure, side-effect-free functions in complexity scoring.
Overview
Functional analysis examines your code at the AST level to detect:
- Iterator pipelines - Chains like
.iter().map().filter().collect() - Purity analysis - Functions with no mutable state or side effects
- Composition quality metrics - Overall functional programming quality scores
- Side effect classification - Categorization of Pure, Benign, and Impure side effects
This analysis integrates with debtmap’s scoring system, providing score bonuses for high-quality functional code and reducing god object warnings for codebases with many small pure helper functions.
Specification: This feature implements Specification 111: AST-Based Functional Pattern Detection with accuracy targets of precision ≥90%, recall ≥85%, F1 ≥0.87, and performance overhead <10%.
Configuration Profiles
Debtmap provides three pre-configured analysis profiles to match different codebases:
| Profile | Use Case | Min Pipeline Depth | Max Closure Complexity | Purity Threshold | Quality Threshold |
|---|---|---|---|---|---|
| Strict | Functional-first codebases | 3 | 3 | 0.9 | 0.7 |
| Balanced (default) | Typical Rust projects | 2 | 5 | 0.8 | 0.6 |
| Lenient | Imperative-heavy legacy code | 2 | 10 | 0.5 | 0.4 |
Choosing a Profile
Use Strict when:
- Your codebase emphasizes functional programming patterns
- You want to enforce high purity standards
- You’re building a new project with functional-first principles
- You want to detect even simple pipelines (3+ stages)
Use Balanced (default) when:
- You have a typical Rust codebase mixing functional and imperative styles
- You want reasonable detection without being overly strict
- You’re working on a mature project with mixed patterns
- You want to reward functional patterns without penalizing pragmatic imperative code
Use Lenient when:
- You’re analyzing legacy code with heavy imperative patterns
- You want to identify only the most obviously functional code
- You’re migrating from an imperative codebase and want gradual improvement
- You have complex closures that are still fundamentally functional
CLI Usage
Enable functional analysis with the --ast-functional-analysis flag and select a profile with --functional-analysis-profile:
# Enable with balanced profile (default)
debtmap analyze . --ast-functional-analysis --functional-analysis-profile balanced
# Use strict profile for functional-first codebases
debtmap analyze . --ast-functional-analysis --functional-analysis-profile strict
# Use lenient profile for legacy code
debtmap analyze . --ast-functional-analysis --functional-analysis-profile lenient
Note: The --ast-functional-analysis flag enables the feature, while --functional-analysis-profile selects the configuration profile (strict/balanced/lenient).
Pure Function Detection
A function is considered pure when it:
- Returns same output for same input (deterministic)
- Has no observable side effects
- Doesn’t mutate external state
- Doesn’t perform I/O
Examples
#![allow(unused)]
fn main() {
// Pure function
fn add(a: i32, b: i32) -> i32 {
a + b
}
// Pure function with internal iteration
fn factorial(n: u32) -> u32 {
(1..=n).product() // Pure despite internal iteration
}
// Not pure: I/O side effect
fn log_and_add(a: i32, b: i32) -> i32 {
println!("Adding {} and {}", a, b); // Side effect!
a + b
}
// Not pure: mutates external state
fn increment_counter(counter: &mut i32) -> i32 {
*counter += 1; // Side effect!
*counter
}
}
Pipeline Detection
Debtmap detects functional pipelines through deep AST analysis, identifying iterator chains and their transformations.
Pipeline Stages
The analyzer recognizes these pipeline stage types:
1. Iterator Initialization
Methods that start an iterator chain:
.iter()- Immutable iteration.into_iter()- Consuming iteration.iter_mut()- Mutable iteration
#![allow(unused)]
fn main() {
// Detected iterator initialization
let results = collection.iter()
.map(|x| x * 2)
.collect();
}
2. Map Transformations
Applies a transformation function to each element:
#![allow(unused)]
fn main() {
// Detected Map stage
items.iter()
.map(|x| x * 2) // Simple closure (low complexity)
.map(|x| { // Complex closure (higher complexity)
let doubled = x * 2;
doubled + 1
})
.collect()
}
The analyzer tracks closure complexity for each map operation. Complex closures may indicate code smells and affect quality scoring based on your max_closure_complexity threshold.
3. Filter Predicates
Selects elements based on a predicate:
#![allow(unused)]
fn main() {
// Detected Filter stage
items.iter()
.filter(|x| *x > 0) // Simple predicate
.filter(|x| { // Complex predicate
x.is_positive() && x < 100
})
.collect()
}
4. Fold/Reduce Aggregation
Combines elements into a single value:
#![allow(unused)]
fn main() {
// Detected Fold stage
items.iter()
.fold(0, |acc, x| acc + x)
// Or using reduce
items.iter()
.reduce(|a, b| a + b)
}
5. FlatMap Transformations
Maps and flattens nested structures:
#![allow(unused)]
fn main() {
// Detected FlatMap stage
items.iter()
.flat_map(|x| vec![x, x * 2])
.collect()
}
6. Inspect (Side-Effect Aware)
Performs side effects while passing through values:
#![allow(unused)]
fn main() {
// Detected Inspect stage (affects purity scoring)
items.iter()
.inspect(|x| println!("Processing: {}", x))
.map(|x| x * 2)
.collect()
}
7. Result/Option Chaining
Specialized stages for error handling:
#![allow(unused)]
fn main() {
// Detected AndThen stage
results.iter()
.and_then(|x| try_process(x))
.collect()
// Detected MapErr stage
results.iter()
.map_err(|e| format!("Error: {}", e))
.collect()
}
Terminal Operations
Pipelines typically end with a terminal operation that consumes the iterator:
collect()- Gather elements into a collectionsum()- Sum numeric valuescount()- Count elementsany()- Check if any element matchesall()- Check if all elements matchfind()- Find first matching elementreduce()- Reduce to single valuefor_each()- Execute side effects for each element
#![allow(unused)]
fn main() {
// Complete pipeline with terminal operation
let total: i32 = items.iter()
.filter(|x| **x > 0)
.map(|x| x * 2)
.sum(); // Terminal operation: sum
}
Nested Pipelines
Debtmap detects pipelines nested within closures, indicating highly functional code patterns:
#![allow(unused)]
fn main() {
// Nested pipeline detected
let results = outer_items.iter()
.map(|item| {
// Inner pipeline (nesting_level = 1)
item.values.iter()
.filter(|v| **v > 0)
.collect()
})
.collect();
}
Nesting level tracking helps identify sophisticated functional composition patterns.
Parallel Pipelines
Parallel iteration using Rayon is automatically detected:
#![allow(unused)]
fn main() {
use rayon::prelude::*;
// Detected as parallel pipeline (is_parallel = true)
let results: Vec<_> = items.par_iter()
.filter(|x| **x > 0)
.map(|x| x * 2)
.collect();
}
Parallel pipelines indicate high-performance functional patterns and receive positive quality scoring.
Builder Pattern Filtering
To avoid false positives, debtmap distinguishes builder patterns from functional pipelines:
#![allow(unused)]
fn main() {
// This is a builder pattern, NOT counted as a functional pipeline
let config = ConfigBuilder::new()
.with_host("localhost")
.with_port(8080)
.build();
// This IS a functional pipeline
let values = items.iter()
.map(|x| x * 2)
.collect();
}
Builder patterns are filtered out to ensure accurate functional composition metrics.
Purity Analysis
Debtmap analyzes functions to determine their purity level - whether they have side effects and mutable state.
Purity Levels
Functions are classified into three purity levels for god object weighting (defined in src/organization/purity_analyzer.rs):
Note: Debtmap has two purity analysis systems serving different purposes:
- PurityLevel (three levels) - Used for god object scoring with weight multipliers (this section)
- PurityLevel (four levels) - Used in
src/analysis/purity_analysis.rsfor detailed responsibility classification (Strictly Pure, Locally Pure, Read-Only, Impure)This chapter focuses on the three-level system for god object integration.
Pure (Weight 0.3)
Guaranteed no side effects:
- No mutable parameters (
&mut,mut self) - No I/O operations
- No global mutations
- No
unsafeblocks - Only immutable bindings
#![allow(unused)]
fn main() {
// Pure function
fn calculate_total(items: &[i32]) -> i32 {
items.iter().sum()
}
// Pure function with immutable bindings
fn process_value(x: i32) -> i32 {
let doubled = x * 2; // Immutable binding
let result = doubled + 10;
result
}
}
Probably Pure (Weight 0.5)
Likely no side effects:
- Static functions (
fnitems, not methods) - Associated functions (no
self) - No obvious side effects detected
#![allow(unused)]
fn main() {
// Probably pure - static function
fn transform(value: i32) -> i32 {
value * 2
}
// Probably pure - associated function
impl MyType {
fn create_default() -> Self {
MyType { value: 0 }
}
}
}
Impure (Weight 1.0)
Has side effects:
- Uses mutable references (
&mut,mut self) - Performs I/O operations (
println!, file I/O, network) - Uses
async(potential side effects) - Mutates global state
- Uses
unsafe
#![allow(unused)]
fn main() {
// Impure - mutable reference
fn increment(value: &mut i32) {
*value += 1;
}
// Impure - I/O operation
fn log_value(value: i32) {
println!("Value: {}", value);
}
// Impure - mutation
fn process_items(items: &mut Vec<i32>) {
items.push(42);
}
}
Purity Weight Multipliers
Purity levels affect god object detection through weight multipliers (implemented in src/organization/purity_analyzer.rs:29-39). Pure functions contribute less to god object scores, rewarding codebases with many small pure helper functions:
- Pure (0.3): A pure function counts as 30% of a regular function in god object method count calculations
- Probably Pure (0.5): Counts as 50%
- Impure (1.0): Full weight
The purity_score dampens god object scores via the weight_multiplier calculation. For example, pure functions with weight 0.3 count as only 30% of a regular function when calculating method counts for god object detection.
Example: A module with 20 pure helper functions (20 × 0.3 = 6.0 effective) is less likely to trigger god object warnings than a module with 10 impure functions (10 × 1.0 = 10.0 effective).
Side Effect Detection
Detected Side Effects
I/O Operations:
- File reading/writing
- Network calls
- Console output
- Database queries
State Mutation:
- Mutable global variables
- Shared mutable state
- Reference mutations
Randomness:
- Random number generation
- Time-dependent behavior
System Interaction:
- Environment variable access
- System calls
- Thread spawning
Rust-Specific Detection
#![allow(unused)]
fn main() {
// Interior mutability detection
use std::cell::RefCell;
fn has_side_effect() {
let data = RefCell::new(vec![]);
data.borrow_mut().push(1); // Detected as mutation
}
// Unsafe code detection
fn unsafe_side_effect() {
unsafe {
// Automatically flagged as potentially impure
}
}
}
Side Effect Classification
Side effects are categorized by severity:
Pure - No Side Effects
No mutations, I/O, or global state changes:
#![allow(unused)]
fn main() {
// Pure - only computation
fn fibonacci(n: u32) -> u32 {
match n {
0 => 0,
1 => 1,
_ => fibonacci(n - 1) + fibonacci(n - 2),
}
}
}
Benign - Small Penalty
Only logging, tracing, or metrics:
#![allow(unused)]
fn main() {
use tracing::debug;
// Benign - logging side effect
fn process(value: i32) -> i32 {
debug!("Processing value: {}", value);
value * 2
}
}
Benign side effects receive a small penalty in purity scoring. Logging and observability are recognized as practical necessities.
Impure - Large Penalty
I/O, mutations, network operations:
#![allow(unused)]
fn main() {
// Impure - file I/O
fn save_to_file(data: &str) -> std::io::Result<()> {
std::fs::write("output.txt", data)
}
// Impure - network operation
async fn fetch_data(url: &str) -> Result<String, reqwest::Error> {
reqwest::get(url).await?.text().await
}
}
Impure side effects receive a large penalty in purity scoring.
Purity Metrics
For each function, debtmap calculates purity metrics through the functional composition analysis (src/analysis/functional_composition.rs). These metrics are computed by analyze_composition() and returned in CompositionMetrics and PurityMetrics:
has_mutable_state- Whether the function uses mutable bindingshas_side_effects- Whether I/O or global mutations are detectedimmutability_ratio- Ratio of immutable to total bindings (0.0-1.0)is_const_fn- Whether declared asconst fnside_effect_kind- Classification: Pure, Benign, or Impurepurity_score- Overall purity score (0.0 impure to 1.0 pure)
Immutability Ratio
The immutability ratio measures how much of a function’s local state is immutable:
#![allow(unused)]
fn main() {
fn example() {
let x = 10; // Immutable
let y = 20; // Immutable
let mut z = 30; // Mutable
z += 1;
// immutability_ratio = 2/3 = 0.67
}
}
Higher immutability ratios contribute to better purity scores.
Composition Pattern Recognition
Function Composition
#![allow(unused)]
fn main() {
// Detected composition pattern
fn process_data(input: String) -> Result<Output> {
input
.parse()
.map(validate)
.and_then(transform)
.map(normalize)
}
}
Higher-Order Functions
#![allow(unused)]
fn main() {
// Detected HOF pattern
fn apply_twice<F>(f: F, x: i32) -> i32
where
F: Fn(i32) -> i32,
{
f(f(x))
}
}
Map/Filter/Fold Chains
#![allow(unused)]
fn main() {
// Detected functional pipeline
let result = items
.iter()
.filter(|x| x.is_valid())
.map(|x| x.transform())
.fold(0, |acc, x| acc + x);
}
Composition Quality Scoring
Debtmap combines pipeline metrics and purity analysis into an overall composition quality score (0.0-1.0).
Scoring Factors
The composition quality score considers:
- Pipeline depth - Longer pipelines indicate more functional composition
- Purity score - Higher purity means better functional programming
- Immutability ratio - More immutable bindings improve the score
- Closure complexity - Simpler closures score better
- Parallel execution - Parallel pipelines receive bonuses
- Nested pipelines - Sophisticated composition patterns score higher
Quality Thresholds
Based on your configuration profile, functions with composition quality above the threshold receive score boosts in debtmap’s overall analysis:
- Strict: Quality ≥ 0.7 required for boost
- Balanced: Quality ≥ 0.6 required for boost
- Lenient: Quality ≥ 0.4 required for boost
High-quality functional code can offset complexity in other areas of your codebase.
Purity Scoring
Distribution Analysis
Debtmap calculates purity distribution:
- Pure functions: 0 side effects detected
- Mostly pure: Minor side effects (e.g., logging)
- Impure: Multiple side effects
- Highly impure: Extensive state mutation and I/O
Scoring Formula
Purity Score = (pure_functions / total_functions) × 100
Side Effect Density = total_side_effects / total_functions
Codebase Health Metrics
Target Purity Levels:
- Core business logic: 80%+ pure
- Utilities: 70%+ pure
- I/O layer: 20-30% pure (expected)
- Overall: 50%+ pure
Integration with Risk Scoring
Functional composition quality integrates with debtmap’s risk scoring system and multi-signal aggregation framework:
- High composition quality → Lower risk scores (functions with quality above threshold receive score boosts)
- Pure functions → Reduced god object penalties (via weight multipliers in
purity_analyzer.rs) - Deep pipelines → Bonus for functional patterns
- Impure side effects → Risk penalties applied
Multi-Signal Integration: Functional composition analysis is one of several signals aggregated in the unified analysis system (src/builders/unified_analysis.rs and src/analysis/multi_signal_aggregation.rs) alongside complexity metrics, god object detection, and risk assessment. This ensures that functional programming quality contributes to the comprehensive technical debt assessment across multiple dimensions.
This integration ensures that well-written functional code is properly rewarded in the overall technical debt assessment.
Practical Examples
Example 1: Detecting Imperative vs Functional Code
Imperative style (lower composition quality):
#![allow(unused)]
fn main() {
fn process_items_imperative(items: Vec<i32>) -> Vec<i32> {
let mut results = Vec::new();
for item in items {
if item > 0 {
results.push(item * 2);
}
}
results
}
// Detected: No pipelines, mutable state, lower purity score
}
Functional style (higher composition quality):
#![allow(unused)]
fn main() {
fn process_items_functional(items: Vec<i32>) -> Vec<i32> {
items.iter()
.filter(|x| **x > 0)
.map(|x| x * 2)
.collect()
}
// Detected: Pipeline depth 3, pure function, high composition quality
}
Example 2: Identifying Refactoring Opportunities
When debtmap detects low composition quality, it suggests refactoring:
#![allow(unused)]
fn main() {
// Original: Imperative with mutations
fn calculate_statistics(data: &[f64]) -> (f64, f64, f64) {
let mut sum = 0.0;
let mut min = f64::MAX;
let mut max = f64::MIN;
for &value in data {
sum += value;
if value < min { min = value; }
if value > max { max = value; }
}
(sum / data.len() as f64, min, max)
}
// Refactored: Functional style
fn calculate_statistics_functional(data: &[f64]) -> (f64, f64, f64) {
let sum: f64 = data.iter().sum();
let min = data.iter().min_by(|a, b| a.partial_cmp(b).unwrap()).unwrap();
let max = data.iter().max_by(|a, b| a.partial_cmp(b).unwrap()).unwrap();
(sum / data.len() as f64, *min, *max)
}
// Higher purity score, multiple pipelines detected
}
Example 3: Using Profiles for Different Codebases
Strict profile - Catches subtle functional patterns:
$ debtmap analyze --ast-functional-analysis --functional-analysis-profile strict src/
# Detects pipelines with 3+ stages
# Requires purity ≥ 0.9 for "pure" classification
# Flags closures with complexity > 3
Balanced profile - Default for most projects:
$ debtmap analyze --ast-functional-analysis --functional-analysis-profile balanced src/
# Detects pipelines with 2+ stages
# Requires purity ≥ 0.8 for "pure" classification
# Flags closures with complexity > 5
Lenient profile - For legacy code:
$ debtmap analyze --ast-functional-analysis --functional-analysis-profile lenient src/
# Detects pipelines with 2+ stages
# Requires purity ≥ 0.5 for "pure" classification
# Flags closures with complexity > 10
Example 4: Interpreting Purity Scores
Pure function (score: 1.0):
#![allow(unused)]
fn main() {
fn add(a: i32, b: i32) -> i32 {
a + b
}
// Purity: 1.0 (perfect)
// Immutability ratio: 1.0 (no bindings)
// Side effects: None
}
Mostly pure (score: 0.8):
#![allow(unused)]
fn main() {
fn process(values: &[i32]) -> i32 {
let doubled: Vec<_> = values.iter().map(|x| x * 2).collect();
let sum: i32 = doubled.iter().sum();
sum
}
// Purity: 0.8 (high)
// Immutability ratio: 1.0 (both bindings immutable)
// Side effects: None
// Pipelines: 2 detected
}
Impure function (score: 0.2):
#![allow(unused)]
fn main() {
fn log_and_process(values: &mut Vec<i32>) {
println!("Processing {} items", values.len());
values.iter_mut().for_each(|x| *x *= 2);
}
// Purity: 0.2 (low)
// Immutability ratio: 0.0 (mutable parameter)
// Side effects: I/O (println), mutation
}
Best Practices
Writing Functional Rust Code
To achieve high composition quality scores:
-
Prefer iterator chains over manual loops
#![allow(unused)] fn main() { // Good let evens: Vec<_> = items.iter().filter(|x| *x % 2 == 0).collect(); // Avoid let mut evens = Vec::new(); for item in &items { if item % 2 == 0 { evens.push(item); } } } -
Minimize mutable state
#![allow(unused)] fn main() { // Good let result = calculate(input); // Avoid let mut result = 0; result = calculate(input); } -
Separate pure logic from side effects
#![allow(unused)] fn main() { // Good - pure computation fn calculate_price(quantity: u32, unit_price: f64) -> f64 { quantity as f64 * unit_price } // Good - I/O at the boundary fn display_price(price: f64) { println!("Total: ${:.2}", price); } } -
Keep closures simple
#![allow(unused)] fn main() { // Good - simple closure items.map(|x| x * 2) // Consider extracting - complex closure items.map(|x| { let temp = expensive_operation(x); transform(temp) }) // Better fn transform_item(x: i32) -> i32 { let temp = expensive_operation(x); transform(temp) } items.map(transform_item) } -
Use parallel iteration for CPU-intensive work
#![allow(unused)] fn main() { use rayon::prelude::*; let results: Vec<_> = large_dataset.par_iter() .map(|item| expensive_computation(item)) .collect(); }
Code Organization
Separate pure from impure:
- Keep pure logic in core modules
- Isolate I/O at boundaries
- Use dependency injection for testability
Maximize purity in:
- Business logic
- Calculations and transformations
- Validation functions
- Data structure operations
Accept impurity in:
- I/O layers
- Logging and monitoring
- External system integration
- Application boundaries
Refactoring strategy:
- Identify impure functions
- Extract pure logic
- Push side effects to boundaries
- Test pure functions exhaustively
Migration Guide
To enable functional analysis on existing projects:
-
Start with lenient profile to understand current state:
debtmap analyze --ast-functional-analysis --functional-analysis-profile lenient . -
Identify quick wins - functions that are almost functional:
- Look for loops that can become iterator chains
- Find mutable variables that can be immutable
- Spot side effects that can be extracted
-
Gradually refactor to functional patterns:
- Convert one function at a time
- Run tests after each change
- Measure improvements with debtmap
-
Tighten profile as codebase improves:
# After refactoring debtmap analyze --ast-functional-analysis --functional-analysis-profile balanced . # For new modules debtmap analyze --ast-functional-analysis --functional-analysis-profile strict src/new_module/ -
Monitor composition quality trends over time
Use Cases
Code Quality Audit
# Assess functional purity
debtmap analyze . --ast-functional-analysis --functional-analysis-profile balanced --format markdown
Refactoring Targets
# Find impure functions in core logic
debtmap analyze src/core/ --ast-functional-analysis --functional-analysis-profile strict
Onboarding Guide
# Show functional patterns in codebase
debtmap analyze . --ast-functional-analysis --functional-analysis-profile balanced --summary
Troubleshooting
“No pipelines detected” but I have iterator chains
- Check pipeline depth: Your chains may be too short for the profile
- Strict requires 3+ stages
- Balanced/Lenient require 2+ stages
- Check for builder patterns: Method chaining for construction is filtered out
- Verify terminal operation: Ensure the chain ends with
collect(),sum(), etc.
“Low purity score” for seemingly pure functions
- Check for hidden side effects:
println!or logging statements- Calls to impure helper functions
unsafeblocks
- Review immutability ratio: Unnecessary
mutbindings lower the score - Verify no I/O operations: File access, network calls affect purity
“High complexity closures flagged”
- Extract complex closures into named functions:
#![allow(unused)] fn main() { // Instead of items.map(|x| { /* 10 lines */ }) // Use fn process_item(x: Item) -> Result { /* 10 lines */ } items.map(process_item) } - Adjust
max_closure_complexity: Consider lenient profile if needed - Refactor closure logic: Break down complex operations
Too Many False Positives
Issue: Pure functions flagged as impure
Solution:
- Use lenient profile
- Suppress known patterns
- Review detection criteria
- Report false positives
Missing Side Effects
Issue: Known impure functions not detected
Solution:
- Use strict profile
- Check for exotic side effect patterns
- Enable comprehensive analysis
Performance impact concerns
- Spec 111 targets <10% overhead: Performance impact should be minimal
- Disable for hot paths: Analyze functional patterns in separate runs if needed
- Use parallel processing: Leverage multi-core parallelism for faster analysis
Related Chapters
- Analysis Guide - Understanding analysis types
- Complexity Analysis - How functional patterns affect complexity metrics
- Scoring Strategies - Integration with overall technical debt scoring
- God Object Detection - How purity weights reduce false positives
- Configuration - Advanced functional analysis configuration options
- Refactoring - Extracting pure functions
Summary
Functional composition analysis helps you:
- Identify functional patterns in your Rust codebase through AST-based pipeline detection
- Measure purity with side effect detection and immutability analysis
- Improve code quality by refactoring imperative code to functional style
- Get scoring benefits for high-quality functional programming patterns
- Choose appropriate profiles (strict/balanced/lenient) for different codebases
Enable it with --functional-analysis-profile to start benefiting from functional programming insights in your technical debt analysis.
God Object Detection
Overview
Debtmap includes sophisticated god object detection that identifies files and types that have grown too large and taken on too many responsibilities. God objects (also called “god classes” or “god modules”) are a significant source of technical debt as they:
- Violate the Single Responsibility Principle
- Become difficult to maintain and test
- Create bottlenecks in development
- Increase the risk of bugs due to high coupling
- Have high coupling with many other modules
- Are hard to test effectively
This chapter explains how Debtmap identifies god objects, calculates their scores, and provides actionable refactoring recommendations.
Detection Criteria
Debtmap uses two distinct detection strategies depending on the file structure:
God Class Criteria
A struct/class is classified as a god class when it violates multiple thresholds:
- Method Count - Number of impl methods on the struct
- Field Count - Number of struct/class fields
- Responsibility Count - Distinct responsibilities inferred from method names (max_traits in config)
- Lines of Code - Estimated lines for the struct and its impl blocks
- Complexity Sum - Combined cyclomatic complexity of struct methods
Note: All five criteria are evaluated by the determine_confidence function to calculate confidence levels. Each criterion that exceeds its threshold contributes to the violation count.
God Module Criteria
A file is classified as a god module when it has excessive standalone functions:
- Standalone Function Count - Total standalone functions (not in impl blocks)
- Responsibility Count - Distinct responsibilities across all functions
- Lines of Code - Total lines in the file
- Complexity Sum - Combined cyclomatic complexity (estimated as
function_count × 5)
Key Difference: God class detection focuses on a single struct’s methods, while god module detection counts standalone functions across the entire file.
Language-Specific Thresholds
Rust
- Max Methods: 20 (includes both impl methods and standalone functions)
- Max Fields: 15
- Max Responsibilities: 5
- Max Lines: 1000
- Max Complexity: 200
Python
- Max Methods: 15
- Max Fields: 10
- Max Responsibilities: 3
- Max Lines: 500
- Max Complexity: 150
JavaScript/TypeScript
- Max Methods: 15
- Max Fields: 20
- Max Responsibilities: 3
- Max Lines: 500
- Max Complexity: 150
Note: TypeScript uses the same thresholds as JavaScript since both languages have similar structural patterns. The implementation treats them identically for god object detection purposes.
These thresholds can be customized per-language in your .debtmap.toml configuration file.
God Class vs God Module Detection
Debtmap distinguishes between two distinct types of god objects:
God Class Detection
A god class is a single struct/class with excessive methods and fields. Debtmap analyzes the largest type in a file using:
- Find the largest type (struct/class) by
method_count + field_count × 2 - Count only the impl methods for that struct
- Check against thresholds:
- Rust: >20 methods, >15 fields
- Python: >15 methods, >10 fields
- JavaScript/TypeScript: >15 methods, >20 fields
Example: A struct with 25 methods and 18 fields would be flagged as a god class.
God Module Detection
A god module is a file with excessive standalone functions (no dominant struct). Debtmap counts standalone functions when:
- No struct/class is found, OR
- The file has many standalone functions outside of any impl blocks
Implementation Detail: Debtmap uses the DetectionType enum with three variants:
GodClass- Single struct with excessive methods/fieldsGodFile- File with excessive functions or lines of codeGodModule- Alias forGodFile(both represent the same detection type)
The GodModule variant is provided for clarity when discussing files with many standalone functions, but internally it’s the same as GodFile. Both terms exist to help distinguish between “file is large” (GodFile) and “file has many functions” (GodModule) conceptually in documentation and error messages, even though they’re implemented identically in the detection logic.
Example: A file like rust_call_graph.rs with 270 standalone functions would be flagged as a god module (using the GodFile/GodModule detection type).
Why Separate Analysis?
Previously, Debtmap combined standalone functions with struct methods, causing false positives for functional/procedural modules. The current implementation analyzes them separately to:
- Avoid penalizing pure functional modules
- Distinguish between architectural issues (god class) and organizational issues (god module)
- Provide more accurate refactoring recommendations
Key Distinction: A file containing a struct with 15 methods plus 20 standalone functions is analyzed as:
- God Class: No (15 methods < 20 threshold)
- God Module: Possibly (20 standalone functions, approaching threshold)
See src/organization/god_object_detector.rs:449-505 for implementation details.
Confidence Levels
Debtmap assigns confidence levels based solely on the number of thresholds violated:
- Definite (5 violations) - All five metrics exceed thresholds - clear god object requiring immediate refactoring
- Probable (3-4 violations) - Most metrics exceed thresholds - likely god object that should be refactored
- Possible (1-2 violations) - Some metrics exceed thresholds - potential god object worth reviewing
- NotGodObject (0 violations) - All metrics within acceptable limits
Note: The confidence level is determined by violation count alone. The god object score (calculated separately) is used for prioritization and ranking, but does not affect the confidence classification.
Example: Consider two files both with violation_count=2 (Possible confidence):
- File A: 21 methods, 16 fields (just over the threshold)
- File B: 100 methods, 50 fields (severely over the threshold)
Both receive the same “Possible” confidence level, but File B will have a much higher god object score for prioritization purposes. This separation ensures consistent confidence classification while still allowing scores to reflect severity.
See src/organization/god_object_analysis.rs:236-268 for the determine_confidence function.
Scoring Algorithms
Debtmap provides three scoring algorithms to accommodate different analysis needs.
Simple Scoring
The base scoring algorithm calculates god object score using four factors:
method_factor = min(method_count / max_methods, 3.0)
field_factor = min(field_count / max_fields, 3.0)
responsibility_factor = min(responsibility_count / 3, 3.0)
size_factor = min(lines_of_code / max_lines, 3.0)
base_score = method_factor × field_factor × responsibility_factor × size_factor
Score Enforcement:
- If
violation_count > 0:final_score = max(base_score × 50 × violation_count, 100) - Else:
final_score = base_score × 10
The minimum score of 100 ensures that any god object receives sufficient priority in the technical debt analysis.
Complexity-Weighted Scoring
Unlike raw method counting, this algorithm weights each method by its cyclomatic complexity. This ensures that 100 simple functions (complexity 1-3) score better than 10 highly complex functions (complexity 17+).
The formula is similar to simple scoring, but uses weighted_method_count (sum of complexity weights) instead of raw counts:
method_factor = min(weighted_method_count / max_methods, 3.0)
Additionally, a complexity factor is applied:
- Average complexity < 3.0:
0.7(reward simple functions) - Average complexity > 10.0:
1.5(penalize complex functions) - Otherwise:
1.0
The final score becomes:
final_score = max(base_score × 50 × complexity_factor × violation_count, 100)
This approach better reflects the true maintainability burden of a large module.
See src/organization/god_object_analysis.rs:142-209.
Purity-Weighted Scoring (Advanced)
Available for Rust only (requires syn::ItemFn analysis)
This advanced scoring variant combines both complexity weighting and purity analysis, building on top of complexity-weighted scoring to further reduce the impact of pure functions. This prevents pure functional modules from being unfairly penalized. The algorithm:
-
Analyzes each function for purity using three levels:
-
Pure (no side effects): Functions with read-only operations, no I/O, no mutation
- Weight multiplier:
0.3 - Examples:
calculate_sum(),format_string(),is_valid()
- Weight multiplier:
-
Probably Pure (likely no side effects): Functions that appear pure but may have hidden side effects
- Weight multiplier:
0.5 - Examples: Functions using trait methods (could have side effects), generic operations
- Weight multiplier:
-
Impure (has side effects): Functions with clear side effects like I/O, mutation, external calls
- Weight multiplier:
1.0 - Examples:
save_to_file(),update_state(),send_request()
- Weight multiplier:
-
-
Purity Detection Heuristics:
- Pure indicators: No
mutreferences, no I/O operations, no external function calls - Impure indicators: File/network operations, mutable state, database access, logging
- Probably Pure: Generic functions, trait method calls, or ambiguous patterns
- Pure indicators: No
-
Combines complexity and purity weights to calculate the total contribution:
total_weight = complexity_weight × purity_multiplierThis means pure functions get both the complexity-based weight AND the purity multiplier applied together.
Example: A pure function with complexity 5 contributes only
5 × 0.3 = 1.5to the weighted count (compared to 5.0 for an impure function of the same complexity). -
Tracks the
PurityDistribution:pure_count,probably_pure_count,impure_countpure_weight_contribution,probably_pure_weight_contribution,impure_weight_contribution
Impact: A file with 100 pure helper functions (total complexity 150) might have a weighted method count of only 150 × 0.3 = 45, avoiding false positives while still catching stateful god objects with many impure methods.
See src/organization/god_object_detector.rs:196-258 and src/organization/purity_analyzer.rs.
Responsibility Detection
Responsibilities are inferred from method names using common prefixes. Debtmap recognizes the following categories:
| Prefix(es) | Responsibility Category |
|---|---|
format, render, write, print | Formatting & Output |
parse, read, extract | Parsing & Input |
filter, select, find | Filtering & Selection |
transform, convert, map, apply | Transformation |
get, set | Data Access |
validate, check, verify, is | Validation |
calculate, compute | Computation |
create, build, new | Construction |
save, load, store | Persistence |
process, handle | Processing |
send, receive | Communication |
| (no prefix match) | Utilities |
Note: Utilities serves as both a category in the responsibility list and the fallback when no prefix matches. In the implementation, Utilities is included in RESPONSIBILITY_CATEGORIES with an empty prefixes array (prefixes: &[]), making it the catch-all category returned by infer_responsibility_from_method when no other category matches.
Distinct Responsibility Counting: Debtmap counts the number of unique responsibility categories used by a struct/module’s methods. A high responsibility count (e.g., >5) indicates the module is handling too many different concerns, violating the Single Responsibility Principle.
Responsibility count directly affects:
- God object scoring (via
responsibility_factor) - Refactoring recommendations (methods grouped by responsibility for suggested splits)
- Detection confidence (counted as one of the five violation criteria)
See src/organization/god_object_analysis.rs:318-388 for the infer_responsibility_from_method function.
Examples and Case Studies
Example 1: Large Rust Module
File: rust_call_graph.rs with 270 standalone functions
Detection:
- Is God Object: Yes
- Method Count: 270
- Field Count: 0 (no struct)
- Responsibilities: 8
- Confidence: Definite
- Score: >1000 (severe violation)
Recommendation: Break into multiple focused modules:
CallGraphBuilder(construction methods)CallGraphAnalyzer(analysis methods)CallGraphFormatter(output methods)
Example 2: Complex Python Class
File: data_manager.py with class containing 25 methods and 12 fields
Detection:
- Is God Object: Yes
- Method Count: 25
- Field Count: 12
- Responsibilities: 6 (Data Access, Validation, Persistence, etc.)
- Confidence: Probable
- Score: ~150-200
Recommendation: Split by responsibility:
DataAccessLayer(get/set methods)DataValidator(validate/check methods)DataPersistence(save/load methods)
Example 3: Mixed Paradigm File (God Module)
File: utils.rs with small struct (5 methods, 3 fields) + 60 standalone functions
Detection:
- God Class (struct): No (5 methods < 20 threshold, 3 fields < 15 threshold)
- God Module (file): Yes (60 standalone functions > 50 threshold)
- Confidence: Probable
- Score: ~120
Analysis: The struct and standalone functions are analyzed separately. The struct is not a god class, but the file is a god module due to the excessive standalone functions. This indicates an overgrown utility module that should be split into smaller, focused modules.
Recommendation: Split standalone functions into focused utility modules:
StringUtils(formatting, parsing)FileUtils(file operations)MathUtils(calculations)
Refactoring Recommendations
When is_god_object = true, Debtmap generates recommended module splits using the recommend_module_splits function. This feature:
-
Groups methods by their inferred responsibilities
-
Creates a
ModuleSplitfor each responsibility group containing:suggested_name(e.g., “DataAccessManager”, “ValidationManager”)methods_to_move(list of method names)responsibility(category name)estimated_lines(approximate LOC for the new module)
-
Orders splits by cohesion (most focused responsibility groups first)
Example output:
Recommended Splits:
1. DataAccessManager (12 methods, ~150 lines)
2. ValidationManager (8 methods, ~100 lines)
3. PersistenceManager (5 methods, ~75 lines)
This provides an actionable roadmap for breaking down god objects into focused, single-responsibility modules.
See src/organization/god_object_detector.rs:165-177 and src/organization/god_object_analysis.rs:40-45.
Code Examples
Split by Responsibility
#![allow(unused)]
fn main() {
// Before: UserManager (god object)
struct UserManager { ... }
// After: Split into focused modules
struct AuthService { ... }
struct ProfileService { ... }
struct PermissionService { ... }
struct NotificationService { ... }
}
Extract Common Functionality
#![allow(unused)]
fn main() {
// Extract shared dependencies
struct ServiceContext {
db: Database,
cache: Cache,
logger: Logger,
}
// Each service gets a reference
struct AuthService<'a> {
context: &'a ServiceContext,
}
}
Use Composition
#![allow(unused)]
fn main() {
// Compose services instead of inheriting
struct UserFacade {
auth: AuthService,
profile: ProfileService,
permissions: PermissionService,
}
impl UserFacade {
fn login(&mut self, credentials: Credentials) -> Result<Session> {
self.auth.login(credentials)
}
}
}
Configuration
TOML Configuration
Add a [god_object_detection] section to your .debtmap.toml:
[god_object_detection]
enabled = true
[god_object_detection.rust]
max_methods = 20
max_fields = 15
max_traits = 5 # max_traits = max responsibilities
max_lines = 1000
max_complexity = 200
# Note: The configuration field is named 'max_traits' for historical reasons,
# but it controls the maximum number of responsibilities/concerns, not Rust traits.
# This is a legacy naming issue from early development.
[god_object_detection.python]
max_methods = 15
max_fields = 10
max_traits = 3
max_lines = 500
max_complexity = 150
[god_object_detection.javascript]
max_methods = 15
max_fields = 20
max_traits = 3
max_lines = 500
max_complexity = 150
Note: enabled defaults to true. Set to false to disable god object detection entirely (equivalent to --no-god-object CLI flag).
See src/config.rs:500-582.
Tuning for Your Project
Strict mode (smaller modules):
[god_object_detection.rust]
max_methods = 15
max_fields = 10
max_traits = 3
Lenient mode (larger modules acceptable):
[god_object_detection.rust]
max_methods = 30
max_fields = 20
max_traits = 7
CLI Options
Debtmap provides several CLI flags to control god object detection behavior:
--no-god-object
Disables god object detection entirely.
debtmap analyze . --no-god-object
Use case: When you only want function-level complexity analysis without file-level aggregation.
--aggregate-only
Shows only file-level god object scores, hiding individual function details.
debtmap analyze . --aggregate-only
Use case: High-level overview of which files are god objects without function-by-function breakdowns.
--no-aggregation
Disables file-level aggregation, showing only individual function metrics.
debtmap analyze . --no-aggregation
Use case: Detailed function-level analysis without combining into file scores.
--aggregation-method <METHOD>
Chooses how to combine function scores into file-level scores:
sum- Add all function scoresweighted_sum- Weight by complexity (default)logarithmic_sum- Logarithmic scaling for large filesmax_plus_average- Max score + average of others
debtmap analyze . --aggregation-method logarithmic_sum
--min-problematic <N>
Sets minimum number of problematic functions required for file-level aggregation.
debtmap analyze . --min-problematic 3
Use case: Avoid flagging files with only 1-2 complex functions as god objects.
See features.json:65-71 and features.json:507-512.
Output Display
File-Level Display
When a god object is detected, Debtmap displays:
⚠️ God Object: 270 methods, 0 fields, 8 responsibilities
Score: 1350 (Confidence: Definite)
Function-Level Display
Within a god object file, individual functions show:
├─ ⚠️ God Object: 45 methods, 20 fields, 5 responsibilities
│ Score: 250 (Confidence: Probable)
The ⚠️ God Object indicator makes it immediately clear which files need architectural refactoring.
Integration with File-Level Scoring
God object detection affects the overall technical debt prioritization through a god object multiplier:
god_object_multiplier = 2.0 + normalized_god_object_score
Normalization
The normalized_god_object_score is scaled to the 0-1 range using:
normalized_score = min(god_object_score / max_expected_score, 1.0)
Where max_expected_score is typically based on the maximum score in the analysis (e.g., 1000 for severe violations).
Impact on Prioritization
This multiplier means:
- Non-god objects (score = 0): multiplier = 2.0 (baseline)
- Moderate god objects (score = 200): multiplier ≈ 2.2-2.5
- Severe god objects (score = 1000+): multiplier ≈ 3.0 (maximum)
Result: God objects receive 2-3× higher priority in debt rankings, ensuring that:
- Functions within god objects inherit elevated scores due to architectural concerns
- God objects surface in the “top 10 most problematic” lists
- Architectural debt is weighted appropriately alongside function-level complexity
See file-level scoring documentation for complete details on how this multiplier integrates into the overall debt calculation.
Metrics Tracking (Advanced)
For teams tracking god object evolution over time, Debtmap provides GodObjectMetrics with:
- Snapshots - Historical god object data per file
- Trends - Improving/Stable/Worsening classification (based on ±10 point score changes)
- New God Objects - Files that crossed the threshold
- Resolved God Objects - Files that were refactored below thresholds
This enables longitudinal analysis: “Are we reducing god objects sprint-over-sprint?”
See src/organization/god_object_metrics.rs:1-228.
Troubleshooting
“Why is my functional module flagged as a god object?”
Answer: Debtmap now analyzes god classes (structs) separately from god modules (standalone functions). If your functional module with 100 pure helper functions is flagged, it’s being detected as a god module (not a god class), which indicates the file has grown too large and should be split for better organization.
Solutions:
- Accept the finding: 100+ functions in one file is difficult to navigate and maintain, even if each function is simple
- Split by responsibility: Organize functions into smaller, focused modules (e.g.,
string_utils.rs,file_utils.rs,math_utils.rs) - Use purity-weighted scoring (Rust only): Pure functions contribute only 0.3× weight, dramatically reducing scores for functional modules
- Adjust thresholds: Increase
max_methodsin.debtmap.tomlif your project standards allow larger modules
“My god object score seems too high”
Answer: The scoring algorithm uses exponential scaling (base_score × 50 × violation_count) to ensure god objects are prioritized.
Solutions:
- Check the violation count - 5 violations means severe issues
- Review each metric - are method count, field count, responsibilities, LOC, and complexity all high?
- Consider if the score accurately reflects maintainability burden
“Why does my test file show as a god object?”
Answer: Test files often have many test functions, which can trigger god module detection. However, this is usually expected for comprehensive test suites.
Solutions:
- Accept the finding: Large test files (100+ test functions) can be difficult to navigate and maintain
- Split by feature: Organize tests into smaller files grouped by the feature they test (e.g.,
user_auth_tests.rs,user_profile_tests.rs) - Adjust thresholds: If your project standards accept large test files, increase
max_methodsin.debtmap.toml - Use test organization: Group related tests in modules within the test file for better structure
Note: Debtmap does not automatically exclude test files from god object detection. Consider the trade-offs between comprehensive test coverage in one file versus better organization across multiple test files.
“Can I disable god object detection for specific files?”
Answer: Currently, god object detection is global. However, you can:
- Use
--no-god-objectto disable entirely - Use
--no-aggregationto skip file-level analysis - Adjust thresholds in
.debtmap.tomlto be more lenient
Best Practices
To avoid god objects:
- Follow Single Responsibility Principle - Each module should have one clear purpose
- Regular Refactoring - Split modules before they reach thresholds
- Monitor Growth - Track method and field counts as modules evolve
- Use Composition - Prefer smaller, composable units over large monoliths
- Clear Boundaries - Define clear module interfaces and responsibilities
- Leverage Purity - Keep pure functions separate from stateful logic (reduces scores in Rust)
- Set Project Thresholds - Customize
.debtmap.tomlto match your team’s standards
Configuration Tradeoffs
Strict Thresholds (e.g., Rust: 10 methods):
- ✅ Catch problems early
- ✅ Enforce strong modularity
- ❌ May flag legitimate large modules
- ❌ More noise in reports
Lenient Thresholds (e.g., Rust: 50 methods):
- ✅ Reduce false positives
- ✅ Focus on egregious violations
- ❌ Miss real god objects
- ❌ Allow technical debt to grow
Recommended: Start with defaults, then adjust based on your codebase’s characteristics. Use metrics tracking to monitor trends over time.
Related Documentation
- File-Level Scoring - How god objects affect overall file scores
- Configuration - Complete
.debtmap.tomlreference - CLI Reference - All command-line options
- Tiered Prioritization - How god objects are prioritized
Summary
God object detection is a powerful architectural analysis feature that:
- Identifies files/types violating single responsibility principle
- Provides multiple scoring algorithms (simple, complexity-weighted, purity-weighted)
- Generates actionable refactoring recommendations
- Integrates with file-level scoring for holistic debt prioritization
- Supports customization via TOML config and CLI flags
By combining quantitative metrics (method count, LOC, complexity) with qualitative analysis (responsibility detection, purity), Debtmap helps teams systematically address architectural debt.
Multi-Pass Analysis
Multi-pass analysis is an advanced feature that performs two separate complexity analyses on your code to distinguish between genuine logical complexity and complexity artifacts introduced by code formatting. By comparing raw and normalized versions of your code, debtmap can attribute complexity to specific sources and provide actionable insights for refactoring.
Overview
Traditional complexity analysis treats all code as-is, which means formatting choices like multiline expressions, whitespace, and indentation can artificially inflate complexity metrics. Multi-pass analysis solves this problem by:
- Raw Analysis - Measures complexity of code exactly as written
- Normalized Analysis - Measures complexity after removing formatting artifacts
- Attribution - Compares the two analyses to identify complexity sources
The difference between raw and normalized complexity reveals how much “complexity” comes from formatting versus genuine logical complexity from control flow, branching, and nesting.
How It Works
Two-Pass Analysis Process
┌─────────────┐
│ Raw Code │
└──────┬──────┘
│
├─────────────────────┐
│ │
▼ ▼
┌──────────────┐ ┌────────────────────┐
│ Raw Analysis │ │ Normalize Formatting│
└──────┬───────┘ └─────────┬──────────┘
│ │
│ ▼
│ ┌──────────────────────┐
│ │ Normalized Analysis │
│ └─────────┬────────────┘
│ │
└──────────┬───────────┘
▼
┌──────────────────┐
│ Attribution │
│ Engine │
└─────────┬────────┘
│
┌─────────┴──────────┐
│ │
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ Insights │ │ Recommendations │
└─────────────────┘ └─────────────────┘
Raw Analysis examines your code as-is, capturing all complexity including:
- Logical control flow (if, loops, match, try/catch)
- Function calls and closures
- Formatting artifacts (multiline expressions, whitespace, indentation)
Normalized Analysis processes semantically equivalent code with standardized formatting:
- Removes excessive whitespace
- Normalizes multiline expressions to single lines where appropriate
- Standardizes indentation
- Preserves logical structure
Attribution Engine compares the results to categorize complexity sources:
- Logical Complexity - From control flow and branching (normalized result)
- Formatting Artifacts - From code formatting choices (difference between raw and normalized)
- Pattern Complexity - From recognized code patterns (error handling, validation, etc.)
Note: Pattern complexity analysis is part of the standard multi-pass analysis. No additional configuration is required to enable pattern detection.
CLI Usage
Enable multi-pass analysis with the --multi-pass flag:
# Basic multi-pass analysis
debtmap analyze . --multi-pass
# Multi-pass with detailed attribution breakdown
debtmap analyze . --multi-pass --attribution
# Control detail level
debtmap analyze . --multi-pass --attribution --detail-level comprehensive
# Output as JSON for tooling integration
debtmap analyze . --multi-pass --attribution --json
Available Flags
| Flag | Description |
|---|---|
--multi-pass | Enable two-pass analysis (raw + normalized) |
--attribution | Show detailed complexity attribution breakdown (requires --multi-pass) |
--detail-level <level> | Set output detail: summary, standard, comprehensive, debug |
--json | Output results in JSON format |
Note: The
--attributionflag requires--multi-passto be enabled, as attribution depends on comparing raw and normalized analyses.
Attribution Engine
The attribution engine breaks down complexity into three main categories, each with detailed tracking and suggestions.
Logical Complexity
Represents inherent complexity from your code’s control flow and structure:
- Function complexity - Cyclomatic and cognitive complexity per function
- Control flow - If statements, loops, match expressions
- Error handling - Try/catch blocks, Result/Option handling
- Closures and callbacks - Anonymous functions and callbacks
- Nesting levels - Depth of nested control structures
Each logical complexity component includes:
- Contribution - Complexity points from this construct
- Location - File, line, column, and span information
- Suggestions - Specific refactoring recommendations
Example:
#![allow(unused)]
fn main() {
// Function with high logical complexity
fn process_data(items: Vec<Item>) -> Result<Vec<Output>> {
let mut results = Vec::new();
for item in items { // +1 (loop)
if item.is_valid() { // +1 (if)
match item.category { // +1 (match)
Category::A => {
if item.value > 100 { // +2 (nested if)
results.push(transform_a(&item)?);
}
}
Category::B => {
results.push(transform_b(&item)?);
}
_ => continue, // +1 (match arm)
}
}
}
Ok(results)
}
// Logical complexity: ~7 points
}
Formatting Artifacts
Identifies complexity introduced by code formatting choices:
- Multiline expressions - Long expressions split across multiple lines
- Excessive whitespace - Blank lines within code blocks
- Inconsistent indentation - Mixed tabs/spaces or irregular indentation
- Line breaks in chains - Method chains split across many lines
Formatting artifacts are categorized by severity:
- Low - Minor formatting inconsistencies (<10% impact)
- Medium - Noticeable formatting impact (10-25% impact)
- High - Significant complexity inflation (>25% impact)
Example:
#![allow(unused)]
fn main() {
// Same function with formatting that inflates complexity
fn process_data(
items: Vec<Item>
) -> Result<Vec<Output>> {
let mut results =
Vec::new();
for item in
items
{
if item
.is_valid()
{
match item
.category
{
Category::A =>
{
if item
.value
> 100
{
results
.push(
transform_a(
&item
)?
);
}
}
Category::B =>
{
results
.push(
transform_b(
&item
)?
);
}
_ => continue,
}
}
}
Ok(results)
}
// Raw complexity: ~12 points (formatting adds ~5 points)
// Normalized complexity: ~7 points (true logical complexity)
}
Pattern Complexity
Recognizes common code patterns and their complexity characteristics:
- Error handling patterns - Result/Option propagation, error conversion
- Validation patterns - Input validation, constraint checking
- Data transformation - Map/filter/fold chains, data conversions
- Builder patterns - Fluent interfaces and builders
- State machines - Explicit state management
Each pattern includes:
- Confidence score (0.0-1.0) - How certain the pattern recognition is
- Opportunities - Suggestions for pattern extraction or improvement
Example:
#![allow(unused)]
fn main() {
// Error handling pattern (confidence: 0.85)
fn load_config(path: &Path) -> Result<Config> {
let contents = fs::read_to_string(path)
.context("Failed to read config file")?;
let config: Config = serde_json::from_str(&contents)
.context("Failed to parse config JSON")?;
config.validate()
.context("Config validation failed")?;
Ok(config)
}
// Pattern complexity: moderate error handling overhead
// Suggestion: Consider error enum for better type safety
}
Understanding Attribution Output
When you run with --attribution, you’ll see a detailed breakdown:
$ debtmap analyze src/main.rs --multi-pass --attribution --detail-level comprehensive
Sample Output
Multi-Pass Analysis Results
============================
File: src/main.rs
Raw Complexity: 45
Normalized Complexity: 32
Formatting Impact: 28.9%
Attribution Breakdown
---------------------
Logical Complexity: 32 points
├─ Function 'main' (line 10): 8 points
│ ├─ Control flow: 5 points (2 if, 1 match, 2 loops)
│ ├─ Nesting: 3 points (max depth: 3)
│ └─ Suggestions:
│ - Break down into smaller functions
│ - Extract complex conditions into named variables
│
├─ Function 'process_request' (line 45): 12 points
│ ├─ Control flow: 8 points (4 if, 1 match, 3 early returns)
│ ├─ Nesting: 4 points (max depth: 4)
│ └─ Suggestions:
│ - Consider using early returns to reduce nesting
│ - Extract validation logic into separate function
│
└─ Function 'handle_error' (line 89): 12 points
├─ Control flow: 9 points (5 match arms, 4 if conditions)
├─ Pattern: Error handling (confidence: 0.90)
└─ Suggestions:
- Consider error enum instead of multiple match arms
Formatting Artifacts: 13 points (28.9% of raw complexity)
├─ Multiline expressions: 8 points (Medium severity)
│ └─ Locations: lines 23, 45, 67, 89
├─ Excessive whitespace: 3 points (Low severity)
│ └─ Locations: lines 12-14, 56-58
└─ Inconsistent indentation: 2 points (Low severity)
└─ Locations: lines 34, 78
Pattern Complexity: 3 recognized patterns
├─ Error handling (confidence: 0.85): 8 occurrences
│ └─ Opportunity: Consider centralizing error handling
├─ Validation (confidence: 0.72): 5 occurrences
│ └─ Opportunity: Extract validation to separate module
└─ Data transformation (confidence: 0.68): 3 occurrences
└─ Opportunity: Review for functional composition
Interpreting the Results
Logical Complexity Breakdown
- Each function is listed with its complexity contribution
- Control flow elements are itemized (if, loops, match, etc.)
- Nesting depth shows how deeply structures are nested
- Suggestions are specific to that function’s complexity patterns
Formatting Artifacts
- Shows percentage of “false” complexity from formatting
- Severity indicates impact on metrics
- Locations help you find the formatting issues
- High formatting impact (>25%) suggests inconsistent style
Pattern Analysis
- Confidence score shows pattern recognition certainty
- High confidence (>0.7) means reliable pattern detection
- Low confidence (<0.5) suggests unique code structure
- Opportunities highlight potential refactoring
Insights and Recommendations
Multi-pass analysis automatically generates insights and recommendations based on the attribution results.
Insight Types
FormattingImpact
- Triggered when formatting contributes >20% of measured complexity
- Suggests using automated formatting tools
- Recommends standardizing team coding style
PatternOpportunity
- Triggered when pattern confidence is low (<0.5)
- Suggests extracting common patterns
- Recommends reviewing for code duplication
RefactoringCandidate
- Triggered when logical complexity exceeds threshold (>20)
- Identifies functions needing breakdown
- Provides specific refactoring strategies
ComplexityHotspot
- Identifies areas of concentrated complexity
- Highlights files or modules needing attention
- Suggests architectural improvements
Recommendation Structure
Each recommendation includes:
- Priority: Low, Medium, High
- Category: Refactoring, Pattern, Formatting, General
- Title: Brief description of the issue
- Description: Detailed explanation
- Estimated Impact: Expected complexity reduction (in points)
- Suggested Actions: Specific steps to take
Example Recommendations
{
"recommendations": [
{
"priority": "High",
"category": "Refactoring",
"title": "Simplify control flow in 'process_request'",
"description": "This function contributes 12 complexity points with deeply nested conditions",
"estimated_impact": 6,
"suggested_actions": [
"Extract validation logic into separate function",
"Use early returns to reduce nesting depth",
"Consider state pattern for complex branching"
]
},
{
"priority": "Medium",
"category": "Formatting",
"title": "Formatting contributes 29% of measured complexity",
"description": "Code formatting choices are inflating complexity metrics",
"estimated_impact": 13,
"suggested_actions": [
"Use automated formatting tools (rustfmt, prettier)",
"Standardize code formatting across the team",
"Configure editor to format on save"
]
},
{
"priority": "Low",
"category": "Pattern",
"title": "Low pattern recognition suggests unique code structure",
"description": "Pattern confidence score of 0.45 indicates non-standard patterns",
"estimated_impact": 3,
"suggested_actions": [
"Consider extracting common patterns into utilities",
"Review for code duplication opportunities",
"Document unique patterns for team understanding"
]
}
]
}
Performance Considerations
Multi-pass analysis adds overhead compared to single-pass analysis, but debtmap monitors and limits this overhead.
Performance Metrics
When performance tracking is enabled, you’ll see:
Performance Metrics
-------------------
Raw analysis: 145ms
Normalized analysis: 132ms
Attribution: 45ms
Total time: 322ms
Memory used: 12.3 MB
Overhead: 121.7% vs single-pass (145ms baseline)
⚠️ Warning: Overhead exceeds 25% target
Note: Memory usage values are estimates based on parallelism level, not precise heap measurements.
Tracked Metrics:
- Raw analysis time - Time to analyze original code
- Normalized analysis time - Time to analyze normalized code
- Attribution time - Time to compute attribution breakdown
- Total time - Complete multi-pass analysis duration
- Memory used - Estimated additional memory for two-pass analysis
Performance Overhead
Target Overhead: ≤25% compared to single-pass analysis
Multi-pass analysis aims to add no more than 25% overhead versus standard single-pass analysis. If overhead exceeds this threshold, a warning is issued.
Typical Overhead:
- Attribution adds ~10-15% on average
- Normalization adds ~5-10% on average
- Total overhead usually 15-25%
Factors Affecting Performance:
- File size - Larger files take proportionally longer
- Complexity - More complex code requires more analysis time
- Language - Some languages (TypeScript) are slower to parse
- Parallel processing - Overhead is per-file, parallel reduces impact
Optimization Tips
Disable Performance Tracking in Production
#![allow(unused)]
fn main() {
MultiPassOptions {
performance_tracking: false, // Reduces overhead slightly
..Default::default()
}
}
Use Parallel Processing
# Parallel analysis amortizes overhead across cores
# Note: --jobs is a general debtmap flag controlling parallelism for all analysis
debtmap analyze . --multi-pass --jobs 8
Target Specific Files
# Analyze only files that need detailed attribution
debtmap analyze src/complex_module.rs --multi-pass --attribution
Comparative Analysis
Multi-pass analysis supports comparing code changes to validate refactoring efforts.
Basic Comparison
The compare_complexity function is a standalone convenience function that performs complete multi-pass analysis on both code versions and returns the computed differences:
#![allow(unused)]
fn main() {
use debtmap::analysis::multi_pass::compare_complexity;
use debtmap::core::Language;
let before_code = r#"
fn process(items: Vec<i32>) -> i32 {
let mut sum = 0;
for item in items {
if item > 0 {
if item % 2 == 0 {
sum += item * 2;
} else {
sum += item;
}
}
}
sum
}
"#;
let after_code = r#"
fn process(items: Vec<i32>) -> i32 {
items
.into_iter()
.filter(|&item| item > 0)
.map(|item| if item % 2 == 0 { item * 2 } else { item })
.sum()
}
"#;
let comparison = compare_complexity(before_code, after_code, Language::Rust)?;
println!("Complexity change: {}", comparison.complexity_change);
println!("Cognitive complexity change: {}", comparison.cognitive_change);
println!("Formatting impact change: {}", comparison.formatting_impact_change);
}
Comparison Results
The ComparativeAnalysis struct contains the computed differences between before and after analyses:
#![allow(unused)]
fn main() {
pub struct ComparativeAnalysis {
pub complexity_change: i32, // Negative = improvement
pub cognitive_change: i32, // Negative = improvement
pub formatting_impact_change: f32, // Negative = less formatting noise
pub improvements: Vec<String>,
pub regressions: Vec<String>,
}
}
Note: The
compare_complexityfunction performs both analyses internally and returns only the change metrics. To access the full before/after results, perform separate analyses usingMultiPassAnalyzer.
Interpreting Changes:
- Negative complexity change - Refactoring reduced complexity ✓
- Positive complexity change - Refactoring increased complexity ✗
- Improvements - List of detected improvements (reduced nesting, extracted functions, etc.)
- Regressions - List of detected regressions (increased complexity, new anti-patterns, etc.)
Example Output
Comparative Analysis
====================
Complexity Changes:
├─ Cyclomatic: 8 → 4 (-4, -50%)
├─ Cognitive: 12 → 5 (-7, -58.3%)
└─ Formatting Impact: 25% → 10% (-15%, -60%)
Improvements Detected:
✓ Reduced nesting depth (3 → 1)
✓ Eliminated mutable state
✓ Replaced imperative loop with functional chain
✓ Improved formatting consistency
No regressions detected.
Verdict: Refactoring reduced complexity by 50% and improved code clarity.
Configuration Options
Configure multi-pass analysis programmatically:
#![allow(unused)]
fn main() {
use debtmap::analysis::multi_pass::{MultiPassAnalyzer, MultiPassOptions};
use debtmap::analysis::diagnostics::{DetailLevel, OutputFormat};
use debtmap::core::Language;
let options = MultiPassOptions {
language: Language::Rust,
detail_level: DetailLevel::Comprehensive,
enable_recommendations: true,
track_source_locations: true,
generate_insights: true,
output_format: OutputFormat::Json, // Also available: Yaml, Markdown, Html, Text
performance_tracking: true,
};
let analyzer = MultiPassAnalyzer::new(options);
}
Configuration Fields
| Field | Type | Default | Description |
|---|---|---|---|
language | Language | Rust | Target programming language |
detail_level | DetailLevel | Standard | Output detail: Summary, Standard, Comprehensive, Debug (CLI uses lowercase: --detail-level standard) |
enable_recommendations | bool | true | Generate actionable recommendations |
track_source_locations | bool | true | Include file/line/column in attribution |
generate_insights | bool | true | Automatically generate insights |
output_format | OutputFormat | Json | Output format: Json, Yaml, Markdown, Html, Text |
performance_tracking | bool | false | Track and report performance metrics |
Use Cases
When to Use Multi-Pass Analysis
Refactoring Validation
- Compare before/after complexity to validate refactoring
- Ensure complexity actually decreased
- Identify unintended complexity increases
Formatting Impact Assessment
- Determine how much formatting affects your metrics
- Justify automated formatting tool adoption
- Identify formatting inconsistencies
Targeted Refactoring
- Use attribution to find highest-impact refactoring targets
- Focus on logical complexity, not formatting artifacts
- Prioritize functions with actionable suggestions
Code Review
- Provide objective complexity data in pull requests
- Identify genuine complexity increases vs formatting changes
- Guide refactoring discussions with data
Codebase Health Monitoring
- Track logical complexity trends over time
- Separate signal (logic) from noise (formatting)
- Identify complexity hotspots for architectural review
When to Use Standard Analysis
Quick Feedback
- Fast complexity checks during development
- CI/CD gates that need speed
- Large codebases where overhead matters
Sufficient Metrics
- When overall complexity trends are enough
- No need for detailed attribution
- Formatting is already standardized
Future Enhancements
Spec 84: Detailed AST-Based Source Mapping
The current implementation uses estimated complexity locations based on function metrics. Spec 84 will enhance attribution with precise AST-based source mapping:
Planned Improvements:
- Exact AST node locations - Precise line, column, and span for each complexity point
- 100% accurate mapping - No estimation, direct AST-to-source mapping
- IDE integration - Jump from complexity reports directly to source code
- Inline visualization - Show complexity heat maps in your editor
- Statement-level tracking - Complexity attribution at statement granularity
Current vs Future:
Current (estimated):
#![allow(unused)]
fn main() {
ComplexityComponent {
location: CodeLocation {
line: 45, // Function start line
column: 0, // Estimated
span: None, // Not available
},
description: "Function: process_request",
}
}
Future (precise):
#![allow(unused)]
fn main() {
ComplexityComponent {
location: SourceLocation {
line: 47, // Exact if statement line
column: 8, // Exact column
span: Some(47, 52), // Exact span of construct
ast_path: "fn::process_request::body::if[0]",
},
description: "If condition: item.is_valid()",
}
}
This will enable:
- Click-to-navigate from reports to exact code locations
- Visual Studio Code / IntelliJ integration for inline complexity display
- More precise refactoring suggestions
- Better complexity trend tracking at fine granularity
Summary
Multi-pass analysis provides deep insights into your code’s complexity by:
- Separating signal from noise - Distinguishing logical complexity from formatting artifacts
- Attributing complexity sources - Identifying what contributes to complexity and why
- Generating actionable insights - Providing specific refactoring recommendations
- Validating refactoring - Comparing before/after to prove complexity reduction
- Monitoring performance - Ensuring overhead stays within acceptable bounds
Use --multi-pass --attribution when you need detailed complexity analysis and targeted refactoring guidance. The overhead (typically 15-25%) is worthwhile when you need to understand why code is complex and how to improve it.
For quick complexity checks and CI/CD integration, standard single-pass analysis is usually sufficient. Save multi-pass analysis for deep dives, refactoring validation, and complexity investigations.
See Also:
- Analysis Guide - General analysis capabilities
- Scoring Strategies - How complexity affects debt scores
- Coverage Integration - Combining complexity with coverage
- Examples - Real-world multi-pass analysis examples
Parallel Processing
Debtmap leverages Rust’s powerful parallel processing capabilities to analyze large codebases efficiently. Built on Rayon for data parallelism and DashMap for lock-free concurrent data structures, debtmap achieves 10-100x faster performance than Java/Python-based competitors.
Overview
Debtmap’s parallel processing architecture uses a three-phase approach:
- Parallel File Parsing - Parse source files concurrently across all available CPU cores
- Parallel Multi-File Extraction - Extract call graphs from parsed files in parallel
- Parallel Enhanced Analysis - Analyze trait dispatch, function pointers, and framework patterns
This parallel pipeline is controlled by CLI flags that let you tune performance for your environment.
Performance Characteristics
Typical analysis times:
- Small project (1k-5k LOC): <1 second
- Medium project (10k-50k LOC): 2-8 seconds
- Large project (100k-500k LOC): 10-45 seconds
Comparison with other tools (medium-sized Rust project, ~50k LOC):
- SonarQube: 3-4 minutes
- CodeClimate: 2-3 minutes
- Debtmap: 5-8 seconds
CLI Flags for Parallelization
Debtmap provides two flags to control parallel processing behavior:
–jobs / -j
Control the number of worker threads for parallel processing:
# Use all available CPU cores (default)
debtmap analyze --jobs 0
# Limit to 4 threads
debtmap analyze --jobs 4
debtmap analyze -j 4
Behavior:
--jobs 0(default): Auto-detects available CPU cores usingstd::thread::available_parallelism(). Falls back to 4 threads if detection fails.--jobs N: Explicitly sets the thread pool to N threads.
When to use:
- Use
--jobs 0for maximum performance on developer workstations - Use
--jobs 1-4in memory-constrained environments like CI/CD - Use
--jobs 1for deterministic analysis order during debugging
Environment Variables:
You can also set the default via environment variables:
DEBTMAP_JOBS - Set the default thread count:
export DEBTMAP_JOBS=4
debtmap analyze # Uses 4 threads
DEBTMAP_PARALLEL - Enable/disable parallel processing programmatically:
export DEBTMAP_PARALLEL=true
debtmap analyze # Parallel processing enabled
export DEBTMAP_PARALLEL=1
debtmap analyze # Parallel processing enabled (also accepts '1')
The DEBTMAP_PARALLEL variable accepts true or 1 to enable parallel processing. This is useful for programmatic control in scripts or CI environments.
The CLI flags (--jobs, --no-parallel) take precedence over environment variables.
–no-parallel
Disable parallel call graph construction entirely:
debtmap analyze --no-parallel
When to use:
- Debugging concurrency issues: Isolate whether a problem is parallelism-related
- Memory-constrained environments: Parallel processing increases memory usage
- Deterministic analysis: Ensures consistent ordering for reproducibility
Performance Impact:
Disabling parallelization significantly increases analysis time:
- Small projects (< 100 files): 2-3x slower
- Medium projects (100-1000 files): 5-10x slower
- Large projects (> 1000 files): 10-50x slower
For more details on both flags, see the CLI Reference.
Rayon Parallel Iterators
Debtmap uses Rayon, a data parallelism library for Rust, to parallelize file processing operations.
Thread Pool Configuration
The global Rayon thread pool is configured at startup based on the --jobs parameter:
#![allow(unused)]
fn main() {
// From src/builders/parallel_call_graph.rs:48-53
if self.config.num_threads > 0 {
rayon::ThreadPoolBuilder::new()
.num_threads(self.config.num_threads)
.build_global()
.ok(); // Ignore if already configured
}
}
This configures Rayon to use a specific number of worker threads for all parallel operations throughout the analysis.
Worker Thread Selection
The get_worker_count() function determines how many threads to use:
#![allow(unused)]
fn main() {
// From src/main.rs:828-836
fn get_worker_count(jobs: usize) -> usize {
if jobs == 0 {
std::thread::available_parallelism()
.map(|n| n.get())
.unwrap_or(4) // Fallback if detection fails
} else {
jobs // Use explicit value
}
}
}
Auto-detection behavior:
- Queries the OS for available parallelism (CPU cores)
- Respects cgroup limits in containers (Docker, Kubernetes)
- Falls back to 4 threads if detection fails (rare)
Manual configuration:
- Useful in shared environments (CI/CD, shared build servers)
- Prevents resource contention with other processes
- Enables reproducible benchmarking
Parallel File Processing
Phase 1: Parallel File Parsing
Files are parsed concurrently using Rayon’s parallel iterators:
#![allow(unused)]
fn main() {
// From src/builders/parallel_call_graph.rs (parallel_parse_files method)
let parsed_files: Vec<_> = rust_files
.par_iter() // Convert to parallel iterator
.filter_map(|file_path| {
let content = io::read_file(file_path).ok()?;
// Update progress atomically
parallel_graph.stats().increment_files();
Some((file_path.clone(), content))
})
.collect();
}
Key features:
.par_iter()converts a sequential iterator to a parallel one- Each file is read independently on a worker thread
- Progress tracking uses atomic counters (see Parallel Call Graph Statistics)
Phase 2: Parallel Multi-File Extraction
Files are grouped into chunks for optimal parallelization:
#![allow(unused)]
fn main() {
// From src/builders/parallel_call_graph.rs (parallel_multi_file_extraction method)
let chunk_size = std::cmp::max(10, parsed_files.len() / rayon::current_num_threads());
parsed_files.par_chunks(chunk_size).for_each(|chunk| {
// Parse syn files within each chunk
let parsed_chunk: Vec<_> = chunk
.iter()
.filter_map(|(path, content)| {
syn::parse_file(content)
.ok()
.map(|parsed| (parsed, path.clone()))
})
.collect();
if !parsed_chunk.is_empty() {
// Extract call graph for this chunk
let chunk_graph = extract_call_graph_multi_file(&parsed_chunk);
// Merge into main graph concurrently
parallel_graph.merge_concurrent(chunk_graph);
}
});
}
This chunking strategy balances parallelism with overhead:
- Minimum chunk size of 10 files prevents excessive overhead
- Dynamic chunk sizing based on available threads
- Each chunk produces a local call graph that’s merged concurrently
AST Parsing Optimization (Spec 132)
Prior to spec 132, files were parsed twice during call graph construction:
- Phase 1: Read files and store content as strings
- Phase 2: Re-parse the same content to extract call graphs
This redundant parsing was eliminated by parsing each file exactly once and reusing the parsed syn::File AST:
#![allow(unused)]
fn main() {
// Optimized: Parse once in Phase 1
let parsed_files: Vec<(PathBuf, syn::File)> = rust_files
.par_iter()
.filter_map(|file_path| {
let content = io::read_file(file_path).ok()?;
let parsed = syn::parse_file(&content).ok()?; // Parse ONCE
Some((file_path.clone(), parsed))
})
.collect();
// Phase 2: Reuse parsed ASTs (no re-parsing)
for chunk in parsed_files.chunks(chunk_size) {
let chunk_for_extraction: Vec<_> = chunk
.iter()
.map(|(path, parsed)| (parsed.clone(), path.clone())) // Clone AST
.collect();
// Extract call graph...
}
}
Performance Impact:
- Before: 2N parse operations (404 files × 2 = 808 parses)
- After: N parse operations (404 files × 1 = 404 parses)
- Speedup: Cloning a parsed AST is 44% faster than re-parsing
- Time saved: ~432ms per analysis run on 400-file projects
- Memory overhead: <100MB for parsed AST storage
Why Clone Instead of Borrow?
syn::Fileis notSend + Sync(cannot be shared across threads)- Call graph extraction requires owned AST values
- Cloning is still significantly faster than re-parsing (1.33ms vs 2.40ms per file)
See docs/spec-132-benchmark-results.md for detailed benchmarks validating these improvements.
Phase 3: Enhanced Analysis
The third phase analyzes trait dispatch, function pointers, and framework patterns. This phase is currently sequential due to complex shared state requirements, but benefits from the parallel foundation built in phases 1-2.
Parallel Architecture
Debtmap processes files in parallel using Rayon’s parallel iterators:
#![allow(unused)]
fn main() {
files.par_iter()
.map(|file| analyze_file(file))
.collect()
}
Each file is:
- Parsed independently
- Analyzed for complexity
- Scored and prioritized
DashMap for Lock-Free Concurrency
Debtmap uses DashMap, a concurrent hash map implementation, for lock-free data structures during parallel call graph construction.
Why DashMap?
Traditional approaches to concurrent hash maps use a single Mutex<HashMap>, which creates contention:
#![allow(unused)]
fn main() {
// ❌ Traditional approach - serializes all access
let map = Arc<Mutex<HashMap<K, V>>>;
// Thread 1 blocks Thread 2, even for reads
let val = map.lock().unwrap().get(&key);
}
DashMap provides lock-free reads and fine-grained write locking through internal sharding:
#![allow(unused)]
fn main() {
// ✅ DashMap approach - concurrent reads, fine-grained writes
let map = Arc<DashMap<K, V>>;
// Multiple threads can read concurrently without blocking
let val = map.get(&key);
// Writes only lock the specific shard, not the whole map
map.insert(key, value);
}
ParallelCallGraph Implementation
The ParallelCallGraph uses DashMap for all concurrent data structures:
#![allow(unused)]
fn main() {
// From src/priority/parallel_call_graph.rs:50-56
pub struct ParallelCallGraph {
nodes: Arc<DashMap<FunctionId, NodeInfo>>, // Functions
edges: Arc<DashSet<FunctionCall>>, // Calls
caller_index: Arc<DashMap<FunctionId, DashSet<FunctionId>>>, // Who calls this?
callee_index: Arc<DashMap<FunctionId, DashSet<FunctionId>>>, // Who does this call?
stats: Arc<ParallelStats>, // Atomic counters
}
}
Key components:
- nodes: Maps function identifiers to metadata (complexity, lines, flags)
- edges: Set of all function calls (deduplicated automatically)
- caller_index: Reverse index for “who calls this function?”
- callee_index: Forward index for “what does this function call?”
- stats: Atomic counters for progress tracking
Concurrent Operations
Adding Functions Concurrently
Multiple analyzer threads can add functions simultaneously:
#![allow(unused)]
fn main() {
// From src/priority/parallel_call_graph.rs:79-96
pub fn add_function(
&self,
id: FunctionId,
is_entry_point: bool,
is_test: bool,
complexity: u32,
lines: usize,
) {
let node_info = NodeInfo {
id: id.clone(),
is_entry_point,
is_test,
complexity,
lines,
};
self.nodes.insert(id, node_info);
self.stats.add_nodes(1); // Atomic increment
}
}
Atomicity guarantees:
DashMap::insert()is atomic - no data racesAtomicUsizecounters can be incremented from multiple threads safely- No locks required for reading existing nodes
Adding Calls Concurrently
Function calls are added with automatic deduplication:
#![allow(unused)]
fn main() {
// From src/priority/parallel_call_graph.rs:99-117
pub fn add_call(&self, caller: FunctionId, callee: FunctionId, call_type: CallType) {
let call = FunctionCall {
caller: caller.clone(),
callee: callee.clone(),
call_type,
};
if self.edges.insert(call) { // DashSet deduplicates automatically
// Update indices concurrently
self.caller_index
.entry(caller.clone())
.or_default()
.insert(callee.clone());
self.callee_index.entry(callee).or_default().insert(caller);
self.stats.add_edges(1); // Only increment if actually inserted
}
}
}
Deduplication:
DashSet::insert()returnstrueonly for new items- Duplicate calls from multiple threads are safely ignored
- Indices are updated atomically using
entry()API
Shared Read-Only Data
Analysis configuration and indexes are shared across threads:
#![allow(unused)]
fn main() {
let coverage_index = Arc::new(build_coverage_index());
// All threads share the same index
files.par_iter()
.map(|file| analyze_with_coverage(file, &coverage_index))
}
Memory Overhead
DashMap uses internal sharding for parallelism, which has a memory overhead:
- DashMap overhead: ~2x the memory of a regular
HashMapdue to sharding - DashSet overhead: Similar to DashMap
- Benefit: Enables concurrent access without contention
- Trade-off: Debtmap prioritizes speed over memory for large codebases
For memory-constrained environments, use --jobs 2-4 or --no-parallel to reduce parallel overhead.
Parallel Call Graph Statistics
Debtmap tracks parallel processing progress using atomic counters that can be safely updated from multiple threads.
ParallelStats Structure
#![allow(unused)]
fn main() {
// From src/priority/parallel_call_graph.rs:7-14
pub struct ParallelStats {
pub total_nodes: AtomicUsize, // Functions processed
pub total_edges: AtomicUsize, // Calls discovered
pub files_processed: AtomicUsize, // Files completed
pub total_files: AtomicUsize, // Total files to process
}
}
Atomic operations:
fetch_add()- Atomically increment counters from any threadload()- Read current value without blockingOrdering::Relaxed- Sufficient for statistics (no synchronization needed)
Progress Tracking
Progress ratio calculation for long-running analysis:
#![allow(unused)]
fn main() {
// From src/priority/parallel_call_graph.rs:38-46
pub fn progress_ratio(&self) -> f64 {
let processed = self.files_processed.load(Ordering::Relaxed) as f64;
let total = self.total_files.load(Ordering::Relaxed) as f64;
if total > 0.0 {
processed / total
} else {
0.0
}
}
}
This enables progress callbacks during analysis:
#![allow(unused)]
fn main() {
// From src/builders/parallel_call_graph.rs:110-121
parallel_graph.stats().increment_files();
if let Some(ref callback) = self.config.progress_callback {
let processed = parallel_graph
.stats()
.files_processed
.load(std::sync::atomic::Ordering::Relaxed);
let total = parallel_graph
.stats()
.total_files
.load(std::sync::atomic::Ordering::Relaxed);
callback(processed, total);
}
}
Log Output Format
After analysis completes, debtmap reports final statistics:
#![allow(unused)]
fn main() {
// From src/builders/parallel_call_graph.rs:85-93
log::info!(
"Parallel call graph complete: {} nodes, {} edges, {} files processed",
stats.total_nodes.load(std::sync::atomic::Ordering::Relaxed),
stats.total_edges.load(std::sync::atomic::Ordering::Relaxed),
stats
.files_processed
.load(std::sync::atomic::Ordering::Relaxed),
);
}
Example output:
INFO - Processing 1247 Rust files in parallel
INFO - Progress: 100/1247 files processed
INFO - Progress: 500/1247 files processed
INFO - Progress: 1000/1247 files processed
INFO - Parallel call graph complete: 8942 nodes, 23451 edges, 1247 files processed
Cross-File Call Resolution
Debtmap uses a two-phase parallel resolution approach for resolving cross-file function calls, achieving 10-15% faster call graph construction on multi-core systems.
Two-Phase Architecture
Phase 1: Parallel Resolution (Read-Only)
The first phase processes unresolved calls concurrently using Rayon’s parallel iterators:
#![allow(unused)]
fn main() {
// From src/priority/call_graph/cross_file.rs
let resolutions: Vec<(FunctionCall, FunctionId)> = calls_to_resolve
.par_iter() // Parallel iteration
.filter_map(|call| {
// Pure function - safe for parallel execution
Self::resolve_call_with_advanced_matching(
&all_functions,
&call.callee.name,
&call.caller.file,
).map(|resolved_callee| {
(call.clone(), resolved_callee)
})
})
.collect();
}
Key benefits:
- Pure functional resolution: No side effects, safe for concurrent execution
- Immutable data: All inputs are read-only during the parallel phase
- Independent operations: Each call resolution is independent of others
- Parallel efficiency: Utilizes all available CPU cores
Phase 2: Sequential Updates (Mutation)
The second phase applies all resolutions to the graph sequentially:
#![allow(unused)]
fn main() {
// Apply resolutions to graph in sequence
for (original_call, resolved_callee) in resolutions {
self.apply_call_resolution(&original_call, &resolved_callee);
}
}
Key benefits:
- Batch updates: All resolutions processed together
- Data consistency: Sequential updates maintain index synchronization
- Deterministic: Same results regardless of parallel execution order
Performance Impact
The two-phase approach provides significant speedups on multi-core systems:
| CPU Cores | Speedup | Example Time (1500 calls) |
|---|---|---|
| 1 | 0% | 100ms (baseline) |
| 2 | ~8% | 92ms |
| 4 | ~12% | 88ms |
| 8 | ~15% | 85ms |
Performance characteristics:
- Best case: 10-15% reduction in call graph construction time
- Scaling: Diminishing returns beyond 8 cores due to batching overhead
- Memory overhead: <10MB for resolutions vector, even for large projects
Thread Safety
The parallel resolution phase is thread-safe without locks because:
- Pure resolution logic:
resolve_call_with_advanced_matching()is a static method with no side effects - Immutable inputs: All function data is read-only during parallel phase
- Independent resolutions: No dependencies between different call resolutions
- Safe collection: Rayon handles thread synchronization for result collection
The sequential update phase requires no synchronization since it runs single-threaded.
Memory Efficiency
Resolutions vector overhead:
- Per-resolution size: ~200 bytes (FunctionCall + FunctionId)
- For 1000 resolutions: ~200KB
- For 2000 resolutions: ~400KB
- Maximum overhead: <10MB even for very large projects
Total memory footprint:
Total Memory = Base Graph + Resolutions Vector
≈ 5-10MB + 0.2-0.4MB
≈ 5-10MB (negligible overhead)
Integration with Call Graph Construction
The two-phase resolution integrates seamlessly into the existing call graph construction pipeline:
File Parsing (Parallel)
↓
Function Extraction (Parallel)
↓
Build Initial Call Graph
↓
[NEW] Parallel Cross-File Resolution
├─ Phase 1: Parallel resolution → collect resolutions
└─ Phase 2: Sequential updates → apply to graph
↓
Call Graph Complete
Configuration
Cross-file resolution respects the --jobs flag for thread pool sizing:
# Use all cores for maximum speedup
debtmap analyze --jobs 0
# Limit to 4 threads
debtmap analyze --jobs 4
# Disable parallelism (debugging)
debtmap analyze --no-parallel
The --no-parallel flag disables parallel call graph construction entirely, including cross-file resolution parallelization.
Debugging
To verify parallel resolution is working:
# Enable verbose logging
debtmap analyze -vv
# Look for messages like:
# "Resolving 1523 cross-file calls in parallel"
# "Parallel resolution complete: 1423 resolved in 87ms"
To compare parallel vs sequential performance:
# Parallel (default)
time debtmap analyze .
# Sequential (for comparison)
time debtmap analyze . --no-parallel
Expected difference: 10-15% faster with parallel resolution on 4-8 core systems.
Concurrent Merging
The merge_concurrent() method combines call graphs from different analysis phases using parallel iteration.
Implementation
#![allow(unused)]
fn main() {
// From src/priority/parallel_call_graph.rs:120-138
pub fn merge_concurrent(&self, other: CallGraph) {
// Parallelize node merging
let nodes_vec: Vec<_> = other.get_all_functions().collect();
nodes_vec.par_iter().for_each(|func_id| {
if let Some((is_entry, is_test, complexity, lines)) = other.get_function_info(func_id) {
self.add_function((*func_id).clone(), is_entry, is_test, complexity, lines);
}
});
// Parallelize edge merging
let calls_vec: Vec<_> = other.get_all_calls();
calls_vec.par_iter().for_each(|call| {
self.add_call(
call.caller.clone(),
call.callee.clone(),
call.call_type.clone(),
);
});
}
}
How it works:
- Extract all nodes and edges from the source
CallGraph - Use
par_iter()to merge nodes in parallel - Use
par_iter()to merge edges in parallel - DashMap/DashSet automatically handle concurrent insertions
Converting Between Representations
Debtmap uses two call graph representations:
- ParallelCallGraph: Concurrent data structures (DashMap/DashSet) for parallel construction
- CallGraph: Sequential data structures (HashMap/HashSet) for analysis algorithms
Conversion happens at phase boundaries:
#![allow(unused)]
fn main() {
// From src/priority/parallel_call_graph.rs:141-162
pub fn to_call_graph(&self) -> CallGraph {
let mut call_graph = CallGraph::new();
// Add all nodes
for entry in self.nodes.iter() {
let node = entry.value();
call_graph.add_function(
node.id.clone(),
node.is_entry_point,
node.is_test,
node.complexity,
node.lines,
);
}
// Add all edges
for call in self.edges.iter() {
call_graph.add_call(call.clone());
}
call_graph
}
}
Why two representations?
- ParallelCallGraph: Optimized for concurrent writes during construction
- CallGraph: Optimized for graph algorithms (PageRank, connectivity, transitive reduction)
- Conversion overhead is negligible compared to analysis time
Coverage Index Optimization
Debtmap uses an optimized nested HashMap structure for coverage data lookups, providing significant performance improvements for coverage-enabled analysis.
Nested HashMap Architecture
The CoverageIndex structure uses a two-level nested HashMap instead of a flat structure:
#![allow(unused)]
fn main() {
// Optimized structure (nested)
pub struct CoverageIndex {
/// Outer map: file path → inner map of functions
by_file: HashMap<PathBuf, HashMap<String, FunctionCoverage>>,
/// Line-based index for range queries
by_line: HashMap<PathBuf, BTreeMap<usize, FunctionCoverage>>,
/// Pre-computed file paths for efficient iteration
file_paths: Vec<PathBuf>,
}
// OLD structure (flat) - no longer used
HashMap<(PathBuf, String), FunctionCoverage>
}
Performance Characteristics
The nested structure provides dramatic performance improvements:
Lookup Complexity:
- Exact match: O(1) file hash + O(1) function hash
- Path strategies: O(files) instead of O(functions)
- Line-based: O(log functions_in_file) binary search
Real-World Performance:
- Exact match lookups: ~100 nanoseconds
- Path matching fallback: ~10 microseconds (375 file checks vs 1,500 function checks)
- Overall speedup: 50-100x faster coverage lookups
Why This Matters
When analyzing a typical Rust project with coverage enabled:
- Function count: ~1,500 functions (after demangling)
- File count: ~375 files
- Lookups per analysis: ~19,600
- Average functions per file: ~4
OLD flat structure (O(n) scans):
- 19,600 lookups × 4,500 comparisons = 88 million operations
- Estimated time: ~1 minute
NEW nested structure (O(1) lookups):
- 19,600 lookups × 1-3 operations = ~60,000 operations
- Estimated time: ~3 seconds
Speedup: ~20x faster just from index structure optimization
Combined with Function Demangling
This optimization works synergistically with LLVM coverage function name demangling (Spec 134):
Original (no demangling, flat structure):
- 18,631 mangled functions
- O(n) linear scans
- Total time: 10+ minutes
After demangling (Spec 134):
- 1,500 demangled functions
- O(n) linear scans (still)
- Total time: ~1 minute
After nested structure (Spec 135):
- 1,500 demangled functions
- O(1) hash lookups
- Total time: ~3 seconds
Combined speedup: ~50,000x (10+ minutes → 3 seconds)
Implementation Details
Exact Match Lookup (O(1)):
#![allow(unused)]
fn main() {
pub fn get_function_coverage(&self, file: &Path, function_name: &str) -> Option<f64> {
// Two O(1) hash lookups
if let Some(file_functions) = self.by_file.get(file) {
if let Some(coverage) = file_functions.get(function_name) {
return Some(coverage.coverage_percentage / 100.0);
}
}
// Fallback to path strategies (rare)
self.find_by_path_strategies(file, function_name)
}
}
Path Strategy Fallback (O(files)):
#![allow(unused)]
fn main() {
fn find_by_path_strategies(&self, query_path: &Path, function_name: &str) -> Option<f64> {
// Iterate over FILES not FUNCTIONS (375 vs 1,500 = 4x faster)
for file_path in &self.file_paths {
if query_path.ends_with(file_path) {
// O(1) lookup once we find the right file
if let Some(file_functions) = self.by_file.get(file_path) {
if let Some(coverage) = file_functions.get(function_name) {
return Some(coverage.coverage_percentage / 100.0);
}
}
}
}
None
}
}
Memory Overhead
The nested structure has minimal memory overhead:
Flat structure:
- 1,500 entries × ~200 bytes = 300KB
Nested structure:
- Outer HashMap: 375 entries × ~50 bytes = 18.75KB
- Inner HashMaps: 375 × ~4 functions × ~200 bytes = 300KB
- File paths vector: 375 × ~100 bytes = 37.5KB
- Total: ~356KB
Memory increase: ~56KB (18%) - negligible cost for 50-100x speedup
Benchmarking Coverage Performance
Debtmap includes benchmarks to validate coverage index performance:
# Run coverage performance benchmarks
cargo bench --bench coverage_performance
# Compare old flat structure vs new nested structure
# Expected results:
# old_flat_structure: 450ms
# new_nested_structure: 8ms
# Speedup: ~56x
The flat_vs_nested_comparison benchmark simulates the old O(n) scan behavior and compares it with the new nested structure, demonstrating the 50-100x improvement.
Impact on Analysis Time
Coverage lookups are now negligible overhead:
Without coverage optimization:
- Analysis overhead from coverage: ~1 minute
- Percentage of total time: 60-80%
With coverage optimization:
- Analysis overhead from coverage: ~3 seconds
- Percentage of total time: 5-10%
This makes coverage-enabled analysis practical for CI/CD pipelines and real-time feedback during development.
Performance Tuning
Optimal Thread Count
General rule: Use physical core count, not logical cores.
# Check physical core count
lscpu | grep "Core(s) per socket"
# macOS
sysctl hw.physicalcpu
Recommended settings:
| System | Cores | Recommended –jobs |
|---|---|---|
| Laptop | 4 | Default or 4 |
| Desktop | 8 | Default |
| Workstation | 16+ | Default |
| CI/CD | Varies | 2-4 (shared resources) |
Memory Considerations
Each thread requires memory for:
- AST parsing (~1-5 MB per file)
- Analysis state (~500 KB per file)
- Temporary buffers
Memory usage estimate:
Total Memory ≈ (Thread Count) × (Average File Size) × 2-3
Example (50 files, average 10 KB each, 8 threads):
Memory ≈ 8 × 10 KB × 3 = 240 KB (negligible)
For very large files (>1 MB), consider reducing thread count.
Memory vs Speed Tradeoffs
Parallel processing uses more memory:
| Configuration | Memory Overhead | Speed Benefit |
|---|---|---|
--no-parallel | Baseline | Baseline |
--jobs 1 | +10% (data structures) | 1x |
--jobs 4 | +30% (+ worker buffers) | 4-6x |
--jobs 8 | +50% (+ worker buffers) | 6-10x |
--jobs 16 | +80% (+ worker buffers) | 10-15x |
Memory overhead sources:
- DashMap internal sharding (~2x HashMap)
- Per-worker thread stacks and buffers
- Parallel iterator intermediates
I/O Bound vs CPU Bound
CPU-bound analysis (default):
- Complexity calculations
- Pattern detection
- Risk scoring
Parallel processing provides 4-8x speedup.
I/O-bound operations:
- Reading files from disk
- Loading coverage data
Limited speedup from parallelism (1.5-2x).
If analysis is I/O-bound:
- Use SSD storage
- Reduce thread count (less I/O contention)
- Use
--max-filesto limit scope
Scaling Strategies
Small Projects (<10k LOC)
# Default settings are fine
debtmap analyze .
Parallel overhead may exceed benefits. Consider --no-parallel if analysis is <1 second.
Medium Projects (10k-100k LOC)
# Use all cores
debtmap analyze .
Optimal parallel efficiency. Expect 4-8x speedup from parallelism.
Large Projects (>100k LOC)
# Use all cores
debtmap analyze . --jobs 0 # 0 = all cores
Maximize parallel processing for large codebases.
CI/CD Environments
# Limit threads to avoid resource contention
debtmap analyze . --jobs 2
CI environments often limit CPU cores per job.
Scaling Behavior
Debtmap’s parallel processing scales with CPU core count:
Strong Scaling (Fixed Problem Size):
| CPU Cores | Speedup | Efficiency |
|---|---|---|
| 1 | 1x | 100% |
| 2 | 1.8x | 90% |
| 4 | 3.4x | 85% |
| 8 | 6.2x | 78% |
| 16 | 10.5x | 66% |
| 32 | 16.8x | 53% |
Efficiency decreases at higher core counts due to:
- Synchronization overhead (atomic operations, DashMap locking)
- Memory bandwidth saturation
- Diminishing returns from Amdahl’s law (sequential portions)
Weak Scaling (Problem Size Grows with Cores):
Debtmap maintains high efficiency when problem size scales with core count, making it ideal for analyzing larger codebases on more powerful machines.
Tuning Guidelines
Development Workstations:
# Use all cores for maximum speed
debtmap analyze --jobs 0
CI/CD Environments:
# Limit threads to avoid resource contention
debtmap analyze --jobs 2
# Or disable parallelism on very constrained runners
debtmap analyze --no-parallel
Containers:
# Auto-detection respects cgroup limits
debtmap analyze --jobs 0
# Or explicitly match container CPU allocation
debtmap analyze --jobs 4
Benchmarking:
# Use fixed thread count for reproducible results
debtmap analyze --jobs 8
Profiling and Debugging
Measure Analysis Time
time debtmap analyze .
Disable Parallelism for Debugging
debtmap analyze . --no-parallel -vv
Single-threaded mode with verbose output for debugging.
Profile Thread Usage
Use system tools to monitor thread usage:
# Linux
htop
# macOS
Activity Monitor (View > CPU Usage > Show Threads)
Look for:
- All cores at ~100% utilization (optimal)
- Some cores idle (I/O bound or insufficient work)
- Excessive context switching (too many threads)
Finding Optimal Settings
Finding the optimal setting:
# Benchmark different configurations
time debtmap analyze --jobs 0 # Auto
time debtmap analyze --jobs 4 # 4 threads
time debtmap analyze --jobs 8 # 8 threads
time debtmap analyze --no-parallel # Sequential
Monitor memory usage during analysis:
# Monitor peak memory usage
/usr/bin/time -v debtmap analyze --jobs 8
Best Practices
- Use default settings - Debtmap auto-detects optimal thread count
- Limit threads in CI - Use
--jobs 2or--jobs 4in shared environments - Profile before tuning - Measure actual performance impact
- Consider I/O - If using slow storage, reduce thread count
Troubleshooting
Analysis is Slow Despite Parallelism
Possible causes:
- I/O bottleneck (slow disk)
- Memory pressure (swapping)
- Thread contention
Solutions:
- Use faster storage (SSD)
- Reduce thread count to avoid memory pressure
- Limit analysis scope with
--max-files
Slow Analysis Performance
If analysis is slower than expected:
-
Check thread count:
# Ensure you're using all cores debtmap analyze --jobs 0 -vv | grep "threads" -
Check I/O bottleneck:
# Use iotop or similar to check disk saturation # SSD storage significantly improves performance -
Check memory pressure:
# Monitor memory usage during analysis top -p $(pgrep debtmap) -
Try different thread counts:
# Sometimes less threads = less contention debtmap analyze --jobs 4
High CPU Usage But No Progress
Possible cause: Analyzing very complex files (large ASTs)
Solution:
# Reduce thread count to avoid memory thrashing
debtmap analyze . --jobs 2
High Memory Usage
If debtmap uses too much memory:
-
Reduce parallelism:
debtmap analyze --jobs 2 -
Disable parallel call graph:
debtmap analyze --no-parallel -
Analyze subdirectories separately:
# Process codebase in chunks debtmap analyze src/module1 debtmap analyze src/module2
Inconsistent Results Between Runs
Possible cause: Non-deterministic parallel aggregation (rare)
Solution:
# Use single-threaded mode
debtmap analyze . --no-parallel
If results differ, report as a bug.
Debugging Concurrency Issues
If you suspect a concurrency bug:
-
Run sequentially to isolate:
debtmap analyze --no-parallel -
Use deterministic mode:
# Single-threaded = deterministic order debtmap analyze --jobs 1 -
Enable verbose logging:
debtmap analyze -vvv --no-parallel > debug.log 2>&1 -
Report the issue: If behavior differs between
--no-paralleland parallel mode, please report it with:- Command used
- Platform (OS, CPU core count)
- Debtmap version
- Minimal reproduction case
Thread Contention Warning
If you see warnings about thread contention:
WARN - High contention detected on parallel call graph
This indicates too many threads competing for locks. Try:
# Reduce thread count
debtmap analyze --jobs 4
See Also
- CLI Reference - Performance & Caching - Complete flag documentation
- Configuration - Project-specific settings
- Troubleshooting - General troubleshooting guide
- Troubleshooting - Slow Analysis - Performance debugging guide
- Troubleshooting - High Memory Usage - Memory optimization tips
- FAQ - Reducing Parallelism - Common questions about parallel processing
- Architecture - High-level system design
Summary
Debtmap’s parallel processing architecture provides:
- 10-100x speedup over sequential analysis using Rayon parallel iterators
- Lock-free concurrency with DashMap for minimal contention
- Flexible configuration via
--jobsand--no-parallelflags - Automatic thread pool tuning that respects system resources
- Production-grade reliability with atomic progress tracking and concurrent merging
The three-phase parallel pipeline (parse → extract → analyze) maximizes parallelism while maintaining correctness through carefully designed concurrent data structures.
Prodigy Integration
Debtmap integrates with Prodigy to provide fully automated technical debt reduction through AI-driven workflows. This chapter explains how to set up and use Prodigy workflows to automatically refactor code, add tests, and improve codebase quality.
Prerequisites Checklist
Before using Prodigy with Debtmap, ensure you have:
- Rust 1.70 or later installed
- Debtmap installed (
cargo install debtmap) - Prodigy installed (
cargo install --git https://github.com/iepathos/prodigy prodigy) - Anthropic API key for Claude access
- Git (for worktree management)
- Optional:
justcommand runner (a command runner like make), or use directcargocommands as alternatives
What is Prodigy?
Note: Prodigy is a separate open-source tool (https://github.com/iepathos/prodigy). You need to install both Debtmap and Prodigy to use this integration.
Prodigy is an AI-powered workflow automation system that uses Claude to execute complex multi-step tasks. When integrated with Debtmap, it can:
- Automatically refactor high-complexity functions identified by Debtmap
- Add unit tests for untested code
- Fix code duplication by extracting shared logic
- Improve code organization by addressing architectural issues
- Validate improvements with automated testing
All changes are made in isolated git worktrees, validated with tests and linting, and only committed if all checks pass.
Benefits
Automated Debt Reduction
Instead of manually addressing each technical debt item, Prodigy can:
- Analyze Debtmap’s output
- Select high-priority items
- Generate refactoring plans
- Execute refactorings automatically
- Validate with tests
- Commit clean changes
Iterative Improvement
Prodigy supports iterative workflows:
- Run analysis → fix top items → re-analyze → fix more
- Configurable iteration count (default: 5 iterations)
- Each iteration focuses on highest-priority remaining items
Safe Experimentation
All changes happen in isolated git worktrees:
- Original branch remains untouched
- Failed attempts don’t affect main codebase
- Easy to review before merging
- Automatic cleanup after workflow
Prerequisites
Install Prodigy
# Install Prodigy from GitHub repository
cargo install --git https://github.com/iepathos/prodigy prodigy
# Verify installation
prodigy --version
Note: Currently, Prodigy must be installed from GitHub. Check the Prodigy repository for the latest installation instructions.
Requirements:
- Rust 1.70 or later
- Git (for worktree management)
- Anthropic API key for Claude access
Configure Claude API
# Set Claude API key
export ANTHROPIC_API_KEY="your-api-key-here"
# Or in ~/.prodigy/config.toml:
[api]
anthropic_key = "your-api-key-here"
Ensure Debtmap is Installed
# Install Debtmap
cargo install debtmap
# Verify installation
debtmap --version
Quick Start
1. Initialize Workflow
Create a workflow file workflows/debtmap.yml:
# Sequential workflow. Fix top technical debt item
# Phase 1: Generate coverage data
- shell: "just coverage-lcov" # or: cargo tarpaulin --out lcov --output-dir target/coverage
# Phase 2: Analyze tech debt and capture baseline
- shell: "debtmap analyze . --lcov target/coverage/lcov.info --output .prodigy/debtmap-before.json --format json"
# Phase 3: Create implementation plan (PLANNING PHASE)
- claude: "/prodigy-debtmap-plan --before .prodigy/debtmap-before.json --output .prodigy/IMPLEMENTATION_PLAN.md"
capture_output: true
validate:
commands:
- claude: "/prodigy-validate-debtmap-plan --before .prodigy/debtmap-before.json --plan .prodigy/IMPLEMENTATION_PLAN.md --output .prodigy/plan-validation.json"
result_file: ".prodigy/plan-validation.json"
threshold: 75
on_incomplete:
commands:
- claude: "/prodigy-revise-debtmap-plan --gaps ${validation.gaps} --plan .prodigy/IMPLEMENTATION_PLAN.md"
max_attempts: 3
fail_workflow: false
# Phase 4: Execute the plan (IMPLEMENTATION PHASE)
- claude: "/prodigy-debtmap-implement --plan .prodigy/IMPLEMENTATION_PLAN.md"
commit_required: true
validate:
commands:
- shell: "debtmap analyze . --lcov target/coverage/lcov.info --output .prodigy/debtmap-after.json --format json"
- shell: "debtmap compare --before .prodigy/debtmap-before.json --after .prodigy/debtmap-after.json --plan .prodigy/IMPLEMENTATION_PLAN.md --output .prodigy/comparison.json --format json"
- shell: "debtmap validate-improvement --comparison .prodigy/comparison.json --output .prodigy/debtmap-validation.json"
result_file: ".prodigy/debtmap-validation.json"
threshold: 75
on_incomplete:
commands:
- claude: "/prodigy-complete-debtmap-fix --gaps ${validation.gaps} --plan .prodigy/IMPLEMENTATION_PLAN.md"
commit_required: true
- shell: "just coverage-lcov"
- shell: "debtmap analyze . --lcov target/coverage/lcov.info --output .prodigy/debtmap-after.json --format json"
- shell: "debtmap compare --before .prodigy/debtmap-before.json --after .prodigy/debtmap-after.json --plan .prodigy/IMPLEMENTATION_PLAN.md --output .prodigy/comparison.json --format json"
max_attempts: 5
fail_workflow: true
# Phase 5: Run tests with automatic fixing
- shell: "just test"
on_failure:
claude: "/prodigy-debug-test-failure --output ${shell.output}"
max_attempts: 5
fail_workflow: true
# Phase 6: Run linting and formatting
- shell: "just fmt-check && just lint"
on_failure:
claude: "/prodigy-lint ${shell.output}"
max_attempts: 5
fail_workflow: true
Note about
just: This example usesjust(a command runner likemake). If you don’t have ajustfile, replacejust coverage-lcovwithcargo tarpaulin --out lcov --output-dir target/coverage,just testwithcargo test, andjust fmt-check && just lintwithcargo fmt --check && cargo clippy -- -D warnings.
2. Run Workflow
# Run with auto-confirm, 5 iterations
prodigy run workflows/debtmap.yml -yn 5
# Run with custom iteration count
prodigy run workflows/debtmap.yml -yn 10
# Run single iteration for testing
prodigy run workflows/debtmap.yml -yn 1
Command Flags:
-y(--yes) - Auto-confirm workflow steps (skip prompts)-n 5(--max-iterations 5) - Run workflow for up to 5 iterations
Note: Worktrees are managed separately via the prodigy worktree command. In MapReduce mode, Prodigy automatically creates isolated worktrees for each parallel agent.
3. Review Results
Prodigy creates a detailed report:
📊 WORKFLOW SUMMARY
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Iterations: 5
Items Fixed: 12
Tests Added: 8
Complexity Reduced: 145 → 78 (-46%)
Coverage Improved: 45% → 72% (+27%)
✅ All validations passed
Useful Prodigy Commands
Beyond prodigy run, several commands help manage workflows and sessions:
Resume Interrupted Workflows
# Resume an interrupted sequential workflow
prodigy resume <SESSION_ID>
# Resume an interrupted MapReduce job
prodigy resume-job <JOB_ID>
# List all sessions to find the SESSION_ID
prodigy sessions
When to use: If a workflow is interrupted (Ctrl-C, system crash, network issues), you can resume from the last checkpoint rather than starting over.
View Checkpoints
# List all available checkpoints
prodigy checkpoints
# List checkpoints for specific session
prodigy checkpoints --session <SESSION_ID>
When to use: To see available restore points for interrupted workflows.
Manage Worktrees
# List all Prodigy worktrees
prodigy worktree list
# Clean up old worktrees
prodigy worktree clean
# Remove specific worktree
prodigy worktree remove <SESSION_ID>
When to use: MapReduce workflows create many worktrees. Clean them up periodically to save disk space.
Monitor MapReduce Progress
# View progress of running MapReduce job
prodigy progress <JOB_ID>
# View events and logs from MapReduce job
prodigy events <JOB_ID>
# Filter events by type
prodigy events <JOB_ID> --type agent_started
prodigy events <JOB_ID> --type agent_completed
prodigy events <JOB_ID> --type agent_failed
When to use: Monitor long-running MapReduce jobs to see how many agents have completed, which are still running, and which have failed.
Manage Dead Letter Queue
# View failed MapReduce items in DLQ
prodigy dlq list <JOB_ID>
# Retry failed items from DLQ
prodigy dlq retry <JOB_ID> <ITEM_ID>
# Remove items from DLQ
prodigy dlq remove <JOB_ID> <ITEM_ID>
When to use: When some MapReduce agents fail, their items go to the Dead Letter Queue. You can retry them individually or investigate why they failed.
Session Management
# List all workflow sessions
prodigy sessions
# Clean up old sessions
prodigy clean
When to use: View history of workflow runs and clean up old data.
Workflow Configuration
Prodigy workflows are defined as YAML lists of steps. Each step can be either a shell command or a claude slash command.
Workflow Step Types
Shell Commands
Execute shell commands directly:
# Simple shell command
- shell: "cargo test"
# With timeout (in seconds)
- shell: "just coverage-lcov"
timeout: 900 # 15 minutes
# With error handling
- shell: "just test"
on_failure:
claude: "/prodigy-debug-test-failure --output ${shell.output}"
max_attempts: 5
fail_workflow: true
Shell Command Fields:
shell: Command to execute (string)timeout: Maximum execution time in seconds (optional)on_failure: Error handler configuration (optional)claude: Slash command to run on failuremax_attempts: Maximum retry attemptsfail_workflow: If true, fail entire workflow after max attempts
Claude Commands
Execute Claude Code slash commands:
# Simple Claude command
- claude: "/prodigy-debtmap-plan --before .prodigy/debtmap-before.json --output .prodigy/IMPLEMENTATION_PLAN.md"
# With output capture (makes command output available in ${shell.output})
- claude: "/prodigy-debtmap-plan --before .prodigy/debtmap-before.json --output .prodigy/IMPLEMENTATION_PLAN.md"
capture_output: true
# With commit requirement (workflow fails if no git commit made)
- claude: "/prodigy-debtmap-implement --plan .prodigy/IMPLEMENTATION_PLAN.md"
commit_required: true
# With timeout and validation
- claude: "/prodigy-debtmap-implement --plan .prodigy/IMPLEMENTATION_PLAN.md"
commit_required: true
timeout: 1800 # 30 minutes
validate:
commands:
- shell: "cargo test"
result_file: ".prodigy/validation.json"
threshold: 75
Claude Command Fields:
claude: Slash command to execute (string)capture_output: If true, command output is available in${shell.output}variable (optional)commit_required: If true, workflow fails if command doesn’t create a git commit (optional)timeout: Maximum execution time in seconds (optional)validate: Validation configuration (optional, see Step-Level Validation below)
Step-Level Validation
Steps can include validation that must pass:
- claude: "/prodigy-debtmap-implement --plan .prodigy/IMPLEMENTATION_PLAN.md"
commit_required: true
validate:
commands:
- shell: "cargo test"
- shell: "cargo clippy -- -D warnings"
result_file: ".prodigy/validation.json"
threshold: 75
on_incomplete:
commands:
- claude: "/prodigy-complete-debtmap-fix --gaps ${validation.gaps} --plan .prodigy/IMPLEMENTATION_PLAN.md"
commit_required: true
max_attempts: 5
fail_workflow: true
Validation Options:
commands: List of commands to run for validationresult_file: JSON file containing validation resultsthreshold: Minimum score (0-100) required to passon_incomplete: Actions to take if validation score < thresholdmax_attempts: Maximum retry attemptsfail_workflow: Whether to fail entire workflow if validation never passes
Error Handling
Use on_failure to handle command failures:
- shell: "just fmt-check && just lint"
on_failure:
claude: "/prodigy-lint ${shell.output}"
max_attempts: 5
fail_workflow: true
Error Handling Options:
claude: Slash command to fix the failuremax_attempts: Maximum fix attemptsfail_workflow: If true, workflow fails after max_attempts; if false, continues to next step
Coverage Integration
Generate and use coverage data in workflows. See Coverage Integration for details on generating LCOV files and understanding coverage metrics.
# Generate coverage
- shell: "just coverage-lcov"
# Use coverage in analysis
- shell: "debtmap analyze . --lcov target/coverage/lcov.info --output .prodigy/debtmap-before.json --format json"
Claude Slash Commands
Important: The slash commands documented below are custom commands provided in Debtmap’s
.claude/commands/directory. They are included in the Debtmap repository as working examples. You can use them as-is or create your own based on these patterns.Note on workflow styles: The sequential workflow (workflows/debtmap.yml) uses
shell:commands directly, while the MapReduce workflow (workflows/debtmap-reduce.yml) usesclaude:wrapper commands for some operations likevalidate-improvement. Both approaches are valid - use whichever fits your workflow style.
Prodigy workflows use Claude Code slash commands to perform analysis, planning, and implementation. The key commands used in the debtmap workflow are:
Planning Commands
/prodigy-debtmap-plan
Creates an implementation plan for the top priority debt item.
- claude: "/prodigy-debtmap-plan --before .prodigy/debtmap-before.json --output .prodigy/IMPLEMENTATION_PLAN.md"
capture_output: true
Parameters:
--before: Path to debtmap analysis JSON file--output: Path to write implementation plan
/prodigy-validate-debtmap-plan
Validates that the implementation plan is complete and addresses the debt item.
- claude: "/prodigy-validate-debtmap-plan --before .prodigy/debtmap-before.json --plan .prodigy/IMPLEMENTATION_PLAN.md --output .prodigy/plan-validation.json"
Parameters:
--before: Original debtmap analysis--plan: Implementation plan to validate--output: Validation results JSON (with score 0-100)
/prodigy-revise-debtmap-plan
Revises an incomplete plan based on validation gaps.
- claude: "/prodigy-revise-debtmap-plan --gaps ${validation.gaps} --plan .prodigy/IMPLEMENTATION_PLAN.md"
Parameters:
--gaps: List of missing items from validation--plan: Plan file to update
Implementation Commands
/prodigy-debtmap-implement
Executes the implementation plan.
- claude: "/prodigy-debtmap-implement --plan .prodigy/IMPLEMENTATION_PLAN.md"
commit_required: true
Parameters:
--plan: Path to implementation plan
/prodigy-validate-debtmap-improvement
Validates that the implementation successfully addressed the debt item.
Note: This slash command now wraps the
debtmap validate-improvementsubcommand. You can use either approach:
- Claude slash command:
/prodigy-validate-debtmap-improvement(recommended for Prodigy workflows)- Direct subcommand:
debtmap validate-improvement(recommended for shell scripts)
# Using Claude slash command (recommended for workflows)
- claude: "/prodigy-validate-debtmap-improvement --comparison .prodigy/comparison.json --output .prodigy/debtmap-validation.json"
# Or using shell command directly
- shell: "debtmap validate-improvement --comparison .prodigy/comparison.json --output .prodigy/debtmap-validation.json"
Parameters:
--comparison: Debtmap comparison results (before vs after)--output: Validation results JSON (with score 0-100)--previous-validation: (Optional) Previous validation result for trend tracking--threshold: (Optional) Improvement threshold percentage (default: 75.0)
/prodigy-complete-debtmap-fix
Completes a partial fix based on validation gaps.
- claude: "/prodigy-complete-debtmap-fix --gaps ${validation.gaps} --plan .prodigy/IMPLEMENTATION_PLAN.md"
commit_required: true
Parameters:
--gaps: Validation gaps to address--plan: Original implementation plan
Testing and Quality Commands
/prodigy-debug-test-failure
Automatically fixes failing tests.
- shell: "just test"
on_failure:
claude: "/prodigy-debug-test-failure --output ${shell.output}"
max_attempts: 5
Parameters:
--output: Test failure output from shell command
/prodigy-lint
Fixes linting and formatting issues.
- shell: "just fmt-check && just lint"
on_failure:
claude: "/prodigy-lint ${shell.output}"
max_attempts: 5
Parameters:
- Shell output with linting errors
Target Selection
Target selection happens through the debtmap analysis and slash commands, not through workflow configuration:
How Targets Are Selected
- Debtmap analyzes the codebase and scores all items by complexity, coverage, and risk
- Planning command (
/prodigy-debtmap-plan) selects the highest priority item - Implementation command (
/prodigy-debtmap-implement) fixes that specific item - Next iteration re-analyzes and selects the next highest priority item
Factors in Prioritization
- Complexity score: Functions with cyclomatic complexity > 10
- Coverage percentage: Lower coverage increases priority
- Risk score: Complexity × (100 - coverage%)
- Debt type: Complexity, TestGap, Duplication, GodObject, DeepNesting
Customizing Target Selection
To focus on specific debt types or modules, modify the slash commands or create custom commands in .claude/commands/
MapReduce Workflows
Prodigy supports MapReduce workflows for processing multiple items in parallel. This is powerful for large-scale refactoring where you want to fix many debt items simultaneously.
When to Use MapReduce
- Processing multiple independent debt items simultaneously (e.g., refactor 10 high-complexity functions in parallel)
- Applying the same fix pattern across many files
- Large-scale codebase cleanup tasks
- Situations where sequential iteration would be too slow
MapReduce vs Sequential Workflows
Sequential Workflow (-n 5):
- Runs entire workflow N times in sequence
- Fixes one item per iteration
- Each iteration re-analyzes the codebase
- Total time: N × workflow_duration
MapReduce Workflow:
- Processes multiple items in parallel in a single run
- Setup phase runs once
- Map phase spawns N parallel agents (each in isolated worktree)
- Reduce phase aggregates results
- Total time: setup + max(map_agent_durations) + reduce
Complete MapReduce Example
Create workflows/debtmap-reduce.yml:
name: debtmap-parallel-elimination
mode: mapreduce
# Setup phase: Analyze the codebase and generate debt items
setup:
timeout: 900 # 15 minutes for coverage generation
commands:
# Generate coverage data with tarpaulin
- shell: "just coverage-lcov"
# Run debtmap with coverage data to establish baseline
- shell: "debtmap analyze src --lcov target/coverage/lcov.info --output .prodigy/debtmap-before.json --format json"
# Map phase: Process each debt item in parallel with planning and validation
map:
# Input configuration - debtmap-before.json contains items array
input: .prodigy/debtmap-before.json
json_path: "$.items[*]"
# Commands to execute for each debt item
agent_template:
# Phase 1: Create implementation plan
- claude: "/prodigy-debtmap-plan --item '${item}' --output .prodigy/plan-${item_id}.md"
capture_output: true
validate:
commands:
- claude: "/prodigy-validate-debtmap-plan --item '${item}' --plan .prodigy/plan-${item_id}.md --output .prodigy/validation-${item_id}.json"
result_file: ".prodigy/validation-${item_id}.json"
threshold: 75
on_incomplete:
commands:
- claude: "/prodigy-revise-debtmap-plan --gaps ${validation.gaps} --plan .prodigy/plan-${item_id}.md"
max_attempts: 3
fail_workflow: false
# Phase 2: Execute the plan
- claude: "/prodigy-debtmap-implement --plan .prodigy/plan-${item_id}.md"
commit_required: true
validate:
commands:
- shell: "just coverage-lcov"
- shell: "debtmap analyze src --lcov target/coverage/lcov.info --output .prodigy/debtmap-after-${item_id}.json --format json"
- shell: "debtmap compare --before .prodigy/debtmap-before.json --after .prodigy/debtmap-after-${item_id}.json --plan .prodigy/plan-${item_id}.md --output .prodigy/comparison-${item_id}.json --format json"
- shell: "debtmap validate-improvement --comparison .prodigy/comparison-${item_id}.json --output .prodigy/debtmap-validation-${item_id}.json"
result_file: ".prodigy/debtmap-validation-${item_id}.json"
threshold: 75
on_incomplete:
commands:
- claude: "/prodigy-complete-debtmap-fix --plan .prodigy/plan-${item_id}.md --validation .prodigy/debtmap-validation-${item_id}.json --attempt ${validation.attempt_number}"
commit_required: true
- shell: "just coverage-lcov"
- shell: "debtmap analyze src --lcov target/coverage/lcov.info --output .prodigy/debtmap-after-${item_id}.json --format json"
- shell: "debtmap compare --before .prodigy/debtmap-before.json --after .prodigy/debtmap-after-${item_id}.json --plan .prodigy/plan-${item_id}.md --output .prodigy/comparison-${item_id}.json --format json"
max_attempts: 5
fail_workflow: true
# Phase 3: Verify tests pass
- shell: "just test"
on_failure:
claude: "/prodigy-debug-test-failure --output ${shell.output}"
max_attempts: 5
fail_workflow: true
# Phase 4: Check formatting and linting
- shell: "just fmt-check && just lint"
on_failure:
claude: "/prodigy-lint ${shell.output}"
max_attempts: 5
fail_workflow: true
# Parallelization settings
max_parallel: 5 # Run up to 5 agents in parallel
# Filter and sort items
filter: "File.score >= 10 OR Function.unified_score.final_score >= 10"
sort_by: "File.score DESC, Function.unified_score.final_score DESC"
max_items: 10 # Limit to 10 items per run
# Reduce phase: Aggregate results and verify overall improvements
reduce:
# Phase 1: Run final tests across all changes
- shell: "just test"
on_failure:
claude: "/prodigy-debug-test-failure --output ${shell.output}"
max_attempts: 5
fail_workflow: true
# Phase 2: Check formatting and linting
- shell: "just fmt-check && just lint"
on_failure:
claude: "/prodigy-lint ${shell.output}"
max_attempts: 5
fail_workflow: true
# Phase 3: Re-run debtmap to measure cumulative improvements
- shell: "just coverage-lcov"
- shell: "debtmap analyze src --lcov target/coverage/lcov.info --output .prodigy/debtmap-after.json --format json"
# Phase 4: Create final commit with summary
- write_file:
path: ".prodigy/map-results.json"
content: "${map.results}"
format: json
create_dirs: true
- claude: |
/prodigy-compare-debt-results \
--before .prodigy/debtmap-before.json \
--after .prodigy/debtmap-after.json \
--map-results-file .prodigy/map-results.json \
--successful ${map.successful} \
--failed ${map.failed} \
--total ${map.total}
commit_required: true
Running MapReduce Workflows
# Run MapReduce workflow (single execution processes multiple items in parallel)
prodigy run workflows/debtmap-reduce.yml
# Run with auto-confirm
prodigy run workflows/debtmap-reduce.yml -y
Note: MapReduce workflows don’t typically use -n for iterations. Instead, they process multiple items in a single run through parallel map agents.
MapReduce Configuration Options
Top-Level Fields
name: Workflow name (string)mode: mapreduce: Enables MapReduce mode (required)setup: Commands to run once before map phasemap: Map phase configurationreduce: Commands to run after all map agents complete
Setup Phase Fields
timeout: Maximum time in seconds for setup phasecommands: List of shell or claude commands to run
Map Phase Fields
input: Path to JSON file containing items to processjson_path: JSONPath expression to extract items array (e.g.,$.items[*])agent_template: List of commands to run for each item (each item gets its own agent in an isolated worktree)max_parallel: Maximum number of agents to run concurrentlyfilter: Expression to filter which items to process (e.g.,"score >= 10")sort_by: Expression to sort items (e.g.,"score DESC")max_items: Limit total items processed
MapReduce-Specific Variables
| Variable | Available In | Type | Description |
|---|---|---|---|
${item} | map phase | JSON | The full JSON object for current item |
${item_id} | map phase | string | Unique ID for current item (auto-generated) |
${validation.gaps} | map phase | array | List of validation gaps from failed validation |
${validation.attempt_number} | map phase | number | Current retry attempt number (1, 2, 3, etc.) |
${shell.output} | both phases | string | Output from previous shell command |
${map.results} | reduce phase | array | All map agent results as JSON |
${map.successful} | reduce phase | number | Count of successful map agents |
${map.failed} | reduce phase | number | Count of failed map agents |
${map.total} | reduce phase | number | Total number of map agents |
MapReduce Architecture
┌─────────────────────────────────────────────────────────┐
│ Setup Phase (main worktree) │
│ - Generate coverage data │
│ - Run debtmap analysis │
│ - Output: .prodigy/debtmap-before.json │
└─────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ Map Phase (parallel worktrees) │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Agent 1 │ │ Agent 2 │ │ Agent 3 │ │
│ │ Item #1 │ │ Item #2 │ │ Item #3 │ │
│ │ Worktree A │ │ Worktree B │ │ Worktree C │ │
│ │ │ │ │ │ │ │
│ │ Plan → Fix │ │ Plan → Fix │ │ Plan → Fix │ │
│ │ → Validate │ │ → Validate │ │ → Validate │ │
│ │ → Test │ │ → Test │ │ → Test │ │
│ │ → Commit │ │ → Commit │ │ → Commit │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ Agent 4 │ │ Agent 5 │ │
│ │ Item #4 │ │ Item #5 │ │
│ │ Worktree D │ │ Worktree E │ │
│ │ │ │ │ │
│ │ Plan → Fix │ │ Plan → Fix │ │
│ │ → Validate │ │ → Validate │ │
│ │ → Test │ │ → Test │ │
│ │ → Commit │ │ → Commit │ │
│ └──────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ Reduce Phase (main worktree) │
│ - Merge all agent worktrees │
│ - Run final tests on merged code │
│ - Run final linting │
│ - Re-analyze with debtmap │
│ - Generate summary commit │
└─────────────────────────────────────────────────────────┘
Key Concepts:
- Isolation: Each map agent works in its own git worktree
- Parallelism: Multiple agents process different items simultaneously
- Validation: Each agent validates its changes independently
- Merging: Reduce phase merges all successful agent worktrees
- Final Validation: Reduce phase ensures merged code passes all tests
Iteration Strategy
How Iterations Work
When you run prodigy run workflows/debtmap.yml -yn 5, the workflow executes up to 5 times:
-
Iteration 1:
- Analyze codebase with debtmap
- Select highest priority item
- Create implementation plan
- Execute plan and validate
- Run tests and linting
-
Iteration 2:
- Re-analyze codebase (scores updated based on Iteration 1 changes)
- Select next highest priority item
- Repeat plan/implement/validate cycle
-
Continue until iteration limit reached or workflow completes without finding issues
Controlling Iterations
Iterations are controlled via the -n flag:
# Single iteration (testing)
prodigy run workflows/debtmap.yml -yn 1
# Standard run (5 iterations)
prodigy run workflows/debtmap.yml -yn 5
# Deep cleanup (10+ iterations)
prodigy run workflows/debtmap.yml -yn 20
What Happens Each Iteration
Each iteration runs the entire workflow from start to finish:
- Generate coverage data
- Analyze technical debt
- Create implementation plan
- Execute plan
- Validate improvement
- Run tests (with auto-fixing)
- Run linting (with auto-fixing)
The workflow continues to the next iteration automatically if all steps succeed.
Example Output
Iteration 1:
- Fixed: parse_expression() (9.2 → 5.1)
- Fixed: calculate_score() (8.8 → 4.2)
- Fixed: apply_weights() (8.5 → 5.8)
✓ Tests pass
Iteration 2:
- Fixed: normalize_results() (7.5 → 3.9)
- Fixed: aggregate_data() (7.2 → 4.1)
✓ Tests pass
Iteration 3:
- No items above threshold (6.0)
✓ Early stop
Final Results:
Items fixed: 5
Average complexity: 15.2 → 8.6
Validation
Prodigy validates changes at the workflow step level, not as a standalone configuration.
Step-Level Validation
Validation is attached to specific workflow steps:
- claude: "/prodigy-debtmap-implement --plan .prodigy/IMPLEMENTATION_PLAN.md"
commit_required: true
validate:
commands:
- shell: "debtmap analyze . --lcov target/coverage/lcov.info --output .prodigy/debtmap-after.json --format json"
- shell: "debtmap compare --before .prodigy/debtmap-before.json --after .prodigy/debtmap-after.json --plan .prodigy/IMPLEMENTATION_PLAN.md --output .prodigy/comparison.json --format json"
- claude: "/prodigy-validate-debtmap-improvement --comparison .prodigy/comparison.json --output .prodigy/debtmap-validation.json"
result_file: ".prodigy/debtmap-validation.json"
threshold: 75
on_incomplete:
commands:
- claude: "/prodigy-complete-debtmap-fix --gaps ${validation.gaps} --plan .prodigy/IMPLEMENTATION_PLAN.md"
commit_required: true
- shell: "just coverage-lcov"
- shell: "debtmap analyze . --lcov target/coverage/lcov.info --output .prodigy/debtmap-after.json --format json"
- shell: "debtmap compare --before .prodigy/debtmap-before.json --after .prodigy/debtmap-after.json --plan .prodigy/IMPLEMENTATION_PLAN.md --output .prodigy/comparison.json --format json"
max_attempts: 5
fail_workflow: true
Validation Process
- Commands run: Execute validation commands (shell or claude)
- Check result file: Read JSON file specified in
result_file - Compare to threshold: Score must be >= threshold (0-100 scale)
- On incomplete: If score < threshold, run
on_incompletecommands - Retry: Repeat up to
max_attemptstimes - Fail or continue: If
fail_workflow: true, stop workflow; otherwise continue
Validation Result Format
The result_file JSON should contain:
{
"score": 85,
"passed": true,
"gaps": [],
"details": "All debt improvement criteria met"
}
Test Validation with Auto-Fix
Tests are validated with automatic fixing on failure:
- shell: "just test"
on_failure:
claude: "/prodigy-debug-test-failure --output ${shell.output}"
max_attempts: 5
fail_workflow: true
If tests fail, Prodigy automatically attempts to fix them up to 5 times before failing the workflow.
Output and Metrics
Workflow Report
{
"workflow": "debtmap-debt-reduction",
"iterations": 5,
"items_processed": 12,
"items_fixed": 10,
"items_failed": 2,
"metrics": {
"complexity_before": 145,
"complexity_after": 78,
"complexity_reduction": -46.2,
"coverage_before": 45.3,
"coverage_after": 72.1,
"coverage_improvement": 26.8
},
"changes": [
{
"file": "src/parser.rs",
"function": "parse_expression",
"before_score": 9.2,
"after_score": 5.1,
"improvements": ["Reduced complexity", "Added tests"]
}
]
}
Commit Messages
Prodigy generates descriptive commit messages:
refactor(parser): reduce complexity in parse_expression
- Extract nested conditionals to helper functions
- Add unit tests for edge cases
- Coverage: 0% → 85%
- Complexity: 22 → 8
Generated by Prodigy workflow: debtmap-debt-reduction
Iteration: 1/5
Integration with CI/CD
GitHub Actions
name: Prodigy Debt Reduction
on:
schedule:
- cron: '0 0 * * 0' # Weekly on Sunday
workflow_dispatch:
jobs:
reduce-debt:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Rust
uses: actions-rs/toolchain@v1
with:
toolchain: stable
- name: Install Prodigy
run: cargo install prodigy
- name: Install dependencies
run: |
cargo install debtmap
cargo install just
- name: Run Prodigy workflow
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
run: prodigy run workflows/debtmap.yml -yn 5
- name: Create PR
uses: peter-evans/create-pull-request@v5
with:
title: "chore: automated debt reduction via Prodigy"
body: |
Automated technical debt reduction using Prodigy workflow.
This PR was generated by the weekly debt reduction workflow.
Review changes carefully before merging.
branch: prodigy-debt-reduction
GitLab CI
prodigy-debt-reduction:
stage: quality
rules:
- if: '$CI_PIPELINE_SOURCE == "schedule"'
script:
- cargo install prodigy
- cargo install debtmap
- cargo install just
- prodigy run workflows/debtmap.yml -yn 5
artifacts:
paths:
- .prodigy/debtmap-*.json
- .prodigy/comparison.json
Important CI Considerations
- API Keys: Store
ANTHROPIC_API_KEYas a secret - Worktrees: MapReduce mode creates isolated worktrees automatically for parallel processing
- Dependencies: Install
prodigy,debtmap, andjust(or your build tool) - Timeout: CI jobs may need extended timeout for multiple iterations
- Review: Always create a PR for human review before merging automated changes
Best Practices
1. Start Small
Begin with low iteration counts:
# First run: 1 iteration to test workflow
prodigy run workflows/debtmap.yml -yn 1
# Standard run: 3-5 iterations
prodigy run workflows/debtmap.yml -yn 5
2. Focus on High-Priority Items
The debtmap analysis automatically prioritizes by:
- Complexity score (cyclomatic complexity)
- Coverage percentage (lower coverage = higher priority)
- Risk score (complexity × (100 - coverage%))
To focus on specific areas, create custom slash commands in .claude/commands/ that filter by:
- Module/file patterns
- Specific debt types (Complexity, TestGap, Duplication)
- Score thresholds
3. Validate Thoroughly
Use comprehensive validation in your workflow:
- shell: "just test"
on_failure:
claude: "/prodigy-debug-test-failure --output ${shell.output}"
max_attempts: 5
fail_workflow: true
- shell: "just fmt-check && just lint"
on_failure:
claude: "/prodigy-lint ${shell.output}"
max_attempts: 5
fail_workflow: true
4. Review Before Merging
Always review Prodigy’s changes:
# Find your worktree
ls ~/.prodigy/worktrees/
# Check changes
cd ~/.prodigy/worktrees/session-xxx
git diff main
# Review commit history
git log --oneline
# Run full test suite
cargo test --all-features
5. Monitor Progress
Track debt reduction over iterations:
# Compare before and after
debtmap compare --before .prodigy/debtmap-before.json --after .prodigy/debtmap-after.json
# View detailed metrics
cat .prodigy/comparison.json | jq
Troubleshooting
Workflow Fails to Start
Issue: “Prodigy not found” or “API key missing”
Solution:
# Install Prodigy
cargo install prodigy
# Set API key
export ANTHROPIC_API_KEY="your-key"
# Verify installation
prodigy --version
Validation Failures
Issue: Validation score below threshold
Solution: Check validation results:
# View validation details
cat .prodigy/debtmap-validation.json
# Check what gaps remain
cat .prodigy/debtmap-validation.json | jq '.gaps'
# Review comparison results
cat .prodigy/comparison.json
The workflow will automatically retry up to max_attempts times with /prodigy-complete-debtmap-fix.
Test Failures
Issue: Tests fail after implementation
Solution: The workflow includes automatic test fixing:
- shell: "just test"
on_failure:
claude: "/prodigy-debug-test-failure --output ${shell.output}"
max_attempts: 5
fail_workflow: true
If tests still fail after 5 attempts, review manually:
# Check test output
just test
# Review recent changes
git diff HEAD~1
No Items Processed
Issue: Workflow completes but doesn’t find debt to fix
Possible Causes:
- Codebase has very low debt scores (below selection threshold)
- Coverage data not generated properly
- Debtmap analysis found no high-priority items
Solution:
# Check debtmap analysis results
cat .prodigy/debtmap-before.json | jq '.items | sort_by(-.unified_score.final_score) | .[0:5]'
# Verify coverage was generated
ls -lh target/coverage/lcov.info
# Run debtmap manually to see what's detected
debtmap analyze . --lcov target/coverage/lcov.info
Workflow Hangs or Times Out
Issue: Workflow takes too long or appears stuck
Possible Causes:
- Large codebase with many files
- Complex refactoring requiring extensive analysis
- Network issues with Claude API
Solution:
- Reduce iteration count for testing (
-n 1) - Check Claude API connectivity
- Monitor worktree for progress:
cd ~/.prodigy/worktrees/session-xxx && git log
MapReduce-Specific Troubleshooting
Resuming Failed MapReduce Jobs
Issue: MapReduce job was interrupted or failed
Solution:
# Find the job ID from recent sessions
prodigy sessions
# Resume the MapReduce job from checkpoint
prodigy resume-job <JOB_ID>
The job will resume from where it left off, skipping already-completed items.
Checking MapReduce Progress
Issue: Want to monitor long-running MapReduce job
Solution:
# View overall progress
prodigy progress <JOB_ID>
# View detailed events
prodigy events <JOB_ID>
# Filter for specific event types
prodigy events <JOB_ID> --type agent_completed
prodigy events <JOB_ID> --type agent_failed
Output example:
MapReduce Job: job-abc123
Status: running
Progress: 7/10 items (70%)
- Completed: 5
- Running: 2
- Failed: 3
Managing Failed MapReduce Items
Issue: Some agents failed, items in Dead Letter Queue
Solution:
# View failed items
prodigy dlq list <JOB_ID>
# Review why an item failed (check events)
prodigy events <JOB_ID> --item <ITEM_ID>
# Retry specific failed item
prodigy dlq retry <JOB_ID> <ITEM_ID>
# Remove unfixable items from DLQ
prodigy dlq remove <JOB_ID> <ITEM_ID>
Common failure reasons:
- Validation threshold not met after max_attempts
- Tests fail and can’t be fixed automatically
- Merge conflicts with other agents’ changes
- Timeout exceeded for complex refactoring
Cleaning Up MapReduce Worktrees
Issue: Disk space consumed by many MapReduce worktrees
Solution:
# List all worktrees
prodigy worktree list
# Clean up completed job worktrees
prodigy worktree clean
# Remove specific session's worktrees
prodigy worktree remove <SESSION_ID>
# Manual cleanup (if Prodigy commands don't work)
rm -rf ~/.prodigy/worktrees/session-xxx
When to clean:
- After successful job completion and merge
- When disk space is low
- After abandoned or failed jobs
MapReduce Merge Conflicts
Issue: Reduce phase fails due to merge conflicts between agent worktrees
Possible Causes:
- Multiple agents modified overlapping code
- Agents made conflicting architectural changes
- Shared dependencies updated differently
Solution:
# Review which agents succeeded
prodigy events <JOB_ID> --type agent_completed
# Check merge conflicts
cd ~/.prodigy/worktrees/session-xxx
git status
# Manually resolve conflicts
# Edit conflicting files
git add .
git commit -m "Resolve MapReduce merge conflicts"
# Resume the job
prodigy resume-job <JOB_ID>
Prevention:
- Use
filterto ensure agents work on independent items - Reduce
max_parallelto minimize conflicts - Design debt items to be truly independent
Understanding MapReduce Variables
If you’re debugging workflow files, these variables are available:
In map phase (agent_template):
${item}: Full JSON of current item being processed${item_id}: Unique ID for current item${validation.gaps}: Validation gaps from validation result${validation.attempt_number}: Current retry attempt (1, 2, 3…)${shell.output}: Output from previous shell command
In reduce phase:
${map.results}: All map agent results as JSON${map.successful}: Count of successful agents${map.failed}: Count of failed agents${map.total}: Total number of agents
Example debug command:
# In agent_template, log the item being processed
- shell: "echo 'Processing item: ${item_id}' >> .prodigy/debug.log"
Example Workflows
Full Repository Cleanup
For comprehensive debt reduction, use a higher iteration count:
# Run 10 iterations for deeper cleanup
prodigy run workflows/debtmap.yml -yn 10
# Run 20 iterations for major refactoring
prodigy run workflows/debtmap.yml -yn 20
The workflow automatically:
- Selects highest priority items each iteration
- Addresses different debt types (Complexity, TestGap, Duplication)
- Validates all changes with tests and linting
- Commits only successful improvements
Custom Workflow for Specific Focus
Create a custom workflow file for focused improvements:
workflows/add-tests.yml - Focus on test coverage:
# Generate coverage
- shell: "just coverage-lcov"
# Analyze with focus on test gaps
- shell: "debtmap analyze . --lcov target/coverage/lcov.info --output .prodigy/debtmap-before.json --format json"
# Create plan (slash command will prioritize TestGap items)
- claude: "/prodigy-debtmap-plan --before .prodigy/debtmap-before.json --output .prodigy/IMPLEMENTATION_PLAN.md"
# ... rest of standard workflow steps
Run with:
prodigy run workflows/add-tests.yml -yn 5
Targeted Module Cleanup
Create a custom slash command to focus on specific modules:
.claude/commands/refactor-module.md:
# /refactor-module
Refactor the highest complexity item in the specified module.
Arguments: --module <module_name>
... implementation details ...
Then create a workflow using this command for targeted refactoring.
See Also
- Debtmap CLI Reference - All Debtmap command options including
analyze,compare, andvalidate - Coverage Integration - Generating and using LCOV coverage data with Debtmap
- Configuration - Debtmap configuration file options
- Tiered Prioritization - Understanding how Debtmap scores and prioritizes debt items
- Prodigy Documentation - Full Prodigy reference and advanced features
Responsibility Analysis
Overview
Responsibility analysis is a core feature of Debtmap that helps identify violations of the Single Responsibility Principle (SRP), one of the fundamental SOLID design principles. By analyzing function and method names, Debtmap automatically infers the distinct functional responsibilities within a code unit and detects when a single module, struct, or class has taken on too many concerns.
This chapter provides an in-depth look at how Debtmap determines responsibilities, categorizes them, and uses this information to guide refactoring decisions.
What Are Responsibilities?
In the context of Debtmap, a responsibility is a distinct functional domain or concern that a code unit handles. Examples include:
- Data Access - Getting and setting values from data structures
- Validation - Checking inputs, verifying constraints, ensuring correctness
- Persistence - Saving and loading data to/from storage
- Computation - Performing calculations and transformations
- Communication - Sending and receiving messages or events
According to the Single Responsibility Principle, each module should have one and only one reason to change. When a module handles multiple unrelated responsibilities (e.g., validation, persistence, AND computation), it becomes:
- Harder to understand - Developers must mentally juggle multiple concerns
- More fragile - Changes to one responsibility can break others
- Difficult to test - Testing requires complex setup across multiple domains
- Prone to coupling - Dependencies from different domains become entangled
Debtmap’s responsibility analysis automatically identifies these violations and provides concrete recommendations for splitting modules along responsibility boundaries.
How Responsibilities Are Detected
Pattern-Based Inference
Debtmap uses prefix-based pattern matching to infer responsibilities from function and method names. This approach is both simple and effective because well-named functions naturally express their intent through conventional prefixes.
Implementation Location: src/organization/god_object_analysis.rs:316-386
The infer_responsibility_from_method() function performs case-insensitive prefix matching:
#![allow(unused)]
fn main() {
pub fn infer_responsibility_from_method(method_name: &str) -> String {
let lower_name = method_name.to_lowercase();
if lower_name.starts_with("format_") || lower_name.starts_with("render_") {
return "Formatting & Output".to_string();
}
if lower_name.starts_with("parse_") || lower_name.starts_with("read_") {
return "Parsing & Input".to_string();
}
// ... additional patterns
}
}
This approach works across languages (Rust, Python, JavaScript/TypeScript) because naming conventions are relatively consistent in modern codebases.
Responsibility Categories
Debtmap recognizes 11 built-in responsibility categories plus a generic “Utilities” fallback:
| Category | Prefixes | Examples |
|---|---|---|
| Formatting & Output | format_, render_, write_, print_ | format_json(), render_table(), write_report() |
| Parsing & Input | parse_, read_, extract_ | parse_config(), read_file(), extract_fields() |
| Filtering & Selection | filter_, select_, find_ | filter_results(), select_top(), find_item() |
| Transformation | transform_, convert_, map_, apply_ | transform_data(), convert_type(), map_fields() |
| Data Access | get_, set_ | get_value(), set_name() |
| Validation | validate_, check_, verify_, is_* | validate_input(), check_bounds(), is_valid() |
| Computation | calculate_, compute_ | calculate_score(), compute_sum() |
| Construction | create_, build_, new_*, make_ | create_instance(), build_config(), new_user() |
| Persistence | save_, load_, store_ | save_data(), load_cache(), store_result() |
| Processing | process_, handle_ | process_request(), handle_error() |
| Communication | send_, receive_ | send_message(), receive_data() |
| Utilities | (all others) | helper(), do_work(), utility_fn() |
Grouping Methods by Responsibility
Once individual methods are categorized, Debtmap groups them using group_methods_by_responsibility():
Implementation Location: src/organization/god_object_analysis.rs:268-280
#![allow(unused)]
fn main() {
pub fn group_methods_by_responsibility(methods: &[String]) -> HashMap<String, Vec<String>> {
let mut groups: HashMap<String, Vec<String>> = HashMap::new();
for method in methods {
let responsibility = infer_responsibility_from_method(method);
groups.entry(responsibility).or_default().push(method.clone());
}
groups
}
}
Output Structure:
- Keys: Responsibility category names (e.g., “Data Access”, “Validation”)
- Values: Lists of method names belonging to each category
The responsibility count is simply the number of unique keys in this HashMap.
Example Analysis
Consider a Rust struct with these methods:
#![allow(unused)]
fn main() {
impl UserManager {
fn get_user(&self, id: UserId) -> Option<User> { }
fn set_password(&mut self, id: UserId, password: &str) { }
fn validate_email(&self, email: &str) -> bool { }
fn validate_password(&self, password: &str) -> bool { }
fn save_user(&self, user: &User) -> Result<()> { }
fn load_user(&self, id: UserId) -> Result<User> { }
fn send_notification(&self, user_id: UserId, msg: &str) { }
fn format_user_profile(&self, user: &User) -> String { }
}
}
Debtmap’s Analysis:
| Method | Inferred Responsibility |
|---|---|
get_user | Data Access |
set_password | Data Access |
validate_email | Validation |
validate_password | Validation |
save_user | Persistence |
load_user | Persistence |
send_notification | Communication |
format_user_profile | Formatting & Output |
Result:
- Responsibility Count: 5 (Data Access, Validation, Persistence, Communication, Formatting)
- Assessment: This violates SRP -
UserManagerhas too many distinct concerns
Responsibility Scoring
Integration with God Object Detection
Responsibility count is a critical factor in God Object Detection. The scoring algorithm includes:
responsibility_factor = min(responsibility_count / 3.0, 3.0)
god_object_score = method_factor × field_factor × responsibility_factor × size_factor
Why divide by 3.0?
- 1-3 responsibilities: Normal, well-scoped module
- 4-6 responsibilities: Warning signs, approaching problematic territory
- 7+ responsibilities: Severe violation, likely a god object
Language-Specific Thresholds
Different languages have different expectations for responsibility counts:
| Language | Max Responsibilities | Rationale |
|---|---|---|
| Rust | 5 | Strong module system encourages tight boundaries |
| Python | 3 | Duck typing makes mixing concerns more dangerous |
| JavaScript/TypeScript | 3 | Prototype-based, benefits from focused classes |
These thresholds can be customized in .debtmap.toml:
[god_object_detection.rust]
max_traits = 5 # max_traits = max responsibilities
[god_object_detection.python]
max_traits = 3
Confidence Determination
Responsibility count contributes to overall confidence levels:
Implementation Location: src/organization/god_object_analysis.rs:234-266
#![allow(unused)]
fn main() {
pub fn determine_confidence(
method_count: usize,
field_count: usize,
responsibility_count: usize,
lines_of_code: usize,
complexity_sum: u32,
thresholds: &GodObjectThresholds,
) -> GodObjectConfidence {
let mut violations = 0;
if responsibility_count > thresholds.max_traits {
violations += 1;
}
// ... check other metrics
match violations {
5 => GodObjectConfidence::Definite,
3..=4 => GodObjectConfidence::Probable,
1..=2 => GodObjectConfidence::Possible,
_ => GodObjectConfidence::NotGodObject,
}
}
}
Advanced Responsibility Detection
Module-Level Analysis
For large modules without a single dominant struct, Debtmap performs module-level responsibility detection:
Implementation Location: src/organization/god_object_detector.rs:682-697
The classify_responsibility() function provides extended categorization:
#![allow(unused)]
fn main() {
fn classify_responsibility(prefix: &str) -> String {
match prefix {
"get" | "set" => "Data Access",
"calculate" | "compute" => "Computation",
"validate" | "check" | "verify" | "ensure" => "Validation",
"save" | "load" | "store" | "retrieve" | "fetch" => "Persistence",
"create" | "build" | "new" | "make" | "init" => "Construction",
"send" | "receive" | "handle" | "manage" => "Communication",
"update" | "modify" | "change" | "edit" => "Modification",
"delete" | "remove" | "clear" | "reset" => "Deletion",
"is" | "has" | "can" | "should" | "will" => "State Query",
"process" | "transform" => "Processing",
_ => format!("{} Operations", capitalize_first(prefix)),
}
}
}
This extended mapping covers 10 core categories plus dynamic fallback for custom prefixes.
Responsibility Groups
The ResponsibilityGroup data structure tracks detailed information about each responsibility:
Implementation Location: src/organization/mod.rs:156-161
#![allow(unused)]
fn main() {
#[derive(Debug, Clone, PartialEq)]
pub struct ResponsibilityGroup {
pub name: String, // e.g., "DataAccessManager"
pub methods: Vec<String>, // Methods in this group
pub fields: Vec<String>, // Associated fields
pub responsibility: String, // e.g., "Data Access"
}
}
This structure enables:
- Refactoring recommendations - Suggest splitting by responsibility group
- Cohesion analysis - Measure how tightly methods are related
- Field-method correlation - Identify which fields belong to which responsibilities
Refactoring Based on Responsibilities
Recommended Module Splits
When Debtmap detects a module with multiple responsibilities, it generates actionable refactoring recommendations using recommend_module_splits():
Implementation Location: src/organization/god_object_detector.rs:165-177
Process:
- Group all methods by their inferred responsibilities
- Create a
ModuleSplitfor each responsibility group - Suggest module names (e.g., “DataAccessManager”, “ValidationManager”)
- Estimate lines of code for each new module
- Order by cohesion (most focused groups first)
Example Output:
Recommended Splits for UserManager:
1. DataAccessManager (5 methods, ~80 lines)
- get_user, set_password, get_email, set_email, update_profile
2. ValidationManager (4 methods, ~60 lines)
- validate_email, validate_password, check_permissions, verify_token
3. PersistenceManager (3 methods, ~50 lines)
- save_user, load_user, delete_user
4. NotificationManager (2 methods, ~30 lines)
- send_notification, send_bulk_notifications
Practical Refactoring Patterns
Pattern 1: Extract Service Classes
Before (God Object):
#![allow(unused)]
fn main() {
struct UserManager {
db: Database,
cache: Cache,
notifier: Notifier,
}
impl UserManager {
fn get_user(&self, id: UserId) -> Option<User> { }
fn validate_email(&self, email: &str) -> bool { }
fn save_user(&self, user: &User) -> Result<()> { }
fn send_notification(&self, id: UserId, msg: &str) { }
}
}
After (Split by Responsibility):
#![allow(unused)]
fn main() {
// Data Access
struct UserRepository {
db: Database,
cache: Cache,
}
// Validation
struct UserValidator;
// Persistence
struct UserPersistence {
db: Database,
}
// Communication
struct NotificationService {
notifier: Notifier,
}
}
Pattern 2: Use Facade for Composition
After splitting, create a facade to coordinate:
#![allow(unused)]
fn main() {
struct UserFacade {
repository: UserRepository,
validator: UserValidator,
persistence: UserPersistence,
notifier: NotificationService,
}
impl UserFacade {
fn register_user(&mut self, user: User) -> Result<()> {
self.validator.validate_email(&user.email)?;
self.persistence.save_user(&user)?;
self.notifier.send_welcome(&user.id)?;
Ok(())
}
}
}
Pattern 3: Trait-Based Separation (Rust)
Use traits to define responsibility boundaries:
#![allow(unused)]
fn main() {
trait DataAccess {
fn get_user(&self, id: UserId) -> Option<User>;
}
trait Validation {
fn validate_email(&self, email: &str) -> bool;
}
trait Persistence {
fn save_user(&self, user: &User) -> Result<()>;
}
// Implement only the needed traits per struct
impl DataAccess for UserRepository { }
impl Validation for UserValidator { }
impl Persistence for UserPersistence { }
}
Data Structures
GodObjectAnalysis
The main result structure includes responsibility information:
Implementation Location: src/organization/god_object_analysis.rs:5-18
#![allow(unused)]
fn main() {
pub struct GodObjectAnalysis {
pub is_god_object: bool,
pub method_count: usize,
pub field_count: usize,
pub responsibility_count: usize, // Number of distinct responsibilities
pub lines_of_code: usize,
pub complexity_sum: u32,
pub god_object_score: f64,
pub recommended_splits: Vec<ModuleSplit>,
pub confidence: GodObjectConfidence,
pub responsibilities: Vec<String>, // List of responsibility names
pub purity_distribution: Option<PurityDistribution>,
}
}
ModuleSplit
Recommendations for splitting modules:
Implementation Location: src/organization/god_object_analysis.rs:40-45
#![allow(unused)]
fn main() {
pub struct ModuleSplit {
pub suggested_name: String, // e.g., "ValidationManager"
pub methods_to_move: Vec<String>, // Methods for this module
pub responsibility: String, // Responsibility category
pub estimated_lines: usize, // Approximate LOC
}
}
Testing Responsibility Detection
Debtmap includes comprehensive tests for responsibility detection:
Implementation Location: src/organization/god_object_analysis.rs:623-838
Test Coverage
Key test cases:
- Prefix recognition - Each of the 11 categories is tested individually
- Case insensitivity -
Format_Outputandformat_outputboth map to “Formatting & Output” - Multiple responsibilities - Grouping diverse methods correctly
- Empty input handling - Graceful handling of empty method lists
- Edge cases - Methods without recognized prefixes default to “Utilities”
Example Test:
#![allow(unused)]
fn main() {
#[test]
fn test_multiple_responsibility_groups() {
let methods = vec![
"format_output".to_string(),
"parse_input".to_string(),
"get_value".to_string(),
"validate_data".to_string(),
];
let groups = group_methods_by_responsibility(&methods);
assert_eq!(groups.len(), 4); // 4 distinct responsibilities
assert!(groups.contains_key("Formatting & Output"));
assert!(groups.contains_key("Parsing & Input"));
assert!(groups.contains_key("Data Access"));
assert!(groups.contains_key("Validation"));
}
}
Configuration
TOML Configuration
Customize responsibility thresholds in .debtmap.toml:
[god_object_detection]
enabled = true
[god_object_detection.rust]
max_traits = 5 # Max responsibilities for Rust
[god_object_detection.python]
max_traits = 3 # Max responsibilities for Python
[god_object_detection.javascript]
max_traits = 3 # Max responsibilities for JavaScript/TypeScript
Tuning Guidelines
Strict SRP Enforcement:
[god_object_detection.rust]
max_traits = 3
- Enforces very tight single responsibility
- Suitable for greenfield projects or strict refactoring efforts
Balanced Approach (Default):
[god_object_detection.rust]
max_traits = 5
- Allows some flexibility while catching major violations
- Works well for most projects
Lenient Mode:
[god_object_detection.rust]
max_traits = 7
- Only flags severe SRP violations
- Useful for large legacy codebases during initial assessment
Output and Reporting
Console Output
When analyzing a file with multiple responsibilities:
src/services/user_manager.rs
⚠️ God Object: 18 methods, 8 fields, 5 responsibilities
Score: 185 (Confidence: Probable)
Responsibilities:
- Data Access (5 methods)
- Validation (4 methods)
- Persistence (3 methods)
- Communication (3 methods)
- Formatting & Output (3 methods)
Recommended Splits:
1. DataAccessManager (5 methods, ~75 lines)
2. ValidationManager (4 methods, ~60 lines)
3. PersistenceManager (3 methods, ~45 lines)
JSON Output
For programmatic analysis, use --format json:
{
"file": "src/services/user_manager.rs",
"is_god_object": true,
"responsibility_count": 5,
"responsibilities": [
"Data Access",
"Validation",
"Persistence",
"Communication",
"Formatting & Output"
],
"recommended_splits": [
{
"suggested_name": "DataAccessManager",
"methods_to_move": ["get_user", "set_password", "get_email"],
"responsibility": "Data Access",
"estimated_lines": 75
}
]
}
Best Practices
Writing SRP-Compliant Code
- Name functions descriptively - Use standard prefixes (
get_,validate_, etc.) - Group related functions - Keep similar responsibilities together
- Limit responsibility count - Aim for 1-3 responsibilities per module
- Review regularly - Run Debtmap periodically to catch responsibility creep
- Refactor early - Split modules before they hit thresholds
Code Review Guidelines
When reviewing responsibility analysis results:
- Check responsibility boundaries - Are they logically distinct?
- Validate groupings - Do the recommended splits make sense?
- Consider dependencies - Will splitting introduce more coupling?
- Estimate refactoring cost - Is the improvement worth the effort?
- Prioritize by score - Focus on high-scoring god objects first
Team Adoption
Phase 1: Assessment
- Run Debtmap on codebase
- Review responsibility violations
- Identify top 10 problematic modules
Phase 2: Education
- Share responsibility analysis results with team
- Discuss SRP and its benefits
- Agree on responsibility threshold standards
Phase 3: Incremental Refactoring
- Start with highest-scoring modules
- Apply recommended splits
- Measure improvement with follow-up analysis
Phase 4: Continuous Monitoring
- Integrate Debtmap into CI/CD
- Track responsibility counts over time
- Prevent new SRP violations from merging
Limitations and Edge Cases
False Positives
Scenario 1: Utilities Module
#![allow(unused)]
fn main() {
// utilities.rs - 15 helper functions with different prefixes
fn format_date() { }
fn parse_config() { }
fn validate_email() { }
// ... 12 more diverse utilities
}
Issue: Flagged as having multiple responsibilities, but it’s intentionally a utility collection.
Solution: Either accept the flagging (utilities should perhaps be split) or increase max_traits threshold.
False Negatives
Scenario 2: Poor Naming
#![allow(unused)]
fn main() {
impl DataProcessor {
fn process_data(&mut self) { /* does everything */ }
fn handle_stuff(&mut self) { /* also does everything */ }
fn do_work(&mut self) { /* yet more mixed concerns */ }
}
}
Issue: All methods map to “Processing” or “Utilities”, so responsibility count is low despite clear SRP violations.
Solution: Encourage better naming conventions in your team. Debtmap relies on descriptive function names.
Language-Specific Challenges
Rust: Trait implementations may group methods by trait rather than responsibility, artificially inflating counts.
Python: Dynamic typing and duck typing make responsibility boundaries less clear from signatures alone.
JavaScript: Prototype methods and closures may not follow conventional naming patterns.
Integration with Other Features
God Object Detection
Responsibility analysis is a core component of God Object Detection. The responsibility count contributes to:
- God object scoring
- Confidence level determination
- Refactoring recommendations
Tiered Prioritization
High responsibility counts increase priority in Tiered Prioritization through the god object multiplier.
Risk Assessment
Modules with multiple responsibilities receive higher risk scores in risk assessment, as they are more prone to bugs and harder to maintain.
Related Documentation
- God Object Detection - Full god object analysis including responsibility detection
- Configuration - TOML configuration reference
- Metrics Reference - All metrics including responsibility count
- Architecture - High-level design including analysis pipelines
Summary
Responsibility analysis in Debtmap:
- Automatically detects SRP violations through pattern-based method name analysis
- Categorizes methods into 11 built-in responsibility types
- Provides actionable refactoring recommendations with suggested module splits
- Integrates with god object detection for holistic architectural analysis
- Supports language-specific thresholds for Rust, Python, and JavaScript/TypeScript
- Is fully configurable via
.debtmap.tomland CLI flags
By surfacing responsibility violations early and suggesting concrete refactoring paths, Debtmap helps teams maintain clean, modular architectures that follow the Single Responsibility Principle.
Scoring Strategies
Debtmap provides two complementary scoring approaches: file-level and function-level. Understanding when to use each approach helps you make better refactoring decisions and prioritize work effectively.
Overview
Different refactoring scenarios require different levels of granularity:
- File-level scoring: Identifies architectural issues and planning major refactoring initiatives
- Function-level scoring: Pinpoints specific hot spots for targeted improvements
This chapter explains both approaches, when to use each, and how to interpret the results.
File-Level Scoring
File-level scoring aggregates metrics across all functions in a file to identify architectural problems and module-level refactoring opportunities.
Formula
File Score = Size × Complexity × Coverage Factor × Density × GodObject × FunctionScores
Note: This is a conceptual formula showing the multiplicative relationship between factors. The actual implementation in src/priority/file_metrics.rs includes additional normalization steps and conditional adjustments. See source code for exact calculation details.
Where each factor is calculated as:
- Size =
sqrt(total_lines / 100) - Complexity =
(avg_complexity / 5.0) × sqrt(total_complexity / 50.0) - Coverage Factor =
((1.0 - coverage_percent) × 2.0) + 1.0 - Density =
1.0 + ((function_count - 50) × 0.02)if function_count > 50, else 1.0 - GodObject =
2.0 + god_object_scoreif detected - FunctionScores =
sum(function_scores) / 10
Factors
Size Factor: sqrt(total_lines / 100)
- Larger files have higher impact
- Square root dampens the effect to avoid over-penalizing large files
- Rationale: Refactoring a 1000-line file affects more code than a 100-line file
Complexity Factor: Combines average and total complexity
(average_cyclomatic + total_cyclomatic / function_count) / 2- Balances per-function and aggregate complexity
- Rationale: Both concentrated complexity and spread-out complexity matter
Coverage Factor: (coverage_gap × 2.0) + 1.0 where coverage_gap = 1.0 - coverage_percent
- Lower coverage increases score multiplicatively
- Range: 1.0 (100% coverage) to 3.0 (0% coverage)
- Formula expands to:
((1.0 - coverage_percent) × 2.0) + 1.0 - Example: 50% coverage → gap=0.5 → factor=(0.5×2.0)+1.0 = 2.0x
- Rationale: Untested files amplify existing complexity and risk through a multiplicative factor greater than 1.0
- Note: Earlier versions used
1.0 - coverage_percent(range 0-1); current implementation uses expanded range 1-3 for stronger emphasis
Density Factor: Penalizes files with excessive function count
- Triggers when function count > 50
- Formula:
1.0 + ((function_count - 50) * 0.02)if function_count > 50, else 1.0 - Creates a gradual linear increase: 51 functions = 1.02x, 75 functions = 1.50x, 100 functions = 2.0x
- Example: A file with 75 functions gets 1.0 + ((75 - 50) * 0.02) = 1.0 + 0.50 = 1.50x multiplier
- Rationale: Files with many functions likely violate single responsibility
God Object Multiplier: 2.0 + god_object_score when detected
- Applies when god object detection flags the file
- Range: 2.0 (borderline) to 3.0 (severe god object)
- Rationale: God objects need immediate architectural attention
Function Scores: sum(all_function_scores) / 10
- Normalized sum of individual function debt scores
- Provides baseline before modifiers
Use Cases
1. Planning Major Refactoring Initiatives
# Show top 10 files needing architectural refactoring
debtmap analyze . --aggregate-only --top 10
Use when:
- Planning sprint or quarterly refactoring work
- Deciding which modules to split
- Prioritizing architectural improvements
- Allocating team resources
Note: File-level scoring is enabled with the --aggregate-only flag (a boolean flag—no value needed), which changes output to show only file-level metrics instead of function-level details.
2. Identifying Architectural Issues
File-level scoring excels at finding:
- God objects with too many responsibilities
- Files with poor cohesion
- Modules that should be split
- Files with too many functions
# Focus on architectural problems
debtmap analyze . --aggregate-only --filter Architecture
3. Breaking Up Monolithic Modules
# Find files with excessive function counts
debtmap analyze . --aggregate-only --min-problematic 50
4. Evaluating Overall Codebase Health
# Generate file-level report for executive summary
debtmap analyze . --aggregate-only --format markdown -o report.md
Aggregation Methods
Debtmap supports multiple aggregation methods for file-level scores, configurable via CLI or configuration file.
Weighted Sum (Default)
Formula: Σ(function_score × complexity_weight × coverage_weight)
debtmap analyze . --aggregation-method weighted_sum
Or via configuration:
[aggregation]
method = "weighted_sum"
Characteristics:
- Weights functions by their complexity and coverage gaps
- Emphasizes high-impact functions over trivial ones
- Best for most use cases where you want to focus on significant issues
Best for: Standard codebases where you want proportional emphasis on complex, untested code
Simple Sum
Formula: Σ(function_scores)
[aggregation]
method = "sum"
Characteristics:
- Adds all function scores directly without weighting
- Treats all functions equally regardless of complexity
- Useful for broad overview and trend analysis
Best for: Getting a raw count-based view of technical debt across all functions
Logarithmic Sum
Formula: log(1 + Σ(function_scores))
[aggregation]
method = "logarithmic_sum"
Characteristics:
- Dampens impact of many small issues to prevent score explosion
- Prevents files with hundreds of minor issues from dominating
- Creates more balanced comparisons across files of different sizes
Best for: Legacy codebases with many small issues where you want to avoid extreme scores
Max Plus Average
Formula: max_score × 0.6 + avg_score × 0.4
[aggregation]
method = "max_plus_average"
Characteristics:
- Considers worst function (60%) plus average of all functions (40%)
- Balances worst-case and typical-case scenarios
- Highlights files with both a critical hot spot and general issues
Best for: Identifying files with concentrated complexity alongside general code quality concerns
Choosing an Aggregation Method
| Codebase Type | Recommended Method | Rationale |
|---|---|---|
| New/Modern | weighted_sum | Proportional emphasis on real issues |
| Legacy with many small issues | logarithmic_sum | Prevents score explosion |
| Mixed quality | max_plus_average | Balances hot spots with overall quality |
| Trend analysis | sum | Simple, consistent metric over time |
Performance Note: All aggregation methods have O(n) complexity where n = number of functions. Performance differences are negligible for typical codebases (<100k functions). Choose based on prioritization strategy, not performance concerns.
Configuration
IMPORTANT: The configuration file must be named
.debtmap.toml(notdebtmap.ymlor other variants) and placed in your project root directory.
[aggregation]
method = "weighted_sum"
min_problematic = 3 # Need 3+ problematic functions for file-level score
[god_object_detection]
enabled = true
max_methods = 20
max_fields = 15
max_responsibilities = 5
Function-Level Scoring
Function-level scoring identifies specific functions needing attention for targeted improvements.
Formula
Base Score = (Complexity Factor × 10 × 0.50) + (Dependency Factor × 10 × 0.25)
Coverage Multiplier = 1.0 - coverage_percent
Final Score = Base Score × Coverage Multiplier × Role Multiplier
Formula Breakdown:
- Complexity Factor: Raw complexity / 2.0, clamped to 0-10 range (complexity of 20+ maps to 10.0)
- Dependency Factor: Upstream dependency count / 2.0, capped at 10.0 (20+ dependencies map to 10.0)
- Base Score: (Complexity Factor × 10 × 0.50) + (Dependency Factor × 10 × 0.25)
- 50% weight on complexity, 25% weight on dependencies
- Coverage Multiplier: 1.0 - coverage_percent (0% coverage = 1.0, 100% coverage = 0.0)
- Final Score: Base Score × Coverage Multiplier × Role Multiplier
Why Hard-Coded Weights? The base weights (0.50 for complexity, 0.25 for dependencies) are intentionally not configurable to:
- Ensure consistency: Scores remain comparable across projects and teams
- Prevent instability: Avoid extreme configurations that break prioritization
- Simplify configuration: Reduce cognitive load for users
- Maintain calibration: Weights are empirically tuned based on analysis of real codebases
You can still customize prioritization significantly through configurable role_multipliers, coverage_weights, and normalization settings.
Note: Coverage acts as a dampening multiplier rather than an additive factor. Lower coverage (higher multiplier) increases the final score, making untested complex code a higher priority. Role multipliers and coverage weights remain configurable to allow customization while maintaining stable base calculations.
Migration Note: Earlier versions used an additive model with weights (Complexity × 0.35) + (Coverage × 0.50) + (Dependency × 0.15). The current model (spec 122) uses coverage as a multiplicative dampener, which better reflects that testing gaps amplify existing complexity rather than adding to it.
Metrics
Cyclomatic Complexity
- Counts decision points (if, match, loops)
- Guides test case count
Cognitive Complexity
- Measures understanding difficulty
- Accounts for nesting depth
Coverage Percentage
- Direct line coverage from LCOV
- 0% coverage = maximum urgency
Dependency Count
- Upstream callers + downstream callees
- Higher dependencies = higher impact
Role Multiplier
Functions are classified by role, and each role receives a multiplier based on its architectural importance:
| Role | Multiplier | Description |
|---|---|---|
| Pure logic | 1.2x | Core business rules and algorithms |
| Unknown | 1.0x | Functions without clear classification |
| Entry point | 0.9x | Public APIs, main functions, HTTP handlers |
| Orchestrator | 0.8x | Functions that coordinate other functions |
| IO wrapper | 0.7x | Simple file/network I/O wrappers |
| Pattern match | 0.6x | Functions primarily doing pattern matching |
Note: Role multipliers are configurable via the [role_multipliers] section in .debtmap.toml. The multipliers have been rebalanced to be less extreme than earlier versions - pure logic was reduced from 1.5x to 1.2x, while orchestrator and IO wrapper were increased to better reflect their importance in modern codebases.
Constructor Detection
Debtmap includes intelligent constructor detection to prevent false positives where trivial initialization functions are misclassified as critical business logic.
Problem: Simple constructors like new(), default(), or from_config() often have low complexity but were being flagged as high-priority pure logic functions.
Solution: Constructor detection automatically identifies and classifies these functions as IOWrapper (low priority) instead of PureLogic (high priority).
Detection Criteria:
A function is considered a simple constructor if it meets ALL of the following:
-
Name matches a constructor pattern (configurable):
- Exact match:
new,default,empty,zero,any - Prefix match:
from_*,with_*,create_*,make_*,build_*,of_*
- Exact match:
-
Low cyclomatic complexity (≤ 2 by default)
-
Short length (< 15 lines by default)
-
Minimal nesting (≤ 1 level by default)
-
Low cognitive complexity (≤ 3 by default)
Example:
#![allow(unused)]
fn main() {
// Simple constructor - detected and classified as IOWrapper
fn new() -> Self {
Self {
field1: 0,
field2: String::new(),
}
}
// Complex factory - NOT detected as constructor, remains PureLogic
fn create_with_validation(data: Data) -> Result<Self> {
validate(&data)?;
// ... 30 lines of logic
Ok(Self { ... })
}
}
Configuration:
Constructor detection is fully configurable in .debtmap.toml:
[classification.constructors]
# Enable AST-based constructor detection (default: true)
# When enabled, uses Abstract Syntax Tree analysis for accurate detection
# Disable only if experiencing performance issues with very large codebases
ast_detection = true
# Constructor name patterns
patterns = [
"new",
"default",
"from_",
"with_",
"create_",
"make_",
"build_",
"of_",
"empty",
"zero",
"any",
]
# Complexity thresholds
max_cyclomatic = 2 # Maximum cyclomatic complexity
max_cognitive = 3 # Maximum cognitive complexity
max_length = 15 # Maximum lines
max_nesting = 1 # Maximum nesting depth
Customization Example:
To add custom constructor patterns or adjust thresholds:
[classification.constructors]
ast_detection = true # Keep AST detection enabled (recommended)
patterns = [
"new",
"default",
"from_",
"with_",
"init_", # Add custom pattern
"setup_", # Add custom pattern
]
max_cyclomatic = 3 # Allow slightly more complex constructors
max_length = 20 # Allow longer constructors
To disable AST-based detection (if experiencing performance issues):
[classification.constructors]
ast_detection = false # Fall back to pattern-only matching
# Note: May reduce detection accuracy but improves performance
Performance and Disabling:
Constructor detection is always enabled and cannot be fully disabled, as it’s integral to accurate priority scoring. However, you can:
- Disable AST analysis (shown above): Falls back to pattern-only matching, reducing accuracy but improving performance for very large codebases (100k+ functions)
- Adjust thresholds: Make detection more lenient by increasing
max_cyclomatic,max_cognitive, ormax_length - Remove patterns: Delete specific patterns from the
patternslist to exclude them from detection
Performance Impact:
- AST-based detection: Negligible impact (<5% overhead) for typical codebases
- Pattern-only detection: Near-zero performance impact
- Recommendation: Keep
ast_detection = trueunless profiling shows it’s a bottleneck
Accuracy Trade-offs:
- With AST: 95%+ accuracy in identifying simple constructors
- Without AST: ~70% accuracy, more false negatives
This feature is part of spec 117 and helps reduce false positives in priority scoring.
Role-Based Adjustments
DebtMap uses a sophisticated two-stage role adjustment mechanism to ensure that scores accurately reflect both the testing strategy appropriate for each function type and the architectural importance of different roles.
Why Role-Based Adjustments?
Problem: Traditional scoring treats all functions equally, leading to false positives:
-
Entry points (CLI handlers, HTTP routes,
mainfunctions) typically use integration tests rather than unit tests- Flagging them for “low unit test coverage” misses that they’re tested differently
- They orchestrate other code but contain minimal business logic
-
Pure business logic functions should have comprehensive unit tests
- Easy to test in isolation with deterministic inputs/outputs
- Core value of the application lives here
-
I/O wrappers are often tested implicitly through integration tests
- Thin abstractions over file system, network, or database operations
- Unit testing them provides limited value compared to integration testing
Solution: DebtMap applies role-based adjustments in two stages to address both coverage expectations and architectural importance.
Stage 1: Role-Based Coverage Weighting
The first stage adjusts coverage penalty expectations based on function role. This prevents functions that use different testing strategies from unfairly dominating the priority list.
How It Works:
For each function, DebtMap:
- Detects the function’s role (entry point, pure logic, I/O wrapper, etc.)
- Applies a coverage weight multiplier based on that role
- Reduces or increases the coverage penalty accordingly
Default Coverage Weights (configurable in .debtmap.toml):
| Function Role | Coverage Weight | Impact on Scoring |
|---|---|---|
| Pure Logic | 1.2 | Higher coverage penalty (should have unit tests) |
| Unknown | 1.0 | Standard penalty |
| Pattern Match | 1.0 | Standard penalty |
| Orchestrator | 0.8 | Reduced penalty (partially integration tested) |
| I/O Wrapper | 0.7 | Reduced penalty (often integration tested) |
| Entry Point | 0.6 | Significantly reduced penalty (integration tested) |
Example Score Changes:
Before role-based coverage adjustment:
Function: handle_request (Entry Point)
Complexity: 5
Coverage: 0%
Raw Coverage Penalty: 1.0 (full penalty)
Score: 8.5 (flagged as high priority)
After role-based coverage adjustment:
Function: handle_request (Entry Point)
Complexity: 5
Coverage: 0%
Adjusted Coverage Penalty: 0.4 (60% reduction via 0.6 weight)
Score: 4.2 (medium priority - more realistic)
Rationale: Entry points are integration tested, not unit tested.
This function is likely tested via API/CLI integration tests.
Comparison with Pure Logic:
Function: calculate_discount (Pure Logic)
Complexity: 5
Coverage: 0%
Adjusted Coverage Penalty: 1.2 (20% increase via 1.2 weight)
Score: 9.8 (critical priority)
Rationale: Pure logic should have unit tests.
This function needs immediate test coverage.
Stage 2: Role Multiplier
The second stage applies a final role-based multiplier to reflect architectural importance. This multiplier is clamped by default to prevent extreme score swings.
Configuration (.debtmap.toml under [scoring.role_multiplier]):
[scoring.role_multiplier]
clamp_min = 0.3 # Minimum multiplier (default: 0.3)
clamp_max = 1.8 # Maximum multiplier (default: 1.8)
enable_clamping = true # Enable clamping (default: true)
Clamp Range Rationale:
- Default [0.3, 1.8]: Balances differentiation with stability
- Lower bound (0.3): I/O wrappers still contribute 30% of base score (not invisible)
- Upper bound (1.8): Critical entry points don’t overwhelm other issues (max 180%)
- Configurable: Adjust based on project priorities
Example with Clamping:
Function: process_data (Complex Pure Logic)
Base Score: 45.0
Unclamped Role Multiplier: 2.5
Clamped Multiplier: 1.8 (clamp_max)
Final Score: 45.0 × 1.8 = 81.0
Effect: Prevents one complex function from dominating entire priority list
Why Two Stages?
The separation of coverage weight adjustment and role multiplier ensures they work together without interfering:
Stage 1 (Coverage Weight): Adjusts testing expectations
- Question: “How much should we penalize missing unit tests for this type of function?”
- Example: Entry points get 60% of normal coverage penalty (they’re integration tested)
Stage 2 (Role Multiplier): Adjusts architectural importance
- Question: “How important is this function relative to others with similar complexity?”
- Example: Critical entry points might get a 1.2x multiplier (clamped), while simple I/O wrappers get 0.5x (clamped)
Independent Contributions:
1. Calculate base score from complexity + dependencies
2. Apply coverage weight by role → adjusted coverage penalty
3. Combine into preliminary score
4. Apply clamped role multiplier → final score
This approach ensures:
- Coverage adjustments don’t interfere with role multiplier
- Both mechanisms contribute independently
- Clamping prevents instability from extreme multipliers
How This Reduces False Positives
False Positive #1: Entry Points Flagged for Low Coverage
Before:
Top Priority Items:
1. main() - Score: 9.2 (0% unit test coverage)
2. handle_cli_command() - Score: 8.8 (5% unit test coverage)
3. run_server() - Score: 8.5 (0% unit test coverage)
After:
Top Priority Items:
1. calculate_tax() - Score: 9.8 (0% coverage, Pure Logic)
2. validate_payment() - Score: 9.2 (10% coverage, Pure Logic)
3. main() - Score: 4.2 (0% coverage, Entry Point - integration tested)
Result: Business logic functions that actually need unit tests rise to the top.
False Positive #2: I/O Wrappers Over-Prioritized
Before:
Function: read_config_file
Complexity: 3
Coverage: 0%
Score: 7.5 (high priority)
Issue: This is a thin wrapper over std::fs::read_to_string.
Unit testing it provides minimal value vs integration tests.
After:
Function: read_config_file
Complexity: 3
Coverage: 0%
Adjusted Coverage Weight: 0.7
Score: 3.2 (low priority)
Rationale: I/O wrappers are integration tested.
Focus on business logic instead.
Configuration Examples
Emphasize Pure Logic Testing:
[scoring.role_coverage_weights]
pure_logic = 1.5 # Strong penalty for untested pure logic
entry_point = 0.5 # Minimal penalty for untested entry points
io_wrapper = 0.5 # Minimal penalty for untested I/O wrappers
Conservative Approach (Smaller Adjustments):
[scoring.role_coverage_weights]
pure_logic = 1.1 # Slight increase
entry_point = 0.9 # Slight decrease
io_wrapper = 0.9 # Slight decrease
Disable Multiplier Clamping (not recommended for production):
[scoring.role_multiplier]
enable_clamping = false # Allow unclamped multipliers
# Warning: May cause unstable prioritization
Verification
To see how role-based adjustments affect your codebase:
# Show detailed scoring breakdown
debtmap analyze . --verbose
# Compare with role adjustments disabled
debtmap analyze . --config minimal.toml
Sample verbose output:
Function: src/handlers/request.rs:handle_request
Role: Entry Point
Complexity: 5
Coverage: 0%
Coverage Weight: 0.6 (Entry Point adjustment)
Adjusted Coverage Penalty: 0.4 (reduced from 1.0)
Base Score: 15.0
Role Multiplier: 1.2 (clamped from 1.5)
Final Score: 18.0
Interpretation:
- Entry point gets 60% coverage penalty instead of 100%
- Likely tested via integration tests
- Still flagged due to complexity, but not over-penalized for coverage
Benefits Summary
- Fewer false positives: Entry points and I/O wrappers no longer dominate priority lists
- Better resource allocation: Testing efforts focus on pure logic where unit tests provide most value
- Recognition of testing strategies: Integration tests are valued equally with unit tests
- Stable prioritization: Clamping prevents extreme multipliers from causing volatile rankings
- Configurable: Adjust weights and clamp ranges to match your project’s testing philosophy
Use Cases
1. Identifying Specific Hot Spots
# Show top 20 functions needing attention
debtmap analyze . --top 20
Use when:
- Planning individual developer tasks
- Assigning specific refactoring work
- Identifying functions to test first
- Code review focus
2. Sprint Planning for Developers
# Get function-level tasks for this sprint
debtmap analyze . --top 10 --format json -o sprint-tasks.json
3. Writing Unit Tests
# Find untested complex functions
debtmap analyze . --lcov coverage.lcov --filter Testing --top 15
4. Targeted Performance Optimization
# Find complex hot paths
debtmap analyze . --filter Performance --context --top 10
Configuration
Complete configuration file example showing all scoring-related sections.
File name: .debtmap.toml (must be placed in your project root)
# .debtmap.toml - Complete scoring configuration
# Role multipliers (applied to final score after coverage multiplier)
[role_multipliers]
pure_logic = 1.2 # Core business rules and algorithms
unknown = 1.0 # Functions without clear classification
entry_point = 0.9 # Public APIs, main functions, HTTP handlers
orchestrator = 0.8 # Functions that coordinate other functions
io_wrapper = 0.7 # File/network I/O wrappers
pattern_match = 0.6 # Functions primarily doing pattern matching
# Aggregation settings (for file-level scoring)
[aggregation]
method = "weighted_sum" # Options: weighted_sum, sum, logarithmic_sum, max_plus_average
min_problematic = 3 # Minimum number of problematic functions to report file
# Normalization settings (for advanced multi-phase normalization)
[normalization]
linear_threshold = 10.0 # Scores below this use linear scaling (1:1 mapping)
logarithmic_threshold = 100.0 # Scores above this use logarithmic dampening
sqrt_multiplier = 3.33 # Applied to scores between linear and log thresholds
log_multiplier = 10.0 # Applied to scores above logarithmic threshold
show_raw_scores = true # Display both normalized (0-100) and raw scores in output
Note on Scoring Weights: The base complexity and dependency weights are hard-coded for consistency across environments. However, you can customize prioritization significantly through configurable options:
What’s Configurable:
role_multipliers- Adjust importance of different function types (pure logic, entry points, I/O wrappers)coverage_weights- Role-specific coverage penalty adjustmentsnormalizationsettings - Control score scaling and rangeaggregation.method- Choose how function scores combine into file scores
What’s Hard-Coded:
- Base complexity weight (50%) and dependency weight (25%)
- Coverage multiplier formula:
1.0 - coverage_percent
Impact: While base weights are fixed, the configurable multipliers and weights provide significant control over final rankings and priorities. A function with role_multiplier = 1.5 and coverage_weight = 1.2 can have 80% higher priority than the same function with default settings.
Note: The configuration file must be named .debtmap.toml (not debtmap.yml or other variants) and placed in your project root directory.
When to Use Each Approach
Use File-Level Scoring When:
✅ Planning architectural refactoring ✅ Quarterly or annual planning ✅ Deciding which modules to split ✅ Executive summaries and high-level reports ✅ Team capacity planning ✅ Identifying god objects ✅ Module reorganization
Command:
debtmap analyze . --aggregate-only
Use Function-Level Scoring When:
✅ Sprint planning ✅ Individual developer task assignment ✅ Writing specific unit tests ✅ Code review preparation ✅ Pair programming sessions ✅ Daily or weekly development work ✅ Targeted hot spot fixes
Command:
debtmap analyze . --top 20
Use Both Together:
Many workflows benefit from both views:
# Step 1: Identify problematic files
debtmap analyze . --aggregate-only --top 5 -o files.json
# Step 2: Drill into specific file
debtmap analyze src/problematic/module.rs --format terminal
Comparison Examples
Example 1: God Object Detection
Command:
debtmap analyze src/services/user_service.rs --aggregate-only
File-Level View:
src/services/user_service.rs - Score: 245.8
- 850 lines, 45 methods
- God Object: 78% score
- Action: Split into UserAuth, UserProfile, UserNotifications
Command:
debtmap analyze src/services/user_service.rs --top 5
Function-Level View:
src/services/user_service.rs:142 - authenticate_user() - Score: 8.5
src/services/user_service.rs:298 - update_profile() - Score: 7.2
src/services/user_service.rs:456 - send_notification() - Score: 6.8
Decision: File-level score (245.8) correctly identifies architectural issue. Individual functions aren’t exceptionally complex, but the file has too many responsibilities. Solution: Split the file.
Example 2: Targeted Function Fix
Command:
debtmap analyze src/parsers/expression.rs --aggregate-only
File-Level View:
src/parsers/expression.rs - Score: 45.2
- 320 lines, 12 functions
- No god object detected
Command:
debtmap analyze src/parsers/expression.rs --top 5
Function-Level View:
src/parsers/expression.rs:89 - parse_complex_expression() - Score: 9.1
- Cyclomatic: 22, Cognitive: 35
- Coverage: 0%
- Action: Add tests and refactor
Decision: File as a whole is acceptable, but one function needs attention. Solution: Focus on that specific function.
Example 3: Balanced Refactoring
Command:
debtmap analyze src/analysis/scoring.rs --aggregate-only --coverage-file coverage.lcov
File-Level View:
src/analysis/scoring.rs - Score: 125.6
- 580 lines, 18 functions
- High complexity, low coverage
Command:
debtmap analyze src/analysis/scoring.rs --coverage-file coverage.lcov --top 5
Function-Level View:
calculate_score() - Score: 8.8 (15% coverage)
apply_weights() - Score: 8.2 (10% coverage)
normalize_results() - Score: 7.5 (0% coverage)
Decision: Both file and functions need work. Solution: Add tests first (function-level), then consider splitting if complexity persists (file-level).
Score Normalization
Both scoring approaches normalize to a 0-10 scale for consistency.
Normalization Strategies
Default: Linear Clamping
The default normalization uses simple linear clamping to the 0-100 range:
- Formula: Score is clamped between 0.0 and 100.0
- Behavior: No transformation, just boundary enforcement
- Usage: Production output uses this method
This ensures scores stay within the expected range without additional transformations.
Advanced: Multi-Phase Normalization
For more sophisticated normalization, debtmap provides multi-phase scaling with different formulas for different score ranges:
Phase 1 - Linear (scores < 10):
- Formula:
normalized = raw_score - Behavior: 1:1 mapping, no scaling
- Rationale: Preserve low score distinctions
Phase 2 - Square Root (scores 10-100):
- Formula:
normalized = 10.0 + sqrt(raw_score - 10.0) × 3.33 - Behavior: Moderate dampening
- Rationale: Balance between linear and logarithmic
Phase 3 - Logarithmic (scores > 100):
- Formula:
normalized = 41.59 + ln(raw_score / 100.0) × 10.0 - Behavior: Strong dampening of extreme values
- Rationale: Prevent outliers from dominating
This multi-phase approach dampens extreme values while preserving distinctions in the normal range. Configure via [normalization] section in .debtmap.toml.
Configuration
[normalization]
linear_threshold = 10.0 # Scores below this use linear scaling (1:1 mapping)
logarithmic_threshold = 100.0 # Scores above this use logarithmic dampening
sqrt_multiplier = 3.33 # Applied to scores between linear and log thresholds
log_multiplier = 10.0 # Applied to scores above logarithmic threshold
show_raw_scores = true # Display both normalized (0-10) and raw scores in output
Explanation:
- linear_threshold: Scores below this value are mapped 1:1 (no scaling)
- logarithmic_threshold: Scores above this value are dampened logarithmically to prevent extreme values
- sqrt_multiplier: Square root scaling applied to mid-range scores (between linear and logarithmic thresholds)
- log_multiplier: Logarithmic dampening factor for very high scores
- show_raw_scores: When enabled, output includes both the normalized 0-10 score and the raw calculated score
Best Practices
Workflow Integration
Week 1: File-Level Assessment
# Identify architectural problems
debtmap analyze . --aggregate-only --top 10
Week 2-4: Function-Level Work
# Work through specific functions
debtmap analyze src/target/module.rs
Monthly: Compare Progress
debtmap compare --before baseline.json --after current.json
Team Collaboration
- Architects: Use file-level scores for strategic planning
- Tech Leads: Use both for sprint planning
- Developers: Use function-level for daily work
- QA: Use function-level for test prioritization
CI/CD Integration
# Gate: No new file-level regressions
debtmap analyze . --aggregate-only --format json -o file-scores.json
# Gate: No new critical function-level issues
debtmap analyze . --min-priority critical --format json -o critical-items.json
Troubleshooting
Issue: File-level scores seem too high
Solution: Check aggregation method:
[aggregation]
method = "logarithmic_sum" # Dampen scores
Issue: Function-level scores all similar
Solution: Adjust role multipliers to create more differentiation:
[role_multipliers]
pure_logic = 1.5 # Emphasize business logic more
io_wrapper = 0.5 # De-emphasize I/O wrappers more
Note: Base scoring weights (complexity 50%, dependency 25%) are hard-coded and cannot be configured.
Issue: Too many low-priority items
Solution: Use minimum thresholds:
[thresholds]
minimum_debt_score = 3.0
Rebalanced Debt Scoring (Spec 136)
Debtmap now includes an advanced rebalanced scoring algorithm that prioritizes actual code quality issues—complexity, coverage gaps, and structural problems—over pure file size concerns.
Enabling Rebalanced Scoring
IMPORTANT: Rebalanced scoring is enabled through your
.debtmap.tomlconfiguration file, not via CLI flags. Add the[scoring_rebalanced]section to activate it.
Default Behavior: By default, debtmap uses the standard scoring algorithm described earlier in this chapter. To use rebalanced scoring, add the [scoring_rebalanced] section to your config:
# .debtmap.toml
[scoring_rebalanced]
preset = "balanced" # Activates rebalanced scoring with balanced preset
Relationship to Standard Scoring:
- Rebalanced scoring supplements standard scoring, providing an alternative prioritization strategy
- Both algorithms can coexist - choose which to use based on your needs
- File-level and function-level scoring both work with rebalanced scoring
- Output format remains the same, only score calculations differ
Migration Path:
- Test first: Add
[scoring_rebalanced]section to a test config file - Compare: Run analysis with both standard and rebalanced scoring on same codebase
- Evaluate: Review how priorities change (large simple files rank lower, complex untested code ranks higher)
- Adopt: Once satisfied, switch your primary config to use rebalanced scoring
- Tune: Adjust preset or custom weights based on your team’s priorities
Quick Start:
# Create test config with rebalanced scoring
cat > .debtmap-rebalanced.toml <<EOF
[scoring_rebalanced]
preset = "balanced"
EOF
# Compare results
debtmap analyze . --format terminal # Standard scoring
debtmap analyze . --config .debtmap-rebalanced.toml --format terminal # Rebalanced scoring
Philosophy
Traditional scoring often over-emphasizes file size, causing large but simple files to rank higher than complex, untested code. The rebalanced algorithm fixes this by:
- De-emphasizing size: Reduces size weight from ~1.5 to 0.3 (80% reduction)
- Emphasizing quality: Increases weights for complexity (1.0) and coverage gaps (1.0)
- Additive bonuses: Provides +20 bonus for complex + untested code (not multiplicative)
- Context-aware thresholds: Integrates with file type classification from Spec 135
Multi-Dimensional Scoring
The rebalanced algorithm computes five scoring components:
| Component | Weight | Range | Description |
|---|---|---|---|
| Complexity | 1.0 | 0-100 | Cyclomatic + cognitive complexity |
| Coverage Gap | 1.0 | 0-80 | Testing coverage deficit with complexity bonus |
| Structural | 0.8 | 0-60 | God objects and architectural issues |
| Size | 0.3 | 0-30 | File size (reduced from previous ~1.5) |
| Code Smells | 0.6 | 0-40 | Long functions, deep nesting, impure logic |
Weighted Total Formula:
weighted_total = (complexity × 1.0) + (coverage × 1.0) + (structural × 0.8)
+ (size × 0.3) + (smells × 0.6)
normalized_score = (weighted_total / 237.0) × 200.0 // Normalize to 0-200 range
Scoring Presets
Debtmap provides four presets for different prioritization strategies:
Balanced (Default)
[scoring_rebalanced]
preset = "balanced"
Weights:
- Complexity: 1.0, Coverage: 1.0, Structural: 0.8, Size: 0.3, Smells: 0.6
Use when: Standard development with focus on actual code quality
Quality-Focused
[scoring_rebalanced]
preset = "quality-focused"
Weights:
- Complexity: 1.2, Coverage: 1.1, Structural: 0.9, Size: 0.2, Smells: 0.7
Use when: Maximum emphasis on code quality, minimal concern for file size
Test-Coverage-Focused
[scoring_rebalanced]
preset = "test-coverage"
Weights:
- Complexity: 0.8, Coverage: 1.3, Structural: 0.6, Size: 0.2, Smells: 0.5
Use when: Prioritizing test coverage improvements
Size-Focused (Legacy)
[scoring_rebalanced]
preset = "size-focused"
Weights:
- Complexity: 0.5, Coverage: 0.4, Structural: 0.6, Size: 1.5, Smells: 0.3
Use when: Maintaining legacy scoring behavior, file size is primary concern
Custom Weights
You can define custom weights in .debtmap.toml:
[scoring_rebalanced]
complexity_weight = 1.2
coverage_weight = 1.0
structural_weight = 0.8
size_weight = 0.2
smell_weight = 0.7
Severity Levels
The rebalanced algorithm assigns severity based on normalized score and risk factors:
| Severity | Criteria | Description |
|---|---|---|
| CRITICAL | Score > 120 OR (complexity > 60 AND coverage > 40) | Requires immediate attention |
| HIGH | Score > 80 OR (complexity > 40 AND coverage > 20) OR structural > 50 | High priority for next sprint |
| MEDIUM | Score > 40 OR single moderate issue | Plan for future sprint |
| LOW | Everything else | Minor concerns, size-only issues |
Evaluation Logic: Severity is assigned based on the first matching criteria (logical OR). An item needs to satisfy only ONE condition to qualify for that severity level. For example, a function with score=90 is HIGH severity even if complexity and coverage are both low, because it meets the “Score > 80” condition.
Example Prioritization
Complex Untested Function (HIGH priority):
#![allow(unused)]
fn main() {
fn process_payment(cart: &Cart, user: &User) -> Result<Receipt> {
// 150 lines, cyclomatic: 42, cognitive: 77
// Coverage: 38%
// Rebalanced Score:
// - Complexity: 100.0 (very high)
// - Coverage: 57.2 (gap × 0.6 + 20 bonus for complex+untested)
// - Structural: 0.0
// - Size: 0.0 (function-level scoring)
// - Smells: 25.0 (long function)
// Total: 95.3 → CRITICAL severity
}
}
Large Simple Function (LOW priority):
#![allow(unused)]
fn main() {
fn format_report(data: &ReportData) -> String {
// 2000 lines, cyclomatic: 3, cognitive: 5
// Coverage: 100%
// Rebalanced Score:
// - Complexity: 0.0 (trivial)
// - Coverage: 0.0 (well tested)
// - Structural: 0.0
// - Size: 0.0 (function-level scoring)
// - Smells: 15.0 (long but simple)
// Total: 3.2 → LOW severity
}
}
Result: Complex untested code ranks 30× higher than large simple code.
Integration with File Classification (Spec 135)
The rebalanced scoring integrates with context-aware file size thresholds:
#![allow(unused)]
fn main() {
use debtmap::organization::file_classifier::{classify_file, get_threshold};
let file_type = classify_file(source, path);
let threshold = get_threshold(&file_type, function_count, lines);
// Apply context-aware scoring:
// - Generated code: 0.1× size multiplier
// - Test code: Lenient thresholds (650 lines)
// - Business logic: Strict thresholds (400 lines)
}
Generated Code Detection
The rebalanced scoring automatically detects and reduces scores for generated code:
Detection Markers (first 20 lines):
- “DO NOT EDIT”
- “automatically generated”
- “AUTO-GENERATED”
- “@generated”
- “Code generated by”
Generated Code Score Adjustment:
#![allow(unused)]
fn main() {
if is_generated_code(source) {
size_score *= 0.1; // 90% reduction
}
}
Scoring Rationale
Each debt item includes a detailed rationale explaining the score:
Debt Item: src/payment/processor.rs:142 - process_payment()
Score: 95.3 (CRITICAL)
Primary factors:
- High cyclomatic complexity (+100.0)
- Significant coverage gap (+57.2)
Bonuses:
- Complex + untested: +20 bonus applied
- Code smells detected (+25.0)
Context adjustments:
- Size de-emphasized (weight: 0.3)
Migration from Legacy Scoring
Breaking Changes:
- Scores will change significantly for all debt items
- Large files with low complexity will rank lower
- Complex untested code will rank higher
- Size-based prioritization reduced by 80%
Restoring Legacy Behavior:
[scoring_rebalanced]
preset = "size-focused"
Gradual Migration:
- Run analysis with both algorithms:
debtmap analyze . --legacy-scoring - Compare results to understand impact
- Adjust team priorities based on new rankings
- Switch to rebalanced scoring after validation
See Migration Guide for detailed migration instructions.
Configuration Reference
Complete configuration example:
# .debtmap.toml
[scoring_rebalanced]
# Use a preset (balanced, quality-focused, test-coverage, size-focused)
preset = "balanced"
# Or define custom weights
complexity_weight = 1.0
coverage_weight = 1.0
structural_weight = 0.8
size_weight = 0.3
smell_weight = 0.6
When to Use Rebalanced Scoring
✅ Use rebalanced scoring when:
- You want to prioritize code quality over file size
- Complex untested code is a concern
- You’re building new features and need quality focus
- Your team values testability and maintainability
❌ Use legacy/size-focused when:
- You’re managing a legacy codebase with large files
- File size reduction is the primary concern
- You need compatibility with existing workflows
- Your team’s priority is file splitting over quality
Performance
The rebalanced scoring algorithm has minimal performance impact:
- Same O(n) complexity as legacy scoring
- No additional file I/O required
- Parallel processing compatible
- Adds ~5% to analysis time for rationale generation
Score-Based Prioritization with Exponential Scaling (Spec 171)
DebtMap uses exponential scaling and risk boosting to amplify high-severity technical debt items, ensuring critical issues stand out clearly in priority lists. This section explains how these mechanisms work and how to configure them for your project.
Why Exponential Scaling?
Traditional linear multipliers create uniform gaps between scores:
- Linear 2x multiplier: Score 50 → 100, Score 100 → 200 (uniform +50 and +100 gaps)
Exponential scaling creates growing gaps that make critical issues impossible to miss:
- Exponential scaling (^1.4): Score 50 → 279, Score 100 → 1000 (gaps grow dramatically)
Key Benefits:
- Visual Separation: Critical items have dramatically higher scores than medium items
- Natural Clustering: Similar-severity items cluster together in ranked lists
- Actionable Ordering: Work through the list from top to bottom with confidence
- No Arbitrary Thresholds: Pure score-based ranking eliminates debates about tier boundaries
How Exponential Scaling Works
After calculating the base score (complexity + coverage + dependencies), DebtMap applies pattern-specific exponential scaling:
Formula:
scaled_score = base_score ^ exponent
Pattern-Specific Exponents (configurable in .debtmap.toml):
| Pattern Type | Default Exponent | Rationale |
|---|---|---|
| God Objects | 1.4 | Highest amplification - architectural issues deserve top priority |
| Long Functions | 1.3 | High amplification - major refactoring candidates |
| Complex Functions | 1.2 | Moderate amplification - complexity issues |
| Primitive Obsession | 1.1 | Light amplification - design smell but lower urgency |
Example: God Object Scaling (exponent = 1.4)
Comparing three God Objects with different base scores:
| Base Score | Calculation | Scaled Score | Amplification |
|---|---|---|---|
| 10 | 10^1.4 | 25.1 | 2.5x |
| 50 | 50^1.4 | 279.5 | 5.6x |
| 100 | 100^1.4 | 1000.0 | 10x |
Result: The highest-severity God Object (score 100) gets 10x amplification, while a minor issue (score 10) only gets 2.5x. This creates clear visual separation in your priority list.
Risk Boosting
After exponential scaling, DebtMap applies additional risk multipliers based on architectural position:
Risk Multipliers (applied multiplicatively):
#![allow(unused)]
fn main() {
final_score = scaled_score × risk_multiplier
}
| Risk Factor | Multiplier | Rationale |
|---|---|---|
| High dependency count (10+ callers) | 1.2x | Harder to refactor safely, affects more code |
| Entry point (main, CLI handlers, routes) | 1.15x | Failures cascade to all downstream code |
| Low test coverage (<30%) | 1.1x | Riskier to modify without tests |
Example:
Function: process_payment (God Object)
Base Score: 85.0
Exponentially Scaled: 85^1.4 = 554.3
Risk Factors:
- Entry point: ×1.15
- Low coverage (15%): ×1.1
Final Score: 554.3 × 1.15 × 1.1 = 701.7
Complete Scoring Pipeline
DebtMap processes scores through multiple stages:
1. Base Score Calculation
↓
Weighted sum of:
- Coverage factor (40% weight)
- Complexity factor (40% weight)
- Dependency factor (20% weight)
2. Exponential Scaling
↓
Pattern-specific exponent applied:
- God Objects: ^1.4
- Long Functions: ^1.3
- etc.
3. Risk Boosting
↓
Architectural position multipliers:
- High dependencies: ×1.2
- Entry points: ×1.15
- Low coverage: ×1.1
4. Final Score
↓
Used for ranking (no tier bucketing)
5. Output
↓
Sorted descending by final score
Configuration
You can customize exponential scaling parameters in .debtmap.toml:
[priority.scaling.god_object]
exponent = 1.5 # Increase amplification for God Objects
min_threshold = 30.0 # Only scale scores above 30
max_threshold = 500.0 # Cap scaled scores at 500
[priority.scaling.long_function]
exponent = 1.3 # Default amplification
min_threshold = 0.0 # No minimum threshold
max_threshold = 1000.0 # High cap for extreme cases
[priority.scaling.complex_function]
exponent = 1.2 # Moderate amplification
min_threshold = 20.0 # Scale scores above 20
max_threshold = 800.0 # Cap at 800
Configuration Parameters:
- exponent: The exponential scaling factor (higher = more amplification)
- min_threshold: Minimum base score to apply scaling (prevents amplifying trivial issues)
- max_threshold: Maximum scaled score (prevents extreme outliers)
Tuning Guidelines
Increase amplification when:
- Critical issues aren’t standing out enough in your priority list
- Team needs stronger signal about what to tackle first
- You have many medium-severity items obscuring high-severity ones
Decrease amplification when:
- Priority list feels too top-heavy (too many “critical” items)
- Scores are getting too large (e.g., thousands)
- You want more gradual transitions between severity levels
Example: More Aggressive God Object Detection
[priority.scaling.god_object]
exponent = 1.6 # Higher amplification
min_threshold = 20.0 # Start scaling earlier
max_threshold = 2000.0 # Allow higher caps
Comparing With vs Without Exponential Scaling
Without Exponential Scaling (Linear Multipliers):
Priority List:
1. God Object (base: 85) → final: 170 (2x multiplier)
2. Long Function (base: 80) → final: 160 (2x multiplier)
3. Complex Function (base: 75) → final: 150 (2x multiplier)
4. Medium Issue (base: 70) → final: 140 (2x multiplier)
Problem: Gaps are uniform (10 points). Hard to distinguish critical from medium issues.
With Exponential Scaling:
Priority List:
1. God Object (base: 85) → scaled: 554 → with risk: 701
2. Long Function (base: 80) → scaled: 447 → with risk: 492
3. Complex Function (base: 75) → scaled: 357 → with risk: 357
4. Medium Issue (base: 70) → scaled: 282 → with risk: 282
Result: Clear separation. God Object stands out as 2.5x higher than medium issues.
Score-Based Ranking vs Tier-Based Ranking
DebtMap uses pure score-based ranking (not tier-based) for finer granularity:
Traditional Tier-Based Ranking:
Critical: Items with score ≥ 200
High: Items with score 100-199
Medium: Items with score 50-99
Low: Items with score < 50
Problem: All “Critical” items look equally important, even if one has score 201 and another has score 1000.
Score-Based Ranking:
1. process_payment - Score: 1247.3
2. UserService.authenticate - Score: 891.2
3. calculate_tax - Score: 654.1
...
Benefits:
- Every item has a unique priority position
- Natural ordering - work from highest to lowest
- No arbitrary boundaries or threshold debates
- Finer-grained decision making
Compatibility Note: For tools expecting Priority enums, scores can be mapped to tiers:
- Score ≥ 200: Critical
- Score ≥ 100: High
- Score ≥ 50: Medium
- Score < 50: Low
However, the primary output uses raw scores for maximum granularity.
Practical Examples
Example 1: Identifying Architectural Hot Spots
debtmap analyze . --top 10
Output:
Top 10 Technical Debt Items (Sorted by Score)
1. src/services/user_service.rs:45 - UserService::authenticate
Score: 1247.3 | Pattern: God Object | Coverage: 12%
→ 45 methods, 892 lines, high complexity
→ Risk factors: Entry point (×1.15), High dependencies (×1.2)
2. src/payment/processor.rs:142 - process_payment
Score: 891.2 | Pattern: Complex Function | Coverage: 8%
→ Cyclomatic: 42, Cognitive: 77
→ Risk factors: Entry point (×1.15), Low coverage (×1.1)
3. src/reporting/generator.rs:234 - generate_monthly_report
Score: 654.1 | Pattern: Long Function | Coverage: 45%
→ 287 lines, moderate complexity
→ Risk factors: High dependencies (×1.2)
Action: Focus on top 3 items first - they have dramatically higher scores than items 4-10.
Example 2: Monitoring Exponential Scaling Impact
# Analyze with verbose output to see scaling details
debtmap analyze . --verbose --top 5
Verbose Output:
Function: src/services/user_service.rs:45 - UserService::authenticate
Base Score: 85.0
Pattern: God Object
Exponential Scaling (^1.4): 85.0^1.4 = 554.3
Risk Boosting:
- Entry point: ×1.15 → 637.4
- High dependencies (15 callers): ×1.2 → 764.9
- Low coverage (12%): ×1.1 → 841.4
Final Score: 841.4
Insight: Base score of 85 amplified to 841 through exponential scaling and risk boosting - a 9.9x total amplification.
When to Use Exponential Scaling
✅ Use exponential scaling when:
- You need clear visual separation between critical and medium issues
- Your priority list has too many “high priority” items
- You want top issues to stand out dramatically
- You prefer score-based ranking over tier-based bucketing
✅ Adjust exponents when:
- Default amplification doesn’t match your team’s priorities
- Certain patterns (e.g., God Objects) deserve more/less emphasis
- You’re tuning the balance between different debt types
✅ Tune thresholds when:
- Scores are getting too large (increase max_threshold)
- Trivial issues are being amplified (increase min_threshold)
- You want to cap extreme outliers (adjust max_threshold)
Performance Impact
Exponential scaling has negligible performance impact:
- Computation: Simple
powf()operation per item - Overhead: <1% additional analysis time
- Scalability: Works with parallel processing (no synchronization needed)
- Memory: No additional data structures required
See Also
- Tiered Prioritization - Understanding tier-based classification
- Configuration - Scoring and aggregation configuration
- Analysis Guide - Detailed metric explanations
- File Classification - Context-aware file size thresholds (Spec 135)
- ARCHITECTURE.md - Technical details of exponential scaling implementation
Tiered Prioritization
Debtmap uses a sophisticated tiered prioritization system to surface critical architectural issues above simple testing gaps. This chapter explains the tier strategy, how to interpret tier classifications, and how to customize tier thresholds for your project.
Overview
The tiered prioritization system organizes technical debt into four distinct tiers based on impact, urgency, and architectural significance. This prevents “walls of similar-scored items” and ensures critical issues don’t get lost among minor problems.
Two Tier Systems: Debtmap uses two complementary tier systems:
- RecommendationTier (T1-T4): Used internally to classify items based on architectural significance and testing needs
- Display Tier (Critical/High/Moderate/Low): Score-based tiers shown in terminal output, derived from final calculated scores
The configuration examples below control the RecommendationTier classification logic, which influences scoring through tier weights. The final display uses score-based tiers for consistency across all output formats.
The Four Tiers
Tier 1: Critical Architecture
Description: God Objects, God Modules, excessive complexity requiring immediate architectural attention
Priority: Must address before adding new features
Weight: 1.5x (highest priority multiplier)
Impact: High impact on maintainability and team velocity
Examples:
- Files with 15+ responsibilities
- Modules with 50+ methods
- ComplexityHotspot debt items with cyclomatic complexity > 50 (extreme complexity requiring architectural redesign)
- God objects flagged by detection algorithms
- Circular dependencies affecting core modules
When to Address: Immediately, before sprint work begins. These issues compound over time and block progress.
# Focus on Tier 1 items
debtmap analyze . --min-priority high --top 5
Tier 2: Complex Untested
Description: Untested code with high complexity or critical dependencies. Items qualify for Tier 2 if they meet ANY of: cyclomatic complexity ≥ 15, total dependencies ≥ 10, or are entry point functions with any coverage gap.
Priority: Risk of bugs in critical paths
Weight: 1.0x (standard multiplier)
Action: Should be tested before refactoring to prevent regressions
Examples:
- Functions with cyclomatic complexity ≥ 15 and 0% coverage
- Functions with 10+ dependencies and low test coverage
- Business logic entry points without tests
- Complex error handling without validation
When to Address: Within current sprint. Add tests before making changes.
# See Tier 2 testing gaps
debtmap analyze . --lcov coverage.lcov --min-priority high
Tier 3: Testing Gaps
Description: Untested code with moderate complexity
Priority: Improve coverage to prevent future issues
Weight: 0.7x (reduced multiplier)
Action: Add tests opportunistically or during related changes
Examples:
- Functions with cyclomatic complexity 10-15 and low coverage
- Utility functions without edge case tests
- Moderate complexity with partial coverage
When to Address: Next sprint or when touching related code.
Tier 4: Maintenance
Description: Low-complexity issues and code quality improvements
Priority: Address opportunistically during other work
Weight: 0.3x (lowest multiplier)
Action: Fix when convenient, low urgency
Examples:
- Simple functions with minor code quality issues
- TODO markers in well-tested code
- Minor duplication in test code
When to Address: During cleanup sprints or when refactoring nearby code.
Configuration
Tier configuration is optional in .debtmap.toml. If not specified, Debtmap uses the balanced defaults shown below.
Default Tier Thresholds
[tiers]
# Tier 2 thresholds (Complex Untested)
t2_complexity_threshold = 15 # Cyclomatic complexity cutoff
t2_dependency_threshold = 10 # Dependency count cutoff
# Tier 3 thresholds (Testing Gaps)
t3_complexity_threshold = 10 # Lower complexity threshold
# Display options
show_t4_in_main_report = false # Hide Tier 4 from main output (default: false)
# Tier weights (multipliers applied to base scores)
t1_weight = 1.5 # Critical architecture
t2_weight = 1.0 # Complex untested
t3_weight = 0.7 # Testing gaps
t4_weight = 0.3 # Maintenance
To use tier-based prioritization with custom settings, add the [tiers] section to your .debtmap.toml configuration file:
# Analyze with custom tier configuration
debtmap analyze . --config .debtmap.toml
Tier Preset Configurations
Debtmap provides three built-in tier presets for different project needs:
Balanced (Default)
[tiers]
t2_complexity_threshold = 15
t2_dependency_threshold = 10
t3_complexity_threshold = 10
Suitable for most projects. Balances detection sensitivity with manageable issue counts.
Strict
[tiers]
t2_complexity_threshold = 10
t2_dependency_threshold = 7
t3_complexity_threshold = 7
For high-quality codebases or teams with strict quality standards. Flags more items as requiring attention.
Lenient
[tiers]
t2_complexity_threshold = 20
t2_dependency_threshold = 15
t3_complexity_threshold = 15
For legacy codebases or gradual technical debt reduction. Focuses on the most critical issues first.
Programmatic Access: These presets are also available as methods when using Debtmap as a library:
TierConfig::balanced()- Equivalent to the balanced preset aboveTierConfig::strict()- Equivalent to the strict preset aboveTierConfig::lenient()- Equivalent to the lenient preset above
These methods can be used in Rust code to configure tier settings programmatically without manual TOML configuration.
Customizing Tier Thresholds
You can also create custom threshold configurations tailored to your project:
# Custom thresholds for specific project needs
[tiers]
t2_complexity_threshold = 12
t2_dependency_threshold = 8
t3_complexity_threshold = 8
Tier Weight Customization
Tier weights are multipliers applied to base debt scores during prioritization. A weight of 1.5 means items in that tier will score 50% higher than equivalent items in a tier with weight 1.0, pushing them higher in priority rankings.
Adjust weights based on your priorities:
# Emphasize testing over architecture
[tiers]
t1_weight = 1.2 # Reduce architecture weight
t2_weight = 1.3 # Increase testing weight
t3_weight = 0.8
t4_weight = 0.3
# Focus on architecture first
[tiers]
t1_weight = 2.0 # Maximize architecture weight
t2_weight = 1.0
t3_weight = 0.5
t4_weight = 0.2
Use Cases
Sprint Planning
Use tiered prioritization to allocate work:
# See Tier 1 items for architectural planning
debtmap analyze . --min-priority high --top 5
# See Tier 2/3 for testing sprint work
debtmap analyze . --lcov coverage.lcov --min-priority medium
Code Review Focus
Prioritize review attention based on tiers:
- Tier 1: Architectural review required, senior dev attention
- Tier 2: Test coverage validation critical
- Tier 3: Standard review process
- Tier 4: Quick review or automated checks
Refactoring Strategy
# Phase 1: Address Tier 1 architectural issues
debtmap analyze . --min-priority high
# Phase 2: Add tests for Tier 2 complex code
debtmap analyze . --lcov coverage.lcov --min-priority high
# Phase 3: Improve Tier 3 coverage
debtmap analyze . --lcov coverage.lcov --min-priority medium
Best Practices
- Always address Tier 1 before feature work - Architectural issues compound
- Test Tier 2 items before refactoring - Avoid regressions
- Batch Tier 3 items - Address multiple in one sprint
- Defer Tier 4 items - Only fix during cleanup or when convenient
- Track tier distribution over time - Aim to reduce Tier 1/2 counts
Interpreting Tier Output
Terminal Output
Terminal output displays items grouped by score-based tiers:
TECHNICAL DEBT ANALYSIS - PRIORITY TIERS
Critical (score >= 90)
src/services.rs - God Object (score: 127.5)
src/core/engine.rs - Circular dependency (score: 95.2)
High (score 70-89.9)
src/processing/transform.rs:145 - UntestableComplexity (score: 85.0)
src/api/handlers.rs - God Module (score: 78.3)
...
Moderate (score 50-69.9)
src/utils/parser.rs:220 - TestingGap (score: 62.1)
...
Low (score < 50)
[Items with score < 50 appear here]
Note: The scores shown reflect tier weight multipliers applied during classification. Items classified as Tier 1 (Critical Architecture) receive a 1.5x weight boost, which often elevates them into the Critical or High score ranges.
JSON Output
JSON output uses the same score-based priority levels as terminal output:
{
"summary": {
"score_distribution": {
"critical": 2,
"high": 5,
"medium": 12,
"low": 45
}
},
"items": [
{
"type": "File",
"score": 127.5,
"priority": "critical",
"location": {
"file": "src/services.rs"
},
"debt_type": "GodObject"
},
{
"type": "Function",
"score": 85.0,
"priority": "high",
"location": {
"file": "src/processing/transform.rs",
"line": 145,
"function": "process_data"
},
"debt_type": "UntestableComplexity"
}
]
}
The priority field is derived from the score field using these thresholds:
critical: score >= 100.0high: score >= 50.0medium: score >= 20.0low: score < 20.0
Note: While RecommendationTier (T1-T4) classifications exist internally for applying tier weights, they are not included in JSON output. The output shows final calculated scores and their corresponding priority levels.
Troubleshooting
Issue: Too many Tier 1 items
Solution: Lower tier weights or increase thresholds temporarily:
[tiers]
t1_weight = 1.2 # Reduce from 1.5
Issue: Not enough items in Tier 1
Solution: Check if god object detection is enabled:
[god_object_detection]
enabled = true
Issue: All items in Tier 4
Solution: Lower minimum thresholds:
[thresholds]
minimum_debt_score = 1.0
minimum_cyclomatic_complexity = 2
See Also
- Scoring Strategies - Understanding file-level vs function-level scoring
- Configuration - Complete configuration reference
- Analysis Guide - Detailed metric explanations
Validation and Quality Gates
The validate command enforces quality gates in your development workflow, making it ideal for CI/CD integration. Unlike the analyze command which focuses on exploration and reporting, validate checks your codebase against configured thresholds and returns appropriate exit codes for automated workflows.
Table of Contents
- Validate vs Analyze
- Quick Start
- Understanding Density-Based Validation
- Configuration Setup
- Validation Metrics
- Exit Codes and CI Integration
- Coverage Integration
- Context-Aware Validation
- CI/CD Examples
- Migrating from Deprecated Thresholds
- Troubleshooting
- Best Practices
Validate vs Analyze
Understanding when to use each command is crucial:
| Aspect | validate | analyze |
|---|---|---|
| Purpose | Enforce quality gates | Explore and understand debt |
| Exit Codes | Returns non-zero on failure | Always returns 0 (unless error) |
| Thresholds | From .debtmap.toml config | Command-line flags |
| Use Case | CI/CD pipelines, pre-commit hooks | Interactive analysis, reports |
| Output Focus | Pass/fail with violation details | Comprehensive metrics and insights |
| Configuration | Requires .debtmap.toml | Works without config file |
Rule of thumb: Use validate for automation and analyze for investigation.
Quick Start
-
Initialize configuration:
debtmap init -
Edit
.debtmap.tomlto set thresholds:[thresholds.validation] max_debt_density = 50.0 # Debt items per 1000 LOC max_average_complexity = 10.0 # Average cyclomatic complexity max_codebase_risk_score = 7.0 # Overall risk level (1-10) -
Run validation:
debtmap validate . -
Check exit code:
echo $? # 0 = pass, non-zero = fail
Understanding Density-Based Validation
Debtmap uses density-based metrics as the primary quality measure. This approach provides several advantages over traditional absolute count metrics.
Why Density Matters
Traditional metrics like “maximum 50 high-complexity functions” fail as your codebase grows:
Scenario: Your team adds 10,000 LOC of high-quality code
- Old metric: "max 50 complex functions" → FAILS (now 55 total)
- Density metric: "max 50 per 1000 LOC" → PASSES (density improved)
Scale-dependent metrics (absolute counts):
- Grow linearly with codebase size
- Require constant threshold adjustments
- Punish healthy growth
- Don’t reflect actual code quality
Density metrics (per 1000 LOC):
- Remain stable as codebase grows
- Measure true quality ratios
- No adjustment needed for growth
- Directly comparable across projects
Calculating Debt Density
Debt Density = (Total Debt Items / Total LOC) × 1000
Example:
- 25 debt items in 5,000 LOC project
- Density = (25 / 5000) × 1000 = 5.0 debt items per 1000 LOC
This density remains meaningful whether your codebase is 5,000 or 500,000 LOC.
Recommended Density Thresholds
| Project Type | max_debt_density | Rationale |
|---|---|---|
| New/Greenfield | 20.0 | High quality bar for new code |
| Active Development | 50.0 | Balanced quality/velocity (default) |
| Legacy Modernization | 100.0 | Prevent regression during refactoring |
| Mature/Critical | 30.0 | Maintain quality in stable systems |
Configuration Setup
Creating Configuration File
The debtmap init command generates a .debtmap.toml with sensible defaults:
debtmap init
This creates:
[thresholds.validation]
# Primary quality metrics (scale-independent)
max_average_complexity = 10.0
max_debt_density = 50.0
max_codebase_risk_score = 7.0
# Optional metrics
min_coverage_percentage = 0.0 # Disabled by default
# Safety net (high ceiling for extreme cases)
max_total_debt_score = 10000
Editing Thresholds
Edit the [thresholds.validation] section to match your quality requirements:
[thresholds.validation]
# Enforce stricter quality for new project
max_debt_density = 30.0 # Tighter density requirement
max_average_complexity = 8.0 # Lower complexity tolerance
max_codebase_risk_score = 6.0 # Reduced risk threshold
min_coverage_percentage = 80.0 # Require 80% test coverage
Override via Command Line
You can override the density threshold from the command line:
# Temporarily use stricter threshold
debtmap validate . --max-debt-density 40.0
Validation Metrics
Debtmap organizes validation metrics into three categories:
Primary Metrics (Scale-Independent)
These are the core quality measures that every project should monitor:
-
max_average_complexity(default: 10.0)- Average cyclomatic complexity per function
- Measures typical function complexity across codebase
- Lower values indicate simpler, more maintainable code
max_average_complexity = 10.0 -
max_debt_density(default: 50.0) - PRIMARY METRIC- Debt items per 1000 lines of code
- Scale-independent quality measure
- Remains stable as codebase grows
max_debt_density = 50.0 -
max_codebase_risk_score(default: 7.0)- Overall risk level combining complexity, coverage, and criticality
- Score ranges from 1 (low risk) to 10 (high risk)
- Considers context-aware analysis when enabled
max_codebase_risk_score = 7.0
Optional Metrics
Configure these when you want additional quality enforcement:
-
min_coverage_percentage(default: 0.0 - disabled)- Minimum required test coverage percentage
- Only enforced when coverage data is provided via
--coverage-file - Set to 0.0 to disable coverage requirements
min_coverage_percentage = 75.0 # Require 75% coverage
Safety Net Metrics
High ceilings to catch extreme cases:
-
max_total_debt_score(default: 10000)- Absolute ceiling on total technical debt
- Prevents runaway growth even if density stays low
- Rarely triggers in normal operation
max_total_debt_score = 10000
Metric Priority
Validation uses AND logic: All primary metrics must pass for validation to succeed. If any check fails, the entire validation fails with a non-zero exit code.
When validation fails, fix issues in this order:
- Critical:
max_debt_densityviolations (core quality metric) - High:
max_average_complexityviolations (function-level quality) - High:
max_codebase_risk_scoreviolations (overall risk) - Medium:
min_coverage_percentageviolations (test coverage) - Low:
max_total_debt_scoreviolations (extreme cases only)
The priority list above is for remediation order when validation fails, not for which checks are enforced. All configured thresholds are enforced equally.
Exit Codes and CI Integration
The validate command uses exit codes to signal success or failure:
Exit Code Behavior
debtmap validate .
echo $?
Exit codes:
0- Success: All thresholds passed- Non-zero - Failure: One or more thresholds exceeded or errors occurred
Using Exit Codes in CI
Exit codes integrate naturally with CI/CD systems:
GitHub Actions:
- name: Validate code quality
run: debtmap validate .
# Step fails automatically if exit code is non-zero
GitLab CI:
script:
- debtmap validate .
# Pipeline fails if exit code is non-zero
Shell scripts:
#!/bin/bash
if debtmap validate .; then
echo "✅ Validation passed"
else
echo "❌ Validation failed"
exit 1
fi
Understanding Validation Output
Success output:
✅ Validation PASSED
Metrics:
Average Complexity: 7.2 / 10.0 ✓
Debt Density: 32.5 / 50.0 ✓
Codebase Risk: 5.8 / 7.0 ✓
Total Debt Score: 1250 / 10000 ✓
Failure output:
❌ Validation FAILED
Metrics:
Average Complexity: 12.3 / 10.0 ✗ EXCEEDED
Debt Density: 65.8 / 50.0 ✗ EXCEEDED
Codebase Risk: 5.2 / 7.0 ✓
Total Debt Score: 2100 / 10000 ✓
Failed checks: 2
Summary Output Format
For compact output suitable for CI logs, use the --summary or -s flag:
debtmap validate . --summary
# or
debtmap validate . -s
Summary format output:
✅ Validation PASSED
Priority Tiers:
P0 (Critical): 2 items
P1 (High): 8 items
P2 (Medium): 15 items
P3 (Low): 23 items
Top Issues:
1. Complex authentication logic (complexity: 28)
2. Database connection pool (risk: 9.2)
3. Untested error handler (coverage: 0%)
The summary format provides:
- Tiered priority counts instead of individual item details
- Top violating functions for quick triage
- Compact format ideal for CI/CD logs
- Same pass/fail determination as standard format
Coverage Integration
Integrate test coverage data to enable risk-based validation:
Generating Coverage Data
For Rust projects with cargo-tarpaulin:
cargo tarpaulin --out Lcov --output-dir target/coverage
For Python projects with pytest-cov:
pytest --cov --cov-report=lcov:coverage/lcov.info
For JavaScript projects with Jest:
jest --coverage --coverageReporters=lcov
Running Validation with Coverage
debtmap validate . --coverage-file target/coverage/lcov.info
Benefits of Coverage Integration
With coverage data, validation gains additional insights:
- Risk-based prioritization - Identifies untested complex code
- Coverage threshold enforcement - via
min_coverage_percentage - Enhanced risk scoring - Combines complexity + coverage + context
- Better failure diagnostics - Shows which untested areas need attention
Coverage-Enhanced Output
debtmap validate . --coverage-file coverage/lcov.info -vv
Output includes:
- Overall coverage percentage
- High-risk uncovered functions
- Coverage-adjusted risk scores
- Prioritized remediation recommendations
Context-Aware Validation
Enable context-aware analysis for deeper risk insights:
Available Context Providers
critical_path- Analyzes call graph to find execution bottlenecksdependency- Identifies highly-coupled modulesgit_history- Detects frequently-changed code (churn)
Enabling Context Providers
Enable all providers:
debtmap validate . --enable-context
Select specific providers:
debtmap validate . --enable-context --context-providers critical_path,git_history
Disable specific providers:
debtmap validate . --enable-context --disable-context dependency
Context-Aware Configuration
Add context settings to .debtmap.toml:
[analysis]
enable_context = true
context_providers = ["critical_path", "git_history"]
Then run validation:
debtmap validate . # Uses config settings
Context Benefits for Validation
Context-aware analysis improves risk scoring by:
- Prioritizing frequently-called functions
- Weighting high-churn code more heavily
- Identifying architectural bottlenecks
- Surfacing critical code paths
CI/CD Examples
GitHub Actions
Complete workflow with coverage generation and validation:
name: Code Quality Validation
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
env:
CARGO_TERM_COLOR: always
RUST_BACKTRACE: 1
jobs:
validate:
name: Technical Debt Validation
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v5
with:
fetch-depth: 0 # Full history for git context
- name: Setup Rust
uses: dtolnay/rust-toolchain@stable
with:
components: rustfmt, clippy
- name: Cache cargo dependencies
uses: actions/cache@v4
with:
path: |
~/.cargo/bin/
~/.cargo/registry/index/
~/.cargo/registry/cache/
~/.cargo/git/db/
target/
key: ${{ runner.os }}-cargo-${{ hashFiles('**/Cargo.lock') }}
restore-keys: |
${{ runner.os }}-cargo-
- name: Install cargo-tarpaulin
run: |
if ! command -v cargo-tarpaulin &> /dev/null; then
cargo install cargo-tarpaulin
fi
- name: Build debtmap
run: cargo build --release
- name: Generate coverage data
run: cargo tarpaulin --out Lcov --output-dir target/coverage --timeout 300
- name: Run debtmap validation with coverage
run: |
if [ -f "target/coverage/lcov.info" ]; then
./target/release/debtmap validate . \
--coverage-file target/coverage/lcov.info \
--enable-context \
--format json \
--output debtmap-report.json
else
echo "Warning: LCOV file not found, running without coverage"
./target/release/debtmap validate . \
--format json \
--output debtmap-report.json
fi
- name: Upload debtmap report
if: always()
uses: actions/upload-artifact@v4
with:
name: debtmap-analysis-artifacts
path: |
debtmap-report.json
target/coverage/lcov.info
retention-days: 7
GitLab CI
stages:
- test
- quality
variables:
CARGO_HOME: $CI_PROJECT_DIR/.cargo
debtmap:
stage: quality
image: rust:latest
cache:
paths:
- .cargo/
- target/
before_script:
# Install debtmap and coverage tools
- cargo install debtmap
- cargo install cargo-tarpaulin
script:
# Generate coverage
- cargo tarpaulin --out Lcov --output-dir coverage
# Validate with debtmap
- debtmap validate . --coverage-file coverage/lcov.info -v
artifacts:
when: always
paths:
- coverage/
reports:
coverage_report:
coverage_format: cobertura
path: coverage/cobertura.xml
CircleCI
version: 2.1
jobs:
validate:
docker:
- image: cimg/rust:1.75
steps:
- checkout
- restore_cache:
keys:
- cargo-{{ checksum "Cargo.lock" }}
- run:
name: Install tools
command: |
cargo install debtmap
cargo install cargo-tarpaulin
- run:
name: Generate coverage
command: cargo tarpaulin --out Lcov
- run:
name: Validate code quality
command: debtmap validate . --coverage-file lcov.info
- save_cache:
key: cargo-{{ checksum "Cargo.lock" }}
paths:
- ~/.cargo
- target
workflows:
version: 2
quality:
jobs:
- validate
Migrating from Deprecated Thresholds
Debtmap version 0.3.0 deprecated scale-dependent absolute count metrics in favor of density-based metrics.
Deprecated Metrics
The following metrics will be removed in v1.0:
| Deprecated Metric | Migration Path |
|---|---|
max_high_complexity_count | Use max_debt_density |
max_debt_items | Use max_debt_density |
max_high_risk_functions | Use max_debt_density + max_codebase_risk_score |
Migration Example
Old configuration (deprecated):
[thresholds.validation]
max_high_complexity_count = 50 # ❌ Scale-dependent
max_debt_items = 100 # ❌ Scale-dependent
max_high_risk_functions = 20 # ❌ Scale-dependent
New configuration (recommended):
[thresholds.validation]
max_debt_density = 50.0 # ✅ Scale-independent
max_average_complexity = 10.0 # ✅ Quality ratio
max_codebase_risk_score = 7.0 # ✅ Risk level
Calculating Equivalent Density Threshold
Convert your old absolute thresholds to density:
Old: max_debt_items = 100 in 10,000 LOC codebase
New: max_debt_density = (100 / 10000) × 1000 = 10.0
Deprecation Warnings
When you run validate with deprecated metrics, you’ll see:
⚠️ DEPRECATION WARNING:
The following validation thresholds are deprecated:
- max_high_complexity_count
- max_debt_items
These scale-dependent metrics will be removed in v1.0.
Please migrate to density-based validation:
- Use 'max_debt_density' instead of absolute counts
- Density metrics remain stable as your codebase grows
Migration Timeline
- v0.3.0 - Density metrics introduced, old metrics deprecated
- v0.4.0 - v0.9.x - Deprecation warnings shown
- v1.0.0 - Deprecated metrics removed
Troubleshooting
Debugging Validation Failures
Use verbosity flags to understand why validation failed:
Level 1: Basic details (-v)
debtmap validate . -v
Shows which thresholds failed, by how much, and timing breakdown:
- Call graph building time
- Trait resolution time
- Coverage loading time
- Individual analysis phase durations
Level 2: Detailed breakdown (-vv)
debtmap validate . -vv
Shows everything from -v plus:
- Score calculation factors and weights
- Top violating functions with metrics
- Detailed phase timing information
- Risk score component breakdown
Level 3: Full diagnostic output (-vvv)
debtmap validate . -vvv
Shows complete debug information:
- All debt items with full details
- Complete risk calculations for each function
- All timing information including sub-phases
- File-level and function-level analysis data
- Context provider outputs (if enabled)
Common Issues
Issue: Validation fails but output unclear
# Solution: Increase verbosity
debtmap validate . -vv
Issue: Want to see only the worst problems
# Solution: Use --top flag
debtmap validate . --top 10 -v
Issue: Output is too verbose for CI logs
# Solution: Use --summary flag for compact tiered output
debtmap validate . --summary
# or
debtmap validate . -s
This provides a condensed view focused on priority tiers rather than individual items.
Issue: Validation passes locally but fails in CI
# Possible causes:
# 1. Different code (stale local branch)
# 2. Different config file (check .debtmap.toml in CI)
# 3. Missing coverage data (check LCOV generation in CI)
# Debug in CI:
debtmap validate . -vvv # Maximum verbosity
Issue: Coverage threshold fails unexpectedly
# Check if coverage file is being read
debtmap validate . --coverage-file coverage/lcov.info -v
# Verify coverage file exists and is valid
ls -lh coverage/lcov.info
Issue: Context providers causing performance issues
# Disable expensive providers
debtmap validate . --enable-context --disable-context git_history
Issue: Semantic analysis causing errors or unexpected behavior
Debtmap uses semantic analysis by default, powered by tree-sitter for deep AST (Abstract Syntax Tree) analysis. This provides accurate understanding of code structure, control flow, and complexity patterns.
However, semantic analysis may encounter issues with:
- Unsupported or experimental language features
- Malformed or incomplete syntax
- Complex macro expansions
- Very large files that timeout during parsing
# Solution: Disable semantic analysis with fallback mode
debtmap validate . --semantic-off
When semantic analysis is disabled with --semantic-off, debtmap falls back to basic syntax analysis, which is faster but less accurate for complexity calculations. Use this flag if:
- Encountering parsing errors or timeouts
- Working with bleeding-edge language features
- Need faster validation at the cost of precision
Validation Report Generation
Generate detailed reports for debugging:
JSON format for programmatic analysis:
debtmap validate . --format json --output validation-report.json
cat validation-report.json | jq '.validation_details'
Markdown format for documentation:
debtmap validate . --format markdown --output validation-report.md
Terminal format with filtering:
debtmap validate . --format terminal --top 20 -vv
Best Practices
Setting Initial Thresholds
1. Establish baseline:
# Run analysis to see current metrics
debtmap analyze . --format json > baseline.json
cat baseline.json | jq '.unified_analysis.debt_density'
2. Set pragmatic thresholds:
[thresholds.validation]
# Start slightly above current values to prevent regression
max_debt_density = 60.0 # Current: 55.0
max_average_complexity = 12.0 # Current: 10.5
3. Gradually tighten:
# After 1 month of cleanup
max_debt_density = 50.0
max_average_complexity = 10.0
Progressive Threshold Tightening
Month 1-2: Prevent regression
max_debt_density = 60.0 # Above current baseline
Month 3-4: Incremental improvement
max_debt_density = 50.0 # Industry standard
Month 5-6: Quality leadership
max_debt_density = 30.0 # Best-in-class
Project-Specific Recommendations
Greenfield projects:
# Start with high quality bar
max_debt_density = 20.0
max_average_complexity = 8.0
min_coverage_percentage = 80.0
Active development:
# Balanced quality/velocity
max_debt_density = 50.0
max_average_complexity = 10.0
min_coverage_percentage = 70.0
Legacy modernization:
# Prevent regression during refactoring
max_debt_density = 100.0
max_average_complexity = 15.0
min_coverage_percentage = 50.0
Pre-Commit Hook Integration
Add validation as a pre-commit hook:
# .git/hooks/pre-commit
#!/bin/bash
echo "Running debtmap validation..."
if debtmap validate . -v; then
echo "✅ Validation passed"
exit 0
else
echo "❌ Validation failed - commit blocked"
exit 1
fi
Make it executable:
chmod +x .git/hooks/pre-commit
Performance Optimization
Enable parallel processing: Validation uses parallel processing by default for fast execution on multi-core systems.
Disable for resource-constrained environments:
# Limit parallelism
debtmap validate . --jobs 2
# Disable completely
debtmap validate . --no-parallel
Performance characteristics:
- Parallel call graph construction
- Multi-threaded file analysis
- Same performance as
analyzecommand
Monitoring Trends
Track validation metrics over time:
# Generate timestamped reports
debtmap validate . --format json --output "reports/validation-$(date +%Y%m%d).json"
# Compare trends
jq -s 'map(.unified_analysis.debt_density)' reports/validation-*.json
Documentation
Document your threshold decisions:
# .debtmap.toml
[thresholds.validation]
# Rationale: Team agreed 50.0 density balances quality and velocity
# Review: Quarterly (next: 2025-04-01)
max_debt_density = 50.0
# Rationale: Enforces single-responsibility principle
# Review: After 3 months of metrics
max_average_complexity = 10.0
Summary
The validate command provides automated quality gates for CI/CD integration:
- Use density-based metrics for scale-independent quality measurement
- Configure in
.debtmap.tomlfor consistent, version-controlled thresholds - Integrate with CI/CD using exit codes for automated enforcement
- Enable coverage and context for risk-based validation
- Migrate from deprecated metrics to density-based approach
- Debug with verbosity flags when validation fails unexpectedly
- Tighten thresholds progressively as code quality improves
Next steps:
- Review Configuration Reference for detailed threshold options
- See Examples for more CI/CD integration patterns
- Check CLI Reference for complete command documentation
Metrics Reference
Comprehensive guide to all metrics calculated by Debtmap and how to interpret them.
Metric Categories (Spec 118)
Debtmap distinguishes between two fundamental categories of metrics:
Measured Metrics
Definition: Metrics directly computed from the Abstract Syntax Tree (AST) through precise analysis.
These metrics are:
- Deterministic: Same code always produces the same metric value
- Precise: Exact counts from syntax analysis, not estimates
- Suitable for thresholds: Reliable for CI/CD quality gates
- Language-specific: Computed using language parsers (syn for Rust, tree-sitter for others)
Measured metrics include:
| Metric | Description | Example |
|---|---|---|
cyclomatic_complexity | Count of decision points (if, match, while, for, etc.) | Function with 3 if statements = complexity 4 |
cognitive_complexity | Weighted measure of code understandability | Nested loops increase cognitive load |
nesting_depth | Maximum levels of nested control structures | 3 nested if statements = depth 3 |
loc | Lines of code in the function | Physical line count |
parameter_count | Number of function parameters | fn foo(a: i32, b: String) = 2 |
Estimated Metrics
Definition: Heuristic approximations calculated using formulas, not direct AST measurements.
These metrics are:
- Heuristic: Based on mathematical formulas and assumptions
- Approximate: Close estimates, not exact counts
- Useful for prioritization: Help estimate effort and risk
- Not suitable for hard thresholds: Use for relative comparisons, not absolute gates
Estimated metrics include:
| Metric | Formula | Purpose | Example |
|---|---|---|---|
est_branches | max(nesting, 1) × cyclomatic ÷ 3 | Estimate test cases needed for branch coverage | Complexity 12, nesting 3 → ~12 branches |
Important: The est_branches metric was previously called branches. It was renamed in Spec 118 to make it explicit that this is an estimate, not a precise count from the AST.
Why the Distinction Matters
For Code Quality Gates
# GOOD: Use measured metrics for CI/CD thresholds
debtmap validate . --threshold-complexity 15
# AVOID: Don't use estimated metrics for hard gates
# (est_branches is not exposed as a threshold flag)
Rationale: Measured metrics are deterministic and precise, making them suitable for build-breaking quality gates.
For Prioritization
# GOOD: Use est_branches for prioritization
debtmap analyze . --top 10 # Sorts by est_branches (among other factors)
# GOOD: Estimated metrics help understand testing effort
debtmap analyze . --lcov coverage.info --verbose
Rationale: Estimated metrics provide useful heuristics for understanding where to focus testing and refactoring efforts.
For Comparison Across Codebases
- Measured metrics: Comparable across projects (cyclomatic 10 means the same everywhere)
- Estimated metrics: Project-specific heuristics (est_branches depends on nesting patterns)
Detailed Metric Descriptions
Cyclomatic Complexity (Measured)
What it measures: The number of linearly independent paths through a function’s control flow.
How it’s calculated:
- Start with a base of 1
- Add 1 for each decision point:
if,else ifmatcharmswhile,for,loop&&,||in conditions?operator (early return)
Example:
#![allow(unused)]
fn main() {
fn example(x: i32, y: i32) -> bool {
if x > 0 { // +1
if y > 0 { // +1
true
} else { // implicit in if/else
false
}
} else if x < 0 { // +1
false
} else {
y == 0 // no additional branches
}
}
// Cyclomatic complexity = 1 + 3 = 4
}
Thresholds:
- 1-5: Simple, easy to test
- 6-10: Moderate, manageable complexity
- 11-20: Complex, consider refactoring
- 21+: Very complex, high maintenance cost
Cognitive Complexity (Measured)
What it measures: How difficult the code is for humans to understand.
How it differs from cyclomatic:
- Weights nested structures more heavily (nested if is worse than sequential if)
- Ignores shorthand structures (early returns, guard clauses)
- Focuses on readability, not just logic paths
Example:
#![allow(unused)]
fn main() {
fn cyclomatic_low_cognitive_low(status: Status) -> bool {
match status { // Cyclomatic: 4, Cognitive: 1
Status::Active => true,
Status::Pending => false,
Status::Closed => false,
Status::Error => false,
}
}
fn cyclomatic_low_cognitive_high(x: i32, y: i32, z: i32) -> bool {
if x > 0 {
if y > 0 { // Nested: +2 cognitive penalty
if z > 0 { // Deeply nested: +3 cognitive penalty
return true;
}
}
}
false
}
// Cyclomatic: 4, Cognitive: 7 (nesting penalty applied)
}
Thresholds:
- 1-5: Easy to understand
- 6-10: Moderate mental load
- 11-15: Difficult to follow
- 16+: Refactor recommended
Estimated Branches (Estimated)
What it estimates: Approximate number of execution paths that would need test coverage.
Formula:
est_branches = max(nesting_depth, 1) × cyclomatic_complexity ÷ 3
Why this formula:
- Nesting multiplier: Deeper nesting creates more combinations
- Cyclomatic base: Higher complexity → more paths
- ÷ 3 adjustment: Empirical factor to align with typical branch coverage needs
Example scenarios:
| Cyclomatic | Nesting | est_branches | Interpretation |
|---|---|---|---|
| 3 | 1 | 1 | Simple linear code |
| 12 | 1 | 4 | Multiple sequential branches |
| 12 | 3 | 12 | Nested conditions, many paths |
| 20 | 5 | 33 | Complex nested logic |
Use cases:
- Estimating test case requirements
- Prioritizing untested complex code
- Understanding coverage gaps
Limitations:
- Not a precise count: This is a heuristic approximation
- Don’t use for coverage percentage calculation: Use actual coverage tools
- Varies by code style: Heavily nested code scores higher
Terminology Change (Spec 118)
Before: branches
Previously, this metric was displayed as branches=X, which was confusing because:
- Users thought it was a precise count from AST analysis
- It was mistaken for cyclomatic complexity (actual branch count)
- The estimation nature was not obvious
After: est_branches
Now displayed as est_branches=X to:
- Make estimation explicit: “est_” prefix indicates this is approximate
- Avoid confusion: Clearly different from cyclomatic complexity
- Set correct expectations: Users know this is a heuristic, not a measurement
Migration Guide
Terminal Output:
- Old:
COMPLEXITY: cyclomatic=12, branches=8, cognitive=15 - New:
COMPLEXITY: cyclomatic=12, est_branches=8, cognitive=15
Code:
- Internal variable names updated from
branchestoest_branches - Comments added explaining the estimation formula
JSON Output:
- No change: The ComplexityMetrics struct does not include this field
est_branchesis calculated on-demand for display purposes only
Practical Usage Examples
Example 1: Code Quality Gate
# Fail build if any function exceeds cyclomatic complexity 15
debtmap validate . --threshold-complexity 15 --max-high 0
# Why: Cyclomatic is measured, precise, and repeatable
Example 2: Prioritize Testing Effort
# Show top 10 functions by risk (uses est_branches in scoring)
debtmap analyze . --lcov coverage.info --top 10
# Functions with high est_branches and low coverage appear first
Example 3: Understanding Test Requirements
# Verbose output shows est_branches for each function
debtmap analyze . --verbose
# Output:
# └─ COMPLEXITY: cyclomatic=12, est_branches=8, cognitive=15, nesting=2
#
# Interpretation: ~8 test cases likely needed for good branch coverage
Example 4: Explaining Metrics to Team
# Display comprehensive metric definitions
debtmap analyze --explain-metrics
# Shows:
# - Measured vs Estimated categories
# - Formulas and thresholds
# - When to use each metric
Metric Selection Guide
When to Use Cyclomatic Complexity
✅ Use for:
- CI/CD quality gates
- Code review guidelines
- Consistent cross-project comparison
- Identifying refactoring candidates
❌ Don’t use for:
- Estimating test effort (use est_branches)
- Readability assessment (use cognitive complexity)
When to Use Cognitive Complexity
✅ Use for:
- Readability reviews
- Identifying hard-to-maintain code
- Onboarding difficulty assessment
❌ Don’t use for:
- Test coverage planning
- Strict quality gates (more subjective than cyclomatic)
When to Use est_branches
✅ Use for:
- Estimating test case requirements
- Prioritizing test coverage work
- Understanding coverage gaps
❌ Don’t use for:
- CI/CD quality gates (it’s an estimate)
- Calculating coverage percentages (use actual coverage data)
- Cross-project comparison (formula is heuristic)
Combining Metrics for Insights
High Complexity, Low Coverage
cyclomatic=18, est_branches=12, coverage=0%
Interpretation: High-risk code needing ~12 test cases for adequate coverage.
Action: Prioritize writing tests, consider refactoring.
High Cyclomatic, Low Cognitive
cyclomatic=15, cognitive=5
Interpretation: Many branches, but simple linear logic (e.g., validation checks).
Action: Acceptable pattern, tests should be straightforward.
Low Cyclomatic, High Cognitive
cyclomatic=8, cognitive=18
Interpretation: Deeply nested logic, hard to understand despite fewer branches.
Action: Refactor to reduce nesting, extract functions.
High est_branches, Low Cyclomatic
cyclomatic=9, nesting=5, est_branches=15
Interpretation: Deep nesting creates many path combinations.
Action: Flatten nesting, use early returns, extract nested logic.
Frequently Asked Questions
Q: Why is est_branches different from cyclomatic complexity?
A: Cyclomatic is the measured count of decision points. est_branches is an estimated number of execution paths, calculated using nesting depth to account for path combinations.
Q: Can I use est_branches in CI/CD thresholds?
A: No. Use measured metrics (cyclomatic_complexity, cognitive_complexity) for quality gates. est_branches is a heuristic for prioritization, not a precise measurement.
Q: Why did the metric name change from “branches” to “est_branches”?
A: To make it explicit that this is an estimate, not a measured value. Users were confused, thinking it was a precise count from the AST.
Q: How accurate is est_branches for estimating test cases?
A: It’s a rough approximation. Actual test case requirements depend on:
- Business logic complexity
- Edge cases
- Error handling paths
- Integration points
Use est_branches as a starting point, not an exact requirement.
Q: Should I refactor code with high est_branches?
A: Not necessarily. High est_branches indicates complex logic that may need thorough testing. If the logic is unavoidable (e.g., state machines, complex business rules), focus on comprehensive test coverage rather than refactoring.
Further Reading
- Why Debtmap? - Entropy Analysis
- Configuration - Complexity Thresholds
- Coverage Integration
- Scoring Strategies
Examples
This chapter provides practical, real-world examples of using Debtmap across different project types and workflows. All examples use current CLI syntax verified against the source code.
Quick Start: New to Debtmap? Start with Basic Rust Analysis for the simplest introduction, then explore Coverage Integration for risk-based prioritization.
Quick Navigation: For detailed explanations of all CLI options, see the CLI Reference chapter.
Overview
This chapter demonstrates:
- Language-specific analysis: Rust, Python, JavaScript/TypeScript with their respective testing tools
- CI/CD integration: GitHub Actions, GitLab CI, CircleCI with validation gates
- Output formats: Terminal, JSON, and Markdown with interpretation guidance
- Advanced features: Context-aware analysis, multi-pass processing
- Configuration patterns: Tailored settings for different project types
- Progress tracking: Using the
comparecommand to validate refactoring improvements
All examples are copy-paste ready and tested against the current Debtmap implementation.
Table of Contents
- Analyzing Rust Projects
- Python Analysis
- JavaScript/TypeScript
- CI Integration
- Output Formats
- Advanced Usage
- Configuration Examples
- Compare Command
Analyzing Rust Projects
Basic Rust Analysis
Start with a simple analysis of your Rust project:
# Analyze current directory (path defaults to '.' - both commands are identical)
debtmap analyze
# Same as above with explicit path
debtmap analyze .
# Analyze specific directory
debtmap analyze ./src
# Analyze with custom complexity threshold
debtmap analyze ./src --threshold-complexity 15
Coverage Integration with cargo-tarpaulin
Combine complexity analysis with test coverage for risk-based prioritization:
# Generate LCOV coverage data
cargo tarpaulin --out lcov --output-dir target/coverage
# Analyze with coverage data
debtmap analyze . --lcov target/coverage/lcov.info
# Or use the shorter alias
debtmap analyze . --coverage-file target/coverage/lcov.info
Note:
--lcovis an alias for--coverage-file- both work identically.
What this does:
- Functions with 0% coverage and high complexity get marked as
[CRITICAL] - Well-tested functions (>80% coverage) are deprioritized
- Shows risk reduction potential for each untested function
Custom Thresholds
Configure thresholds to match your project standards:
# Set both complexity and duplication thresholds
debtmap analyze . \
--threshold-complexity 15 \
--threshold-duplication 50
# Use preset configurations for quick setup
debtmap analyze . --threshold-preset strict # Strict standards
debtmap analyze . --threshold-preset balanced # Default balanced
debtmap analyze . --threshold-preset lenient # Lenient for legacy code
Preset configurations:
- Strict: Lower thresholds for high quality standards (good for new projects)
- Balanced: Default thresholds suitable for typical projects
- Lenient: Higher thresholds designed for legacy codebases with existing technical debt
Preset Threshold Values:
| Preset | Complexity | Duplication | Max Function Lines | Max Nesting Depth |
|---|---|---|---|---|
| Strict | 8 | 30 | 30 | 3 |
| Balanced | 10 | 50 | 50 | 4 |
| Lenient | 15 | 75 | 80 | 5 |
God Object Detection
Identify classes and modules with too many responsibilities:
# Standard analysis includes god object detection
debtmap analyze .
# Disable god object detection for specific run
debtmap analyze . --no-god-object
God objects are flagged with detailed metrics:
- Number of methods and fields
- Responsibility count (grouped by naming patterns)
- God object score (0-100%)
- Recommendations for splitting
Purity-Weighted God Object Scoring
Debtmap uses purity analysis to distinguish functional programming patterns from actual god objects. Enable verbose mode to see purity distribution:
# See purity distribution in god object analysis
debtmap analyze . -v
Example Output:
GOD OBJECT ANALYSIS: src/core/processor.rs
Total functions: 107
PURITY DISTRIBUTION:
Pure: 70 functions (65%) → complexity weight: 6.3
Impure: 37 functions (35%) → complexity weight: 14.0
Total weighted complexity: 20.3
God object score: 12.0 (threshold: 70.0)
Status: ✓ Not a god object (functional design)
This shows:
- Pure functions (no side effects, immutable) receive 0.3× weight
- Impure functions (I/O, mutations, side effects) receive 1.0× weight
- Functional modules with many pure helpers avoid false positives
- Focus shifts to modules with excessive stateful code
Why This Matters:
Without purity weighting:
Module with 100 pure helpers → Flagged as god object ❌
With purity weighting:
Module with 100 pure helpers → Normal (functional design) ✅
Module with 100 impure functions → God object detected ✅
Compare Two Modules:
Functional module (70 pure, 30 impure):
Pure: 70 × 0.3 = 21.0
Impure: 30 × 1.0 = 30.0
Score: 35.0 → Not a god object ✓
Procedural module (100 impure):
Impure: 100 × 1.0 = 100.0
Score: 125.0 → God object detected ✗
Filtering and Focusing
# Analyze only Rust files
debtmap analyze . --languages rust
# Focus on architecture issues (god objects, complexity)
debtmap analyze . --filter Architecture
# Focus on testing gaps
debtmap analyze . --filter Testing
# Filter by multiple categories
debtmap analyze . --filter Architecture,Testing
# Show only top 10 issues
debtmap analyze . --top 10
# Show only high-priority items
debtmap analyze . --min-priority high
Valid filter categories:
Architecture- God objects, high complexity, structural issuesTesting- Test coverage gaps, untested critical codeDuplication- Code duplication and similar patternsMaintainability- Long functions, deep nesting, readability issues
Output Formats
# JSON output for CI integration
debtmap analyze . --format json --output report.json
# Markdown report
debtmap analyze . --format markdown --output DEBT_REPORT.md
# Terminal output (default) - prettified
debtmap analyze .
Multi-Pass Analysis
For deeper analysis with context awareness:
# Enable context-aware analysis with multiple providers
debtmap analyze . \
--context \
--context-providers critical_path,dependency,git_history
# Multi-pass analysis with attribution
debtmap analyze . --multi-pass --attribution
Complete CI Example
This is from Debtmap’s own .github/workflows/debtmap.yml:
# 1. Install cargo-tarpaulin
cargo install cargo-tarpaulin
# 2. Build debtmap
cargo build --release
# 3. Generate coverage
cargo tarpaulin --config .tarpaulin.toml --out Lcov --timeout 300
# 4. Run validation with coverage
./target/release/debtmap validate . \
--coverage-file target/coverage/lcov.info \
--format json \
--output debtmap-report.json
Python Analysis
Basic Python Analysis
# Analyze Python files only
debtmap analyze . --languages python
# Analyze specific Python directory
debtmap analyze src --languages python
Coverage Integration with pytest
Generate coverage and analyze risk:
# Generate LCOV coverage with pytest
pytest --cov --cov-report=lcov
# Analyze with coverage data
debtmap analyze . \
--languages python \
--lcov coverage.lcov
Python-Specific Patterns
# Focus on testing gaps in Python code
debtmap analyze . \
--languages python \
--filter Testing
# Find god objects in Python modules
debtmap analyze . \
--languages python \
--filter Architecture
Example Configuration for Python Projects
Create .debtmap.toml:
[languages]
enabled = ["python"]
[thresholds]
complexity = 12
max_function_lines = 40
[ignore]
patterns = [
"**/*_test.py",
"tests/**",
".venv/**",
"**/__pycache__/**",
]
[god_object]
enabled = true
max_methods = 15
max_responsibilities = 4
JavaScript/TypeScript
Analyzing JS/TS Projects
# Analyze JavaScript and TypeScript
debtmap analyze . --languages javascript,typescript
# TypeScript only
debtmap analyze . --languages typescript
Coverage Integration with Jest
# Generate LCOV with Jest
jest --coverage --coverageReporters=lcov
# Analyze with coverage
debtmap analyze . \
--languages javascript,typescript \
--lcov coverage/lcov.info
Node.js Project Patterns
# Exclude node_modules and focus on source
debtmap analyze src --languages javascript,typescript
# With custom complexity thresholds for JS
debtmap analyze . \
--languages javascript,typescript \
--threshold-complexity 10
TypeScript Configuration Example
Create .debtmap.toml:
[languages]
enabled = ["typescript", "javascript"]
[thresholds]
complexity = 10
max_function_lines = 50
[ignore]
patterns = [
"node_modules/**",
"**/*.test.ts",
"**/*.spec.ts",
"dist/**",
"build/**",
"**/*.d.ts",
]
Monorepo Analysis
# Analyze specific package
debtmap analyze packages/api --languages typescript
# Analyze all packages, grouped by category
debtmap analyze packages \
--languages typescript \
--group-by-category
CI Integration
GitHub Actions
Complete workflow example (from .github/workflows/debtmap.yml):
name: Debtmap
on:
push:
branches: [ main, master ]
pull_request:
branches: [ main, master ]
workflow_dispatch:
env:
CARGO_TERM_COLOR: always
RUST_BACKTRACE: 1
jobs:
validate:
name: Technical Debt Validation
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v5
with:
fetch-depth: 0
- name: Setup Rust
uses: dtolnay/rust-toolchain@stable
with:
components: rustfmt, clippy
- name: Cache cargo dependencies
uses: actions/cache@v4
with:
path: |
~/.cargo/bin/
~/.cargo/registry/index/
~/.cargo/registry/cache/
~/.cargo/git/db/
target/
key: ${{ runner.os }}-cargo-${{ hashFiles('**/Cargo.lock') }}
restore-keys: |
${{ runner.os }}-cargo-
- name: Install cargo-tarpaulin
run: |
if ! command -v cargo-tarpaulin &> /dev/null; then
cargo install cargo-tarpaulin
else
echo "cargo-tarpaulin already installed"
fi
- name: Build debtmap
run: cargo build --release
- name: Generate coverage data
run: cargo tarpaulin --config .tarpaulin.toml --out Lcov --timeout 300
- name: Run debtmap validation with coverage
run: |
if [ -f "target/coverage/lcov.info" ]; then
./target/release/debtmap validate . --coverage-file target/coverage/lcov.info --format json --output debtmap-report.json
else
echo "Warning: LCOV file not found, running validation without coverage data"
./target/release/debtmap validate . --format json --output debtmap-report.json
fi
- name: Upload debtmap report and coverage
if: always()
uses: actions/upload-artifact@v4
with:
name: debtmap-analysis-artifacts
path: |
debtmap-report.json
target/coverage/lcov.info
retention-days: 7
GitLab CI
debtmap:
stage: quality
image: rust:latest
script:
# Install debtmap
- cargo install debtmap
# Run tests with coverage (generates LCOV format)
- cargo install cargo-tarpaulin
- cargo tarpaulin --out Lcov
# Validate with debtmap (using LCOV format)
- debtmap validate .
--coverage-file lcov.info
--format json
--output debtmap-report.json
artifacts:
paths:
- lcov.info
- debtmap-report.json
expire_in: 1 week
CircleCI
version: 2.1
jobs:
debtmap:
docker:
- image: cimg/rust:1.75
steps:
- checkout
- run:
name: Install debtmap
command: cargo install debtmap
- run:
name: Generate coverage
command: |
cargo install cargo-tarpaulin
cargo tarpaulin --out Lcov
- run:
name: Run debtmap
command: |
debtmap validate . \
--coverage-file lcov.info \
--format json \
--output debtmap.json
- store_artifacts:
path: debtmap.json
workflows:
version: 2
build:
jobs:
- debtmap
Using debtmap validate for PR Gates
# Fail build if thresholds are exceeded
debtmap validate . --coverage-file lcov.info
# With custom thresholds
debtmap validate . \
--coverage-file lcov.info \
--threshold-complexity 15
# Exit code 0 if passing, 1 if failing
Compare Command in CI
Track technical debt trends over time:
# Generate baseline (on main branch)
debtmap analyze . --format json --output baseline.json
# After PR changes
debtmap analyze . --format json --output current.json
# Compare and fail if regressions detected
debtmap compare \
--before baseline.json \
--after current.json \
--format json
Output Formats
Terminal Output (Default)
The default terminal output is prettified with colors and priorities:
debtmap analyze . --lcov coverage.lcov --top 3
Example output:
════════════════════════════════════════════
PRIORITY TECHNICAL DEBT FIXES
════════════════════════════════════════════
🎯 TOP 3 RECOMMENDATIONS (by unified priority)
#1 SCORE: 8.9 [CRITICAL]
├─ TEST GAP: ./src/analyzers/rust.rs:38 parse_function()
├─ ACTION: Add 6 unit tests for full coverage
├─ IMPACT: Full test coverage, -3.7 risk
├─ COMPLEXITY: cyclomatic=6, cognitive=8, nesting=2, lines=32
├─ DEPENDENCIES: 0 upstream, 11 downstream
└─ WHY: Business logic with 0% coverage, manageable complexity
📊 TOTAL DEBT SCORE: 4907
📈 OVERALL COVERAGE: 67.12%
JSON Output
Machine-readable format for CI/CD integration:
debtmap analyze . --format json --output report.json
Using JSON output programmatically:
# Extract total debt score
debtmap analyze . --format json | jq '.total_debt_score'
# Count critical items
debtmap analyze . --format json | jq '[.items[] | select(.unified_score.final_score >= 8)] | length'
# Get top 5 functions by score
debtmap analyze . --format json | jq '.items | sort_by(-.unified_score.final_score) | .[0:5] | .[].location'
# Extract all test gap items
debtmap analyze . --format json | jq '[.items[] | select(.debt_type == "TestGap")]'
Structure:
{
"items": [
{
"location": {
"file": "src/main.rs",
"function": "process_data",
"line": 42
},
"debt_type": "TestGap",
"unified_score": {
"complexity_factor": 3.2,
"coverage_factor": 10.0,
"dependency_factor": 2.5,
"role_multiplier": 1.2,
"final_score": 9.4
},
"function_role": "BusinessLogic",
"recommendation": {
"action": "Add unit tests",
"details": "Add 6 unit tests for full coverage",
"effort_estimate": "2-3 hours"
},
"expected_impact": {
"risk_reduction": 3.9,
"complexity_reduction": 0,
"coverage_improvement": 100
}
}
],
"overall_coverage": 67.12,
"total_debt_score": 4907
}
Markdown Report
# Standard markdown report
debtmap analyze . --format markdown --output DEBT_REPORT.md
# Summary level for executives (minimal detail)
debtmap analyze . --format markdown --detail-level summary --output SUMMARY.md
# Standard level for team review (default)
debtmap analyze . --format markdown --detail-level standard --output DEBT.md
# Comprehensive level for deep analysis
debtmap analyze . --format markdown --detail-level comprehensive --output DETAILED.md
# Debug level for troubleshooting
debtmap analyze . --format markdown --detail-level debug --output DEBUG.md
Detail levels:
- summary: Executive summary with key metrics and top issues only
- standard: Balanced detail suitable for team reviews (default)
- comprehensive: Full details including all debt items and analysis
- debug: Maximum detail including AST information and parser internals
Use cases:
- Summary: Management reports, PR comments
- Standard: Regular team reviews
- Comprehensive: Deep dives, refactoring planning
- Debug: Troubleshooting debtmap behavior
Great for documentation or PR comments.
Understanding Output Formats
Note:
--formatselects the output type (json/markdown/terminal), while--output-formatselects the JSON structure variant (unified/legacy). They serve different purposes and can be used together.
# JSON output (default is legacy format)
debtmap analyze . --format json
# Unified JSON format (alternative to legacy)
debtmap analyze . --format json --output-format unified
# Legacy JSON format (default, for backward compatibility)
debtmap analyze . --format json --output-format legacy
# Output format options: terminal, json, markdown
debtmap analyze . --format terminal
Unified vs Legacy JSON Formats:
The unified format provides a consistent structure with a type field to distinguish between different item types, replacing the File/Function wrapper objects used in legacy format.
Legacy format (default):
{
"items": [
{"File": {"path": "src/main.rs", "score": 45.2}},
{"Function": {"name": "process", "complexity": 12}}
]
}
Unified format:
{
"items": [
{"type": "file", "path": "src/main.rs", "score": 45.2},
{"type": "function", "name": "process", "complexity": 12}
]
}
Key differences:
- Unified format: Consistent structure with
typediscriminator field, easier to parse programmatically, better for new integrations - Legacy format: Uses wrapper objects for backward compatibility with existing tooling and scripts
Use unified format for new integrations and tools. Use legacy format when working with existing debtmap analysis pipelines.
Advanced Usage
Design Pattern Detection
Detect design patterns in your codebase to understand architectural choices and identify potential overuse of certain patterns:
# Detect specific design patterns
debtmap analyze . --patterns observer,singleton,factory
# Adjust pattern confidence threshold (0.0-1.0, default: 0.7)
debtmap analyze . --pattern-threshold 0.8
# Show uncertain pattern matches with warnings
debtmap analyze . --show-pattern-warnings
# Disable pattern detection entirely
debtmap analyze . --no-pattern-detection
Available patterns:
observer- Event listener registrations, callback patternssingleton- Static instance managementfactory- Object creation methodsstrategy- Algorithm selection via traits/interfacescallback- Function passing and invocationtemplate_method- Abstract methods with concrete implementations
Use cases:
- Identify architectural patterns in unfamiliar codebases
- Detect potential overuse of certain patterns (e.g., too many singletons)
- Understand code organization and design decisions
Dead Code and Public API Analysis
Control how Debtmap detects public APIs to prevent false positives when analyzing libraries vs CLI tools:
# Analyze with public API awareness (default for libraries)
debtmap analyze . --context
# Disable public API heuristics (useful for CLI tools)
debtmap analyze . --no-public-api-detection
# Adjust public API confidence threshold (default: 0.7)
debtmap analyze . --public-api-threshold 0.8
When to use:
- Libraries: Keep default public API detection to avoid flagging exported functions as unused
- CLI tools: Use
--no-public-api-detectionsince there’s no public API - Mixed projects: Adjust threshold based on false positive rate
What it detects:
- Functions exported in
lib.rsorapi.rs - Public trait implementations
- Functions matching API naming patterns
- Prevents false positives for “unused” library functions
Context-Aware Analysis
Enable advanced context providers for more accurate prioritization:
# Enable all context providers for comprehensive analysis
debtmap analyze . \
--context \
--context-providers critical_path,dependency,git_history
Context Providers:
critical_path - Identifies functions on critical execution paths
- Analyzes call graph to find frequently-called functions
- Prioritizes functions that affect many code paths
- Use for: Understanding impact of potential failures
dependency - Analyzes dependency impact and cascade effects
- Tracks caller/callee relationships
- Calculates cascade impact of changes
- Use for: Understanding change propagation and refactoring risk
git_history - Tracks change frequency and churn
- Analyzes git blame and commit history
- Identifies frequently-changed functions
- Use for: Finding volatile code that needs stabilization
Example workflows:
# Find volatile high-complexity code
debtmap analyze . --context --context-providers git_history
# Understand refactoring impact
debtmap analyze . --context --context-providers dependency
# Disable slow provider for faster analysis
debtmap analyze . --context --disable-context git_history
Multi-Pass Analysis
# Multi-pass with attribution tracking
debtmap analyze . --multi-pass --attribution
# Shows which functions contribute to which patterns
Aggregation Methods
# Use logarithmic sum for aggregation
debtmap analyze . --aggregation-method logarithmic_sum
# Standard sum (default)
debtmap analyze . --aggregation-method sum
Filtering and Grouping
# Group results by debt category
debtmap analyze . --group-by-category
# Filter specific categories
debtmap analyze . --filter Architecture,Testing
# Show only high-priority items
debtmap analyze . --min-priority high --top 10
Call Graph Debugging
Debug and validate call graph construction for accurate dependency analysis:
# Enable call graph debugging output
debtmap analyze . --debug-call-graph
# Trace specific function resolution
debtmap analyze . --trace-function my_function --trace-function another_fn
# Validate call graph structure (detect orphans and cycles)
debtmap analyze . --validate-call-graph
# Show detailed caller/callee relationships
debtmap analyze . --show-dependencies
Use cases:
Troubleshooting resolution failures:
# When a function isn't being analyzed correctly
debtmap analyze . --debug-call-graph --trace-function problematic_function
Understanding function relationships:
# See who calls what
debtmap analyze . --show-dependencies --top 10
Validating call graph integrity:
# Detect cycles and orphaned nodes
debtmap analyze . --validate-call-graph
Output includes:
- Resolution statistics (success/failure rates)
- DFS cycle detection results
- Orphan node detection
- Cross-module resolution details
Verbosity Levels
Control the level of diagnostic output for debugging and understanding analysis decisions:
# Level 1: Show main score factors
debtmap analyze . -v
# Level 2: Show detailed calculations
debtmap analyze . -vv
# Level 3: Show all debug information
debtmap analyze . -vvv
# Long form also available
debtmap analyze . --verbose
# Show macro expansion details (Rust)
debtmap analyze . --verbose-macro-warnings --show-macro-stats
What each level shows:
- -v: Score factor breakdowns and purity distribution in god object analysis
- -vv: Detailed calculations, coverage lookups, and metric computations
- -vvv: Full debug information including AST details and parser internals
Understanding Metrics
Learn how Debtmap calculates complexity metrics and scores:
# Show metric definitions and formulas
debtmap analyze . --explain-metrics
What it explains:
Measured metrics (counted from AST):
cyclomatic_complexity- Decision points (if, match, while, for, etc.)cognitive_complexity- Weighted readability measurenesting_depth- Maximum nested control structure levelsloc- Lines of code in functionparameter_count- Number of function parameters
Estimated metrics (formula-based approximations):
est_branches- Estimated execution paths- Formula:
max(nesting_depth, 1) × cyclomatic_complexity ÷ 3 - Purpose: Estimate test cases needed for branch coverage
- Note: This is an ESTIMATE, not a count from the AST
- Formula:
Scoring formulas:
- Complexity factor calculation
- Coverage factor weight
- Dependency factor impact
- Role multiplier application
- Final score aggregation
Use –explain-metrics when:
- First learning debtmap
- Questioning why something is flagged
- Understanding score differences
- Teaching team members about technical debt metrics
AST Functional Analysis
Enable AST-based functional composition analysis to detect functional programming patterns:
# Enable AST-based functional composition analysis
debtmap analyze . --ast-functional-analysis
# Combine with verbose mode to see purity analysis
debtmap analyze . --ast-functional-analysis -v
What it detects:
- Pure functions (no side effects, immutable)
- Impure functions (I/O, mutations, side effects)
- Function composition patterns
- Immutability patterns
Benefits:
- Distinguishes functional patterns from god objects (see purity weighting in God Object Detection section)
- Identifies opportunities for better testability
- Highlights side effect boundaries
- Supports functional programming code reviews
Example output with -v:
PURITY DISTRIBUTION:
Pure: 70 functions (65%) → complexity weight: 6.3
Impure: 37 functions (35%) → complexity weight: 14.0
Total weighted complexity: 20.3
Parallel Processing Control
Control thread count for CPU-bound systems or to limit resource usage in CI environments. By default, Debtmap uses all available cores for optimal performance.
# Use 8 parallel jobs
debtmap analyze . --jobs 8
# Disable parallel processing
debtmap analyze . --no-parallel
When to adjust:
- CI environments: Limit thread count to avoid resource contention with other jobs
- CPU-bound systems: Reduce threads if machine is under load
- Large codebases: Default parallelism provides best performance
- Debugging: Use
--no-parallelto simplify sequential execution when troubleshooting
Configuration Examples
Basic Configuration
Create .debtmap.toml:
[thresholds]
complexity = 15
duplication = 25
max_function_lines = 50
max_nesting_depth = 4
[languages]
enabled = ["rust", "python"]
[ignore]
patterns = [
"tests/**/*",
"**/*.test.rs",
"target/**",
]
Entropy-Based Complexity
[entropy]
enabled = true
weight = 0.5
use_classification = true
pattern_threshold = 0.7
entropy_threshold = 0.4
branch_threshold = 0.8
max_combined_reduction = 0.3
This reduces false positives for repetitive code patterns.
Understanding entropy-adjusted output:
When entropy analysis detects repetitive patterns, detailed output (-vv) shows both original and adjusted complexity:
debtmap analyze . -vv --top 5
Example output for repetitive validation function:
#15 SCORE: 68.2 [HIGH]
├─ COMPLEXITY: cyclomatic=20 (dampened: 14, factor: 0.70), est_branches=40, cognitive=25, nesting=3, entropy=0.30
- Entropy Impact: 30% dampening (entropy: 0.30, repetition: 95%)
Interpreting the adjustment:
cyclomatic=20: Original complexity before entropy adjustmentdampened: 14: Adjusted complexity (20 × 0.70 = 14)factor: 0.70: Dampening factor (30% reduction applied)entropy: 0.30: Low entropy indicates repetitive patternsrepetition: 95%: High pattern repetition detected
When no dampening is applied:
#5 SCORE: 85.5 [CRITICAL]
├─ COMPLEXITY: cyclomatic=15, est_branches=30, cognitive=22, nesting=4
No “dampened” indicator means the function has diverse logic without repetitive patterns, so the full complexity is used for scoring.
See Entropy Analysis for more details.
Custom Scoring Weights
[scoring]
coverage = 0.40 # Test coverage gaps
complexity = 0.40 # Code complexity
dependency = 0.20 # Dependency criticality
God Object Detection Tuning
[god_object]
enabled = true
# Purity-based scoring reduces false positives for functional code
# Pure functions (no side effects) get lower weight in god object scoring
purity_weight_pure = 0.3 # Pure function complexity weight (default: 0.3)
purity_weight_impure = 1.0 # Impure function complexity weight (default: 1.0)
# Rust-specific thresholds
[god_object.rust]
max_methods = 25
max_fields = 15
max_traits = 5
max_lines = 400
max_complexity = 50
# Python-specific thresholds
[god_object.python]
max_methods = 20
max_fields = 12
max_lines = 350
max_complexity = 45
# JavaScript/TypeScript-specific thresholds
[god_object.javascript]
max_methods = 20
max_fields = 12
max_lines = 300
max_complexity = 40
Why purity weighting matters: See the Purity-Weighted God Object Scoring section for detailed explanation. In short:
- Modules with many pure helper functions avoid false god object flags
- Focus shifts to modules with excessive stateful/impure code
- Functional programming patterns are properly recognized
Example:
- Module with 100 pure functions → Normal (functional design) ✅
- Module with 100 impure functions → God object detected ✅
External API Configuration
For libraries (not CLI tools):
[external_api]
detect_external_api = true
api_functions = [
"parse",
"Parser::new",
"client::connect",
]
api_files = [
"src/lib.rs",
"src/api.rs",
"src/public/*.rs",
]
Complete Multi-Language Configuration
[thresholds]
complexity = 12
duplication = 30
max_file_lines = 400
max_function_lines = 40
minimum_debt_score = 1.0
minimum_cyclomatic_complexity = 2
[entropy]
enabled = true
weight = 0.5
[scoring]
coverage = 0.40
complexity = 0.40
dependency = 0.20
[languages]
enabled = ["rust", "python", "javascript", "typescript"]
[ignore]
patterns = [
# Tests
"tests/**/*",
"**/*.test.*",
"**/*_test.*",
# Build artifacts
"target/**",
"dist/**",
"build/**",
"node_modules/**",
# Python
".venv/**",
"**/__pycache__/**",
# Generated code
"*.generated.*",
"*.pb.*",
]
[god_object]
enabled = true
max_methods = 18
max_fields = 12
Compare Command
The compare command helps validate that refactoring achieved its goals.
Basic Comparison Workflow
# 1. Generate baseline before refactoring
debtmap analyze . --format json --output before.json
# 2. Make your code improvements
# ... refactor, add tests, etc ...
# 3. Generate new analysis
debtmap analyze . --format json --output after.json
# 4. Compare and verify improvements
debtmap compare --before before.json --after after.json
Target-Specific Comparison
Focus on whether a specific function improved:
# Target format: file:function:line
debtmap compare \
--before before.json \
--after after.json \
--target-location src/main.rs:process_data:100
Using with Implementation Plans
Extract target automatically from plan files:
# If IMPLEMENTATION_PLAN.md contains:
# **Target**: src/parser.rs:parse_expression:45
debtmap compare \
--before before.json \
--after after.json \
--plan IMPLEMENTATION_PLAN.md
Output Formats
# JSON output (default)
debtmap compare --before before.json --after after.json
# Terminal output (explicit)
debtmap compare \
--before before.json \
--after after.json \
--format terminal
# JSON for CI integration (explicit output file)
debtmap compare \
--before before.json \
--after after.json \
--format json \
--output comparison.json
# Markdown report
debtmap compare \
--before before.json \
--after after.json \
--format markdown \
--output COMPARISON.md
Interpreting Results
Target Status:
- Resolved: Function no longer appears (complexity reduced below threshold)
- Improved: Metrics improved (complexity down, coverage up)
- Unchanged: No significant change
- Regressed: Metrics got worse
- Not Found: Target not found in baseline
Overall Trend:
- Improving: More items resolved/improved than regressed
- Stable: No significant changes
- Regressing: New critical debt introduced
Example Output:
Target Status: Resolved ✅
- src/parser.rs:parse_expression:45 reduced from complexity 22 to 8
- Coverage improved from 0% to 85%
Overall Trend: Improving
- 3 items resolved
- 2 items improved
- 0 regressions
- Total debt score: 450 → 285 (-37%)
CI Integration
Use in pull request validation:
# In CI script
debtmap compare \
--before baseline.json \
--after current.json \
--format json | jq -e '.overall_trend == "Improving"'
# Exit code 0 if improving, 1 otherwise
Tips and Best Practices
- Start Simple: Begin with basic analysis, add coverage later
- Use Filters: Focus on one category at a time (Architecture, Testing)
- Iterate: Run analysis, fix top items, repeat
- CI Integration: Automate validation in your build pipeline
- Track Progress: Use
comparecommand to validate improvements - Configure Thresholds: Adjust to match your team’s standards
- Leverage Coverage: Always include coverage data for accurate risk assessment
Next Steps
- CLI Reference - Complete CLI documentation
- Analysis Guide - Understanding analysis results
- Configuration - Advanced configuration options
Frequently Asked Questions
Common questions about debtmap’s features, usage, and comparison with other tools.
Features & Capabilities
What’s the difference between measured and estimated metrics?
Debtmap distinguishes between two types of metrics (Spec 118):
Measured Metrics - Precise values from AST analysis:
cyclomatic_complexity: Exact count of decision pointscognitive_complexity: Weighted readability measurenesting_depth: Maximum nesting levelsloc: Lines of code- These are suitable for CI/CD quality gates and thresholds
Estimated Metrics - Heuristic approximations:
est_branches: Estimated execution paths (formula-based)- Formula:
max(nesting, 1) × cyclomatic ÷ 3 - Use for: Estimating test cases needed
- Don’t use for: Hard quality gates
- Formula:
Why it matters:
- Use measured metrics for thresholds and gates (precise, repeatable)
- Use estimated metrics for prioritization and effort estimation (heuristic, approximate)
Example:
# GOOD: Use measured metric for quality gate
debtmap validate . --threshold-complexity 15
# GOOD: Use estimated metric for test prioritization
debtmap analyze . --top 10 # Considers est_branches for ranking
See Metrics Reference for complete details.
What is entropy-based complexity analysis?
Entropy analysis uses information theory to distinguish between genuinely complex code and repetitive pattern-based code. Traditional cyclomatic complexity counts branches, but not all branches are equal in cognitive load.
For example, a function with 20 identical if/return validation checks has the same cyclomatic complexity as a function with 20 diverse conditional branches handling different business logic. Entropy analysis gives the validation function a much lower effective complexity score because it follows a simple, repetitive pattern.
Result: 60-75% reduction in false positives compared to traditional complexity metrics.
How do I interpret entropy-adjusted complexity output?
When entropy analysis detects repetitive patterns, debtmap shows both the original and adjusted complexity values for transparency. Look for the “dampened” indicator in detailed output (-vv):
Example output:
COMPLEXITY: cyclomatic=20 (dampened: 14, factor: 0.70), est_branches=40, cognitive=25, nesting=3, entropy=0.30
- Entropy Impact: 30% dampening (entropy: 0.30, repetition: 95%)
Understanding the values:
cyclomatic=20: Original complexity before adjustmentdampened: 14: Adjusted complexity after entropy analysisfactor: 0.70: Dampening factor (0.70 = 30% reduction)entropy: 0.30: Shannon entropy score (lower = more repetitive)repetition: 95%: Pattern repetition percentage (higher = more repetitive)
Verify the calculation:
original × factor = adjusted
20 × 0.70 = 14
If you don’t see dampening information, either:
- The function is too small for entropy analysis (< 20 tokens)
- Entropy analysis didn’t detect repetitive patterns
- Entropy analysis is disabled (
--semantic-off)
Read full guide in Entropy Analysis
How does coverage integration work?
Debtmap reads LCOV format coverage data (generated by tools like cargo-tarpaulin, pytest-cov, or jest) and maps it to specific functions and branches. It then combines coverage percentages with complexity metrics to calculate risk scores.
Key insight: A complex function with good test coverage is lower risk than a moderately complex function with no tests.
Example workflow:
# Generate coverage data
cargo tarpaulin --out lcov --output-dir target/coverage
# Analyze with coverage integration
debtmap analyze . --lcov target/coverage/lcov.info
See examples in Analysis Guide
What languages are supported?
Full support:
- Rust (via
syncrate - complete AST analysis) - Python (via
rustpython- full Python 3.x support)
Partial support:
- JavaScript (via
tree-sitter- ES6+, JSX) - TypeScript (via
tree-sitter- basic support)
Planned:
- Go (target: Q2 2025)
- Java (target: Q3 2025)
- C/C++ (target: Q4 2025)
Language support means: AST parsing, metric extraction, complexity calculation, and pattern detection.
Why was “branches” renamed to “est_branches”?
The metric was renamed in Spec 118 to make it clear that this is an estimated value, not a precise measurement.
Problem with old name (“branches”):
- Users thought it was a direct count from AST analysis (it’s not)
- Caused confusion with cyclomatic complexity (which counts actual branches)
- Unclear that the value was formula-based
Benefits of new name (“est_branches”):
- The “est_” prefix makes the estimation explicit
- Clearly distinguishes it from measured metrics
- Sets correct user expectations
What changed:
- Terminal output:
branches=8→est_branches=8 - Internal variable names updated for clarity
- Documentation updated to explain the distinction
What didn’t change:
- The formula remains the same:
max(nesting, 1) × cyclomatic ÷ 3 - JSON output (this field was never serialized to JSON)
- Scoring and prioritization logic
See Metrics Reference for more details.
Can I customize the complexity thresholds?
Yes! Configure thresholds in .debtmap.toml:
[thresholds]
cyclomatic_complexity = 10 # Flag functions above this
nesting_depth = 3 # Maximum nesting levels
loc = 200 # Maximum lines per function
parameter_count = 4 # Maximum parameters
[scoring]
critical_threshold = 8.0 # Risk score for Critical tier
high_threshold = 5.0 # Risk score for High tier
moderate_threshold = 2.0 # Risk score for Moderate tier
See Configuration for all available options.
Does debtmap integrate with CI/CD?
Yes! Use the validate command to enforce quality gates:
# Fail build if critical or high-tier debt detected
debtmap validate . --max-critical 0 --max-high 5
# Exit codes:
# 0 = validation passed
# 1 = validation failed (debt exceeds thresholds)
# 2 = analysis error
GitHub Actions example:
- name: Check technical debt
run: |
cargo tarpaulin --out lcov --output-dir target/coverage
debtmap validate . --lcov target/coverage/lcov.info \
--max-critical 0 --max-high 10 \
--format json --output debt-report.json
- name: Comment on PR
uses: actions/github-script@v6
with:
script: |
const report = require('./debt-report.json');
// Post report as PR comment
See Prodigy Integration for more CI/CD patterns.
Comparison with Other Tools
How is debtmap different from SonarQube?
| Aspect | Debtmap | SonarQube |
|---|---|---|
| Speed | 10-100x faster (Rust) | Slower (JVM overhead) |
| Coverage Integration | ✅ Built-in LCOV | ⚠️ Enterprise only |
| Entropy Analysis | ✅ Unique feature | ❌ No |
| Language Support | Rust, Python, JS/TS | 25+ languages |
| Setup | Single binary | JVM + server setup |
| Cost | Free, open-source | Free (basic) / Paid (advanced) |
| Use Case | Fast local analysis | Enterprise dashboards |
When to use SonarQube: Multi-language monorepos, enterprise compliance, centralized quality dashboards.
When to use debtmap: Rust-focused projects, local development workflow, coverage-driven prioritization.
How is debtmap different from CodeClimate?
| Aspect | Debtmap | CodeClimate |
|---|---|---|
| Deployment | Local binary | Cloud service |
| Coverage | Built-in integration | Separate tool |
| Entropy | ✅ Yes | ❌ No |
| Speed | Seconds | Minutes (uploads code) |
| Privacy | Code stays local | Code uploaded to cloud |
| Cost | Free | Free (open source) / Paid |
When to use CodeClimate: Multi-language projects, prefer SaaS solutions, want maintainability ratings.
When to use debtmap: Rust projects, privacy-sensitive code, fast local analysis, entropy-based scoring.
Should I replace clippy with debtmap?
No—use both! They serve different purposes:
clippy:
- Focuses on idiomatic Rust patterns
- Catches common mistakes (e.g., unnecessary clones, inefficient iterators)
- Suggests Rust-specific best practices
- Runs in milliseconds
debtmap:
- Focuses on technical debt prioritization
- Identifies untested complex code
- Combines complexity with test coverage
- Provides quantified recommendations
Recommended workflow:
# Fix clippy issues first (quick wins)
cargo clippy --all-targets --all-features -- -D warnings
# Then prioritize debt with debtmap
debtmap analyze . --lcov coverage/lcov.info --top 10
Should I replace cargo-audit with debtmap?
No—different focus. cargo-audit scans for security vulnerabilities in dependencies. Debtmap analyzes code complexity and test coverage.
Use both:
cargo-audit- Security vulnerabilities in dependenciescargo-geiger- Unsafe code detectiondebtmap- Technical debt and test gaps
How does debtmap compare to traditional code coverage tools?
Debtmap doesn’t replace coverage tools—it augments them.
Coverage tools (tarpaulin, pytest-cov, jest):
- Measure what % of code is executed by tests
- Tell you “you have 75% coverage”
Debtmap:
- Reads coverage data from these tools
- Prioritizes gaps based on code complexity
- Tells you “function X has 0% coverage and complexity 12—fix this first”
Value: Debtmap answers “which 25% should I test first?” instead of just “75% is tested.”
Usage & Configuration
Why don’t entry points need 100% coverage?
Entry points (main functions, CLI handlers, framework integration code) are typically tested via integration tests, not unit tests. Unit testing them would mean mocking the entire runtime environment, which is brittle and low-value.
Debtmap recognizes common entry point patterns and lowers their priority for unit test coverage:
// Entry point - integration test coverage expected
fn main() {
// Debtmap: LOW priority for unit tests
}
// HTTP handler - integration test coverage expected
async fn handle_request(req: Request) -> Response {
// Debtmap: LOW priority for unit tests
}
// Core business logic - unit test coverage expected
fn calculate_discount(cart: &Cart) -> Discount {
// Debtmap: HIGH priority for unit tests if uncovered
}
You can configure entry point detection in .debtmap.toml:
[analysis]
entry_point_patterns = [
"main",
"handle_*",
"run_*",
"*_handler",
]
How do I exclude test files from analysis?
By default, debtmap excludes common test directories. To customize:
.debtmap.toml:
[analysis]
exclude_patterns = [
"**/tests/**",
"**/*_test.rs",
"**/test_*.py",
"**/*.test.ts",
"**/target/**",
"**/node_modules/**",
]
Command line:
debtmap analyze . --exclude '**/tests/**' --exclude '**/*_test.rs'
Can I analyze only specific files or directories?
Yes! Use the --include flag:
# Analyze only src/ directory
debtmap analyze . --include 'src/**'
# Analyze specific files
debtmap analyze . --include 'src/main.rs' --include 'src/lib.rs'
# Combine include and exclude
debtmap analyze . --include 'src/**' --exclude 'src/generated/**'
How do I configure ignore patterns for generated code?
Add to .debtmap.toml:
[analysis]
exclude_patterns = [
"**/generated/**",
"**/*.g.rs", # Generated Rust
"**/*_pb.py", # Protobuf generated Python
"**/*.generated.ts", # Generated TypeScript
]
Or use comments in source files:
#![allow(unused)]
fn main() {
// debtmap:ignore-file - entire file ignored
fn complex_function() {
// debtmap:ignore-start
// ... complex generated code ...
// debtmap:ignore-end
}
}
What if debtmap reports false positives?
1. Verify entropy analysis is enabled (default in v0.2.8+):
[analysis]
enable_entropy_analysis = true
2. Adjust thresholds for your project’s needs:
[thresholds]
cyclomatic_complexity = 15 # Increase if you have many validation functions
3. Use ignore comments for specific functions:
#![allow(unused)]
fn main() {
// debtmap:ignore - explanation for why this is acceptable
fn complex_but_acceptable() {
// ...
}
}
4. Report false positives: If you believe debtmap’s analysis is incorrect, please open an issue with a code example. This helps improve the tool!
How accurate is the risk scoring?
Risk scores are relative prioritization metrics, not absolute measures. They help you answer “which code should I focus on first?” rather than “exactly how risky is this code?”
Factors affecting accuracy:
- Coverage data quality: Accurate if your tests exercise realistic scenarios
- Entropy analysis: Effective for common patterns; may miss domain-specific patterns
- Call graph: More accurate within single files than across modules
- Context: Cannot account for business criticality (you know your domain best)
Best practice: Use risk scores for prioritization, but apply your domain knowledge when deciding what to actually refactor or test.
Can I run debtmap on a CI server?
Yes! Debtmap is designed for CI/CD pipelines:
Performance:
- Statically linked binary (no runtime dependencies)
- Fast analysis (seconds, not minutes)
- Low memory footprint
Exit codes:
0- Analysis succeeded, validation passed1- Analysis succeeded, validation failed (debt thresholds exceeded)2- Analysis error (parse failure, invalid config, etc.)
Example CI configuration:
# .github/workflows/debt-check.yml
name: Technical Debt Check
on: [pull_request]
jobs:
debt-analysis:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Install debtmap
run: cargo install debtmap
- name: Generate coverage
run: cargo tarpaulin --out lcov
- name: Analyze debt
run: debtmap validate . --lcov lcov.info --max-critical 0
Troubleshooting
Analysis is slow on my large codebase
Optimization strategies:
1. Exclude unnecessary files:
[analysis]
exclude_patterns = [
"**/target/**",
"**/node_modules/**",
"**/vendor/**",
"**/.git/**",
]
2. Analyze specific directories:
# Only analyze src/, skip examples and benches
debtmap analyze src/
3. Reduce parallelism if memory-constrained:
debtmap analyze . --jobs 4
Expected performance:
- 50k LOC: 5-15 seconds
- 200k LOC: 30-90 seconds
- 1M+ LOC: 3-8 minutes
If analysis is significantly slower, please report a performance issue.
Debtmap crashes with “stack overflow”
This typically happens with extremely deep call stacks or heavily nested code.
Solutions:
1. Increase stack size:
# Linux/macOS
RUST_MIN_STACK=8388608 debtmap analyze .
# Windows PowerShell
$env:RUST_MIN_STACK=8388608; debtmap analyze .
2. Exclude problematic files:
debtmap analyze . --exclude 'path/to/deeply/nested/file.rs'
3. Report the issue: If you encounter stack overflows, please report with a minimal reproducible example.
Coverage data isn’t being applied
Check:
1. LCOV file path is correct:
debtmap analyze . --lcov target/coverage/lcov.info
2. LCOV file contains data:
grep -c "^SF:" target/coverage/lcov.info # Should be > 0
3. Source paths match: LCOV file paths must match your source file paths. If you generate coverage in a different directory:
[coverage]
source_root = "/path/to/project" # Rewrite LCOV paths
4. Enable debug logging:
RUST_LOG=debug debtmap analyze . --lcov lcov.info 2>&1 | grep -i coverage
Debtmap reports “No functions found”
Common causes:
1. Wrong language detection:
# Verify file extensions are recognized
debtmap analyze . --verbose
2. Syntax errors preventing parsing:
# Check for parse errors
RUST_LOG=warn debtmap analyze .
3. All files excluded by ignore patterns:
# List files being analyzed
debtmap analyze . --dry-run
4. Unsupported language features: Some cutting-edge syntax may not parse correctly. Report parsing issues with code examples.
How do I report a bug or request a feature?
Bug reports:
- Check existing issues
- Provide minimal reproducible example
- Include debtmap version:
debtmap --version - Include OS and Rust version:
rustc --version
Feature requests:
- Describe the use case (what problem does it solve?)
- Provide example of desired behavior
- Explain why existing features don’t address the need
Contributions: Debtmap is open-source and welcomes contributions! See CONTRIBUTING.md for guidelines.
Advanced Topics
Can I extend debtmap with custom analyzers?
Not yet, but planned for v0.3.0. You’ll be able to implement the Analyzer trait for custom language support or domain-specific pattern detection.
Roadmap:
- v0.3.0: Plugin API for custom analyzers
- v0.4.0: Plugin API for custom scoring strategies
- v0.5.0: Plugin API for custom output formatters
Track progress in issue #42.
How does debtmap handle monorepos?
Workspace support: Debtmap analyzes each workspace member independently by default:
# Analyze entire workspace
debtmap analyze .
# Analyze specific member
debtmap analyze packages/api
# Combined report for all members
debtmap analyze . --workspace-mode combined
Configuration:
[workspace]
members = ["packages/*", "services/*"]
exclude = ["examples/*"]
Can I compare debt between branches or commits?
Yes! Use the compare command:
# Compare current branch with main
debtmap compare main
# Compare two specific commits
debtmap compare abc123..def456
# Show only new debt introduced
debtmap compare main --show-new-only
Output shows:
- New debt items (introduced since base)
- Resolved debt items (fixed since base)
- Changed debt items (score increased/decreased)
See Examples - Comparing Branches for details.
How do I integrate debtmap with my editor?
VS Code:
- Install the “Debtmap” extension (planned for Q2 2025)
- Inline warnings in editor for high-risk code
- Quick fixes to generate test stubs
Vim/Neovim:
- Use ALE or vim-lsp with debtmap’s LSP mode (planned)
IntelliJ/RustRover:
- Use external tools integration:
- Settings → Tools → External Tools
- Add debtmap command
- Configure keyboard shortcut
Track editor integration progress in issue #38.
Need More Help?
- Documentation: debtmap.dev
- GitHub Issues: Report bugs or request features
- Discussions: Ask questions
- Examples: See Examples for real-world use cases
Troubleshooting
Common issues and solutions for using debtmap effectively.
Quick Fixes for Common Issues
If you’re experiencing problems, try these first:
- Analysis is slow: Adjust threads with
--jobsor use--semantic-offfor faster fallback mode - Parse errors: Use
--semantic-offfor faster fallback mode or exclude problematic files - No output: Increase verbosity with
-vor lower--min-priority - Inconsistent results: Check if coverage file changed or context providers are enabled
Common Issues
Parse Errors
Problem: Encountering “Parse error in file:line:column” messages
Causes:
- Unsupported language syntax or version
- Complex macro expansions (Rust)
- Type inference edge cases (Python, TypeScript)
Solutions:
# Try fallback mode without semantic analysis
debtmap --semantic-off
# For Rust macro issues, see detailed warnings
debtmap --verbose-macro-warnings --show-macro-stats
# Exclude specific problematic files
# Add to .debtmap/config.toml:
# exclude = ["path/to/problematic/file.rs"]
Out of Memory Errors
Problem: Analysis crashes or runs out of memory on large codebases
Solutions:
# Limit parallel processing
debtmap --jobs 2
# Disable parallel processing entirely
debtmap --no-parallel
# Test with limited files first
debtmap --max-files 100
# Analyze subdirectories separately
debtmap path/to/subset
Performance Issues
Problem: Analysis takes too long to complete
Solutions:
# Use all available CPU cores
debtmap --jobs 0
# Try faster fallback mode (less accurate)
debtmap --semantic-off
# Use plain output for faster terminal rendering
debtmap --plain
See Performance Tips for detailed optimization strategies.
File Permission Errors
Problem: “File system error” when accessing files
Solutions:
- Ensure you have read permissions for all source files
- Check that the project directory is accessible
Git History Errors
Problem: Errors when using git_history context provider
Causes:
- Not running in a git repository
- Git history not available for files
- Insufficient git permissions
Solutions:
# Disable git_history context provider
debtmap --context --disable-context git_history
# Disable all context providers
debtmap --no-context-aware
# Check if in git repository
git status
Coverage File Issues
Problem: Coverage file not being processed or causing errors
Causes:
- Non-LCOV format coverage file
- Malformed coverage data
- Path mismatches between coverage and source files
Solutions:
# Verify coverage file format (must be LCOV)
head coverage.info
# Check coverage file path
debtmap --coverage-file path/to/coverage.info -v
# Ensure paths in coverage file match source paths
# Coverage paths are relative to project root
Threshold and Preset Confusion
Problem: Unexpected filtering or priority levels
Solutions:
# Check what threshold preset does
debtmap --threshold-preset strict --help
# Override specific thresholds
debtmap --min-priority 3
# See all items regardless of thresholds
debtmap --min-priority 0
# Use category filters instead
debtmap --filter "complexity,debt"
JSON Format Issues
Problem: JSON output parsing errors or unexpected structure
Understanding the Two Formats:
Legacy format wraps items in variant-specific objects:
{"File": {"path": "src/main.rs", "score": 7.5, ...}}
{"Function": {"name": "parse", "score": 8.2, ...}}
Unified format uses consistent structure with type field:
{"type": "File", "path": "src/main.rs", "score": 7.5, ...}
{"type": "Function", "name": "parse", "score": 8.2, ...}
The unified format is recommended for parsing and tool integration as it provides a consistent structure across all item types.
Solutions:
# Use unified JSON format (consistent structure, recommended)
debtmap --format json --output-format unified
# Legacy format (default, uses {File: {...}} structure)
debtmap --format json --output-format legacy
# Validate JSON output
debtmap --format json | jq .
# Write to file for easier inspection
debtmap --format json --output results.json
See the Configuration/Output Formats chapter for detailed JSON structure documentation.
Context Provider Errors
Problem: Errors with critical_path, dependency, or git_history providers
Solutions:
# Enable specific providers only
debtmap --context --context-providers critical_path,dependency
# Disable problematic provider
debtmap --context --disable-context git_history
# Disable context-aware filtering
debtmap --no-context-aware
# Check context provider details
debtmap --context -vvv
See Context Provider Troubleshooting for details.
Debug Mode
Use verbosity flags to diagnose issues and understand analysis behavior.
Verbosity Levels
# Level 1: Show main score factors
debtmap -v
# Level 2: Show detailed calculations
debtmap -vv
# Level 3: Show all debug information
debtmap -vvv
What each level shows:
-v: Score breakdowns, main contributing factors-vv: Detailed metric calculations, file processing-vvv: Full debug output, context provider details
Diagnostic Options
# Show macro parsing warnings (Rust)
debtmap --verbose-macro-warnings
# Show macro expansion statistics (Rust)
debtmap --show-macro-stats
# Disable semantic analysis (fallback mode)
debtmap --semantic-off
# Validate LOC consistency
debtmap --validate-loc
Note: The --explain-score flag is deprecated and hidden. Use -v, -vv, or -vvv for verbosity levels instead to see score breakdowns.
Debugging Score Calculations
# Use verbosity levels to see score breakdown
debtmap -v # Shows score factors
# See how coverage affects scores
debtmap --coverage-file coverage.info -v
# See how context affects scores
debtmap --context --context-providers critical_path -v
Example Debug Session
# Step 1: Run with verbosity to see what's happening
debtmap -vv
# Step 2: Try without semantic analysis
debtmap --semantic-off -v
# Step 3: Check specific file
debtmap path/to/file.rs -vvv
# Step 4: Validate results
debtmap --validate-loc
Performance Tips
Optimize debtmap analysis speed and resource usage.
Parallel Processing
# Use all CPU cores (default)
debtmap --jobs 0
# Limit to 4 threads
debtmap --jobs 4
# Disable parallel processing (debugging)
# Note: --no-parallel is equivalent to --jobs 1 (single-threaded)
debtmap --no-parallel
When to adjust parallelism:
- Use
--jobs 0(default): Maximum performance on dedicated machine - Use
--jobs N: Limit resource usage while other tasks run - Use
--no-parallel: Debugging concurrency issues
Analysis Optimizations
# Fast mode: disable semantic analysis
debtmap --semantic-off
# Plain output: faster terminal rendering
debtmap --plain
# Limit files for testing
debtmap --max-files 100
# Analyze subdirectory only
debtmap src/specific/module
# Reduce output with filters
debtmap --min-priority 4 --top 20
Performance Comparison
| Configuration | Speed | Accuracy |
|---|---|---|
| Default | Fast | High |
--semantic-off | Fastest | Medium |
--no-parallel | Slowest | High |
--jobs 4 | Medium | High |
Monitoring Performance
# Time analysis
time debtmap
# Profile with verbosity
debtmap -vv 2>&1 | grep "processed in"
Environment Variables
Debtmap supports various environment variables for configuring behavior without command-line flags.
Analysis Feature Flags
# Enable context-aware analysis by default
export DEBTMAP_CONTEXT_AWARE=true
# Enable functional analysis by default
export DEBTMAP_FUNCTIONAL_ANALYSIS=true
Automation and CI/CD Variables
# Enable automation-friendly output (used by Prodigy)
export PRODIGY_AUTOMATION=true
# Enable validation mode (stricter checks)
export PRODIGY_VALIDATION=true
Output Customization
# Disable emoji in output
export NO_EMOJI=1
# Force plain text output (no colors)
export NO_COLOR=1
Usage Examples
# Enable context-aware analysis by default
echo 'export DEBTMAP_CONTEXT_AWARE=true' >> ~/.bashrc
# CI/CD environment setup
export NO_EMOJI=1
export NO_COLOR=1
export PRODIGY_AUTOMATION=true
# Run analysis with environment settings
debtmap
# Override environment with flags
DEBTMAP_CONTEXT_AWARE=false debtmap --context # Flag takes precedence
Precedence Rules
When both environment variables and CLI flags are present:
- CLI flags take precedence over environment variables
- Environment variables override config file defaults
- Config file settings override built-in defaults
Troubleshooting Environment Variables
# Test with specific environment
env DEBTMAP_CONTEXT_AWARE=true debtmap -v
# See all debtmap-related environment variables
env | grep -i debtmap
env | grep -i prodigy
Context Provider Troubleshooting
Diagnose and fix issues with context providers (critical_path, dependency, git_history).
Enable Context Analysis
# Enable with default providers
debtmap --context
# Or use explicit flag
debtmap --enable-context
# Specify specific providers
debtmap --context --context-providers critical_path,dependency,git_history
Disable Specific Providers
# Disable git_history only
debtmap --context --disable-context git_history
# Disable multiple providers
debtmap --context --disable-context git_history,dependency
# Disable context-aware filtering
debtmap --no-context-aware
Git History Provider Issues
Problem: “Git history error” when running analysis
Causes:
- Not in a git repository
- No git history for files
- Git not installed or accessible
Solutions:
# Verify in git repository
git status
# Disable git_history provider
debtmap --context --disable-context git_history
# Initialize git repo if needed
git init
Dependency Provider Issues
Problem: “Dependency error” or incomplete dependency graph
Causes:
- Complex import structures
- Circular dependencies
- Unsupported dependency patterns
Solutions:
# Disable dependency provider
debtmap --context --disable-context dependency
# Try with verbosity to see details
debtmap --context -vvv
# Use without context
debtmap
Critical Path Provider Issues
Problem: Critical path analysis fails or produces unexpected results
Causes:
- Invalid call graph
- Missing function definitions
- Complex control flow
Solutions:
# Disable critical_path provider
debtmap --context --disable-context critical_path
# Try with semantic analysis disabled
debtmap --context --semantic-off
# Debug with verbosity
debtmap --context --context-providers critical_path -vvv
Context Impact on Scoring
Context providers add additional risk factors to scoring:
# See context contribution to scores
debtmap --context -v
# Compare with and without context
debtmap --output baseline.json
debtmap --context --output with_context.json
debtmap compare --before baseline.json --after with_context.json
Performance Impact
Context analysis adds processing overhead:
# Faster: no context
debtmap
# Slower: with all context providers
debtmap --context --context-providers critical_path,dependency,git_history
# Medium: selective providers
debtmap --context --context-providers dependency
Debug Context Providers
# See detailed context provider output
debtmap --context -vvv
# Check which providers are active
debtmap --context -v | grep "context provider"
Advanced Analysis Troubleshooting
Advanced CLI flags for specialized analysis scenarios.
Multi-Pass Analysis
Flag: --multi-pass
Multi-pass analysis performs multiple iterations to refine results.
# Enable multi-pass analysis
debtmap --multi-pass
# Useful for complex projects with intricate dependencies
# May increase analysis time but improve accuracy
When to use:
- Complex dependency graphs
- Large codebases with deep nesting
- When single-pass results seem incomplete
Attribution Output
Flag: --show-attribution
Shows attribution information for detected issues.
# Enable attribution output
debtmap --show-attribution
# Combine with verbosity for details
debtmap --show-attribution -v
Troubleshooting:
- Requires git history provider for author information
- May slow down analysis
- Use
--disable-context git_historyif causing errors
Aggregation Methods
Flag: --aggregation-method <method>
Controls how results are aggregated across files.
# Available aggregation methods:
debtmap --aggregation-method weighted_sum # (default)
debtmap --aggregation-method sum
debtmap --aggregation-method logarithmic_sum
debtmap --aggregation-method max_plus_average
Common issues:
- Different methods produce different result structures
- Choose method based on your reporting needs
- Use consistent method for comparison over time
Minimum Problematic Threshold
Flag: --min-problematic <number>
Sets the minimum score for an item to be considered problematic.
# Default threshold
debtmap --min-problematic 3
# More strict (show more issues)
debtmap --min-problematic 1
# Less strict (show only serious issues)
debtmap --min-problematic 5
Relationship to other filters:
- Works alongside
--min-priority - Filters at analysis level vs display level
- Lower values = more issues shown
God Object Detection
Flag: --no-god-object
Disables god object (large class/module) detection.
God Object Types:
Debtmap distinguishes three types of god objects:
-
god_class: Files with excessive complexity excluding test functions
- Focuses on production code complexity
- Ignores test helper functions and test cases
- Best indicator of production code quality issues
-
god_file: Files with excessive complexity including all functions
- Considers both production and test code
- Useful for understanding total file complexity
- Alias:
god_module(same as god_file)
-
god_module: Alias for god_file
- Module-level view of complexity
- Includes all functions regardless of purpose
Responsibility Analysis Metrics (Spec 140):
Modern god object detection includes domain responsibility analysis:
# See detailed god object metrics
debtmap -vv 2>&1 | grep "god_object\|domain"
Additional Metrics:
- domain_count: Number of distinct responsibility domains in file
- domain_diversity: Measure of how varied the responsibilities are (0.0-1.0)
- struct_ratio: Ratio of structs to total file size
- cross_domain_severity: How badly domains are mixed (0.0-1.0)
- module_splits: Suggested number of modules to split into
Configuration:
# In .debtmap.toml
[god_object]
# Thresholds for god object detection
complexity_threshold = 100
loc_threshold = 500
function_count_threshold = 20
# Responsibility analysis thresholds
domain_diversity_threshold = 0.7 # High diversity = mixed responsibilities
cross_domain_threshold = 0.6 # High value = poor separation
Usage:
# Disable god object detection entirely
debtmap --no-god-object
# See god object analysis with responsibility metrics
debtmap -vv
# Check specific file for god object patterns
debtmap path/to/large/file.rs -vv
When to use:
- False positives on framework files
- Intentional large aggregator classes
- Reducing noise in results
- Files that are legitimately large due to generated code
Understanding the Metrics:
# Example output interpretation:
# domain_count = 5 → File handles 5 different concerns
# domain_diversity = 0.8 → Very mixed responsibilities (bad)
# cross_domain_severity = 0.7 → Poor separation of concerns
# module_splits = 3 → Suggest splitting into 3 modules
# High domain_diversity + high cross_domain_severity = strong god object
# Recommended: refactor into separate modules per domain
Detail Level Control
Flag: --detail-level <level>
Controls the level of detail in analysis output.
# Available detail levels:
debtmap --detail-level summary # High-level overview only
debtmap --detail-level standard # (default) Balanced detail
debtmap --detail-level comprehensive # Detailed analysis
debtmap --detail-level debug # Full debug information
When to use:
summary: Quick overview for large codebasesstandard: Default, appropriate for most use casescomprehensive: Deep dive into specific issuesdebug: Troubleshooting analysis behavior
Aggregation Control
Flags: --aggregate-only, --no-aggregation
Control file-level score aggregation.
# Show only aggregated file-level scores
debtmap --aggregate-only
# Disable file-level aggregation entirely
debtmap --no-aggregation
# Default: show both individual items and file aggregates
debtmap
Use cases:
--aggregate-only: Focus on file-level technical debt--no-aggregation: See individual functions/classes only- Default: Full picture with both levels
Call Graph Debugging
Overview: Debug call graph generation and analysis for dependency tracking.
Available Flags:
# Enable call graph debug output
debtmap --debug-call-graph
# Trace specific functions through call graph
debtmap --trace-functions "function_name,another_function"
# Show only call graph statistics (no detailed graph)
debtmap --call-graph-stats-only
# Control debug output format (text or json)
debtmap --debug-call-graph --debug-format text
debtmap --debug-call-graph --debug-format json
# Validate call graph consistency
debtmap --validate-call-graph
Dependency Control Flags:
# Show dependency information in results
debtmap --show-dependencies
# Hide dependency information (default in some contexts)
debtmap --no-dependencies
# Limit number of callers shown per function
debtmap --max-callers 10
# Limit number of callees shown per function
debtmap --max-callees 10
# Include external crate calls in call graph
debtmap --show-external
# Include standard library calls in call graph
debtmap --show-std-lib
Common Issues:
Q: Call graph shows incomplete or missing relationships?
A: Try these debugging steps:
# Enable debug output to see graph construction
debtmap --debug-call-graph -vv
# Validate the call graph consistency
debtmap --validate-call-graph
# Include external dependencies if relevant
debtmap --show-external --show-std-lib
# Trace specific functions to see their relationships
debtmap --trace-functions "my_function" -vv
Q: Call graph output is overwhelming?
A: Use filtering options:
# Show only statistics, not the full graph
debtmap --call-graph-stats-only
# Limit callers and callees shown
debtmap --max-callers 5 --max-callees 5
# Hide dependencies from main output
debtmap --no-dependencies
# Export to JSON for external processing
debtmap --debug-call-graph --debug-format json --output call-graph.json
When to use call graph debugging:
- Investigating missing critical path detection
- Understanding dependency relationships
- Debugging context provider issues
- Analyzing architectural coupling
- Validating function relationship detection
Tiered Prioritization Issues
Overview: Debtmap uses a 4-tier system to classify and sort technical debt items by architectural importance. Tiers affect result ordering but do not multiply scores.
Tier Classification:
- Tier 1 (Critical Architecture): High complexity, low coverage, high dependencies, entry points, or file-level architectural debt
- Tier 2 (Complex Untested): Significant complexity or coverage gaps
- Tier 3 (Testing Gaps): Moderate issues that need attention
- Tier 4 (Maintenance): Low-priority items, routine maintenance
Result Ordering: Results are sorted first by tier (T1 > T2 > T3 > T4), then by score within each tier. This ensures architecturally critical items appear at the top regardless of their absolute score.
Note: Tier weights (1.5×, 1.0×, 0.7×, 0.3×) exist in the configuration but are currently not applied as score multipliers. Tiers control sort order instead.
Configuration:
# In .debtmap.toml
[tiers]
# Tier 2 requires EITHER high complexity OR high dependencies
t2_complexity_threshold = 15
t2_dependency_threshold = 10
# Tier 3 requires moderate complexity
t3_complexity_threshold = 8
# Control Tier 4 visibility in main report
show_t4_in_main_report = false
Common Issues:
Q: Why is my item in Tier 3 instead of Tier 2?
A: Check if it meets Tier 2 thresholds:
# See tier classification with verbosity
debtmap -v
# Check current thresholds
cat .debtmap.toml | grep -A 5 "\[tiers\]"
# Lower thresholds to promote more items to Tier 2
# In .debtmap.toml:
# t2_complexity_threshold = 10 (default: 15)
# t2_dependency_threshold = 5 (default: 10)
Q: How do I hide Tier 4 items from the main report?
A: Use the show_t4_in_main_report configuration:
# In .debtmap.toml
[tiers]
show_t4_in_main_report = false
Tier 4 items will still appear in detailed output but won’t clutter the main summary.
File-Level Scoring Issues
Overview: Debtmap aggregates function/class scores into file-level scores using configurable aggregation methods.
Note: The exact aggregation formula depends on the selected method (see --aggregation-method flag). File-level scores combine individual item scores with file-level characteristics.
Aggregation Methods:
# Weighted sum (default) - considers complexity weights
debtmap --aggregation-method weighted_sum
# Simple sum - adds all function scores
debtmap --aggregation-method sum
# Logarithmic sum - dampens very high scores
debtmap --aggregation-method logarithmic_sum
# Max plus average - highlights worst function + context
debtmap --aggregation-method max_plus_average
When to use each method:
- weighted_sum: Default, balances individual and collective impact
- sum: When you want raw cumulative debt
- logarithmic_sum: For very large files to prevent score explosion
- max_plus_average: Focus on worst offender while considering overall file health
Aggregation Control Flags:
# Show only aggregated file-level scores
debtmap --aggregate-only
# Disable file-level aggregation entirely
debtmap --no-aggregation
# Default: show both individual items and file aggregates
debtmap
Troubleshooting High File Scores:
Q: Why does this file have such a high score?
A: Check contributing factors with verbosity:
# See file-level score breakdown
debtmap path/to/file.rs -vv
# Look for:
# - High function count (density_factor kicks in at 50+)
# - God object detection (1.5× multiplier)
# - Low coverage (high coverage_factor)
# - Large file size (size_factor)
# - Multiple high-complexity functions
# Disable god object detection if false positive
debtmap --no-god-object path/to/file.rs
Functional Analysis Issues
Overview: Functional analysis detects violations of functional programming principles like impure functions, excessive mutation, and side effects.
Enable Functional Analysis:
# Enable AST-based functional analysis
debtmap --ast-functional-analysis
# Use different strictness profiles
debtmap --ast-functional-analysis --functional-analysis-profile strict
debtmap --ast-functional-analysis --functional-analysis-profile balanced # (default)
debtmap --ast-functional-analysis --functional-analysis-profile lenient
Analysis Profiles:
- strict: Flag most functional violations, enforce pure functions
- balanced: Default, reasonable middle ground for mixed codebases
- lenient: Allow more pragmatic deviations from pure functional style
Common Issues:
Q: Too many false positives for legitimate imperative code?
A: Adjust the profile or disable for specific areas:
# Use lenient profile for pragmatic codebases
debtmap --ast-functional-analysis --functional-analysis-profile lenient
# Disable functional analysis if not using FP style
debtmap # (functional analysis is opt-in via --ast-functional-analysis)
Q: What violations does functional analysis detect?
A: Functional analysis flags:
- Mutation of variables (reassignment)
- Side effects in functions (I/O, global state)
- Impure functions (non-deterministic behavior)
- Excessive mutable state
- Missing const/immutability annotations
# See detailed functional analysis results
debtmap --ast-functional-analysis -vv
# Focus on functional purity issues
debtmap --ast-functional-analysis --filter "functional"
When to use functional analysis:
- Projects following functional programming principles
- Codebases using immutable data structures
- When refactoring to reduce side effects
- For detecting hidden mutation bugs
- In functional-first languages (Rust with functional style)
When to disable:
- Imperative codebases where mutation is expected
- Performance-critical code requiring in-place updates
- When false positives overwhelm actual issues
Pattern Detection Issues
Overview: Pattern detection identifies repetitive code structures, anti-patterns, and common debt patterns.
Control Pattern Detection:
# Disable pattern detection entirely
debtmap --no-pattern-detection
# Specify specific patterns to detect
debtmap --patterns "god_object,long_function,complex_conditional"
# Adjust pattern detection sensitivity
debtmap --pattern-threshold 0.8 # Higher = stricter matching (0.0-1.0)
# Show pattern detection warnings
debtmap --show-pattern-warnings
Common Issues:
Q: Pattern detection causes too many false positives?
A: Adjust threshold or disable specific patterns:
# Increase threshold for stricter matching (fewer false positives)
debtmap --pattern-threshold 0.9
# Disable pattern detection for exploratory analysis
debtmap --no-pattern-detection
# See which patterns are triggering with warnings
debtmap --show-pattern-warnings -v
Q: Missing patterns I expect to see?
A: Lower threshold or check pattern names:
# Lower threshold to catch more patterns
debtmap --pattern-threshold 0.6
# Specify patterns explicitly
debtmap --patterns "god_object,long_function,deep_nesting"
# Use verbosity to see pattern detection process
debtmap --show-pattern-warnings -vv
Detected Patterns:
god_object: Classes/modules with too many responsibilitieslong_function: Functions exceeding length thresholdscomplex_conditional: Nested or complex branching logicdeep_nesting: Excessive indentation depthparameter_overload: Too many function parametersduplicate_code: Repetitive code structures
When to adjust pattern threshold:
- Higher (0.8-1.0): Reduce noise, only flag clear violations
- Lower (0.5-0.7): Catch subtle patterns, more comprehensive detection
- Default (0.7): Balanced detection for most codebases
Public API Detection Issues
Overview: Public API detection identifies functions and types that form your crate’s public interface, affecting scoring and priority.
Control Public API Detection:
# Disable public API detection
debtmap --no-public-api-detection
# Adjust public API detection threshold
debtmap --public-api-threshold 0.5 # Lower = more items marked as public (0.0-1.0)
Common Issues:
Q: Private functions being marked as public API?
A: Increase the threshold for stricter detection:
# Higher threshold = only clearly public items
debtmap --public-api-threshold 0.8
# Disable public API detection if not useful
debtmap --no-public-api-detection
# See what's being detected as public
debtmap -vv 2>&1 | grep "public API"
Q: Public functions not being detected?
A: Lower the threshold or check visibility:
# Lower threshold to detect more public items
debtmap --public-api-threshold 0.3
# Verify function is actually public (pub keyword in Rust)
debtmap path/to/file.rs -vv
How Public API Detection Works:
- Checks for
pubvisibility in Rust - Identifies exported functions in Python (
__all__) - Detects exported symbols in JavaScript/TypeScript
- Considers call graph entry points
- Factors in documentation presence
Impact on Scoring:
- Public API items get higher priority scores (1.1× multiplier)
- Entry point detection uses public API information
- Critical path analysis considers public boundaries
When to disable:
- Internal tools or scripts (no public API)
- When API detection causes confusion
- Libraries where everything is intentionally public
Combining Advanced Flags
# Comprehensive analysis with all features
debtmap --multi-pass --attribution --context -vv
# Minimal filtering for exploration
debtmap --min-problematic 1 --min-priority 0 --no-god-object
# Performance-focused advanced analysis
debtmap --multi-pass --jobs 8
# Summary view with aggregated scores
debtmap --detail-level summary --aggregate-only
Error Messages Reference
Understanding common error messages and how to resolve them.
File System Errors
Message: File system error: Permission denied
Meaning: Cannot read file or directory due to permissions
Solutions:
- Check file permissions:
ls -la <file> - Ensure user has read access
Message: File system error: No such file or directory
Meaning: File or directory does not exist
Solutions:
- Verify path is correct
- Check current working directory:
pwd - Use absolute paths if needed
- Ensure files weren’t moved or deleted
Parse Errors
Message: Parse error in file.rs:line:column: unexpected token
Meaning: Syntax debtmap cannot parse
Solutions:
# Try fallback mode
debtmap --semantic-off
# For Rust macros
debtmap --verbose-macro-warnings --show-macro-stats
# Exclude problematic file
# In .debtmap/config.toml:
# exclude = ["path/to/file.rs"]
Analysis Errors
Message: Analysis error: internal analysis failure
Meaning: Internal error during analysis phase
Solutions:
# Try fallback mode
debtmap --semantic-off
# Report with debug info
debtmap -vvv 2>&1 | tee error.log
# Isolate problem file
debtmap --max-files 1 path/to/suspected/file
Configuration Errors
Message: Configuration error: invalid config value
Meaning: Invalid configuration in .debtmap/config.toml or CLI
Solutions:
- Check
.debtmap/config.tomlsyntax - Validate TOML format:
cat .debtmap/config.toml - Review CLI flag values
- Check for typos in flag names
Validation Errors
Message: Validation error: threshold validation failed
Meaning: Threshold configuration is invalid
Solutions:
- Check threshold values in config
- Ensure
--min-priorityis in valid range (0-10) - Verify threshold preset exists
- Use
--threshold-presetwith valid preset name
Dependency Errors
Message: Dependency error: cannot resolve dependency graph
Meaning: Cannot build dependency relationships
Solutions:
# Disable dependency provider
debtmap --context --disable-context dependency
# Try without context
debtmap
# Debug with verbosity
debtmap -vvv
Concurrency Errors
Message: Concurrency error: parallel processing failure
Meaning: Error during parallel execution
Solutions:
# Disable parallel processing
debtmap --no-parallel
# Reduce thread count
debtmap --jobs 1
# Report issue with debug output
debtmap -vvv 2>&1 | tee error.log
Unsupported Errors
Message: Unsupported: feature not available for <language>
Meaning: Language or construct not supported
Solutions:
- Use supported languages: Rust, Python, JavaScript, TypeScript
- Check if language is enabled in config
- Some advanced features may not be available for all languages
- Try
--semantic-offfor basic analysis
Pattern Errors
Message: Pattern error: invalid glob pattern
Meaning: Invalid glob pattern in configuration or CLI
Solutions:
- Check glob pattern syntax
- Escape special characters if needed
- Test pattern with shell glob:
ls <pattern> - Use simpler patterns or path prefixes
Language-Specific Issues
Rust
Macro Expansion Issues
# See macro warnings
debtmap --verbose-macro-warnings
# Show macro statistics
debtmap --show-macro-stats
# Common issue: Complex macros may not expand correctly
# Solution: Use --semantic-off for faster fallback
Trait and Generic Complexity
Complex trait bounds and generic constraints may affect analysis accuracy:
# Full semantic analysis (default)
debtmap
# Fallback mode for edge cases
debtmap --semantic-off
Python
Type Inference Limitations
Dynamic typing makes some analysis challenging:
# Best effort type inference (default)
debtmap
# Fallback mode if issues
debtmap --semantic-off
Import Resolution
Complex import structures may not resolve fully:
- Relative imports usually work
- Dynamic imports may not be detected
__init__.pypackages are supported
JavaScript/TypeScript
JSX/TSX Parsing
Ensure files have correct extensions:
.jsxfor JavaScript + JSX.tsxfor TypeScript + JSX- Configure extensions in
.debtmap/config.tomlif needed
Type Resolution
TypeScript type resolution in complex projects:
# Full type checking (default for .ts files)
debtmap
# Fallback if type issues
debtmap --semantic-off
Mixed Language Projects
# Analyze all supported languages (default)
debtmap
# Filter specific languages
# In .debtmap/config.toml:
# languages = ["rust", "python"]
Unsupported Language Constructs
Some advanced language features may show as “Unsupported”:
- Rust: Some macro patterns, const generics edge cases
- Python: Some metaclass patterns, dynamic code generation
- JavaScript: Some advanced AST manipulation
Solutions:
- Use
--semantic-offfor basic analysis - Exclude problematic files if needed
- Report unsupported patterns as feature requests
Boilerplate Detection Issues
Overview: Boilerplate detection identifies repetitive code patterns that are necessary but contribute to complexity scores, such as trait implementations, error handling, and validation logic.
How Boilerplate Detection Works:
Debtmap automatically detects common boilerplate patterns:
- Trait implementations: Standard trait method implementations (Debug, Display, From, etc.)
- Error handling: Repetitive error conversion and propagation code
- Validation functions: Similar validation logic across multiple functions
- Macro-generated code: Repetitive patterns from macro expansions
- Builder patterns: Setter methods and builder implementations
Impact on Scoring:
Detected boilerplate receives dampened complexity scores to avoid inflating technical debt for necessary repetitive code.
Common Issues:
Q: Legitimate complex code being marked as boilerplate?
A: Boilerplate detection uses pattern similarity thresholds. If unique logic is being incorrectly dampened:
# See what's being detected as boilerplate
debtmap -vv 2>&1 | grep "boilerplate"
# Check entropy analysis settings (used for boilerplate detection)
# In .debtmap.toml:
# [entropy]
# pattern_threshold = 0.8 # Increase for stricter matching
Q: Boilerplate code still showing high scores?
A: Some boilerplate patterns may not be recognized. Common cases:
# Trait implementations should be automatically detected
# If not dampened, check that code follows standard patterns
# For custom validation patterns, ensure similarity is high enough
# In .debtmap.toml:
# [entropy]
# pattern_threshold = 0.7 # Lower to catch more patterns
# enabled = true
Q: How to identify what debtmap considers boilerplate?
A: Use verbose output:
# See boilerplate detection in action
debtmap -vv 2>&1 | grep -i "boilerplate\|pattern\|entropy"
# Check specific file
debtmap path/to/file.rs -vv
Boilerplate Reduction Strategies:
# In .debtmap.toml
[entropy]
enabled = true # Enable pattern-based dampening
pattern_threshold = 0.7 # Similarity threshold (0.0-1.0)
weight = 0.3 # Impact on complexity adjustment
min_tokens = 50 # Minimum size for pattern analysis
When boilerplate detection helps:
- Codebases with many trait implementations
- Projects with extensive validation logic
- Macro-heavy code (derives, procedural macros)
- Builder pattern implementations
- Error handling boilerplate
When to adjust thresholds:
- Increase
pattern_threshold(0.8-0.9): If unique code is being dampened - Decrease
pattern_threshold(0.5-0.6): If obvious boilerplate isn’t being detected - Disable entropy (
enabled = false): If causing too many false dampening
False Positives
Reduce false positives for validation functions and repetitive code patterns using entropy analysis:
Enable and Configure Entropy Analysis:
# In .debtmap.toml
[entropy]
enabled = true # Enable entropy-based dampening
weight = 0.3 # Weight in complexity adjustment (0.0-1.0)
min_tokens = 50 # Minimum tokens for entropy calculation
pattern_threshold = 0.7 # Pattern similarity threshold (0.0-1.0)
use_classification = true # Enable advanced token classification
entropy_threshold = 0.5 # Entropy level for dampening (0.0-1.0)
branch_threshold = 0.8 # Branch similarity threshold (0.0-1.0)
max_combined_reduction = 0.5 # Max reduction percentage (0.0-1.0)
When to Adjust Parameters:
- Increase
pattern_threshold(e.g., 0.8-0.9): Be more strict, reduce dampening for truly unique code - Decrease
entropy_threshold(e.g., 0.3-0.4): Apply dampening more broadly to catch more repetitive patterns - Increase
weight(e.g., 0.4-0.5): Make entropy have stronger impact on final scores - Increase
min_tokens(e.g., 100): Only apply entropy analysis to larger functions - Increase
branch_threshold(e.g., 0.9): Be more strict about branching pattern similarity
Entropy analysis can reduce false positives by up to 70% for validation functions, error handling, and other repetitive patterns.
Other False Positive Reduction Strategies:
# Use context-aware analysis
debtmap --context
# Adjust thresholds
debtmap --threshold-preset lenient
# Disable context-aware filtering if too aggressive
debtmap --no-context-aware
Missing Detections
# Ensure semantic analysis is enabled
debtmap # (default, semantic ON)
# Increase verbosity to see what's detected
debtmap -vv
# Check if files are being analyzed
debtmap -v 2>&1 | grep "Processing"
Output Formatting Issues
Choose Output Format
# Terminal format (default, human-readable)
debtmap
# JSON format
debtmap --format json
# Markdown format
debtmap --format markdown
JSON Format Options
# Legacy format (default): {File: {...}}
debtmap --format json --output-format legacy
# Unified format: consistent structure with 'type' field
debtmap --format json --output-format unified
# Validate JSON
debtmap --format json | jq .
# Write to file
debtmap --format json --output results.json
Plain Output Mode
For environments without color/emoji support:
# ASCII-only, no colors, no emoji
debtmap --plain
# Or set environment variable
export NO_EMOJI=1
debtmap
Terminal Color Issues
Problem: Colors not rendering or showing escape codes
Solutions:
# Use plain mode
debtmap --plain
# Check TERM environment variable
echo $TERM
# Set appropriate TERM
export TERM=xterm-256color
Emoji Issues
Problem: Emojis showing as boxes or ??
Solutions:
# Disable emojis
debtmap --plain
# Or environment variable
export NO_EMOJI=1
debtmap
Markdown Rendering
Ensure viewer supports GitHub-flavored markdown:
- Tables
- Code blocks with syntax highlighting
- Task lists
Write Output to File
# JSON to file
debtmap --format json --output results.json
# Markdown to file
debtmap --format markdown --output report.md
# Terminal format to file (preserves colors)
debtmap --output results.txt
# Plain format to file
debtmap --plain --output results.txt
Summary vs Full Output
# Summary mode (compact)
debtmap --summary
debtmap -s
# Full output (default)
debtmap
# Limit number of items
debtmap --top 10 # Top 10 by priority
debtmap --tail 10 # Bottom 10 by priority
Filtering Output
# Minimum priority level
debtmap --min-priority 5
# Category filters
debtmap --filter "complexity,debt"
# Combine filters
debtmap --min-priority 3 --top 20 --filter complexity
Compare Command Issues
The compare command helps track changes in technical debt over time.
Basic Usage
Note: The compare command defaults to JSON output format (unlike analyze which defaults to terminal). Use --format terminal or --format markdown if you need different output.
# Save baseline results
debtmap --format json --output before.json
# Make code changes...
# Save new results
debtmap --format json --output after.json
# Compare results (outputs JSON by default)
debtmap compare --before before.json --after after.json
# Compare with terminal output
debtmap compare --before before.json --after after.json --format terminal
Targeted Comparison
Use --plan and --target-location for focused debt analysis:
# Compare based on implementation plan
debtmap compare --before before.json --after after.json --plan implementation-plan.json
# Compare specific code location
debtmap compare --before before.json --after after.json \
--target-location src/main.rs:calculate_score:42
# Combine both for precise tracking
debtmap compare --before before.json --after after.json \
--plan implementation-plan.json \
--target-location src/analyzers/complexity.rs:analyze_function:128
Use cases:
--plan: Track debt changes for planned refactoring tasks--target-location: Focus on specific function or code location- Combine for granular technical debt tracking
Incompatible Format Errors
Problem: “Incompatible formats” error when comparing files
Causes:
- Mixing legacy and unified JSON formats
- Files from different debtmap versions
- Corrupted JSON files
Solutions:
# Ensure both files use same output format
debtmap --format json --output-format unified --output before.json
# ... make changes ...
debtmap --format json --output-format unified --output after.json
debtmap compare --before before.json --after after.json
# Validate JSON files are well-formed
jq . before.json > /dev/null
jq . after.json > /dev/null
Comparing Across Branches
# Save baseline on main branch
git checkout main
debtmap --format json --output main.json
# Switch to feature branch
git checkout feature-branch
debtmap --format json --output feature.json
# Compare branches
debtmap compare --before main.json --after feature.json
Missing Files Error
Problem: “File not found” when running compare
Solutions:
- Verify file paths are correct (use absolute paths if needed)
- Ensure JSON files weren’t moved or deleted
- Check current working directory with
pwd
Format Mismatch Issues
Problem: Compare shows unexpected differences or errors
Solutions:
# Regenerate both files with same debtmap version
debtmap --format json --output before.json
# ... make changes ...
debtmap --format json --output after.json
# Use same output format for both
debtmap --format json --output-format unified --output before.json
debtmap --format json --output-format unified --output after.json
Validate Command Issues
The validate command checks if a codebase meets specified quality thresholds, useful for CI/CD pipelines.
Basic Validation
# Validate codebase passes default thresholds
debtmap validate /path/to/project
# Exit code 0 if passes, non-zero if validation fails
Debt Density Validation
Flag: --max-debt-density <number>
Sets the maximum acceptable technical debt per 1000 lines of code.
# Set maximum acceptable debt density (per 1000 LOC)
debtmap validate /path/to/project --max-debt-density 10.0
# Stricter threshold for critical projects
debtmap validate /path/to/project --max-debt-density 5.0
# Lenient threshold for legacy code
debtmap validate /path/to/project --max-debt-density 20.0
Troubleshooting validation failures:
# See which files exceed threshold with details
debtmap validate /path/to/project --max-debt-density 10.0 -v
# Get detailed breakdown of debt density calculations
debtmap validate /path/to/project --max-debt-density 10.0 -vv
# Analyze specific files that failed validation
debtmap /path/to/problematic/file.rs -v
# Understand debt density metric
# Debt density = (total_debt_score / total_lines_of_code) × 1000
# Example: 150 debt points across 10,000 LOC = 15.0 debt density
Interpreting debt density values:
- < 5.0: Excellent code quality
- 5.0 - 10.0: Good, manageable technical debt
- 10.0 - 20.0: Moderate debt, consider cleanup
- > 20.0: High debt, refactoring recommended
CI/CD Integration
# In CI pipeline (fails build if validation fails)
debtmap validate . --max-debt-density 10.0 || exit 1
# With verbose output for debugging
debtmap validate . --max-debt-density 10.0 -v
# Save validation report
debtmap validate . --max-debt-density 10.0 --format json --output validation.json
Use cases:
- Enforce quality gates in CI/CD pipelines
- Prevent accumulation of technical debt over time
- Track debt density trends across releases
- Set different thresholds for different parts of codebase
Validate-Improvement Command Issues
The validate-improvement command verifies that code changes actually reduced technical debt, useful for validating refactoring efforts.
Basic Usage
# Validate that changes improved the codebase
debtmap validate-improvement \
--comparison comparison.json \
--output improvement-report.json
# Set minimum acceptable improvement threshold
debtmap validate-improvement \
--comparison comparison.json \
--threshold 5.0 \
--output improvement-report.json
Command Flags
# Specify comparison file from 'debtmap compare' output
debtmap validate-improvement --comparison comparison.json
# Set output file for validation results
debtmap validate-improvement \
--comparison comparison.json \
--output improvement-report.json
# Use previous validation for trend analysis
debtmap validate-improvement \
--comparison comparison.json \
--previous-validation previous-report.json
# Set minimum improvement threshold (percentage)
debtmap validate-improvement \
--comparison comparison.json \
--threshold 10.0 # Require 10% improvement
# Control output format (json, text, markdown)
debtmap validate-improvement \
--comparison comparison.json \
--format json
# Quiet mode (exit code only, no output)
debtmap validate-improvement \
--comparison comparison.json \
--quiet
Typical Workflow
# Step 1: Save baseline before refactoring
debtmap --format json --output before.json
# Step 2: Make code changes...
# Step 3: Analyze after changes
debtmap --format json --output after.json
# Step 4: Compare results
debtmap compare --before before.json --after after.json \
--format json --output comparison.json
# Step 5: Validate improvement
debtmap validate-improvement \
--comparison comparison.json \
--threshold 5.0 \
--output validation.json
# Exit code 0 if improvement meets threshold, non-zero otherwise
Common Issues
Q: Validation fails but I fixed issues - why?
A: Check what the validation is measuring:
# See detailed validation results (without --quiet)
debtmap validate-improvement \
--comparison comparison.json \
--format text
# Common reasons for failure:
# - Added new complexity elsewhere while fixing issues
# - Threshold too strict for the changes made
# - Comparison file doesn't reflect latest changes
# - File-level scores increased despite function improvements
Q: How is improvement calculated?
A: Improvement is measured as percentage reduction in total debt score:
# Formula: improvement = ((before_score - after_score) / before_score) × 100
#
# Example:
# - Before: total score = 100
# - After: total score = 80
# - Improvement: ((100 - 80) / 100) × 100 = 20%
# See detailed breakdown
debtmap validate-improvement \
--comparison comparison.json \
--format text -v
Q: Can I track improvement over multiple refactorings?
A: Yes, use --previous-validation for trend analysis:
# First validation
debtmap validate-improvement \
--comparison refactor1-comparison.json \
--output validation1.json
# Second validation references first
debtmap validate-improvement \
--comparison refactor2-comparison.json \
--previous-validation validation1.json \
--output validation2.json
# Shows cumulative improvement trend
CI/CD Integration
# In CI pipeline: enforce minimum improvement for refactoring PRs
debtmap validate-improvement \
--comparison comparison.json \
--threshold 5.0 \
--quiet || exit 1
# With output for CI reporting
debtmap validate-improvement \
--comparison comparison.json \
--threshold 5.0 \
--format json \
--output improvement-report.json
# Archive validation reports for tracking
Use cases:
- Verify refactoring PRs actually reduce debt
- Enforce improvement thresholds in code review
- Track debt reduction trends over time
- Validate that tech debt fixes are effective
- Generate improvement metrics for reporting
Troubleshooting Validation Failures
# Check the comparison file is valid
jq . comparison.json
# Verify before/after files were generated correctly
debtmap --format json --output before.json -v
# ... make changes ...
debtmap --format json --output after.json -v
# Lower threshold if being too strict
debtmap validate-improvement \
--comparison comparison.json \
--threshold 1.0 # Accept any improvement
# See detailed improvement breakdown
debtmap validate-improvement \
--comparison comparison.json \
--format markdown \
--output improvement.md
FAQ
General Questions
Q: Why is my analysis slow?
A: Check several factors:
# Use all CPU cores
debtmap --jobs 0
# Try faster fallback mode
debtmap --semantic-off
# Check for large files or complex macros
debtmap -vv
Q: What does ‘Parse error’ mean?
A: File contains syntax debtmap cannot parse. Solutions:
- Try
--semantic-offfor fallback mode - Use
--verbose-macro-warningsfor Rust macros - Exclude problematic files in
.debtmap/config.toml - Report parse errors as potential bugs
Q: Why do scores differ between runs?
A: Several factors affect scores:
- Coverage file changed (use
--coverage-file) - Context providers enabled/disabled (
--context) - Code changes (intended behavior)
- Different threshold settings
Q: How do I reduce noise in results?
A: Use filtering options:
# Increase minimum priority
debtmap --min-priority 5
# Use threshold preset
debtmap --threshold-preset strict
# Filter categories
debtmap --filter "complexity,debt"
# Limit output
debtmap --top 20
Format and Output
Q: What’s the difference between legacy and unified JSON?
A: Two JSON output formats:
- Legacy:
{File: {...}}- nested file-based structure - Unified: Consistent structure with
typefield for each item
# Legacy (default)
debtmap --format json --output-format legacy
# Unified (recommended for parsing)
debtmap --format json --output-format unified
Q: Can I analyze partial codebases?
A: Yes, several approaches:
# Limit file count
debtmap --max-files 100
# Analyze specific directory
debtmap src/specific/module
# Use filters in config
# .debtmap/config.toml:
# include = ["src/**/*.rs"]
Q: How is the 0-10 priority score calculated?
A: Debtmap uses a multiplicative risk-based scoring formula to compute priority scores:
Core Formula:
Final Score = base_risk × debt_factor × complexity_factor ×
coverage_penalty × coverage_factor
Base Risk Calculation:
complexity_component = (cyclomatic × 0.3 + cognitive × 0.45) / 50.0
coverage_component = (100 - coverage_percentage) / 100.0 × 0.5
base_risk = (complexity_component + coverage_component) × 5.0
Coverage Penalty (tiered based on test coverage):
- < 20% coverage: 3.0× penalty (critical)
- 20-40% coverage: 2.0× penalty (high risk)
- 40-60% coverage: 1.5× penalty (moderate risk)
- 60-80% coverage: 1.2× penalty (low risk)
- ≥ 80% coverage: 0.8× penalty (well tested - reduction)
Coverage Factor (additional reduction for well-tested code):
- ≥ 90% coverage: 0.8 (20% score reduction)
- 70-90% coverage: 0.9 (10% score reduction)
- < 70% coverage: 1.0 (no reduction)
Role-Based Adjustments (Evidence-Based Calculator):
- Pure logic: 1.2× (testable, maintainable code)
- Entry points: 1.1× (public API boundaries)
- I/O wrappers: 0.7× (thin delegation layers)
Default Weights:
- Coverage weight: 0.5
- Cyclomatic complexity weight: 0.3
- Cognitive complexity weight: 0.45
- Debt factor weight: 0.2
Example:
- Function: cyclomatic=15, cognitive=20, coverage=10%, role=entry_point
- Complexity component: (15 × 0.3 + 20 × 0.45) / 50 = 0.27
- Coverage component: (100 - 10) / 100 × 0.5 = 0.45
- Base risk: (0.27 + 0.45) × 5.0 = 3.6
- Coverage penalty: 3.0 (< 20% coverage)
- Coverage factor: 1.0 (< 70% coverage)
- Debt factor: ~1.2 (moderate debt patterns)
- Complexity factor: ~1.3 (pattern-adjusted)
- Final score: 3.6 × 1.2 × 1.3 × 3.0 × 1.0 × 1.1 (role) ≈ 18.5 (clamped to 10.0 scale)
# See score breakdown with verbosity
debtmap -v
# See detailed factor calculations including all multipliers
debtmap -vv
Coverage and Testing
Q: How does coverage affect scores?
A: Coverage affects scores through two multiplicative factors in the risk calculation:
1. Coverage Penalty (tiered multiplier based on test coverage):
- < 20% coverage: 3.0× penalty (untested code gets highest priority)
- 20-40% coverage: 2.0× penalty
- 40-60% coverage: 1.5× penalty
- 60-80% coverage: 1.2× penalty
- ≥ 80% coverage: 0.8× reduction (well-tested code deprioritized)
2. Coverage Factor (additional reduction for well-tested code):
- ≥ 90% coverage: 0.8 (20% score reduction)
- 70-90% coverage: 0.9 (10% score reduction)
- < 70% coverage: 1.0 (no additional reduction)
3. Base Risk Component (coverage weight: 0.5):
coverage_component = (100 - coverage_percentage) / 100.0 × 0.5- Integrated into base risk calculation
Combined Effect: Untested complex code (0% coverage) receives maximum penalties (3.0× coverage penalty), while well-tested code (≥90% coverage) receives both the 0.8× coverage penalty and 0.8× coverage factor, resulting in a 0.64× total reduction. This ensures untested code rises to the top of the priority list.
# Use coverage file
debtmap --coverage-file coverage.info
# See coverage impact on scoring
debtmap --coverage-file coverage.info -v
# See detailed coverage penalty and factor breakdown
debtmap --coverage-file coverage.info -vv
See the FAQ entry “How is the 0-10 priority score calculated?” for complete scoring formula details.
Q: What’s the difference between measured and estimated metrics?
A: Debtmap provides both directly measured metrics and formula-based estimates:
Measured Metrics (from AST analysis):
cyclomatic_complexity: Actual count of decision points in codecognitive_complexity: Weighted measure of code understandabilitynesting_depth: Maximum level of nested blocksloc(lines of code): Actual line countparameters: Number of function parametersreturn_points: Number of return statements
Estimated Metrics (formula-based):
est_branches: Estimated branch count for testing effort- Formula:
max(nesting_depth, 1) × cyclomatic_complexity ÷ 3 - Not an actual count of branches in the AST
- Represents estimated testing complexity/effort
- Useful for understanding test coverage needs
- Formula:
# See all metrics including estimates
debtmap -vv
# Example output:
# cyclomatic_complexity: 15 (measured from AST)
# cognitive_complexity: 20 (measured from AST)
# nesting_depth: 4 (measured from AST)
# est_branches: 20 (estimated: max(4,1) × 15 ÷ 3 = 20)
When to trust estimated metrics:
- Comparing relative complexity between functions
- Estimating testing effort
- Understanding potential branching scenarios
When to rely on measured metrics:
- Precise complexity analysis
- Setting hard thresholds
- Exact cyclomatic/cognitive complexity values
Context and Analysis
Q: What are context providers?
A: Additional analysis for prioritization:
- critical_path: Call graph analysis, entry point distance
- dependency: Dependency relationships and coupling
- git_history: Change frequency and authorship
# Enable all
debtmap --context
# Specific providers
debtmap --context --context-providers critical_path,dependency
# See context impact
debtmap --context -v
Results and Comparison
Q: Why no output?
A: Check verbosity and filtering:
# Increase verbosity
debtmap -v
# Lower priority threshold
debtmap --min-priority 0
# Check if files were analyzed
debtmap -vv 2>&1 | grep "Processed"
# Ensure not using strict threshold
debtmap --threshold-preset lenient
Q: How to compare results over time?
A: Use the compare command:
# Save baseline
debtmap --format json --output before.json
# Make changes...
# Analyze again
debtmap --format json --output after.json
# Compare
debtmap compare --before before.json --after after.json
Q: Why does compare fail with ‘incompatible formats’?
A: The JSON files must use the same output format:
# Use unified format for both
debtmap --format json --output-format unified --output before.json
# ... make changes ...
debtmap --format json --output-format unified --output after.json
debtmap compare --before before.json --after after.json
# Or use legacy format for both (but unified is recommended)
debtmap --format json --output-format legacy --output before.json
debtmap --format json --output-format legacy --output after.json
Q: How do I compare results from different branches?
A: Generate JSON output on each branch and compare:
# On main branch
git checkout main
debtmap --format json --output main.json
# On feature branch
git checkout feature-branch
debtmap --format json --output feature.json
# Compare (from either branch)
debtmap compare --before main.json --after feature.json
Q: Can I compare legacy and unified JSON formats?
A: No, both files must use the same format. Regenerate with matching formats:
# Convert both to unified format
debtmap --format json --output-format unified --output before.json
debtmap --format json --output-format unified --output after.json
debtmap compare --before before.json --after after.json
Performance and Optimization
Q: How many threads should I use?
A: Depends on your machine:
# Use all cores (default, recommended)
debtmap --jobs 0
# Limit to 4 threads (if other work running)
debtmap --jobs 4
# Single threaded (debugging only)
debtmap --no-parallel
When to File Bug Reports
File a bug report when:
✅ These are bugs:
- Parse errors on valid syntax
- Crashes or panics
- Incorrect complexity calculations
- Concurrency errors
- Incorrect error messages
❌ These are not bugs:
- Unsupported language constructs (file feature request)
- Disagreement with complexity scores (subjective)
- Performance on very large codebases (optimization request)
- Missing documentation (docs issue, not code bug)
How to Report Issues
- Reproduce with minimal example
- Include debug output:
debtmap -vvv 2>&1 | tee error.log - Include version:
debtmap --version - Include platform: OS, Rust version if relevant
- Include configuration:
.debtmap/config.tomlif used - Expected vs actual behavior
Before Filing
- Check this troubleshooting guide
- Try
--semantic-offfallback mode - Update to latest version
- Search existing issues on GitHub
Related Documentation
- Configuration Guide: Configure debtmap behavior
- CLI Reference: Complete CLI flag documentation
- Analysis Guide: Understanding analysis results
- Examples: Practical usage examples
- API Documentation: Rust API documentation
Troubleshooting Checklist
When debugging issues, work through this checklist:
- Run with
-vvto see detailed output - Try
--semantic-offto use fallback mode - Check file permissions and paths
- Verify configuration in
.debtmap/config.toml - Test with
--max-files 10to isolate issues - Try
--no-parallelto rule out concurrency - Check
debtmap --versionfor updates - Review error messages in this guide
- Search GitHub issues for similar problems
- Create minimal reproduction case
- File bug report with debug output