Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Interpreting Results

Interpreting Results

Understanding Output Formats

Debtmap provides three output formats:

Terminal (default): Human-readable with colors and tables

debtmap analyze .

JSON: Machine-readable for CI/CD integration

debtmap analyze . --format json --output report.json

Markdown: Documentation-friendly

debtmap analyze . --format markdown --output report.md

JSON Structure

Debtmap uses a unified JSON format (spec 108) that provides consistent structure for all debt items. The output is generated by converting the internal UnifiedAnalysis to UnifiedOutput format (src/output/unified.rs).

Top-level JSON structure:

{
  "format_version": "1.0",
  "metadata": {
    "debtmap_version": "0.2.0",
    "generated_at": "2025-12-04T12:00:00Z",
    "project_root": "/path/to/project",
    "analysis_type": "full"
  },
  "summary": {
    "total_items": 45,
    "total_debt_score": 2847.3,
    "debt_density": 113.89,
    "total_loc": 25000,
    "by_type": {
      "File": 3,
      "Function": 42
    },
    "by_category": {
      "Architecture": 5,
      "Testing": 23,
      "Performance": 7,
      "CodeQuality": 10
    },
    "score_distribution": {
      "critical": 2,
      "high": 8,
      "medium": 18,
      "low": 17
    }
  },
  "items": [ /* array of UnifiedDebtItemOutput */ ]
}

Individual debt item structure (function-level):

{
  "type": "Function",
  "score": 87.3,
  "category": "Testing",
  "priority": "high",
  "location": {
    "file": "src/main.rs",
    "line": 42,
    "function": "process_data"
  },
  "metrics": {
    "cyclomatic_complexity": 15,
    "cognitive_complexity": 22,
    "length": 68,
    "nesting_depth": 4,
    "coverage": 0.0,
    "uncovered_lines": [42, 43, 44, 45, 46],
    "entropy_score": 0.65
  },
  "debt_type": {
    "ComplexityHotspot": {
      "cyclomatic": 15,
      "cognitive": 22,
      "adjusted_cyclomatic": null
    }
  },
  "function_role": "BusinessLogic",
  "purity_analysis": {
    "is_pure": false,
    "confidence": 0.8,
    "side_effects": ["file_io", "network"]
  },
  "dependencies": {
    "upstream_count": 2,
    "downstream_count": 3,
    "upstream_callers": ["main", "process_request"],
    "downstream_callees": ["validate", "save", "notify"]
  },
  "recommendation": {
    "action": "Add test coverage for complex function",
    "priority": "High",
    "implementation_steps": [
      "Write unit tests covering all 15 branches",
      "Consider extracting validation logic to reduce complexity"
    ]
  },
  "impact": {
    "coverage_improvement": 0.85,
    "complexity_reduction": 12.0,
    "risk_reduction": 3.7
  },
  "adjusted_complexity": {
    "dampened_cyclomatic": 12.5,
    "dampening_factor": 0.83
  },
  "complexity_pattern": "validation",
  "pattern_type": "state_machine",
  "pattern_confidence": 0.72
}

Source: JSON structure defined in src/output/unified.rs:18-24 (UnifiedOutput), lines 66-71 (UnifiedDebtItemOutput enum), lines 156-183 (FunctionDebtItemOutput)

Note: The "type" field uses a tagged enum format where the value is either "Function" or "File". All items have consistent top-level fields (score, category, priority, location) regardless of type, simplifying filtering and sorting across mixed results.

Reading Function Metrics

Key fields in metrics object:

  • cyclomatic_complexity: Decision points - guides test case count (src/output/unified.rs:195)
  • cognitive_complexity: Understanding difficulty - guides refactoring priority
  • length: Lines of code - signals SRP violations
  • nesting_depth: Indentation depth - signals need for extraction
  • coverage: Test coverage percentage (0.0-1.0, optional if no coverage data)
  • uncovered_lines: Array of line numbers not covered by tests (optional)
  • entropy_score: Pattern analysis score for false positive reduction (0.0-1.0, optional)

Fields at item level:

  • function_role: Function importance ("EntryPoint", "BusinessLogic", "Utility", "TestHelper") - affects score multiplier (src/priority/mod.rs:231)
  • purity_analysis: Whether function has side effects (optional) - affects testability assessment
    • is_pure: Boolean indicating no side effects
    • confidence: How certain (0.0-1.0)
    • side_effects: Array of detected side effect types (e.g., “file_io”, “network”, “mutation”)
  • dependencies: Call graph relationships
    • upstream_count: Number of functions that call this one (impact radius)
    • downstream_count: Number of functions this calls (complexity)
    • upstream_callers: Names of calling functions (optional, included if count > 0)
    • downstream_callees: Names of called functions (optional, included if count > 0)

Complexity adjustment fields:

  • adjusted_complexity: Entropy-based dampening applied (optional, included when dampening factor < 1.0)
    • dampened_cyclomatic: Reduced complexity after pattern analysis
    • dampening_factor: Multiplier applied (e.g., 0.83 = 17% reduction)
  • complexity_pattern: Detected pattern name (e.g., “validation”, “dispatch”, “error_handling”)
  • pattern_type: Pattern category (e.g., “state_machine”, “coordinator”) - high-level classification
  • pattern_confidence: Confidence in pattern detection (0.0-1.0, shown if ≥ 0.5)

Pattern detection and complexity adjustment:

Debtmap detects common code patterns that appear complex but are actually maintainable (src/complexity/entropy_analysis.rs). When detected with high confidence (≥ 0.5), complexity metrics are dampened:

Validation patterns - Repetitive input checking:

#![allow(unused)]
fn main() {
// Appears complex (cyclomatic = 15) but is repetitive
if field1.is_empty() { return Err(...) }
if field2.is_empty() { return Err(...) }
// ... repeated for many fields
}

Dampening: ~20-40% reduction if similarity is high

State machine patterns - Structured state transitions:

#![allow(unused)]
fn main() {
match state {
    State::Init => { /* transition logic */ }
    State::Running => { /* transition logic */ }
    // ... many similar cases
}
}

Dampening: ~30-50% reduction for consistent transition logic

Error handling patterns - Systematic error checking:

#![allow(unused)]
fn main() {
let x = operation1().map_err(|e| /* wrap error */)?;
let y = operation2().map_err(|e| /* wrap error */)?;
// ... repeated error wrapping
}

Dampening: ~15-30% reduction for consistent error propagation

When to trust adjusted_complexity:

  • pattern_confidence ≥ 0.7: High confidence, use dampened_cyclomatic for priority decisions
  • pattern_confidence 0.5-0.7: Moderate confidence, consider both original and dampened values
  • pattern_confidence < 0.5 or missing: Use original cyclomatic_complexity

Entropy score interpretation:

  • entropy_score ≥ 0.7: High entropy, genuinely complex code - prioritize refactoring
  • entropy_score 0.4-0.7: Moderate entropy, some repetition - review manually
  • entropy_score < 0.4: Low entropy, highly repetitive pattern - likely false positive if flagged complex

Prioritizing Work

Debtmap provides multiple prioritization strategies, with unified scoring (0-10 scale) as the recommended default for most workflows:

1. By Unified Score (default - recommended)

debtmap analyze . --top 10

Shows top 10 items by combined complexity, coverage, and dependency factors, weighted and adjusted by function role.

Why use unified scoring:

  • Balances complexity (40%), coverage (40%), and dependency impact (20%)
  • Adjusts for function importance (entry points prioritized over utilities)
  • Normalized 0-10 scale is intuitive and consistent
  • Reduces false positives through coverage propagation
  • Best for sprint planning and function-level refactoring decisions

Example:

# Show top 20 critical items
debtmap analyze . --min-priority 7.0 --top 20

# Focus on high-impact functions (score >= 7.0)
debtmap analyze . --format json | jq '.functions[] | select(.unified_score >= 7.0)'

2. By Risk Category (legacy compatibility)

debtmap analyze . --min-priority high

Shows only HIGH and CRITICAL priority items using legacy risk scoring.

Note: Legacy risk scoring uses additive formulas and unbounded scales. Prefer unified scoring for new workflows.

3. By Debt Type

debtmap analyze . --filter Architecture,Testing

Focuses on specific categories:

  • Architecture: God objects, complexity, dead code
  • Testing: Coverage gaps, test quality
  • Performance: Resource leaks, inefficiencies
  • CodeQuality: Code smells, maintainability

4. By ROI (with coverage)

debtmap analyze . --lcov coverage.lcov --top 20

Prioritizes by return on investment for testing/refactoring. Combines unified scoring with test effort estimates to identify high-value work.

Choosing the right strategy:

  • Sprint planning for developers: Use unified scoring (--top N)
  • Architectural review: Use tiered prioritization (--summary)
  • Category-focused work: Use debt type filtering (--filter)
  • Testing priorities: Use ROI analysis with coverage data (--lcov)
  • Historical comparisons: Use legacy risk scoring (for consistency with old reports)

Tiered Prioritization

Note: Tiered prioritization uses traditional debt scoring (additive, higher = worse) and is complementary to the unified scoring system (0-10 scale). Both systems can be used together:

  • Unified scoring (0-10 scale): Best for function-level prioritization and sprint planning
  • Tiered prioritization (debt tiers): Best for architectural focus and strategic debt planning

Use --summary for tiered view focusing on architectural issues, or default output for function-level unified scores.

Debtmap uses a tier-based system to map debt scores to actionable priority levels. Each tier includes effort estimates and strategic guidance for efficient debt remediation.

Tier Levels

The Tier enum defines four priority levels based on score thresholds:

#![allow(unused)]
fn main() {
pub enum Tier {
    Critical,  // Score ≥ 90
    High,      // Score 70-89.9
    Moderate,  // Score 50-69.9
    Low,       // Score < 50
}
}

Score-to-Tier Mapping:

  • Critical (≥ 90): Immediate action required - blocks progress
  • High (70-89.9): Should be addressed this sprint
  • Moderate (50-69.9): Plan for next sprint
  • Low (< 50): Background maintenance work

Effort Estimates Per Tier

Each tier includes estimated effort based on typical remediation patterns:

TierEstimated EffortTypical Work
Critical1-2 daysMajor refactoring, comprehensive testing, architectural changes
High2-4 hoursExtract functions, add test coverage, fix resource leaks
Moderate1-2 hoursSimplify logic, reduce duplication, improve error handling
Low30 minutesAddress TODOs, minor cleanup, documentation

Effort calculation considers:

  • Complexity metrics (cyclomatic, cognitive)
  • Test coverage gaps
  • Number of dependencies (upstream/downstream)
  • Debt category (Architecture debt takes longer than CodeQuality)

Tiered Display Grouping

Note: TieredDisplay is an internal structure (src/priority/mod.rs:515) used for terminal output formatting and is not serialized to JSON. JSON output includes individual items with their priority field (critical, high, medium, low) based on score thresholds (src/output/unified.rs:74-81).

The internal TieredDisplay structure groups similar debt items for batch action recommendations in terminal output:

Grouping strategy:

  • Groups items by tier (Critical/High/Moderate/Low) and similarity pattern
  • Prevents grouping of god objects (always show individually)
  • Prevents grouping of Critical items (each needs individual attention)
  • Suggests batch actions for similar Low/Moderate items in terminal output

To view tiered grouping in JSON, use priority filtering:

# Get all Critical items (priority: "critical")
debtmap analyze . --format json | jq '.items[] | select(.priority == "critical")'

# Get High priority items
debtmap analyze . --format json | jq '.items[] | select(.priority == "high")'

# Count items by priority
debtmap analyze . --format json | jq '.summary.score_distribution'

The score_distribution in the summary provides counts for each priority tier.

Using Tiered Prioritization

1. Start with Critical tier:

debtmap analyze . --min-priority critical

Focus on items with score ≥ 90. These typically represent:

  • Complex functions with 0% coverage
  • God objects blocking feature development
  • Critical resource leaks or security issues

2. Plan High tier work:

debtmap analyze . --min-priority high --format json > sprint-plan.json

Schedule 2-4 hours per item for this sprint. Look for:

  • Functions approaching complexity thresholds
  • Moderate coverage gaps on important code paths
  • Performance bottlenecks with clear solutions

3. Batch Moderate tier items:

debtmap analyze . --min-priority moderate

Review batch recommendations. Examples:

  • “10 validation functions detected - extract common pattern”
  • “5 similar test files with duplication - create shared fixtures”
  • “8 functions with magic values - create constants module”

4. Schedule Low tier background work: Address during slack time or as warm-up tasks for new contributors.

Strategic Guidance by Tier

Critical Tier Strategy:

  • Block new features until addressed
  • Pair programming recommended for complex items
  • Architectural review before major refactoring
  • Comprehensive testing after changes

High Tier Strategy:

  • Sprint planning priority
  • Impact analysis before changes
  • Code review from senior developers
  • Integration testing after changes

Moderate Tier Strategy:

  • Batch similar items for efficiency
  • Extract patterns across multiple files
  • Incremental improvement over multiple PRs
  • Regression testing for affected areas

Low Tier Strategy:

  • Good first issues for new contributors
  • Documentation improvements
  • Code cleanup during refactoring nearby code
  • Technical debt gardening sessions

Categorized Debt Analysis

Note: CategorizedDebt is an internal analysis structure (src/priority/mod.rs:419) used for query operations and markdown formatting. It is not serialized to JSON output. The JSON output uses by_category in the summary section instead (see JSON Structure above).

Debtmap’s internal CategorizedDebt analysis groups debt items by category and identifies cross-category dependencies. This analysis powers the markdown output and internal query methods but is not directly exposed in JSON format.

CategorySummary

Each category gets a summary with metrics for planning:

#![allow(unused)]
fn main() {
pub struct CategorySummary {
    pub category: DebtCategory,
    pub total_score: f64,
    pub item_count: usize,
    pub estimated_effort_hours: f64,
    pub average_severity: f64,
    pub top_items: Vec<DebtItem>,  // Up to 5 highest priority
}
}

Effort estimation formulas:

  • Architecture debt: complexity_score / 10 × 2 hours (structural changes take longer)
  • Testing debt: complexity_score / 10 × 1.5 hours (writing tests)
  • Performance debt: complexity_score / 10 × 1.8 hours (profiling + optimization)
  • CodeQuality debt: complexity_score / 10 × 1.2 hours (refactoring)

Example category summary:

{
  "category": "Architecture",
  "total_score": 487.5,
  "item_count": 15,
  "estimated_effort_hours": 97.5,
  "average_severity": 32.5,
  "top_items": [
    {
      "debt_type": "GodObject",
      "file": "src/services/user_service.rs",
      "score": 95.0,
      "estimated_effort_hours": 16.0
    },
    {
      "debt_type": "ComplexityHotspot",
      "file": "src/payments/processor.rs",
      "score": 87.3,
      "estimated_effort_hours": 14.0
    }
  ]
}

Cross-Category Dependencies

CrossCategoryDependency identifies blocking relationships between different debt categories:

#![allow(unused)]
fn main() {
pub struct CrossCategoryDependency {
    pub from_category: DebtCategory,
    pub to_category: DebtCategory,
    pub blocking_items: Vec<(DebtItem, DebtItem)>,
    pub impact_level: ImpactLevel,  // Critical, High, Medium, Low
    pub recommendation: String,
}
}

Common dependency patterns:

1. Architecture blocks Testing:

  • Pattern: God objects are too complex to test effectively
  • Example: UserService has 50+ functions, making comprehensive testing impractical
  • Impact: Critical - cannot improve test coverage without refactoring
  • Recommendation: “Split god object into 4-5 focused modules before adding tests”

2. Async issues require Architecture changes:

  • Pattern: Blocking I/O in async contexts requires architectural redesign
  • Example: Sync database calls in async handlers
  • Impact: High - performance problems require design changes
  • Recommendation: “Introduce async database layer before optimizing handlers”

3. Complexity affects Testability:

  • Pattern: High cyclomatic complexity makes thorough testing difficult
  • Example: Function with 22 branches needs 22+ test cases
  • Impact: High - testing effort grows exponentially with complexity
  • Recommendation: “Reduce complexity to < 10 before writing comprehensive tests”

4. Performance requires Architecture:

  • Pattern: O(n²) nested loops need different data structures
  • Example: Linear search in loops should use HashMap
  • Impact: Medium - optimization requires structural changes
  • Recommendation: “Refactor data structure before micro-optimizations”

Example cross-category dependency:

{
  "from_category": "Architecture",
  "to_category": "Testing",
  "impact_level": "Critical",
  "blocking_items": [
    {
      "blocker": {
        "debt_type": "GodObject",
        "file": "src/services/user_service.rs",
        "functions": 52,
        "score": 95.0
      },
      "blocked": {
        "debt_type": "TestingGap",
        "file": "src/services/user_service.rs",
        "coverage": 15,
        "score": 78.0
      }
    }
  ],
  "recommendation": "Split UserService into focused modules (auth, profile, settings, notifications) before attempting to improve test coverage. Current structure makes comprehensive testing impractical.",
  "estimated_unblock_effort_hours": 16.0
}

Using Category-Based Analysis

View category distribution in JSON:

debtmap analyze . --format json | jq '.summary.by_category'

This shows item counts per category (Architecture, Testing, Performance, CodeQuality).

Filter items by category:

debtmap analyze . --format json | jq '.items[] | select(.category == "Architecture")'

Focus on specific category with CLI:

debtmap analyze . --filter Architecture --top 10

Generate markdown for detailed category analysis:

debtmap analyze . --format markdown --output report.md

The markdown format includes full CategorySummary details with effort estimates and cross-category dependency analysis.

Strategic planning workflow:

  1. Review category summaries:

    • Identify which category has highest total score
    • Check estimated effort hours per category
    • Note average severity to gauge urgency
  2. Check cross-category dependencies:

    • Find Critical and High impact blockers
    • Prioritize blockers before blocked items
    • Plan architectural changes before optimization
  3. Plan remediation order:

    Example decision tree:
    - Architecture score > 400? → Address god objects first
    - Testing gap with low complexity? → Quick wins, add tests
    - Performance issues + architecture debt? → Refactor structure first
    - High code quality debt but good architecture? → Incremental cleanup
    
  4. Use category-specific strategies:

    • Architecture: Pair programming, design reviews, incremental refactoring
    • Testing: TDD for new code, characterization tests for legacy
    • Performance: Profiling first, optimize hot paths, avoid premature optimization
    • CodeQuality: Code review focus, linting rules, consistent patterns

Note on output formats: CategorySummary and CrossCategoryDependency details are available in markdown format only. The JSON output provides category counts in summary.by_category and you can filter items by category using the category field on each item.

Debt Density Metric

Debt density normalizes technical debt scores across projects of different sizes, providing a per-1000-lines-of-code metric for fair comparison.

Formula

debt_density = (total_debt_score / total_lines_of_code) × 1000

Example calculation:

Project A:
  - Total debt score: 1,250
  - Total lines of code: 25,000
  - Debt density: (1,250 / 25,000) × 1000 = 50

Project B:
  - Total debt score: 2,500
  - Total lines of code: 50,000
  - Debt density: (2,500 / 50,000) × 1000 = 50

Projects A and B have equal debt density (50) despite B having twice the absolute debt, because B is also twice as large. They have proportionally similar technical debt.

Interpretation Guidelines

Use these thresholds to assess codebase health:

Debt DensityAssessmentDescription
0-50CleanWell-maintained codebase, minimal debt
51-100ModerateTypical technical debt, manageable
101-150HighSignificant debt, prioritize remediation
150+CriticalSevere debt burden, may impede development

Context matters:

  • Early-stage projects: Often have higher density (rapid iteration)
  • Mature projects: Should trend toward lower density over time
  • Legacy systems: May have high density, track trend over time
  • Greenfield rewrites: Aim for density < 50

Using Debt Density

1. Compare projects fairly:

# Small microservice (5,000 LOC, debt = 250)
# Debt density: 50

# Large monolith (100,000 LOC, debt = 5,000)
# Debt density: 50

# Equal health despite size difference

2. Track improvement over time:

Sprint 1: 50,000 LOC, debt = 7,500, density = 150 (High)
Sprint 5: 52,000 LOC, debt = 6,500, density = 125 (Improving)
Sprint 10: 54,000 LOC, debt = 4,860, density = 90 (Moderate)

3. Set team goals:

Current density: 120
Target density: < 80 (by Q4)
Reduction needed: 40 points

Strategy:
- Fix 2-3 Critical items per sprint
- Prevent new debt (enforce thresholds)
- Refactor before adding features in high-debt modules

4. Benchmark across teams/projects:

{
  "team_metrics": [
    {
      "project": "auth-service",
      "debt_density": 45,
      "assessment": "Clean",
      "trend": "stable"
    },
    {
      "project": "billing-service",
      "debt_density": 95,
      "assessment": "Moderate",
      "trend": "improving"
    },
    {
      "project": "legacy-api",
      "debt_density": 165,
      "assessment": "Critical",
      "trend": "worsening"
    }
  ]
}

Limitations

Debt density doesn’t account for:

  • Code importance: 100 LOC in payment logic ≠ 100 LOC in logging utils
  • Complexity distribution: One 1000-line god object vs. 1000 simple functions
  • Test coverage: 50% coverage on critical paths vs. low-priority features
  • Team familiarity: New codebase vs. well-understood legacy system

Best practices:

  • Use density as one metric among many
  • Combine with category analysis and tiered prioritization
  • Focus on trend (improving/stable/worsening) over absolute number
  • Consider debt per module for more granular insights

Debt Density in CI/CD

Track density over time:

# Generate report with density
debtmap analyze . --format json --output debt-report.json

# Extract density for trending
DENSITY=$(jq '.debt_density' debt-report.json)

# Store in metrics database
echo "debtmap.density:${DENSITY}|g" | nc -u -w0 statsd 8125

Set threshold gates:

# .github/workflows/debt-check.yml
- name: Check debt density
  run: |
    DENSITY=$(debtmap analyze . --format json | jq '.debt_density')
    if (( $(echo "$DENSITY > 150" | bc -l) )); then
      echo "❌ Debt density too high: $DENSITY (limit: 150)"
      exit 1
    fi
    echo "✅ Debt density acceptable: $DENSITY"

Actionable Insights

Each recommendation includes:

ACTION: What to do

  • “Add 6 unit tests for full coverage”
  • “Refactor into 3 smaller functions”
  • “Extract validation to separate function”

IMPACT: Expected improvement

  • “Full test coverage, -3.7 risk”
  • “Reduce complexity from 22 to 8”
  • “Eliminate 120 lines of duplication”

WHY: Rationale

  • “Business logic with 0% coverage, manageable complexity”
  • “High complexity with low coverage threatens stability”
  • “Repeated validation pattern across 5 files”

Example workflow:

  1. Run analysis with coverage: debtmap analyze . --lcov coverage.lcov
  2. Filter to CRITICAL items: --min-priority critical
  3. Review top 5 recommendations
  4. Start with highest ROI items
  5. Rerun analysis to track progress

Common Patterns to Recognize

Pattern 1: High Complexity, Well Tested

Complexity: 25, Coverage: 95%, Risk: LOW

This is actually good! Complex but thoroughly tested code. Learn from this approach.

Pattern 2: Moderate Complexity, No Tests

Complexity: 12, Coverage: 0%, Risk: CRITICAL

Highest priority - manageable complexity, should be easy to test.

Pattern 3: Low Complexity, No Tests

Complexity: 3, Coverage: 0%, Risk: LOW

Low priority - simple code, less risky without tests.

Pattern 4: Repetitive High Complexity (Dampened)

Cyclomatic: 20, Effective: 7 (65% dampened), Risk: LOW

Validation or dispatch pattern - looks complex but is repetitive. Lower priority.

Pattern 5: God Object

File: services.rs, Functions: 50+, Responsibilities: 15+

Architectural issue - split before adding features.