Observability and Logging¶
Prodigy provides comprehensive execution monitoring and debugging through event tracking, Claude execution logs, and configurable verbosity levels.
Overview¶
Observability features: - Event tracking: JSONL event streams for all operations - Claude observability: Detailed Claude execution logs with tool invocations - Verbosity control: Granular output control from clean to trace-level - Log analysis: Tools for inspecting execution history - Performance metrics: Token usage and timing information
graph TD
Workflow[Workflow Execution] --> Events[Event System]
Workflow --> Claude[Claude Commands]
Workflow --> Verbosity[Verbosity Control]
Events --> JSONL[JSONL Event Files<br/>~/.prodigy/events/]
Events --> Types[Event Types<br/>AgentStarted, Completed, Failed]
Claude --> JSONLog[JSON Logs<br/>~/.local/state/claude/logs/]
Claude --> Tools[Tool Invocations]
Claude --> Tokens[Token Usage]
Verbosity --> Clean[Default: Clean Output]
Verbosity --> Verbose["-v: Show Streaming"]
Verbosity --> Debug["-vv: Debug Logs"]
Verbosity --> Trace["-vvv: Trace Details"]
JSONL --> Analysis[Log Analysis]
JSONLog --> Analysis
Analysis --> Debugging[Debugging & Monitoring]
style Events fill:#e1f5ff
style Claude fill:#fff3e0
style Verbosity fill:#f3e5f5
style Analysis fill:#e8f5e9
Figure: Prodigy's observability architecture showing event tracking, Claude logs, and verbosity control.
Event Tracking¶
All workflow operations are logged to JSONL event files:
Event Storage Best Practice
Events are stored globally in ~/.prodigy/events/ to enable cross-worktree aggregation. Multiple worktrees working on the same job share the same event log, making it easy to monitor parallel execution.
Event Types¶
AgentStarted - Agent execution begins:
{
"type": "AgentStarted",
"job_id": "mapreduce-123",
"agent_id": "agent-1",
"item_id": "item-1",
"timestamp": "2025-01-11T12:00:00Z"
}
AgentCompleted - Agent finishes successfully:
{
"type": "AgentCompleted", // (1)!
"job_id": "mapreduce-123", // (2)!
"agent_id": "agent-1", // (3)!
"duration": {"secs": 30, "nanos": 0}, // (4)!
"commits": ["abc123", "def456"], // (5)!
"json_log_location": "/path/to/logs/session-xyz.json" // (6)!
}
- Event type indicating successful completion
- MapReduce job identifier
- Unique agent identifier for this work item
- Total execution time for the agent
- Git commits created during execution
- Path to Claude's detailed JSON log for debugging
AgentFailed - Agent encounters errors:
{
"type": "AgentFailed",
"job_id": "mapreduce-123",
"agent_id": "agent-1",
"error": "Timeout after 300 seconds",
"json_log_location": "/path/to/logs/session-xyz.json"
}
WorkItemProcessed - Item completion:
{
"type": "WorkItemProcessed",
"job_id": "mapreduce-123",
"item_id": "item-1",
"status": "completed",
"result": {...}
}
CheckpointSaved - State persistence:
{
"type": "CheckpointSaved",
"job_id": "mapreduce-123",
"phase": "map",
"checkpoint_path": "/path/to/checkpoint.json",
"timestamp": "2025-01-11T12:05:00Z"
}
ClaudeMessage - Claude interaction messages:
// Source: src/cook/execution/events/event_types.rs:164-169
{
"type": "ClaudeMessage",
"agent_id": "agent-1",
"content": "Analyzing file structure...",
"message_type": "assistant",
"json_log_location": "/path/to/logs/session-xyz.json"
}
Event Organization¶
Events are organized by repository and job:
~/.prodigy/events/
└── prodigy/ # (1)!
├── mapreduce-123/ # (2)!
│ └── events-20250111.jsonl # (3)!
└── mapreduce-456/
└── events-20250111.jsonl
- Repository name for multi-repo support
- Job ID groups all events for this MapReduce run
- JSONL file with one event per line (append-only)
Claude Observability¶
Detailed Claude execution logs capture complete interactions:
JSON Log Location¶
Every Claude command creates a JSON log file:
Log Contents¶
Complete conversation history: - User messages and prompts - Claude responses - Tool invocations with parameters - Tool results - Token usage statistics - Error details and stack traces
Accessing JSON Logs¶
Via Verbose Output (-v flag):
Output includes log location:
Executing: claude /my-command
Claude JSON log: /Users/user/.local/state/claude/logs/session-abc123.json
✓ Command completed
In MapReduce Events:
{
"type": "AgentCompleted",
"agent_id": "agent-1",
"json_log_location": "/path/to/logs/session-xyz.json"
}
In DLQ Items:
{
"item_id": "item-1",
"failure_history": [{
"error": "Command failed",
"json_log_location": "/path/to/logs/session-xyz.json"
}]
}
Analyzing JSON Logs¶
Common Log Analysis Tasks
The examples below show how to extract specific information from Claude JSON logs using jq. These patterns are useful for debugging agent failures, tracking token usage, and understanding Claude's decision-making process.
View complete conversation:
Check tool invocations:
cat ~/.local/state/claude/logs/session-abc123.json | \
jq '.messages[].content[] | select(.type == "tool_use")'
Analyze token usage:
Extract errors:
cat ~/.local/state/claude/logs/session-abc123.json | \
jq '.messages[] | select(.role == "assistant") | .content[] | select(.type == "error")'
Verbosity Control¶
Granular output control with verbosity flags:
Choosing the Right Verbosity Level
Start with default output for production workflows. Use -v when debugging Claude interactions or when you need to see streaming output. Reserve -vv and -vvv for deep troubleshooting of Prodigy internals.
Levels¶
Default (verbosity = 0): - Clean, minimal output - Progress indicators - Results only
Verbose (-v, verbosity = 1): - Claude streaming JSON output - Command execution details - Log file locations
Debug (-vv, verbosity = 2): - Internal debug logs - Execution traces - State transitions
Trace (-vvv, verbosity = 3): - Trace-level internal logging - Full execution details - Performance metrics
Usage¶
# Default: clean output
prodigy run workflow.yml
# Verbose: show Claude streaming
prodigy run workflow.yml -v
# Debug: internal logs
prodigy run workflow.yml -vv
# Trace: maximum detail
prodigy run workflow.yml -vvv
Environment Override¶
Force streaming output regardless of verbosity:
Debugging MapReduce Failures¶
Using JSON Logs¶
When a MapReduce agent fails, use this debugging workflow:
flowchart TD
Start[Agent Fails] --> DLQ[Check DLQ Item]
DLQ --> GetLog{json_log_location<br/>present?}
GetLog -->|Yes| InspectLog[Inspect Claude JSON Log]
GetLog -->|No| CheckEvents[Check Event Stream]
InspectLog --> FindError[Find Failing Tool/Message]
FindError --> Context[Analyze Context]
Context --> Messages[Review Message History]
Context --> Tools[Check Tool Invocations]
Context --> Tokens[Examine Token Usage]
Context --> Errors[Extract Error Details]
Messages --> Root[Identify Root Cause]
Tools --> Root
Tokens --> Root
Errors --> Root
Root --> Fix[Apply Fix]
Fix --> Retry[Retry via DLQ]
CheckEvents --> EventLog[Parse Event JSONL]
EventLog --> Root
style Start fill:#ffebee
style GetLog fill:#fff3e0
style Root fill:#e8f5e9
style Fix fill:#e1f5ff
Figure: MapReduce debugging workflow showing how to trace failures using JSON logs and events.
When a MapReduce agent fails:
-
Check DLQ for json_log_location:
-
Inspect the Claude JSON log:
-
Identify failing tool:
-
Understand context:
- Review full conversation history
- Check tool invocations and results
- Examine token usage for context issues
- Look for error messages
Performance Metrics¶
Token Usage¶
Track token consumption:
{
"usage": {
"input_tokens": 1234,
"output_tokens": 567,
"cache_read_tokens": 89,
"cache_creation_tokens": 0
}
}
Execution Timing¶
Monitor performance:
{
"timings": {
"step1": {"secs": 10, "nanos": 500000000},
"step2": {"secs": 25, "nanos": 0},
"total": {"secs": 35, "nanos": 500000000}
}
}
Event Query Examples¶
Correlation IDs¶
Events include optional correlation IDs for tracing related operations across multiple agents:
// Source: src/storage/types.rs:75
{
"type": "AgentStarted",
"job_id": "mapreduce-123", // (1)!
"agent_id": "agent-1", // (2)!
"correlation_id": "trace-abc-123", // (3)!
"timestamp": "2025-01-11T12:00:00Z"
}
- Job identifier - groups all agents in this MapReduce run
- Agent identifier - unique to this work item
- Correlation ID - traces related operations across agents (optional)
Filter events by correlation ID:
# Source: src/cook/execution/events/filter.rs:63
# Find all events related to a specific workflow trace
cat ~/.prodigy/events/prodigy/mapreduce-123/events-*.jsonl | \
jq -c 'select(.correlation_id == "trace-abc-123")'
Track an agent workflow end-to-end:
# Get correlation ID from initial event
CORRELATION_ID=$(cat events.jsonl | jq -r 'select(.type == "AgentStarted") | .correlation_id' | head -1)
# Find all related events
cat events.jsonl | jq -c "select(.correlation_id == \"$CORRELATION_ID\")"
Find Failed Agents¶
cat ~/.prodigy/events/prodigy/mapreduce-123/events-*.jsonl | \
jq -c 'select(.type == "AgentFailed")'
Calculate Success Rate¶
# Count completed
completed=$(cat events.jsonl | jq 'select(.type == "AgentCompleted")' | wc -l)
# Count failed
failed=$(cat events.jsonl | jq 'select(.type == "AgentFailed")' | wc -l)
# Calculate rate
echo "Success rate: $(($completed * 100 / ($completed + $failed)))%"
Find Slowest Agents¶
cat events.jsonl | \
jq -c 'select(.type == "AgentCompleted") | {agent_id, duration: .duration.secs}' | \
sort -k2 -n -r | \
head -10
Log Management¶
Log Locations¶
Log Storage Considerations
Claude JSON logs can grow large with extensive tool usage. Monitor disk space when running many MapReduce agents. Consider setting up automated cleanup for logs older than 30 days in production environments.
Cleanup¶
# Clean old event logs (older than 30 days)
find ~/.prodigy/events -name "*.jsonl" -mtime +30 -delete
# Clean old Claude logs
find ~/.local/state/claude/logs -name "*.json" -mtime +30 -delete
# Clean completed sessions
prodigy sessions clean --completed
Examples¶
Debug Workflow Failure¶
# Run with verbose output
prodigy run workflow.yml -v
# Check event log for errors
cat ~/.prodigy/events/prodigy/latest/events-*.jsonl | \
jq -c 'select(.type == "AgentFailed")'
# Inspect Claude log
cat $(jq -r '.json_log_location' dlq-item.json) | jq '.messages[-5:]'
Monitor MapReduce Progress¶
# Run in verbose mode
prodigy run mapreduce.yml -v &
# Watch event stream
tail -f ~/.prodigy/events/prodigy/mapreduce-123/events-*.jsonl | \
jq -c 'select(.type == "AgentCompleted")'