Improving Our Design & Planning Process
What We’ve Done Well ✅
1. Example-Driven Design
- ✅ Wrote examples before implementation
- ✅ Tested ergonomics with realistic code
- ✅ Identified pain points early
2. Comprehensive Analysis
- ✅ Compared alternatives (tuples vs HList, read vs query)
- ✅ Documented trade-offs explicitly
- ✅ Evaluated pain points systematically
3. Clear Philosophy
- ✅ “Pure core, imperative shell” is well-defined
- ✅ Design principles documented
- ✅ Anti-patterns identified (no heavy macros, etc.)
4. Thorough Documentation
- ✅ DESIGN.md captures API
- ✅ PHILOSOPHY.md explains “why”
- ✅ Examples show real usage
Critical Gaps 🚨
Gap 1: No Validation Through Code
Problem: All our examples are fictional - they don’t compile!
Impact:
- Assumptions might be wrong
- API might not work as expected
- Hidden complexity not discovered
Solution:
#![allow(unused)]
fn main() {
// Create a minimal proof-of-concept:
// stillwater/prototypes/validation_poc.rs
// Actually implement just Validation<T, E>
// Write REAL tests that COMPILE and RUN
// Discover what breaks, what's awkward
#[test]
fn test_validation_accumulation() {
let result = Validation::all((
validate_email("test@example.com"),
validate_age(25),
));
// Does this actually work?
// Is the syntax actually ergonomic?
// What error messages does the compiler give?
}
}
Action Items:
- Create
prototypes/directory - Implement minimal Validation type
- Write 10 real test cases
- Document surprises/learnings
Gap 2: No User Validation
Problem: We’re designing in a vacuum - no external feedback.
Impact:
- Solving wrong problems
- Missing critical use cases
- API might not resonate with real users
Solution:
A. Define User Personas
## Persona 1: Backend Developer (Primary)
- Building REST APIs with Axum/Actix
- Uses PostgreSQL/SQLx
- Pain: Testing business logic mixed with DB
- Wants: Testable code, clear error messages
## Persona 2: CLI Tool Author (Secondary)
- Building command-line tools
- Reads configs, processes files
- Pain: Error context is lost
- Wants: Great error messages, validation
## Persona 3: Data Engineer (Tertiary)
- ETL pipelines, CSV processing
- Needs bulk validation
- Pain: Want all errors, not first one
- Wants: Performance, parallelism
B. Create User Stories
As a backend developer,
I want to validate API inputs and get all errors,
So that users can fix their entire request at once.
As a CLI tool author,
I want clear error context showing what failed,
So that users can debug issues without my help.
As a data engineer,
I want to validate thousands of records in parallel,
So that pipelines complete faster.
C. Early User Interviews
- Share design docs on r/rust
- Get feedback from 5-10 Rust developers
- Ask: “Would you use this? Why/why not?”
- Document objections and address them
Gap 3: No Performance Validation
Problem: We assume async wrapping is cheap, but haven’t measured.
Impact:
- Performance might be worse than expected
- Might not be zero-cost in practice
- Could be a deal-breaker for some users
Solution:
Benchmark Critical Paths
#![allow(unused)]
fn main() {
// benchmarks/effect_overhead.rs
#[bench]
fn hand_written_sync(b: &mut Bencher) {
b.iter(|| {
let user = fetch_user_direct(42);
let validated = validate_user_direct(user);
save_user_direct(validated)
});
}
#[bench]
fn stillwater_sync(b: &mut Bencher) {
b.iter(|| {
Effect::from_fn(|_| fetch_user(42))
.and_then(|user| validate_user(user))
.and_then(|user| save_user(user))
.run(&())
});
}
// Measure:
// - Boxing overhead
// - Future wrapping cost
// - Comparison to hand-written
// - Memory allocations
}
Acceptance Criteria:
- Effect overhead < 5% vs hand-written
- Memory allocations reasonable
- Document in README if overhead exists
Action Items:
- Create benchmark suite
- Run on realistic workloads
- Profile with
cargo flamegraph - Document results in PERFORMANCE.md
Gap 4: No Competitive Analysis
Problem: Haven’t deeply compared to alternatives.
Impact:
- Missing features others have
- Repeating mistakes
- Can’t articulate our advantages
Solution:
Deep Dive Comparison
## vs. anyhow/eyre (Error Handling)
| Feature | anyhow | stillwater |
|---------|--------|------------|
| Error context | ✅ Yes | ✅ Yes |
| Validation accumulation | ❌ No | ✅ Yes |
| Effect composition | ❌ No | ✅ Yes |
| Pure/effect separation | ❌ No | ✅ Yes |
**When to use anyhow:** Simple apps, don't need validation
**When to use stillwater:** Need validation, testability, effect composition
## vs. frunk (Validation)
| Feature | frunk | stillwater |
|---------|-------|------------|
| Validation | ✅ Yes | ✅ Yes |
| HList | ✅ Yes | ❌ No (not needed) |
| Effect composition | ❌ No | ✅ Yes |
| Documentation | ⚠️ Sparse | ✅ Comprehensive |
| Learning curve | ⚠️ Steep | ✅ Gentle |
**When to use frunk:** Type-level programming, Generic derives
**When to use stillwater:** Practical validation, clear APIs
## vs. Hand-rolling
| Aspect | Hand-rolled | stillwater |
|--------|-------------|------------|
| Boilerplate | ❌ High | ✅ Low |
| Consistency | ⚠️ Varies | ✅ Enforced |
| Testing | ⚠️ Manual | ✅ Patterns built-in |
| Onboarding | ⚠️ Team-specific | ✅ Documented |
**When to hand-roll:** Very simple apps, unique requirements
**When to use stillwater:** Team projects, maintainability matters
Action Items:
- Try building same feature with alternatives
- Measure LOC, compile time, ergonomics
- Document in COMPARISON.md
- Use in marketing/README
Gap 5: Missing Implementation Experiments
Problem: Designing without building reveals hidden complexity late.
Impact:
- Lifetime issues we haven’t anticipated
- Trait bounds that don’t work
- Type inference failures
Solution:
Spike/Prototype Critical Parts
#![allow(unused)]
fn main() {
// prototypes/effect_lifetimes.rs
// Experiment: Can we avoid boxing?
struct EffectNoBox<T, E, Env, F>
where
F: FnOnce(&Env) -> BoxFuture<'_, Result<T, E>>,
{
run_fn: F,
}
// Try implementing and_then without boxing
// See what breaks, what lifetime errors occur
// Document findings
// Results:
// - [ ] Boxing necessary? Why/why not?
// - [ ] Can we use impl Trait instead?
// - [ ] What's the actual cost?
}
Experiments to Run:
- Effect without boxing (is it possible?)
- Validation with Iterator instead of tuples
- Context without String allocation
- Try trait integration (can we make ? work?)
- Environment extraction (trait vs direct access)
Gap 6: No Migration/Adoption Story
Problem: How does someone actually start using this?
Impact:
- Adoption friction
- Unclear path from current code
- All-or-nothing approach
Solution:
Progressive Adoption Guide
## Migration Path
### Stage 1: Validation Only (Week 1)
Start with just validation in new API endpoints:
```rust
// Before
fn create_user(input: UserInput) -> Result<User, Error> {
if !validate_email(&input.email) {
return Err(Error::InvalidEmail);
}
// ... continue with first-error-only
}
// After (just add validation)
fn create_user(input: UserInput) -> Result<User, Vec<ValidationError>> {
Validation::all((
validate_email(&input.email),
validate_age(input.age),
))
.into_result()
}
Benefits: Immediate value, low risk, no refactoring needed
Stage 2: Effect Separation (Week 2-3)
Extract pure business logic in critical paths:
#![allow(unused)]
fn main() {
// Pure functions (new)
fn calculate_discount(customer: &Customer) -> Money { ... }
fn apply_discount(order: Order, discount: Money) -> Order { ... }
// Keep existing I/O code (not refactored yet)
async fn process_order(id: OrderId) -> Result<Invoice, Error> {
let order = db.fetch_order(id).await?;
let discount = calculate_discount(&order.customer); // Pure!
let final_order = apply_discount(order, discount); // Pure!
db.save_invoice(final_order).await
}
}
Benefits: Better testability immediately, incremental change
Stage 3: Full Effects (Month 2+)
Gradually wrap I/O in Effects for new features:
#![allow(unused)]
fn main() {
fn process_order_v2(id: OrderId) -> Effect<Invoice, Error, AppEnv> {
// Full stillwater style
}
}
Benefits: New code uses best practices, old code still works
**Action Items:**
- [ ] Write migration guide
- [ ] Create "starter" templates
- [ ] Document integration with popular frameworks
- [ ] Show how to use with existing codebases
---
### Gap 7: No Clear Success Metrics
**Problem:** "100+ stars" is vague. How do we know we succeeded?
**Impact:**
- Can't measure progress
- Don't know when to pivot
- Unclear what "good" looks like
**Solution:**
#### Define Concrete Metrics
**Technical Metrics:**
- [ ] Compiles with zero warnings
- [ ] 100% documented (rustdoc)
- [ ] <5% overhead vs hand-written (benchmark)
- [ ] <2s additional compile time for simple project
- [ ] All examples compile and run
**Adoption Metrics (6 months):**
- [ ] 3+ production users (verified via contact)
- [ ] 10+ GitHub issues filed (engagement)
- [ ] 100+ downloads/week on crates.io
- [ ] Featured in "This Week in Rust" or similar
**Quality Metrics:**
- [ ] Positive HN/Reddit feedback (>70% upvote)
- [ ] 0 critical bugs reported
- [ ] <24hr response time to issues
- [ ] 5+ external contributors
**Educational Metrics:**
- [ ] Blog post written about it
- [ ] Conference talk accepted
- [ ] 3+ community examples/tutorials
**Leading Indicators (Month 1):**
- [ ] 5 people try it and give feedback
- [ ] 2 people say "I'd use this"
- [ ] 0 people say "This solves nothing"
---
### Gap 8: Documentation Organization
**Problem:** Design docs scattered across many files.
**Impact:**
- Hard to find information
- Redundancy/conflicts
- No clear entry point
**Solution:**
#### Documentation Structure
stillwater/ ├── README.md # Quick intro, examples ├── docs/ │ ├── guide/ │ │ ├── 01-getting-started.md │ │ ├── 02-validation.md │ │ ├── 03-effects.md │ │ ├── 04-testing.md │ │ └── 05-async.md │ ├── design/ │ │ ├── philosophy.md # Why we made these choices │ │ ├── architecture.md # How it works │ │ ├── decisions/ │ │ │ ├── 001-tuples-for-validation.md │ │ │ ├── 002-read-write-not-query-execute.md │ │ │ ├── 003-async-first.md │ │ │ └── template.md │ │ └── alternatives.md # vs frunk, anyhow, etc. │ ├── examples/ │ │ ├── web-api-validation.md │ │ ├── cli-tool-errors.md │ │ ├── data-pipeline.md │ │ └── testing-patterns.md │ └── contributing/ │ ├── development.md │ ├── testing.md │ └── releasing.md ├── examples/ # Runnable code ├── prototypes/ # Experiments └── benchmarks/ # Performance tests
**Action Items:**
- [ ] Reorganize current docs into structure
- [ ] Create templates for decision records
- [ ] Add navigation/ToC to each doc
- [ ] Cross-reference related docs
---
### Gap 9: No "Why Not" Section
**Problem:** Don't address objections head-on.
**Impact:**
- Users have unanswered concerns
- Seems like we're hiding weaknesses
- Can't learn from critics
**Solution:**
#### Address Objections Explicitly
```markdown
## Why NOT Use Stillwater?
### "I don't need validation accumulation"
**Then use:** anyhow/eyre for simple error handling
**Stillwater adds:** Unnecessary complexity if you don't validate forms/data
### "This adds too much abstraction"
**Valid concern:** Yes, it's more abstract than hand-written code
**Trade-off:** Abstraction buys you testability and consistency
**Decision:** If your team values simplicity > testability, skip this
### "Async-first means I can't use it in sync code"
**Clarification:** You CAN use it in sync code (wraps in ready Future)
**But:** You do need an async runtime (tokio)
**Alternative:** If you're building pure sync CLI, the overhead might not be worth it
### "I don't like the philosophy"
**That's fine:** If "pure core, imperative shell" doesn't resonate, this isn't for you
**Alternative:** Many roads to good code - this is one path
### "The API is too verbose"
**Valid in some cases:** `Effect<T, ContextError<E>, Env>` is long
**Mitigation:** Type aliases reduce this: `type AppEffect<T> = ...`
**Decision:** We chose explicit over magic
Action Items:
- List all objections we can think of
- Get feedback from critics
- Address honestly in FAQ
- Don’t be defensive - acknowledge trade-offs
Gap 10: No Failure Scenarios Considered
Problem: Only designed for success case.
Impact:
- What if compilation is slow?
- What if error messages are cryptic?
- What if adoption is zero?
Solution:
Plan for Failure
Scenario 1: Compile Times Are Terrible
- Detection: >10s for small project
- Response: Profile with
-Z self-profile, identify hot spots - Mitigation: Reduce generic instantiations, use trait objects
- Pivot: If unfixable, document clearly and target specific use cases
Scenario 2: Error Messages Are Cryptic
- Detection: User feedback: “I don’t understand this error”
- Response: Collect examples of bad errors
- Mitigation: Add trait bounds diagnostics, custom error messages
- Pivot: Simplify type system if needed
Scenario 3: No Adoption After 6 Months
- Detection: <10 downloads/week, no GitHub activity
- Response: User interviews - why didn’t it resonate?
- Pivot Options:
- Simplify to just validation (drop effects)
- Target specific niche (e.g., just data pipelines)
- Merge into existing library
- Archive project and document learnings
Scenario 4: Competing Library Emerges
- Detection: New library with similar goals gets traction
- Response: Compare features, identify gaps
- Options:
- Collaborate/merge
- Differentiate clearly
- Concede if theirs is better
Immediate Action Plan
This Week
1. Build Minimal Prototype
- Implement just Validation<T, E> (200 LOC)
- Write 10 real test cases
- Document surprises
2. Get External Feedback
- Share design docs on r/rust
- Ask 3 Rust developers to review
- Collect objections
3. Benchmark Assumptions
- Measure boxing overhead
- Compare to hand-written code
- Document results
Next Week
4. Competitive Analysis
- Build same feature with frunk
- Build same feature with anyhow
- Compare LOC, ergonomics
5. Define Success Metrics
- Technical goals (compile time, overhead)
- Adoption goals (users, downloads)
- Quality goals (bugs, response time)
6. Reorganize Documentation
- Create docs/ structure
- Move existing docs
- Add navigation
Month 1
7. Implement Core MVP
- Validation type (complete)
- Effect type (basic)
- Context errors
- IO helpers
8. Write Real Examples
- Convert fictional examples to real
- All examples compile and run
- Add to CI
9. Gather User Feedback
- 5 developers try it
- Collect feedback
- Iterate on API
Process Improvements
Add to Workflow
Before Any Design Decision:
- ✅ Write example code showing usage
- ✅ Compare 2-3 alternatives
- ✅ Document trade-offs
- ➕ Prototype if unclear (NEW)
- ➕ Benchmark if performance-sensitive (NEW)
Before Finalizing API:
- ✅ Examples compile and run
- ➕ Get feedback from 3+ external developers (NEW)
- ➕ Ensure migration path exists (NEW)
Before Calling It “Done”:
- ✅ All tests pass
- ✅ Documentation complete
- ➕ Success metrics defined and measured (NEW)
- ➕ Performance validated (NEW)
- ➕ “Why not” section written (NEW)
Key Insight
We’ve been designing in a vacuum.
Good:
- ✅ Thorough analysis
- ✅ Clear philosophy
- ✅ Example-driven
Missing:
- ❌ No real code validation
- ❌ No user feedback
- ❌ No performance data
- ❌ No competitive validation
- ❌ No clear success criteria
Fix: Build small, validate often, talk to users.
Recommended Next Steps
Priority 1: Validate Core Assumptions
- Build minimal Validation prototype
- Write real tests that compile
- Measure performance
- Get 3 people to try it
Priority 2: External Validation
- Share on r/rust
- User interviews
- Competitive analysis
- Document objections
Priority 3: Organize for Success
- Define clear metrics
- Reorganize documentation
- Create migration guide
- Plan for failure scenarios
Great design emerges from iteration with reality, not just thought experiments.