You're about to make a big
technical decision.
Most teams wing it.
ChatGPT gives you a paragraph. wheat gives you typed claims with conflicts highlighted. You research, prototype, stress-test — then a compiler catches contradictions and blocks output until they are resolved. A build system for decisions.
Works with Claude Code, Cursor, Copilot, or standalone. Node.js 20+.
npx wheat init "Should we migrate from microservices to a modular monolith?"
Runs inside Claude Code. Node.js 20+. Zero npm dependencies.
What is wheat?
Every finding in a wheat sprint is a claim — a single typed statement with an evidence grade. "Redis handles 100k ops/sec" is a factual claim at the "documented" tier. "We should use Redis" is a recommendation at the "stated" tier. The compiler treats them differently.
wheat is a structured way to answer technical questions. You start with a question — "Should we migrate from microservices back to a modular monolith?" — and use slash commands to research, prototype, and challenge your findings. Every finding becomes a typed claim (factual, risk, estimate, constraint, recommendation) with an evidence tier from "someone said it" to "measured in production."
When you're ready, a compiler validates everything: catches contradictions, flags weak evidence, and blocks output until issues are resolved. The compiler is JavaScript code — not prompts, not an LLM call. Same claims in, same result out, every time. The output is a decision brief you can send to your team, with a git audit trail showing how every claim was collected and challenged.
Without Wheat
"We should probably roll our microservices back into a monolith because Prime Video did it and Jake said it worked well at his last company."
With Wheat
14 typed claims. 3 risks flagged. 1 contradiction caught between a team-size assumption and a documented constraint. Recommendation: consolidate the auth service back into the monolith; keep payments split where async boundaries are load-bearing. Every claim traceable to a source.
Ask your question
One command. Zero prompts. Sprint ready in under 3 seconds.
$ npx @grainulation/wheat "Should we migrate from microservices to a modular monolith?"
Gather evidence
Use slash commands to research, prototype, and challenge. Every finding is tracked with a type and an evidence grade.
wheat> /research "service consolidation cost"
Ship the brief
The compiler validates everything, resolves conflicts, and produces a decision document you can share.
wheat> /brief
"Should we consolidate the auth service back into our monolith?"
A team needs to decide whether to invest 3-5 weeks in consolidating their standalone auth service back into the monolith. Instead of debating in Slack, they use wheat to research it properly.
Define the question
The team lead types the question: "Should we consolidate the auth service back into our monolith?" wheat asks for audience (platform team), constraints (must not break existing SSO clients during the rollback), and scaffolds the investigation.
Gather evidence
wheat reads your codebase, searches the web, and records what it finds. Each finding gets a type (factual, risk, estimate) and an evidence grade — from "stated" (someone said it) to "tested" (prototype-validated).
Build and measure
wheat builds a working proof-of-concept and benchmarks it. Findings from prototypes get the "tested" evidence grade — real measurements, not blog posts.
Stress-test the findings
The adversarial review catches that the latency-win estimate assumes auth's in-process path will stay fast at monolith scale — but the last 90 days of incidents show the monolith already has slow-path contention, and the reverse-strangler re-establishes the schema boundaries that made auth painful to host in the first place.
Ship the decision
The compiler resolves conflicts (higher evidence grades win), validates everything, and produces a self-contained HTML decision document. If there are contradictions, it tells you exactly what to fix first.
The outcome: Instead of a 45-minute meeting where the loudest voice wins, the team has a compiled brief with 14 validated findings, resolved conflicts, and a clear recommendation. Anyone can run git log claims.json to see how the decision was made.
Numbers in this example are illustrative — the pattern is what matters. Evidence tiers (web, tested, production) flag which claims would need what level of verification in your own sprint.
A decision-making framework built for engineers
Everything you need to go from question to architecture decision. Nothing you don't.
Typed, Evidence-Graded Claims
Every finding has a type (factual, risk, estimate, constraint, recommendation) and an evidence grade from "stated" (unverified) to "production" (measured in prod). Weak evidence gets flagged. silo stores proven claims as reusable packs.
Compiler Validation
A 7-pass compiler runs before any output is produced. It validates schemas, checks type distribution, sorts by evidence tier, detects conflicts, auto-resolves when evidence tiers differ, analyzes coverage, and checks readiness. Pure JavaScript — no LLM calls. You cannot ship a brief built on contradictions.
20 Slash Commands
/init, /research, /prototype, /challenge, /witness, /blind-spot, /brief, /present, /status, /feedback, /handoff, /merge, /replay, /calibrate, /resolve, /evaluate, /connect, /next, /sync, /pull.
Git Audit Trail
Every finding auto-commits. git log --oneline claims.json is the complete history of how you reached your decision. farmer streams tool calls to your phone for real-time approval.
Shareable Decision Briefs
Briefs, presentations, and dashboards are single HTML files with inline CSS/JS. Send them to anyone — no hosting needed. Use mill to export to PDF, CSV, or static sites.
Works in Any Tech Stack
Any repo, any stack, any language. If Claude Code can read it, wheat can research it. Evaluate your Scala migration, Python monorepo, or Flutter rewrite.
Yes, Node 20 or later. But your project can use any language or framework — wheat works in any repo regardless of stack.
Yes. wheat runs as a set of slash commands inside Claude Code. It uses Claude's ability to read your codebase, search the web, and reason about evidence.
Asking Claude gives you a paragraph of plausible-sounding advice with no way to verify it. wheat gives you 10-30 typed claims, each with a named evidence tier, run through a compiler that catches contradictions and flags weak evidence. The output is a decision brief with a git audit trail — git log claims.json shows exactly how you got there. Every claim is traceable. Every conflict is surfaced. Not a prompt wrapper — a build system for decisions.
Most planning tools generate a big plan upfront, then you execute it. wheat validates continuously: every finding is checked as it comes in, conflicts are caught immediately, and the compiler blocks output if your evidence doesn't hold up. Your understanding evolves as you learn, not before.
No. A wheat sprint is a single investigation — one question, a set of findings, and a compiled output. It takes 15 minutes to an hour, not two weeks. Think make, not Jira.
A claim is a single finding from your investigation. Each one has a type — factual ("the API returns paginated results"), risk ("connection pooling may bottleneck"), constraint ("must support Postgres 14+"), estimate ("migration: 2-4 weeks"), or recommendation. Each claim also has an evidence grade so you know how much to trust it. The compiler validates them all before you can produce output.
A simple question can be answered in 10-15 minutes. Bigger decisions with prototyping and multiple rounds of challenge might take an hour or spread across a few sessions. orchard tracks dependencies when you're running multiple sprints in parallel.
The ecosystem
wheat is the core research engine. Add tools as you need them.