Experiment Proposals Get Research Guardrails
AI-generated experiment proposals now have to use the right playbook and current benchmark research before they reach approval. The day also introduces a mocked Idea Snapshot page that shows the shape of a concise founder-facing validation report.
This is a quality-control pass for AI-generated validation work. The agent can still move quickly, but it now has to pick the right experiment type, load the matching guidance, and check current metrics before it proposes a concrete action.
That should make experiment suggestions feel less like generic startup advice. The stored research context also gives us a trail we can surface later when a founder asks why a metric or experiment was recommended.
The mocked Idea Snapshot is the other half of the day. It turns the product's validation data into a compact report format, which gives us a target for how project outputs should eventually feel: focused, structured, and useful enough to share.
- Experiment playbooks: each AI-proposed experiment is now tied to a matching guidance set for customer research, social content, cold email, or customer interviews.
- Metric research step: the agent runs one current benchmark lookup before proposing an experiment, so success metrics are grounded in external context instead of guesswork.
- Experiment metadata: proposed experiments now store the skill name, source URL, metric research query, research summary, and supporting sources for later review.
- Idea Snapshot preview: a new `/idea-snapshot` route shows a mocked validation report with quick facts, ICP, market size, hypotheses, competitor comparison, and next-step CTA.
- Project chat experiment flow: the agent now proposes one experiment at a time, uses project context before acting, and avoids creating experiments when metric research cannot be completed.
- Experiment types expanded: validation work now covers market research, Reddit scaffolding, social media marketing, direct outreach, and customer interviews.
- Tool-call visibility made explicit: project-chat tool cards are hidden unless an environment flag enables them, keeping the default chat experience cleaner.
- Third-party notices updated: new external skill sources and research dependencies are documented in the app notices.