Experiment Proposals Get Research Guardrails

AI-generated experiment proposals now have to use the right playbook and current benchmark research before they reach approval. The day also introduces a mocked Idea Snapshot page that shows the shape of a concise founder-facing validation report.

This is a quality-control pass for AI-generated validation work. The agent can still move quickly, but it now has to pick the right experiment type, load the matching guidance, and check current metrics before it proposes a concrete action.

That should make experiment suggestions feel less like generic startup advice. The stored research context also gives us a trail we can surface later when a founder asks why a metric or experiment was recommended.

The mocked Idea Snapshot is the other half of the day. It turns the product's validation data into a compact report format, which gives us a target for how project outputs should eventually feel: focused, structured, and useful enough to share.

  • Experiment playbooks: each AI-proposed experiment is now tied to a matching guidance set for customer research, social content, cold email, or customer interviews.
  • Metric research step: the agent runs one current benchmark lookup before proposing an experiment, so success metrics are grounded in external context instead of guesswork.
  • Experiment metadata: proposed experiments now store the skill name, source URL, metric research query, research summary, and supporting sources for later review.
  • Idea Snapshot preview: a new `/idea-snapshot` route shows a mocked validation report with quick facts, ICP, market size, hypotheses, competitor comparison, and next-step CTA.
  • Project chat experiment flow: the agent now proposes one experiment at a time, uses project context before acting, and avoids creating experiments when metric research cannot be completed.
  • Experiment types expanded: validation work now covers market research, Reddit scaffolding, social media marketing, direct outreach, and customer interviews.
  • Tool-call visibility made explicit: project-chat tool cards are hidden unless an environment flag enables them, keeping the default chat experience cleaner.
  • Third-party notices updated: new external skill sources and research dependencies are documented in the app notices.