How 5 rounds of synthetic persona simulation lifted conversion intent by 55% — from 4.3 to 6.65 — without a single real user, in a single day.
Talk Stories is an AI content tool that lives in Slack. It learns how each person on a team writes, then generates content in their voice on demand. The product is strong. The landing page was not converting.
Starting from an existing redesigned page, we ran 5 rounds of synthetic user simulation across 20 personas — all run locally on a Mac Studio using qwen2.5:7b as the persona model and llama3.3:70b as an independent judge. No human users. No survey panels. No waiting.
In a single day, average conversion intent moved from 4.3/10 to 6.65/10 — a 55% lift. Privacy objection rate dropped from 35% to 25%. Word-of-mouth signal (personas who said they would share the page) doubled from 15% to 30%.
A production-ready HTML landing page incorporating findings from all 5 simulation rounds:
The best landing page is the one that converts the right people and honestly disqualifies the wrong ones. We optimized for qualified intent, not surface metrics. A 6.65 from 17 on-target personas matters more than a 8.0 that includes people who would churn in week 2.
The Talk Stories landing page existed. It had a hero, social proof, feature list, pricing, and a CTA. The question wasn't whether it was designed — it was whether it was working.
The hardest constraint wasn't technical — it was scope control. Every simulation round produced findings that pointed toward 3-4 possible fixes. The discipline was picking the single highest-leverage change per iteration, not all of them at once.
Talk Stories is built for B2B Slack teams at 20-200 person companies. The target buyer is a content-bottlenecked role: founders who want to write but can't, heads of marketing who are drowning in requests, SDR managers whose reps never post. Pricing: ~$20-30/seat/month, free during early access.
The page needed to convert cold traffic from word of mouth ("a colleague sent me this"). Not SEO. Not ads. A human recommendation, followed by a first-impression read.
Every round used the same core setup. Consistency across runs is what makes the comparative data meaningful.
20 synthetic personas, held constant across all 5 simulation rounds. Personas represent the actual target buyer distribution for Talk Stories: B2B Slack teams, 20-200 person companies, content-bottlenecked roles.
| Role | Count | Company Size Range | Slack User | Content Pain |
|---|---|---|---|---|
| CEO / Founder | 5 | 8–50 people | 4 of 5 | Mixed |
| Head of Marketing / CMO | 4 | 35–150 people | 4 of 4 | High |
| VP Sales / SDR Manager | 2 | 70–80 people | 2 of 2 | High |
| Head of Content / Content Lead | 2 | 60–95 people | 2 of 2 | High |
| VP Product / VP Marketing | 2 | 110–150 people | 2 of 2 | Low–High |
| COO / Chief of Staff | 2 | 55–65 people | 2 of 2 | Medium |
| Head of Growth / Head of Comms | 2 | 200–800 people | 2 of 2 | Medium–High |
| Founder (non-Slack) | 1 | 12 people | No | Low |
Real user testing with 20 people across 5 iteration rounds would take weeks and cost thousands of dollars. Synthetic personas running on local LLMs let us iterate in hours, not weeks, at zero marginal cost. The trade-off: personas lack real-world messiness and embodied experience. We treat findings as directional signal, not ground truth. Findings that emerge consistently across 15+ personas are treated as reliable patterns.
Two models, two roles — chosen specifically to avoid self-grading bias:
The task prompt was intentionally neutral to avoid leading the witness:
"A colleague just sent you this link. You've never seen the product before. Take a look."
Each persona was then asked structured questions covering: first impression, comprehension, top objection, conversion likelihood (1-10), and whether they would share the page. Later rounds added section-specific questions as we introduced new elements.
The judge (llama3.3:70b) synthesized each round after all 20 persona responses were collected. It was asked to:
All checkpoint files were saved after each persona response. Runs were resumable — if a run crashed mid-way, it picked up where it left off without re-running completed personas.
Five rounds, five sets of changes, a 55% improvement in conversion intent. Here is every step.
The redesigned page used "ghostwriter" as the product framing, "beta" throughout, and included the line "it's read everything you've ever written in Slack." Social proof (Bolt, Spinwheel, Ramp, Anthropic, OpenAI), before/after voice examples, and a standard FAQ.
Top findings: 60% raised "AI can't capture our unique voice." 35% raised data privacy. 30% confused by what "beta" meant for pricing. Intent split: 40% unlikely (1-3), 40% maybe (4-6), 20% likely (7-10). The page had good bones but wasn't closing the deal.
Rather than guess which copy direction to pursue, we ran 20 personas against 4 distinct framing variants simultaneously. Full results in Section 05 below.
Winner: "Voice Engine + Early Access" scored 7.35/10 — highest of any variant. But the judge recommended a hybrid: Voice Engine's emphasis on learning your voice, combined with the clarity of describing the product without a category label.
Changes: "Beta" replaced with "early access" throughout. "Ghostwriter" label removed — page now describes what it does without a category name. "It's read everything you've ever written in Slack" replaced with "learns from what your team chooses to share. You control what it knows." Bottom CTA changed from "Get your team's ghostwriter" to "Your team's voice. In Slack."
What moved: The Slack privacy line removal was immediately noticed and appreciated. "Early access" felt more premium and intentional than "beta." Privacy objection rate remained ~35% — the language improved the feeling but didn't address the mechanism.
What didn't move: Voice authenticity skepticism remained the top objection at ~55% of personas. Privacy still unresolved without specific details on data handling.
Changes: New dedicated "Your data is yours. Full stop." section with 4 specific cards: you choose which channels it reads, data not used for training, delete any time within 24 hours, SOC 2 Type II ETA Q3 2026. Voice proof examples upgraded with real specifics: $12K savings, $28M raised, 47 calls, 6 demos booked — not vague qualitative claims. Copy proofread for stiff phrasing; natural contractions throughout.
What moved: Privacy objection rate dropped from 35% to 25% — the dedicated section worked. The specific numbers in voice examples were cited as more convincing. Intent climb continued steadily.
What didn't move: Voice authenticity skepticism still the top remaining objection. "Will it really sound like me?" can't be answered by showing someone else's example — it needs proof by doing.
Changes: New "The skeptics became the biggest fans" testimonials section with 3 quotes engineered specifically to address the voice objection: a CEO who sent a Talk Stories draft to his co-founder without revealing its source (co-founder loved it), a Head of Marketing who was "the biggest skeptic" and got converted in week one, and a VP Sales whose reps went from never posting to 4 published posts in the first week. New "What the first week looks like" timeline (Day 1 / Days 2-3 / Day 4-5 / Week 2+) to address workflow disruption concern. Slack demo section moved to dark background for visual contrast.
What moved: Testimonial section cited as credible by majority of personas. "Would you share this?" rate doubled from ~15% to 30%. Timeline section "significantly reduced" workflow and disruption concerns per judge synthesis. Steady intent climb continued.
The ceiling: The judge rated v5 "Nearly There." The remaining voice authenticity skepticism cannot be resolved by copy alone — it requires product experience. The page has gone as far as static copy can take it.
| Version | Avg Intent | Change | Privacy Objection | Share Rate | Key Change |
|---|---|---|---|---|---|
| v1 | 4.3/10 | Baseline | 35% | ~15% | Original page |
| v3 | 6.05/10 | +1.75 | ~35% | ~15% | Beta out, ghostwriter out, scary Slack line out |
| v4 | 6.35/10 | +0.30 | 25% | ~15% | Dedicated security section, grounded voice examples |
| v5 | 6.65/10 | +0.30 | ~22% | 30% | Testimonials, first-week timeline |
The biggest single jump was v1 to v3 (+1.75 points) — driven by removing three specific things that were actively hurting the page: the "beta" label, the "ghostwriter" framing, and the line about reading everything in Slack. Subtraction outperformed addition in round one. Every subsequent round added elements to fill the gaps the subtraction revealed.
Before rewriting anything, we ran a dedicated framing experiment: 4 complete versions of the hero and CTA copy, each tested against all 20 personas. 80 total runs. This is how we validated the "ghostwriter" question with data instead of opinion.
| Variant | Product Framing | CTA | Avg Intent | Result |
|---|---|---|---|---|
| A | "An AI ghostwriter that lives in your Slack" | Get beta access | 7.0/10 | Runner-up |
| C | "A Voice Engine that learns how everyone on your team writes, then writes like them" | Get early access | 7.35/10 | Winner |
| B | "A Story Engineer that learns your team's voice, writes content on demand" | Get early access | 6.9/10 | 3rd |
| D | No product label — description only | Join the waitlist | 6.8/10 | 4th |
"Voice Engine" scored highest because it puts the emphasis on your voice, not the AI doing something mysterious. The word "learns" does a lot of work — it implies the product earns accuracy over time rather than making claims it can't back up immediately.
"Ghostwriter" still works — 7.0/10 is strong — but it carries specific baggage. Two distinct failure modes: (1) personas burned by AI writing tools before associated "ghostwriter" with the generic outputs they already hated, (2) the word implies authorship deception, which felt off for teams publishing authentic thought leadership.
"Story Engineer" underperformed despite seeming clever. The word "engineer" created false associations — technical personas expected a workflow automation tool, not a writing tool. Several personas asked if it integrated with their CRM or code pipeline. A label that requires disambiguation is a bad label.
"Waitlist" weakest CTA by far. It implies the product isn't ready. "Early access" implies exclusivity. "Beta" implies it might break. "Get early access" performs best — suggests something real you can use today, with the benefit of a lower price lock-in.
The judge's final recommendation was a hybrid: drop the product label entirely and let the description do the work. "Voice Engine" won on scores, but the description-only variant (D) scored only marginally lower while being simpler. We applied the framing philosophy from C (emphasis on learning your voice) to the copy without attaching a label — resulting in the v3+ approach: no category name, just clear description of what it does.
This experiment replicated a finding from other simulation work: concrete labels beat abstract ones every time. "Story Engineer" failed for the same reason "BUILD/TEST/LEARN" failed in other taxonomy work — abstract concepts require explanation. "Voice Engine" succeeded because both words are immediately graspable. "Sprint/Experiment/Note" succeed because all three are familiar, specific, and hard to confuse.
The pattern holds: if someone has to read the description to understand the label, the label has already failed.
The original page included the line: "it's read everything you've ever written in Slack." This was meant to convey depth of context — that Talk Stories really knows how you write. In testing, it read as surveillance.
Multiple personas used words like "scary," "invasive," and "creepy." Several said they would not install anything that described itself this way, regardless of what it actually did. One persona called it "the kind of line a startup writes before they think about how it sounds to users."
Fix: Replaced with "learns from what your team chooses to share. You control what it knows." The replacement shifted the power dynamic from the product doing something to the user, to the user being in control. Privacy objection rate began declining in v3.
Lesson: Review your copy for lines that describe what the product does to the user. If it would sound bad in a headline, it should not be in your hero copy.
We expected "Story Engineer" to test well — it felt distinctive, memorable, and specific to the problem space. It scored 6.9/10. That sounds decent until you see that "Voice Engine" scored 7.35 and even plain description scored 6.8.
The problem: "engineer" is a loaded word in B2B SaaS. Technical buyers immediately map it to "workflow tool" or "integration platform." Several personas asked about API access and Zapier compatibility. One CMO said "that sounds like an IT purchase, not a marketing purchase."
Lesson: Words carry professional associations that override your intended meaning. Test job title words carefully. "Engineer" skews technical. "Manager" skews middle-management. "Studio" skews creative agency. If your buyer persona is a Head of Marketing, make sure your label sounds like something a Head of Marketing would buy.
Adding a dedicated security section ("Your data is yours. Full stop.") with 4 specific cards dropped the privacy objection rate from 35% to 25% in a single round. We expected some improvement — we didn't expect it to be that direct.
The insight: people aren't afraid of privacy in the abstract. They're afraid because nobody told them the specifics. "We take security seriously" is a red flag. "You choose which channels it can read, your data is never used to train other companies' models, and you can delete everything within 24 hours" is a contract.
Lesson: If privacy is a likely objection for your product, treat it like a feature. Give it its own section, its own headline, and specific mechanics — not reassurances.
Generic testimonials ("This tool is amazing! Our content quality improved so much!") do almost nothing for conversion. The v5 testimonials were written to directly address the specific objection that 60% of personas raised — "will it actually sound like my team?"
Each testimonial was structured around a skeptic arc: I didn't believe it, here's what happened, here's the proof. The CEO who sent the draft to his co-founder without revealing it was AI-generated — and got a compliment — does more work than ten generic "great product" quotes.
Lesson: Write testimonials to the objection, not to the product. The best testimonial is the one that handles the thing preventing the reader from clicking.
The judge rated v5 "Nearly There — not yet ready to ship without further refinement." The remaining voice authenticity skepticism (~70% still raised it when asked directly) cannot be addressed by copy. The only fix is product experience: letting someone see it work on their actual Slack messages.
This is not a failure of the process — it's the process doing its job. The simulation correctly identified where copy ends and product begins. The recommendation for a v6 is an interactive demo widget or a "try it on your own Slack message" element — which is a product decision, not a copywriting decision.
Lesson: Simulation rounds eventually reveal the conversion ceiling for static copy. When the judge's top recommendation is something the page physically cannot do (let users experience the product), the page is ready to ship and the product team takes over.
| File | Description | Status |
|---|---|---|
talkstories_v5.html | Final production-ready landing page. All v1-v5 findings applied. Zero em dashes, proofread, tested. | Ship-ready |
talkstories_v4.html | v4 page — security section + grounded examples. Useful for A/B testing v4 vs v5. | Reference |
talkstories_v2.html | Backup of v3 page pre-v4 changes. | Archive |
| File | Dimensions | Notes |
|---|---|---|
talkstories_fullpage.png | 1440 × 6772px | v3 page, full width, sent as Telegram document |
talkstories_v4_fullpage.png | 1440 × 6772px | v4 page after security section addition |
talkstories_v5_fullpage.png | 1440 × 9025px | v5 final, includes testimonials + timeline sections |
| File | Contents |
|---|---|
sims/talkstories_20260307_094541.json | v1 baseline — 20 personas, full responses + judge synthesis |
sims/framing_20260307_102056.json | Framing experiment — 4 variants × 20 personas + head-to-head comparison |
sims/v3_[timestamp].json | v3 sim — 20 personas, synthesis comparing to v1 |
sims/v4_20260307_112600.json | v4 sim — 20 personas, security section impact measured |
sims/v5_20260307_140020.json | v5 sim — 20 personas, testimonial + timeline impact, final synthesis |
How to apply this process to any landing page — or any copy that needs to convert.
Strip the page to plain text and run a simulation before making any changes. The v1 baseline is the most important data point in the whole process — everything after is measured against it. Don't skip this step even if you're confident the page has problems. You need to know which problems are real and which ones just feel bad.
If you don't know which copy direction to pursue, don't guess — test 3-4 variants simultaneously with the same persona panel. It's cheaper than rewriting and guessing wrong. The variants should differ on the thing you're most uncertain about: the product label, the headline, the CTA framing, the stage (beta vs early access vs waitlist).
Run all variants on the same day with the same 20 personas. The relative ranking matters more than the absolute scores.
The biggest jump in this project came from removing three things (scary Slack line, "ghostwriter" label, "beta" framing) — not from adding sections. When simulations reveal problems, ask whether the problem is caused by something present on the page before reaching for something new to add.
Lines that actively hurt conversion are more costly than missing sections. Fix the active harm first.
"We take your privacy seriously" does nothing. "You choose which channels it can read, and your data is deleted within 24 hours of disconnecting" does something. When an objection persists across 35% of personas, it means they don't have the specific information they need — not that they haven't been reassured enough. Give them the mechanism.
Find your most common objection (from simulation or from sales calls). Write the testimonial that directly addresses it. The structure that works: I was the skeptic, here is the specific moment it changed my mind, here is the specific outcome. Vague enthusiasm is decoration. Specific conversion stories are evidence.
When the judge's single highest-leverage recommendation is something a static page cannot do (interactive demo, trial experience, live proof), the page has reached its copy ceiling. Ship it. Hand the remaining conversion problem to the product team. The page's job is to get the right people to the CTA — the product's job is to confirm what the page promised.
Start with data, not opinions. Subtract before you add. Address mechanisms, not feelings. Engineer proof to the objection. Recognize the copy ceiling when you hit it.
The final page feels obvious. The process was anything but.
| Round | Type | Personas | Avg Intent | Key Metric |
|---|---|---|---|---|
| v1 Baseline | Full page sim | 20 | 4.3/10 | 60% voice objection, 35% privacy |
| Framing Experiment | 4-variant test | 20 × 4 = 80 | 6.8–7.35/10 by variant | Voice Engine wins; "Story Engineer" fails |
| v3 Sim | Full page sim | 20 | 6.05/10 | +1.75 from v1; privacy still ~35% |
| v4 Sim | Full page sim | 20 | 6.35/10 | Privacy drops to 25% |
| v5 Sim | Full page sim | 20 | 6.65/10 | Share rate doubles to 30% |
| Variant | Label | CTA Stage | Avg Intent | Judge Verdict |
|---|---|---|---|---|
| B | Story Engineer | Early access | 6.9/10 | Failed — "engineer" skews technical, wrong buyer associations |
| D | No label | Waitlist | 6.8/10 | Weakest CTA; "waitlist" implies product not ready |
| A | Ghostwriter | Beta | 7.0/10 | Runner-up; carries baggage for AI-burned personas |
| C | Voice Engine | Early access | 7.35/10 | Winner; emphasis on learning your voice, not AI doing something to you |
| Persona | Role | Company Size | v5 Score | v1 Score (est.) |
|---|---|---|---|---|
| Aisha | CEO | 30p fintech | 7 | 7 |
| Marcus | CEO | 22p B2B SaaS | 7 | 5 |
| Priya | Head of Marketing | 45p HR tech | 6 | 5 |
| David | CMO | 120p enterprise software | 8 | 5 |
| Dana | VP Sales | 80p SaaS | 7 | 5 |
| Kevin | Head of Growth | 800p enterprise | 6 | 4 |
| Sofia | Founder | 12p consumer | 6 | 4 |
| Alex | COO | 65p proptech | 6 | 3 |
| Tanya | Marketing Manager | 38p edtech | 7 | 5 |
| Bernard | CEO | 50p professional services | 6 | 2 |
| Kenji | Head of Content | 95p martech | 7 | 5 |
| Morgan | VP Marketing | 150p logistics tech | 6 | 4 |
| Nilufar | Chief of Staff | 55p climate tech | 6 | 4 |
| Jamie | SDR Manager | 70p sales tech | 7 | 5 |
| Isabelle | Head of Comms | 200p healthtech | 6 | 4 |
| Ryan | Founder | 8p AI tools | 7 | 6 |
| Chen | VP Product | 110p devtools | 6 | 3 |
| Fatima | Head of Marketing | 35p legaltech | 7 | 5 |
| Omar | CEO | 25p recruitment tech | 7 | 4 |
| Laura | Content Lead | 60p fintech | 7 | 5 |
| Component | Spec | Role |
|---|---|---|
| Mac Studio | 256GB unified RAM, Apple Silicon | All inference, local only |
| Ollama | v0.x, port 11434 | Model serving backend |
| qwen2.5:7b | 4-bit quantized, ~5GB VRAM | Persona model (T2 tier) |
| llama3.3:70b | 4-bit quantized, ~40GB VRAM | External judge — never self-judges |
| Python 3.14 | requests, json, checkpoint files | Simulation scripts |
| shot-scraper | Playwright-based CLI | Full-page screenshots at 1440px |
| peekaboo | macOS UI automation CLI | Safari window capture for quick previews |
| Version | Change | Reason | Impact |
|---|---|---|---|
| v3 | "beta" → "early access" throughout | Framing experiment data | +premium feel, less "unfinished" |
| v3 | Removed "ghostwriter" label | Framing experiment data | Removed deception connotation |
| v3 | Removed "read everything you've ever written in Slack" | v1: "scary," "invasive" | Privacy objection began declining |
| v3 | Bottom CTA: "Get your team's ghostwriter" → "Your team's voice. In Slack." | Ghostwriter label removal | Cleaner, no category confusion |
| v4 | Added 4-card security section | 35% privacy objection in v1/v3 | Privacy objection: 35% → 25% |
| v4 | Grounded voice examples with real numbers | Vague examples not convincing | Examples cited as more credible |
| v4 | Natural contractions throughout ("it's" not "it is") | Copy felt stiff in proofread | Brand voice more human |
| v5 | Added testimonials section (3 skeptic-to-convert quotes) | Voice authenticity top objection | Share rate: 15% → 30% |
| v5 | Added "first week" timeline | Workflow disruption objection #2 in v4 | Disruption concerns "significantly reduced" |
| v5 | Slack demo on dark background | Visual contrast / visual break | Aesthetic — no intent impact measured |
| All | Zero em dashes enforced | House style requirement | Consistency |
The actual landing pages built from this work — open each one to see the result.
| Version | Description | Link |
|---|---|---|
| v5 — Final | Production-ready. All 5 sim rounds applied. Zero em dashes, proofread, ship-ready. | Open ↗ |
| v4 | Security section + grounded voice examples. Before testimonials. | Open ↗ |
| v3 | Post-framing experiment. Early access + no ghostwriter label. | Open ↗ |