Case Study · FPGATek

Mobius Forge vs. expert prompting: a controlled Meta ad experiment.

What happened when we ran the engine against off-the-shelf AI generation and a refined AI workflow on a live Meta campaign.

Headline Summary

$1.84

cost per lead

Mobius Forge engine, blended

4.3×

cheaper

vs. off-the-shelf AI generation

84%

of CBO budget

auto-allocated to the engine

The Question

What does the engine actually add?

Most AI ad tools produce convergent output. A skilled human using a frontier model with iterative prompting can match or beat them on hook quality, copy variation, and visual range. This is not a controversial claim. We can show it in 60 seconds.

The real question for any AI ad engine is narrower. Does it produce concepts that a skilled prompter cannot reach, and does Meta's delivery algorithm reward those concepts when they exist?

We tested this on FPGATek, an FPGA hardware education brand. We chose a category outside our target customer base on purpose. Mobius Forge's claim is about structural buyer-barrier diversity, not category-specific tactics. If the engine outperforms AI baselines on hardware education buyers, that is stronger evidence the framework transfers across categories than testing it inside our own ICP would be. A direct replication on a consumer DTC brand is in progress.

Methodology

Three conditions. Two phases. Identical inputs.

Three conditions tested

Condition AOff-the-shelf AI
Single-pass generation from a strong meta-prompt. The naive baseline most teams encounter when they try a generic AI ad tool.
Condition BRefined AI workflow
Frontier model with a multi-step prompting workflow including critique and refinement. The smart baseline most in-house growth teams build internally.
Condition CMobius Forge engine
Standard engine output, no special tuning for this experiment.

Two phases

Phase 1 (CBO)
Campaign Budget Optimization. All three ad sets in one campaign. Meta's algorithm freely allocates budget across sets in real time based on performance signals. This phase measured how Meta's delivery system responded to each creative set.
Phase 2 (ABO)
Migrated to Ad Set Budget Optimization with equal budgets across all three conditions. This isolates pure creative performance from Meta's allocation behavior.

Controls

Identical broad audience across all three conditions
Same product set
Same creative brief inputs given to all three conditions
Same campaign objective (Meta lead-gen) and bidding strategy
Same time window (Apr 5-28, 2026, 23 days)

Reported parameters

Total spend: ~$518 across all three conditions
Total leads: 222
Duration: 23 days
Leads per condition: 162 (engine), 51 (refined AI), 9 (off-the-shelf AI)

Primary metric

Cost per lead (CPL).

Secondary metrics

Click-through rate, Meta's algorithmic budget allocation in CBO phase.

Results

The cost-per-lead gap is real and persistent.

Condition	Leads	CPL	Share of leads	Cost ratio vs. engine
A: Off-the-shelf AI	9	$7.94	4%	4.3x more expensive
B: Refined AI workflow	51	$2.92	23%	1.6x more expensive
C: Mobius Forge engine	162	$1.84	73%	baseline

Phase 1: CBO Allocation

Meta's algorithm directed 84% of total CBO spend to the Mobius Forge ad set within days. Spend was pulled off the off-the-shelf AI variant almost immediately.

Phase 2: The Counterintuitive ABO Finding

Higher click rates produced worse leads.

In the ABO campaign, the refined AI variant had a higher link CTR than the engine (2.30% vs 1.64%), but cost 47% more per lead ($3.06 vs $2.09). Higher click rates on cleverer-sounding copy. Worse leads behind those clicks. The engine's clicks are pre-qualified by the angle they respond to.

ABO · CTR vs. cost per lead

Refined AI workflow

CTR

2.30%

Cost / lead

$3.06

Mobius Forge engine

CTR

1.64%

Cost / lead

$2.09

What both phases tell us together

The Phase 2 ABO comparison measures pure creative performance with budget held constant. The Phase 1 CBO allocation measures how Meta's delivery system perceives each creative set. Both metrics moved in the same direction at meaningful magnitude.

The CBO result is the harder result to argue with. We did not pick which ad set Meta favored. Meta's delivery system did, based on signals we have no access to. This is third-party validation that does not depend on our judgment or the audience's response.

Next Step

If you want to see the engine on your specific brand and category, the $100 diagnosis is the right starting point. The diagnosis is delivered against your live ads and your buyer, not against a category abstraction.

About this study

N=1 advertiser, two campaigns (ABO + CBO), 23 days, 222 leads, ~$518 total spend. Meta lead-gen objective. All variants ran on the same Meta account against the same audience definition. This is directional evidence supporting our hypothesis, not a universal performance guarantee.

Full methodology, raw numbers, campaign screenshots, and the prompts used in conditions A and B are available as a downloadable PDF.

Download PDF

Common Questions

What buyers ask before the first diagnosis.

How is this different from a generic AI ad tool?

AI tools generate variations. We diagnose coverage. A generator gives you twelve ads that look different and sell the same way to the same person. We read the ads you're already running, map them against the specific reasons your buyers don't purchase, and show you which of those reasons your creative never addresses. The output isn't more concepts, it's knowing which concepts are worth making.

Does this work in my category?

Mobius Forge maps buyer barriers and persuasion mechanisms, not category-specific tactics. The same buyer barriers (skepticism, urgency, identification, switching cost, claim fatigue) show up in skincare, supplements, software, and hardware. Different products, same underlying buyer psychology. The FPGATek experiment was run outside our target DTC category on purpose. It is early evidence that the mechanism is not limited to one consumer vertical. The $100 diagnosis shows exactly how it applies to your brand.

What stops a competitor from copying this?

The framing is portable. The system that produces the framing is not. We have published which buyer barriers and persuasion mechanisms the engine maps to, because the value is not in the taxonomy. The value is the judgment to read your live portfolio, see which barriers it leaves open, and target them — for your specific brand, against your specific existing ads. That requires more than a prompt.

Why does this perform differently than other AI tools?

The FPGATek experiment is the cleanest answer. Same brief, same audience, same Meta account. The engine produced 1.6x cheaper leads than a frontier model with a multi-step prompting workflow built by an experienced practitioner. The output is structurally different in a way that affects performance, and Meta's own delivery algorithm allocated 84% of budget to the engine without being asked.