AI-Native Studio · Hiten Shah × Rak

We don't guess.
We measure.

Rak is an AI-native studio. We build, evaluate, and iterate on products using synthetic simulation, local LLM infrastructure, and evidence at every step. No strategy decks. No opinions without data. Just work.

Cloud API cost

+55%

Intent lift, Talk Stories

5.8x

Throughput gain found

Models benchmarked

Point of View

Most AI work is expensive guessing.
We replaced guessing with infrastructure.

We run a local LLM farm on a single Mac Studio — 24 models, zero cloud cost, unlimited iteration. When we want to know if a landing page will convert, we don't A/B test it live. We simulate 20 personas against it in an hour and get the answer before a single real user sees it.

When we want to know which model to route a query to, we don't benchmark by feel. We run a six-dimensional eval suite across every model and build a router backed by the data. When we find a 5.8x throughput advantage hiding in a different backend, we measure it twice before we ship it.

The method is simple: define the question, build the measurement, run it, read the answer, change something, repeat. The output is work that compounds — each run makes the next one better.

What We Do

Synthetic User Simulation

20-persona simulations on landing pages, copy variants, and product decisions. Results in hours, not weeks. No recruiting, no scheduling, no bias from leading questions.

LLM Evaluation & Routing

Six-dimensional eval suites across your model candidates. Quality, multi-turn, domain, throughput, RAG, think vs no-think. A routing system backed by data, not vibes.

AI Infrastructure

Local LLM farms, cascade routers, model councils, speculative serving. Built on Apple Silicon with Ollama and MLX. Zero per-token cost. Unlimited iteration.

Copy & Positioning Iteration

Data-driven landing page iteration. Framing experiments, objection analysis, copy ceiling diagnosis. Every change tied to a simulation result, not a creative opinion.

Recent Work

Landing Page Simulation · March 7, 2026

Talk Stories: Landing Page to 6.65/10

5 simulation rounds, 100 persona evaluations, 4 copy variants tested head-to-head

+55%

Intent lift

AI Infrastructure · March 6–7, 2026

Local LLM Eval Farm: 24 Models, Zero Cloud

Six eval dimensions, MLX vs Ollama benchmark, cascade router, Model Council

5.8x

Throughput gain

View all work →

About

Rak is the working name for the AI studio built by Hiten Shah and an AI collaborator. Every piece of work documented here was built and shipped in real sessions — the case studies are the actual record of what happened, including the failures, the wrong answer keys, and the times the obvious fix made things worse.

The stack: Mac Studio, 256GB unified RAM, Ollama, MLX, Python. No cloud APIs required. All models run locally. All evals are checkpointed and resumable.

This site grows with the work. New case studies are added as new projects are completed. The methods improve with each run.

We don't guess.We measure.

Point of View

What We Do

Recent Work

About

We don't guess.
We measure.