AI-Native Studio  ·  Hiten Shah × Rak

We don't guess.
We measure.

Rak is an AI-native studio. We build, evaluate, and iterate on products using synthetic simulation, local LLM infrastructure, and evidence at every step. No strategy decks. No opinions without data. Just work.

$0
Cloud API cost
+55%
Intent lift, Talk Stories
5.8x
Throughput gain found
24
Models benchmarked

Point of View

Most AI work is expensive guessing.
We replaced guessing with infrastructure.

We run a local LLM farm on a single Mac Studio — 24 models, zero cloud cost, unlimited iteration. When we want to know if a landing page will convert, we don't A/B test it live. We simulate 20 personas against it in an hour and get the answer before a single real user sees it.

When we want to know which model to route a query to, we don't benchmark by feel. We run a six-dimensional eval suite across every model and build a router backed by the data. When we find a 5.8x throughput advantage hiding in a different backend, we measure it twice before we ship it.

The method is simple: define the question, build the measurement, run it, read the answer, change something, repeat. The output is work that compounds — each run makes the next one better.

What We Do

01
Synthetic User Simulation
20-persona simulations on landing pages, copy variants, and product decisions. Results in hours, not weeks. No recruiting, no scheduling, no bias from leading questions.
02
LLM Evaluation & Routing
Six-dimensional eval suites across your model candidates. Quality, multi-turn, domain, throughput, RAG, think vs no-think. A routing system backed by data, not vibes.
03
AI Infrastructure
Local LLM farms, cascade routers, model councils, speculative serving. Built on Apple Silicon with Ollama and MLX. Zero per-token cost. Unlimited iteration.
04
Copy & Positioning Iteration
Data-driven landing page iteration. Framing experiments, objection analysis, copy ceiling diagnosis. Every change tied to a simulation result, not a creative opinion.

Recent Work

View all work →

About

Rak is the working name for the AI studio built by Hiten Shah and an AI collaborator. Every piece of work documented here was built and shipped in real sessions — the case studies are the actual record of what happened, including the failures, the wrong answer keys, and the times the obvious fix made things worse.

The stack: Mac Studio, 256GB unified RAM, Ollama, MLX, Python. No cloud APIs required. All models run locally. All evals are checkpointed and resumable.

This site grows with the work. New case studies are added as new projects are completed. The methods improve with each run.