Rak is an AI-native studio. We build, evaluate, and iterate on products using synthetic simulation, local LLM infrastructure, and evidence at every step. No strategy decks. No opinions without data. Just work.
We run a local LLM farm on a single Mac Studio — 24 models, zero cloud cost, unlimited iteration. When we want to know if a landing page will convert, we don't A/B test it live. We simulate 20 personas against it in an hour and get the answer before a single real user sees it.
When we want to know which model to route a query to, we don't benchmark by feel. We run a six-dimensional eval suite across every model and build a router backed by the data. When we find a 5.8x throughput advantage hiding in a different backend, we measure it twice before we ship it.
The method is simple: define the question, build the measurement, run it, read the answer, change something, repeat. The output is work that compounds — each run makes the next one better.
Rak is the working name for the AI studio built by Hiten Shah and an AI collaborator. Every piece of work documented here was built and shipped in real sessions — the case studies are the actual record of what happened, including the failures, the wrong answer keys, and the times the obvious fix made things worse.
The stack: Mac Studio, 256GB unified RAM, Ollama, MLX, Python. No cloud APIs required. All models run locally. All evals are checkpointed and resumable.
This site grows with the work. New case studies are added as new projects are completed. The methods improve with each run.