See if self-hosting an LLM actually beats the API bill.
Web tool. Type a model, your daily token volume, batch size. Get hosted API cost on OpenAI, Anthropic, Together, Fireworks beside vLLM on H100, A100, L40S, with the breakeven volume printed at the top.
Bring a friend who'd use this. Each signup pulls the launch closer.
$4.99 lifetime ยท or $19 / month SaaS ยท no charge today
Built by someone who already shipped 30+ tools
For a sense of what shipped looks like: relly.permissionlabs.com โ
Decide self-host vs API in four moves.
Click any card to jump to its deep dive below.
Built for these decisions.
If any of these are your question, this is the tool.
At my volume, would self-hosting Llama 3.3 70B actually be cheaper?
โ Daily tokens in. Side-by-side cost: OpenAI vs Anthropic vs vLLM on H100. Breakeven volume printed at the top.
If we ship to 1,000 paying users, what is our LLM line item?
โ Pick a model class and a vendor. Pricing scales with assumed tokens per user. Hosted vs self-host cross-over surfaced.
Where is the cheapest H100 hour today?
โ Live RunPod vs Vast.ai vs Lambda table. Filter by region, spot vs on-demand, vRAM. Sorted by per-token cost at your batch size.
Does the breakeven hold if we double the request volume?
โ Move the volume slider. Watch the lines cross. Save the scenario, share the URL, defend the number in the planning meeting.
Hosted vs self-host LLM cost lives in 12 tabs and a stale spreadsheet.
You want to know whether running Llama 3.3 70B on a rented GPU is cheaper than paying OpenAI or Anthropic per million tokens. Every answer lives in a different tab. RunPod and Vast.ai pricing pages list 30 GPU SKUs but nothing on throughput. Hosted API trackers like artificialanalysis.ai compare only hosted vendors. The vLLM and SGLang docs show batching wins on paper, but the math never lands in the same view as your daily token volume. You end up in a spreadsheet at 1am, guessing at idle hours and egress, and the breakeven number changes every time you blink.
Four pieces, one tool.
Each piece ships in the first build for waitlist members. SaaS upgrades layer on top.
Hosted vs self-host breakeven calculator
Type your model, your daily token volume, and your batch size. We print monthly cost on OpenAI, Anthropic, Together, and Fireworks beside the same workload on vLLM running on the cheapest matching GPU rental. The breakeven volume, where self-host overtakes hosted, sits at the top of the page in one bold number.
Batching multipliers from real benchmarks
Throughput numbers come from published vLLM and SGLang benchmark releases per model and GPU pair, not vendor brochures. We multiply by your concurrency assumption, so the per-token cost reflects the throughput you would actually see at batch 8, not the marketing peak at batch 256.
Hidden cost line items
Idle GPU hours when traffic is low, cold start surcharges on serverless GPU plans, model weight storage per month, egress per million tokens out. Included in the rolled-up monthly number, never buried in a footnote.
- Cursor Pro$20/mo
- Claude Pro$20/mo
- GPT Plus$20/mo
- Midjourney Standard$30/mo
- GitHub Copilot$10/mo
- Perplexity Pro$20/mo
- ElevenLabs Creatortrial$22/mo
- Replicate$25/mo
Live GPU rental price feed
Daily refresh of RunPod, Vast.ai, and Lambda spot and on-demand prices. Filter by GPU model, region, vRAM, interruptible vs reserved. The cheapest H100-SXM hour today is the one we plug into your calculation.
- Stock photo92% matchUnsplash default hero
- Gradient88% matchTailwind starter purple-pink
- AI portrait71% matchsmooth-skin tells in founder photo
- Icon set95% matchHeroicons defaults, no customization
Today's tools don't do this.
A wrong breakeven call ships a $1,500 monthly GPU bill that should have stayed a $400 API bill, or the reverse. Six months of overpay before someone redoes the spreadsheet. Engineering time to set up vLLM is two days at minimum. Picking the wrong GPU SKU adds a week.
Inputs for model, daily token volume, batch size, and concurrency. Side-by-side monthly cost on OpenAI, Anthropic, Together, Fireworks, and vLLM on the cheapest matching GPU rental from RunPod, Vast.ai, Lambda. Breakeven volume printed at the top. Hidden costs (idle, cold start, storage, egress) rolled into every line.
Alerts when a saved scenario crosses the breakeven line. Custom batch size simulator. SGLang and TensorRT-LLM throughput modes added. Team tier with shared scenarios and weekly GPU price digest. $19/month team tier locks in current pricing.
GPU provisioning, vLLM deployment scripts, model weight downloads, fine-tuning cost modeling, RAG pipeline costs. We answer the cost question. You ship the stack.
Which price would get you on the waitlist?
No charge today. The click tells me which tier is real demand. Early access price โ launch price.
Want this built?
Drop your email. No charge, no spam. You're saying "yes, I'd actually use this." That's the signal I'm looking for.
Hi, I'm Hyunyoung.
Solo builder ยท Choppy Toast
This page is a quick vibe-coded probe to test demand and gather feature requests. The actual product, when it ships, will be a polished, hand-built tool, not this scaffold.
For a sense of what "polished and shipped" looks like, here's another product I built: relly.permissionlabs.com.
Honest answers.
Then I wait. The waitlist stays live, no expiry, no archive. The faster you tell one dev who is staring at an OpenAI bill this month, the faster it ships. That is the whole engine.
More upcoming tools.
Extension Listing Kit
SaaS
Upload each screenshot, promo tile, and description once. We resize and crop every image to the exact spec the Chrome Web Store, Firefox Add-ons, and Edge Add-ons each ask for, hold one copy of your name, summary, and description per store and per language with the character limit checked as you type, and run a pre-submit pass that names every missing or out-of-spec asset before you hit publish.
Video Bloat Audit
Web tool
Paste a URL. We find every autoplay and background video on the rendered page, show the real transfer weight and how much each one adds to your largest paint, estimate the monthly bandwidth bill at your actual traffic, and hand you a ranked plan to swap each clip for a poster, a lazy load, a smaller encode, or a scripted animation.
Decompound Calc
Web tool
See exactly when your retirement money runs out.
AI Stack Cost
SaaS dashboard
One dashboard for every AI subscription you forgot about.
Vibe Check
Web tool
Find every AI-generated tell on your landing page.
Account Stack 2026
Web tool
Tell us your income. We tell you exactly where to put each dollar across 401k, Roth IRA, HSA, and backdoor.
Ad Precheck
Web tool
Paste your ad copy, image, and landing URL. Get a per-platform rejection score for Meta, Google Ads, and AdSense before you submit.
Silent AI Audit
Mac app
Find every AI model silently installed on your Mac. See the size, last access, and how to remove each one.
EV Power Bill
Web tool
See your real EV charging bill before you buy the car.
Before You Install
Web tool
Paste a package. See the supply-chain risk before you run install.
App Store Precheck
Web tool
Paste your app metadata, screenshots, and Info.plist. Get a per-guideline rejection score before you hit Submit for Review.
UK Stamp Duty Surcharge
Web tool
Stack the non-resident surcharge, additional property, and first-time buyer relief in one calculator. See your real SDLT before you exchange.
Vibecode Audit
Web tool
Find the 12 security holes Cursor, Lovable, v0, and Bolt leave open by default. Paste your URL, get a report your investor will not flag.
Spice Graveyard
iOS app
Scan the spice rack once. Get told what to cook tonight with the bottles you already own, and stop buying duplicates.
Home Addition Cost
Web tool
Paste a contractor bid for your home addition. Get a line-by-line read on what is fair, what is padded, and what scope cut drops the total without losing the room.
Accutane Tracker
iOS app
Log your Accutane course like the dermatologist would. Daily lip dryness, side-effect severity, dose ladder, blood-draw reminders, and a photo timeline that shows where week 12 actually got you.
PIH Fade Plan
Web tool
Tell us your skin type, breakout history, and how much post-acne brown is left. Get a 16-week active routine that switches between azelaic, niacinamide, vitamin C, and tretinoin based on how your skin actually reacts in week 2, 4, 8, and 12.
Tailwind Exit Plan
Web tool
Paste a Tailwind component or a repo URL. Get a structured CSS migration plan that pulls out reusable classes, scaffolds CSS modules with design tokens, and hands the team a 4-week refactor schedule with the file to open on Monday.
VMware Exit Plan
Web tool
Paste your VMware renewal quote, host counts, and license SKUs. Get a per-hypervisor cost split across Proxmox, Hyper-V, OpenShift, and Nutanix, the migration hour estimate, and a 90-day cutover schedule that names the cluster to drain first.
Child Investment Planner
Web tool
Type your kid's age, your monthly budget, and your tax bracket. Get a side-by-side projection for 529, UTMA, and Roth IRA at age 18, the contribution that fits your budget, and the one-page memo that names the account to open first.
RMM Escape Plan
Web tool
Paste your NinjaOne invoice, endpoint count, and add-on list. Get your real all-in per-endpoint cost, a dated cancellation letter that respects the 60-day notice clock, and a migration matrix scored to your size across Action1, Level.io, Endpoint Central, and Syncro.
Bank Freeze Exit Plan
Web tool
Type your monthly cash flow, your balance range, and your current business bank. Get a freeze-risk score for every provider, the morning-of runbook if your account locks, a named backup account at a second institution, and a one-page memo for your bookkeeper.
Host Lock-In Escape
Web tool
Paste your host, plan tier, and what you deploy. Get a lock-in risk score across Netlify, Vercel, Render, Fly.io, and Cloudflare Pages, a redeploy config built from your own env vars and redirects, and a one-page runbook for the morning your account gets suspended with the site still live.
After-Death Money Checklist
Web tool
Tell us the state and the rough size of the estate. Get whether you can skip full probate, which bills you actually have to pay and which die with the person, the order to notify the banks and the agencies, and a one-page memo for the family.
Heat Safety Planner
Web tool
Tell us who is going outside, what they are doing, and your zip code. Get a clear go, modify, or cancel call for today's heat, the work-rest and water schedule for those conditions, the early heat-illness signs to watch in that specific person, and the safe hours to move it to.
5 more on the waitlist and I build this.
No charge today. Drop your email, lock in the early-access price, and you hear first when it ships.
Built by a real person. No silent vaporware.
