Waitlist open ยท 0%

See if self-hosting an LLM actually beats the API bill.

Web tool. Type a model, your daily token volume, batch size. Get hosted API cost on OpenAI, Anthropic, Together, Fireworks beside vLLM on H100, A100, L40S, with the breakeven volume printed at the top.

0of 5 on waitlist

Bring a friend who'd use this. Each signup pulls the launch closer.

See it live

$4.99 lifetime ยท or $19 / month SaaS ยท no charge today

๐ŸŒ Web toolยทdev tools

Built by someone who already shipped 30+ tools

30+
tools shipped
2,300
weekly Google impressions
35d
day-by-day streak
12
ranked in top-10

For a sense of what shipped looks like: relly.permissionlabs.com โ†—

What it does

Decide self-host vs API in four moves.

Click any card to jump to its deep dive below.

Who it's for

Built for these decisions.

If any of these are your question, this is the tool.

Indie dev hitting OpenAI rate limits

At my volume, would self-hosting Llama 3.3 70B actually be cheaper?

โ†’ Daily tokens in. Side-by-side cost: OpenAI vs Anthropic vs vLLM on H100. Breakeven volume printed at the top.

Startup deciding stack for a new product

If we ship to 1,000 paying users, what is our LLM line item?

โ†’ Pick a model class and a vendor. Pricing scales with assumed tokens per user. Hosted vs self-host cross-over surfaced.

Founder chasing margin

Where is the cheapest H100 hour today?

โ†’ Live RunPod vs Vast.ai vs Lambda table. Filter by region, spot vs on-demand, vRAM. Sorted by per-token cost at your batch size.

Team ops planning Q3 budget

Does the breakeven hold if we double the request volume?

โ†’ Move the volume slider. Watch the lines cross. Save the scenario, share the URL, defend the number in the planning meeting.

The problem

Hosted vs self-host LLM cost lives in 12 tabs and a stale spreadsheet.

You want to know whether running Llama 3.3 70B on a rented GPU is cheaper than paying OpenAI or Anthropic per million tokens. Every answer lives in a different tab. RunPod and Vast.ai pricing pages list 30 GPU SKUs but nothing on throughput. Hosted API trackers like artificialanalysis.ai compare only hosted vendors. The vLLM and SGLang docs show batching wins on paper, but the math never lands in the same view as your daily token volume. You end up in a spreadsheet at 1am, guessing at idle hours and egress, and the breakeven number changes every time you blink.

What you'd get

Four pieces, one tool.

Each piece ships in the first build for waitlist members. SaaS upgrades layer on top.

01 ยท feature

Hosted vs self-host breakeven calculator

Type your model, your daily token volume, and your batch size. We print monthly cost on OpenAI, Anthropic, Together, and Fireworks beside the same workload on vLLM running on the cheapest matching GPU rental. The breakeven volume, where self-host overtakes hosted, sits at the top of the page in one bold number.

Drawdown chart
year 22$1.0My0y40
02 ยท feature

Batching multipliers from real benchmarks

Throughput numbers come from published vLLM and SGLang benchmark releases per model and GPU pair, not vendor brochures. We multiply by your concurrency assumption, so the per-token cost reflects the throughput you would actually see at batch 8, not the marketing peak at batch 256.

Scenario picker
03 ยท feature

Hidden cost line items

Idle GPU hours when traffic is low, cold start surcharges on serverless GPU plans, model weight storage per month, egress per million tokens out. Included in the rolled-up monthly number, never buried in a footnote.

AI subscriptions detected
  • Cursor Pro
    $20/mo
  • Claude Pro
    $20/mo
  • GPT Plus
    $20/mo
  • Midjourney Standard
    $30/mo
  • GitHub Copilot
    $10/mo
  • Perplexity Pro
    $20/mo
  • ElevenLabs Creatortrial
    $22/mo
  • Replicate
    $25/mo
Monthly total$167
04 ยท feature

Live GPU rental price feed

Daily refresh of RunPod, Vast.ai, and Lambda spot and on-demand prices. Filter by GPU model, region, vRAM, interruptible vs reserved. The cheapest H100-SXM hour today is the one we plug into your calculation.

Stock photo + default gradient lens
  • Stock photo92% match
    Unsplash default hero
  • Gradient88% match
    Tailwind starter purple-pink
  • AI portrait71% match
    smooth-skin tells in founder photo
  • Icon set95% match
    Heroicons defaults, no customization
Why paid

Today's tools don't do this.

A wrong breakeven call ships a $1,500 monthly GPU bill that should have stayed a $400 API bill, or the reverse. Six months of overpay before someone redoes the spreadsheet. Engineering time to set up vLLM is two days at minimum. Picking the wrong GPU SKU adds a week.

โœ—Compares hosted API vendors against each other. Self-host on a rented GPU is not in the model. The whole question of whether self-hosting beats the API is outside the scope.
No charge
โœ—Tracks per-token API prices across hosted providers. No GPU rental data, no batching math, no breakeven view. Useful for picking a vendor, useless for the build vs buy call.
No charge
โœ—Lists 30 GPU SKUs and an hourly rate. No throughput per model, no batching, no comparison to hosted APIs. The math from $/hour to $/million tokens is left on you.
Per-hour rental
DIY GPU spreadsheet
โœ—Falls behind the moment a vendor changes pricing or a new model lands. Batching multipliers and idle costs are usually omitted because they are tedious. The number you defend in a meeting is already wrong.
Your weekend
LLM Self-host Cost
โœ“Breakeven baked in, not optional ยท Batching multipliers from published benchmarks ยท Hidden costs in the headline number ยท Live GPU rental pricing across three vendors
$4.99
Price anchor
Compared to
DevOps consultant 1hr
Their price
$150~300 / hour
What you get
Same instinct as a senior DevOps review on whether to self-host, $4.99 once, runs every time you re-evaluate. Or $19/month team tier with saved scenarios and weekly GPU price reports.
โœ“
What you get now

Inputs for model, daily token volume, batch size, and concurrency. Side-by-side monthly cost on OpenAI, Anthropic, Together, Fireworks, and vLLM on the cheapest matching GPU rental from RunPod, Vast.ai, Lambda. Breakeven volume printed at the top. Hidden costs (idle, cold start, storage, egress) rolled into every line.

โ†—
What's coming

Alerts when a saved scenario crosses the breakeven line. Custom batch size simulator. SGLang and TensorRT-LLM throughput modes added. Team tier with shared scenarios and weekly GPU price digest. $19/month team tier locks in current pricing.

ยท
What's not included

GPU provisioning, vLLM deployment scripts, model weight downloads, fine-tuning cost modeling, RAG pipeline costs. We answer the cost question. You ship the stack.

Pricing

Which price would get you on the waitlist?

No charge today. The click tells me which tier is real demand. Early access price โ‰  launch price.

Join the waitlist

Want this built?

Drop your email. No charge, no spam. You're saying "yes, I'd actually use this." That's the signal I'm looking for.

Hi, I'm Hyunyoung.

Solo builder ยท Choppy Toast

This page is a quick vibe-coded probe to test demand and gather feature requests. The actual product, when it ships, will be a polished, hand-built tool, not this scaffold.

For a sense of what "polished and shipped" looks like, here's another product I built: relly.permissionlabs.com.

FAQ

Honest answers.

Then I wait. The waitlist stays live, no expiry, no archive. The faster you tell one dev who is staring at an OpenAI bill this month, the faster it ships. That is the whole engine.

More

More upcoming tools.

๐Ÿงฉ

Extension Listing Kit

SaaS

Upload each screenshot, promo tile, and description once. We resize and crop every image to the exact spec the Chrome Web Store, Firefox Add-ons, and Edge Add-ons each ask for, hold one copy of your name, summary, and description per store and per language with the character limit checked as you type, and run a pre-submit pass that names every missing or out-of-spec asset before you hit publish.

Waitlist 0%Waitlist open
๐ŸŽฌ

Video Bloat Audit

Web tool

Paste a URL. We find every autoplay and background video on the rendered page, show the real transfer weight and how much each one adds to your largest paint, estimate the monthly bandwidth bill at your actual traffic, and hand you a ranked plan to swap each clip for a poster, a lazy load, a smaller encode, or a scripted animation.

Waitlist 0%Waitlist open
๐Ÿ“‰

Decompound Calc

Web tool

See exactly when your retirement money runs out.

Waitlist 0%Waitlist open
๐Ÿ’ธ

AI Stack Cost

SaaS dashboard

One dashboard for every AI subscription you forgot about.

Waitlist 0%Waitlist open
๐Ÿ”

Vibe Check

Web tool

Find every AI-generated tell on your landing page.

Waitlist 0%Waitlist open
๐Ÿฆ

Account Stack 2026

Web tool

Tell us your income. We tell you exactly where to put each dollar across 401k, Roth IRA, HSA, and backdoor.

Waitlist 0%Waitlist open
๐Ÿ›ก๏ธ

Ad Precheck

Web tool

Paste your ad copy, image, and landing URL. Get a per-platform rejection score for Meta, Google Ads, and AdSense before you submit.

Waitlist 0%Waitlist open
๐Ÿ”ฌ

Silent AI Audit

Mac app

Find every AI model silently installed on your Mac. See the size, last access, and how to remove each one.

Waitlist 0%Waitlist open
โšก

EV Power Bill

Web tool

See your real EV charging bill before you buy the car.

Waitlist 0%Waitlist open
๐Ÿชค

Before You Install

Web tool

Paste a package. See the supply-chain risk before you run install.

Waitlist 0%Waitlist open
๐ŸŽ

App Store Precheck

Web tool

Paste your app metadata, screenshots, and Info.plist. Get a per-guideline rejection score before you hit Submit for Review.

Waitlist 0%Waitlist open
๐Ÿ›๏ธ

UK Stamp Duty Surcharge

Web tool

Stack the non-resident surcharge, additional property, and first-time buyer relief in one calculator. See your real SDLT before you exchange.

Waitlist 0%Waitlist open
๐Ÿ›ก๏ธ

Vibecode Audit

Web tool

Find the 12 security holes Cursor, Lovable, v0, and Bolt leave open by default. Paste your URL, get a report your investor will not flag.

Waitlist 0%Waitlist open
๐ŸŒถ๏ธ

Spice Graveyard

iOS app

Scan the spice rack once. Get told what to cook tonight with the bottles you already own, and stop buying duplicates.

Waitlist 0%Waitlist open
๐Ÿ 

Home Addition Cost

Web tool

Paste a contractor bid for your home addition. Get a line-by-line read on what is fair, what is padded, and what scope cut drops the total without losing the room.

Waitlist 0%Waitlist open
๐Ÿ’Š

Accutane Tracker

iOS app

Log your Accutane course like the dermatologist would. Daily lip dryness, side-effect severity, dose ladder, blood-draw reminders, and a photo timeline that shows where week 12 actually got you.

Waitlist 0%Waitlist open
๐Ÿงด

PIH Fade Plan

Web tool

Tell us your skin type, breakout history, and how much post-acne brown is left. Get a 16-week active routine that switches between azelaic, niacinamide, vitamin C, and tretinoin based on how your skin actually reacts in week 2, 4, 8, and 12.

Waitlist 0%Waitlist open
๐Ÿงน

Tailwind Exit Plan

Web tool

Paste a Tailwind component or a repo URL. Get a structured CSS migration plan that pulls out reusable classes, scaffolds CSS modules with design tokens, and hands the team a 4-week refactor schedule with the file to open on Monday.

Waitlist 0%Waitlist open
๐Ÿ–ฅ๏ธ

VMware Exit Plan

Web tool

Paste your VMware renewal quote, host counts, and license SKUs. Get a per-hypervisor cost split across Proxmox, Hyper-V, OpenShift, and Nutanix, the migration hour estimate, and a 90-day cutover schedule that names the cluster to drain first.

Waitlist 0%Waitlist open
๐Ÿ‘ถ

Child Investment Planner

Web tool

Type your kid's age, your monthly budget, and your tax bracket. Get a side-by-side projection for 529, UTMA, and Roth IRA at age 18, the contribution that fits your budget, and the one-page memo that names the account to open first.

Waitlist 0%Waitlist open
๐Ÿšช

RMM Escape Plan

Web tool

Paste your NinjaOne invoice, endpoint count, and add-on list. Get your real all-in per-endpoint cost, a dated cancellation letter that respects the 60-day notice clock, and a migration matrix scored to your size across Action1, Level.io, Endpoint Central, and Syncro.

Waitlist 0%Waitlist open
๐Ÿฆ

Bank Freeze Exit Plan

Web tool

Type your monthly cash flow, your balance range, and your current business bank. Get a freeze-risk score for every provider, the morning-of runbook if your account locks, a named backup account at a second institution, and a one-page memo for your bookkeeper.

Waitlist 0%Waitlist open
๐Ÿ”Œ

Host Lock-In Escape

Web tool

Paste your host, plan tier, and what you deploy. Get a lock-in risk score across Netlify, Vercel, Render, Fly.io, and Cloudflare Pages, a redeploy config built from your own env vars and redirects, and a one-page runbook for the morning your account gets suspended with the site still live.

Waitlist 0%Waitlist open
๐Ÿ“‹

After-Death Money Checklist

Web tool

Tell us the state and the rough size of the estate. Get whether you can skip full probate, which bills you actually have to pay and which die with the person, the order to notify the banks and the agencies, and a one-page memo for the family.

Waitlist 0%Waitlist open
๐ŸŒก๏ธ

Heat Safety Planner

Web tool

Tell us who is going outside, what they are doing, and your zip code. Get a clear go, modify, or cancel call for today's heat, the work-rest and water schedule for those conditions, the early heat-illness signs to watch in that specific person, and the safe hours to move it to.

Waitlist 0%Waitlist open

5 more on the waitlist and I build this.

No charge today. Drop your email, lock in the early-access price, and you hear first when it ships.

Email me directly

Built by a real person. No silent vaporware.