What you’ll build
A production AI backend that handles:- Chat completions for customer-facing AI assistants
- Structured output for data extraction and classification
- Image generation for content creation features
- Per-request billing that maps to your own pricing
Why open-source models for SaaS
| OpenAI / Anthropic | Runcrate (open-source models) | |
|---|---|---|
| Pricing | $3–15 per 1M output tokens | $0.20–2.00 per 1M output tokens |
| Vendor lock-in | Locked to one provider | Switch models freely |
| Data privacy | Data sent to third party | Open-source models, your choice |
| Rate limits | Strict per-org limits | 100 req/min default, higher on request |
| Model choice | 3–5 models | 140+ models across 8 categories |
Next.js API routes (Vercel AI SDK)
Chat endpoint for your product
Structured data extraction endpoint
Turn unstructured user input into structured data your app can store:Image generation endpoint
Let users generate images from your app:Python backend (FastAPI)
Content moderation middleware
Add a moderation layer before displaying AI-generated content:Cost estimation
At DeepSeek-V3 rates, a typical SaaS workload:| Use case | Tokens/request | Cost/1K requests |
|---|---|---|
| Short chat responses (200 tokens out) | ~300 total | ~$0.06 |
| Data extraction (100 tokens out) | ~200 total | ~$0.04 |
| Long-form content (1000 tokens out) | ~1200 total | ~$0.24 |
| Image generation | 1 image | ~$0.03/image |