Internal Document
Infrastructure Plan
SvelteKit on Fly.io, Neon Postgres, R2, Resend, and Sentry. Zero to scale for the long tail of motorcycle part compatibility.
1
Infrastructure Overview
Request path from user to data and every external integration.
All services communicate over HTTPS. Neon is accessed via the serverless driver (
@neondatabase/serverless) from within Fly.io VMs only. No direct public DB access is permitted.
2
Hosting: Fly.io
Global Anycast edge, per-machine billing, sub-second deploys.
Launch VM
shared-cpu-1x
256 MB RAM. Free tier. Sufficient for early traffic.
Scale Trigger
>100 concurrent
Sustained load adds a 2nd machine automatically.
Auto-scaling
1 – 3 machines
Min 1 always-on, max 3 at peak. Cost caps cleanly.
Primary Region
iad (US East)
Closest to Neon default region. Low latency roundtrips.
-
Docker image
Node 20 Alpine, pnpm,
@sveltejs/adapter-node -
Deploy command
fly deploy— rolling, zero-downtime by default -
Rollback
fly deploy --image [previous-image-digest]— under 2 minutes - Static assets Served by SvelteKit from Fly.io edge; consider R2 + CDN for media at scale
- SSL Auto-provisioned Let's Encrypt certificate via Fly.io
-
Secrets
All via
fly secrets set KEY=VALUE— never committed to repo
Free tier machines are stopped when idle. Use the
min_machines_running = 1 setting in fly.toml to prevent cold starts on production once traffic is consistent.
3
Database: Neon Postgres
Serverless Postgres with branch-per-PR and built-in PITR.
Launch Tier
Free
0.5 GB storage, 190 compute hours/mo. Enough for MVP.
Upgrade Timing
Month 3–4
Pro at $19/mo for branching, more storage, and PITR.
Free PITR Window
7 days
30 days on Pro. Point-in-time recovery for any incident.
ORM / Migrations
Drizzle ORM
Timestamp-based migration files, type-safe schema.
- Connection pooling Neon's built-in serverless driver — no PgBouncer needed
- Dev branches 1 branch per feature, auto-created from main snapshot
- PR branches Preview branch auto-created per pull request by GitHub Actions
- Backup strategy Neon PITR (built-in). No additional backup service needed at launch.
- Direct connections IP allowlist in Neon console — Fly.io IP ranges only
-
Migration command
drizzle-kit migraterun as part of deploy step in CI/CD
4
CI/CD Pipeline
GitHub Actions — four stages from push to production.
1
Lint, Typecheck & Unit Tests
Trigger: any push to any branch / ~2 min
ESLint, Prettier check,
svelte-check TypeScript validation, Vitest unit suite. Must pass before any deploy step begins.
2
Preview Deploy
Trigger: pull request opened or updated
Creates a Neon branch database from main, runs
drizzle-kit migrate, deploys a named Fly.io preview app, posts preview URL as a PR comment.
3
Production Deploy
Trigger: merge to main
Runs migrations against Neon main branch, then
fly deploy rolling deploy. Zero-downtime. Previous image digest stored as artifact for rollback.
4
Release & E2E Tests
Trigger: git tag (e.g. v1.2.0)
Creates a GitHub Release, runs Playwright E2E suite against production URL, posts test report to release notes. Notifies Slack on failure.
Rollback in under 2 minutes:
fly deploy --image registry.fly.io/motopartpicker:[previous-digest]. Store the image digest in a GitHub Actions artifact after every production deploy.
5
Environment Management
Local, preview, and production — each isolated with its own DB branch.
Local
- Docker Compose with Postgres 16 (Neon-compatible)
.env.local— never committed to git- Run with
pnpm dev+docker compose up - Drizzle Studio for DB inspection
Preview
- Fly.io preview app per PR (auto-created)
- Neon branch DB per PR (auto-created)
- Secrets injected by GitHub Actions
- Preview URL posted to PR comment
Production
- Fly.io production app —
mainbranch deploys - Neon main branch database
- All secrets via
fly secrets set - Migrations gated by CI green status
Secrets rule: never store credentials in
.env files that are committed. Never use a .env.production file on disk. Production secrets live only in Fly.io's encrypted secret store.
6
Monitoring & Observability
Errors, uptime, performance, and custom business metrics.
Sentry
Free — 5K events/mo
Error tracking and performance monitoring. Alerts routed to Slack. Source maps uploaded during deploy for readable stack traces.
UptimeRobot
Free forever
Ping every 5 minutes. Alert via email and Slack on downtime. Public status page available for incident communication.
Fly.io Metrics
Included
CPU utilization, RAM, request count, and latency histograms. Built-in dashboard. No setup required.
Neon Console
Included
Query performance insights, active connection count, storage usage. Identify slow queries before they reach users.
Custom Metrics
Postgres
Affiliate click-through rate, part verification rate, and search success rate stored in Postgres, displayed in the admin dashboard.
Alert Routing
-
Sentry errors
New issue or regression → Slack
#alerts-errors -
UptimeRobot
Downtime detected → email (immediate) + Slack
#alerts-infra - Fly.io Machine crash or OOM → email from Fly.io dashboard alerts
- Neon Storage > 80% → email from Neon console alert threshold
7
Cost Projections
$0 to launch. Under $175/mo at full Year 2 scale.
Launch (Mo 1–3)
~$1/mo
Domain only. All services on free tier.
Growth (Mo 4–12)
$45–75/mo
Neon Pro kicks in, Fly.io paid, email volume scales.
Scale (Year 2)
$150–170/mo
Full stack operational. Easily covered by affiliate revenue.
| Service | Launch (Mo 1–3) | Growth (Mo 4–12) | Scale (Year 2) |
|---|---|---|---|
| Fly.io | $0 | $5–10/mo | $30–50/mo |
| Neon Postgres | $0 | $19/mo | $69/mo |
| Cloudflare R2 | $0 | $0–1/mo | $5/mo |
| Resend | $0 | $0–20/mo | $20/mo |
| Sentry | $0 | $0–26/mo | $26/mo |
| UptimeRobot | $0 | $0 | $0 |
| Domain (.com) | $12/year | $12/year | $12/year |
| TOTAL | ~$1/mo | $45–75/mo | $150–170/mo |
8
Security Infrastructure
Defense in depth from the edge to the database.
SSL / TLS
Fly.io auto-provisions Let's Encrypt certificates. HTTPS enforced. HSTS header with 1-year max-age.
DDoS Protection
Fly.io Anycast routing distributes traffic globally. Basic L3/L4 DDoS mitigation included on all plans.
Security Headers
CSP, HSTS,
X-Content-Type-Options: nosniff, X-Frame-Options: DENY. Set in SvelteKit hooks.Database Access
Neon IP allowlist permits only Fly.io egress IPs for direct connections. Application connects via connection string only.
Deploy Access
GitHub team-level permissions gate CI/CD. Fly.io RBAC controls who can run
fly deploy or read secrets.9
Disaster Recovery
RTO 15 minutes. RPO 1 hour. Runbook for every failure mode.
RTO
15 min
Recovery Time Objective
RPO
1 hour
Recovery Point Objective
PITR Window
7 days
Free / 30 days on Pro
Incident Runbook
DB Down
Check Neon status page (
neon.tech/status). If Neon outage, activate read-only mode (serve cached data). If data corruption, restore from PITR via Neon Console — select timestamp, clone branch, update DATABASE_URL secret.App Down
Check
fly status. If bad deploy, run fly deploy --image registry.fly.io/motopartpicker:[previous-digest]. Should be live within 2 minutes. If Fly.io outage, check flyio.statuspage.io.External API Down
Graceful degradation: show cached affiliate prices with stale timestamp. Sentry alert will fire on elevated error rate. RevZilla and Rocky Mountain ATV APIs degrade independently — part data still searchable without live pricing.
Data Corruption
Identify the approximate timestamp of corruption from Sentry traces. In Neon Console: Branch → Restore to Point in Time. Create restore branch, validate data, swap
DATABASE_URL secret in Fly.io, redeploy.10
Launch Checklist
Every box checked before going public. Track your progress below.
0 / 12 complete
Infrastructure
Monitoring
Integrations
Security & SEO