MotoPartPicker — Technical Architecture Document

00

Architecture Principles

Every technical decision in this document flows from five governing principles. When tradeoffs arise, these serve as the tiebreaker — in this order.

SOLID at service level

Separation of concerns

Defense in depth

12-Factor App

YAGNI, designed to scale

SOLID at Service Level

Each SvelteKit route module has a single responsibility. Business logic lives in service files, not load functions. Interfaces are preferred over direct implementation calls.

Separation of Concerns

Presentation, business logic, and data access are never mixed. SSR handles public data; client-side handles interactivity. API routes are the only data boundary.

Defense in Depth

Auth is enforced at the route level, the service level, and the database level. No single layer is trusted alone. Secrets never touch source code.

12-Factor App

Config via environment variables. Stateless processes. Port binding. Dev/prod parity. Logs as event streams via structured pino output to stdout.

YAGNI, Designed to Scale

No Redis, no Elasticsearch, no microservices at launch. Postgres handles search, queuing, and sessions. Scale gates are documented and infrastructure-tested.

01

System Overview

MotoPartPicker is a single SvelteKit application deployed on Fly.io that handles both SSR and API routing. All persistent state lives in Neon Postgres. External services (auth, billing, storage, email) are integrated via official SDKs and webhooks.

System Component Diagram

bikes parts compatibility_records verifications users builds retailers part_prices affiliate_clicks retailer_subscriptions

02

Frontend Architecture

SvelteKit provides SSR and client-side hydration in a single framework. SSR is the default for all public pages — critical because "[bike] [part] compatible" queries are high-volume organic search traffic. Interactive features (build planner, part filters, comparison) hydrate on the client after initial load.

SEO is the primary growth channel. Every public bike/part page is fully rendered HTML on first request. No client-side-only rendering for indexable content. Server load functions handle all data fetching before the response is sent.

Rendering Strategy

SSR — SEO Pages

Bike listing, part detail, compatibility result, build showcase, marketing landing. All rendered server-side with server load functions.

CSR — Interactive

Build planner, live part filters, comparison tool, user dashboard. Hydrated after SSR shell; state managed in Svelte stores.

Feature-Sliced Route Structure

        // src/routes/ — feature-sliced by domain
routes/
├── (marketing)/          — landing, about, pricing
├── bikes/
│   ├── +page.svelte        — bike selector (year/make/model)
│   └── [bikeId]/
│       └── +page.svelte    — compatible parts for bike
├── parts/
│   └── [partId]/
│       └── +page.svelte    — part detail + prices
├── builds/
│   ├── +page.svelte        — user build list (CSR)
│   └── [buildId]/
│       └── +page.svelte    — build detail (SSR for public)
└── api/                   — all +server.ts endpoints
      

Performance Budget

<2s

LCP

Largest Contentful Paint

<100ms

FID

First Input Delay

<0.1

CLS

Cumulative Layout Shift

03

Backend Architecture

The backend is a set of SvelteKit +server.ts API routes. Auth is enforced via BetterAuth middleware. Rate limiting uses an in-memory sliding window at launch, graduating to a Postgres-backed counter at 25K MAU.

Auth strategy: BetterAuth with session cookies. Google and GitHub OAuth. Sessions stored in Postgres via BetterAuth's session adapter. No JWTs in local storage. Rate limits: 100 req/min unauthenticated, 300 req/min authenticated.

API Surface

Method	Path	Auth	Description
GET	/api/bikes	none	List bikes, filterable by year / make / model
GET	/api/bikes/[id]/parts	none	Compatible parts for a specific bike
GET	/api/parts/[id]	none	Part detail including prices across retailers
GET	/api/parts/[id]/verifications	none	Community verifications for a part + bike combo
POST	/api/verifications	user	Submit a fit verification for a part on a bike
GET	/api/builds	user	List authenticated user's builds
POST	/api/builds	user	Create a new build for the authenticated user
PUT	/api/builds/[id]	owner	Update a build (ownership verified server-side)
GET	/api/prices/[partId]	none	Current prices across all retailers for a part
POST	/api/affiliate/click	none	Record an affiliate click for attribution tracking
POST	/api/webhooks/stripe	stripe	Handle billing events: invoice.paid, subscription.updated, etc.

04

Data Architecture

Everything lives in Postgres. No secondary data stores at launch. Neon's serverless driver handles connection pooling transparently — no need for PgBouncer or a separate pooler.

Neon Serverless Driver

HTTP-based Postgres connection that works in edge runtimes. Built-in connection pooling eliminates the need for PgBouncer at this scale.

Full-Text Search via tsvector

Parts search uses Postgres tsvector + GIN index. No Elasticsearch needed at 300K MAU given read-heavy, low-write search patterns.

Partitioned Affiliate Clicks

The affiliate_clicks table is range-partitioned by month. Old partitions can be archived without downtime.

Neon Branching

Each dev environment and preview deploy gets a Neon branch: instant copy-on-write clone of production schema. Zero cost for idle branches.

        -- Core compatibility relationship
CREATE TABLE compatibility_records (
  id          uuid PRIMARY KEY DEFAULT gen_random_uuid(),
  bike_id     uuid REFERENCES bikes(id),
  part_id     uuid REFERENCES parts(id),
  status      'verified' | 'community' | 'no_data',
  created_at  timestamptz DEFAULT now(),
  UNIQUE (bike_id, part_id)
);

-- Full-text search index on parts
CREATE INDEX parts_search_idx ON parts
  USING gin(to_tsvector('english', name || ' ' || coalesce(description, '')));

-- Affiliate clicks: partitioned by month
CREATE TABLE affiliate_clicks (
  id          bigserial,
  part_id     uuid,
  retailer_id uuid,
  clicked_at  timestamptz DEFAULT now(),
  user_id     uuid NULLABLE
) PARTITION BY RANGE (clicked_at);
      

05

Integration Architecture

All external integrations are treated as unreliable. Every outbound call uses exponential backoff with a dead-letter audit log for permanent failures.

RevZilla / Amazon

Hourly price scrape via pg-boss background job. Results written to part_prices. Stale prices (>24h) marked as such in the UI.

Stripe Webhooks

Events: invoice.paid, subscription.updated, subscription.deleted. Signature verified with STRIPE_WEBHOOK_SECRET. Idempotent processing via event ID.

Resend (Email)

Transactional only: fit verification confirmation, build share notifications, weekly digest (opt-in). Templates are React Email components compiled server-side.

Google / GitHub OAuth

Managed entirely by BetterAuth. Zero custom OAuth code. Callback URLs registered per environment. PKCE enforced.

Retry strategy: exponential backoff with dead-letter logging. 3 retries at 5s, 25s, 125s intervals. After the 3rd failure the job is moved to an integration_errors audit table and an alert fires. No silent failures.

06

Background Processing

Background jobs run via pg-boss, a Postgres-based job queue using the SKIP LOCKED pattern. No Redis, no separate worker process — the same Fly.io machine that serves HTTP also processes jobs. At 100K MAU this becomes a dedicated machine.

hourly price-update Scrape RevZilla + Amazon for part prices
on-event send-email Transactional notifications via Resend
daily data-quality-check Flag stale prices, orphaned records, low-confidence verifications
daily sitemap-generation Regenerate XML sitemap and ping Google Search Console

Failure handling. 3 retries with exponential backoff. On the 3rd failure pg-boss marks the job as failed, writes to the audit log, and triggers a Sentry alert. The price-update job is designed to be fully idempotent — rerunning never duplicates records.

07

Security Architecture

Security is layered: HTTPS at the edge, session auth at the route level, ownership checks at the service level, and column-level encryption at the database. No single layer is trusted alone.

Authentication Flow

Transport Security

HTTPS everywhere via Fly.io SSL termination. HSTS enforced. X-Frame-Options: DENY. X-Content-Type-Options: nosniff.

CSP Headers

Content Security Policy restricts script sources to self + Google Fonts. CORS: api.motopartpicker.com only. No wildcard origins.

PII Handling

users.email is encrypted at rest (Neon transparent encryption). display_name is public. No PII in logs.

FTC Compliance

Affiliate disclosure visible on every page containing purchase links. Disclosure text: "We earn a commission from purchases. This doesn't affect our compatibility data."

Secrets Management

All secrets stored in Fly.io secrets (fly secrets set). Exposed as environment variables. Never committed to source. Rotated on team member offboarding.

08

Scaling Strategy

Three defined scale checkpoints. Each checkpoint has specific infrastructure triggers and a cost estimate. Nothing is provisioned until the trigger is hit.

Stage	Trigger	Fly.io	Neon	Cache
Launch 0 – 5K MAU	—	1 × shared-cpu-1x (256 MB)	Free tier	SvelteKit in-memory (5 min TTL)
Year 1 5K – 25K MAU	p95 latency >400ms	2 × shared-cpu-2x (512 MB)	Pro (~$19/mo)	SvelteKit + CDN for static assets
Year 2 25K – 100K MAU	DB CPU >60% sustained	3 × dedicated-cpu-1x (1 GB)	Scale + read replica	CDN + server-side cache per route
Year 3 100K – 300K MAU	Multiple regions requested	Multi-region Fly machines	Scale (multi-region)	Redis for hot data (re-evaluate)

Cache strategy: server-side first. SvelteKit's server load functions cache bike and parts data for 5 minutes using a simple Map-based LRU store. At Year 2, Cloudflare Pages / R2 CDN caches public HTML responses. Redis is deferred until Postgres read replica can no longer handle read traffic.

09

Observability

Observability stack is minimal at launch: Sentry for errors, UptimeRobot for uptime checks, pino for structured logs. Metrics are event-driven, not time-series — Postgres query counts and pino log aggregation are sufficient to identify bottlenecks at Year 1 scale.

Sentry (Errors)

Error tracking with source maps. All unhandled exceptions in server load functions and API routes. Release tracking tied to Fly.io deployments.

UptimeRobot

60-second HTTP checks on /api/bikes and the homepage. PagerDuty integration. Alert if downtime >2 consecutive checks.

Pino (Structured Logs)

JSON logs to stdout. Fields: level, msg, route, userId, duration_ms, status. No PII in log fields. Shipped to Fly.io log aggregation.

Key Metrics and Alert Thresholds

api.latency.p95

API response time at 95th percentile

Alert: > 500ms sustained 5 min

api.error_rate

Ratio of 5xx responses to total requests

Alert: > 5% over any 10-min window

affiliate.click_through_rate

Affiliate clicks per part detail page view

Monitor: drop > 30% week-over-week

verification.submission_rate

Verifications submitted per active user

Monitor: weekly cohort trend

uptime.availability

% of time the app is reachable

Alert: < 99.5% in any 24h window

jobs.failure_rate

Background job failure percentage

Alert: any job fails 3rd retry

10

Architecture Decision Records

Five decisions that meaningfully shaped the architecture. Each is considered final for the Year 1 scope with explicit conditions that would trigger reconsideration.

ADR-001

SvelteKit over Next.js

Accepted

Decision: Use SvelteKit as the full-stack framework instead of Next.js or Remix.

Rationale: SvelteKit produces significantly smaller client bundles (no Virtual DOM overhead), has first-class SSR with minimal configuration, and ships one deployment target (Node adapter on Fly.io) with no edge-function complexity. For a 2-person team building a data-rich but interaction-light app, the simpler mental model and faster build times outweigh Next.js's ecosystem size. Reconsider if we hire React engineers who cannot ramp on Svelte.

ADR-002

Neon over Supabase

Accepted

Decision: Use Neon as the managed Postgres provider instead of Supabase or PlanetScale.

Rationale: Neon is pure Postgres — no custom extensions, no proprietary realtime layer, no RLS magic to reason about. Database branching enables true prod-parity dev and PR preview environments at zero cost. Supabase's Auth and Realtime features are compelling but add vendor lock-in we don't need (BetterAuth handles auth, no realtime requirement). Reconsider if pg-boss bottlenecks at high job throughput and we need a Redis-backed queue — at that point Supabase's full platform may make sense.

ADR-003

BetterAuth over Clerk

Accepted

Decision: Use BetterAuth (library) instead of Clerk or Auth0 (managed services).

Rationale: Auth is a core trust feature for a community-driven platform. Self-hosted auth means sessions live in our Postgres database, user data never touches a third-party auth vendor, and there is no per-MAU pricing cliff. BetterAuth is a library, not a service — it compiles into the SvelteKit app with zero cold-start overhead. Clerk's UI components are excellent but the vendor dependency and pricing model are incompatible with our bootstrapped constraints. Reconsider if compliance requirements (SOC 2, HIPAA) mandate a certified auth vendor.

ADR-004

Postgres Full-Text Search over Elasticsearch

Accepted

Decision: Implement parts search using Postgres tsvector + GIN index instead of Elasticsearch or Typesense.

Rationale: The parts catalog is read-heavy and write-sparse. Full-text search on a <500K row table is well within Postgres's capabilities with a GIN-indexed tsvector column. Eliminating a second data store removes an entire failure domain, infrastructure cost (~$50-150/mo), and operational complexity. Search quality at this scale (no ML ranking, no synonyms) does not require Elasticsearch. Reconsider at 1M+ parts records or if search quality scores fall below acceptable in user testing.

ADR-005

pg-boss over BullMQ / Redis

Accepted

Decision: Use pg-boss for background job processing instead of BullMQ (which requires Redis).

Rationale: Adding Redis introduces a second stateful service: separate deployment, separate connection string, separate failure domain, separate monitoring. pg-boss uses the SKIP LOCKED Postgres primitive to provide reliable at-least-once job delivery from the same database we already operate. For 4 job types running at hourly or daily frequency, this is entirely sufficient. Job throughput at launch is <100 jobs/hour — trivially within pg-boss's capacity. Reconsider if job volume exceeds 10,000/hour, which is a Year 3 problem at the earliest.

MotoPartPicker — System Architecture

Architecture Principles

SOLID at Service Level

Separation of Concerns

Defense in Depth

12-Factor App

YAGNI, Designed to Scale

System Overview

Frontend Architecture

Rendering Strategy

SSR — SEO Pages

CSR — Interactive

Feature-Sliced Route Structure

Performance Budget

Backend Architecture

API Surface

Data Architecture

Neon Serverless Driver

Full-Text Search via tsvector

Partitioned Affiliate Clicks

Neon Branching

Integration Architecture

RevZilla / Amazon

Stripe Webhooks

Resend (Email)

Google / GitHub OAuth

Background Processing

Security Architecture

Transport Security

CSP Headers

PII Handling

FTC Compliance

Secrets Management

Scaling Strategy

Observability

Sentry (Errors)

UptimeRobot

Pino (Structured Logs)

Key Metrics and Alert Thresholds

Architecture Decision Records

SvelteKit over Next.js

Neon over Supabase

BetterAuth over Clerk

Postgres Full-Text Search over Elasticsearch

pg-boss over BullMQ / Redis