02 / 19

Playbook / 02

Pastebin

The same shape as URL shortener — small write API, hot read path — but the storage choice flips. Pastes are large blobs (10 KB to 2 MB), so the architecture moves from hot KV to object storage with edge caching. Walk this and you've seen both ends of the read-heavy spectrum, plus the retention and abuse-handling problems that come with user-generated content.

1 · Clarifying questions

What's the max paste size?	2 MB. Hard cap for free tier; 10 MB for paid. The cap shapes everything from the schema to the upload path.
How long do pastes live?	Configurable per paste — 10 min, 1 hour, 1 day, 1 month, 1 year, forever. Forever costs real money in long-tail storage.
Privacy model?	Public (indexed, listable), unlisted (link-only, not listable), private (auth required). Three different access patterns, three different code paths.
What's the read/write ratio?	~10:1 across all pastes; ~100:1 for the long tail of viral pastes. Most pastes get one or two reads; a few get millions.
Syntax highlighting?	Client-side only (Prism / highlight.js). Never store rendered HTML — that doubles storage and bakes in security risks.
Abuse handling?	Required. Malware scanning, CSAM detection, takedown workflow, regulator notice. Skipping this is what fails the interview round.
Latency budget?	P99 read ≤ 200 ms (most served from edge). P99 create ≤ 500 ms (it's an upload).
Multi-region?	Yes for reads (anycast edge). Single-region writes initially.

2 · Capacity math, on a napkin

Number	Calculation	Result
Writes / day	given	10M
Reads / day	10× writes	100M
Write QPS (avg / peak)	10M / 86,400 × 3	~120 / ~350
Read QPS (avg / peak)	100M / 86,400 × 3	~1.2K / ~3.5K (origin)
Average paste size	given	~50 KB
Storage / day	10M × 50 KB	~500 GB
Storage / year	×365	~180 TB
Storage / 10 yr (no expiry)	×10	~1.8 PB
Storage / 10 yr (50% expire)	halved	~900 TB
Hot working set	~5% of pastes account for 90% of reads	~9 TB
Egress / day	100M × 50 KB × 0.2 (CDN miss)	~1 TB / day from origin

The takeaway: this is a storage problem, not a compute problem. Object storage (S3) at $0.023/GB-month gets us 1.8 PB for ~$40K/month — and most of that flows out via the CDN, which is the next biggest cost line.

3 · API and data model

Endpoints

POST /v1/pastes # create
{
 "content": "...", # required, ≤ 2 MB
 "language": "python", # optional, hint for client-side highlighter
 "privacy": "unlisted", # public | unlisted | private
 "expires": "1d" # 10m | 1h | 1d | 1m | 1y | never
}
→ 201 {"id":"aB3x9Q2", "url":"https://pb.sh/aB3x9Q2", "expires_at":"2026-05-10T..."}

GET /:id # read (the hot path)
→ 200 text/plain + the paste content
 Cache-Control: public, max-age=300, s-maxage=86400

GET /:id/raw # read raw, for clients/tools
→ 200 text/plain

POST /:id/report # abuse / takedown
→ 202

DELETE /:id # owner only
→ 204

Schema

Two tables, plus the blob in object storage. The Postgres row is intentionally tiny — the paste body lives in S3 with the row pointing at the object key.

pastes -- metadata only, ~300 B / row
 id VARCHAR(8) PRIMARY KEY -- base62 ID
 blob_key VARCHAR(64) NOT NULL -- s3://pastes/<sha256-prefix>/<sha256>
 size_bytes INTEGER NOT NULL
 language VARCHAR(32) NULL
 privacy VARCHAR(16) NOT NULL -- public | unlisted | private
 owner_id BIGINT NULL
 content_hash CHAR(64) NOT NULL -- sha256, used for dedupe
 created_at TIMESTAMP NOT NULL
 expires_at TIMESTAMP NULL
 deleted_at TIMESTAMP NULL -- soft delete; reaper hard-deletes after grace
 flagged BOOLEAN NOT NULL DEFAULT FALSE -- abuse review

 INDEX (expires_at) WHERE expires_at IS NOT NULL -- expiry sweep
 INDEX (content_hash) -- dedupe lookup
 INDEX (owner_id, created_at) -- "my pastes"
 INDEX (created_at) WHERE privacy = 'public' -- public listing

paste_blobs -- S3 bucket, content-addressed
 Bucket layout: pastes/<first-2-hex>/<sha256>
 Storage class: Standard for hot, Infrequent Access after 30d, Glacier after 1y
 Server-side encryption: SSE-S3 (free) or SSE-KMS (compliance)
 Lifecycle policy: tier-down on age, hard-delete on expiry

Content-addressed storage by SHA-256 means duplicate pastes — and there are many — share a blob. The pastes table holds N rows pointing to one S3 object. ~30% storage savings in practice.

4 · High-level architecture

The shape is read-side-heavy. The edge does most of the work; the origin handles creates and the long tail of cache misses.

The hot path on read is: edge → read svc → Redis (for the metadata) → S3 (for the blob). On a CDN hit, none of this fires. The create path is: create svc → scan svc (async, can fail-open or fail-closed depending on policy) → S3 + Postgres.

5 · The hard part — storage tiering and the retention sweep

At 1.8 PB and growing, storage is the dominant cost. Three patterns earn their keep:

Pattern	How	Savings
Lifecycle tiering	S3 lifecycle policy: Standard for 30 d → Infrequent Access for 1 y → Glacier Deep Archive after.	~75% on the long tail. Glacier Deep Archive is $0.00099/GB-month vs $0.023 for Standard.
Content-addressed dedupe	Object key is SHA-256 of content. New paste with same content reuses the existing blob.	~30% based on real-world dupes (cron snippets, error stacks, the same Stack Overflow answer pasted 8 times).
Compression at write	Server-side Zstd before S3 PUT. Decompress on read; CDN caches the compressed bytes.	~60% on text. CPU cost is small at 120 RPS write.

The expiry sweep

Pastes with expires_at need to disappear on time. Three ways to do it, from worst to best:

Scan the whole table at midnight. Hot, single-threaded, blocks the database. Don't.
Index on expires_at, sweep every 15 minutes. Pull rows with expires_at < NOW(), delete from S3, soft-delete the row. The standard answer.
Time-bucketed expiry queue. Route each paste to a Redis sorted set keyed by hour. The sweeper just pops the expired bucket. No range scans on the relational store. The next-tier answer when expiry is on the hot path.

The deletion-isn't-deletion trap. "Hard delete" in S3 with versioning enabled keeps a tombstone. For real privacy/compliance, set the bucket to delete versioned objects after the grace period. Test this end-to-end before claiming GDPR compliance — most teams don't.

6 · Caching strategy

Three layers, the same as URL shortener but with a different ratio of work — the edge does much more here because the payload is bigger.

Edge cache (CDN). 24-hour TTL on read responses. Catches ~85% of all reads. The viral-paste burst (someone tweets a paste link) lands here, not on origin.
Redis (metadata only). 5-minute TTL on the id → blob_key, expiry, privacy tuple. ~99% hit rate; saves a Postgres lookup on every cache miss at the edge.
S3 acts as its own cache. The If-None-Match / If-Modified-Since dance lets the read-svc revalidate cheaply. The blob bytes don't go through the read-svc — it returns a signed URL or a 302, depending on privacy.

Public vs. private at the edge. Public pastes — full CDN cacheable. Unlisted — cacheable but with Cache-Control: private, max-age=300 at the user agent only. Private — never cached at the edge; no-store and an auth-checked stream from origin. Mixing these up is the most common security bug in designs like this.

7 · Abuse, scanning, takedowns

User-generated content invariably attracts abuse. The this design treats this as first-class infrastructure rather than an afterthought.

Concern	Tooling	Where it runs
Malware	ClamAV, VirusTotal API, Cloudflare Workers	Async after PUT, before publishing to CDN
CSAM / known-bad hash	PhotoDNA, NCMEC hash lists	Sync on create — fail-closed; legal requirement
Spam / phishing	URL classifier, domain reputation, ML model	Async, with confidence-thresholded auto-takedown
Copyright / DMCA	Manual review queue, takedown API	Human-in-loop, 24-hour SLA
Rate limit	Token bucket per IP + per user	API gateway, before reaching create svc

A flagged paste is soft-deleted (paste returns 451 Unavailable for Legal Reasons), the blob is moved to a quarantine bucket with restricted IAM, and the event lands in a SIEM. Hard delete only after the appeal window (typically 30 days).

8 · Failure modes & runbook

Failure	Symptom	Mitigation
S3 region outage	All reads on cache miss → 5xx	Cross-region replication; failover URL signing to the replica region (~5 min).
Postgres primary down	Create svc fails; reads survive (Redis + S3)	Read replica auto-promote (~30 s). Create svc returns 503 with Retry-After.
Redis cluster unhealthy	Postgres load 10×	Local in-process LRU absorbs ~50%; Postgres has read-replicas to soak the rest; circuit breaker drops to direct-DB read.
Scan svc down	Pastes pile up in pending state	Backlog the queue; the create svc returns 202 Accepted with the paste in "scanning" status. Fail-open after 1-hour timeout for low-risk content; fail-closed for known-bad indicators.
Viral paste DDoS	Edge is OK; the paste id resolves to a saturating origin	Edge cache absorbs > 95%. Above that, rate-limit per source IP and serve cached value with `stale-if-error`.
Expiry sweep fell behind	Expired pastes accessible past their TTL	The CDN's TTL bounds the leak (max 24 h). Sweep dual-pass: hourly in-region + daily cross-region reconciliation.
Storage cost runaway	S3 bill 2× last month	Lifecycle policy alarms on missing transitions; weekly aged-paste audit; automatic Glacier transition for > 1-year unlisted.

9 · Cost & SLOs

Line	Estimate	Note
S3 storage (1.8 PB blended tiers)	~$18K / month	40% Standard / 40% IA / 20% Glacier
S3 requests + transfer	~$2K / month	Mostly origin egress to CDN
CDN egress	~$8K / month	~150 TB / month at $0.005-$0.02 / GB blended
Postgres (managed, 2 TB)	~$1.5K / month	1 primary + 2 replicas
Redis (50 GB cluster)	~$1K / month	3-node managed
Compute (read 100 + create 30 + scan 20 pods)	~$3K / month	Managed K8s
Scanning (PhotoDNA + ClamAV + VT API)	~$2K / month	Per-scan cost is low at 120 RPS write
Total	~$36K / month	~$0.36 / 1K pastes lifetime

SLOs

Read availability: 99.99%. Edge + cross-region replication → 12 min/quarter budget.
Create availability: 99.9%. Tighter budget; ~2 hours/quarter. Postgres failover dominates.
Read P99: 200 ms (cache hit) / 500 ms (origin). Track separately; the cache miss rate is the main lever.
Expiry SLA: 99% within 30 minutes of expires_at. Sweep cadence + CDN TTL bounds.

10 · Trade-offs & "what would you change at 10×"

If…	Then…
Writes 10× (100M / day)	Stronger compression (Zstd → Zstd-19); shard Postgres metadata; pre-compute base62 IDs in batch.
Reads 10× (1B / day)	Already mostly absorbed by edge; the lever is CDN coverage and longer TTLs for public pastes (24h → 7d).
Strict end-to-end encryption	Client-side encrypt, server stores ciphertext blob + non-secret nonce. Loses dedupe, loses scanning — make this a paid-tier opt-in only.
Global writes (multi-region active-active)	Region-prefixed IDs (us-aB3x9, eu-...) avoid a coordination dance. Conflicts impossible because IDs are unique.
Versioned pastes	S3 versioning is free; index versions in Postgres with a `version` column. List endpoint surfaces history.
"What would a more senior answer add?"	The legal/compliance pipeline — DMCA queue at scale, regulatory reporting (NCMEC, GDPR Article 17 deletes), the audit trail. Most most candidates skip this; it's the difference between "designs systems" and "owns systems".

Pastebin

1 · Clarifying questions

2 · Capacity math, on a napkin

3 · API and data model

Endpoints

Schema

4 · High-level architecture

5 · The hard part — storage tiering and the retention sweep

The expiry sweep

6 · Caching strategy

7 · Abuse, scanning, takedowns

8 · Failure modes & runbook

9 · Cost & SLOs

SLOs

10 · Trade-offs & "what would you change at 10×"

Further reading

Distributed key-value store