02 / 19
Playbook / 02

Pastebin

The same shape as URL shortener — small write API, hot read path — but the storage choice flips. Pastes are large blobs (10 KB to 2 MB), so the architecture moves from hot KV to object storage with edge caching. Walk this and you've seen both ends of the read-heavy spectrum, plus the retention and abuse-handling problems that come with user-generated content.


1 · Clarifying questions

What's the max paste size?2 MB. Hard cap for free tier; 10 MB for paid. The cap shapes everything from the schema to the upload path.
How long do pastes live?Configurable per paste — 10 min, 1 hour, 1 day, 1 month, 1 year, forever. Forever costs real money in long-tail storage.
Privacy model?Public (indexed, listable), unlisted (link-only, not listable), private (auth required). Three different access patterns, three different code paths.
What's the read/write ratio?~10:1 across all pastes; ~100:1 for the long tail of viral pastes. Most pastes get one or two reads; a few get millions.
Syntax highlighting?Client-side only (Prism / highlight.js). Never store rendered HTML — that doubles storage and bakes in security risks.
Abuse handling?Required. Malware scanning, CSAM detection, takedown workflow, regulator notice. Skipping this is what fails the interview round.
Latency budget?P99 read ≤ 200 ms (most served from edge). P99 create ≤ 500 ms (it's an upload).
Multi-region?Yes for reads (anycast edge). Single-region writes initially.

2 · Capacity math, on a napkin

NumberCalculationResult
Writes / daygiven10M
Reads / day10× writes100M
Write QPS (avg / peak)10M / 86,400 × 3~120 / ~350
Read QPS (avg / peak)100M / 86,400 × 3~1.2K / ~3.5K (origin)
Average paste sizegiven~50 KB
Storage / day10M × 50 KB~500 GB
Storage / year×365~180 TB
Storage / 10 yr (no expiry)×10~1.8 PB
Storage / 10 yr (50% expire)halved~900 TB
Hot working set~5% of pastes account for 90% of reads~9 TB
Egress / day100M × 50 KB × 0.2 (CDN miss)~1 TB / day from origin

The takeaway: this is a storage problem, not a compute problem. Object storage (S3) at $0.023/GB-month gets us 1.8 PB for ~$40K/month — and most of that flows out via the CDN, which is the next biggest cost line.

3 · API and data model

Endpoints

POST /v1/pastes # create
{
 "content": "...", # required, ≤ 2 MB
 "language": "python", # optional, hint for client-side highlighter
 "privacy": "unlisted", # public | unlisted | private
 "expires": "1d" # 10m | 1h | 1d | 1m | 1y | never
}
→ 201 {"id":"aB3x9Q2", "url":"https://pb.sh/aB3x9Q2", "expires_at":"2026-05-10T..."}

GET /:id # read (the hot path)
→ 200 text/plain + the paste content
 Cache-Control: public, max-age=300, s-maxage=86400

GET /:id/raw # read raw, for clients/tools
→ 200 text/plain

POST /:id/report # abuse / takedown
→ 202

DELETE /:id # owner only
→ 204

Schema

Two tables, plus the blob in object storage. The Postgres row is intentionally tiny — the paste body lives in S3 with the row pointing at the object key.

pastes -- metadata only, ~300 B / row
 id VARCHAR(8) PRIMARY KEY -- base62 ID
 blob_key VARCHAR(64) NOT NULL -- s3://pastes/<sha256-prefix>/<sha256>
 size_bytes INTEGER NOT NULL
 language VARCHAR(32) NULL
 privacy VARCHAR(16) NOT NULL -- public | unlisted | private
 owner_id BIGINT NULL
 content_hash CHAR(64) NOT NULL -- sha256, used for dedupe
 created_at TIMESTAMP NOT NULL
 expires_at TIMESTAMP NULL
 deleted_at TIMESTAMP NULL -- soft delete; reaper hard-deletes after grace
 flagged BOOLEAN NOT NULL DEFAULT FALSE -- abuse review

 INDEX (expires_at) WHERE expires_at IS NOT NULL -- expiry sweep
 INDEX (content_hash) -- dedupe lookup
 INDEX (owner_id, created_at) -- "my pastes"
 INDEX (created_at) WHERE privacy = 'public' -- public listing

paste_blobs -- S3 bucket, content-addressed
 Bucket layout: pastes/<first-2-hex>/<sha256>
 Storage class: Standard for hot, Infrequent Access after 30d, Glacier after 1y
 Server-side encryption: SSE-S3 (free) or SSE-KMS (compliance)
 Lifecycle policy: tier-down on age, hard-delete on expiry

Content-addressed storage by SHA-256 means duplicate pastes — and there are many — share a blob. The pastes table holds N rows pointing to one S3 object. ~30% storage savings in practice.

4 · High-level architecture

The shape is read-side-heavy. The edge does most of the work; the origin handles creates and the long tail of cache misses.

The hot path on read is: edge → read svc → Redis (for the metadata) → S3 (for the blob). On a CDN hit, none of this fires. The create path is: create svc → scan svc (async, can fail-open or fail-closed depending on policy) → S3 + Postgres.

5 · The hard part — storage tiering and the retention sweep

At 1.8 PB and growing, storage is the dominant cost. Three patterns earn their keep:

PatternHowSavings
Lifecycle tiering S3 lifecycle policy: Standard for 30 d → Infrequent Access for 1 y → Glacier Deep Archive after. ~75% on the long tail. Glacier Deep Archive is $0.00099/GB-month vs $0.023 for Standard.
Content-addressed dedupe Object key is SHA-256 of content. New paste with same content reuses the existing blob. ~30% based on real-world dupes (cron snippets, error stacks, the same Stack Overflow answer pasted 8 times).
Compression at write Server-side Zstd before S3 PUT. Decompress on read; CDN caches the compressed bytes. ~60% on text. CPU cost is small at 120 RPS write.

The expiry sweep

Pastes with expires_at need to disappear on time. Three ways to do it, from worst to best:

  1. Scan the whole table at midnight. Hot, single-threaded, blocks the database. Don't.
  2. Index on expires_at, sweep every 15 minutes. Pull rows with expires_at < NOW(), delete from S3, soft-delete the row. The standard answer.
  3. Time-bucketed expiry queue. Route each paste to a Redis sorted set keyed by hour. The sweeper just pops the expired bucket. No range scans on the relational store. The next-tier answer when expiry is on the hot path.
The deletion-isn't-deletion trap. "Hard delete" in S3 with versioning enabled keeps a tombstone. For real privacy/compliance, set the bucket to delete versioned objects after the grace period. Test this end-to-end before claiming GDPR compliance — most teams don't.

6 · Caching strategy

Three layers, the same as URL shortener but with a different ratio of work — the edge does much more here because the payload is bigger.

  • Edge cache (CDN). 24-hour TTL on read responses. Catches ~85% of all reads. The viral-paste burst (someone tweets a paste link) lands here, not on origin.
  • Redis (metadata only). 5-minute TTL on the id → blob_key, expiry, privacy tuple. ~99% hit rate; saves a Postgres lookup on every cache miss at the edge.
  • S3 acts as its own cache. The If-None-Match / If-Modified-Since dance lets the read-svc revalidate cheaply. The blob bytes don't go through the read-svc — it returns a signed URL or a 302, depending on privacy.
Public vs. private at the edge. Public pastes — full CDN cacheable. Unlisted — cacheable but with Cache-Control: private, max-age=300 at the user agent only. Private — never cached at the edge; no-store and an auth-checked stream from origin. Mixing these up is the most common security bug in designs like this.

7 · Abuse, scanning, takedowns

User-generated content invariably attracts abuse. The this design treats this as first-class infrastructure rather than an afterthought.

ConcernToolingWhere it runs
MalwareClamAV, VirusTotal API, Cloudflare WorkersAsync after PUT, before publishing to CDN
CSAM / known-bad hashPhotoDNA, NCMEC hash listsSync on create — fail-closed; legal requirement
Spam / phishingURL classifier, domain reputation, ML modelAsync, with confidence-thresholded auto-takedown
Copyright / DMCAManual review queue, takedown APIHuman-in-loop, 24-hour SLA
Rate limitToken bucket per IP + per userAPI gateway, before reaching create svc

A flagged paste is soft-deleted (paste returns 451 Unavailable for Legal Reasons), the blob is moved to a quarantine bucket with restricted IAM, and the event lands in a SIEM. Hard delete only after the appeal window (typically 30 days).

8 · Failure modes & runbook

FailureSymptomMitigation
S3 region outageAll reads on cache miss → 5xxCross-region replication; failover URL signing to the replica region (~5 min).
Postgres primary downCreate svc fails; reads survive (Redis + S3)Read replica auto-promote (~30 s). Create svc returns 503 with Retry-After.
Redis cluster unhealthyPostgres load 10×Local in-process LRU absorbs ~50%; Postgres has read-replicas to soak the rest; circuit breaker drops to direct-DB read.
Scan svc downPastes pile up in pending stateBacklog the queue; the create svc returns 202 Accepted with the paste in "scanning" status. Fail-open after 1-hour timeout for low-risk content; fail-closed for known-bad indicators.
Viral paste DDoSEdge is OK; the paste id resolves to a saturating originEdge cache absorbs > 95%. Above that, rate-limit per source IP and serve cached value with stale-if-error.
Expiry sweep fell behindExpired pastes accessible past their TTLThe CDN's TTL bounds the leak (max 24 h). Sweep dual-pass: hourly in-region + daily cross-region reconciliation.
Storage cost runawayS3 bill 2× last monthLifecycle policy alarms on missing transitions; weekly aged-paste audit; automatic Glacier transition for > 1-year unlisted.

9 · Cost & SLOs

LineEstimateNote
S3 storage (1.8 PB blended tiers)~$18K / month40% Standard / 40% IA / 20% Glacier
S3 requests + transfer~$2K / monthMostly origin egress to CDN
CDN egress~$8K / month~150 TB / month at $0.005-$0.02 / GB blended
Postgres (managed, 2 TB)~$1.5K / month1 primary + 2 replicas
Redis (50 GB cluster)~$1K / month3-node managed
Compute (read 100 + create 30 + scan 20 pods)~$3K / monthManaged K8s
Scanning (PhotoDNA + ClamAV + VT API)~$2K / monthPer-scan cost is low at 120 RPS write
Total~$36K / month~$0.36 / 1K pastes lifetime

SLOs

  • Read availability: 99.99%. Edge + cross-region replication → 12 min/quarter budget.
  • Create availability: 99.9%. Tighter budget; ~2 hours/quarter. Postgres failover dominates.
  • Read P99: 200 ms (cache hit) / 500 ms (origin). Track separately; the cache miss rate is the main lever.
  • Expiry SLA: 99% within 30 minutes of expires_at. Sweep cadence + CDN TTL bounds.

10 · Trade-offs & "what would you change at 10×"

If…Then…
Writes 10× (100M / day)Stronger compression (Zstd → Zstd-19); shard Postgres metadata; pre-compute base62 IDs in batch.
Reads 10× (1B / day)Already mostly absorbed by edge; the lever is CDN coverage and longer TTLs for public pastes (24h → 7d).
Strict end-to-end encryptionClient-side encrypt, server stores ciphertext blob + non-secret nonce. Loses dedupe, loses scanning — make this a paid-tier opt-in only.
Global writes (multi-region active-active)Region-prefixed IDs (us-aB3x9, eu-...) avoid a coordination dance. Conflicts impossible because IDs are unique.
Versioned pastesS3 versioning is free; index versions in Postgres with a version column. List endpoint surfaces history.
"What would a more senior answer add?"The legal/compliance pipeline — DMCA queue at scale, regulatory reporting (NCMEC, GDPR Article 17 deletes), the audit trail. Most most candidates skip this; it's the difference between "designs systems" and "owns systems".

Further reading

  • AWS — "Amazon S3 storage classes & lifecycle". The reference doc on tiered storage costs. Memorise the four standard tiers.
  • Cloudflare — "Workers, R2 and the egress-cost model". Useful counter-design — R2 has no egress fees, which changes the optimal CDN architecture for pastebin-shaped workloads.
  • Microsoft — "PhotoDNA Cloud Service". The standard CSAM-detection plumbing for any user-generated-content service.
  • Backblaze — "How long do hard drives last". Tangential, but useful to internalise that "object storage is cheap" is not magic — it's deduplication, erasure coding, and the bathtub curve.
  • Adjacent: URL shortener. Same shape, different storage choice. Read both back-to-back.
  • Adjacent: CDNs. The edge layer that makes pastebin economically possible.
  • Adjacent: Object storage. The S3-shaped substrate underneath.
Found this useful?