Design a URL Shortener

A URL shortener converts long URLs into shorter aliases that redirect to the original. Services like Bit.ly and TinyURL have made this a classic system design interview question—it's accessible enough for junior candidates yet offers depth for senior discussions.

This walkthrough follows the Interview Framework and focuses on what you'd actually present in a 45-60 minute interview.

Phase 1: Requirements (~5 minutes)

Before touching the whiteboard, clarify exactly what you're building. A URL shortener sounds simple, but the requirements significantly impact the design.

Functional Requirements

Frame these as user capabilities:

Create short URL — Users submit a long URL and receive a shortened version
Redirect — Users visit a short URL and are redirected to the original
Custom aliases (optional) — Users can specify their own short code (e.g., short.ly/my-link)
Expiration (optional) — URLs expire after a set time

Keep it to 3-4 core features. Analytics, user accounts, and link editing are "below the line"—acknowledge them but defer unless the interviewer specifically asks.

Non-Functional Requirements

These define how well the system performs:

Requirement	Target	Rationale
Scale	100M new URLs/month	Typical for a major service
Read:Write Ratio	100:1	URLs are created once, clicked many times
Latency	< 100ms redirects	Users expect instant redirects
Availability	99.9%+	Broken short links damage trust
Uniqueness	Guaranteed	Each short code maps to exactly one URL

Ask clarifying questions:

"Should we prioritize availability over consistency?" — Is eventual consistency acceptable? A newly created URL might take a moment to propagate to read replicas. (Usually yes—prioritize uptime over perfect read-after-write consistency)
"Do we need the short URLs to be unpredictable?" — Can someone guess other short codes by incrementing? (Security concern if links point to private content)

Capacity Estimation

Do a quick back-of-envelope calculation to inform your design decisions:

Given: 100M new URLs/month, 100:1 read/write ratio

Writes (URL creation):

100M / (30 days × 24 hours × 3600 sec) ≈ 40 URLs/second

Reads (redirects):

40 × 100 = 4,000 redirects/second

Storage (5-year retention):

100M URLs/month × 12 months × 5 years = 6 billion URLs
Average record size: ~500 bytes (short code + long URL + metadata)
Total: 6B × 500 bytes ≈ 3 TB

3 TB fits comfortably on a single modern database server with no sharding needed. And 4,000 reads/second is well within PostgreSQL's capability with proper indexing—add a Redis cache and you'll handle 10x that easily.

Short code length calculation:

Using Base62 (a-z, A-Z, 0-9 = 62 characters):

6 characters: 62⁶ ≈ 57 billion combinations
7 characters: 62⁷ ≈ 3.5 trillion combinations

7 characters is more than sufficient for our scale and leaves room for growth.

Phase 2: Data Model (~5 minutes)

Identify the key entities before jumping into APIs. This establishes shared vocabulary with your interviewer.

Core Entities

URL
├── short_code (string, 7 chars) - Primary key for lookups
├── long_url (string, 2048 chars) - Original URL
├── created_at (timestamp)
├── expires_at (timestamp, optional)
└── user_id (string, optional) - If tracking creators

User (optional)
├── id
├── email
└── api_key

Keep the data model minimal initially. You can add fields like click_count or last_accessed later during the scaling discussion. In an interview, saying "I'll start simple and add analytics fields if we have time" shows good prioritization.

Phase 3: API Design (~5 minutes)

Define the contract between clients and your system. REST is the right choice here—it maps naturally to our CRUD operations.

Create Short URL

POST /api/v1/urls
Content-Type: application/json

{
  "long_url": "https://example.com/very/long/path/to/resource",
  "custom_alias": "my-link",     // optional
  "expires_in": 86400            // optional, seconds
}

Response: 201 Created
{
  "short_url": "https://short.ly/abc1234",
  "expires_at": "2024-12-31T23:59:59Z"
}

Redirect (The Critical Path)

GET /{short_code}

Response: 302 Found
Location: https://example.com/very/long/path/to/resource

301 vs 302: Which redirect?

When your server returns a redirect, the HTTP status code tells the browser how to handle future visits:

301 (Moved Permanently):

User visits short.ly/abc123 → Server returns 301 → Browser caches this
Next visit: Browser skips your server entirely, goes straight to destination

The browser remembers the redirect and never asks your server again (until cache expires or is cleared).

302 (Found / Temporary Redirect):

User visits short.ly/abc123 → Server returns 302 → Browser does NOT cache
Next visit: Browser asks your server again → Server redirects again

Every click goes through your server first.

Code	Browser Caches?	Server Sees Every Click?	Best For
301	Yes	No	Static redirects where performance matters most
302	No	Yes	Dynamic links, analytics, expiring URLs

Use 302 for a URL shortener. It lets us:

Track click analytics (every click hits our server)
Update the destination URL later
Expire or disable links
Block malicious destinations if needed

Phase 4: High-Level Design (~15-25 minutes)

This is the core of your interview. Start with a working design, then evolve it.

Initial Architecture

Start with the simplest design that satisfies requirements:

Components:

Load Balancer — Distributes traffic across API servers
API Servers — Stateless servers handling business logic
Database — Stores URL mappings (PostgreSQL or DynamoDB)
Cache — Redis for frequently accessed URLs

URL Shortening Flow

Walk through the data flow as you draw:

"When a user submits a long URL, the request hits our load balancer, routes to an API server. The server generates a unique short code, stores the mapping in our database, and returns the short URL."

1. Client sends POST /api/v1/urls with long_url
2. API Server validates the URL format
3. Generate unique short_code (we'll discuss how shortly)
4. Store mapping: short_code → long_url in database
5. Return short URL to client

URL Redirection Flow

"For redirects—the hot path—user visits short.ly/abc1234. We first check the cache. If it's a hit, we redirect immediately. On cache miss, we query the database, cache the result, then redirect."

1. Client sends GET /abc1234
2. Check cache for short_code
   - Cache hit: Return 302 redirect immediately
   - Cache miss: Continue to step 3
3. Query database for long_url
4. If found and not expired:
   - Cache the mapping
   - Return 302 redirect
5. If not found: Return 404

Narrating the data flow as you draw keeps the interviewer engaged and demonstrates you understand how the pieces connect—not just what they are.

Handling Custom Aliases

If the user provides a custom alias (e.g., my-link), skip the generation step entirely:

Check if the alias already exists in the database
If available, store the custom alias as the short_code
If taken, return an error (don't auto-generate a fallback—the user wanted that specific alias)

This works seamlessly with any ID generation approach since custom aliases bypass the generator entirely.

The Key Problem: Generating Unique Short Codes

This is the most interesting design decision. There are three main approaches:

Approach 1: Hash the Long URL

Take a hash (MD5/SHA256) of the long URL and use the first 7 characters.

hash("https://example.com/long/path") → "a1b2c3d..."
short_code = "a1b2c3d"

Pros:

Same long URL always produces same short code (deduplication)
No coordination needed between servers

Cons:

Collisions: Different URLs might hash to same prefix
Must check database and retry with different hash on collision
Collision resolution adds latency and complexity

Collision handling: If short_code exists, append a counter to the original URL and rehash:

hash("https://example.com/long/path" + "1") → new hash

You can use a Bloom filter to quickly check if a code might exist before hitting the database—reducing collision-check latency.

Approach 2: Counter + Base62 Encoding (Recommended)

Use a global counter to generate unique numeric IDs, then convert to Base62.

counter = 1000000001
base62_encode(1000000001) → "abc1234"

Pros:

Guaranteed uniqueness — No collision handling needed
Simple and predictable
Compact codes (6-7 characters for billions of URLs)

Cons:

Sequential IDs are predictable (security concern)
Counter is a single point of failure
Distributed coordination needed at scale

Making it unpredictable: If security matters, shuffle the bits or XOR with a secret before encoding. This maintains uniqueness while obscuring the sequence.

Approach 3: Pre-generated Key Pool

Generate random short codes in advance and store in a "key pool" database. When creating a URL, grab a key from the pool.

Pros:

No runtime generation overhead
Naturally unpredictable

Cons:

Need to manage key pool (ensure it doesn't run out)
Additional database for keys
Coordination for distributed key distribution

Which Approach to Choose?

Approach	Best For	Avoid When
Hash + Collision	Deduplication matters, distributed generation	High volume (collision rate increases)
Counter + Base62	Simplicity, guaranteed uniqueness	Need unpredictable URLs
Pre-generated Pool	High security requirements	Simpler systems

For most interviews, Counter + Base62 is the sweet spot. It's simple to explain, guarantees uniqueness, and you can address predictability concerns with bit shuffling if asked.

Database Choice

SQL (PostgreSQL) works well here:

Simple schema with one main table
Strong consistency for URL creation
B-tree index on short_code for fast lookups
Mature, well-understood

NoSQL (DynamoDB) also works:

short_code as partition key
Built-in horizontal scaling
Good for read-heavy workloads

Either is fine—pick based on your experience and justify it.

CREATE TABLE urls (
    short_code VARCHAR(10) PRIMARY KEY,
    long_url TEXT NOT NULL,
    created_at TIMESTAMP DEFAULT NOW(),
    expires_at TIMESTAMP,
    click_count BIGINT DEFAULT 0
);

-- Optional: index for deduplication (only if same URL should return same short code)
-- CREATE INDEX idx_long_url ON urls(long_url);

Phase 5: Scaling & Trade-offs (~15-20 minutes)

With a working design in place, address the non-functional requirements and potential bottlenecks.

Scaling Reads with Caching

With a read-heavy workload (100:1 ratio), caching dramatically improves latency and reduces database load.

Cache Strategy:

Cache Key: short_code
Cache Value: long_url
TTL: 24 hours (or until expiration)
Eviction: LRU

Use the cache-aside pattern:

Check cache first
On miss, query database
Populate cache with result
Return to client

Cache sizing:

Top 20% of URLs likely get 80% of traffic
6B URLs × 20% × 500 bytes = 600 GB

A Redis cluster can easily handle this. For even better performance, consider:

Application-level caching — In-memory cache on each API server for ultra-hot URLs
CDN caching — For globally popular links

Scaling Writes with Distributed ID Generation

If the counter becomes a bottleneck or single point of failure:

Range-based distribution: Assign ID ranges to each server.

Server 1: IDs 1-1,000,000
Server 2: IDs 1,000,001-2,000,000
Server 3: IDs 2,000,001-3,000,000

When a server exhausts its range, it requests a new range from a coordinator.

Counter batching: Each server requests batches of IDs (e.g., 1000 at a time) from a centralized Redis counter using atomic increment.

INCRBY url_counter 1000  // Returns new counter value

Server uses the batch locally without network calls per URL.

If a server crashes with unused IDs in its batch, those IDs are "lost" but that's acceptable—we have trillions of combinations. Trade-off: efficiency vs perfect utilization.

Database Scaling

Read replicas: Route redirect queries to replicas, writes to primary. With async replication, a newly created URL might not be immediately readable—but the creator can be routed to primary for read-your-writes consistency.

Sharding (if needed): Partition by short_code hash. Since each redirect only needs one lookup by short_code, queries don't span shards.

High Availability

Component	Strategy
Load Balancer	Multiple LBs with DNS failover
API Servers	Stateless, auto-scaling group
Database	Primary-replica with automatic failover
Cache	Redis Cluster with replication
ID Generator	Redis with persistence, or range-based

Rate Limiting

Protect against abuse:

Limit URL creation per IP/API key (e.g., 100/minute)
Limit redirects per IP to prevent bot enumeration and server overload

Monitoring & Observability

Mention these proactively:

Latency percentiles (p50, p99 for redirects)
Cache hit rate (target > 90%)
Error rates by endpoint
ID range depletion alerts (if using range-based distribution)

Common Pitfalls

Over-engineering early — Don't propose sharding before you've established the scale requires it. Our capacity estimation showed 3 TB fits on one machine. Start simple.

Ignoring collision handling — If you choose the hash approach, you must explain how you handle collisions. "We'll just retry with a different hash" is insufficient—explain how you generate a different hash.

Forgetting the hot path — Redirects are 100x more frequent than creates. Optimize for reads first. Your caching strategy matters more than your database choice.

Not discussing trade-offs — Every decision has trade-offs. 301 vs 302, SQL vs NoSQL, hash vs counter—state what you're gaining and giving up.

Interview Checklist

Before wrapping up, verify you've covered:

Requirements Phase

3-4 functional requirements identified
Scale, latency, availability clarified
Quick capacity estimation completed

Data Model

Key entities identified with basic attributes
Primary key strategy established (short_code)

API Design

Each requirement maps to an endpoint
Request/response formats defined
Redirect status code choice justified (302)

High-Level Design

Architecture diagram with data flow
Short code generation approach explained
Database choice justified

Scaling & Trade-offs

Caching strategy for read-heavy workload
ID generation scaling addressed
At least one trade-off discussed explicitly

Summary

Aspect	Recommendation	Rationale
Short code length	7 characters, Base62	3.5 trillion combinations
ID generation	Counter + Base62	Simple, guaranteed unique
Database	PostgreSQL or DynamoDB	Either works for this scale
Caching	Redis with cache-aside	Essential for read-heavy workload
Redirect code	302 Temporary	Enables analytics and updates
Availability	Replicated DB + Redis cluster	No single points of failure

The URL shortener is an excellent interview question because it appears simple but reveals depth in ID generation, caching strategy, and scalability reasoning. Focus on explaining why you make each decision, not just what you're building.