Database Replication - Achieving Availability and Scalability | Concepts

What is Replication

Replication is a mechanism that copies and synchronizes database data across multiple servers. It’s used for improving availability, enhancing read performance, and disaster recovery.

Why it’s needed: A single database server becomes a Single Point of Failure (SPOF). Replication allows service continuity during failures and distributes read queries.

Basic Replication Configurations

Master-Slave (Primary-Replica)

flowchart TB
    Writes["Writes"] --> Master["Master<br/>(Primary)"]
    Master -->|Replication| Slave1["Slave 1<br/>(Replica)"]
    Master -->|Replication| Slave2["Slave 2<br/>(Replica)"]
    Master -->|Replication| Slave3["Slave 3<br/>(Replica)"]
    Slave1 --> Reads["Reads"]
    Slave2 --> Reads
    Slave3 --> Reads

Master (Primary): Main server accepting writes
Slave (Replica): Copies master data, handles reads

Multi-Master

flowchart TB
    Master1["Master 1"] <-->|Sync| Master2["Master 2"]
    Master1 --> RW1["Read/Write"]
    Master2 --> RW2["Read/Write"]

Multiple masters accept writes. Conflict resolution becomes a challenge.

Synchronous vs Asynchronous Replication

Synchronous Replication

flowchart LR
    Client --> Master --> Slave["Slave<br/>(Wait for ACK)"] --> Response["Response to Client"]

Advantages	Disadvantages
No data loss	Increased latency
Strong consistency	Writes stop on slave failure

Asynchronous Replication

flowchart LR
    Client --> Master --> Response["Response to Client"]
    Master -.->|Applied later| Slave

Advantages	Disadvantages
Low latency	Replication lag
No impact from slave failure	Potential data loss

Semi-synchronous Replication

Commit is confirmed once at least one slave receives the data.

flowchart LR
    Client --> Master --> Slave1["Slave 1<br/>(ACK)"] --> Response["Response to Client"]
    Master -.->|Applied later| Slave2["Slave 2"]

Replication Methods

Statement-based

Replicates SQL statements themselves.

-- Executed on master
INSERT INTO users (name, created_at) VALUES ('Alice', NOW());

-- Same SQL executed on slave
-- Problem: NOW() might return different results

Row-based

Replicates the actual row data that changed.

Before	After
`{id: 1, name: 'Alice'}`	`{id: 1, name: 'Bob'}`

This diff is applied to slave.

Mixed Mode

Uses statement-based normally, row-based for non-deterministic functions.

Replication Lag

The delay until a slave catches up with the master.

Server	Transactions	Status
Master	1, 2, 3, 4, 5	✓ Complete
Slave	1, 2, 3	✓ Lag: 2 transactions

Cases Where Lag Causes Problems

// Read immediately after write
await db.master.query('UPDATE users SET name = ? WHERE id = ?', ['Bob', 1]);

// Read from slave → May return old data
const user = await db.slave.query('SELECT * FROM users WHERE id = ?', [1]);
// user.name might still be 'Alice'!

Solutions

Read Your Writes: Read from master immediately after writing
Causal Consistency: Track write timestamps
Synchronous Replication: For critical data only

Failover

When the master fails, a slave is promoted to become the new master.

Automatic Failover

flowchart TB
    Step1["1. Detect master failure<br/>Monitoring system detects no response from master"]
    Step2["2. Select slave<br/>Choose the slave with most recent replication progress"]
    Step3["3. Promotion<br/>Promote selected slave to master"]
    Step4["4. Switch connections<br/>Update application connections to new master"]
    Step1 --> Step2 --> Step3 --> Step4

Failover Considerations

Challenge	Solution
Split brain	Fencing (force stop old master)
Data loss	Unapplied transactions in async replication
Connection switching	Virtual IP, DNS update, proxy

Common Implementations

MySQL

-- Master configuration
CHANGE MASTER TO
  MASTER_HOST='master.example.com',
  MASTER_USER='repl_user',
  MASTER_PASSWORD='password',
  MASTER_LOG_FILE='mysql-bin.000001',
  MASTER_LOG_POS=4;

START SLAVE;

PostgreSQL

# postgresql.conf (master)
wal_level = replica
max_wal_senders = 3

# recovery.conf (slave)
standby_mode = 'on'
primary_conninfo = 'host=master.example.com port=5432'

Managed Services

Service	Replica Features
Amazon RDS	Read Replicas, Multi-AZ
Cloud SQL	Read Replicas, High Availability
Azure SQL	geo-replication

Read Scaling Pattern

flowchart LR
    App --> Master["Master<br/>(Writes)"]
    App --> LB["Load Balancer"]
    LB --> R1["Replica 1"]
    LB --> R2["Replica 2"]
    LB --> R3["Replica 3"]
    R1 --> Reads["Reads"]
    R2 --> Reads
    R3 --> Reads

Load balancer distributes read queries across replicas.

Summary

Database replication is a foundational technology for achieving high availability and read scalability. By properly designing synchronous/asynchronous choices, addressing replication lag, and planning failover strategies, you can build robust database systems.

← Back to list