ClickHouse vs. Postgres: When to Move Your Logs Out of a Relational DB

Reading part 10 of Deep Dive

Postgres is the most reliable tool in my stack. It handles users, configurations, and complex relations without breaking a sweat. But databases, like any physical system, have a “design limit”. For Postgres, that limit usually appears when you try to use it as a dumping ground for high-velocity logs and metrics.

When your ANALYZE commands start taking minutes and your indexes consume more RAM than your data, you aren’t facing a Postgres bug; you’re facing an architectural mismatch.

The Write-Heavy Wall

Relational databases are designed for OLTP (Online Transactional Processing). They prioritize consistency, ACID compliance, and point-lookups. Logs, however, are OLAP (Online Analytical Processing) data. They are immutable, sequential, and usually queried in broad sweeps rather than by individual IDs.

In Postgres, every new log entry must be written to the Write-Ahead Log (WAL) and then inserted into a B-Tree index. As the table grows, the cost of maintaining those B-Trees during high-concurrency writes creates a “latency floor” that you simply cannot drop below.

Deep Dive // B-Tree vs. LSM-Tree

B-Trees (Postgres): Optimized for reading small amounts of data quickly. Every write requires finding the correct spot in a tree, which can lead to random disk I/O as the tree grows larger than memory.

LSM-Trees (ClickHouse): Optimized for high-volume writes. It appends data to sorted “parts” in memory and flushes them to disk sequentially. Background “merges” combine these parts later. This turns random I/O into sequential I/O, which is where modern SSDs (and even HDDs) truly shine.

The “Write Amplification” Problem

In Postgres, if you have a table with 5 indexes, every single INSERT must update those 5 separate B-Trees. This is Write Amplification. For a log-heavy system processing 5,000 events/sec, your disk is working 5x harder just to keep the indexes “ready”, even if you only query those logs once a week.

Why ClickHouse?

Moving logs to ClickHouse isn’t just about “faster writes”. It’s about changing how you think about data storage.

1. Columnar Compression

In Postgres, data is stored in rows. If you have 50 columns but only query 2 (timestamp and error_code), Postgres still has to read the whole row from disk. ClickHouse stores each column in its own file.

Tip

Because similar data is stored together in columns (e.g., thousands of “200 OK” strings), ClickHouse can use specialized algorithms like ZSTD or Delta encoding to achieve 10x to 100x compression ratios.

2. The Power of Materialized Views

One of ClickHouse’s “superpowers” is the ability to aggregate data on ingestion. Instead of running a heavy COUNT(*) query every time you refresh a dashboard, ClickHouse can maintain a running total in a background table as the data arrives using the SummingMergeTree or AggregatingMergeTree engines.

sql

CREATE MATERIALIZED VIEW hourly_errors_mv
ENGINE = SummingMergeTree
AS SELECT
toStartOfHour(timestamp) as hour,
count() as error_count
FROM logs
WHERE status >= 400
GROUP BY hour;

I don’t want to make this sound like a free lunch. Moving to ClickHouse introduces “Operational Tax”:

No Real Updates: ClickHouse is for immutable data. UPDATE and DELETE commands exist (as ALTER TABLE ... UPDATE), but they are heavy, asynchronous background operations. If you need to frequently edit your data, stay with Postgres.
Join Performance: While ClickHouse can do joins, it isn’t its primary strength. It prefers large, “denormalized” tables.
Consistency Models: Postgres gives you strict ACID. ClickHouse is “eventually consistent” during background merges. For logs, this is fine. For a bank balance, it is a disaster.

If your logs table is under 5 million rows and your write rate is low, stay with Postgres. The operational simplicity of having one database outweighs the performance gains of ClickHouse.

However, if you’re seeing any of the following, it’s time to move:

VACUUM processes are struggling to keep up with dead tuples (bloat).
Your log indexes are larger than the actual log data.
Simple aggregations (like “average response time per hour”) are timing out.
You are spending more on RDS storage than on actual compute.

The goal is to use the right tool for the job. Postgres remains my “Source of Truth” for application state: users, sessions, and relationships. But for the firehose of observability data, ClickHouse has reclaimed our performance headroom and slashed our storage bills.

The Write-Heavy Wall

The “Write Amplification” Problem

Why ClickHouse?

1. Columnar Compression

2. The Power of Materialized Views

The Internet is Breaking.

Welcome to the Fold.

Welcome to the Fold.