Serverless data systems

Scaling models and multi-tenant data systems - ASDS Chapter 6

Scaling models and multi-tenant data systems - ASDS Chapter 6

What is scaling in large-scale multi-tenant data systems, and how does that compare to single-tenant data systems? How does per-tenant scaling relate to system-wide scaling? How do scale-to-zero and cold starts come into play? Answering these questions is chapter 6 of The Architecture of Serverless Data Systems.

Rather than looking at a new system, this chapter focuses on some patterns that emerge after analyzing the first five systems. So far, I’ve included storage-oriented systems (Amazon DynamoDB, Confluent Cloud’s Kora engine), OLTP databases (Neon, CockroachDB serverless), and an OLAP database (ClickHouse Cloud). The workloads across these systems diverge a lot, so their architectural decisions also diverge, but some patterns emerge.

Serverless ClickHouse Cloud - ASDS Chapter 5 (part 2)

Serverless ClickHouse Cloud - ASDS Chapter 5 (part 2)

In part 1 we looked at the open-source ClickHouse architecture, in part 2 we dive into the serverless architecture of ClickHouse Cloud.

The serverless ClickHouse architecture

ClickHouse Cloud is an entirely serverless offering of ClickHouse that uses a new MergeTree engine called the SharedMergeTree. Where a traditional on-premise ClickHouse cluster is composed of a set of monolithic servers that do both query processing and storage, CH Cloud servers separate storage and compute, with the servers being stateless compute instances and storage served solely by cloud object storage. The SharedMergeEngine is at the core of this serverless architecture.

Serverless ClickHouse Cloud - ASDS Chapter 5 (part 1)

Serverless ClickHouse Cloud - ASDS Chapter 5 (part 1)

See The Architecture of Serverless Data Systems introduction chapter to find other serverless data systems. This chapter is the first system of group 3 - the analytics database group.

ClickHouse is an open-source, column-oriented, distributed (real-time) OLAP database management system. ClickHouse Cloud offers serverless Clickhouse clusters, billed according to the compute and storage resources consumed. 

Before we jump into the architecture of ClickHouse Cloud, it’s worth understanding the architecture of a standard ClickHouse cluster first. The storage architecture of ClickHouse plays a critical role in how the serverless version was designed. Part 1 of this chapter will focus on the standard, fully open-source ClickHouse, and part 2 will focus on the serverless architecture of ClickHouse Cloud.

Serverless CockroachDB - ASDS Chapter 4 (part 3)

Serverless CockroachDB - ASDS Chapter 4 (part 3)

In part 3, the focus is on heat management (the mitigation of hot spots in the storage layer) and autoscaling in the compute layer. In part 2 we looked at the Admission Control sub-system of CRDB and how it helps node overload and noisy neighbors. CRDB uses the combination of sharding, shard movement, and lease distribution to avoid hot spots in its shared storage layer.

Serverless CockroachDB - ASDS Chapter 4 (part 1)

Serverless CockroachDB - ASDS Chapter 4 (part 1)

CockroachDB is a distributed SQL database that aims to be Postgres-compatible. Over the years, the Postgres wire protocol has become a standard of sorts with many database products implementing its wire protocol (much like the Apache Kafka protocol has become a de facto standard in the streaming space).

While it may be Postgres-compatible, there is almost nothing about the serverless CockroachDB architecture that is shared with Neon (serverless Postgres covered in the previous chapter). What they do share, like all the serverless multi-tenant systems in this series, is the separation of storage and compute; the rest is completely different.

Neon - Serverless PostgreSQL - ASDS Chapter 3

Neon - Serverless PostgreSQL - ASDS Chapter 3

Neon is a serverless Postgres service based on an architecture similar to Amazon Aurora. It separates the Postgres monolith into disaggregated storage and compute. The motivation behind this architecture is four-fold:

  • Aim to deliver the best price-performance Postgres service in the world.

  • Use modern replication techniques to provide high availability and high durability to Postgres.

  • Simplify the life of developers by bringing the serverless consumption model to Postgres.

  • Do all this while keeping the majority of Postgres unchanged. Rather than building a new Postgres-compatible implementation, simply leverage the pluggable storage layer to provide all the above benefits while keeping Postgres Postgres.

Kora - Serverless Kafka - ASDS Chapter 2

Kora - Serverless Kafka - ASDS Chapter 2

This is the second instalment of the Architecture of Serverless Data Systems. In the first post, I covered DynamoDB, and I will refer back to it where comparison and contrast are interesting.

Kora is the multi-tenant serverless Kafka engine inside Confluent Cloud. It was designed to offer virtual Kafka clusters on top of shared physical clusters, based on a heavily modified version of Apache Kafka. Today, as little as 20% of the code is shared with the open-source version as the demands of large-scale multi-tenant systems diverge from the needs of single-tenant clusters.

The goals of Kora were to avoid stamping cookie-cutter single-tenant Kafka clusters which would miss out on the economic and reliability benefits of large-scale multi-tenancy. This architecture is evolving fast, and a year or two from now, this description will likely be stale as we continue to disaggregate the architecture from the original Kafka monolith.

Amazon DynamoDB - ASDS Chapter 1

Amazon DynamoDB - ASDS Chapter 1

DynamoDB is a serverless, distributed, multi-tenant NoSQL KV store that was designed and implemented from day one as a disaggregated cloud-native data system.

The goals were to build a multi-tenant system that had the following properties:

  • Consistent performance with low single-digit latency.

  • Obtain high resource utilization through multi-tenancy in order to reduce costs which could be passed on to customers.

  • Unbounded size of tables where the size does not affect the performance.

  • Support for ACID transactions across multiple operations and tables.