November 21, 2023

Serverless CockroachDB - ASDS Chapter 4 (part 3)

November 21, 2023

In part 3, the focus is on heat management (the mitigation of hot spots in the storage layer) and autoscaling in the compute layer. In part 2 we looked at the Admission Control sub-system of CRDB and how it helps node overload and noisy neighbors. CRDB uses the combination of sharding, shard movement, and lease distribution to avoid hot spots in its shared storage layer.

Jack Vanlightly

November 21, 2023

Serverless data systems

Serverless CockroachDB - ASDS Chapter 4 (part 2)

Jack Vanlightly

November 21, 2023

Serverless data systems

Serverless CockroachDB - ASDS Chapter 4 (part 2)

In part 1 we covered the basics of the CockroachDB (CRDB) single-tenant architecture and the high-level changes for building the multi-tenant serverless architecture. In parts 2 and 3, I’ll start focusing more narrowly on the tenant isolation and scaling mechanisms in multi-tenant serverless CRDB.

Jack Vanlightly

November 21, 2023

Serverless data systems

Serverless CockroachDB - ASDS Chapter 4 (part 1)

Jack Vanlightly

November 21, 2023

Serverless data systems

Serverless CockroachDB - ASDS Chapter 4 (part 1)

CockroachDB is a distributed SQL database that aims to be Postgres-compatible. Over the years, the Postgres wire protocol has become a standard of sorts with many database products implementing its wire protocol (much like the Apache Kafka protocol has become a de facto standard in the streaming space).

While it may be Postgres-compatible, there is almost nothing about the serverless CockroachDB architecture that is shared with Neon (serverless Postgres covered in the previous chapter). What they do share, like all the serverless multi-tenant systems in this series, is the separation of storage and compute; the rest is completely different.

Jack Vanlightly

November 15, 2023

Serverless data systems

Neon - Serverless PostgreSQL - ASDS Chapter 3

Jack Vanlightly

November 15, 2023

Serverless data systems

Neon - Serverless PostgreSQL - ASDS Chapter 3

Neon is a serverless Postgres service based on an architecture similar to Amazon Aurora. It separates the Postgres monolith into disaggregated storage and compute. The motivation behind this architecture is four-fold:

Aim to deliver the best price-performance Postgres service in the world.
Use modern replication techniques to provide high availability and high durability to Postgres.
Simplify the life of developers by bringing the serverless consumption model to Postgres.
Do all this while keeping the majority of Postgres unchanged. Rather than building a new Postgres-compatible implementation, simply leverage the pluggable storage layer to provide all the above benefits while keeping Postgres Postgres.

Jack Vanlightly

November 14, 2023

Serverless data systems

Kora - Serverless Kafka - ASDS Chapter 2

Jack Vanlightly

November 14, 2023

Serverless data systems

Kora - Serverless Kafka - ASDS Chapter 2

This is the second instalment of the Architecture of Serverless Data Systems. In the first post, I covered DynamoDB, and I will refer back to it where comparison and contrast are interesting.

Kora is the multi-tenant serverless Kafka engine inside Confluent Cloud. It was designed to offer virtual Kafka clusters on top of shared physical clusters, based on a heavily modified version of Apache Kafka. Today, as little as 20% of the code is shared with the open-source version as the demands of large-scale multi-tenant systems diverge from the needs of single-tenant clusters.

The goals of Kora were to avoid stamping cookie-cutter single-tenant Kafka clusters which would miss out on the economic and reliability benefits of large-scale multi-tenancy. This architecture is evolving fast, and a year or two from now, this description will likely be stale as we continue to disaggregate the architecture from the original Kafka monolith.

Jack Vanlightly

November 14, 2023

Serverless data systems

Amazon DynamoDB - ASDS Chapter 1

Jack Vanlightly

November 14, 2023

Serverless data systems

DynamoDB is a serverless, distributed, multi-tenant NoSQL KV store that was designed and implemented from day one as a disaggregated cloud-native data system.

The goals were to build a multi-tenant system that had the following properties:

Consistent performance with low single-digit latency.
Obtain high resource utilization through multi-tenancy in order to reduce costs which could be passed on to customers.
Unbounded size of tables where the size does not affect the performance.
Support for ACID transactions across multiple operations and tables.