Serverless Data Systems

Tableflow: the stream/table, Kafka/Iceberg duality

Tableflow: the stream/table, Kafka/Iceberg duality

Confluent just announced Tableflow, the seamless materialization of Apache Kafka topics as Apache Iceberg tables. This announcement has to be the most impactful announcement I’ve witnessed while at Confluent. This post is about why Iceberg tables aren’t just another destination to sync data to; they fundamentally change the world of streaming. It’s also about the macro trends that have led us to this point and why Iceberg (and the other table formats) are so important to the future of streaming.

The Architecture of Serverless Data Systems

The Architecture of Serverless Data Systems

I recently blogged about why I believe the future of cloud data services is large-scale and multi-tenant, citing, among others, S3. 

Top tier SaaS services like S3 are able to deliver amazing simplicity, reliability, durability, scalability, and low price because their technologies are structurally oriented to deliver those things. Serving customers over large resource pools provides unparalleled efficiency and reliability at scale.” So said myself in that post.

To further explore this topic, I am surveying real-world serverless, multi-tenant data architectures to understand how different types of systems, such as OLTP databases, real-time OLAP, cloud data warehouses, event streaming systems, and more, implement serverless MT.