This is a data system internals blog post. So if you enjoyed my table formats internals blog posts, or writing on Apache Kafka internals or Apache BookKeeper internals, you might enjoy this one. But beware, it’s long and detailed. Also note that I work for Confluent, which also runs Apache Flink but does not run nor contributes to Apache Fluss. However, this post aims to be a faithful and objective description of Fluss.
Apache Fluss is a table storage engine for Flink being developed by Alibaba in collaboration with Ververica. To write this blog post, I reverse engineered a high level architecture by reading the Fluss code from the main branch (and running tests), in August 2025. This follows my same approach to my writing about Kafka, Pulsar, BookKeeper, and the table formats (Iceberg, Delta, Hudi and Paimon) as the code is always the true source of information. Unlike the rest, I have not had time to formally verify Fluss in TLA+ or Fizzbee, though I did not notice any obvious issues that are not already logged in a GitHub issue.
Let’s get started. We’ll start with some high level discussion in the Fluss Overview section, then get into the internals in the Fluss Cluster Core Architecture and Fluss Lakehouse Architecture sections.