S3 Express One Zone, not quite what I hoped for — Jack Vanlightly

S3 Express One Zone, not quite what I hoped for

AWS just announced a new lower-latency S3 storage class and for those of us in the data infrastructure business, this is big news. It’s not a secret that a low-latency object storage primitive has the potential to change how we build cloud data systems forever. So has this new world arrived with S3 Express One Zone?

The answer is no, but this is a good time to talk about cloud object storage, its role in modern cloud data systems and the potential future role it can take.

Please note that when I say S3, I’m talking about cloud object storage in general unless specifically S3 Express 1Z.

The holy grail of cloud storage

The holy grail of storage is simply:

  1. Cheap.

  2. Durable and consistent.

  3. Low-latency and high throughput.

S3 Standard can be cheap and most definitely is highly durable. It’s Achilles heel is the high, unpredictable latency. Cheap, durable storage makes it the best place to store large volumes of data and many systems today already do that. However, the high latency is a problem and depending on the workload, data system builders must go through many hoops to integrate S3 into the architecture to benefit from the economical storage but dodging the latency bullet.

I wrote about this topic in my opening post on The Architecture of Serverless Data Systems. Quoting myself:

Engineers can choose to include object storage in their low-latency system but counter the latency issues of object storage by placing a durable, fault-tolerant write-cache and predictive read-cache that sits in front of the slower object storage. This durable write-cache is essentially a cluster of servers that implement a replication protocol and write data to block storage. In the background, the cluster uploads data asynchronously to object storage obeying the economic pattern of writing fewer, larger files.

Fig 1. The data systems from chapters 1-5 of Architecture of Serverless Data Systems, and their choices for integrating cloud object storage.

Building a replication layer in front of S3 is frankly, a lot of work but many systems take this approach because they want all three properties from their storage layer. This is actually what Kora (chapter 2) and Neon (chapter 3) chose to implement: a highly available, fault-tolerant, write-through, read-through cache in front of S3. In the case of Kora, this replicated cache is formed by the Kora brokers (Kafka Replication protocol); and in the case of Neon, it is the Safekeepers (Multi-Paxos) and Pageservers.

There are downsides to this approach:

  1. Implementing Paxos, Raft or the Kafka replication protocol is a tremendous amount of work.

  2. Replicating data across availability zones is expensive.

But there is an major upside: These replicated, fault tolerant caches provide predictable low latency operations. These caches simply make some workloads possible that would not be possible with S3 Standard alone. This upside is the difference between having a viable infrastructure business and not.

Of course some systems are able to speak only to S3, such as Clickhouse Cloud (chapter 5) but CHC has much looser latency and data consistency requirements. For CHC, S3 Standard makes an excellent primary (and only) data store. For those of us that build systems with tighter constraints, we are waiting for the holy grail: an S3 storage class that is cheap, durable AND low-latency.

Fig 2. The holy grail is still out there.

Express One Zone does not tick the “cheap” checkbox and so the replicated, fault-tolerant cache implementations can continue to sleep easy that they are still relevant and highly valuable.

Which data systems will build on S3 Express One Zone?

As of today, S3 Express One Zone is not so compelling to data systems that offer low-latency workloads using replicated caches. The costs of Express 1Z is very similar to that of the existing cross-AZ charges and yet most data system vendors can apply significant discounts to their AZ charges. Some choose to deploy to a single-zone, avoiding the cross-AZ charges completely, making the cost imbalance between the replicated cache and this S3 storage class even greater.

So it doesn’t make much sense to replace replication with S3 as it would end-up far more expensive. Then there’s the cost of the storage itself which is 8 times more expensive than S3 standard.

However, a low-latency S3, while being expensive, might be attractive to the new start-up. Not having to build the replication layer is a significant time saver and if the start-up can eat the higher costs of this storage class, then may be it makes sense.

Fig 3. System builders have a 4th choice for integrating cloud object storage.

My feeling on S3 Express One Zone is it’s the right technology, at the right time with the wrong price. The holy grail of cloud storage is still out there, the only question is how long we’ll have to wait.

Share