Originally published on July 8, 2020
Yesterday I read an analyst report that the serverless architecture market will be $21B by 2025. I also recently met with Alex DeBrie, author of the DynamoDB book and enjoyed learning about his serverless philosophy. He wrote a great post about the key factors for choosing serverless databases here, and we had a fascinating conversation about serverless indexing systems that complement them. Last week Bob Muglia, newly appointed executive chairman of FaunaDB, wrote an equally interesting article about how client-serverless is essentially the 4th generation of application model.
From Datacenter to Cloud
During my early days at VMware, we spent a lot of time thinking about admin controls. Why? Because enterprise IT teams always asked us for better ways to manage datacenter infrastructure and control spend. Along came AWS, turning the model on its head, unlocking developer agility through self-service. The recent era of cloud agility saw companies migrate their existing stacks to the cloud, but there is only so much you can do when you migrate software built for data centers into the cloud. Lift-and-shift cloud migration is a bad idea – you end up bringing along all the existing datacenter complexity and trying to force fit your software stack to function in an entirely new environment.
From Cloud to Serverless
If you think about it, what is the point of manually sizing clusters, provisioning servers and managing cloud infrastructure when your software is the best judge of exactly what resources it needs at any given point? Manually configuring software in the cloud is a lot of ops overhead, involves a bunch of sizing assumptions, over-provisioned compute/storage and still causes operational fires when things start to scale.
This is why the world is moving from cloud-hosted architecture to serverless architecture – its the next-generation of cloud infrastructure services that automates capacity planning, deployment and scaling. The result is that your software is easier to build, maintain and far more cost-efficient. No wonder the JAM stack and GraphQL are all the rage today. But what is the ideal data stack for serverless architectures?
Serverless Data Stack for Low Ops, High Velocity Teams
A data management system is serverless, if one can load data, persist data, and run queries simply using a data API –without ever having to think about servers. Some of the key aspects of a serverless data management system are:
- No database connections – Users shouldn’t have to manage database connections. It should be accessible via data APIs.
- No provisioning – Users shouldn’t have to choose what type of hardware to provision for their datastore.
- No capacity planning – Users shouldn’t need to plan cluster capacity at any point during the lifetime of the application.
- No scaling limits – Users shouldn’t have to worry about hitting a wall with their data footprint growth. It should feel like its infinitely scalable and limitless.
- No server maintenance – Users shouldn’t have to think about security patching, upgrading dependent modules, or monitoring servers—all the tasks required to support 24 x 7 server uptime.
When you’re thinking about your transactional database, there are some popular serverless options you should consider, including DynamoDB, Aurora serverless and FaunaDB. But what about your entire data architecture – what about the other data stores you need and how do you serve your BI and apps? Your data stack is 10x more streamlined when you combine the the low ops approach of serverless with flexibility of NoSQL data model This type of modern data stack in the cloud uses a serverless transactional database for OLTP, a data stream for events, a data lake with query engine for BI and a real-time indexing layer for serving applications. Here is a reference architecture – notice that DynamoDB, Kafka, S3, Athena and Rockset are all JSON-compatible serverless data stores so you have a flexible schema, low ops data stack that unleashes developer agility like never before.
What parts of your current application stack are serverless? How much of your data architecture has gone serverless? In case you’re new to serverless, here is a curated list of awesome serverless events happening around you.