In this series we'll build a set of APIs with ASP.NET Core 2.0 in a cloud native, microservices architecture that can satisfy a long list of non-functional requirements.
So what does cloud native actually mean? It means that you don't know where your applications are going to run anymore. Your applications need to be able to work out how they are configured and integrate themselves into the wider system of services and start working. Why is that a desirable thing? Because it means that we can stand them up and down at will, without need for special system configurations. That means we can create copies of them for scaling out and we gain resiliency because when one copy of an application dies we just start up another one.
The only exception to this are the services that manage our state: our Redis clusters, SQL Server, MongoDb etc. We still need to treat those as our beloved pets and look after them well, but the rest is just cattle.
There are many advantages and dangers of microservices. But much has been learned over the last few years and the tools and platforms that are available today make microservices more and more attractive.
So what are the benefits of micro services?
Speed of change - multiple teams can all work on the system without treading on each others toes. Teams are decoupled, and can operate on different release cadences without affecting each other as the various services continue to respond to the same contract.
Freedom to choose different technologies. Perhaps you're a C# shop but you have a top priority project that fits Python better and you happen to have a few top quality Python developers available? When services interact over HTTP and messaging platforms you can do that. You can forget about that with your monolith.
Scaling. Each service can scale independently according to needs. What's more, you can get better parallelization of work as you fire off requests to multiple different microservices hosted on other servers.
Sounds great but it's not all unicorns and rainbows. Now you've got 20 services and a few copies of each one, so now you've got 100 things to run. What's worse is that you've got some .NET, a couple of Python and a Go app to run. So what runs where? Now you need to start installing .NET, Python and Go on all your servers right? This is where containers and container orchestration engines come in.
First of all, your containers take everything your applications need to run (including runtimes) and package them up nicely and efficiently in a standardized image. You can run those containers on any bare server and they'll run just fine. No need to install anything except Docker. You give your orchestration engine a set of servers and it will choose where to put each of your containerized services for you. If one dies then your orchestrator will spin up another to replace it. You don't care where they are, as long as the right number are running and they can be reached.
How do you reach your microservices? You don't even know where they are running! Again the orchestration engine solves this for you. Docker Swarm with Swarm Mode has its own service discovery mechanism, Kubenetes uses ETCD.
Suddenly managing deployment of microservices doesn't sound so bad. But there are other problems we need to take into account:
Latency (that in memory function call you used to do has a thousand times less overhead than an HTTP call)
Networks are unreliable. Service calls can fail and do.
Service modelling. How to partition your business logic cleanly into multiple services? Mistakes can make difficult to use APIs, APIs that need to be called many times to get the data you need or that return too much data introducing latency and network congestion.
Deep synchronous call graphs. Service A calls service B which calls service C which calls service D. Without taking time to consider dependencies, we can get added latency and unreliability by creating large call graphs that involve a myriad of services.
Development environments. How do you work locally, doing debugging when the dependency graph of your microservice includes mutliple other services and a messaging system to boot?
Incident analysis. Failures in production can be difficult to diagnose as one logged error does not contain all the information. In a monolith we see the whole call stack in the stack trace but in a microsservice we see one small part of a distributed call stack.
We'll be covering all those points in this series, including the containerization of our apps and the use of an orchestration engine.
A Cautionary Note ABout Microservices
Microservices are the current hype but they are an architectural style that should be evaluated based on merit rather than anything else. Personally my feelings are that we should focus on building Cloud Native right sized services. You can make a monolith Cloud Native. You can scale out your monolith horizontally as long as it's stateless.
So take microservices with a pinch of salt and think about right-sized services.
Making a Platform with Services
When I read Jeff Bezo's services mandate a while back, it put a massive smile on my face as I knew he was right, I'd lived the shared memory model for too long already. So I'll share the mandate below.
All teams will henceforth expose their data and functionality through service interfaces.
Teams must communicate with each other through these interfaces.
There will be no other form of inter-process communication allowed: no direct linking, no direct reads of another team’s data store, no shared-memory model, no back-doors whatsoever. The only communication allowed is via service interface calls over the network.
It doesn’t matter what technology they use.
All service interfaces, without exception, must be designed from the ground up to be externalizable. That is to say, the team must plan and design to be able to expose the interface to developers in the outside world. No exceptions.
He's talking about building a platform built around decoupled teams and services. So what's so special about building a platform? It depends on the size of your organization. If you have a large and complex business then the chances are you need to think in terms of building and operating a platform. If you don't you'll end up with a swath of cobbled together applications from the past decade until now that no-one really understands. When you have millions of lines of code, you better start thinking about a platform or no-one will be able to tell you all the applications involved in the "name your super important business process here" process.
So what is a platform? It's the core of your business. At the core of your business will lie a small number of core areas around which the rest of the business orbits. Turn those areas into a well designed coherent set of services upon which you can build the rest of your software systems. Protect the hell out of those core services, with extreme prejudice. Keep them clean, well modelled and aligned with the business.
So this is the final challenge and honestly makes the rest of the technology based work look like a walk in the park.
In this series we'll build a small cloud native, microservices architecture using:
ASP.NET Core 2.0 as the application stack
SQL Server as the primary data store
Docker Swarm (Swarm Mode) as our container orchestrator
Kakfa for event sourcing
IdentityServer4 for OAuth2.0
Elastic stack for monitoring, but we'll also take a look at Prometheus and Sec.
Neo4J of architecture mapping
The objective is not to add technologies but achieve certain capabilities. Let's look at the capabilities we'll add in this series, in addition to the cloud native stuff we already talked about.
We need a real-time view of what is going on in the system. That means both applications and infrastructure. We need to know what servers we have, what is deployed on each server, server and network metrics and application log analytics. We also need visibility of Docker.
We'll be using the Elastic stack: Elasticsearch, Kibana, Logstash and Beats as the work horse here. Although Prometheus is the new hotness I prefer a push based system for metrics/logs and the Elastic stack covers most of our needs. But we'll take peek at Prometheus too.
The usual suspects:
Secure communications - HTTPS
Secrets management (backend credentials, API keys etc) - we'll be using Docker Secrets.
Authentication and Authorization - we'll be using OAuth2/OpenID Connect provided by IdentityServer4. We'll also being looking at API Managers later on and they often include this capability also.
Modern Build and Release Pipeline (CI/CD)
We need an automated build and release pipeline that can cope with multiple environments, publish NuGet packages, build Docker images, deploy Docker containers and more. We also want automated tests to be run and code analysis reports to be generated.
We'll be using VSTS for our source control (Git) and our build/release pipleline. We'll store our Docker images in Docker Cloud. For automated tests we'll use XUnit and SonarQube for code analysis. We'll also take a look at NDepend and its integration with SonarQube.
We'll be using Swagger and Swagger UI to create self documenting APIs. DocFx for source code documentation generation. We'll also look at a change log.
We'll look at different options for configuration. Local configuration files, environment variables and the new Docker configuration.
Services need to be able to communicate with each other, which means they need to know where each other is. We'll be hosting in Docker, using the built in Docker Swarms. Service discovery inside a swarm is handled for us, but what about inter-swarm communications, and communications to services not hosted by Docker? We'll take a look at Consul.
We'll look at how an API Manager can be used to protect your APIs via throttling, track API usage and even expose SOAP services at REST endpoints.
We'll ensure that our architecture design is described in a declarative format. We'll then use that declarative format to build a graph in Neo4J that represents our entire architecture from servers, to applications, use of the messaging system and HTTP calls.
We'll be using Docker-Compose.yml files that describe our Swarms but this might not be enough, so we'll find a way of covering everything and exporting that to Neo4J. From there we can explore our architecture visually or programmatically to gain insights and awareness of our system.
Large enterprise systems that create an API eco-system with a mix of internal and publically facing APIs need a governance system to maintain some kind of control. New services need to be reviewed in terms of the wider architecture. Do they duplicate an existing service? Do they follow all policies regarding our non-functional requirements. Do they require a security review?
We also need a central registry of services with information about which team owns the service, what the service does, documentation, roadmap, changelog etc.
We'll look at how to do load testing of our services and interpret the results. We'll look at concepts such as tail latency need to be taken into account and measured.
We'll also look at some performance optimizations and how the addition of a caching layer with Redis can be used to reduce latencies.
That's a lot to write about but I'll be working my way through it over the next few months (sorry kids - got a job and a family too!). All the code will be put in GitHub.
The first topic we'll cover is Swagger UI as it greatly increases developer productivity from day one, making it trivial to test your APIs. I will return to update this first post with the links to all the subsequent posts of the series for easy navigation.