Making a new web app from top to bottom today is filled with choices we didn’t have to make even a few years ago. Cluster computing has gone from being a specialty case for things like Hadoop map-reduce to a viable environment for an HTTP API service that we’d normally just throw on a bare metal server or VM, either on-premises or externally hosted, and call it a day. Containers are everywhere. The NoSQL vs SQL debates rage on after almost a decade of Mongo, Redis and other popular stores. HTTP API designs that were to have been automated for simple CRUD by services like Parse in the past and libraries like GraphQL today continue to be debated. The very notion of using a framework to run an always-on HTTP server has been called into question by the Serverless approach of on-demand HTTP request handlers that come up and down instantly thanks to the power of containers. Single Page Application (SPA) frameworks like Angular, React and Vue have come to the forefront to help build pages with complex state transitions. The eternal debate of monoliths versus microservices continues. Web Assembly threatens to finally realize the dream of running near-native code in the browser. Newer languages like Crystal, Elixir, Go, Rust and Scala challenge the dominance of mainstays like Java and Python. And through this wide and seemingly endless playing field is a depth of nuance between related and competing approaches to handle everything.
In this post I’ll attempt to cover the trends of the last few years as I’ve seen them, the tradeoffs of different approaches, and lightly touch on what I think some of the coming trends might be. It would be difficult to cover everything in depth, but hopefully this serves as a good high-level state of web development and infrastructure, and a starting point for discussion and thought. I apologize in advance for any details I got wrong, corrections are welcome. I’m also sorry if I don’t mention the specific software you use, there really are too many to name. I’m sure that one day 5 years from now we’ll all look back and laugh at how much has changed (and how much really hasn’t).
History’ish (very, very loosely)
Hosting a web server has changed quite a bit. Static files were once hosted on a relatively static server like Apache and served straight-up. Then dynamic handling of requests became a thing with Perl’s CGI, Python’s WSGI, Java Servlets and so forth. Once server-side languages became a thing, library and external service dependencies began to complicate consistent deployments. Over the years, tools like Puppet, Chef and Ansible came out to assist in recreating environments. Bare metal servers began to be replaced by virtual machines.
Amazon Web Services (AWS) became a go-to place for setting up web apps that required databases, queues and other common needs. Heroku, powered by AWS, introduced the 12-factor app methodology to the general public. Other services like Google’s Cloud Platform (GCP) arose to compete with AWS, and Microsoft came in a bit later in the game with their Azure service. Now, setting up a database and deploying a web app had to be handled in similar but incompatible ways depending on the exact platform you’re running on and the situation you’re in. AWS provided ways to deal with autoscaling, but these were AWS-specific. There have been numerous ways for dealing with deployment using things like Elastic Beanstalk, Cloud Formation and so forth. And while these all work, it often feels like we’re repeating the same things in slightly different ways.
Ok, so, clusters?
So where do clusters fit in all this? Well, technically any set of two or more computers that are set up similarly to do the same task, which are connected to each other, and that can be treated as a single system is a computer cluster. By this definition, AWS has been running clusters since its inception, and its users, by extension, have been doing cluster computing (congratulations!). However, from the user perspective, they must define the specific set of things (such as EC2 instances) that must be autoscaled, and which part of services need to be autoscaled and how, permission groups, allowed ports, and so forth.
As containers and web services continued to take off, some very intelligent people realized that scheduling programs for execution in a cluster wasn’t as efficient as it could be. They proposed a new way called Dominant Resource Fairness (DRF) which worked to maximize the smallest dominant share in the system (e.g. for CPU, I/O, RAM). This ensured the best use of resources, and encouraged users of a cluster not to lie about their needs. And thus Mesos was born, an open source cousin of Google’s Borg. Twitter made great use of Mesos to scale up their systems, which brought it into the public eye. They built Aurora on top of Mesos for scheduling long-running jobs.
Google, likely wanting to bring people onto their own GCP service, released Kubernetes, another Borg-like open source project. Initially it had a high barrier to entry, but over time a lot of tooling has been built out, including a nice dashboard, the Helm charts packaging system, the Ingress load balancer, and several authentication schemes important to corporate workflows. Combined with the Google brand and the strong marketing, it appears that it has now handily won the mindshare battle for cluster computing, although Mesos is of course still a powerhouse for those needing to scale up to astronomical numbers of nodes.
Right, but, why would I want this for a simple API?
The truth is, you don’t need Kubernetes. If you have an existing system for building, deploying, and running your apps, you’re set. These are the main things to consider for going this route.
- Simple, standard definition of an app, its pieces and dependencies
- Scaling by setting number of replicas
- Self recovery and rescheduling
- Controlled, but simple exposing of apps externally, in-cluster, in-pod
- Load balancing
- No lock-in (although running on GCP is best experience, of course)
- Cluster setup and administration required (or paid external hosting required)
- Steep initial learning curve
For most apps, a simpler approach is fine, but I have a feeling that as time goes on, the expectations of apps will increase and the clustered approach will continue to gain steam. Mesos and Kubernetes are not going away, but for the moment, I still see them as nice-to-haves for most apps, and incredibly helpful for highly available, highly resilient apps.
Then there’s Docker. OS-level virtualization has been around since at least the year 2000 with FreeBSD jails, but Linux containers didn’t come around until 2008, and as Linux was the predominant choice for webservers, this brought them into view, but it wasn’t really until Docker came around in 2013 that it became simple enough for people to use. The Docker image was the biggest change. Until then, if on AWS, a common scenario was to write a script to recreate an environment based on an existing Amazon Machine Image (AMI), produce a new one, and use this as the template going forward. However, developing these images was a costly endeavor, because if anything failed during running a script, it was difficult to go back to the point of failure and fix the issue. Instead, it was safer to re-run the entire build to the point of failure to ensure reproducibility. Also, if creating an image to similar to one before, unless you carefully created a new image for each step of your build, reusing pre-existing images wasn’t a thing (Docker’s layered images made this easy and efficient). Similar issues surrounded creating virtual machines from scripts, as was the case with Chef and Vagrant. Running a container based on a Docker image was also made easy, allowing for the dynamic choice of which port to expose at runtime. Creating a central image repository, while also a great business model, helped to solve the issue of how to pull the images to the right place when needed. Oh, and none of this required running on a specific hosting provider.
Fine, but why containers over VMs or bare metal?
This has been answered widely all over the web, so I’ll just give my own subjective personal take on it. When deploying a new version of a server software to servers, if isolation and reproducibility are desired, then a VM is usually instantiated, then set up from a script, before being made available. For Docker, an image is created based on a script (Dockerfile), but only the changed parts, typically the updated source code of the server software, are run, which are relatively quick. The startup time for a container is faster than that of a VM as well. In the event of a failure along the way, shutting down a container is also generally faster. If you’re paying for hosting of all of this, the compute time will be less, and should translate to less money.
How about for development on a local machine? Recreating a multi-machine multi-service environment locally using VMs can be a costly endeavor in terms of RAM and CPU, as there is much duplication in running each virtualized OS. Containers have relatively less overhead, allowing for the recreation of complex environments on a modest laptop.
Bare metal is desirable for cases where relatively constant, high throughput is desired, such as with a database like PostgreSQL. Setting things like caches to be writethrough (WT) instead of writeback (WB) in the kernel to avoid data loss of the Write Ahead Log (WAL) is not typically possible in a container, and the reason is obvious: multitenancy. If caches are set to WT for the DB, other apps depending on WB caches for performance that are running on the same physical machine will suffer. This is a breaking of isolation. Docker does support setting some kernel parameters but not all, so if your use case is not covered, then a VM or bare metal may be the way to go. By definition, VMs and containers can never beat the performance of bare metal, or its configurability. If security is a concern, and one is paranoid of a container or VM breaking isolation somehow, this is a way to allay those fears as well (though of course the box itself may be hacked).
NoSQL vs SQL
It says a lot about the prevalence of SQL-using RDBMS’s that the term “NoSQL” has been used to amalgamate such disparate databases as Mongo and Redis. How much time has been spent in writing SQL queries, DDL for schemas, and analyzing performance of queries? How much time has been spent fine-tuning these databases? Scaling them? Sharding? Using an RDBMS correctly is a difficult and never-ending endeavor, growing and changing with the needs of the database and the applications using it. Many people have been driven insane by all of this, so the dream of never having to deal with these things again in some magical world with no SQL is alluring and tantalizing.
Well, in my limited experience, I have to say that… RDBMS’s are fantastic. It turns out that much of the world’s data is relational in nature, and the structure of tabular data is a great fit for it. Having a set of software that is optimized and battle-tested for dealing with the similar set of problems that everyone with relational data faces is a great and wonderful thing. It allows experts in information software to pour their expertise into and polish a highly-reusable system for everyone to reap the benefits.
It’s also true that not all data is relational in nature. The document-based nature of full-text search and the inverted index used to speed up lookup time in engines such as Lucene is but one example. Analytics data coming in from apps as a hash of non-uniform information in giant quantities quickly also lends itself better to a document-based database. Use the right tool for the job.
I’ve seen many cases where relational data is forced into a non-relational system, and the consequences are disastrous. Many NoSQL databases offer no ACID guarantees, instead offering “eventual consistency” which may or may not happen. Great databases such as Redis can handle high throughput and atomicity of pipelined operations, but not semantic consistency of data (this is up to the app developer). One could create something like a SQL column index by using a Sorted Set, for instance, but this must be codified in the app, as Redis has no schemas. Understanding such an app with effective schema logic in the app code rather than the database is a challenge. Similarly, Redis data structures can be used to model relations to perform “JOINs” but this is inadvisable.
Redis is a fantastic high-performance cache, allowing one to take data structures from an app and storing and retrieving them rapidly. It is also highly extensible, with projects such as RediSearch for full-text search. But it is not a relational database, and simply hearing “Redis is fast!” does not make it the right tool for every job. In short, do not shy away from RDBMS’s because they’ve been around for a while, or out of some perceived lack of performance. Oh, and if you have some unstructured data in your structured data, consider something like PostgreSQL’s JSONB column.
A few years ago, a company called Parse which offered Backend as a Service (BaaS) came about which offered the exciting notion that the building of a simple API for CRUD operations could be more or less automated or abstracted away. This paved the way for frontend developers who didn’t have the time or expertise to build a CRUD API to power their app. This company was eventually bought by Facebook in 2013, and their service shut down in 2017.
GraphQL came about in 2015 from, you guessed it, Facebook. It offered a way to set up a one-size-fits-all DB-based backend API in which the frontend decided what data to query using GraphQL’s query language. In a way, this brought even more power to frontend developers, although it also meant that some of the app system logic sits in the frontend. Unlike REST APIs which map resources onto URLs, GraphQL exposes a single HTTP endpoint that can be queried for whatever data that the backend exposes. To avoid several roundtrips, Relay is used to pipeline queries into a single HTTP request. Further, the server can implement response caching to compensate for the loss of GET request caching that happens with most REST APIs. While GraphQL has only limited support for high-depth recursive queries, most apps will not require this.
As with RDBMS’s covering the majority case of relational data, GraphQL appears to cover the similarly common case of CRUD on said data. Further, it allows apps to define the data they need, so for a given set of apps all talking to the same API, they can tailor queries to their own needs (e.g. mobile vs. web).
Why not GraphQL?
For a simple API, or an API with non-graph data (such as that in a document-based store), not a lot is gained given the initial overhead of setup. In addition, doing things like file uploads or video streaming don’t have a place here (then again neither does REST).
I recommend reading this excellent summary of the various use cases for GraphQL. I believe we will see more GraphQL-based APIs going forward, but as this article hints at, there’s nothing preventing having a RESTful API and a GraphQL API running side by side.
This trend, kicked off by AWS Lambda, is one that I’m still not fully sold on, having not had the chance to use it in production yet. The idea is simple: write a function to do a defined task, and all the wrapper code of bringing up and exposing a server endpoint to expose this is left up to the serverless server (they really should have picked a better name for this). Typically, when the function is triggered somehow (e.g. via API call), a container will be started on-demand to fulfill the request, and will either shut down immediately after, or stay up until a determined period of idleness has elapsed.
From the standpoint of Don’t Repeat Yourself (DRY), not repeating repetitive and well-understood code, the advantage is obvious here. What’s less obvious is what is being lost. I can think of at least:
- Control of the stack running the code (if you disagree with what the provider is using)
- Vendor lock-in (yeah, different vendors will have different “serverless” libs)
- Testing (all testing is now integration testing, because you must test on a real serverless server)
I hope to get a chance to play around with this in the future, and I can see the potential for one-off tasks that need to be run from time to time and setting up an entire server for a simple one-off is a hassle, but for now it seems like a neat thing that is not a necessity.
Single Page Applications (SPA)
Many websites, it turned out, were applications disguised as a series of web pages, but the web was originally built to be an interlinked set of documents, not an application on-par with desktop apps. Attempts to use libraries like jQuery for this, no matter how clean, always hit the same problems. Tracking and modifying state, both in and outside of the DOM (e.g. shadow DOM), was a repetitive and error-prone chore. Eventually, libraries like Angular, Knockout, Ember and others came to address this by effectively replicating what server-side rendering was already doing with models but on the client side. These approaches saw a degree of success, but the more complex a page became, the harder it became to maintain state, and models seemed to not be the right abstraction needed for rendering a tree of DOM elements.
In 2013 Facebook released React, which took the learnings of desktop app development into the web, such as unidirectional data flow in a state machine. This time, the abstraction was a component, which consisted of internal state and a method to render the component in JSX, a language resembling the HTML that the component would render to. Now instead of awkwardly trying to map data to models, then models to HTML, there would be a more readily-understood hierarchy of DOM-like elements. Further, rather than force each component to hold all related state and code to update its state, state from high up the hierarchy could be pushed down the hierarchy as a set of properties, further aiding in decoupling code and data flow. Rather than keep track of every possible combination of states, now each component only cares about two things: what’s being sent into it, and what it sends to components further down. A similar approach was applied to application state, and libraries like Flux and Redux came about, eventually being used together with React.
Monolith vs Microservices
Monoliths are pervasive, and it’s no wonder why. Having a single structure reduces communication and middleware overhead, and provides a (hopefully) obvious single place to look for all things. A well-structured and well-maintained monolith can grow to titan proportions and still be readable and able to be developed, but the larger and more varied a monolith’s development team becomes, the higher the chance that coding styles and quality will vary wildly, making it unwieldy to work with.
In much the same way that code implementing common functionality is extracted into libraries, HTTP monoliths with common functionality can be extracted into services. Some teams take this to an extreme and create what they call a microservice, which is seen to be a small service, for somebody’s definition of small. In my experience this is often taken too far, and the overhead added by breaking a perfectly good service into several microservices is not worth it.
Many times a service is broken off of a monolith to allow a particular team to develop it without interference from potentially breaking changes. In the end, software is built by teams of humans, and it’s perhaps inevitable that it will be broken down along team lines somehow. The fastest-running software in the world that never gets built will never be fast, and some sacrifices to account for human behavior which allow software development to continue are not always a bad thing.
This is perhaps the biggest unknown quantity out in the wild right now. In 2011 we got WebGL, which promised to allow us to build in the web what was once the sole space occupied by native code, fast-performing beautiful 3D graphics. Since then this has been taken advantage of, but by and large people still play games as native apps on phones, consoles or computers.
In the beginning there was binary, and eventually assembly languages in the late 1940s, until the landmark creation of the Fortran, Lisp and Cobol languages in the late 1950s, allowing for the abstraction of machine-specific code into general-purpose logic. Then in 1970 the C programming language was introduced when Unix was released, itself being used to build Unix. Since then, most popular programming languages have been largely written in or influenced by C.
Recently, however, there’s been a move to more performant, static-typed and functional languages. Riding on the outstanding performance of the JVM, Scala arose as a kind of happy medium between a pure functional language like Haskell and the Object-Oriented Java, with the Play web framework. The Go language, built and heavily backed by Google, offers a performant static-typed imperative language meant to replace C as a systems programming language, but it’s also finding a big following in web servers. Rust, also a systems programming language, has been growing as well, with server libraries like Hyper and Iron. Elixir, riding Erlang’s BEAM VM, has recently come to the fore, with its Rails-like Phoenix framework. Crystal, another Ruby-inspired language, is also making waves with the Rails-like Amber web framework. Finally, Swift, originally from Apple and used on iOS, has begun to see some use in web backends as well, with Vapor, Kitura and Perfect.
So many languages, so many choices, where to begin? Well, it largely depends on your situation. Language choice comes down to many of the following factors:
Whatever language is picked, there needs to be developers capable of programming in it and doing it well, or the ability to hire such people. Building a library that only 1 person can maintain is a risky proposition.
An app that has relatively lower performance requirements can live with a performance hit of several orders of magnitude, and it may be the right choice if more developers on the team or on the market are available to maintain it. The truth is, most language choices do scale, but the associated cost factors will vary, namely: server cost, engineering cost. As an example, Scala on the JVM properly tuned may be vastly more performant in server cost than MRI Ruby, but you’re much more likely to find great Ruby devs than great Scala developers on the market. Conversely, choosing not to invest time into setting up a full-text search stack like Elastic Stack when the server cost outweighs the engineering cost is folly.
Performance as a raw metric is nigh meaningless without the context. Every tech stack decision has its tradeoffs, and simply saying “this is more performant” does not make this truth magically disappear. The parable of having a hammer and seeing nails everywhere is real.
While all libraries could technically be written in any general purpose language, in practice different industries tend to congregate around one or two languages. Ruby on Rails has a fantastic ecosystem for building monolithic web apps with a supportive community, which makes it a great choice for building web apps. Python is heavily supported by the data science, machine learning and system administration communities, and has a great number of libraries to support those use cases such as NumPy and TensorFlow. Scala has great tools such as Play and Slick for building web apps, and Apache Spark and others for big data. Java has a great many libraries for almost anything under the sun, which Scala can take advantage of while running on the JVM. Go is performant and backed by Google, with an ever-growing ecosystem. The Rust, Elixir and Crystal ecosystems are still relatively young, but are already used in production and will grow quickly enough.
It is more likely that the choice of language is dictated by the desire to use an existing set of libraries, rather than the merits of the language itself. How well documented the language or its libraries are, and how open the community is to newcomers, is also a part of the ecosystem. The best language on earth with bad documentation and nobody to help will serve few people.
Functional programming and more advanced type theory concepts are making their way into many modern languages, but the more of these are used, the harder it becomes to onboard developers. Languages like Scala and Haskell have a higher barrier to entry because of this. Rust’s borrow checker and Go’s goroutines and channels also take some effort to adjust to. Of course, once learned, these powerful tools help to write equally powerful and robust code, but the learning is not free. Like it or not, onboarding people takes time, and in some cases, people are not willing or able to learn a new language or concept. Humans cannot be removed from the equation of language choice.
Web development is crazy
“But wait, you didn’t cover everything!” Yeah… it’s quite difficult to do so, but I hope this was a good overview of things as food for thought, if a bit haphazard. The world of web, and programming at large, is an ever-evolving beast being pulled in many directions by many forces. I sometimes think that programming has not essentially changed in as many years, we’re just finding ever more creative ways to push strings around. On the other hand, our string-pushing game is getting pretty strong, and I can’t wait to see what great new methods and technologies we use next.