Making multi-cloud deployment a reality at Netflix with Spinnaker

Learn how Netflix leveraged automation and Spinnaker to perform 4,000 multi-cloud deployments per day.

Enterprise architects need to think about how to streamline their application delivery cycle to the cloud in order to stay abreast of competitors. "Moving fast leads to competitive advantage," said Andrew Glover, engineering manager at Netflix, at the DevOps Enterprise Summit in San Francisco. "Netflix believes strongly in moving faster because if we don't move fast enough, our competitors will."

One of the key practices for achieving agility lies in setting up the infrastructure and practices that enable continuous delivery. "You can take small changes move them into environments, experiment, and back out of problems quickly," said Glover.

Enterprises need to think about tooling that can streamline the deployment process. This makes it easier to provision across different cloud platforms in a way that makes it easier to take advantage of cloud specific features. It's also important to implement telemetry for gathering data about app performance into new code. This makes it easier to treat new code as experiment and identify improvements of defects faster.

Streamline deployments with tooling

Netflix has leveraged these principles to facilitate over 4,000 deployments into test and production environments daily. Its release cadence doubled after implementing its own custom deployment tool chain called Spinnaker, a cloud deployment orchestration engine. "You can think of Spinnaker as a paved road where best practices are codified in a single platform," said Glover.

Netflix has been running spinnaker internally for two years and recently open sourced it to get feedback from other enterprises and also as a recruiting tool. The project is being used at leading enterprises such as Target and has received support from Google, AWS, and Microsoft. 

Make developers responsible for success and failures

There are no ops teams at Netflix.  Developers push code out when they feel it is ready, and when an app dies, the developer is the one that has the fix the problem. Spinnaker makes it possible for code check-in to start the process of moving into production. It also makes it easier to set up the telemetry for tracking the benefits or defects of new code.

This deployment pipeline makes it easier to leverage parameters that enforce strong opinions about how new code is to be released, configured, and managed. These pipelines simplify the process of running tests, and scaling new code. Developers are prohibited from manually coding configurations to make an app work. Otherwise the next time the app needs to be deployed the developer might be on vacation, which could be a problem.

Netflix has a small core team of about a dozen engineers on call 24 x 7 to quickly assess new problems and then page the appropriate developer. "This drives home the notion you do high quality services and there is a vested interest in making sure they run because you don't want to be the one that gets the calls," said Glover.

Use cloud specific artifacts to improve performance

Netflix currently does the majority of its deployments to AWS. At the same time, they wanted to give developers the ability to take advantage of services and features specific to different cloud environments. Spinnaker allows developers to provision new code as AMIs on Amazon or Docker like containers based on Netflix's Titus orchestration technology. Titus makes it possible to provision, decommission, and scale apps automatically much like Kubernetes does for Docker containers.

One of the key lessons Netflix learned from previous continuous deployment pipelines was to keep the architecture modular so that it could adapt to new use cases and cloud specific architectures. Developers ended up forking the code in previous pipeline projects, which reduced reusability. The modular, pluggable approach implemented in Spinnaker means that it can be architected to support multiple clouds.

 

Next Steps

Just how portable is your cloud computing infrastructure?

Selecting a cloud provider for backing up your systems

Important SLA features your cloud contract should contain 

Cloud computing and backing up mobile data

Dig Deeper on Software development best practices and processes