The NetflixOSS Stack

A high-level overview of Netflix's Open Source projects

Presented by Joe Hohertz / @joehohertz

Slides @ http://jhohertz.github.io/netflixoss-slides

Welcome

Audience

  1. Developers
  2. Operations
  3. Project managers
  4. Other interested parties

Let's get started!

What is NetflixOSS?

The view from 10,000 feet

What does NetflixOSS consist of?

  1. Cloud Native Architecture Components
  2. Anti-Fragile Patterns
  3. Other Cloud Tools
  4. Other Miscellaneous Projects

Cloud Native

What is this?

  • Master copies of data are cloud-resident
  • Everything is dynamically provisioned
  • All services are ephemeral

Anti-Fragile Patterns

What is this?

  • Micro-services
  • Highly available systems
  • Ephemeral/Immutable components
  • Circuit breakers
  • Chaos engines

Micro-services

  • Split everything into series of micro services
  • Clients assume failure will occur, seek alternates
  • Calls between services done through circuit breaker pattern

Chaos Engines

  • Injected failures
  • Expose weaknesses
  • Ensure resiliency

Deep use of Metrics

  • Most libraries include rich metrics
  • Not just for reporting
  • Used internally for load management by:
    • Hystrix (breaks overloaded circuits)
    • Eureka (service selection)
  • Excellent real-time visualization of metrics

Deployment

What is deployment like?

  • Depends on Amazon's Cloud
  • One service per node, combination discouraged
  • Base AMI provisions a Java web container (Tomcat typical)
  • Specific AMIs have WAR deployed onto base
  • If WAR is a co-process, the service captured is also deployed to AMI image.
  • Each AMI launches into it's own ASG (usually)

Java?!

  • Everything runs in a JVM
  • Uses JMX for some core functions
  • Supports many JSR-233 languages
  • Not everything need be in Java-proper
  • Many components using Groovy, Scala, etc.

Image management

  • Some pre-baked AMIs around...
  • Not good for much more than some evaluation
  • Bootstrapping your own necessary for real use
  • Some level of config baked into AMI, some region-specific (AMI per service per region typical)
  • Will build on available Ansible playbooks

Cluster Tools

The management elements of NetflixOSS

Aminator / BUILD

  • AMI Build tool
  • Works with Ansible playbooks
  • Handles laying a specific service into a container AMI
  • Generates a configured, deployable in ASG, service AMI

Asgard

  • Application Deployment Manager
  • Manages majority of AWS cloud
  • Allows Red/Black rollouts
  • Replaces use of EC2 console for most tasks
  • Enforces deployment conventions / rules
  • Does NOT cross production/test boundaries. (Each gets it's own Asgard)

Ice

  • Cost analysis / optimization
  • Imports from billing API @ Amazon
  • Allows querying/reporting on data
  • Aware of reserved instances

Edda

  • Cluster state analysis/query
  • Imports from config APIs @ Amazon
  • Allows querying/reporting on data
  • Display differences in environment over time

Denominator

  • Library/CLI for manipulating hosted DNS config
  • Like jclouds for DNS (same author)
  • A future state component, not intrinsic to current stack
  • Netflix positions this as part of a multi-region failover strategy

Base Cluster Services

The lowest level of the stack

Exhibitor/Curator (Zookeeper)

  • Zookeeper = Cluster Co-ordination services
  • Exhibitor = Zookeeper Co-process
  • Curator = Client wrapper for Zookeeper
  • Leveraged for configuration management of stack / application on a cluster-wide basis

Eureka

  • Application services registry
  • How parts of the application find provisioned services
  • Central to scaling/HA
  • Clients decide how to reach/retry a service given a registry list queried from Eureka (client-side load spreading)

Turbine / Hystrix Dashboard

  • Turbine = metrics aggregation
  • Hystrix Dashboard = visualization
  • Provides real-time insights into all areas of application performance

Suro

  • Distributed Data Pipeline
  • Collection/Aggregation/Dispatch
  • Integration w/ external pipelines (IE: Kafka)
  • Supports data logging for Hadoop processing

Frameworks / Libraries

Application programming environment

Archaius

  • Provides dynamic/composed config properties
  • Can compose config from multiple sources
  • Can handle environment/zone specific details
  • Stores in zookeeper via curator
  • Notifications on property changes available

Servo

  • Application metrics
  • Publish to JMX is primary channel
  • Other publications available (IE: Graphite, Turbine, Cloudwatch)

Blitz4J

  • Enhancements to Log4J
  • Decouples logging from main thread
  • Logging is async
  • Can handle high log volumes
  • Mitigates loss during log storms via summaries.

Ribbon

  • Model for interprocess communication
  • Core of how services communicate with other services
  • Where client-side load balancers are implemented
  • Provides REST-based model for service exposure
  • Various serialization schemes supported (JSON primarily)

Governator

  • Based on Google Guice, a dependency injection framework
  • Field validation
  • Config to field mapping
  • Classpath management
  • Lifecycle management
  • More...

RxJava

  • Implementation of "Rx Observables"
  • Rx = Reactive Extensions
  • A port of Rx.NET
  • for composing asynchronous and event-based programs using observable sequences

Hystrix

  • Latency and Fault Tolerance helper
  • Prevents hung services from taking down app
  • A primary source of application metrics
  • Manages queues of requests to like sources
  • Collapses requests into batches where possible

Commons

  • Common utilities
  • Implementation of an event bus
  • Guice -> Jersey utilities
  • Infix operator enhancements
  • Helpers for adding custom metrics

Karyon

  • Base container for applications/services in this stack
  • Ties most of the other libraries together
  • Gives a built-in admin console w/ insights/statistics
  • Companion library (Pytheas) provides web development aspects

Pytheas

  • Data source integration
  • Provides common UI components
  • Supports server-side events
  • What front end development ultimately targets
  • Edge services will be provisioned in terms of this and Karyon

Zuul (The Gatekeeper)

  • An even higher level edge framework.
  • Sort of like a front end load balancer on steriods
  • If used, rendering to users pushed down to a middle tier service
  • Routes/filters requests to the appropriate service
  • Allows powerful debugging, route specific requests to serperate API clusters
  • Can route cross-region.

Glisten

  • Groovy library for building application with Amazon Simple Workflow
  • Used by Asgard for co-ordinating deployment activities

Data Services

Persistence, Caching, Query

Priam

  • Co-process for Cassandra
  • Manages configuration, tokens, seed discovery
  • AWS aware
  • Backup/Recovery (Complete/Incremental)
  • Exposes Cassandra as a framework service

Astyanax

  • High level client side for Cassandra/Priam
  • Provides recipes for common Cassandra use-cases
  • Handles discovery of new nodes, marking of downed ones.
  • Completely encapsulates client access
  • Connection pooling
  • In-progress: Rebasing to Datastax driver

EVCache

  • Service for Ephemeral, Volatile Cache
  • Distributed cache
  • AWS Zone aware
  • Allows replication of cached data
  • Uses memcache
  • Is a co-process like Priam

Genie

  • Implements Hadoop as a service
  • Runs Hadoop/Hive/Pig jobs
  • Schedules jobs to available resources (including launch of transients)
  • Callers can do so without directly invoking a Hadoop client.
  • Provides monitoring of jobs as well

Aegisthus

  • Bulk data pipeline from Cassandra
  • Reads SSTables directly
  • Emits compacted snapshot of column family
  • Imports data into Apache Pig

PigPen

  • Map/Reduce for Clojure
  • supports distributed Clojure
  • Compiles to Apache Pig
  • Hides details of Pig, allows unit testing of map/reduce programs

Lipstick

  • Visualization of Pig jobs
  • Graphical depiction of job flow
  • Can reflect status of job in progress
  • Complements Genie, providing status of jobs it manages

Zeno

  • In-memory data service for data sets with low latency tolerance
  • Creates compact serialized representation of Java objects (with de-duplication)
  • Creates smallest change set possible for propagation
  • Used @ Netflix for video metadata that drives user experience
  • Care taken to be as GC friendly as possible

Netflix Graph

  • Compact in-memory directed graphs
  • Predecessor to Zeno?
  • Claims an order-of-magnitude reduction of memory footprint vs. native java objects

S3mper

  • Extension for Hadoop's S3-based filesystem
  • Uses DynamoDB to provide added consistency checking

STAASH

  • A REST-based exposure of data services
  • Exposure to external languages/stacks
  • Still very much a proof-of-concept, limited use within Netflix

Other Components

Things that don't fit in other areas

GCViz

  • Visualize JVM garbage collector activity
  • Requires a HotSpot JVM (IE: Oracle)
  • Requires a specific set of JVM options to emit log

Simian Army

  • Chaos Monkey - Failure injection for resilience testing
  • Janitor Monkey - Flag/Deactivate unused cloud resources
  • Conformity Monkey - Scans cloud resources for non-conformance of best practices

Our additions

Things we've been working on

Buri

  • Builds Ubuntu AMIs for all machine types
  • A collection of Ansible recipes
  • Works similar to Aminator
  • But it includes foundation/base building
  • Hosted on GitHub: https://github.com/viafoura/buri

Buri Roles (completed)

  • Exhibitor/Zookeeper (clustered)
  • Eureka (clustered)
  • Ice

Buri Roles (in progress)

  • Priam/Cassandra
  • Asgard
  • Turbine/Hystrix Console

NetflixOSS Resources:

NetflixOSS Reference Implementations

THE END