One of the key competitive advantages of Walmart Logistics over the decades has been keeping its distribution centers running as efficiently and autonomously as possible.  That autonomy extends to the very hardware and software infrastructure running within these facilities — more than 200 worldwide — as each location contains what could be viewed as a mini-datacenter.

Managing this vast distributed landscape is a complicated logistics problem, presenting substantial challenges for everything from minor upgrades to the delivery of new software and solutions. To meet these challenges, Walmart Technology is undertaking an effort to enable cloud-like flexibility across its distribution centers. The initiative’s top-level goals include abstracting applications from their underlying infrastructure, delivering the ability to rapidly deploy new business solutions, and providing seamless, on-demand scaling based on application workloads. Each business application should function autonomously, like the distribution centers themselves.

But bringing a cloud-like environment to a disparate network of on-premises logistics systems is no easy challenge. To meet these goals, Walmart Technology is bringing together a handful of individually impressive open source software components: OneOps, Kubernetes, Jenkins and Nexus.

OneOps, a @WalmartLabs-led open source project, enables a consistent infrastructure provisioning and delivery mechanism in a cloud-agnostic fashion. And while Walmart Technology has employed OneOps to help manage more than 1,000 deployments daily across its many data center-based clouds, distribution centers have not been able to leverage the same convenience. This initiative aims to change that, standing up each distribution center as its own cloud, with OneOps provisioning the underlying virtual machines, as well as all other key technology components required for a Kubernetes cluster (see Figure 1).

Figure 1: OneOps VMware distribution center interactions

In the above diagram, integration between OneOps and VMware’s vCenter occurs via Fog adapters. This key integration allows distribution centers to use their existing VMware investments with minimal changes to DC infrastructure. Once this integration stands up the backing virtual machines, OneOps takes over by laying down the Kubernetes ecosystem directly onto these newly provisioned VMs. This convenient approach means that whether applications are in the cloud or on-premises, OneOps can facilitate a consistent infrastructure management experience across the enterprise.

Predecessors to Kubernetes in Walmart’s distribution centers are open source successes in their own right (Nagios, Apache-Httpd to name a few). And while they have been relatively effective in meeting the current needs of the applications they serve, these disparate products require customized configurations and install scripts to provide a cohesive application deployment platform. As applications transition toward a microservice-style architecture, the overhead of managing these disparate components has increasingly made that transition much more difficult.

Kubernetes has proven to not only meet the same needs provided by its open-source predecessors, but it is doing so in a much more efficient, stable and easier-to-manage fashion. With its roots grounded in years of Google cloud-computing experience, Kubernetes also brings with it powerful Docker container orchestration features to help enable a microservice architecture. These include:

Applications provisioned within Kubernetes are assured to always be running and making the most efficient use of on-premises resources. This is accomplished with relatively little additional overhead, having just three small VMs dedicated as redundant master nodes. The complete provisioned cluster provides everything needed for an application execution ecosystem (figure 2):

Figure 2: Kubernetes VM layout for a distribution center

Master nodes ensure the cluster is running smoothly, tracking the status of each component, and handling API requests for deployments and updates; worker nodes remain busy executing application workloads; NFS servers meet locally shared file-system-based requirements; and load balancer nodes intercept and route external, non-Kubernetes based traffic into the cluster.

Once the Kubernetes clusters are provisioned by OneOps, each freestanding cluster is hooked into Jenkins-based continuous delivery pipelines. These pipelines ensure applications are delivered to each distribution center in a timely and efficient fashion (see Figure 3).

Figure 3: Jenkins / CD pipeline interactions with Kubernetes clusters

Treating each application as its own isolated microservice component, Jenkins has proven to be a fast, effective tool for not only building software, but also as a foundational platform for Walmart’s continuous delivery pipelines.

Jenkins alone isn’t sufficient as a deployment mechanism. Any Docker-based solution requires a solid image registry — that single “source of truth” for Docker binaries. Having already invested in Nexus for Java, dotnet, Linux RPMs and more recently, NPM-based artifacts, the obvious solution for Walmart Technology was to expand Nexus’ already impressive feature list to include Docker registry capabilities. Furthermore, given a distribution center’s autonomous nature, a local registry within each DC becomes a must to ensure clusters are able to operate independently and also buffer against excessive external network traffic. Once again, the same Nexus solution, but running as a more optimized Docker container image, is deployed within each DC’s Kubernetes cluster, serving mostly as a proxy and cache to the enterprise-wide registry (see Figure 4).

Figure 4: Nexus (Docker registry) Topology

As shown in the above diagram, Jenkins pipelines deliver application updates directly to the enterprise Nexus instance in the form of new Docker image versions. Each DC Kubernetes cluster pulls from the central image on-demand and caches these latest updates within the local file system. The resulting approach forms a complete federated image management infrastructure, powered by Nexus, Jenkins and Kubernetes.

By combining these key open source products into a cohesive solution, Walmart Technology is able to deliver the level of flexibility and isolation its modern logistics systems require. Development teams are empowered to make the right technology choices for the solutions their logistics partners need. Walmart Logistics end users and management are happy because they can use existing distribution center assets and see improvements in the speed of delivery, systems uptime and asset utilization. And finally, Walmart customers ultimately benefit from a system that ensures their favorite items are in stock at Everyday Low Prices.

Comments

  1. Sounds very similar to what we plan on doing. For us, it’d be Bamboo for CI/CD, Nexus for images and source of truth, and OpenShift on Kubernetes.

    1. Sounds like a great stack Cassius! We really liked what we saw with Openshift, too, and had some great discussions on Redhat on possible solutions. For us it made sense to eventually go with our own internal tools such as OneOps, adding Kubernetes cluster provisioning features, enabling us to unify our provisioning and monitoring mechanisms across the enterprise.

  2. We are building a very similar stack and I’ve been attempting to bring flanneld into vSphere with only partial success. (cbr0 is still taking over and wanting to run the place). I’d be interested in talking with one of your engineers on how they were able to configure the overlay and if at all possible for them to share their work.

  3. Hello Dollan/Barve,

    It’s a very nice article, easy to read and very well explained. I would like to know which Nexus version are you currently working with for the Docker Registry?

    1. Hi Alexander, thanks for your question. We are using stock nexus but added the Docker registry features on top of it (we’ve got some great internal talent working on maven and nexus!). The binaries for this version are being posted to maven-central as well as docker hub. Look for the source code to be released on GitHub soon!

  4. This is a really great case study! Thanks for posting. Please, submit a talk for KubeCon Europe or North America 2017. 🙂

Comments are closed.

Register for the Latest News