Kubernetes cluster configuration and compliance with Jetstack Preflight

Jetstack formed as a way to help companies get value out of Kubernetes, and since the early days of the project, we have learned plenty about what it takes to run Kubernetes successfully - sometimes the hard way!

It’s this valuable experience that we bring to our customers, giving them the confidence to take services to production and scale their platforms.

As Kubernetes permeates businesses, we see customers run clusters in a huge variety of ways. These can span projects, products, regions and even cloud providers. One thing we see customers consistently struggle with, is ensuring that they conform with organisational policies, especially in regulated environments. For that reason, we have created Preflight.

Introducing Jetstack Preflight

We’re pleased to announce a new member to our open source family. Meet Preflight, a tool to verify Kubernetes cluster configuration and ensure security and compliance. Preflight enables cluster operators to perform checks on clusters based on a set of predefined rules. You can craft these yourself if you wish and we’ll see how these work in this post.

How Preflight works

When it comes to policy enforcement, there is one project that has been in the Kubernetes ecosystem for some time now and is gaining adoption: Open Policy Agent (OPA). Preflight builds on OPA’s policy engine.

Policies are encoded into what we call Preflight Packages. A Preflight Package is a set of OPA Rego files and a YAML file, the Policy Manifest, that adds some metadata to describe the policy.

Package

Let’s assume your organisation allows people to create their own GKE clusters by following a policy that requires those clusters not to expose services to the public Internet.

Naturally, your organisation wants people to self-service, so they are not blocked by internal bureaucracy. However, there are some drawbacks to this approach. It can be too easy for anyone to misconfigure their cluster and expose an internal service by accident.

So, let’s write a package that encodes the policy of “clusters must be private”. Preflight can fetch data from different APIs and evaluate that data against predefined policies. Those sources of information are called Data Gatherers. The Google Cloud Platform API is one of the supported Data Gatherers. In particular, it gathers information about one specific GKE cluster. The Preflight Package of this example is going to depend on this GKE Data Gatherer.

Now let’s write our policy to the Policy Manifest file, in human readable form:

# ./preflight-packages/mycompany.io/gke/policy-manifest.yaml
id: "gke"
namespace: "mycompany.io"
data-gatherers:
- gke
# `root-query` selects the Rego package: `data.<repo_package>`.
# In this case, it corresponds to the package in ./gke.rego.
root-query: "data.gke"
name: GKE checks
description: >
  This package contains company-wide policies for GKE.
sections:
- id: networking
  name: Networking
  rules:
  - id: "private_cluster"
    name: Private cluster enabled
    description: >
      Enabling private cluster means Nodes do not have public IP addresses, and
      are therefore isolated from the public internet. They can then only
      communicate with the internet using configured gateways, such as Cloud
      NAT, thus limiting the possible exposure of workloads.
    remediation: >
      Changing this setting requires re-creation of the cluster. The private
      cluster option must be selected or specified when creating a GKE cluster.
    links:
    - "https://cloud.google.com/kubernetes-engine/docs/how-to/private-clusters"

We are done with the metadata of our policy, now we need to write the actual policy in OPA’s Rego. In order to do that, we need to know how the input data is structured. Below is a Rego file encoding the policy:

# ./preflight-packages/mycompany.io/gke/gke.rego
package gke

# See https://github.com/jetstack/preflight/blob/master/docs/datagatherers/gke.md for more details
import input.gke.Cluster as gke

# Rule 'private_cluster'
default preflight_private_cluster = false
preflight_private_cluster {
        gke.privateClusterConfig.enablePrivateNodes == true
}

Tell Preflight what to do

Now we write a configuration file that tells Preflight which information it needs to gather, which packages it needs to evaluate, and how the report should be presented.

# ./preflight.yaml
cluster-name: my-cluster

data-gatherers:
  gke:
    project: my-gcp-project
    zone: us-central1-a
    cluster: my-gke-cluster

package-sources:
- type: local
  dir: preflight-packages/

enabled-packages:
  - "mycompany.io/gke"

outputs:
- type: local
  path: ./output
  format: json
- type: cli

Executing Preflight

Once both the Preflight Package(s) and the configuration file are ready, we can run the tool.

Run Preflight

Visualizing the results

Humans weren’t born to read JSON. That’s why we have put together a web interface where you can simply upload your JSON formatted report and visualise the results in your browser.

Report

We have some report samples (gke.json, pods.json) in the repo. You can load them in preflight.jetstack.io to get a feel for the tool.

Jetstack Subscription

We have encoded our best practice into ready-made policies, updated and maintained by Jetstack, that you can obtain with Subscription. Together with remediation advice for the checks, and online training to learn how to put this into practice, you’re in safe hands as you head into and throughout production service.

As the project progresses, we plan to add new Data Gatherers very soon, as well as some other interesting features. Stay tuned!

We invite you to visit the GitHub project, give it a try, and contribute. Any feedback is welcome.