doomholderz Security Blog

one network policy after another

Multi-tenancy in Kubernetes involves multiple tenants operating within a single cluster.

A tenant may be a specific customer, an internal software service, or an engineering team, for which we will deploy workloads and resources that we want to be isolated to that tenant.

There is no native Tenant resource in Kubernetes to help us manage multi-tenancy - the onus is on us as cluster operators to construct our own logical tenant isolation boundary.


defining the tenant boundary

breaking down the boundary

Firstly we'll break down the aims of a tenant boundary into 3 distinct goals:

This blog focusses on how we build scalable tenant network isolation in a multi-tenant cluster.

traffic patterns within the boundary

To build secure tenant network isolation, we first need to understand the type of network traffic that passes through a tenant:


a layered policy approach

types of policy

To provide tenant isolation across these traffic patterns, we will layer a series of network policies to create a strict default baseline from which we will explicitly allow necessary traffic:

the policies in practice

These policies assume the presence of Cilium as the cluster CNI of choice. Mileage with other CNIs may vary.

Check out this repo for the following policies and associated resources to get this setup in a Kind cluster.


default-deny.yaml (baseline policy)

For our foundation policy, we want a one-time deploy of a cluster-wide 'default-deny' policy. This blocks all traffic (aside from egress to kube-dns) and enforces that all subsequent traffic flows must be explicitly declared.

apiVersion: cilium.io/v2
kind: CiliumClusterwideNetworkPolicy
metadata:
  name: "default-deny"
spec:
  description: "block all the traffic (except egress to CoreDNS) by default"
  egress:
  - toEndpoints:
    - matchLabels:
        io.kubernetes.pod.namespace: kube-system
        k8s-app: kube-dns
    toPorts:
    - ports:
      - port: '53'
        protocol: UDP
      rules:
        dns:
        - matchPattern: '*'
  endpointSelector:
    matchExpressions:
    - key: io.kubernetes.pod.namespace
      operator: NotIn
      values:
      - kube-system

ingress-to-server.yaml (ingress policy)

Next we want to allow traffic from our ingress controller Pods to workload services. We must define both an ingress and egress policy that allows trafic from specific Pods in our ingress namespace to specific Pods in our destination service's namespace

apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: allow-nginx-ingress
  namespace: server-ns
spec:
  endpointSelector:
    matchLabels:
      app: server
  ingress:
  - fromEndpoints:
    - matchLabels:
        app.kubernetes.io/name: ingress-nginx
        io.kubernetes.pod.namespace: ingress-nginx
    toPorts:
    - ports:
      - port: "8080"
        protocol: TCP
---
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: nginx-egress-to-server
  namespace: ingress-nginx
spec:
  endpointSelector:
    matchLabels:
      app.kubernetes.io/name: ingress-nginx
  egress:
  - toEndpoints:
    - matchLabels:
        app: server
        io.kubernetes.pod.namespace: server-ns
    toPorts:
    - ports:
      - port: "8080"
        protocol: TCP

pod-to-pod.yaml (east-west policy)

Any pod-to-pod communication a tenant requires will be facilitated by an ingress/egress policy pair. These policies may begin as L3/4, and can be matured with L7 filters (e.g. allowing only access to specific endpoints of the server).

apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
  name: allow-egress-to-web-server
  namespace: client-ns
spec:
  endpointSelector:
    matchLabels:
      app: network-client
  egress:
  - toEndpoints:
    - matchLabels:
        "k8s:io.kubernetes.pod.namespace": server-ns
        "k8s:app": server
    toPorts:
    - ports:
      - port: "8080"
        protocol: TCP
---
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
  name: allow-ingress-from-client
  namespace: server-ns
spec:
  endpointSelector:
    matchLabels:
      app: server
  ingress:
  - fromEndpoints:
    - matchLabels:
        "k8s:io.kubernetes.pod.namespace": client-ns
        "k8s:app": network-client
    toPorts:
    - ports:
      - port: "8080"
        protocol: TCP

pod-egress.yaml (pod egress policy)

Pods needing to communicate with internet services require pod egress policies that specify the FQDN of the host.

If we want to restrict pod-egress to specific endpoints (or any other L7 filter) we require setting up TLS termination/re-origination within the cluster. A guide for TLS inspection within Cilium exists here, and relies on Envoy proxy terminating TLS traffic using internally-generated certificates for FQDNs we wish to manage L7 policies for, and re-originating after policy actions.

apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
  name: allow-pod-egress
  namespace: client-ns
spec:
  endpointSelector:
    matchLabels:
      app: network-client
  egress:
  - toFQDNs:
    - matchName: ""
    toPorts:
    - ports:
      - port: "443"
        protocol: TCP
      terminatingTLS:
        secret:
          namespace: "kube-system"
          name: "-tls-data"
      originatingTLS:
        secret:
          namespace: "kube-system"
          name: "tls-orig-data"
      rules:
        http:
        - path: "/v1/data(/.*)?$"
          method: "GET"

pod-to-node.yaml (vertical policy)

Pods needing network access to Host services require a cluster-wide policy targetting the respective host endpoints. This allows us to configure an ingress rule on the Host available to specific endpoints and on specific ports.

Quick tip: node-level policies can (and often will) brick your cluster if you're not careful. You can configure your host's endpoint via cilium-dbg endpoint config $HOST_EP_ID PolicyAuditMode=Enabled to turn on audit-only mode for these policies, and view the intended actions of the policy via cilium-dbg monitor -t policy-verdict --related-to $HOST_EP_ID.

You can get the value of $HOST_EP_ID via cilium-dbg endpoint list and search for the Cilium endpoint ID of process ID 1.

apiVersion: "cilium.io/v2"
kind: CiliumClusterwideNetworkPolicy
metadata:
  name: "kubelet-host-policy"
spec:
  description: ""
  nodeSelector:
    matchLabels:
      node-access: kubelet
  ingress:
  - fromEndpoints:
    - matchLabels:
        app: network-client-priv2
    toPorts:
    - ports:
      - port: "10250"
        protocol: TCP
  - fromEntities:
    - kube-apiserver
    toPorts:
    - ports:
      - port: "10250"
        protocol: TCP

node-egress.yaml (node egress policy)

Finally, in cases where a Node requires access to specific internet services, we provide access via our node egress policy. Similar to vertical policies, this relies on us selecting Nodes via labels and configuring egress policies for each traffic flow. There are several required egress rules for the Node to function effectively (e.g. access to API server and cilium agents on other Nodes).

apiVersion: "cilium.io/v2"
kind: CiliumClusterwideNetworkPolicy
metadata:
  name: "worker-node-egress"
spec:
  description: "allow only essential observed egress traffic from worker nodes labeled node-access=kubelet."
  nodeSelector:
    matchLabels:
      node-access: kubelet
  egress:

  # rule 1: allow communication to the cluster API server
  - toEntities:
    - kube-apiserver
    toPorts:
    - ports:
      - port: "6443"
        protocol: TCP
      - port: "10250"
        protocol: TCP

  # rule 2: allow DNS lookups to coreDNS pods
  - toEndpoints:
    - matchLabels:
        "k8s:io.kubernetes.pod.namespace": kube-system
        "k8s:k8s-app": kube-dns
    toPorts:
    - ports:
      - port: "53"
        protocol: ANY

  # rule 3: allow health checks to coreDNS pods
  - toEndpoints:
    - matchLabels:
        "k8s:io.kubernetes.pod.namespace": kube-system
        "k8s:k8s-app": kube-dns
    toPorts:
    - ports:
      - port: "8080"
        protocol: TCP
      - port: "8181"
        protocol: TCP

  # rule 4: allow cilium agent node-to-node communication for health checks
  - toEntities:
    - remote-node
    - cluster
    toPorts:
    - ports:
      - port: "4240" # default cilium agent health port
        protocol: TCP
  - toEntities:
    - remote-node
    toPorts:
    - ports:
      - port: "8472"
        protocol: UDP

  # rule 5: allow node to reach external DNS server
  # use `rules` to get cilium DNS proxy to inspect external DNS responses
  # this allows us to use FQDN-based policies on externally resolved domains
  - toEntities:
    - world
    toPorts:
    - ports:
      - port: "53"
        protocol: UDP
      rules: 
        dns:
          - matchPattern: "*"

  # rule 6: allow ICMP for cilium health checks 
  - toEntities:
    - cluster
    - remote-node
    - host
    icmps:
    - fields:
      - type: EchoRequest
        family: IPv4

  # rule 7: allow FQDN-based request to https://google.com
  - toFQDNs:
    - matchName: "google.com"
    toPorts:
    - ports:
      - port: "443"
        protocol: TCP

troubleshooting through policy woes

hubble observe

Network policies are a pain in the arse to manage, and you'll often find traffic being blocked for what feels like no good reason, or even more frustratingly find traffic riding past your carefully constructed policies untroubled.

Cilium installs hubble onto each of the Cilium Pods to provide a means of troubleshooting these issues, and this should be the first port of call for viewing the effect (or lack of) of your deployed policies.

There's a great cheat sheet Isovalent provides for using the hubble CLI to view network traffic and policies, and for multi-node clusters it's a good shout to use hubble relay to gain this pane-of-glass view for the whole cluster.