Back to HomeOpenShift

OpenShift Architecture Deep Dive: Control Plane, Operator and Network Design [2026]

13 min min read
#OpenShift#Architecture Design#Kubernetes#Operator#OVN-Kubernetes

OpenShift Architecture Deep Dive: Control Plane, Operator and Network Design [2026]

OpenShift Architecture Deep Dive: Control Plane, Operator and Network Design

To use OpenShift well, you first need to understand its architecture. Otherwise when problems occur, you won't even know where to look.

OpenShift architecture has quite a bit more than native Kubernetes, but the core logic isn't hard to understand. This article starts from the high-level architecture diagram, breaking it down layer by layer to build your complete technical understanding. If you're not familiar with OpenShift yet, we recommend first reading OpenShift Complete Guide.


OpenShift Architecture Overview

High-Level Architecture Diagram

OpenShift architecture can be divided into four major layers:

┌─────────────────────────────────────────────────────────┐
│                    Application Layer                     │
│         Pods │ Deployments │ Services │ Routes          │
├─────────────────────────────────────────────────────────┤
│                  OpenShift Platform Services             │
│   Console │ Monitoring │ Logging │ Registry │ Pipelines │
├─────────────────────────────────────────────────────────┤
│                    Kubernetes Layer                      │
│        API Server │ Scheduler │ Controllers             │
├─────────────────────────────────────────────────────────┤
│                   Operating System Layer                 │
│              Red Hat CoreOS (RHCOS)                      │
├─────────────────────────────────────────────────────────┤
│                   Infrastructure Layer                   │
│          Cloud │ Virtualization │ Bare Metal │ Edge     │
└─────────────────────────────────────────────────────────┘

Differences from Native Kubernetes

OpenShift isn't just "Kubernetes + some tools"—it has several fundamental architectural differences:

AspectKubernetesOpenShift
Operating SystemAny LinuxRHCOS (Immutable)
Installation Methodkubeadm and othersUnified Installer
Component ManagementManual or HelmOperator Unified Management
Default SecurityMore PermissiveSCC Strict Restrictions
Network PluginChoosableOVN-Kubernetes
Upgrade MechanismManual CoordinationOperator Automated

Design Philosophy

Red Hat's OpenShift 4.x design has several core philosophies:

1. Immutable Infrastructure

The node operating system (RHCOS) is read-only, configuration changes are unified through Machine Config Operator. Benefits:

  • All node configurations are consistent
  • Changes are trackable and rollback-able
  • Reduces human operational errors

2. Everything is an Operator

All OpenShift components are managed with Operators. Operators are "self-managing applications" that know how to install, upgrade, and repair themselves.

3. Secure by Default

Security isn't added afterwards—it's built into the architecture. SCC (Security Context Constraints), network isolation, and image scanning are enabled by default.


Control Plane Components

Control Plane is the cluster's brain, responsible for receiving requests, making decisions, and maintaining state.

API Server

API Server is the cluster's single entry point.

All operations—whether kubectl commands, Web Console clicks, or Operator requests—must go through API Server.

kubectl ─────┐
Web Console ─┼───→ API Server ───→ etcd
Operator ────┘         │
                       ↓
              Controller Manager
                  Scheduler

API Server responsibilities:

  • Authenticate request identity (Authentication)
  • Check request permissions (Authorization)
  • Validate request format (Admission)
  • Write state to etcd

etcd

etcd is the cluster's memory.

All cluster state is stored in etcd: Pod definitions, Service configurations, ConfigMaps, Secrets... everything.

etcd characteristics:

  • Distributed key-value store
  • Strong consistency (Raft consensus algorithm)
  • OpenShift defaults to 3 nodes for high availability

etcd health directly affects the entire cluster. If etcd goes down, the cluster cannot accept any changes.

Controller Manager

Controller Manager is responsible for "making reality match expectations."

You say "I want 3 Pods," Controller Manager ensures there are always 3 Pods running. If there are fewer, it adds; if more, it removes.

OpenShift has multiple Controllers:

  • Deployment Controller: Manages Deployment's ReplicaSets
  • ReplicaSet Controller: Ensures correct Pod count
  • Node Controller: Monitors node health
  • Service Controller: Manages cloud load balancers

Scheduler

Scheduler decides which Node a Pod runs on.

When you create a Pod, Scheduler will:

  1. Filter Nodes that don't meet conditions (insufficient resources, label mismatch, taint rejection)
  2. Score remaining Nodes
  3. Select the highest-scoring Node

Scoring factors include:

  • Resource utilization (prefers less busy Nodes)
  • Pod spreading (same Deployment Pods distributed apart)
  • Node affinity (preference based on labels)

OAuth Server

OAuth Server is OpenShift-specific, handling authentication.

Native Kubernetes only has basic authentication mechanisms. OpenShift adds complete OAuth 2.0 implementation:

  • Supports multiple identity providers (LDAP, AD, GitHub, Google)
  • Issues OAuth Access Tokens
  • Manages users and groups
# OAuth configuration example
apiVersion: config.openshift.io/v1
kind: OAuth
metadata:
  name: cluster
spec:
  identityProviders:
  - name: ldap
    type: LDAP
    ldap:
      url: "ldap://ldap.example.com/ou=users,dc=example,dc=com?uid"

Worker Node Architecture

Worker Nodes are where applications actually run.

Node Components

Each Worker Node runs:

ComponentFunction
KubeletNode agent, manages Pod lifecycle
CRI-OContainer runtime, executes containers
OVN-KubernetesNetwork plugin, handles networking
Node Problem DetectorDetects node issues

CRI-O Container Runtime

OpenShift uses CRI-O instead of Docker:

  • Lightweight runtime designed specifically for Kubernetes
  • Complies with OCI (Open Container Initiative) standard
  • Leaner and more secure than Docker

CRI-O does one thing: run containers. Unlike Docker which also includes image building, network management, and other features.

Kubelet Configuration

OpenShift's Kubelet settings are unified through Machine Config Operator:

# KubeletConfig example
apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
metadata:
  name: custom-kubelet
spec:
  machineConfigPoolSelector:
    matchLabels:
      pools.operator.machineconfiguration.openshift.io/worker: ""
  kubeletConfig:
    maxPods: 250
    podPidsLimit: 4096

Node Scaling

OpenShift manages nodes through Machine API:

  • Machine: Represents one node instance
  • MachineSet: Manages a group of identically configured Machines
  • MachineAutoScaler: Automatically scales MachineSets
# MachineSet example
apiVersion: machine.openshift.io/v1beta1
kind: MachineSet
metadata:
  name: worker-us-east-1a
spec:
  replicas: 3
  selector:
    matchLabels:
      machine.openshift.io/cluster-api-machineset: worker-us-east-1a

Operator Mechanism Deep Dive

Operator is OpenShift 4.x's most important architectural innovation.

What is an Operator?

Operator is a design pattern: encoding application operational knowledge into code.

Traditional approach:

  1. Person reads documentation to learn installation
  2. Person manually executes installation steps
  3. Person handles upgrades, backups, failures

Operator approach:

  1. Operator knows how to install (coded)
  2. Operator automatically executes installation
  3. Operator automatically handles upgrades, backups, failures

Operator Lifecycle Manager (OLM)

OLM is the "Operator that manages Operators."

OLM responsibilities:

  • Install Operators from OperatorHub
  • Manage Operator upgrades
  • Handle dependencies between Operators
# Subscription to Operator
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: elasticsearch-operator
  namespace: openshift-operators
spec:
  channel: stable
  name: elasticsearch-operator
  source: redhat-operators
  sourceNamespace: openshift-marketplace

Built-in OpenShift Operators

All OpenShift's own components are Operators:

OperatorManages
Cluster Version OperatorCluster version, upgrades
Machine Config OperatorNode configuration
Console OperatorWeb Console
Ingress OperatorRouter, Ingress
Monitoring OperatorPrometheus, Alertmanager
Logging OperatorLog collection

View all ClusterOperators:

oc get clusteroperators

Example output:

NAME                                       VERSION   AVAILABLE   PROGRESSING   DEGRADED
authentication                             4.16.0    True        False         False
console                                    4.16.0    True        False         False
ingress                                    4.16.0    True        False         False
monitoring                                 4.16.0    True        False         False

Custom Operator Development

Enterprises can also develop their own Operators. Common tools:

  • Operator SDK: Development framework from Red Hat
  • Kubebuilder: Framework from Kubernetes SIG
  • KUDO: Declarative framework from D2iQ

Operator development steps:

  1. Define Custom Resource Definition (CRD)
  2. Implement Controller logic
  3. Package as container image
  4. Publish to OperatorHub

Network Architecture

OpenShift networking is the part many find complex, but the core concepts are actually clear.

OVN-Kubernetes

OpenShift 4.x uses OVN-Kubernetes as the default network plugin:

  • Based on Open Virtual Network (OVN)
  • Supports Kubernetes NetworkPolicy
  • Provides advanced features (Egress IP, multi-networking)

OVN-Kubernetes advantages:

  • Better performance than earlier OpenShift SDN
  • Native IPv6 dual-stack support
  • Better observability

Pod Network

Each Pod has its own IP address, and Pods within the cluster can communicate directly.

Pod A (10.128.0.5) ←──→ Pod B (10.128.1.10)
        │                      │
        └──────── OVN ─────────┘

Pod network design:

  • Default Pod CIDR: 10.128.0.0/14
  • Each Node gets a subnet
  • Pods use overlay network

Service and Ingress

Service provides stable access endpoints:

Service TypeDescription
ClusterIPInternal cluster IP (default)
NodePortOpens port on every Node
LoadBalancerCloud load balancer

Route is OpenShift's specific Ingress implementation:

apiVersion: route.openshift.io/v1
kind: Route
metadata:
  name: my-app
spec:
  host: myapp.apps.cluster.example.com
  to:
    kind: Service
    name: my-app
  tls:
    termination: edge

Route appeared before Kubernetes Ingress and has more complete features (like TLS re-encryption, blue-green deployment weights).

Network Policy

Network Policy controls network access between Pods:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: deny-all
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress

By default, all Pods can communicate with each other. After applying Network Policy, only explicitly allowed traffic can pass through.

Network architecture design affects overall performance and security. Book architecture consultation, let us help review your design.


Storage Architecture

Storage Concepts

OpenShift uses Kubernetes standard storage abstractions:

ConceptDescription
PersistentVolume (PV)Cluster-level storage resource
PersistentVolumeClaim (PVC)User's request for storage
StorageClassStorage "specification," defines how to dynamically create PVs

Dynamic Provisioning

Most situations use Dynamic Provisioning:

  1. User creates PVC, specifying StorageClass
  2. StorageClass's Provisioner automatically creates PV
  3. PV and PVC bind
  4. Pod mounts PVC
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: gp3-csi

OpenShift Data Foundation (ODF)

ODF is Red Hat's software-defined storage solution:

  • Based on Ceph and Rook
  • Provides block, file, and object storage
  • Unified storage across clouds and regions

ODF suits:

  • Need ReadWriteMany (RWX) access mode
  • Want storage and compute integration
  • Need consistent storage experience across clouds

CSI Drivers

OpenShift supports various CSI (Container Storage Interface) drivers:

  • AWS EBS CSI
  • Azure Disk CSI
  • GCP PD CSI
  • VMware vSphere CSI
  • NetApp Trident
  • Pure Storage

Security Architecture

OpenShift's security is "strict by default," very different from native Kubernetes's "permissive by default."

RBAC (Role-Based Access Control)

RBAC controls "who can do what":

ResourceDescription
RoleNamespace-level permissions
ClusterRoleCluster-level permissions
RoleBindingBinds Role to users
ClusterRoleBindingBinds ClusterRole to users
# Give user developer edit permission in my-project
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: developer-edit
  namespace: my-project
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: edit
subjects:
- kind: User
  name: developer

SCC (Security Context Constraints)

SCC is OpenShift-specific security mechanism, controlling what Pods can do.

Default SCCs:

SCCRestriction LevelDescription
restrictedStrictestDefault, prohibits privileged operations
restricted-v2StrictNew default, complies with Pod Security Standards
nonrootMediumAllows non-root users
anyuidPermissiveAllows any UID
privilegedMost PermissiveAllows privileged containers

Many applications migrating from Docker or native K8s encounter permission issues, usually because OpenShift defaults to restricted SCC.

Image Security

OpenShift has built-in image security mechanisms:

  • Image Signature Verification: Ensures image source is trusted
  • Image Scanning: Integrates Clair, ACS for vulnerability scanning
  • Image Policy: Restricts usable Registries
# Restrict to only use images from specific Registries
apiVersion: config.openshift.io/v1
kind: Image
metadata:
  name: cluster
spec:
  registrySources:
    allowedRegistries:
    - quay.io
    - registry.redhat.io
    - image-registry.openshift-image-registry.svc:5000

High Availability Design

Control Plane HA

OpenShift production environments recommend 3 Control Plane nodes:

  • API Server: Runs on every node, load balancer in front
  • etcd: 3-node cluster, Raft consensus ensures consistency
  • Controller/Scheduler: Leader Election, only one active
                    Load Balancer
                         │
         ┌───────────────┼───────────────┐
         │               │               │
    ┌────▼────┐     ┌────▼────┐     ┌────▼────┐
    │ Master 1│     │ Master 2│     │ Master 3│
    │ API     │     │ API     │     │ API     │
    │ etcd    │◄───►│ etcd    │◄───►│ etcd    │
    │ Ctrl    │     │ Ctrl    │     │ Ctrl    │
    └─────────┘     └─────────┘     └─────────┘

Multi-Zone Deployment

In cloud environments, recommend deploying across 3 Availability Zones (AZ):

  • At least 1 Master, multiple Workers per AZ
  • Use Pod Anti-Affinity to spread applications
  • Storage also cross-AZ (or use ODF)

Disaster Recovery

Disaster recovery strategies:

StrategyRPORTOComplexity
etcd Backup RestoreHoursHoursLow
Active-Passive ClusterMinutesMinutesMedium
Active-Active ClusterNear Real-timeNear Real-timeHigh

Regular etcd backup is fundamental:

# Backup etcd
oc get pods -n openshift-etcd
oc rsh -n openshift-etcd etcd-master-0
etcdctl snapshot save /var/lib/etcd/snapshot.db

FAQ

Q1: How is OpenShift architecture different from Kubernetes?

Main differences in three areas: (1) Operating system uses immutable RHCOS, configuration unified by Operator; (2) All components use Operator pattern, including OpenShift's own components; (3) Default security mechanisms are stricter, like SCC restricting container permissions. Underlying is still Kubernetes, but upper layer packaging is completely different.

Q2: What's the difference between Operator and Helm?

Helm is a packaging tool that packages a bunch of YAML into a Chart, expanding during installation. Operator is a runtime management program that continuously monitors application state and automatically handles (upgrades, backups, failure recovery). Helm's job ends after installation, Operator's work begins after installation. Both can be used together.

Q3: Why does OpenShift use CRI-O instead of Docker?

Docker has too many features, Kubernetes only needs "run containers." CRI-O is designed specifically for Kubernetes, only implementing CRI (Container Runtime Interface), lighter, more secure, more stable. Also Kubernetes 1.24 removed dockershim, Docker is no longer officially supported runtime.

Q4: What if etcd breaks?

If only one node fails, etcd cluster handles it automatically (3 nodes tolerate 1 failure). If more than half fail, cluster cannot operate, need to restore from backup. So must regularly backup etcd, and backups must be stored outside the cluster. OpenShift 4.x has auto-backup mechanism, but still need to confirm backups are actually running.

Q5: SCC keeps blocking my application, what to do?

First confirm if application really needs those permissions. Many applications don't actually need root, just poorly written Dockerfile. If really needed: (1) Modify Deployment's securityContext; (2) Create ServiceAccount and bind appropriate SCC; (3) Only consider anyuid or privileged as last resort.


Need a Second Opinion on OpenShift Architecture?

Good architecture can save multiples in operational costs. From Control Plane configuration to network design, every decision affects subsequent stability and scalability.

Book architecture consultation, let's review your container platform planning together.


Reference Resources

Need Professional Cloud Advice?

Whether you're evaluating cloud platforms, optimizing existing architecture, or looking for cost-saving solutions, we can help

Book Free Consultation

Related Articles