Back to HomeOpenShift

OpenShift Logging Complete Guide: Log Collection, Analysis, and Monitoring [2026]

11 min min read
#OpenShift#Logging#Loki#Vector#Observability

OpenShift Logging Complete Guide: Log Collection, Analysis, and Monitoring [2026]

OpenShift Logging Complete Guide: Log Collection, Analysis, and Monitoring

When something goes wrong with your system, logs are your best friend. But in Kubernetes environments, Pods can be deleted and recreated at any time, and logs disappear with them.

OpenShift Logging solves this problem by centrally collecting, storing, and querying all logs. No matter where Pods run or how long they live, logs won't be lost.

This article provides a complete introduction to OpenShift Logging architecture and configuration to help you build a reliable log management system. If you're not familiar with OpenShift yet, we recommend first reading the OpenShift Complete Guide.


Introduction to OpenShift Logging

Importance of Logs

In container environments, logs are:

  • First-hand data for troubleshooting: Application errors, performance issues
  • Key for security audits: Who did what, when
  • Compliance requirements: Many regulations require log retention
  • Source for business analysis: User behavior, usage patterns

Without a logging system is like driving without a dashboard.

Architecture Overview

OpenShift Logging consists of three major parts:

┌─────────────────────────────────────────────────────┐
│                    Log Sources                       │
│  ┌─────────┐  ┌─────────┐  ┌─────────┐             │
│  │Application│  │Infrastructure│  │ Audit  │             │
│  │  Logs   │  │  Logs    │  │  Logs  │             │
│  └────┬────┘  └────┬────┘  └────┬────┘             │
├───────┼────────────┼────────────┼───────────────────┤
│       └────────────┼────────────┘                   │
│                    ▼                                │
│              ┌─────────┐                           │
│              │ Vector  │  ← Log Collector           │
│              └────┬────┘                           │
│                   ▼                                │
│              ┌─────────┐                           │
│              │  Loki   │  ← Log Storage             │
│              └────┬────┘                           │
│                   ▼                                │
│              ┌─────────┐                           │
│              │ Console │  ← Log Query               │
│              └─────────┘                           │
└─────────────────────────────────────────────────────┘

5.x vs 6.x Differences

OpenShift Logging had major architecture changes in version 6.x:

Aspect5.x6.x
Log CollectorFluentdVector
Log StorageElasticsearchLokiStack (recommended)
ConfigurationClusterLogging CRClusterLogForwarder primary
Architecture ComplexityMore complexSimplified
PerformanceAverageImproved

6.x is the currently recommended version; this article focuses on 6.x.


Logging 6.x New Features

Vector Log Collector

Vector replaces Fluentd as the default collector.

Vector's advantages:

  • Better performance: Written in Rust, more memory and CPU efficient
  • Simpler configuration: More intuitive syntax
  • More complete features: Built-in transformation, routing, filtering

Vector deploys as DaemonSet, one per node:

oc get pods -n openshift-logging -l app.kubernetes.io/component=collector

LokiStack

LokiStack is the recommended log storage solution, replacing Elasticsearch.

Loki's design philosophy differs from Elasticsearch:

  • Only indexes metadata (timestamps, labels), not log content
  • Lower storage costs
  • Fast queries (when labels are known)
  • Easier horizontal scaling

LokiStack architecture:

┌─────────────────────────────────────┐
│           LokiStack                  │
│  ┌─────────┐ ┌─────────┐            │
│  │Ingester │ │ Querier │            │
│  └────┬────┘ └────┬────┘            │
│       │           │                  │
│  ┌────▼───────────▼────┐            │
│  │   Object Storage    │            │
│  │   (S3 / ODF)        │            │
│  └─────────────────────┘            │
└─────────────────────────────────────┘

Simplified Architecture

6.x architecture is more streamlined:

Old Architecture (5.x): Fluentd → Kafka (optional) → Elasticsearch → Kibana

New Architecture (6.x): Vector → LokiStack → OpenShift Console

No additional Kibana needed—query directly in OpenShift Console.


Installation and Configuration

Install Operators

Step 1: Install Loki Operator

apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: loki-operator
  namespace: openshift-operators-redhat
spec:
  channel: stable-6.0
  name: loki-operator
  source: redhat-operators
  sourceNamespace: openshift-marketplace

Step 2: Install Cluster Logging Operator

apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: cluster-logging
  namespace: openshift-logging
spec:
  channel: stable-6.0
  name: cluster-logging
  source: redhat-operators
  sourceNamespace: openshift-marketplace

Configure LokiStack

LokiStack requires object storage. Using AWS S3 as example:

Step 1: Create Secret (S3 Credentials)

apiVersion: v1
kind: Secret
metadata:
  name: logging-loki-s3
  namespace: openshift-logging
stringData:
  access_key_id: "<AWS_ACCESS_KEY>"
  access_key_secret: "<AWS_SECRET_KEY>"
  bucketnames: "openshift-logging-loki"
  endpoint: "https://s3.ap-northeast-1.amazonaws.com"
  region: "ap-northeast-1"

Step 2: Create LokiStack

apiVersion: loki.grafana.com/v1
kind: LokiStack
metadata:
  name: logging-loki
  namespace: openshift-logging
spec:
  size: 1x.small
  storage:
    schemas:
    - version: v13
      effectiveDate: "2024-10-01"
    secret:
      name: logging-loki-s3
      type: s3
  storageClassName: gp3-csi
  tenants:
    mode: openshift-logging

Configure ClusterLogForwarder

ClusterLogForwarder defines how logs are collected and forwarded:

apiVersion: observability.openshift.io/v1
kind: ClusterLogForwarder
metadata:
  name: instance
  namespace: openshift-logging
spec:
  serviceAccount:
    name: cluster-logging-operator
  outputs:
  - name: default-lokistack
    type: lokiStack
    lokiStack:
      target:
        name: logging-loki
        namespace: openshift-logging
      authentication:
        token:
          from: serviceAccount
  pipelines:
  - name: default-logstore
    inputRefs:
    - application
    - infrastructure
    outputRefs:
    - default-lokistack

Storage Configuration Recommendations

Cluster SizeLokiStack SizeStorage Needed/DayRecommended Retention
Small (<50 Pods)1x.extra-small~10 GB7 days
Medium (50-200 Pods)1x.small~50 GB14 days
Large (200+ Pods)1x.medium~200 GB30 days

Loki Integration

LokiStack Architecture Details

LokiStack includes multiple components:

ComponentFunction
DistributorReceives logs, distributes to Ingesters
IngesterTemporarily stores logs, writes to storage
QuerierHandles query requests
Query FrontendQuery caching, splitting
CompactorCompresses old data

LogQL Query Syntax

Loki uses LogQL query language, similar to PromQL:

Basic Queries:

# Query logs for specific namespace
{kubernetes_namespace_name="my-app"}

# Query specific Pod
{kubernetes_pod_name="my-pod-abc123"}

# Multiple conditions
{kubernetes_namespace_name="my-app", kubernetes_container_name="web"}

Filter Log Content:

# Logs containing "error"
{kubernetes_namespace_name="my-app"} |= "error"

# Logs not containing "debug"
{kubernetes_namespace_name="my-app"} != "debug"

# Regular expression
{kubernetes_namespace_name="my-app"} |~ "error|exception"

JSON Parsing:

# Parse JSON logs, filter level=error
{kubernetes_namespace_name="my-app"} | json | level="error"

# Extract specific fields
{kubernetes_namespace_name="my-app"} | json | line_format "{{.message}}"

Statistical Queries:

# Error count in past 5 minutes
count_over_time({kubernetes_namespace_name="my-app"} |= "error" [5m])

# Log volume per minute
rate({kubernetes_namespace_name="my-app"} [1m])

Query in Console

OpenShift Console has built-in log query interface:

  1. Go to Observe → Logs
  2. Select log type (Application / Infrastructure / Audit)
  3. Enter LogQL query
  4. View results

Can also use CLI:

# Install logcli
# Query logs
logcli query '{kubernetes_namespace_name="my-app"}' --limit=100

Log Types

OpenShift categorizes logs into three major types:

Application Logs

From user-deployed applications (non openshift-* namespaces):

  • Container stdout/stderr
  • Log files written by applications

Label example:

{
  log_type: "application",
  kubernetes_namespace_name: "my-app",
  kubernetes_pod_name: "web-abc123",
  kubernetes_container_name: "web"
}

Infrastructure Logs

From OpenShift system components:

  • Pods in openshift-* namespaces
  • Pods in kube-* namespaces
  • Node system logs (journald)

Label example:

{
  log_type: "infrastructure",
  kubernetes_namespace_name: "openshift-apiserver"
}

Audit Logs

System operation audit records:

  • Kubernetes API Audit: Who did what to the API
  • OpenShift API Audit: OAuth, Route operations, etc.
  • OVN Audit: Network operations
  • Node Audit: Linux auditd logs

Audit logs are critical for compliance.


Log Forwarding

Besides storing to LokiStack, logs can be forwarded to external systems.

Forward to Kafka

apiVersion: observability.openshift.io/v1
kind: ClusterLogForwarder
metadata:
  name: instance
  namespace: openshift-logging
spec:
  outputs:
  - name: kafka-receiver
    type: kafka
    kafka:
      brokers:
      - kafka.example.com:9092
      topic: openshift-logs
  pipelines:
  - name: to-kafka
    inputRefs:
    - application
    outputRefs:
    - kafka-receiver

Forward to Splunk

outputs:
- name: splunk-receiver
  type: splunk
  splunk:
    url: https://splunk.example.com:8088
    token:
      secretName: splunk-token
      key: token

Forward to Cloud Services

AWS CloudWatch:

outputs:
- name: cloudwatch
  type: cloudwatch
  cloudwatch:
    region: ap-northeast-1
    groupBy: namespaceName
    secret:
      name: cloudwatch-credentials

Azure Monitor:

outputs:
- name: azure-monitor
  type: azureMonitor
  azureMonitor:
    customerId: "<workspace-id>"
    logType: openshift
    secret:
      name: azure-monitor-secret

Log forwarding architecture needs to consider latency, reliability, and cost. Book an architecture consultation and let us help you design the best solution.


Audit Log Management

Kubernetes API Audit

Records all operations to Kubernetes API:

{
  "kind": "Event",
  "apiVersion": "audit.k8s.io/v1",
  "level": "RequestResponse",
  "stage": "ResponseComplete",
  "requestURI": "/api/v1/namespaces/default/pods",
  "verb": "create",
  "user": {
    "username": "admin",
    "groups": ["system:authenticated"]
  },
  "responseStatus": {
    "code": 201
  }
}

Audit Policy

OpenShift's default audit policy balances completeness and storage. To adjust:

apiVersion: config.openshift.io/v1
kind: APIServer
metadata:
  name: cluster
spec:
  audit:
    profile: Default  # or WriteRequestBodies, AllRequestBodies
ProfileDescriptionLog Volume
DefaultRecords metadataMedium
WriteRequestBodiesRecords write operation bodiesLarge
AllRequestBodiesRecords all operation bodiesVery Large

Compliance Requirements

Common compliance standards' log requirements:

StandardRequirements
PCI DSSRetain at least 1 year, accessible for 3 months
SOC 2System access logs, change logs
HIPAAAccess auditing, integrity protection
ISO 27001Log protection, regular review

Set log retention policy:

apiVersion: loki.grafana.com/v1
kind: LokiStack
spec:
  limits:
    global:
      retention:
        days: 30
        streams:
        - selector: '{log_type="audit"}'
          priority: 1
          period: 365d  # Audit logs retained 1 year

Best Practices

Log Retention Policy

Tiered Retention:

Log TypeRecommended RetentionNotes
Application Logs7-30 daysBased on troubleshooting needs
Infrastructure Logs14-30 daysSystem troubleshooting
Audit Logs90-365 daysCompliance requirements

Resource Planning

Vector (Collector):

  • ~200-500 MB RAM per node
  • CPU depends on log volume

LokiStack:

  • Ingesters need more memory
  • Choose appropriate size
# Production recommendation
spec:
  size: 1x.medium  # or larger
  replication:
    factor: 3  # Replica count

Performance Tuning

1. Use Label Filtering

# Good: Filter by labels first, reduce scan volume
{kubernetes_namespace_name="my-app"} |= "error"

# Bad: No labels, scans everything
{} |= "error"

2. Limit Time Range

# Good: Explicit time range
{kubernetes_namespace_name="my-app"} |= "error"
# Select "Past 1 hour" when querying

# Bad: Query "Past 30 days" of massive data

3. Appropriate Log Levels

Applications should:

  • Use INFO or WARN level in production
  • Avoid excessive DEBUG logs
  • Structured logs (JSON) are easier to query

Security Considerations

1. Access Control

Limit who can see which logs:

# RBAC: Only allow viewing own namespace logs
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: log-viewer
  namespace: my-app
subjects:
- kind: User
  name: developer
roleRef:
  kind: ClusterRole
  name: cluster-logging-application-view

2. Sensitive Data

Avoid logging sensitive information:

  • Passwords, API Keys
  • Personally Identifiable Information (PII)
  • Credit card numbers

Can use Vector's transformation features to mask:

spec:
  filters:
  - name: mask-sensitive
    type: prune
    prune:
      in:
      - .message
      notIn:
      - password
      - secret

Common Troubleshooting

Missing Logs

Symptom: Some Pod logs don't appear

Troubleshoot:

# Check Vector Pod status
oc get pods -n openshift-logging -l app.kubernetes.io/component=collector

# Check Vector logs
oc logs -n openshift-logging -l app.kubernetes.io/component=collector

# Check ClusterLogForwarder status
oc get clusterlogforwarder -n openshift-logging -o yaml

Common causes:

  • Pod log path isn't standard location
  • Vector resource insufficient
  • Target storage has issues

Performance Issues

Symptom: Queries very slow or timeout

Solutions:

  1. Narrow query scope (time, namespace)
  2. Increase LokiStack resources
  3. Check object storage latency

Storage Space Insufficient

Symptom: Logs stop writing

Solutions:

  1. Check storage usage
  2. Adjust retention policy
  3. Expand storage
# Check PVC usage
oc get pvc -n openshift-logging

FAQ

Q1: Why did 6.x switch to Loki instead of Elasticsearch?

Loki's design is more suitable for log scenarios: (1) Only indexes labels not content, storage costs much lower; (2) Architecture simpler, less operations burden; (3) Good integration with Prometheus ecosystem. Elasticsearch is powerful but somewhat over-designed for logs, and resource-intensive.

Q2: Can old Elasticsearch logs be migrated to Loki?

Technically possible, but usually not recommended. More pragmatic approach is: (1) Keep old Elasticsearch for reading old logs; (2) New logs go to LokiStack; (3) Old logs naturally age out after retention period.

Q3: Log volume is huge, how to control costs?

Several directions: (1) Only collect needed logs, use ClusterLogForwarder to filter; (2) Shorten retention period; (3) Use cheaper object storage (S3 Standard-IA); (4) Increase compression ratio; (5) Avoid logging DEBUG level to production.

Q4: Can I send to both Loki and Splunk simultaneously?

Yes. ClusterLogForwarder supports multiple outputs, same logs can go to multiple destinations. But watch network bandwidth and Splunk licensing costs.

Q5: How do I know if logs are being lost?

Several checkpoints: (1) Compare Pod count vs Pods with logs; (2) Monitor Vector's drop metrics; (3) Spot-check specific Pod logs for completeness; (4) Set up alerts when log volume suddenly drops.


Logging Systems are Operations' Eyes

Poor design will leave you blind to problems. From log architecture to retention policy, every decision affects subsequent troubleshooting efficiency.

Book an architecture consultation and let us help you optimize your logging architecture.


Reference Resources

Need Professional Cloud Advice?

Whether you're evaluating cloud platforms, optimizing existing architecture, or looking for cost-saving solutions, we can help

Book Free Consultation

Related Articles