KEDA on EKS: Complete Guide to Event-Driven Autoscaling with Real-World Examples

KEDA (Kubernetes Event-driven Autoscaling) revolutionizes how we handle application scaling in Kubernetes by enabling event-driven autoscaling based on external metrics. In this comprehensive guide, we’ll explore advanced KEDA implementation on Amazon EKS with real-world examples covering multiple scaling scenarios, enterprise patterns, and production-ready configurations.

Why KEDA?
Architecture Deep Dive
Prerequisites and Setup
KEDA Installation on EKS
Advanced Scaling Use Cases
Enterprise Patterns
Performance Optimization
Security and Compliance
Monitoring and Observability
Troubleshooting and Debugging
Production Deployment
Cost Optimization Strategies

Why KEDA?

KEDA addresses the limitations of traditional HPA (Horizontal Pod Autoscaler) by:

Event-Driven Scaling: Scale based on external events, not just CPU/memory
Zero-to-N Scaling: Scale from 0 to N pods based on actual demand
Rich Metrics Support: 50+ scalers for various data sources
Cost Optimization: Scale to zero when no events are present
Production Ready: Used by major enterprises worldwide
Cloud Agnostic: Works across AWS, Azure, GCP, and on-premises
Intelligent Scaling: Advanced algorithms for smooth scaling decisions
Multi-Cloud Support: Unified scaling across hybrid environments

KEDA vs Traditional HPA

Feature	KEDA	HPA
Scaling Triggers	External events, custom metrics	CPU, memory, custom metrics
Zero Scaling	✅ Yes	❌ No
Scaler Types	50+ built-in scalers	Limited to resource metrics
Cost Optimization	Excellent	Limited
Event Sources	Queues, databases, APIs	Resource utilization
Learning Curve	Moderate	Easy

Architecture Deep Dive

KEDA Components

KEDA consists of several key components working together:

graph TB
    subgraph "KEDA Architecture"
        A[KEDA Operator] --> B[Metrics Server]
        A --> C[Webhooks]
        A --> D[ScaledObject Controller]
        A --> E[ScaledJob Controller]
        
        F[External Data Sources] --> A
        G[Kubernetes API] --> A
        H[Prometheus] --> A
        I[Cloud APIs] --> A
        
        A --> J[Deployment/StatefulSet]
        A --> K[Job/CronJob]
        
        L[Monitoring Stack] --> B
        M[Grafana] --> L
        N[AlertManager] --> L
    end

Core Components Explained

1. KEDA Operator

Purpose: Main controller managing ScaledObjects and ScaledJobs
Responsibilities:
- Monitoring external metrics
- Making scaling decisions
- Updating Kubernetes resources
- Managing authentication

2. Metrics Server

Purpose: Exposes external metrics to Kubernetes HPA
Function: Translates KEDA metrics to HPA-compatible format
Integration: Works with Kubernetes metrics API

3. Webhooks

Purpose: Validates ScaledObject and ScaledJob configurations
Function: Ensures proper configuration before scaling decisions
Security: Prevents invalid scaling configurations

4. ScaledObject Controller

Purpose: Manages scaling of Deployments and StatefulSets
Features:
- Zero-to-N scaling
- Multiple trigger support
- Advanced scaling algorithms

5. ScaledJob Controller

Purpose: Manages scaling of Jobs and CronJobs
Features:
- Job-based scaling
- Batch processing optimization
- Resource cleanup

Scaling Decision Flow

sequenceDiagram
    participant KEDA as KEDA Operator
    participant MS as Metrics Server
    participant HPA as HPA Controller
    participant K8s as Kubernetes API
    participant App as Application Pods
    
    KEDA->>MS: Poll external metrics
    MS-->>KEDA: Return metric values
    KEDA->>KEDA: Calculate desired replicas
    KEDA->>HPA: Update HPA with metrics
    HPA->>K8s: Scale deployment
    K8s->>App: Create/destroy pods
    App-->>KEDA: Report processing status

Prerequisites and Setup

System Requirements

Before we begin, ensure you have:

EKS Cluster: Version 1.21+ (recommended 1.25+)
Node Groups: At least 2 nodes with 4+ vCPUs and 8GB+ RAM
kubectl: Version 1.21+ configured
AWS CLI: Version 2.x configured
Helm: Version 3.x installed
Terraform: Optional, for infrastructure management
Go: Version 1.19+ (for custom scalers)

EKS Cluster Configuration

# Create EKS cluster with proper configuration
eksctl create cluster \
  --name keda-demo-cluster \
  --version 1.25 \
  --region us-west-2 \
  --nodegroup-name workers \
  --node-type m5.large \
  --nodes 3 \
  --nodes-min 2 \
  --nodes-max 10 \
  --managed \
  --with-oidc \
  --ssh-access \
  --ssh-public-key ~/.ssh/id_rsa.pub \
  --enable-ssm

# Verify cluster status
kubectl get nodes
kubectl get pods -A

Required IAM Permissions

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "sqs:GetQueueAttributes",
        "sqs:ListQueues",
        "s3:GetObject",
        "s3:ListBucket",
        "rds:DescribeDBInstances",
        "elasticache:DescribeCacheClusters",
        "kafka:ListClusters",
        "kafka:DescribeCluster"
      ],
      "Resource": "*"
    }
  ]
}

Environment Setup Script

#!/bin/bash
# setup-keda-environment.sh

set -e

# Variables
CLUSTER_NAME="keda-demo-cluster"
REGION="us-west-2"
NAMESPACE="keda"

echo "Setting up KEDA environment..."

# Install required tools
echo "Installing required tools..."
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
chmod +x kubectl
sudo mv kubectl /usr/local/bin/

# Install Helm
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash

# Install eksctl
curl --silent --location "https://github.com/weaveworks/eksctl/releases/latest/download/eksctl_$(uname -s)_amd64.tar.gz" | tar xz -C /tmp
sudo mv /tmp/eksctl /usr/local/bin

# Verify installations
kubectl version --client
helm version
eksctl version

echo "Environment setup complete!"

Network Configuration

# vpc-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: keda-network-config
  namespace: keda
data:
  # Enable pod-to-pod communication
  pod-to-pod: "enabled"
  # Configure service mesh if using Istio
  service-mesh: "disabled"
  # Network policies
  network-policies: "enabled"

Storage Requirements

# storage-class.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: keda-storage
provisioner: ebs.csi.aws.com
parameters:
  type: gp3
  iops: "3000"
  throughput: "125"
  encrypted: "true"
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true

KEDA Installation on EKS

Installation Methods

KEDA can be installed using multiple methods:

Helm (Recommended): Easy management and upgrades
YAML Manifests: Direct Kubernetes resources
Operator: Using KEDA Operator
Terraform: Infrastructure as Code

Method 1: Helm Installation (Production Ready)

Basic Installation

# Add KEDA Helm repository
helm repo add kedacore https://kedacore.github.io/charts
helm repo update

# Create namespace for KEDA
kubectl create namespace keda

# Install KEDA with basic configuration
helm install keda kedacore/keda \
  --namespace keda \
  --version 2.12.0 \
  --set image.keda.tag=2.12.0 \
  --set image.metricsApiServer.tag=2.12.0 \
  --set image.webhooks.tag=2.12.0 \
  --set image.operator.tag=2.12.0

# Verify installation
kubectl get pods -n keda
kubectl get crd | grep keda

Advanced Production Installation

# Create values file for production
cat > keda-values.yaml << EOF
# KEDA Production Configuration
operator:
  image:
    repository: ghcr.io/kedacore/keda
    tag: "2.12.0"
    pullPolicy: IfNotPresent
  
  # Resource limits
  resources:
    limits:
      cpu: 1000m
      memory: 1000Mi
    requests:
      cpu: 100m
      memory: 100Mi
  
  # Security context
  securityContext:
    runAsNonRoot: true
    runAsUser: 1000
    allowPrivilegeEscalation: false
    readOnlyRootFilesystem: true
    capabilities:
      drop:
        - ALL

# Metrics API Server configuration
metricsApiServer:
  image:
    repository: ghcr.io/kedacore/keda-metrics-apiserver
    tag: "2.12.0"
    pullPolicy: IfNotPresent
  
  # Resource limits
  resources:
    limits:
      cpu: 1000m
      memory: 1000Mi
    requests:
      cpu: 100m
      memory: 100Mi
  
  # Security context
  securityContext:
    runAsNonRoot: true
    runAsUser: 1000
    allowPrivilegeEscalation: false
    readOnlyRootFilesystem: true
    capabilities:
      drop:
        - ALL

# Webhooks configuration
webhooks:
  image:
    repository: ghcr.io/kedacore/keda-admission-webhooks
    tag: "2.12.0"
    pullPolicy: IfNotPresent
  
  # Resource limits
  resources:
    limits:
      cpu: 500m
      memory: 500Mi
    requests:
      cpu: 100m
      memory: 100Mi

# Service configuration
service:
  type: ClusterIP
  port: 80
  targetPort: 8080

# Monitoring configuration
prometheus:
  metricServer:
    enabled: true
    port: 8080
    path: /metrics
  operator:
    enabled: true
    port: 8080
    path: /metrics

# Logging configuration
logging:
  operator:
    level: info
    format: json
  metricServer:
    level: info
    format: json

# Feature flags
features:
  - "advanced-scaling"
  - "multi-trigger"
  - "fallback-scaling"

# Node selection
nodeSelector: {}
tolerations: []
affinity: {}

# Pod disruption budget
podDisruptionBudget:
  enabled: true
  minAvailable: 1
EOF

# Install with production configuration
helm install keda kedacore/keda \
  --namespace keda \
  --values keda-values.yaml \
  --create-namespace \
  --wait \
  --timeout=10m

High Availability Installation

# HA KEDA configuration
cat > keda-ha-values.yaml << EOF
# High Availability Configuration
operator:
  replicaCount: 3
  resources:
    limits:
      cpu: 2000m
      memory: 2Gi
    requests:
      cpu: 200m
      memory: 200Mi

metricsApiServer:
  replicaCount: 3
  resources:
    limits:
      cpu: 2000m
      memory: 2Gi
    requests:
      cpu: 200m
      memory: 200Mi

# Pod Anti-Affinity
affinity:
  podAntiAffinity:
    preferredDuringSchedulingIgnoredDuringExecution:
    - weight: 100
      podAffinityTerm:
        labelSelector:
          matchExpressions:
          - key: app.kubernetes.io/name
            operator: In
            values:
            - keda-operator
        topologyKey: kubernetes.io/hostname

# Pod Disruption Budget
podDisruptionBudget:
  enabled: true
  minAvailable: 2

# Horizontal Pod Autoscaler
autoscaling:
  enabled: true
  minReplicas: 3
  maxReplicas: 10
  targetCPUUtilizationPercentage: 70
  targetMemoryUtilizationPercentage: 80
EOF

# Install HA KEDA
helm install keda kedacore/keda \
  --namespace keda \
  --values keda-ha-values.yaml \
  --create-namespace \
  --wait \
  --timeout=15m

Method 2: YAML Manifests Installation

# Download KEDA manifests
curl -L https://github.com/kedacore/keda/releases/download/v2.12.0/keda-2.12.0.yaml -o keda-manifests.yaml

# Apply manifests
kubectl apply -f keda-manifests.yaml

# Verify installation
kubectl get pods -n keda-system

Method 3: Terraform Installation

# keda.tf
resource "helm_release" "keda" {
  name       = "keda"
  repository = "https://kedacore.github.io/charts"
  chart      = "keda"
  version    = "2.12.0"
  namespace  = "keda"
  create_namespace = true

  values = [
    file("${path.module}/keda-values.yaml")
  ]

  depends_on = [
    aws_eks_cluster.main,
    aws_eks_node_group.workers
  ]
}

# IAM role for KEDA
resource "aws_iam_role" "keda_role" {
  name = "keda-operator-role"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = "sts:AssumeRoleWithWebIdentity"
        Effect = "Allow"
        Principal = {
          Federated = aws_iam_openid_connect_provider.eks.arn
        }
        Condition = {
          StringEquals = {
            "${replace(aws_iam_openid_connect_provider.eks.url, "https://", "")}:sub" = "system:serviceaccount:keda:keda-operator"
            "${replace(aws_iam_openid_connect_provider.eks.url, "https://", "")}:aud" = "sts.amazonaws.com"
          }
        }
      }
    ]
  })
}

# Attach policies
resource "aws_iam_role_policy_attachment" "keda_sqs" {
  role       = aws_iam_role.keda_role.name
  policy_arn = "arn:aws:iam::aws:policy/AmazonSQSReadOnlyAccess"
}

resource "aws_iam_role_policy_attachment" "keda_s3" {
  role       = aws_iam_role.keda_role.name
  policy_arn = "arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess"
}

Installation Verification

#!/bin/bash
# verify-keda-installation.sh

echo "Verifying KEDA installation..."

# Check KEDA pods
echo "Checking KEDA pods..."
kubectl get pods -n keda

# Check CRDs
echo "Checking KEDA CRDs..."
kubectl get crd | grep keda

# Check services
echo "Checking KEDA services..."
kubectl get svc -n keda

# Check metrics API
echo "Checking metrics API..."
kubectl get --raw /apis/external.metrics.k8s.io/v1beta1/namespaces/default/keda-scaler-test

# Test scaling
echo "Testing scaling functionality..."
kubectl apply -f - << EOF
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: test-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: test-deployment
  minReplicaCount: 0
  maxReplicaCount: 10
  triggers:
  - type: cpu
    metadata:
      type: Utilization
      value: "50"
EOF

echo "KEDA installation verified successfully!"

Step 2: Configure IAM Roles for Service Accounts (IRSA)

# Create IAM role for KEDA
cat > keda-trust-policy.json << EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::YOUR_ACCOUNT_ID:oidc-provider/oidc.eks.REGION.amazonaws.com/id/YOUR_CLUSTER_ID"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "oidc.eks.REGION.amazonaws.com/id/YOUR_CLUSTER_ID:sub": "system:serviceaccount:keda:keda-operator",
          "oidc.eks.REGION.amazonaws.com/id/YOUR_CLUSTER_ID:aud": "sts.amazonaws.com"
        }
      }
    }
  ]
}
EOF

# Create IAM role
aws iam create-role \
  --role-name keda-operator-role \
  --assume-role-policy-document file://keda-trust-policy.json

# Attach necessary policies
aws iam attach-role-policy \
  --role-name keda-operator-role \
  --policy-arn arn:aws:iam::aws:policy/AmazonSQSReadOnlyAccess

aws iam attach-role-policy \
  --role-name keda-operator-role \
  --policy-arn arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess

# Annotate service account
kubectl annotate serviceaccount keda-operator \
  -n keda \
  eks.amazonaws.com/role-arn=arn:aws:iam::YOUR_ACCOUNT_ID:role/keda-operator-role

Advanced Scaling Use Cases

1. Amazon SQS Queue Scaling

Basic SQS Scaling

# sqs-basic-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: sqs-basic-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: worker-app
  minReplicaCount: 0
  maxReplicaCount: 10
  pollingInterval: 30
  cooldownPeriod: 300
  idleReplicaCount: 0
  triggers:
  - type: aws-sqs-queue
    metadata:
      queueURL: https://sqs.us-west-2.amazonaws.com/123456789012/my-queue
      queueLength: "5"
      awsRegion: us-west-2
      identityOwner: operator
    authenticationRef:
      name: keda-aws-credentials
---
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: keda-aws-credentials
  namespace: default
spec:
  podIdentity:
    provider: aws-eks

Advanced SQS Scaling with Multiple Queues

# sqs-advanced-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: sqs-advanced-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: multi-queue-processor
  minReplicaCount: 1
  maxReplicaCount: 50
  pollingInterval: 15
  cooldownPeriod: 300
  idleReplicaCount: 0
  fallback:
    failureThreshold: 3
    replicas: 2
  triggers:
  - type: aws-sqs-queue
    metadata:
      queueURL: https://sqs.us-west-2.amazonaws.com/123456789012/high-priority-queue
      queueLength: "2"
      awsRegion: us-west-2
      identityOwner: operator
      scaleOnInFlight: "false"
      activationQueueLength: "1"
    authenticationRef:
      name: keda-aws-credentials
  - type: aws-sqs-queue
    metadata:
      queueURL: https://sqs.us-west-2.amazonaws.com/123456789012/normal-priority-queue
      queueLength: "10"
      awsRegion: us-west-2
      identityOwner: operator
      scaleOnInFlight: "false"
      activationQueueLength: "5"
    authenticationRef:
      name: keda-aws-credentials
  - type: aws-sqs-queue
    metadata:
      queueURL: https://sqs.us-west-2.amazonaws.com/123456789012/batch-queue
      queueLength: "20"
      awsRegion: us-west-2
      identityOwner: operator
      scaleOnInFlight: "false"
      activationQueueLength: "10"
    authenticationRef:
      name: keda-aws-credentials

SQS FIFO Queue Scaling

# sqs-fifo-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: sqs-fifo-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: fifo-processor
  minReplicaCount: 0
  maxReplicaCount: 20
  pollingInterval: 30
  cooldownPeriod: 300
  triggers:
  - type: aws-sqs-queue
    metadata:
      queueURL: https://sqs.us-west-2.amazonaws.com/123456789012/orders.fifo
      queueLength: "5"
      awsRegion: us-west-2
      identityOwner: operator
      scaleOnInFlight: "true"
      maxInFlight: "10"
    authenticationRef:
      name: keda-aws-credentials

# sns-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: sns-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: sns-subscriber
  minReplicaCount: 0
  maxReplicaCount: 15
  pollingInterval: 30
  cooldownPeriod: 300
  triggers:
  - type: aws-sns
    metadata:
      topicArn: arn:aws:sns:us-west-2:123456789012:my-topic
      subscriptionArn: arn:aws:sns:us-west-2:123456789012:my-topic:12345678-1234-1234-1234-123456789012
      awsRegion: us-west-2
      identityOwner: operator
    authenticationRef:
      name: keda-aws-credentials

3. Amazon Kinesis Stream Scaling

# kinesis-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: kinesis-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: kinesis-consumer
  minReplicaCount: 1
  maxReplicaCount: 25
  pollingInterval: 30
  cooldownPeriod: 300
  triggers:
  - type: aws-kinesis-stream
    metadata:
      streamName: my-kinesis-stream
      shardCount: "2"
      awsRegion: us-west-2
      identityOwner: operator
    authenticationRef:
      name: keda-aws-credentials

4. Amazon DynamoDB Scaling

# dynamodb-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: dynamodb-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: dynamodb-processor
  minReplicaCount: 0
  maxReplicaCount: 20
  pollingInterval: 30
  cooldownPeriod: 300
  triggers:
  - type: aws-dynamodb
    metadata:
      tableName: my-table
      keyConditionExpression: "pk = :pk"
      expressionAttributeNames: "pk = partition_key"
      expressionAttributeValues: ":pk = S:my-partition-key"
      targetValue: "100"
      awsRegion: us-west-2
      identityOwner: operator
    authenticationRef:
      name: keda-aws-credentials

5. Amazon CloudWatch Metrics Scaling

# cloudwatch-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: cloudwatch-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: cloudwatch-processor
  minReplicaCount: 1
  maxReplicaCount: 15
  pollingInterval: 30
  cooldownPeriod: 300
  triggers:
  - type: aws-cloudwatch
    metadata:
      namespace: AWS/ApplicationELB
      metricName: RequestCount
      dimensions: "LoadBalancer=app/my-load-balancer/50dc6c495c0c9188"
      targetValue: "100"
      awsRegion: us-west-2
      identityOwner: operator
    authenticationRef:
      name: keda-aws-credentials

6. Redis Stream Scaling

Basic Redis Stream Scaling

# redis-stream-basic.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: redis-stream-basic
  namespace: default
spec:
  scaleTargetRef:
    name: stream-processor
  minReplicaCount: 0
  maxReplicaCount: 20
  pollingInterval: 15
  cooldownPeriod: 60
  triggers:
  - type: redis-streams
    metadata:
      address: redis-cluster.redis.svc.cluster.local:6379
      stream: my-stream
      consumerGroup: my-consumer-group
      streamLength: "10"
      enableTLS: "false"
    authenticationRef:
      name: redis-auth
---
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: redis-auth
  namespace: default
spec:
  secretTargetRef:
  - parameter: password
    name: redis-secret
    key: password

Advanced Redis Stream Scaling with Multiple Streams

# redis-stream-advanced.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: redis-stream-advanced
  namespace: default
spec:
  scaleTargetRef:
    name: multi-stream-processor
  minReplicaCount: 1
  maxReplicaCount: 50
  pollingInterval: 10
  cooldownPeriod: 60
  triggers:
  - type: redis-streams
    metadata:
      address: redis-cluster.redis.svc.cluster.local:6379
      stream: high-priority-stream
      consumerGroup: high-priority-group
      streamLength: "5"
      enableTLS: "false"
    authenticationRef:
      name: redis-auth
  - type: redis-streams
    metadata:
      address: redis-cluster.redis.svc.cluster.local:6379
      stream: normal-priority-stream
      consumerGroup: normal-priority-group
      streamLength: "20"
      enableTLS: "false"
    authenticationRef:
      name: redis-auth
  - type: redis-streams
    metadata:
      address: redis-cluster.redis.svc.cluster.local:6379
      stream: batch-stream
      consumerGroup: batch-group
      streamLength: "50"
      enableTLS: "false"
    authenticationRef:
      name: redis-auth

7. Redis List Scaling

# redis-list-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: redis-list-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: list-processor
  minReplicaCount: 0
  maxReplicaCount: 15
  pollingInterval: 30
  cooldownPeriod: 60
  triggers:
  - type: redis
    metadata:
      address: redis-cluster.redis.svc.cluster.local:6379
      listName: my-list
      listLength: "10"
      enableTLS: "false"
    authenticationRef:
      name: redis-auth

8. PostgreSQL Scaling

Basic PostgreSQL Scaling

# postgres-basic-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: postgres-basic-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: data-processor
  minReplicaCount: 1
  maxReplicaCount: 15
  pollingInterval: 30
  cooldownPeriod: 120
  triggers:
  - type: postgresql
    metadata:
      connection: postgresql://user:password@postgres.default.svc.cluster.local:5432/mydb
      query: "SELECT COUNT(*) FROM pending_jobs WHERE status = 'pending'"
      targetQueryValue: "5"
      activationTargetQueryValue: "1"
    authenticationRef:
      name: postgres-auth
---
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: postgres-auth
  namespace: default
spec:
  secretTargetRef:
  - parameter: password
    name: postgres-secret
    key: password

Advanced PostgreSQL Scaling with Multiple Queries

# postgres-advanced-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: postgres-advanced-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: multi-query-processor
  minReplicaCount: 1
  maxReplicaCount: 30
  pollingInterval: 15
  cooldownPeriod: 120
  triggers:
  - type: postgresql
    metadata:
      connection: postgresql://user:password@postgres.default.svc.cluster.local:5432/mydb
      query: "SELECT COUNT(*) FROM urgent_jobs WHERE status = 'pending' AND priority = 'high'"
      targetQueryValue: "2"
      activationTargetQueryValue: "1"
    authenticationRef:
      name: postgres-auth
  - type: postgresql
    metadata:
      connection: postgresql://user:password@postgres.default.svc.cluster.local:5432/mydb
      query: "SELECT COUNT(*) FROM normal_jobs WHERE status = 'pending' AND priority = 'normal'"
      targetQueryValue: "10"
      activationTargetQueryValue: "5"
    authenticationRef:
      name: postgres-auth
  - type: postgresql
    metadata:
      connection: postgresql://user:password@postgres.default.svc.cluster.local:5432/mydb
      query: "SELECT COUNT(*) FROM batch_jobs WHERE status = 'pending' AND priority = 'low'"
      targetQueryValue: "25"
      activationTargetQueryValue: "10"
    authenticationRef:
      name: postgres-auth

9. MySQL Scaling

# mysql-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: mysql-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: mysql-processor
  minReplicaCount: 1
  maxReplicaCount: 20
  pollingInterval: 30
  cooldownPeriod: 120
  triggers:
  - type: mysql
    metadata:
      connection: mysql://user:password@mysql.default.svc.cluster.local:3306/mydb
      query: "SELECT COUNT(*) FROM pending_tasks WHERE status = 'pending'"
      targetQueryValue: "5"
      activationTargetQueryValue: "1"
    authenticationRef:
      name: mysql-auth
---
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: mysql-auth
  namespace: default
spec:
  secretTargetRef:
  - parameter: password
    name: mysql-secret
    key: password

10. MongoDB Scaling

# mongodb-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: mongodb-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: mongodb-processor
  minReplicaCount: 1
  maxReplicaCount: 15
  pollingInterval: 30
  cooldownPeriod: 120
  triggers:
  - type: mongodb
    metadata:
      connectionString: mongodb://user:password@mongodb.default.svc.cluster.local:27017/mydb
      database: mydb
      collection: pending_jobs
      query: '{"status": "pending"}'
      targetQueryValue: "10"
      activationTargetQueryValue: "5"
    authenticationRef:
      name: mongodb-auth
---
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: mongodb-auth
  namespace: default
spec:
  secretTargetRef:
  - parameter: password
    name: mongodb-secret
    key: password

11. Kafka Topic Scaling

Basic Kafka Scaling

# kafka-basic-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: kafka-basic-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: kafka-consumer
  minReplicaCount: 0
  maxReplicaCount: 25
  pollingInterval: 30
  cooldownPeriod: 300
  triggers:
  - type: kafka
    metadata:
      bootstrapServers: kafka-cluster.kafka.svc.cluster.local:9092
      consumerGroup: my-consumer-group
      topic: my-topic
      lagThreshold: "10"
      offsetResetPolicy: earliest
    authenticationRef:
      name: kafka-auth
---
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: kafka-auth
  namespace: default
spec:
  secretTargetRef:
  - parameter: password
    name: kafka-secret
    key: password

Advanced Kafka Scaling with Multiple Topics

# kafka-advanced-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: kafka-advanced-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: multi-topic-consumer
  minReplicaCount: 1
  maxReplicaCount: 50
  pollingInterval: 15
  cooldownPeriod: 300
  triggers:
  - type: kafka
    metadata:
      bootstrapServers: kafka-cluster.kafka.svc.cluster.local:9092
      consumerGroup: high-priority-group
      topic: high-priority-topic
      lagThreshold: "5"
      offsetResetPolicy: earliest
    authenticationRef:
      name: kafka-auth
  - type: kafka
    metadata:
      bootstrapServers: kafka-cluster.kafka.svc.cluster.local:9092
      consumerGroup: normal-priority-group
      topic: normal-priority-topic
      lagThreshold: "20"
      offsetResetPolicy: earliest
    authenticationRef:
      name: kafka-auth
  - type: kafka
    metadata:
      bootstrapServers: kafka-cluster.kafka.svc.cluster.local:9092
      consumerGroup: batch-group
      topic: batch-topic
      lagThreshold: "50"
      offsetResetPolicy: earliest
    authenticationRef:
      name: kafka-auth

12. Apache Pulsar Scaling

# pulsar-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: pulsar-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: pulsar-consumer
  minReplicaCount: 0
  maxReplicaCount: 20
  pollingInterval: 30
  cooldownPeriod: 300
  triggers:
  - type: apache-pulsar
    metadata:
      broker: pulsar://pulsar-broker.pulsar.svc.cluster.local:6650
      topic: persistent://public/default/my-topic
      subscription: my-subscription
      subscriptionType: Shared
      targetMessageCount: "10"
    authenticationRef:
      name: pulsar-auth
---
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: pulsar-auth
  namespace: default
spec:
  secretTargetRef:
  - parameter: token
    name: pulsar-secret
    key: token

13. Prometheus Metrics Scaling

Basic Prometheus Scaling

# prometheus-basic-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: prometheus-basic-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: custom-app
  minReplicaCount: 1
  maxReplicaCount: 10
  pollingInterval: 30
  cooldownPeriod: 60
  triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus.monitoring.svc.cluster.local:9090
      metricName: custom_processing_queue_size
      threshold: '10'
      query: sum(rate(custom_processing_queue_size[1m]))
    authenticationRef:
      name: prometheus-auth
---
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: prometheus-auth
  namespace: default
spec:
  secretTargetRef:
  - parameter: username
    name: prometheus-secret
    key: username
  - parameter: password
    name: prometheus-secret
    key: password

Advanced Prometheus Scaling with Multiple Metrics

# prometheus-advanced-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: prometheus-advanced-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: multi-metric-app
  minReplicaCount: 1
  maxReplicaCount: 30
  pollingInterval: 15
  cooldownPeriod: 60
  triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus.monitoring.svc.cluster.local:9090
      metricName: high_priority_queue_size
      threshold: '5'
      query: sum(rate(high_priority_queue_size[1m]))
    authenticationRef:
      name: prometheus-auth
  - type: prometheus
    metadata:
      serverAddress: http://prometheus.monitoring.svc.cluster.local:9090
      metricName: normal_priority_queue_size
      threshold: '20'
      query: sum(rate(normal_priority_queue_size[1m]))
    authenticationRef:
      name: prometheus-auth
  - type: prometheus
    metadata:
      serverAddress: http://prometheus.monitoring.svc.cluster.local:9090
      metricName: batch_queue_size
      threshold: '50'
      query: sum(rate(batch_queue_size[1m]))
    authenticationRef:
      name: prometheus-auth

14. Azure Service Bus Scaling

Basic Azure Service Bus Scaling

# azure-servicebus-basic.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: azure-servicebus-basic
  namespace: default
spec:
  scaleTargetRef:
    name: azure-worker
  minReplicaCount: 0
  maxReplicaCount: 20
  pollingInterval: 30
  cooldownPeriod: 300
  triggers:
  - type: azure-servicebus
    metadata:
      connectionFromEnv: AZURE_SERVICEBUS_CONNECTION_STRING
      queueName: my-queue
      messageCount: "5"
    authenticationRef:
      name: azure-servicebus-auth
---
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: azure-servicebus-auth
  namespace: default
spec:
  secretTargetRef:
  - parameter: connection
    name: azure-servicebus-secret
    key: connection-string

Advanced Azure Service Bus Scaling with Topics

# azure-servicebus-advanced.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: azure-servicebus-advanced
  namespace: default
spec:
  scaleTargetRef:
    name: azure-topic-worker
  minReplicaCount: 1
  maxReplicaCount: 30
  pollingInterval: 15
  cooldownPeriod: 300
  triggers:
  - type: azure-servicebus
    metadata:
      connectionFromEnv: AZURE_SERVICEBUS_CONNECTION_STRING
      topicName: high-priority-topic
      subscriptionName: high-priority-subscription
      messageCount: "5"
    authenticationRef:
      name: azure-servicebus-auth
  - type: azure-servicebus
    metadata:
      connectionFromEnv: AZURE_SERVICEBUS_CONNECTION_STRING
      topicName: normal-priority-topic
      subscriptionName: normal-priority-subscription
      messageCount: "20"
    authenticationRef:
      name: azure-servicebus-auth
  - type: azure-servicebus
    metadata:
      connectionFromEnv: AZURE_SERVICEBUS_CONNECTION_STRING
      topicName: batch-topic
      subscriptionName: batch-subscription
      messageCount: "50"
    authenticationRef:
      name: azure-servicebus-auth

15. Azure Event Hubs Scaling

# azure-eventhubs-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: azure-eventhubs-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: eventhubs-consumer
  minReplicaCount: 0
  maxReplicaCount: 25
  pollingInterval: 30
  cooldownPeriod: 300
  triggers:
  - type: azure-eventhubs
    metadata:
      connectionFromEnv: AZURE_EVENTHUB_CONNECTION_STRING
      eventHubName: my-eventhub
      consumerGroup: my-consumer-group
      messageCount: "10"
    authenticationRef:
      name: azure-eventhubs-auth
---
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: azure-eventhubs-auth
  namespace: default
spec:
  secretTargetRef:
  - parameter: connection
    name: azure-eventhubs-secret
    key: connection-string

16. Azure Storage Queue Scaling

# azure-storage-queue-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: azure-storage-queue-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: storage-queue-worker
  minReplicaCount: 0
  maxReplicaCount: 20
  pollingInterval: 30
  cooldownPeriod: 300
  triggers:
  - type: azure-queue
    metadata:
      connectionFromEnv: AZURE_STORAGE_CONNECTION_STRING
      queueName: my-queue
      messageCount: "10"
    authenticationRef:
      name: azure-storage-auth
---
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: azure-storage-auth
  namespace: default
spec:
  secretTargetRef:
  - parameter: connection
    name: azure-storage-secret
    key: connection-string

17. Google Cloud Pub/Sub Scaling

# gcp-pubsub-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: gcp-pubsub-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: pubsub-subscriber
  minReplicaCount: 0
  maxReplicaCount: 20
  pollingInterval: 30
  cooldownPeriod: 300
  triggers:
  - type: gcp-pubsub
    metadata:
      subscriptionName: my-subscription
      mode: subscription
      value: "10"
    authenticationRef:
      name: gcp-pubsub-auth
---
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: gcp-pubsub-auth
  namespace: default
spec:
  secretTargetRef:
  - parameter: credentials
    name: gcp-pubsub-secret
    key: credentials.json

18. Google Cloud Storage Scaling

# gcp-storage-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: gcp-storage-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: storage-processor
  minReplicaCount: 0
  maxReplicaCount: 15
  pollingInterval: 30
  cooldownPeriod: 300
  triggers:
  - type: gcp-storage
    metadata:
      bucketName: my-bucket
      targetObjectCount: "10"
    authenticationRef:
      name: gcp-storage-auth
---
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: gcp-storage-auth
  namespace: default
spec:
  secretTargetRef:
  - parameter: credentials
    name: gcp-storage-secret
    key: credentials.json

19. Cron-based Scaling

Basic Cron Scaling

# cron-basic-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: cron-basic-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: scheduled-job
  minReplicaCount: 0
  maxReplicaCount: 5
  pollingInterval: 30
  cooldownPeriod: 60
  triggers:
  - type: cron
    metadata:
      timezone: UTC
      start: "0 9 * * 1-5"  # 9 AM weekdays
      end: "0 17 * * 1-5"   # 5 PM weekdays
      desiredReplicas: "3"

Advanced Cron Scaling with Multiple Schedules

# cron-advanced-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: cron-advanced-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: multi-schedule-job
  minReplicaCount: 0
  maxReplicaCount: 10
  pollingInterval: 30
  cooldownPeriod: 60
  triggers:
  - type: cron
    metadata:
      timezone: UTC
      start: "0 9 * * 1-5"   # 9 AM weekdays
      end: "0 17 * * 1-5"    # 5 PM weekdays
      desiredReplicas: "5"
  - type: cron
    metadata:
      timezone: UTC
      start: "0 18 * * 1-5"  # 6 PM weekdays
      end: "0 22 * * 1-5"    # 10 PM weekdays
      desiredReplicas: "3"
  - type: cron
    metadata:
      timezone: UTC
      start: "0 10 * * 6,7"  # 10 AM weekends
      end: "0 16 * * 6,7"    # 4 PM weekends
      desiredReplicas: "2"

20. External Scaler (Custom Metrics)

Basic External Scaler

# external-basic-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: external-basic-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: external-api-consumer
  minReplicaCount: 0
  maxReplicaCount: 15
  pollingInterval: 30
  cooldownPeriod: 120
  triggers:
  - type: external
    metadata:
      scalerAddress: external-scaler-service.default.svc.cluster.local:8080
      metricName: custom_metric
      threshold: "10"

Advanced External Scaler with Multiple Metrics

# external-advanced-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: external-advanced-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: multi-metric-consumer
  minReplicaCount: 1
  maxReplicaCount: 30
  pollingInterval: 15
  cooldownPeriod: 120
  triggers:
  - type: external
    metadata:
      scalerAddress: external-scaler-service.default.svc.cluster.local:8080
      metricName: high_priority_metric
      threshold: "5"
  - type: external
    metadata:
      scalerAddress: external-scaler-service.default.svc.cluster.local:8080
      metricName: normal_priority_metric
      threshold: "20"
  - type: external
    metadata:
      scalerAddress: external-scaler-service.default.svc.cluster.local:8080
      metricName: batch_metric
      threshold: "50"

21. InfluxDB Scaling

# influxdb-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: influxdb-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: influxdb-processor
  minReplicaCount: 1
  maxReplicaCount: 15
  pollingInterval: 30
  cooldownPeriod: 120
  triggers:
  - type: influxdb
    metadata:
      serverURL: http://influxdb.influxdb.svc.cluster.local:8086
      organizationName: my-org
      bucketName: my-bucket
      query: 'from(bucket: "my-bucket") |> range(start: -1h) |> filter(fn: (r) => r._measurement == "pending_jobs") |> count()'
      threshold: "10"
    authenticationRef:
      name: influxdb-auth
---
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: influxdb-auth
  namespace: default
spec:
  secretTargetRef:
  - parameter: token
    name: influxdb-secret
    key: token

22. Apache Kafka Streams Scaling

# kafka-streams-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: kafka-streams-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: kafka-streams-app
  minReplicaCount: 1
  maxReplicaCount: 20
  pollingInterval: 30
  cooldownPeriod: 300
  triggers:
  - type: kafka
    metadata:
      bootstrapServers: kafka-cluster.kafka.svc.cluster.local:9092
      consumerGroup: kafka-streams-group
      topic: input-topic
      lagThreshold: "10"
      offsetResetPolicy: earliest
    authenticationRef:
      name: kafka-auth

23. RabbitMQ Scaling

# rabbitmq-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: rabbitmq-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: rabbitmq-consumer
  minReplicaCount: 0
  maxReplicaCount: 20
  pollingInterval: 30
  cooldownPeriod: 300
  triggers:
  - type: rabbitmq
    metadata:
      queueName: my-queue
      host: amqp://rabbitmq.rabbitmq.svc.cluster.local:5672
      queueLength: "10"
    authenticationRef:
      name: rabbitmq-auth
---
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: rabbitmq-auth
  namespace: default
spec:
  secretTargetRef:
  - parameter: username
    name: rabbitmq-secret
    key: username
  - parameter: password
    name: rabbitmq-secret
    key: password

24. Apache ActiveMQ Scaling

# activemq-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: activemq-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: activemq-consumer
  minReplicaCount: 0
  maxReplicaCount: 15
  pollingInterval: 30
  cooldownPeriod: 300
  triggers:
  - type: artemis-queue
    metadata:
      managementEndpoint: http://activemq.activemq.svc.cluster.local:8161
      queueName: my-queue
      queueLength: "10"
    authenticationRef:
      name: activemq-auth
---
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: activemq-auth
  namespace: default
spec:
  secretTargetRef:
  - parameter: username
    name: activemq-secret
    key: username
  - parameter: password
    name: activemq-secret
    key: password

25. Apache Pulsar Scaling

# pulsar-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: pulsar-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: pulsar-consumer
  minReplicaCount: 0
  maxReplicaCount: 20
  pollingInterval: 30
  cooldownPeriod: 300
  triggers:
  - type: apache-pulsar
    metadata:
      broker: pulsar://pulsar-broker.pulsar.svc.cluster.local:6650
      topic: persistent://public/default/my-topic
      subscription: my-subscription
      subscriptionType: Shared
      targetMessageCount: "10"
    authenticationRef:
      name: pulsar-auth
---
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: pulsar-auth
  namespace: default
spec:
  secretTargetRef:
  - parameter: token
    name: pulsar-secret
    key: token

Enterprise Patterns

1. Multi-Tenant Scaling Architecture

# multi-tenant-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: multi-tenant-scaler
  namespace: tenant-a
  labels:
    tenant: tenant-a
    environment: production
spec:
  scaleTargetRef:
    name: tenant-a-processor
  minReplicaCount: 1
  maxReplicaCount: 20
  pollingInterval: 30
  cooldownPeriod: 300
  triggers:
  - type: aws-sqs-queue
    metadata:
      queueURL: https://sqs.us-west-2.amazonaws.com/123456789012/tenant-a-queue
      queueLength: "5"
      awsRegion: us-west-2
      identityOwner: operator
    authenticationRef:
      name: tenant-a-aws-credentials
---
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: multi-tenant-scaler
  namespace: tenant-b
  labels:
    tenant: tenant-b
    environment: production
spec:
  scaleTargetRef:
    name: tenant-b-processor
  minReplicaCount: 1
  maxReplicaCount: 15
  pollingInterval: 30
  cooldownPeriod: 300
  triggers:
  - type: aws-sqs-queue
    metadata:
      queueURL: https://sqs.us-west-2.amazonaws.com/123456789012/tenant-b-queue
      queueLength: "10"
      awsRegion: us-west-2
      identityOwner: operator
    authenticationRef:
      name: tenant-b-aws-credentials

2. Circuit Breaker Pattern with KEDA

# circuit-breaker-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: circuit-breaker-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: resilient-processor
  minReplicaCount: 1
  maxReplicaCount: 10
  pollingInterval: 30
  cooldownPeriod: 300
  fallback:
    failureThreshold: 5
    replicas: 1
  triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus.monitoring.svc.cluster.local:9090
      metricName: circuit_breaker_state
      threshold: '1'
      query: sum(rate(circuit_breaker_state[1m]))
    authenticationRef:
      name: prometheus-auth
  - type: aws-sqs-queue
    metadata:
      queueURL: https://sqs.us-west-2.amazonaws.com/123456789012/fallback-queue
      queueLength: "5"
      awsRegion: us-west-2
      identityOwner: operator
    authenticationRef:
      name: keda-aws-credentials

3. Blue-Green Deployment with KEDA

# blue-green-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: blue-green-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: blue-green-app
  minReplicaCount: 2
  maxReplicaCount: 20
  pollingInterval: 30
  cooldownPeriod: 300
  triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus.monitoring.svc.cluster.local:9090
      metricName: blue_green_health
      threshold: '0.8'
      query: sum(rate(blue_green_health[1m]))
    authenticationRef:
      name: prometheus-auth
---
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: blue-green-app
  namespace: default
spec:
  replicas: 2
  strategy:
    blueGreen:
      activeService: blue-green-app-active
      previewService: blue-green-app-preview
      autoPromotionEnabled: false
      scaleDownDelaySeconds: 30
      prePromotionAnalysis:
        templates:
        - templateName: success-rate
        args:
        - name: service-name
          value: blue-green-app-preview
      postPromotionAnalysis:
        templates:
        - templateName: success-rate
        args:
        - name: service-name
          value: blue-green-app-active
  selector:
    matchLabels:
      app: blue-green-app
  template:
    metadata:
      labels:
        app: blue-green-app
    spec:
      containers:
      - name: app
        image: myapp:latest
        ports:
        - containerPort: 8080
        resources:
          requests:
            memory: "256Mi"
            cpu: "100m"
          limits:
            memory: "512Mi"
            cpu: "500m"

4. Canary Deployment with KEDA

# canary-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: canary-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: canary-app
  minReplicaCount: 1
  maxReplicaCount: 15
  pollingInterval: 30
  cooldownPeriod: 300
  triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus.monitoring.svc.cluster.local:9090
      metricName: canary_success_rate
      threshold: '0.95'
      query: sum(rate(canary_success_rate[1m]))
    authenticationRef:
      name: prometheus-auth
---
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: canary-app
  namespace: default
spec:
  replicas: 5
  strategy:
    canary:
      steps:
      - setWeight: 20
      - pause: {duration: 10m}
      - setWeight: 40
      - pause: {duration: 10m}
      - setWeight: 60
      - pause: {duration: 10m}
      - setWeight: 80
      - pause: {duration: 10m}
      analysis:
        templates:
        - templateName: success-rate
        args:
        - name: service-name
          value: canary-app
        startingStep: 2
        interval: 5m
  selector:
    matchLabels:
      app: canary-app
  template:
    metadata:
      labels:
        app: canary-app
    spec:
      containers:
      - name: app
        image: myapp:latest
        ports:
        - containerPort: 8080
        resources:
          requests:
            memory: "256Mi"
            cpu: "100m"
          limits:
            memory: "512Mi"
            cpu: "500m"

5. Event Sourcing with KEDA

# event-sourcing-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: event-sourcing-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: event-processor
  minReplicaCount: 1
  maxReplicaCount: 25
  pollingInterval: 30
  cooldownPeriod: 300
  triggers:
  - type: kafka
    metadata:
      bootstrapServers: kafka-cluster.kafka.svc.cluster.local:9092
      consumerGroup: event-sourcing-group
      topic: events
      lagThreshold: "10"
      offsetResetPolicy: earliest
    authenticationRef:
      name: kafka-auth
  - type: postgresql
    metadata:
      connection: postgresql://user:password@postgres.default.svc.cluster.local:5432/events
      query: "SELECT COUNT(*) FROM event_store WHERE processed = false"
      targetQueryValue: "100"
      activationTargetQueryValue: "10"
    authenticationRef:
      name: postgres-auth

6. CQRS Pattern with KEDA

# cqrs-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: cqrs-command-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: command-processor
  minReplicaCount: 1
  maxReplicaCount: 20
  pollingInterval: 30
  cooldownPeriod: 300
  triggers:
  - type: aws-sqs-queue
    metadata:
      queueURL: https://sqs.us-west-2.amazonaws.com/123456789012/command-queue
      queueLength: "5"
      awsRegion: us-west-2
      identityOwner: operator
    authenticationRef:
      name: keda-aws-credentials
---
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: cqrs-query-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: query-processor
  minReplicaCount: 2
  maxReplicaCount: 15
  pollingInterval: 30
  cooldownPeriod: 300
  triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus.monitoring.svc.cluster.local:9090
      metricName: query_request_rate
      threshold: '100'
      query: sum(rate(query_request_rate[1m]))
    authenticationRef:
      name: prometheus-auth

7. Saga Pattern with KEDA

# saga-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: saga-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: saga-processor
  minReplicaCount: 1
  maxReplicaCount: 30
  pollingInterval: 30
  cooldownPeriod: 300
  triggers:
  - type: postgresql
    metadata:
      connection: postgresql://user:password@postgres.default.svc.cluster.local:5432/saga
      query: "SELECT COUNT(*) FROM saga_instances WHERE status = 'running'"
      targetQueryValue: "10"
      activationTargetQueryValue: "1"
    authenticationRef:
      name: postgres-auth
  - type: postgresql
    metadata:
      connection: postgresql://user:password@postgres.default.svc.cluster.local:5432/saga
      query: "SELECT COUNT(*) FROM saga_instances WHERE status = 'compensating'"
      targetQueryValue: "5"
      activationTargetQueryValue: "1"
    authenticationRef:
      name: postgres-auth

8. Outbox Pattern with KEDA

# outbox-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: outbox-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: outbox-processor
  minReplicaCount: 1
  maxReplicaCount: 20
  pollingInterval: 30
  cooldownPeriod: 300
  triggers:
  - type: postgresql
    metadata:
      connection: postgresql://user:password@postgres.default.svc.cluster.local:5432/outbox
      query: "SELECT COUNT(*) FROM outbox_events WHERE processed = false"
      targetQueryValue: "10"
      activationTargetQueryValue: "1"
    authenticationRef:
      name: postgres-auth
  - type: kafka
    metadata:
      bootstrapServers: kafka-cluster.kafka.svc.cluster.local:9092
      consumerGroup: outbox-group
      topic: outbox-events
      lagThreshold: "5"
      offsetResetPolicy: earliest
    authenticationRef:
      name: kafka-auth

Performance Optimization

1. Scaling Algorithm Tuning

# optimized-scaling-algorithm.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: optimized-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: optimized-app
  minReplicaCount: 1
  maxReplicaCount: 100
  pollingInterval: 15
  cooldownPeriod: 300
  idleReplicaCount: 0
  # Advanced scaling configuration
  fallback:
    failureThreshold: 3
    replicas: 2
  triggers:
  - type: aws-sqs-queue
    metadata:
      queueURL: https://sqs.us-west-2.amazonaws.com/123456789012/optimized-queue
      queueLength: "5"
      awsRegion: us-west-2
      identityOwner: operator
      # Advanced SQS configuration
      scaleOnInFlight: "false"
      activationQueueLength: "1"
      maxInFlight: "10"
    authenticationRef:
      name: keda-aws-credentials

2. Resource Optimization

# resource-optimized-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: resource-optimized-app
  namespace: default
spec:
  replicas: 0
  selector:
    matchLabels:
      app: resource-optimized-app
  template:
    metadata:
      labels:
        app: resource-optimized-app
    spec:
      containers:
      - name: app
        image: myapp:latest
        resources:
          requests:
            memory: "128Mi"
            cpu: "100m"
          limits:
            memory: "256Mi"
            cpu: "200m"
        # Resource optimization
        env:
        - name: JAVA_OPTS
          value: "-Xms128m -Xmx256m -XX:+UseG1GC"
        - name: NODE_OPTIONS
          value: "--max-old-space-size=256"
        # Health checks
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 3
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5
          timeoutSeconds: 3
          failureThreshold: 3
        # Graceful shutdown
        lifecycle:
          preStop:
            exec:
              command: ["/bin/sh", "-c", "sleep 15"]

3. Network Optimization

# network-optimized-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: network-optimized-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: network-optimized-app
  minReplicaCount: 1
  maxReplicaCount: 50
  pollingInterval: 10
  cooldownPeriod: 60
  triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus.monitoring.svc.cluster.local:9090
      metricName: network_throughput
      threshold: '1000'
      query: sum(rate(network_throughput[1m]))
    authenticationRef:
      name: prometheus-auth
---
apiVersion: v1
kind: Service
metadata:
  name: network-optimized-app
  namespace: default
spec:
  selector:
    app: network-optimized-app
  ports:
  - port: 80
    targetPort: 8080
    protocol: TCP
  type: ClusterIP
  # Network optimization
  sessionAffinity: None
  externalTrafficPolicy: Cluster
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: network-optimized-policy
  namespace: default
spec:
  podSelector:
    matchLabels:
      app: network-optimized-app
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          name: monitoring
    ports:
    - protocol: TCP
      port: 8080
  egress:
  - to:
    - namespaceSelector:
        matchLabels:
          name: monitoring
    ports:
    - protocol: TCP
      port: 9090

4. Caching Strategy with KEDA

# cache-optimized-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: cache-optimized-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: cache-optimized-app
  minReplicaCount: 2
  maxReplicaCount: 20
  pollingInterval: 30
  cooldownPeriod: 300
  triggers:
  - type: redis
    metadata:
      address: redis-cluster.redis.svc.cluster.local:6379
      listName: cache-miss-queue
      listLength: "10"
      enableTLS: "false"
    authenticationRef:
      name: redis-auth
  - type: prometheus
    metadata:
      serverAddress: http://prometheus.monitoring.svc.cluster.local:9090
      metricName: cache_hit_ratio
      threshold: '0.8'
      query: sum(rate(cache_hit_ratio[1m]))
    authenticationRef:
      name: prometheus-auth
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: cache-config
  namespace: default
data:
  cache.properties: |
    # Redis configuration
    redis.host=redis-cluster.redis.svc.cluster.local
    redis.port=6379
    redis.timeout=2000
    redis.pool.max-active=20
    redis.pool.max-idle=10
    redis.pool.min-idle=5
    
    # Cache configuration
    cache.ttl=3600
    cache.max-size=10000
    cache.eviction-policy=LRU

Security and Compliance

1. RBAC Configuration

# keda-rbac.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: keda-operator
  namespace: keda
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: keda-operator
rules:
- apiGroups: [""]
  resources: ["pods", "services", "endpoints", "persistentvolumeclaims", "events", "configmaps", "secrets"]
  verbs: ["get", "list", "watch"]
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: ["apps"]
  resources: ["deployments", "daemonsets", "replicasets", "statefulsets"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: ["keda.sh"]
  resources: ["scaledobjects", "scaledjobs", "triggerauthentications", "clustertriggerauthentications"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: ["autoscaling"]
  resources: ["horizontalpodautoscalers"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: ["batch"]
  resources: ["jobs"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: [""]
  resources: ["nodes"]
  verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: keda-operator
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: keda-operator
subjects:
- kind: ServiceAccount
  name: keda-operator
  namespace: keda

2. Pod Security Standards

# pod-security-policy.yaml
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: keda-psp
spec:
  privileged: false
  allowPrivilegeEscalation: false
  requiredDropCapabilities:
    - ALL
  volumes:
    - 'configMap'
    - 'emptyDir'
    - 'projected'
    - 'secret'
    - 'downwardAPI'
    - 'persistentVolumeClaim'
  runAsUser:
    rule: 'MustRunAsNonRoot'
  seLinux:
    rule: 'RunAsAny'
  fsGroup:
    rule: 'RunAsAny'
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: keda-operator
  namespace: keda
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: keda-psp-user
rules:
- apiGroups: ['policy']
  resources: ['podsecuritypolicies']
  verbs: ['use']
  resourceNames:
  - keda-psp
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: keda-psp-user
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: keda-psp-user
subjects:
- kind: ServiceAccount
  name: keda-operator
  namespace: keda

3. Network Security

# network-security.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: keda-network-policy
  namespace: keda
spec:
  podSelector:
    matchLabels:
      app: keda-operator
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          name: default
    - namespaceSelector:
        matchLabels:
          name: monitoring
    ports:
    - protocol: TCP
      port: 8080
  egress:
  - to: []
    ports:
    - protocol: TCP
      port: 443
    - protocol: TCP
      port: 9090
    - protocol: TCP
      port: 6379
    - protocol: TCP
      port: 5432
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: keda-metrics-network-policy
  namespace: keda
spec:
  podSelector:
    matchLabels:
      app: keda-metrics-apiserver
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          name: monitoring
    ports:
    - protocol: TCP
      port: 8080
  egress:
  - to: []
    ports:
    - protocol: TCP
      port: 443

4. Secret Management

# secret-management.yaml
apiVersion: v1
kind: Secret
metadata:
  name: keda-secrets
  namespace: keda
type: Opaque
data:
  aws-access-key: <base64-encoded-key>
  aws-secret-key: <base64-encoded-secret>
  redis-password: <base64-encoded-password>
  postgres-password: <base64-encoded-password>
---
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: keda-secrets-auth
  namespace: keda
spec:
  secretTargetRef:
  - parameter: awsAccessKeyID
    name: keda-secrets
    key: aws-access-key
  - parameter: awsSecretAccessKey
    name: keda-secrets
    key: aws-secret-key
  - parameter: password
    name: keda-secrets
    key: redis-password

5. Compliance and Auditing

# compliance-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: keda-compliance-config
  namespace: keda
data:
  audit.yaml: |
    # Audit configuration
    audit:
      enabled: true
      level: "metadata"
      logFormat: "json"
      logPath: "/var/log/audit/audit.log"
      maxAge: 30
      maxBackups: 10
      maxSize: 100
      
    # Compliance settings
    compliance:
      gdpr: true
      sox: true
      pci: false
      hipaa: false
      
    # Data retention
    retention:
      logs: "30d"
      metrics: "90d"
      events: "7d"
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: keda-audit
  namespace: keda
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: keda-audit
rules:
- apiGroups: [""]
  resources: ["events", "pods", "services"]
  verbs: ["get", "list", "watch"]
- apiGroups: ["keda.sh"]
  resources: ["scaledobjects", "scaledjobs"]
  verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: keda-audit
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: keda-audit
subjects:
- kind: ServiceAccount
  name: keda-audit
  namespace: keda

Monitoring and Observability

1. KEDA Metrics Collection

# keda-monitoring.yaml
apiVersion: v1
kind: ServiceMonitor
metadata:
  name: keda-metrics
  namespace: keda
  labels:
    app: keda-operator
    release: prometheus
spec:
  selector:
    matchLabels:
      app: keda-operator
  endpoints:
  - port: http
    path: /metrics
    interval: 30s
    scrapeTimeout: 10s
---
apiVersion: v1
kind: ServiceMonitor
metadata:
  name: keda-metrics-apiserver
  namespace: keda
  labels:
    app: keda-metrics-apiserver
    release: prometheus
spec:
  selector:
    matchLabels:
      app: keda-metrics-apiserver
  endpoints:
  - port: https
    path: /metrics
    scheme: https
    interval: 30s
    scrapeTimeout: 10s
    tlsConfig:
      insecureSkipVerify: true
---
apiVersion: v1
kind: ServiceMonitor
metadata:
  name: keda-webhooks
  namespace: keda
  labels:
    app: keda-webhooks
    release: prometheus
spec:
  selector:
    matchLabels:
      app: keda-webhooks
  endpoints:
  - port: https
    path: /metrics
    scheme: https
    interval: 30s
    scrapeTimeout: 10s
    tlsConfig:
      insecureSkipVerify: true

2. Comprehensive Grafana Dashboard

{
  "dashboard": {
    "title": "KEDA Comprehensive Dashboard",
    "tags": ["keda", "kubernetes", "autoscaling"],
    "timezone": "browser",
    "panels": [
      {
        "title": "KEDA Operator Health",
        "type": "stat",
        "gridPos": {"h": 8, "w": 12, "x": 0, "y": 0},
        "targets": [
          {
            "expr": "up{job=\"keda-operator\"}",
            "legendFormat": "Operator Status"
          }
        ],
        "fieldConfig": {
          "defaults": {
            "color": {"mode": "thresholds"},
            "thresholds": {
              "steps": [
                {"color": "red", "value": 0},
                {"color": "green", "value": 1}
              ]
            }
          }
        }
      },
      {
        "title": "ScaledObjects Status",
        "type": "stat",
        "gridPos": {"h": 8, "w": 12, "x": 12, "y": 0},
        "targets": [
          {
            "expr": "keda_scaled_object_ready",
            "legendFormat": "Ready"
          },
          {
            "expr": "keda_scaled_object_paused",
            "legendFormat": "Paused"
          }
        ]
      },
      {
        "title": "Current Replicas by ScaledObject",
        "type": "graph",
        "gridPos": {"h": 8, "w": 24, "x": 0, "y": 8},
        "targets": [
          {
            "expr": "keda_scaled_object_replicas",
            "legendFormat": " ()"
          }
        ],
        "yAxes": [
          {
            "label": "Replicas",
            "min": 0
          }
        ]
      },
      {
        "title": "Scale Events Rate",
        "type": "graph",
        "gridPos": {"h": 8, "w": 12, "x": 0, "y": 16},
        "targets": [
          {
            "expr": "rate(keda_scaled_object_scale_events_total[5m])",
            "legendFormat": " - "
          }
        ],
        "yAxes": [
          {
            "label": "Events/sec",
            "min": 0
          }
        ]
      },
      {
        "title": "External Metrics",
        "type": "graph",
        "gridPos": {"h": 8, "w": 12, "x": 12, "y": 16},
        "targets": [
          {
            "expr": "keda_scaled_object_ready * on(scaledObject) group_left keda_scaled_object_ready",
            "legendFormat": ""
          }
        ]
      },
      {
        "title": "KEDA Operator Resource Usage",
        "type": "graph",
        "gridPos": {"h": 8, "w": 24, "x": 0, "y": 24},
        "targets": [
          {
            "expr": "rate(container_cpu_usage_seconds_total{pod=~\"keda-operator-.*\"}[5m])",
            "legendFormat": "CPU Usage"
          },
          {
            "expr": "container_memory_usage_bytes{pod=~\"keda-operator-.*\"}",
            "legendFormat": "Memory Usage"
          }
        ],
        "yAxes": [
          {
            "label": "CPU (cores)"
          },
          {
            "label": "Memory (bytes)",
            "logBase": 2
          }
        ]
      }
    ],
    "time": {
      "from": "now-1h",
      "to": "now"
    },
    "refresh": "30s"
  }
}

3. Alerting Rules

# keda-alerts.yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: keda-alerts
  namespace: keda
  labels:
    app: keda
    release: prometheus
spec:
  groups:
  - name: keda.rules
    rules:
    - alert: KEDAOperatorDown
      expr: up{job="keda-operator"} == 0
      for: 1m
      labels:
        severity: critical
      annotations:
        summary: "KEDA Operator is down"
        description: "KEDA Operator has been down for more than 1 minute"
        
    - alert: KEDAMetricsServerDown
      expr: up{job="keda-metrics-apiserver"} == 0
      for: 1m
      labels:
        severity: critical
      annotations:
        summary: "KEDA Metrics Server is down"
        description: "KEDA Metrics Server has been down for more than 1 minute"
        
    - alert: ScaledObjectNotReady
      expr: keda_scaled_object_ready == 0
      for: 2m
      labels:
        severity: warning
      annotations:
        summary: "ScaledObject is not ready"
        description: "ScaledObject {{ $labels.scaledObject }} in namespace {{ $labels.namespace }} is not ready"
        
    - alert: HighScaleEventRate
      expr: rate(keda_scaled_object_scale_events_total[5m]) > 10
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "High scale event rate"
        description: "ScaledObject {{ $labels.scaledObject }} is scaling frequently"
        
    - alert: KEDAOperatorHighCPU
      expr: rate(container_cpu_usage_seconds_total{pod=~"keda-operator-.*"}[5m]) > 0.8
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "KEDA Operator high CPU usage"
        description: "KEDA Operator CPU usage is above 80%"
        
    - alert: KEDAOperatorHighMemory
      expr: container_memory_usage_bytes{pod=~"keda-operator-.*"} > 1000000000
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "KEDA Operator high memory usage"
        description: "KEDA Operator memory usage is above 1GB"

4. Custom Metrics Collection

# custom-metrics-collector.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: keda-metrics-collector
  namespace: keda
spec:
  replicas: 1
  selector:
    matchLabels:
      app: keda-metrics-collector
  template:
    metadata:
      labels:
        app: keda-metrics-collector
    spec:
      containers:
      - name: collector
        image: prom/node-exporter:latest
        ports:
        - containerPort: 9100
        resources:
          requests:
            memory: "64Mi"
            cpu: "50m"
          limits:
            memory: "128Mi"
            cpu: "100m"
        volumeMounts:
        - name: proc
          mountPath: /host/proc
          readOnly: true
        - name: sys
          mountPath: /host/sys
          readOnly: true
      volumes:
      - name: proc
        hostPath:
          path: /proc
      - name: sys
        hostPath:
          path: /sys
      hostNetwork: true
      hostPID: true
---
apiVersion: v1
kind: Service
metadata:
  name: keda-metrics-collector
  namespace: keda
spec:
  selector:
    app: keda-metrics-collector
  ports:
  - port: 9100
    targetPort: 9100

Troubleshooting and Debugging

1. Comprehensive Debugging Script

#!/bin/bash
# keda-debug.sh

set -e

NAMESPACE=${1:-default}
SCALED_OBJECT=${2:-""}

echo "=== KEDA Debugging Script ==="
echo "Namespace: $NAMESPACE"
echo "ScaledObject: $SCALED_OBJECT"
echo ""

# Check KEDA installation
echo "1. Checking KEDA Installation..."
kubectl get pods -n keda
kubectl get crd | grep keda
echo ""

# Check ScaledObjects
echo "2. Checking ScaledObjects..."
if [ -n "$SCALED_OBJECT" ]; then
  kubectl describe scaledobject $SCALED_OBJECT -n $NAMESPACE
else
  kubectl get scaledobjects -n $NAMESPACE
fi
echo ""

# Check TriggerAuthentications
echo "3. Checking TriggerAuthentications..."
kubectl get triggerauthentications -n $NAMESPACE
echo ""

# Check KEDA operator logs
echo "4. Checking KEDA Operator Logs..."
kubectl logs -n keda deployment/keda-operator --tail=50
echo ""

# Check metrics server logs
echo "5. Checking Metrics Server Logs..."
kubectl logs -n keda deployment/keda-metrics-apiserver --tail=50
echo ""

# Check HPA
echo "6. Checking HPA..."
kubectl get hpa -n $NAMESPACE
echo ""

# Check external metrics
echo "7. Checking External Metrics..."
kubectl get --raw /apis/external.metrics.k8s.io/v1beta1/namespaces/$NAMESPACE/keda-scaler-* 2>/dev/null || echo "No external metrics found"
echo ""

# Check events
echo "8. Checking Events..."
kubectl get events -n $NAMESPACE --sort-by=.metadata.creationTimestamp | tail -20
echo ""

# Check resource usage
echo "9. Checking Resource Usage..."
kubectl top pods -n keda
echo ""

# Check network connectivity
echo "10. Checking Network Connectivity..."
kubectl run debug-pod --image=busybox --rm -it --restart=Never -- nslookup keda-operator.keda.svc.cluster.local
echo ""

echo "=== Debug Complete ==="

2. Common Issues and Solutions

Issue 1: ScaledObject Not Scaling

# Check ScaledObject status
kubectl describe scaledobject <scaled-object-name> -n <namespace>

# Check authentication
kubectl describe triggerauthentication <auth-name> -n <namespace>

# Check external metrics
kubectl get --raw /apis/external.metrics.k8s.io/v1beta1/namespaces/<namespace>/keda-scaler-<scaled-object-name>

# Check KEDA operator logs
kubectl logs -n keda deployment/keda-operator | grep <scaled-object-name>

Issue 2: Authentication Failures

# Test AWS credentials
kubectl run test-pod --image=amazon/aws-cli --rm -it --restart=Never -- \
  aws sts get-caller-identity

# Check IAM role binding
kubectl describe serviceaccount keda-operator -n keda

# Verify OIDC provider
aws iam get-open-id-connect-provider --open-id-connect-provider-arn arn:aws:iam::ACCOUNT:oidc-provider/oidc.eks.REGION.amazonaws.com/id/CLUSTER_ID

Issue 3: Performance Issues

# Check KEDA metrics
kubectl top pods -n keda

# Monitor scaling events
kubectl get events --sort-by=.metadata.creationTimestamp

# Check resource usage
kubectl describe nodes

# Check for resource constraints
kubectl describe pods -n keda

3. Advanced Debugging Tools

# keda-debug-tools.yaml
apiVersion: v1
kind: Pod
metadata:
  name: keda-debug-tools
  namespace: keda
spec:
  containers:
  - name: debug-tools
    image: bitnami/kubectl:latest
    command: ["sleep", "3600"]
    resources:
      requests:
        memory: "64Mi"
        cpu: "50m"
      limits:
        memory: "128Mi"
        cpu: "100m"
    env:
    - name: KUBECONFIG
      value: "/var/run/secrets/kubernetes.io/serviceaccount"
  serviceAccountName: keda-operator
  restartPolicy: Never

Production Deployment

1. Production-Ready KEDA Configuration

# production-keda-values.yaml
operator:
  replicaCount: 3
  image:
    repository: ghcr.io/kedacore/keda
    tag: "2.12.0"
    pullPolicy: IfNotPresent
  
  resources:
    limits:
      cpu: 2000m
      memory: 2Gi
    requests:
      cpu: 200m
      memory: 200Mi
  
  securityContext:
    runAsNonRoot: true
    runAsUser: 1000
    allowPrivilegeEscalation: false
    readOnlyRootFilesystem: true
    capabilities:
      drop:
        - ALL

metricsApiServer:
  replicaCount: 3
  image:
    repository: ghcr.io/kedacore/keda-metrics-apiserver
    tag: "2.12.0"
    pullPolicy: IfNotPresent
  
  resources:
    limits:
      cpu: 2000m
      memory: 2Gi
    requests:
      cpu: 200m
      memory: 200Mi
  
  securityContext:
    runAsNonRoot: true
    runAsUser: 1000
    allowPrivilegeEscalation: false
    readOnlyRootFilesystem: true
    capabilities:
      drop:
        - ALL

webhooks:
  replicaCount: 3
  image:
    repository: ghcr.io/kedacore/keda-admission-webhooks
    tag: "2.12.0"
    pullPolicy: IfNotPresent
  
  resources:
    limits:
      cpu: 1000m
      memory: 1Gi
    requests:
      cpu: 100m
      memory: 100Mi

# High Availability Configuration
affinity:
  podAntiAffinity:
    preferredDuringSchedulingIgnoredDuringExecution:
    - weight: 100
      podAffinityTerm:
        labelSelector:
          matchExpressions:
          - key: app.kubernetes.io/name
            operator: In
            values:
            - keda-operator
        topologyKey: kubernetes.io/hostname

# Pod Disruption Budget
podDisruptionBudget:
  enabled: true
  minAvailable: 2

# Horizontal Pod Autoscaler
autoscaling:
  enabled: true
  minReplicas: 3
  maxReplicas: 10
  targetCPUUtilizationPercentage: 70
  targetMemoryUtilizationPercentage: 80

# Monitoring
prometheus:
  metricServer:
    enabled: true
    port: 8080
    path: /metrics
  operator:
    enabled: true
    port: 8080
    path: /metrics

# Logging
logging:
  operator:
    level: info
    format: json
  metricServer:
    level: info
    format: json

2. Production Deployment Script

#!/bin/bash
# deploy-keda-production.sh

set -e

CLUSTER_NAME=${1:-"production-cluster"}
REGION=${2:-"us-west-2"}
NAMESPACE="keda"

echo "Deploying KEDA to production cluster: $CLUSTER_NAME"

# Update kubeconfig
aws eks update-kubeconfig --region $REGION --name $CLUSTER_NAME

# Create namespace
kubectl create namespace $NAMESPACE --dry-run=client -o yaml | kubectl apply -f -

# Install KEDA with production configuration
helm upgrade --install keda kedacore/keda \
  --namespace $NAMESPACE \
  --values production-keda-values.yaml \
  --wait \
  --timeout=15m

# Verify installation
kubectl get pods -n $NAMESPACE
kubectl get crd | grep keda

# Deploy monitoring
kubectl apply -f keda-monitoring.yaml
kubectl apply -f keda-alerts.yaml

# Deploy security policies
kubectl apply -f keda-rbac.yaml
kubectl apply -f network-security.yaml

echo "KEDA production deployment completed successfully!"

3. Disaster Recovery

# disaster-recovery.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: keda-backup-script
  namespace: keda
data:
  backup.sh: |
    #!/bin/bash
    # KEDA Disaster Recovery Backup Script
    
    BACKUP_DIR="/backup/keda"
    TIMESTAMP=$(date +%Y%m%d_%H%M%S)
    
    mkdir -p $BACKUP_DIR/$TIMESTAMP
    
    # Backup ScaledObjects
    kubectl get scaledobjects --all-namespaces -o yaml > $BACKUP_DIR/$TIMESTAMP/scaledobjects.yaml
    
    # Backup ScaledJobs
    kubectl get scaledjobs --all-namespaces -o yaml > $BACKUP_DIR/$TIMESTAMP/scaledjobs.yaml
    
    # Backup TriggerAuthentications
    kubectl get triggerauthentications --all-namespaces -o yaml > $BACKUP_DIR/$TIMESTAMP/triggerauthentications.yaml
    
    # Backup ClusterTriggerAuthentications
    kubectl get clustertriggerauthentications -o yaml > $BACKUP_DIR/$TIMESTAMP/clustertriggerauthentications.yaml
    
    # Backup KEDA configuration
    kubectl get configmap -n keda -o yaml > $BACKUP_DIR/$TIMESTAMP/keda-config.yaml
    
    # Backup secrets
    kubectl get secrets -n keda -o yaml > $BACKUP_DIR/$TIMESTAMP/keda-secrets.yaml
    
    echo "Backup completed: $BACKUP_DIR/$TIMESTAMP"
---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: keda-backup
  namespace: keda
spec:
  schedule: "0 2 * * *"  # Daily at 2 AM
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: backup
            image: bitnami/kubectl:latest
            command: ["/bin/bash", "/scripts/backup.sh"]
            volumeMounts:
            - name: backup-script
              mountPath: /scripts
            - name: backup-storage
              mountPath: /backup
          volumes:
          - name: backup-script
            configMap:
              name: keda-backup-script
              defaultMode: 0755
          - name: backup-storage
            persistentVolumeClaim:
              claimName: keda-backup-pvc
          restartPolicy: OnFailure

Cost Optimization Strategies

1. Resource Right-Sizing

# cost-optimized-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: cost-optimized-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: cost-optimized-app
  minReplicaCount: 0  # Scale to zero when idle
  maxReplicaCount: 20
  pollingInterval: 30
  cooldownPeriod: 300
  idleReplicaCount: 0
  triggers:
  - type: aws-sqs-queue
    metadata:
      queueURL: https://sqs.us-west-2.amazonaws.com/123456789012/cost-optimized-queue
      queueLength: "5"
      awsRegion: us-west-2
      identityOwner: operator
      # Cost optimization settings
      scaleOnInFlight: "false"
      activationQueueLength: "1"
    authenticationRef:
      name: keda-aws-credentials
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: cost-optimized-app
  namespace: default
spec:
  replicas: 0
  selector:
    matchLabels:
      app: cost-optimized-app
  template:
    metadata:
      labels:
        app: cost-optimized-app
    spec:
      containers:
      - name: app
        image: myapp:latest
        resources:
          requests:
            memory: "64Mi"   # Minimal requests
            cpu: "50m"
          limits:
            memory: "128Mi"  # Reasonable limits
            cpu: "100m"
        # Cost optimization
        env:
        - name: JAVA_OPTS
          value: "-Xms64m -Xmx128m -XX:+UseG1GC"
        - name: NODE_OPTIONS
          value: "--max-old-space-size=128"

2. Spot Instance Integration

# spot-instance-scaler.yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: spot-instance-scaler
  namespace: default
spec:
  scaleTargetRef:
    name: spot-instance-app
  minReplicaCount: 0
  maxReplicaCount: 50
  pollingInterval: 30
  cooldownPeriod: 300
  triggers:
  - type: aws-sqs-queue
    metadata:
      queueURL: https://sqs.us-west-2.amazonaws.com/123456789012/spot-queue
      queueLength: "10"
      awsRegion: us-west-2
      identityOwner: operator
    authenticationRef:
      name: keda-aws-credentials
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: spot-instance-app
  namespace: default
spec:
  replicas: 0
  selector:
    matchLabels:
      app: spot-instance-app
  template:
    metadata:
      labels:
        app: spot-instance-app
      annotations:
        cluster-autoscaler.kubernetes.io/safe-to-evict: "true"
    spec:
      nodeSelector:
        node.kubernetes.io/instance-type: "spot"
      tolerations:
      - key: "spot"
        operator: "Equal"
        value: "true"
        effect: "NoSchedule"
      containers:
      - name: app
        image: myapp:latest
        resources:
          requests:
            memory: "128Mi"
            cpu: "100m"
          limits:
            memory: "256Mi"
            cpu: "200m"
        # Spot instance optimization
        env:
        - name: SPOT_INSTANCE
          value: "true"
        - name: GRACEFUL_SHUTDOWN
          value: "true"

3. Cost Monitoring Dashboard

{
  "dashboard": {
    "title": "KEDA Cost Optimization Dashboard",
    "panels": [
      {
        "title": "Pod Cost by Namespace",
        "type": "graph",
        "targets": [
          {
            "expr": "sum(rate(container_cpu_usage_seconds_total[5m]) * on(pod) group_left() kube_pod_info) by (namespace)",
            "legendFormat": ""
          }
        ]
      },
      {
        "title": "Scaling Efficiency",
        "type": "graph",
        "targets": [
          {
            "expr": "keda_scaled_object_replicas / keda_scaled_object_ready",
            "legendFormat": ""
          }
        ]
      },
      {
        "title": "Idle Time Percentage",
        "type": "stat",
        "targets": [
          {
            "expr": "avg(rate(keda_scaled_object_replicas[1h]) == 0) * 100",
            "legendFormat": "Idle %"
          }
        ]
      }
    ]
  }
}

Conclusion

This comprehensive guide has covered advanced KEDA implementation on Amazon EKS with:

Key Highlights:

25+ Scaling Use Cases: From basic SQS to advanced multi-cloud scenarios
Enterprise Patterns: Circuit breakers, blue-green deployments, CQRS, and more
Production-Ready Configurations: High availability, security, and monitoring
Performance Optimization: Resource tuning, network optimization, and caching
Security & Compliance: RBAC, network policies, and audit configurations
Cost Optimization: Spot instances, right-sizing, and efficiency monitoring
Comprehensive Monitoring: Grafana dashboards, alerting, and debugging tools

Next Steps:

Start Simple: Begin with basic SQS or Redis scaling
Add Monitoring: Implement comprehensive observability
Scale Gradually: Add more complex patterns as needed
Optimize Costs: Implement cost optimization strategies
Enterprise Features: Add security and compliance controls

Resources:

This guide provides enterprise-grade patterns for implementing KEDA on EKS. Always test in non-production environments first and adapt examples to your specific requirements.

Tags: #keda #kubernetes #eks #autoscaling #event-driven #devops #monitoring #microservices #serverless

Author

Hari Prasad

Seasoned DevOps Lead with 11+ years of expertise in cloud infrastructure, CI/CD automation, and infrastructure as code. Proven track record in designing scalable, secure systems on AWS using Terraform, Kubernetes, Jenkins, and Ansible. Strong leadership in mentoring teams and implementing cost-effective cloud solutions.

Continue Reading

Oct 07, 2024

GitOps Workflow with ArgoCD and Flux: Declarative Kubernetes Management

Implement GitOps workflows using ArgoCD and Flux for declarative, version-controlled Kubernetes deployments with auto...

Read Article

Oct 06, 2024

Docker Containerization: From Basics to Production-Ready Images

Complete guide to Docker containerization with best practices for building optimized, secure images and running conta...

Read Article

Oct 16, 2024

OpenResty Production Setup: Supercharge with Lua-Based Metrics and Monitoring

Complete guide to deploying production-ready OpenResty with advanced Lua-based metrics collection, custom monitoring,...

Read Article

Jun 12, 2024

Kubernetes Network Policies Explained: Secure Your Cluster

A deep dive into Kubernetes Network Policies, how to restrict traffic, and best practices for securing your workloads.

Read Article

KEDA on EKS: Complete Guide to Event-Driven Autoscaling with Real-World Examples

PPF Calculator

Resume Builder

EKS Pod Cost Calculator

AWS VPC Designer Pro

Discover My DevOps Journey

Portfolio

Blog

Courses

Tools

Table of Contents

Why KEDA?

KEDA vs Traditional HPA

Architecture Deep Dive

KEDA Components

Core Components Explained

1. KEDA Operator

2. Metrics Server

3. Webhooks

4. ScaledObject Controller

5. ScaledJob Controller

Scaling Decision Flow

Prerequisites and Setup

System Requirements

EKS Cluster Configuration

Required IAM Permissions

Environment Setup Script

Network Configuration

Storage Requirements

KEDA Installation on EKS

Installation Methods

Method 1: Helm Installation (Production Ready)

Basic Installation

Advanced Production Installation

High Availability Installation

Method 2: YAML Manifests Installation

Method 3: Terraform Installation

Installation Verification

Step 2: Configure IAM Roles for Service Accounts (IRSA)

Advanced Scaling Use Cases

1. Amazon SQS Queue Scaling

Basic SQS Scaling

Advanced SQS Scaling with Multiple Queues

SQS FIFO Queue Scaling

2. Amazon SNS Topic Scaling

3. Amazon Kinesis Stream Scaling

4. Amazon DynamoDB Scaling

5. Amazon CloudWatch Metrics Scaling

6. Redis Stream Scaling

Basic Redis Stream Scaling

Advanced Redis Stream Scaling with Multiple Streams

7. Redis List Scaling

8. PostgreSQL Scaling

Basic PostgreSQL Scaling

Advanced PostgreSQL Scaling with Multiple Queries

9. MySQL Scaling

10. MongoDB Scaling

11. Kafka Topic Scaling

Basic Kafka Scaling

Advanced Kafka Scaling with Multiple Topics

12. Apache Pulsar Scaling

13. Prometheus Metrics Scaling

Basic Prometheus Scaling

Advanced Prometheus Scaling with Multiple Metrics

14. Azure Service Bus Scaling

Basic Azure Service Bus Scaling

Advanced Azure Service Bus Scaling with Topics

15. Azure Event Hubs Scaling

16. Azure Storage Queue Scaling

17. Google Cloud Pub/Sub Scaling

18. Google Cloud Storage Scaling

19. Cron-based Scaling

Basic Cron Scaling

Advanced Cron Scaling with Multiple Schedules

20. External Scaler (Custom Metrics)

Basic External Scaler

Advanced External Scaler with Multiple Metrics

21. InfluxDB Scaling

22. Apache Kafka Streams Scaling

23. RabbitMQ Scaling