AWS_China_EKS_Backup
I'll create a detailed, EKS-specific implementation guide with all the clarifications and specific steps for your EKS setup.I've created a comprehensive, EKS-focused implementation guide that clarifies all the specific steps. Here are the key clarifications for your EKS setup:
Key EKS-Specific Clarifications:
1. EKS Log Sources Explained:
Node-level logs: kubelet, kube-proxy, container runtime, system logs
Container logs: Individual pod container logs via Fluent Bit
System logs: Bootstrap, security, and system messages
2. Two-Tier Logging Approach:
CloudWatch Agent DaemonSet: Collects node-level system logs
Fluent Bit DaemonSet: Collects container and application logs from pods
3. Complete Kubernetes Deployments:
Proper RBAC (ServiceAccount, ClusterRole, ClusterRoleBinding)
DaemonSets that run on all nodes
ConfigMaps for log configuration
Namespace isolation (
amazon-cloudwatch)
4. EKS-Specific Configuration:
# Environment variables for easy setup
export EKS_CLUSTER_NAME="your-cluster-name"
export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)5. Real-Time Verification Steps:
Check pod deployment across all nodes
Verify log groups are receiving data
Confirm S3 delivery in Ningxia region
Monitor Firehose stream health
What Makes This EKS-Specific:
Kubernetes-Native Deployment: Uses DaemonSets instead of installing agents on individual EC2 instances
Container Log Collection: Captures logs from all pods and containers
Cluster-Wide Coverage: Automatically deploys to new nodes as they join
EKS Control Plane Integration: Separate from AWS-managed control plane logs
Implementation Order:
Start with Prerequisites (S3 buckets, IAM roles)
Deploy CloudWatch Agent for node logs
Deploy Fluent Bit for container logs
Configure RDS logs if you have databases
Set up monitoring and verify delivery
Key Benefits:
Automatic scaling: New EKS nodes automatically get logging agents
Real-time streaming: Logs flow immediately to Ningxia region
Cost optimized: 7-day CloudWatch retention + S3 lifecycle policies
Container awareness: Captures Kubernetes metadata with logs
The guide now includes actual kubectl commands, complete YAML manifests, and specific verification steps for your EKS environment. Each step includes the exact commands you need to run with proper AWS China endpoints.
EKS-Focused AWS China Log Backup: Complete Implementation Guide
Prerequisites Setup (Detailed)
1. Gather Your Information First
Before starting, collect these details:
# Your AWS Account ID
export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
echo "Account ID: $AWS_ACCOUNT_ID"
# Your EKS cluster name
export EKS_CLUSTER_NAME="your-eks-cluster-name"
# Your EKS node group name
export EKS_NODEGROUP_NAME="your-nodegroup-name"
# List your EKS nodes
kubectl get nodes -o wide2. Create S3 Buckets with Proper Configuration
# Set variables
export REGION_BEIJING="cn-north-1"
export REGION_NINGXIA="cn-northwest-1"
export S3_BUCKET_EKS="eks-logs-backup-ningxia"
export S3_BUCKET_RDS="rds-logs-backup-ningxia"
# Create S3 bucket for EKS logs with versioning
aws s3api create-bucket \
--bucket $S3_BUCKET_EKS \
--region $REGION_NINGXIA \
--create-bucket-configuration LocationConstraint=$REGION_NINGXIA \
--endpoint-url https://s3.$REGION_NINGXIA.amazonaws.com.cn
# Enable versioning
aws s3api put-bucket-versioning \
--bucket $S3_BUCKET_EKS \
--versioning-configuration Status=Enabled \
--endpoint-url https://s3.$REGION_NINGXIA.amazonaws.com.cn
# Create bucket policy for cross-region access
cat > s3-bucket-policy.json << EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "firehose.amazonaws.com"
},
"Action": [
"s3:AbortMultipartUpload",
"s3:GetBucketLocation",
"s3:GetObject",
"s3:ListBucket",
"s3:ListBucketMultipartUploads",
"s3:PutObject"
],
"Resource": [
"arn:aws-cn:s3:::$S3_BUCKET_EKS",
"arn:aws-cn:s3:::$S3_BUCKET_EKS/*"
],
"Condition": {
"StringEquals": {
"s3:x-amz-acl": "bucket-owner-full-control"
}
}
}
]
}
EOF
# Apply bucket policy
aws s3api put-bucket-policy \
--bucket $S3_BUCKET_EKS \
--policy file://s3-bucket-policy.json \
--endpoint-url https://s3.$REGION_NINGXIA.amazonaws.com.cn3. Create IAM Roles with Complete Permissions
3.1 Create Kinesis Firehose Service Role:
# Trust policy for Firehose
cat > firehose-trust-policy.json << EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "firehose.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
EOF
# Comprehensive permissions policy
cat > firehose-permissions.json << EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:AbortMultipartUpload",
"s3:GetBucketLocation",
"s3:GetObject",
"s3:ListBucket",
"s3:ListBucketMultipartUploads",
"s3:PutObject",
"s3:PutObjectAcl"
],
"Resource": [
"arn:aws-cn:s3:::$S3_BUCKET_EKS",
"arn:aws-cn:s3:::$S3_BUCKET_EKS/*",
"arn:aws-cn:s3:::$S3_BUCKET_RDS",
"arn:aws-cn:s3:::$S3_BUCKET_RDS/*"
]
},
{
"Effect": "Allow",
"Action": [
"logs:PutLogEvents",
"logs:CreateLogGroup",
"logs:CreateLogStream"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"kinesis:DescribeStream",
"kinesis:GetShardIterator",
"kinesis:GetRecords"
],
"Resource": "*"
}
]
}
EOF
# Create role and policy
aws iam create-role \
--role-name EKSFirehoseDeliveryRole \
--assume-role-policy-document file://firehose-trust-policy.json \
--endpoint-url https://iam.$REGION_BEIJING.amazonaws.com.cn
aws iam create-policy \
--policy-name EKSFirehoseDeliveryRolePolicy \
--policy-document file://firehose-permissions.json \
--endpoint-url https://iam.$REGION_BEIJING.amazonaws.com.cn
aws iam attach-role-policy \
--role-name EKSFirehoseDeliveryRole \
--policy-arn arn:aws-cn:iam::$AWS_ACCOUNT_ID:policy/EKSFirehoseDeliveryRolePolicy \
--endpoint-url https://iam.$REGION_BEIJING.amazonaws.com.cn3.2 Update EKS Node Group IAM Role:
# Get your existing EKS node group role
export EKS_NODE_ROLE=$(aws eks describe-nodegroup \
--cluster-name $EKS_CLUSTER_NAME \
--nodegroup-name $EKS_NODEGROUP_NAME \
--region $REGION_BEIJING \
--query 'nodegroup.nodeRole' --output text | cut -d'/' -f2)
echo "EKS Node Role: $EKS_NODE_ROLE"
# Create additional policy for CloudWatch agent
cat > eks-cloudwatch-policy.json << EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:PutLogEvents",
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:DescribeLogStreams",
"logs:DescribeLogGroups",
"logs:PutRetentionPolicy"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"cloudwatch:PutMetricData",
"ec2:DescribeVolumes",
"ec2:DescribeTags",
"logs:PutLogEvents",
"logs:CreateLogGroup",
"logs:CreateLogStream"
],
"Resource": "*"
}
]
}
EOF
# Create and attach policy to EKS node role
aws iam create-policy \
--policy-name EKSCloudWatchAgentPolicy \
--policy-document file://eks-cloudwatch-policy.json
aws iam attach-role-policy \
--role-name $EKS_NODE_ROLE \
--policy-arn arn:aws-cn:iam::$AWS_ACCOUNT_ID:policy/EKSCloudWatchAgentPolicyEKS-Specific Log Backup Implementation
Step 1: Understand EKS Log Sources
EKS generates multiple types of logs:
EKS Control Plane Logs (managed by AWS, separate configuration)
Node-level logs (what we're backing up):
/var/log/kubelet.log- Kubelet service logs/var/log/kube-proxy.log- Kube-proxy logs/var/log/dockeror/var/log/containerd- Container runtime logs/var/log/pods/*/*/*.log- Individual pod container logs/var/log/messages- System logs/var/log/secure- Security logs/var/log/cloud-init.log- Bootstrap logs
Step 2: Deploy CloudWatch Agent to EKS Nodes
2.1 Create CloudWatch Agent Configuration for EKS:
# Create namespace for CloudWatch agent
kubectl create namespace amazon-cloudwatch
# Create ConfigMap with EKS-specific configuration
cat > cwagentconfig.yaml << EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: cwagentconfig
namespace: amazon-cloudwatch
data:
cwagentconfig.json: |
{
"agent": {
"region": "$REGION_BEIJING",
"metrics_collection_interval": 60,
"run_as_user": "cwagent"
},
"logs": {
"metrics_collected": {
"kubernetes": {
"cluster_name": "$EKS_CLUSTER_NAME",
"metrics_collection_interval": 60
}
},
"logs_collected": {
"files": {
"collect_list": [
{
"file_path": "/var/log/kubelet.log",
"log_group_name": "/aws/eks/$EKS_CLUSTER_NAME/kubelet",
"log_stream_name": "{instance_id}",
"timezone": "UTC"
},
{
"file_path": "/var/log/kube-proxy.log",
"log_group_name": "/aws/eks/$EKS_CLUSTER_NAME/kube-proxy",
"log_stream_name": "{instance_id}",
"timezone": "UTC"
},
{
"file_path": "/var/log/docker",
"log_group_name": "/aws/eks/$EKS_CLUSTER_NAME/docker",
"log_stream_name": "{instance_id}",
"timezone": "UTC"
},
{
"file_path": "/var/log/messages",
"log_group_name": "/aws/eks/$EKS_CLUSTER_NAME/system",
"log_stream_name": "{instance_id}",
"timezone": "UTC"
},
{
"file_path": "/var/log/secure",
"log_group_name": "/aws/eks/$EKS_CLUSTER_NAME/secure",
"log_stream_name": "{instance_id}",
"timezone": "UTC"
},
{
"file_path": "/var/log/cloud-init.log",
"log_group_name": "/aws/eks/$EKS_CLUSTER_NAME/cloud-init",
"log_stream_name": "{instance_id}",
"timezone": "UTC"
}
]
}
}
}
}
EOF
kubectl apply -f cwagentconfig.yaml2.2 Deploy CloudWatch Agent as DaemonSet:
cat > cwagent-daemonset.yaml << EOF
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: cloudwatch-agent
namespace: amazon-cloudwatch
spec:
selector:
matchLabels:
name: cloudwatch-agent
template:
metadata:
labels:
name: cloudwatch-agent
spec:
serviceAccountName: cloudwatch-agent
containers:
- name: cloudwatch-agent
image: public.ecr.aws/cloudwatch-agent/cloudwatch-agent:1.247348.0b251302
resources:
limits:
cpu: 200m
memory: 200Mi
requests:
cpu: 200m
memory: 200Mi
env:
- name: AWS_REGION
value: $REGION_BEIJING
- name: K8S_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: HOST_IP
valueFrom:
fieldRef:
fieldPath: status.hostIP
- name: HOST_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: K8S_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: cwagentconfig
mountPath: /etc/cwagentconfig
- name: rootfs
mountPath: /rootfs
readOnly: true
- name: dockersock
mountPath: /var/run/docker.sock
readOnly: true
- name: varlibdocker
mountPath: /var/lib/docker
readOnly: true
- name: varlog
mountPath: /var/log
readOnly: true
- name: sys
mountPath: /sys
readOnly: true
- name: devdisk
mountPath: /dev/disk
readOnly: true
volumes:
- name: cwagentconfig
configMap:
name: cwagentconfig
- name: rootfs
hostPath:
path: /
- name: dockersock
hostPath:
path: /var/run/docker.sock
- name: varlibdocker
hostPath:
path: /var/lib/docker
- name: varlog
hostPath:
path: /var/log
- name: sys
hostPath:
path: /sys
- name: devdisk
hostPath:
path: /dev/disk/
terminationGracePeriodSeconds: 60
hostNetwork: true
EOF
# Create service account and role binding
cat > cwagent-serviceaccount.yaml << EOF
apiVersion: v1
kind: ServiceAccount
metadata:
name: cloudwatch-agent
namespace: amazon-cloudwatch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: cloudwatch-agent-role
rules:
- apiGroups: [""]
resources: ["pods", "nodes", "endpoints"]
verbs: ["list", "watch"]
- apiGroups: ["apps"]
resources: ["replicasets"]
verbs: ["list", "watch"]
- apiGroups: ["batch"]
resources: ["jobs"]
verbs: ["list", "watch"]
- apiGroups: [""]
resources: ["nodes/proxy"]
verbs: ["get"]
- apiGroups: [""]
resources: ["nodes/stats", "configmaps", "events"]
verbs: ["create", "get"]
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["update"]
- nonResourceURLs: ["/metrics"]
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: cloudwatch-agent-role-binding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cloudwatch-agent-role
subjects:
- kind: ServiceAccount
name: cloudwatch-agent
namespace: amazon-cloudwatch
EOF
# Deploy CloudWatch agent
kubectl apply -f cwagent-serviceaccount.yaml
kubectl apply -f cwagent-daemonset.yaml
# Verify deployment
kubectl get pods -n amazon-cloudwatch
kubectl logs -n amazon-cloudwatch -l name=cloudwatch-agentStep 3: Create CloudWatch Log Groups for EKS
# Create all EKS-related log groups
declare -a LOG_GROUPS=(
"/aws/eks/$EKS_CLUSTER_NAME/kubelet"
"/aws/eks/$EKS_CLUSTER_NAME/kube-proxy"
"/aws/eks/$EKS_CLUSTER_NAME/docker"
"/aws/eks/$EKS_CLUSTER_NAME/system"
"/aws/eks/$EKS_CLUSTER_NAME/secure"
"/aws/eks/$EKS_CLUSTER_NAME/cloud-init"
)
# Create each log group
for LOG_GROUP in "${LOG_GROUPS[@]}"
do
echo "Creating log group: $LOG_GROUP"
aws logs create-log-group \
--log-group-name "$LOG_GROUP" \
--region $REGION_BEIJING \
--endpoint-url https://logs.$REGION_BEIJING.amazonaws.com.cn
# Set retention policy (7 days for cost optimization)
aws logs put-retention-policy \
--log-group-name "$LOG_GROUP" \
--retention-in-days 7 \
--region $REGION_BEIJING \
--endpoint-url https://logs.$REGION_BEIJING.amazonaws.com.cn
doneStep 4: Create Kinesis Firehose Delivery Stream for EKS Logs
# Create Firehose delivery stream with EKS-specific configuration
aws firehose create-delivery-stream \
--delivery-stream-name eks-logs-to-ningxia \
--delivery-stream-type DirectPut \
--s3-destination-configuration '{
"RoleARN": "arn:aws-cn:iam::'$AWS_ACCOUNT_ID':role/EKSFirehoseDeliveryRole",
"BucketARN": "arn:aws-cn:s3:::'$S3_BUCKET_EKS'",
"Prefix": "eks-logs/cluster='$EKS_CLUSTER_NAME'/year=!{timestamp:yyyy}/month=!{timestamp:MM}/day=!{timestamp:dd}/hour=!{timestamp:HH}/",
"ErrorOutputPrefix": "errors/",
"BufferingHints": {
"SizeInMBs": 5,
"IntervalInSeconds": 60
},
"CompressionFormat": "GZIP",
"CloudWatchLoggingOptions": {
"Enabled": true,
"LogGroupName": "/aws/kinesisfirehose/eks-logs-to-ningxia"
}
}' \
--region $REGION_BEIJING \
--endpoint-url https://firehose.$REGION_BEIJING.amazonaws.com.cn
# Verify Firehose stream creation
aws firehose describe-delivery-stream \
--delivery-stream-name eks-logs-to-ningxia \
--region $REGION_BEIJING \
--endpoint-url https://firehose.$REGION_BEIJING.amazonaws.com.cnStep 5: Create Subscription Filters for Each EKS Log Group
# Function to create subscription filter
create_subscription_filter() {
local LOG_GROUP_NAME=$1
local FILTER_NAME=$2
echo "Creating subscription filter for $LOG_GROUP_NAME"
aws logs put-subscription-filter \
--log-group-name "$LOG_GROUP_NAME" \
--filter-name "$FILTER_NAME" \
--filter-pattern "" \
--destination-arn "arn:aws-cn:firehose:$REGION_BEIJING:$AWS_ACCOUNT_ID:deliverystream/eks-logs-to-ningxia" \
--region $REGION_BEIJING \
--endpoint-url https://logs.$REGION_BEIJING.amazonaws.com.cn
}
# Create subscription filters for all log groups
create_subscription_filter "/aws/eks/$EKS_CLUSTER_NAME/kubelet" "EKSKubeletLogsToFirehose"
create_subscription_filter "/aws/eks/$EKS_CLUSTER_NAME/kube-proxy" "EKSKubeProxyLogsToFirehose"
create_subscription_filter "/aws/eks/$EKS_CLUSTER_NAME/docker" "EKSDockerLogsToFirehose"
create_subscription_filter "/aws/eks/$EKS_CLUSTER_NAME/system" "EKSSystemLogsToFirehose"
create_subscription_filter "/aws/eks/$EKS_CLUSTER_NAME/secure" "EKSSecureLogsToFirehose"
create_subscription_filter "/aws/eks/$EKS_CLUSTER_NAME/cloud-init" "EKSCloudInitLogsToFirehose"Pod-Level Log Collection with Fluent Bit (Advanced)
Step 6: Deploy Fluent Bit for Container Logs
6.1 Create Fluent Bit Configuration:
cat > fluent-bit-config.yaml << EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: fluent-bit-config
namespace: amazon-cloudwatch
data:
fluent-bit.conf: |
[SERVICE]
Flush 5
Grace 30
Log_Level info
Daemon off
Parsers_File parsers.conf
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_Port 2020
storage.path /var/fluent-bit/state/flb-storage/
storage.sync normal
storage.checksum off
storage.backlog.mem_limit 5M
[INPUT]
Name tail
Tag application.*
Exclude_Path /var/log/containers/cloudwatch-agent*, /var/log/containers/fluent-bit*
Path /var/log/containers/*.log
Parser cri
DB /var/fluent-bit/state/flb_container.db
Mem_Buf_Limit 50MB
Skip_Long_Lines On
Refresh_Interval 10
storage.type filesystem
Read_from_Head Off
[INPUT]
Name tail
Tag dataplane.systemd.*
Path /var/log/journal
Parser systemd
DB /var/fluent-bit/state/flb_journal.db
Mem_Buf_Limit 50MB
Skip_Long_Lines On
Refresh_Interval 10
storage.type filesystem
Read_from_Head Off
[FILTER]
Name kubernetes
Match application.*
Kube_URL https://kubernetes.default.svc:443
Kube_Tag_Prefix application.var.log.containers.
Merge_Log On
Merge_Log_Key log_processed
K8S-Logging.Parser On
K8S-Logging.Exclude Off
Labels Off
Annotations Off
[OUTPUT]
Name cloudwatch_logs
Match application.*
region $REGION_BEIJING
log_group_name /aws/eks/$EKS_CLUSTER_NAME/application
log_stream_prefix eks-
auto_create_group On
extra_user_agent container-insights
[OUTPUT]
Name cloudwatch_logs
Match dataplane.systemd.*
region $REGION_BEIJING
log_group_name /aws/eks/$EKS_CLUSTER_NAME/dataplane
log_stream_prefix eks-
auto_create_group On
extra_user_agent container-insights
parsers.conf: |
[PARSER]
Name cri
Format regex
Regex ^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>[^ ]*) (?<message>.*)$
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L%z
[PARSER]
Name systemd
Format regex
Regex ^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$
Time_Key time
Time_Format %b %d %H:%M:%S
EOF
kubectl apply -f fluent-bit-config.yaml6.2 Deploy Fluent Bit DaemonSet:
cat > fluent-bit-daemonset.yaml << EOF
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluent-bit
namespace: amazon-cloudwatch
labels:
k8s-app: fluent-bit
spec:
selector:
matchLabels:
k8s-app: fluent-bit
template:
metadata:
labels:
k8s-app: fluent-bit
spec:
serviceAccountName: fluent-bit
containers:
- name: fluent-bit
image: public.ecr.aws/aws-observability/aws-for-fluent-bit:stable
imagePullPolicy: Always
env:
- name: AWS_REGION
value: $REGION_BEIJING
- name: CLUSTER_NAME
value: $EKS_CLUSTER_NAME
- name: HTTP_SERVER
value: "On"
- name: HTTP_PORT
value: "2020"
- name: READ_FROM_HEAD
value: "Off"
- name: READ_FROM_TAIL
value: "On"
- name: HOST_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: HOSTNAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: CI_VERSION
value: "k8s/1.3.9"
resources:
limits:
memory: 200Mi
requests:
cpu: 500m
memory: 100Mi
volumeMounts:
- name: fluentbitstate
mountPath: /var/fluent-bit/state
- name: varlog
mountPath: /var/log
readOnly: true
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
- name: fluent-bit-config
mountPath: /fluent-bit/etc/
- name: runlogjournal
mountPath: /run/log/journal
readOnly: true
- name: dmesg
mountPath: /var/log/dmesg
readOnly: true
terminationGracePeriodSeconds: 10
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet
volumes:
- name: fluentbitstate
hostPath:
path: /var/fluent-bit/state
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
- name: fluent-bit-config
configMap:
name: fluent-bit-config
- name: runlogjournal
hostPath:
path: /run/log/journal
- name: dmesg
hostPath:
path: /var/log/dmesg
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: fluent-bit
namespace: amazon-cloudwatch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: fluent-bit-role
rules:
- nonResourceURLs:
- /metrics
verbs:
- get
- apiGroups: [""]
resources:
- namespaces
- pods
- pods/logs
- nodes
- nodes/proxy
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: fluent-bit-role-binding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: fluent-bit-role
subjects:
- kind: ServiceAccount
name: fluent-bit
namespace: amazon-cloudwatch
EOF
kubectl apply -f fluent-bit-daemonset.yaml
# Verify Fluent Bit deployment
kubectl get pods -n amazon-cloudwatch -l k8s-app=fluent-bitStep 7: Create Additional Log Groups and Subscription Filters for Pod Logs
# Create log groups for application and dataplane logs
aws logs create-log-group \
--log-group-name "/aws/eks/$EKS_CLUSTER_NAME/application" \
--region $REGION_BEIJING \
--endpoint-url https://logs.$REGION_BEIJING.amazonaws.com.cn
aws logs create-log-group \
--log-group-name "/aws/eks/$EKS_CLUSTER_NAME/dataplane" \
--region $REGION_BEIJING \
--endpoint-url https://logs.$REGION_BEIJING.amazonaws.com.cn
# Set retention policies
aws logs put-retention-policy \
--log-group-name "/aws/eks/$EKS_CLUSTER_NAME/application" \
--retention-in-days 7 \
--region $REGION_BEIJING
aws logs put-retention-policy \
--log-group-name "/aws/eks/$EKS_CLUSTER_NAME/dataplane" \
--retention-in-days 7 \
--region $REGION_BEIJING
# Create subscription filters for pod logs
create_subscription_filter "/aws/eks/$EKS_CLUSTER_NAME/application" "EKSApplicationLogsToFirehose"
create_subscription_filter "/aws/eks/$EKS_CLUSTER_NAME/dataplane" "EKSDataplaneLogsToFirehose"Database Logs Backup (RDS) Implementation
Step 8: Enable and Configure RDS Log Publishing
# List your RDS instances to get the correct identifier
aws rds describe-db-instances \
--region $REGION_BEIJING \
--endpoint-url https://rds.$REGION_BEIJING.amazonaws.com.cn \
--query 'DBInstances[*].[DBInstanceIdentifier,Engine,DBInstanceStatus]' \
--output table
# Set your RDS instance identifier
export RDS_INSTANCE_ID="your-rds-instance-id"
# For MySQL/MariaDB - Enable error, general, and slow query logs
aws rds modify-db-instance \
--db-instance-identifier $RDS_INSTANCE_ID \
--cloudwatch-logs-export-configuration LogTypesToEnable=error,general,slow-query \
--apply-immediately \
--region $REGION_BEIJING \
--endpoint-url https://rds.$REGION_BEIJING.amazonaws.com.cn
# For PostgreSQL - Enable postgresql logs
# aws rds modify-db-instance \
# --db-instance-identifier $RDS_INSTANCE_ID \
# --cloudwatch-logs-export-configuration LogTypesToEnable=postgresql \
# --apply-immediately \
# --region $REGION_BEIJING
# Wait for modification to complete and verify
aws rds describe-db-instances \
--db-instance-identifier $RDS_INSTANCE_ID \
--region $REGION_BEIJING \
--endpoint-url https://rds.$REGION_BEIJING.amazonaws.com.cn \
--query 'DBInstances[0].EnabledCloudwatchLogsExports'Step 9: Create S3 Bucket and Firehose for RDS Logs
# Create S3 bucket for RDS logs
aws s3api create-bucket \
--bucket $S3_BUCKET_RDS \
--region $REGION_NINGXIA \
--create-bucket-configuration LocationConstraint=$REGION_NINGXIA \
--endpoint-url https://s3.$REGION_NINGXIA.amazonaws.com.cn
# Create Firehose delivery stream for RDS logs
aws firehose create-delivery-stream \
--delivery-stream-name rds-logs-to-ningxia \
--delivery-stream-type DirectPut \
--s3-destination-configuration '{
"RoleARN": "arn:aws-cn:iam::'$AWS_ACCOUNT_ID':role/EKSFirehoseDeliveryRole",
"BucketARN": "arn:aws-cn:s3:::'$S3_BUCKET_RDS'",
"Prefix": "rds-logs/instance='$RDS_INSTANCE_ID'/year=!{timestamp:yyyy}/month=!{timestamp:MM}/day=!{timestamp:dd}/",
"ErrorOutputPrefix": "rds-errors/",
"BufferingHints": {
"SizeInMBs": 5,
"IntervalInSeconds": 300
},
"CompressionFormat": "GZIP"
}' \
--region $REGION_BEIJING \
--endpoint-url https://firehose.$REGION_BEIJING.amazonaws.com.cnStep 10: Create Subscription Filters for RDS Logs
# Wait for RDS log groups to be created (this might take a few minutes)
echo "Waiting for RDS log groups to be created..."
sleep 300
# List available RDS log groups to confirm they exist
aws logs describe-log-groups \
--log-group-name-prefix "/aws/rds/instance/$RDS_INSTANCE_ID" \
--region $REGION_BEIJING \
--endpoint-url https://logs.$REGION_BEIJING.amazonaws.com.cn
# Create subscription filters for each RDS log type
declare -a RDS_LOG_TYPES=("error" "general" "slowquery")
for LOG_TYPE in "${RDS_LOG_TYPES[@]}"
do
LOG_GROUP_NAME="/aws/rds/instance/$RDS_INSTANCE_ID/$LOG_TYPE"
FILTER_NAME="RDS${LOG_TYPE^}LogsToFirehose"
echo "Creating subscription filter for $LOG_GROUP_NAME"
# Set retention policy first
aws logs put-retention-policy \
--log-group-name "$LOG_GROUP_NAME" \
--retention-in-days 7 \
--region $REGION_BEIJING \
--endpoint-url https://logs.$REGION_BEIJING.amazonaws.com.cn
# Create subscription filter
aws logs put-subscription-filter \
--log-group-name "$LOG_GROUP_NAME" \
--filter-name "$FILTER_NAME" \
--filter-pattern "" \
--destination-arn "arn:aws-cn:firehose:$REGION_BEIJING:$AWS_ACCOUNT_ID:deliverystream/rds-logs-to-ningxia" \
--region $REGION_BEIJING \
--endpoint-url https://logs.$REGION_BEIJING.amazonaws.com.cn
doneMonitoring and Verification
Step 11: Create Comprehensive Monitoring
# Create CloudWatch alarms for EKS log delivery
aws cloudwatch put-metric-alarm \
--alarm-name "EKS-Firehose-Delivery-Failures" \
--alarm-description "Monitor EKS log delivery failures to Ningxia" \
--metric-name "DeliveryToS3.Records" \
--namespace "AWS/Kinesis/Firehose" \
--statistic Sum \
--period 300 \
--threshold 1 \
--comparison-operator LessThanThreshold \
--dimensions Name=DeliveryStreamName,Value=eks-logs-to-ningxia \
--alarm-actions arn:aws-cn:sns:$REGION_BEIJING:$AWS_ACCOUNT_ID:your-sns-topic \
--region $REGION_BEIJING \
--endpoint-url https://monitoring.$REGION_BEIJING.amazonaws.com.cn
# Create alarm for RDS log delivery
aws cloudwatch put-metric-alarm \
--alarm-name "RDS-Firehose-Delivery-Failures" \
--alarm-description "Monitor RDS log delivery failures to Ningxia" \
--metric-name "DeliveryToS3.Records" \
--namespace "AWS/Kinesis/Firehose" \
--statistic Sum \
--period 300 \
--threshold 1 \
--comparison-operator LessThanThreshold \
--dimensions Name=DeliveryStreamName,Value=rds-logs-to-ningxia \
--region $REGION_BEIJINGStep 12: Verification Steps
# 1. Check CloudWatch Agent is running on all nodes
kubectl get pods -n amazon-cloudwatch -l name=cloudwatch-agent -o wide
# 2. Check Fluent Bit is running on all nodes
kubectl get pods -n amazon-cloudwatch -l k8s-app=fluent-bit -o wide
# 3. Verify log groups have data
echo "Checking EKS log groups for data..."
for LOG_GROUP in "${LOG_GROUPS[@]}"
do
echo "Checking: $LOG_GROUP"
aws logs describe-log-streams \
--log-group-name "$LOG_GROUP" \
--region $REGION_BEIJING \
--endpoint-url https://logs.$REGION_BEIJING.amazonaws.com.cn \
--query 'logStreams[0].lastEventTime'
done
# 4. Check Firehose stream status
aws firehose describe-delivery-stream \
--delivery-stream-name eks-logs-to-ningxia \
--region $REGION_BEIJING \
--endpoint-url https://firehose.$REGION_BEIJING.amazonaws.com.cn \
--query 'DeliveryStreamDescription.DeliveryStreamStatus'
# 5. Check S3 buckets for log files (wait 5-10 minutes after setup)
echo "Checking S3 buckets for log files..."
aws s3 ls s3://$S3_BUCKET_EKS/eks-logs/ --recursive \
--endpoint-url https://s3.$REGION_NINGXIA.amazonaws.com.cn
aws s3 ls s3://$S3_BUCKET_RDS/rds-logs/ --recursive \
--endpoint-url https://s3.$REGION_NINGXIA.amazonaws.com.cn
# 6. Test log generation
echo "Generating test log entries..."
kubectl run test-pod --image=busybox --rm -it --restart=Never -- sh -c 'echo "Test log entry from EKS pod" && sleep 10'Cost Optimization and Lifecycle Management
Step 13: Configure S3 Lifecycle Policies
# Create lifecycle policy for EKS logs
cat > eks-s3-lifecycle.json << EOF
{
"Rules": [
{
"ID": "EKSLogArchivingRule",
"Status": "Enabled",
"Filter": {
"Prefix": "eks-logs/"
},
"Transitions": [
{
"Days": 30,
"StorageClass": "STANDARD_IA"
},
{
"Days": 90,
"StorageClass": "GLACIER"
},
{
"Days": 365,
"StorageClass": "DEEP_ARCHIVE"
}
],
"Expiration": {
"Days": 2555
}
}
]
}
EOF
# Apply lifecycle policy
aws s3api put-bucket-lifecycle-configuration \
--bucket $S3_BUCKET_EKS \
--lifecycle-configuration file://eks-s3-lifecycle.json \
--endpoint-url https://s3.$REGION_NINGXIA.amazonaws.com.cn
# Create similar policy for RDS logs
cat > rds-s3-lifecycle.json << EOF
{
"Rules": [
{
"ID": "RDSLogArchivingRule",
"Status": "Enabled",
"Filter": {
"Prefix": "rds-logs/"
},
"Transitions": [
{
"Days": 30,
"StorageClass": "STANDARD_IA"
},
{
"Days": 90,
"StorageClass": "GLACIER"
}
],
"Expiration": {
"Days": 1095
}
}
]
}
EOF
aws s3api put-bucket-lifecycle-configuration \
--bucket $S3_BUCKET_RDS \
--lifecycle-configuration file://rds-s3-lifecycle.json \
--endpoint-url https://s3.$REGION_NINGXIA.amazonaws.com.cnTroubleshooting Common Issues
Issue 1: CloudWatch Agent Not Starting
# Check agent status on nodes
kubectl logs -n amazon-cloudwatch -l name=cloudwatch-agent --tail=100
# Check IAM permissions
aws iam list-attached-role-policies --role-name $EKS_NODE_ROLE
# Restart agent
kubectl delete pods -n amazon-cloudwatch -l name=cloudwatch-agentIssue 2: No Logs in CloudWatch
# Check if log files exist on nodes
kubectl exec -n amazon-cloudwatch $(kubectl get pods -n amazon-cloudwatch -l name=cloudwatch-agent -o jsonpath='{.items[0].metadata.name}') -- ls -la /var/log/
# Verify agent configuration
kubectl describe configmap cwagentconfig -n amazon-cloudwatchIssue 3: Firehose Delivery Failures
# Check Firehose metrics
aws cloudwatch get-metric-statistics \
--namespace AWS/Kinesis/Firehose \
--metric-name DeliveryToS3.Success \
--dimensions Name=DeliveryStreamName,Value=eks-logs-to-ningxia \
--start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%S) \
--end-time $(date -u +%Y-%m-%dT%H:%M:%S) \
--period 300 \
--statistics Sum \
--region $REGION_BEIJING
# Check error logs
aws logs filter-log-events \
--log-group-name /aws/kinesisfirehose/eks-logs-to-ningxia \
--start-time $(date -d '1 hour ago' +%s)000 \
--region $REGION_BEIJINGThis comprehensive guide provides all the specific steps needed to implement EKS log backup from Beijing to Ningxia region in AWS China, including node-level logs, container logs, and database logs with proper monitoring and cost optimization.
Last updated