CloudWatch

Introduction

  • Monitoring operational and performance metrics for your AWS cloud resources and applications.

Feature

  • Dashboard

    • Display metrics and alarms for AWS resources (Not for configuration of AWS Organizations)

    • Can show metrics of multiple regions

  • Metrics:

    • Provided by many AWS services

      • Can create custom metrics: standard resolution: 1 minute, high resolution: 1 second

    • EC2

      • Standard: 5 minutes, Detailed monitoring: 1 minute

      • RAM is not a built-in metric (Use CloudWatch Unified Agent)

    • RI utilization is not tracked by CloudWatch.

      • Create a new reservation budget in AWS Budgets service. Set the reservation budget type to be RI Utilization and configure the utilization threshold.

  • Alarms:

    • Can trigger actions: EC2 action (reboot, stop, terminate, recover), Auto scaling, SNS

    • Alarm events can be intercepted by CloudWatch Events

  • Events

    • Can intercept from AWS services (ex. EC2 Start, CodeBuild failure, CloudWatch Alarms, Trusted Advisor, CloudTrail API calls, etc.)

    • Can trigger:

      • Compute: Lambda, Batch, ECS task

      • Orchestration: Step Functions, CodePipeline, CodeBuild

      • Integration: SQS, SNS, Kinesis Data streams, Kinesis Firehose

      • Maintenance: SSM, EC2 actions

    • Rule Expression

      • (minute, hour, day of month, month, day of week, year)

  • Logs

    • Sources:

      • SDK, CloudWatch Logs Agent, CloudWatch Unified Agent

      • Elastic Beanstalk: collection of logs from application

      • ECS: collection from containers

      • AWS Lambda: collection from function logs

      • VPC Flow Logs: VPC specific logs

      • API Gateway

      • CloudTrail based on filter

      • CloudWatch Log agents (ex. on EC2 machines)

      • Route53: log DNS queries

    • Destination:

      • S3 (export)

        • The bucket must be SSE-S3, not SSE-KMS.

        • Log data can take up to 12 hours to become available for export

          • Use CloudWatch Logs Insights or Logs Subscriptions for real time processing

      • Kinesis Data Stream / Firehose

      • Lambda

      • ElasticSearch

    • Log groups: arbitrary name, usually representing an application

    • Log stream: instances within application / log files /containers

    • Can define log expiration policies (never expire / 30 days, etc.)

    • Optional KMS encryption

    • CloudWatch Logs Insights can be used to query logs and add queries to CloudWatch Dashboards.

    • Logs Subscription

      • Works with a filter by Lambda

      • Send logs out with:

        • Lambda (real time)

        • Kinesis Data Streams (real time)

        • Kinesis Firehose (near real time)

    • Logs aggregation with multi-account & multi-region

      • By each account, set up Subscription Filter to send logs to central Kinesis Data Streams, then send logs to Kinesis Firehose to write logs to S3.

    • Logs Agent & Unified Agent

      • For virtual servers (EC2 instances, on-premise servers,...)

      • CloudWatch Logs Agent

        • Old version of the agent

        • Can only send to CloudWatch Logs

      • CloudWatch Unified Agent

        • Collect additional system-level metrics such as RAM, processes, etc.

        • Collect logs to send to CloudWatch Logs

        • Centralized configuration using SSM Parameter Store

      • Batch sends capabilities and settings:

        • batch_count: number of log events to send (default: 10,000, minimum: 1)

        • batch_duration: duration of batching for log events (default / minimum: 5000ms)

        • batch_size: max size of log events in a batch (default / max: 1 MB)

      • Both agents cannot send logs to Kinesis

Last updated