Batch
Introduction
Run batch jobs as Docker images
Dynamic provisioning of the instances (EC2, Spot Fleet, ECS) in VPC
Optimal quantity and type based on volume and requirements
No need to manage clusters, fully severless, pay for EC2 instances
Use case:
batch process of images
running thousands of concurrent jobs
Schedule Batch jobs using CloudWatch Events
Orchestrate Batch jobs using AWS Step Functions
Feature
Compute Environment
Managed compute environment
AWS Batch manages the capacity and instance types within the environment. (Doesn't need to configure ASG.)
Can choose On-demand / Spot Instances / Spot Fleet
Can set a maximum price for Spot Instances
Launched within your own VPC
If launch within your own private subnet, make sure it has access to the ECS services.
Either use a NAT Gateway or using VPC Endpoints for ECS.
Unmanaged compute environment
You control and manage instance configuration, provisioning and scaling.
Multi Node Mode (for large scale, high performance computing)
1 main node, many child node.
Leverage multiple EC2 / ECS instances at the same time
Doesn't work with Spot Instances
Good for tightly coupled workloads
Represents a single job, and specific how many nodes to create for the job
Works better if your EC2 launch mode is a placement group "cluster"
Lambda vs Batch
Lambda
with time limit
limited runtimes
limited temporary disk space
Serverless
Batch
without time limit
Any runtime as long as it's packaged as a Docker Image
Rely on EBS / instance store for disk space
Relies on EC2 (can be managed by AWS)
Scenario
Batch architecture example
Last updated