DynamoDB

Introduction

NoSQL database, fully managed, massive scale (1,000,000 rps), single-digit millisecond latency.
Similar to Apache Cassandra (can migrate to DynamoDB). DynamoDB is made of tables.
Store on SSD, spread across 3 geographically data centers (cannot choose AZ)
Backups available, point in time recovery.

Feature

Capacity mode:
- Provisioned (R/W Capacity Unit & auto scaling, default, free-tier eligible)
  - If capacity is running out and indexes are well used and don't want to increase the cost, consider to export / archive data.
  - Can purchase reserved capacity in advance to lower the costs
- On-demand
Read type:
- Eventually consistent reads (default)
- Strongly consistent reads
Supports ACID transactions across multiple tables.
Integrated with IAM for security.
Data types:
- Scalar types: string, number, binary, boolean, null.
- Document types: list, map
- Set types: set, number set, binary set.
Primary Key (must be decided at creation time, and must be unique.)
- Partition Key (Hash attribute)
- Partition Key (Hash attribute) + Sort Key (Range Attribute)
  - Data is grouped by Partition Key.
  - Timestemp is a good Sort Key candidate.
Working With Indexes
- Can only query by PK + Sort Key on main table / indexes (Cannot query by a specific column spontanesously).
- Local Secondary Index (to select an alternative Sort Key)
  - Contains identical partition key of base table.
  - The identifier of the Local Secondary Index can only be composite key. The hash attribute must be the same with the hash attribute of base table.
  - Total size <= 10 GB.
  - Created at the same time with base table, and can not be deleted if table exists.
  - Supports both eventual / strong read consistency.
  - Action of read / write consumes capacity units from base table.
  - Best Practices
    Use Indexes sparingly (Avoid indexing to heavily-write table. Don't add index not used.)
    Choose Projections carefully,
    Optimize Projection to avoid fetches.
    Take advantage of sparse Indexes (Ex. Create an attribute of item for indexing, remove the attribute when the item is no longer needed.)
    Watch for expanding item collections.
- Global Secondary Index (in case you need another key to work like the Primary Key)
  - It contains a full mapping to all items (rows) with the specified attribute in the base table.
  - The identifier of the Global Secondary Index can be simple partition key or a composite key. The key can be any attribute.
  - No size restrictions.
  - Can be created at the same time with base table, and can be created / deleted any time.
  - Supports eventual consistency read only.
  - Action of read / write consumes capacity units from the index.
  - Best Practices
    Choose a key that will provide uniform workloads.
    Take advantage of sparse (the attribute appears infrequently among all items) Indexes.
    Use a Global Secondary Index for quick lookups. (Ex. Select sub-sets of attributes.)
    Create an Eventually Consistent Read Replica.
Allows for the storage of large text and binary objects with a limit of 400 KB per item (row).
- If an object is over 400 KB, store it to S3, then save reference.
TTL: automatically purge out old data without consuming WCU / RCU row after a specified epoch date.
DynamoDB Streams:
- React to changes to DynamoDB tables in real time
- Can be read by AWS Lambda, EC2, etc. Then send to ElasticSerach or Kinesis, etc.
- 24 hours retention of data
Global Tables (cross region replication)
- Active-Active replication to many regions
- Must enable DynamoDB Streams
- Useful for low latency, DR purposes
DAX (DynamoDB Accelerator)
- Seamless cache for DynamoDB, no application re-write.
- Writes go through DAX to DynamoDB
- Microsecond latency for cached reads
- Solves the Hot Key Problem (too many reads)
- 5 minutes TTL cache by default
- Up to 10 nodes in the cluster
- Multi-AZ (3 nodes minimum reconnended for production)
- Secure (Encryption at rest with KMS, VPC, IAM, CloudTrail, etc.)
Scenario:
- S3 indexing with DynamoDB
  - Create DynamoDB and indexes for later API retrieval.
  - S3 event to trigger Lambda, to insert data to DynamoDB.
- DAX vs ElastiCache
  - DAX for: individual objects cache for query
  - ElastiCache: store aggregated result

PreviousAurora NextEBS

Last updated 4 years ago

Was this helpful?