Development Notes
  • Introduction
  • Programming Langauges
    • Java
      • Cache
      • Java Fundamentals
      • Multithreading & Concurrency
      • Spring Boot
        • Spring Security
        • Development tips
      • ORM
        • Mybatis
      • Implementation & Testing
    • Node.js
      • Asynchronous Execution
      • Node.js Notes
    • Python
      • Memo
  • Data Structure & Algorithm
  • Database
  • Design Pattern
  • AWS Notes
    • Services
      • API Gateway
      • CloudHSM
      • Compute & Load Balancing
        • Auto Scaling Group
        • EC2
        • ECS
        • ELB
        • Lambda
      • Data Engineering
        • Athena
        • Batch
        • EMR
        • IoT
        • Kinesis
        • Video Streaming
        • Quicksight
      • Deployment
        • CloudFormation
        • Code Deploy
        • Elastic Beanstalk
        • OpsWorks
        • SAM
        • SSM
      • ElasticSearch
      • Identity & Federation
        • Directory Service
        • IAM
        • Organizations
        • Resource Access Manager (RAM)
        • SSO
        • STS
      • KMS
      • Management Tools
        • Catalog
        • CloudTrail
        • CloudWatch
        • Config
        • Cost Allocation Tags
        • GuardDuty
        • Savings Plans
        • Trusted Advisor
        • X-Ray
      • Migration
        • Cloud Migration: The 6R
        • Disaster Recovery
        • DMS
        • VM Migrations
      • Networking
        • ACM
        • CloudFront
        • Direct Connect
        • EIP & ENI
        • Network Security
        • PrivateLink
        • Route53
        • VPC
        • VPN
      • Service Commnucation
        • Amazon MQ
        • SNS
        • SQS
        • Step Functions
        • SWF
      • Storage
        • Aurora
        • DynamoDB
        • EBS
        • EFS
        • ElastiCache
        • RDS
        • Redshift
        • S3
        • Storage Gateway
      • Other Services
        • Alexa for Business, Lex, Connect
        • AppStream 2.0
        • CloudSearch
        • Comprehend
        • Data Tools
        • Elastic Transcoder
        • Mechanical Turk
        • Rekognition
        • WorkDocs
        • WorkSpaces
    • Well Architect Framework
      • Security
      • Reliability
      • Performance Effeciency
      • Cost Optimization
      • Operational Excellence
    • Labs
      • Webserver Implementation
      • ELB Implementation
      • Auto-scaling Implementation
      • A 3-tier Architecture In VPC
  • Architecture
    • Security
  • Spark
    • Memo
  • Conference Notes
    • Notes of JCConf 2017
  • AI Notes
Powered by GitBook
On this page

Was this helpful?

  1. AWS Notes
  2. Services
  3. Storage

DynamoDB

Introduction

  • NoSQL database, fully managed, massive scale (1,000,000 rps), single-digit millisecond latency.

  • Similar to Apache Cassandra (can migrate to DynamoDB). DynamoDB is made of tables.

  • Store on SSD, spread across 3 geographically data centers (cannot choose AZ)

  • Backups available, point in time recovery.

Feature

  • Capacity mode:

    • Provisioned (R/W Capacity Unit & auto scaling, default, free-tier eligible)

      • If capacity is running out and indexes are well used and don't want to increase the cost, consider to export / archive data.

      • Can purchase reserved capacity in advance to lower the costs

    • On-demand

  • Read type:

    • Eventually consistent reads (default)

    • Strongly consistent reads

  • Supports ACID transactions across multiple tables.

  • Integrated with IAM for security.

  • Data types:

    • Scalar types: string, number, binary, boolean, null.

    • Document types: list, map

    • Set types: set, number set, binary set.

  • Primary Key (must be decided at creation time, and must be unique.)

    • Partition Key (Hash attribute)

    • Partition Key (Hash attribute) + Sort Key (Range Attribute)

      • Data is grouped by Partition Key.

      • Timestemp is a good Sort Key candidate.

  • Working With Indexes

    • Can only query by PK + Sort Key on main table / indexes (Cannot query by a specific column spontanesously).

    • Local Secondary Index (to select an alternative Sort Key)

      • Contains identical partition key of base table.

      • The identifier of the Local Secondary Index can only be composite key. The hash attribute must be the same with the hash attribute of base table.

      • Total size <= 10 GB.

      • Created at the same time with base table, and can not be deleted if table exists.

      • Supports both eventual / strong read consistency.

      • Action of read / write consumes capacity units from base table.

      • Best Practices

        • Use Indexes sparingly (Avoid indexing to heavily-write table. Don't add index not used.)

        • Choose Projections carefully,

        • Optimize Projection to avoid fetches.

        • Take advantage of sparse Indexes (Ex. Create an attribute of item for indexing, remove the attribute when the item is no longer needed.)

        • Watch for expanding item collections.

    • Global Secondary Index (in case you need another key to work like the Primary Key)

      • It contains a full mapping to all items (rows) with the specified attribute in the base table.

      • The identifier of the Global Secondary Index can be simple partition key or a composite key. The key can be any attribute.

      • No size restrictions.

      • Can be created at the same time with base table, and can be created / deleted any time.

      • Supports eventual consistency read only.

      • Action of read / write consumes capacity units from the index.

      • Best Practices

        • Choose a key that will provide uniform workloads.

        • Take advantage of sparse (the attribute appears infrequently among all items) Indexes.

        • Use a Global Secondary Index for quick lookups. (Ex. Select sub-sets of attributes.)

        • Create an Eventually Consistent Read Replica.

  • Allows for the storage of large text and binary objects with a limit of 400 KB per item (row).

    • If an object is over 400 KB, store it to S3, then save reference.

  • TTL: automatically purge out old data without consuming WCU / RCU row after a specified epoch date.

  • DynamoDB Streams:

    • React to changes to DynamoDB tables in real time

    • Can be read by AWS Lambda, EC2, etc. Then send to ElasticSerach or Kinesis, etc.

    • 24 hours retention of data

  • Global Tables (cross region replication)

    • Active-Active replication to many regions

    • Must enable DynamoDB Streams

    • Useful for low latency, DR purposes

  • DAX (DynamoDB Accelerator)

    • Seamless cache for DynamoDB, no application re-write.

    • Writes go through DAX to DynamoDB

    • Microsecond latency for cached reads

    • Solves the Hot Key Problem (too many reads)

    • 5 minutes TTL cache by default

    • Up to 10 nodes in the cluster

    • Multi-AZ (3 nodes minimum reconnended for production)

    • Secure (Encryption at rest with KMS, VPC, IAM, CloudTrail, etc.)

  • Scenario:

    • S3 indexing with DynamoDB

      • Create DynamoDB and indexes for later API retrieval.

      • S3 event to trigger Lambda, to insert data to DynamoDB.

    • DAX vs ElastiCache

      • DAX for: individual objects cache for query

      • ElastiCache: store aggregated result

PreviousAuroraNextEBS

Last updated 4 years ago

Was this helpful?