Development Notes
  • Introduction
  • Programming Langauges
    • Java
      • Cache
      • Java Fundamentals
      • Multithreading & Concurrency
      • Spring Boot
        • Spring Security
        • Development tips
      • ORM
        • Mybatis
      • Implementation & Testing
    • Node.js
      • Asynchronous Execution
      • Node.js Notes
    • Python
      • Memo
  • Data Structure & Algorithm
  • Database
  • Design Pattern
  • AWS Notes
    • Services
      • API Gateway
      • CloudHSM
      • Compute & Load Balancing
        • Auto Scaling Group
        • EC2
        • ECS
        • ELB
        • Lambda
      • Data Engineering
        • Athena
        • Batch
        • EMR
        • IoT
        • Kinesis
        • Video Streaming
        • Quicksight
      • Deployment
        • CloudFormation
        • Code Deploy
        • Elastic Beanstalk
        • OpsWorks
        • SAM
        • SSM
      • ElasticSearch
      • Identity & Federation
        • Directory Service
        • IAM
        • Organizations
        • Resource Access Manager (RAM)
        • SSO
        • STS
      • KMS
      • Management Tools
        • Catalog
        • CloudTrail
        • CloudWatch
        • Config
        • Cost Allocation Tags
        • GuardDuty
        • Savings Plans
        • Trusted Advisor
        • X-Ray
      • Migration
        • Cloud Migration: The 6R
        • Disaster Recovery
        • DMS
        • VM Migrations
      • Networking
        • ACM
        • CloudFront
        • Direct Connect
        • EIP & ENI
        • Network Security
        • PrivateLink
        • Route53
        • VPC
        • VPN
      • Service Commnucation
        • Amazon MQ
        • SNS
        • SQS
        • Step Functions
        • SWF
      • Storage
        • Aurora
        • DynamoDB
        • EBS
        • EFS
        • ElastiCache
        • RDS
        • Redshift
        • S3
        • Storage Gateway
      • Other Services
        • Alexa for Business, Lex, Connect
        • AppStream 2.0
        • CloudSearch
        • Comprehend
        • Data Tools
        • Elastic Transcoder
        • Mechanical Turk
        • Rekognition
        • WorkDocs
        • WorkSpaces
    • Well Architect Framework
      • Security
      • Reliability
      • Performance Effeciency
      • Cost Optimization
      • Operational Excellence
    • Labs
      • Webserver Implementation
      • ELB Implementation
      • Auto-scaling Implementation
      • A 3-tier Architecture In VPC
  • Architecture
    • Security
  • Spark
    • Memo
  • Conference Notes
    • Notes of JCConf 2017
  • AI Notes
Powered by GitBook
On this page

Was this helpful?

AI Notes

Bellman equation

  • 目標函數(描述目標的數學函數)

  • V value

  • deterministic process vs non-deterministic process

Markov process property

Policy vs Plan living penalty

Q-learning intuition

  • 动作效用函数(action-utility function)

  • Q = Reward + discount * sum(max(all possible expected value end up with)) // probabilistic approach

  • Q = Reward + discount * max(all possible expected value end up with) // simpler look

  • transformed from V value

Temporal difference

  • TD = Q after action - Q before action // Decreasingly look

  • Q(s, a) after action = Q(s, a) before action + alpha TD(a, s) // alpha is learning rate. Incremental look

Deep Q-learning

  • Q learning

  • Neural network

    • Activation function

      • Threshold function - If x >=0, value = 1. Else value = 0.

      • Sigmoid function - value = 1 / (1 + e^-x). would be smooth line between 0~1.

      • Rectifier function - value: max(x, 0)

      • Hyperbolic Tangent (tanh) - value = (1 - e^-2x) / (1 + e^-2x). would be smooth line between -1~1.

    • Learning - Experience replay

      • Cost function = 1/2 (output value - actual value)*2 // Trying to get minimum of the cost, then the weights of each input are available.

    • Acting - Action selection policies

      • exploration

        • epsilon greedy

        • Softmax

        • epsilon greedy VDBE

      • exploitation

PreviousNotes of JCConf 2017

Last updated 5 years ago

Was this helpful?