# Database

**Basics**

* [ACID](https://en.wikipedia.org/wiki/ACID) - DBMS transaction properties
* [Transactions Isolation levels](https://docs.microsoft.com/en-us/sql/odbc/reference/develop-app/transaction-isolation-levels?view=sql-server-2017)
* [Read lock, Write lock](http://www.mysqltutorial.org/mysql-table-locking/)
* [Optimistic lock & Pessimistic lock](https://stackoverflow.com/questions/129329/optimistic-vs-pessimistic-locking)
  * JOOQ[ Optimistic lock](https://www.jooq.org/doc/latest/manual/sql-execution/crud-with-updatablerecords/optimistic-locking/),[ Pessimistic lock](https://www.jooq.org/doc/3.0/manual/sql-building/sql-statements/select-statement/for-update-clause/)
  * [JPA Optimistic Locking](https://www.baeldung.com/jpa-optimistic-locking)

**Relational DB**

* PostgreSQL
  * Default isolation level: Read Committed
  * Replication: Master-slave
  * Partitioning method: partitioning by range, list and (since PostgreSQL 11) by hash
  * Index
    * How do indexes work in a relational DB? Types and scenarios?
    * [Efficient Use of PostgreSQL Indexes](https://devcenter.heroku.com/articles/postgresql-indexes)
    * Index types:
      * B tree
      * Hash Index
      * GIN(Gernalized inverted index)
      * [GIST](https://stackoverflow.com/questions/28975517/difference-between-gist-and-gin-index)
  * [PostgreSQL 锁解密](https://www.oschina.net/translate/postgresql-locking-revealed)
* MySQL
  * Default isolation level: Repeatable Read
  * Replication: master-master, master-slave
  * Partitioning method: horizontal partitioning, sharding with MySQL Cluster or MySQL Fabric
  * [B+ Tree Index](https://ithelp.ithome.com.tw/articles/10221111)
  * [Designing Schemaless, Uber Engineering’s Scalable Datastore Using MySQL](https://eng.uber.com/schemaless-part-one-mysql-datastore/)
  * [The Architecture of Schemaless, Uber Engineering’s Trip Datastore Using MySQL](https://eng.uber.com/schemaless-part-two-architecture/)
* Oracle
  * Commercial

    Replication:

    * master-master
      * All nodes can do read / write
        * Support fail-over
        * Consistency:
          * Most multi-master replication systems are only loosely consistent, i.e. lazy and asynchronous, violating ACID properties.
          * Performance: Eager replication systems are complex and increase communication latency.
        * Integrity: Issues such as conflict resolution can become intractable as the number of nodes involved rises and latency increases.
      * master-slave
        * Only master node can modify data. Slave nodes waiting for master node for high consistency
        * No automatic fail over
      * Cluster
        * No single point of failure
        * Very high throughput
        * 99.99% uptime
        * Auto-Sharding
        * Real-Time Responsiveness
        * On-Line Operations(Schema changes etc)
        * Distributed writes
        * See known [limitations](https://dev.mysql.com/doc/refman/5.6/en/mysql-cluster-limitations.html)
    * Partitioning method: horizontal partitioning

**NoSQL** [#NoSQLFor?](http://highscalability.com/blog/2010/12/6/what-the-heck-are-you-actually-using-nosql-for.html)

* Column-oriented
  * Cassandra
    * Strength:
      * ability to scale while still being reliable
      * high availability (high velocity writes, and lower velocity reads)
    * Structured / non-structured data. Flexible wide column.
    * A cluster of equal-privileged nodes in a ring structure.
* Graph
* Document-oriented
  * Mongo
    * JSON document based
* Key-value
  * Properties:
    * Can provide a good look up for sharded RDB
  * ready to talk about:
    * Cache Hit Ratio
    * Eviction**s**
    * Expired objects
  * Redis
    * Support persistence
    * Support key-objects (string, hash, list, set, sorted set)
    * Can selectively outdate data&#x20;
    * Support single threaded only
  * Memcached
    * volatile caching
    * When value is smaller than 1 MB, very fast.
    * Support multi-thread
    * [#MemcachedWorkWithMysql](http://highscalability.com/bunch-great-strategies-using-memcached-and-mysql-better-together)
  * Advanced data structures

**Trouble Shooting**

* How do you know that your database queries are taking time? How will you solve it?
  * [Oracle v$session view](https://oolamaru.wordpress.com/2017/05/30/oracle-tuning-%E6%96%B9%E6%B3%95%E7%9B%A3%E6%8E%A7/)

\
**Implementation:**&#x200C;

* Designing a friend list
  * Relational DB
    * Instead of just insert 1 row and do bi-directional search. Insert 2 rows at the same time to describe from both user's perspectives would help keep search SQL simple and make sharding possible.
  * NoSQL
    * Graph DB
      * ​[TAO](https://www.facebook.com/notes/facebook-engineering/tao-the-power-of-the-graph/10151525983993920/) (Facebook's solution), Open Graph, Neo4j‌

**Reference:**&#x200C;

* ​[Quora: How does Facebook maintain a list of friends for each user? Does it maintain a separate table for each user?](https://www.quora.com/How-does-Facebook-maintain-a-list-of-friends-for-each-user-Does-it-maintain-a-separate-table-for-each-user)​

**Reference**

* Second Level Cache -[ English](https://www.baeldung.com/hibernate-second-level-cache)[ Chinese](https://blog.csdn.net/xlgen157387/article/details/40071651)
* Uber replace Postgres with Mysql [#1](https://eng.uber.com/postgres-to-mysql-migration/)
