跳到主要內容

Distributed transactions

Table of contents [ hide ] Basic theory  CAP States that any distributed data store can provide only two of the following three guarantees. Consistency Every read receives the most recent write or an error. Availability Every request receives a (non-error) response, without the guarantee that it contains the most recent write. Partition tolerance The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes. Typical architecture of distributed systems When a network partition failure happens, it must be decided  whether to do one of the following: CP: cancel the operation and thus decrease the availability but ensure consistency AP: proceed with the operation and thus provide availability but risk inconsistency. BASE Basically-available, soft-state, eventual consistency. Base theory is the practical application of CAP theory, that is, under the premise of the existence of partitions and copies, through certain syste

Mongo DB

Mongo DB

An open-source NoSQL file database. Due to the use of JSON document data storage structure, MongoDB has three advantages: flexibility, easy expansion, and high performance. Compared with traditional relational databases, users do not need to define the data structure in advance when using MongoDB to store data, and can freely add the Key/Value of the data.

Basic concept

SQL vs MongoDB

Database vs Database
Table vs Collection
Row vs Document
Column vs Field
Index vs Index
Primary key vs _id
View vs View
Table Joins vs $lookup

Query Syntax Example

SQL vs MongoDB

a = 1 vs {a: 1}
a<> 1 vs {a: {$ne: 1}}
a > 1 vs {a: {$gt: 1}}
a >= 1 vs {a: {$gte: 1}}
a < 1 vs {a: {$lt: 1}}
a <= 1 vs {a: {$lte: 1}}


Aggregation  Pipeline Stages

Description: Mongo DB vs SQL

Filter criteria: $match vs where
Projection: $project vs as
Left outer join: $lookup vs left outer join
Sort: $sort vs order by
Group: $group vs group by
Pagination: $skip/$limit vs limit 0,10
Expand array $unwind
Graph search: $graphLookup
Paginated search: $facet/$bucket

Index

Single Field Indexes
Compound Index
Multikey Index
Hashed Indexes
Geospatial Index
Text Indexes
Wildcard Indexes

Index attribute

Unique Indexes
Partial Indexes
Sparse Indexes
TTL Indexes
Hidden Indexes

Explain

Same as MySQL, Mongo DB also provides the same function to let us check our query model, to improve our query efficiency.

Cluster Type

Replica Set

MongoDB replication is the process of synchronizing data across multiple servers.

Replication provides redundant backup of data and stores data copies on multiple servers, improving data availability and ensuring data security.

Replication also allows you to recover data from hardware failures and service outages.

MongoDB replication requires at least two nodes. One of them is the master node, responsible for processing client requests, and the rest are slave nodes, responsible for replicating the data on the master node.

Sharding

Sharding is the process of splitting a database and spreading it across different machines. It can store more data and handle larger loads without the need for powerful servers. Cut the collection into smaller pieces out of the total data. , these blocks are dispersed into several slices, each slice only loads a part of the total data and is operated through a routing process of the mongos component that knows the correspondence between the data and the slices.

composition structure

Shard

The part of data storage mostly uses a Replica set rather than a single one.

Config Server

Save all the metadata of the cluster, including shard storage and shard router.

Mongos

The entrance of the cluster routed the request to the right shard.

Strategy

WriteConcern

Write concern describes the level of acknowledgment requested from MongoDB for write operations to a standalone mongod, replica sets, or sharded clusters. In sharded clusters, mongos instances will pass the write concern on to the shards.

ReadPreference

Read preference describes how MongoDB clients route read operations to the members of a replica set.

ReadConcern

The readConcern option allows you to control the consistency and isolation properties of the data read from replica sets and sharded clusters.

Monitor

mongostat

The mongostat  utility provides a quick overview of the status of a currently running mongod or mongos instance. Use mongostat to help identify system bottlenecks

mopngotop

mongotop provides a method to track the amount of time a ongoDB instance mongod spends reading and writing data. 
mongotop provides statistics on a per-collection level. By default, 
mongotop returns values every second.

profiler

The database profiler collects detailed information about Database Commands executed against a running mongod instance. This includes CRUD operations as well as configuration and administration commands.

The profiler writes all the data it collects to a system.profile collection, a capped collection in each profiled database. 

db.currentOp()

Returns a document that contains information on in-progress operations for the database instance.

Raft

https://en.wikipedia.org/wiki/Raft_(algorithm)

留言

這個網誌中的熱門文章

ShardingSphere

Table of contents [ hide ]  ShardingSphere The distributed SQL transaction & query engine for data sharding, scaling, encryption, and more - on any database. ShardingJDBC ShardingSphere-JDBC is a lightweight Java framework that provides additional services at Java’s JDBC layer. ShardingProxy ShardingSphere-Proxy is a transparent database proxy, providing a database server that encapsulates database binary protocol to support heterogeneous languages. Core Concept 1. Virtual Database Provides a virtual database with sharding capabilities, allowing applications to be easily used as a single database 2. Real Database The database that stores real data in the shardingShereDatasource instance for use by ShardingSphere 3. Logic Table Tables used by the application 4. Real Table In the table that stores real data, the data structure is the same as the logical table. The application maintains the mapping between the logical table and the real table. All real tables map to ShardingSpher

Program design template

Table of contents [ hide ] Designing programs is a common job for every engineer, so templates help simplify the job of designing each program. Update History Let everyone know the latest version and update information. background Write down why we need this program and what is the background for building this program. Target Target functionality The goal of the program, what function to achieve, the main module and submodule, and these modules' relationship. Target performance Specific benchmarks such as QPS or milliseconds to evaluate programs. Target architecture Stability. Readability. Maintainability. Extendability. ... Others Target Overall design Design principles and thinking Explain how and why the program was designed. Overall architecture An overall architectural picture. Dependency Module dependencies on other modules or programs. Detail design Program flow design Program flow design diagram. API design The details of the API, and how to interact with the frontend

Virtual memory

Table of contents [ hide ] Virtual memory Separation of user logical memory from physical memory. To run an extremely large process. Logical address space can be much larger than physical address space. To increase CPU/resource utilization. A higher degree of multiprogramming degree. To simplify programming tasks. A free programmer from memory limitation. To run programs faster. Less I/O would be needed to load or swap. Process & virtual memory Demand paging: only bring in the page containing the first instruction. Copy-on-write: the parent and the child process share the same frames initially, and frame-copy. when a page is written. Allow both the parent and the child process to share the same frames in memory. If either process modifies a frame, only then a frame is copied. COW allows efficient process creation(e.g., fork()). Free frames are allocated from a pool of zeroed-out frames(for security reasons). The content of the frame is erased to 0 Memory-Mapped File: map a fi