跳到主要內容

Install K8S on VM and spring boot integration.

Installation Environment Vmware Workstation pro  It is recommended to use the  snapshot  to store the state of each installation stage to avoid installation failures and causing the installation to start from scratch. Ubuntu 22.04 windows 11 Hardware settings  create 3 VM: 4 cores and 4G memory and 100G capacity Before installing K8s (All use the root user) set host: 192.168.47.135 master 192.168.47.131 node1 192.168.47.132 node2 set root ssh connection: sudo su - echo "PermitRootLogin yes" >> /etc/ssh/sshd_config systemctl restart sshd sudo passwd ssh-keygen for i in {master,node1,node2}; do  ssh-copy-id root@$i; done set Ipvs and conf  create conf file: for i in {master,node1,node2}; do ssh root@$i 'cat << EOF > /etc/modules-load.d/containerd.conf overlay br_netfilter EOF'; done execute conf: for i in {master,node1,node2}; do ssh root@$i 'modprobe overlay;modprobe br_netfilter;'; done create 99-kubernetes-cri.conf file: for i in {maste...

Distributed transactions

Basic theory 

CAP

States that any distributed data store can provide only two of the following three guarantees.

Consistency

Every read receives the most recent write or an error.

Availability

Every request receives a (non-error) response, without the guarantee that it contains the most recent write.

Partition tolerance

The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes.

Typical architecture of distributed systems

When a network partition failure happens, it must be decided 
whether to do one of the following:

CP: cancel the operation and thus decrease the availability but ensure consistency
AP: proceed with the operation and thus provide availability but risk inconsistency.

BASE

Basically-available, soft-state, eventual consistency.
Base theory is the practical application of CAP theory, that is, under the premise of the existence of partitions and copies, through certain system design methods
The solution is to give up strong consistency and achieve basic availability. This is the choice of most distributed systems.

Basically available

Reading and writing operations are available as much as possible (using all nodes of a database cluster), but might not be consistent (the write might not persist after conflicts are reconciled, and the read might not get the latest write)

Soft-state

Without consistency guarantees, after some amount of time, we only have some probability of knowing the state, since it might not yet have converged

Eventually Consistent

If we execute some writes and then the system functions long enough, we can know the state of the data; any further reads of that data item will return the same value

Distributed transaction model

Strong consistency model

DTP model



The X/Open Distributed Transaction Processing (DTP) model includes a number of interrelated components that control how distributed transactions are processed.

Application program (AP):
The application program (AP) defines transaction boundaries and defines the application-specific actions that make up the transaction.

Resource managers (RM):
RM refers to a database such as MySQL or Oracle or a corresponding database driver or accessible file system or printer server to provide an interface for accessing database resources

Transaction manager (TM):
TM is the coordinator of distributed transactions. It is responsible for assigning transaction IDs to distributed transactions, monitoring the execution process of transactions, and responsible for transaction completion and fault tolerance.
Distributed transactions managed by TM can span multiple RMs. TM also manages the 2PC protocol to coordinate the commit/rollback decisions of distributed transactions. 
 

X/Open XA



The goal of XA is to guarantee atomicity in "global transactions" that are executed across heterogeneous components. A transaction is a unit of work such as transferring money from one person to another. Distributed transactions update multiple data stores (such as databases, application servers, message queues, transactional caches, etc.) To guarantee integrity, XA uses a two-phase commit (2PC) to ensure that all of a transaction's changes either take effect (commit) or do not (rollback), i.e., atomically.

2PC(Two-phase commit protocol)



Phase 1:
The TM (Transaction Manager) notifies each RM (Resource Manager) to prepare to commit their transaction branches. If RM judges that the work it is doing can be submitted, it will persist the work content and give TM a positive reply; if other situations occur, it will give TM a negative reply.

Phase 2:
TM decides whether to commit or rollback the transaction based on the results of each RM prepared in phase 1. If all RMs are prepared successfully,
Then TM notifies all RMs to commit; if RM preparation fails, TM notifies all RMs to roll back their own transaction splits.
branch.

3PC(Three-phase commit protocol)



Three-Phase Commit is an extension of the Two-Phase Commit (2PC) protocol, with an additional phase to improve fault tolerance. It is a blocking protocol with three main steps:

Can-Commit: The coordinator sends a Can-Commit message, asking each participant if they're ready to commit the transaction. Participants reply with either Yes or No.

Pre-Commit: If all participants send a Yes, the coordinator broadcasts a Pre-Commit message. Participants acknowledge the message and prepare to commit the transaction.

Do-Commit: Once the coordinator receives all acknowledgements, it sends a Do-Commit message, instructing participants to finalize the transaction.
Three-Phase Commit helps avoid blocking in case of a coordinator failure by allowing participants to reach a decision independently, reducing the chances of a global deadlock.

Eventual consistency model

TCC(Try-Confirm-Cancel)



Try-Confirm-Cancel (TCC) mode is the most classic distributed transaction solution, which divides distributed transactions into two phases to execute. In the try phase, resources are reserved for each branch transaction. If all branch transactions are reserved, the global transaction is committed in the commit phase. If a node fails to reserve resources, the global transaction is rolled back into the cancel phase.

TCC vs XA

XA is a distributed transaction at the resource level with strong consistency. During the entire two-phase submission process, the resource lock will always be held. TCC is
Distributed transactions at the business level are eventually consistent and will not always hold resource locks.

Reliable message eventual consistency

Local massage table

The local message table is a business coupling design. The message producer needs to build an additional transaction message table and record the message-sending status. The message consumer needs to process this message and complete its own business logic. In addition, there will be an asynchronous mechanism for Periodic scan of outstanding
message to ensure eventual consistency.

Transactional messages(Rocket MQ)



Normal messages and order transactions cannot be consistent because normal messages fail to be committed, rolled back, and coordinated like stand-alone database transactions.

The distributed transactional message feature based on Message Queue for Apache RocketMQ supports two-phase commits based on normal messages. Bind a two-phase commit to a local transaction to achieve consistency in global commit results.

The solution of transactional messages in Message Queue for Apache RocketMQ provides the advantages of high performance, scalability, and simple business development.

Data consistency algorithm

Paxos



Paxos is an algorithm that is used to achieve consensus among a distributed set of computers that communicate via an asynchronous network. One or more clients propose a value to Paxos and we have a consensus when a majority of systems running Paxos agree on one of the proposed values. Paxos is widely used and is legendary in computer science since it is the first consensus algorithm that has been rigorously proved to be correct.

Basic Paxos

In most deployments of Paxos, each participating process acts in three roles; Proposer, Acceptor, and Learner. This reduces the message complexity significantly, without sacrificing correctness.

Paxos is a two-phase protocol, meaning that the proposers interact with the acceptors twice. At a high level:

Phase 1:
A proposer asks all the working acceptors whether anyone already received a proposal. If the answer is no, propose a value.

Phase 2:
If a majority of acceptors agree to this value then that is our consensus.

Multi-Paxos



A typical deployment of Paxos requires a continuous stream of agreed values acting as commands to a distributed state machine. If each command is the result of a single instance of the Basic Paxos protocol, a significant amount of overhead would result.

If the leader is relatively stable, phase 1 becomes unnecessary. Thus, it is possible to skip phase 1 for future instances of the protocol with the same leader.

To achieve this, the round number I is included along with each value which is incremented in each round by the same Leader. Multi-Paxos reduces the failure-free message delay (proposal to learning) from 4 delays to 2 delays.

ZAB(ZooKeeper Atomic Broadcast)



The ZAB protocol ensures that the Zookeeper replication is done in order and is also responsible for the election of leader nodes and the restoration of any failed nodes. In a Zookeeper ecosystem, the leader node is the heart of everything; every cluster has one leader node and the rest of the nodes are followers. All incoming client requests and state changes are received first by the leader with the responsibility to replicate it across all its followers (and itself). All incoming read requests are also load-balanced by the leader within itself and its followers.

RAFT



It is based on Lambert's Multi-Paxos idea with some simplifications and restrictions.
For example, I would like to add that the log must be continuous and only support three states: leader, follower, and candidate. The algorithm is relatively easy to understand and implement. The Raft algorithm is now the consensus algorithm of choice for distributed system development.
The Raft algorithm essentially achieves consensus on a series of values and the consistency of each node's log through all leader-based methods.

GOSSIP



A gossip protocol or epidemic protocol is a procedure or process of computer peer-to-peer communication that is based on the way epidemics spread. Some distributed systems use peer-to-peer gossip to ensure that data is disseminated to all members of a group. Some ad-hoc networks have no central registry and the only way to spread common data is to rely on each member to pass it along to their neighbors.


reference:



留言

這個網誌中的熱門文章

Install K8S on VM and spring boot integration.

Installation Environment Vmware Workstation pro  It is recommended to use the  snapshot  to store the state of each installation stage to avoid installation failures and causing the installation to start from scratch. Ubuntu 22.04 windows 11 Hardware settings  create 3 VM: 4 cores and 4G memory and 100G capacity Before installing K8s (All use the root user) set host: 192.168.47.135 master 192.168.47.131 node1 192.168.47.132 node2 set root ssh connection: sudo su - echo "PermitRootLogin yes" >> /etc/ssh/sshd_config systemctl restart sshd sudo passwd ssh-keygen for i in {master,node1,node2}; do  ssh-copy-id root@$i; done set Ipvs and conf  create conf file: for i in {master,node1,node2}; do ssh root@$i 'cat << EOF > /etc/modules-load.d/containerd.conf overlay br_netfilter EOF'; done execute conf: for i in {master,node1,node2}; do ssh root@$i 'modprobe overlay;modprobe br_netfilter;'; done create 99-kubernetes-cri.conf file: for i in {maste...

ShardingSphere

Table of contents [ hide ]  ShardingSphere The distributed SQL transaction & query engine for data sharding, scaling, encryption, and more - on any database. ShardingJDBC ShardingSphere-JDBC is a lightweight Java framework that provides additional services at Java’s JDBC layer. ShardingProxy ShardingSphere-Proxy is a transparent database proxy, providing a database server that encapsulates database binary protocol to support heterogeneous languages. Core Concept 1. Virtual Database Provides a virtual database with sharding capabilities, allowing applications to be easily used as a single database 2. Real Database The database that stores real data in the shardingShereDatasource instance for use by ShardingSphere 3. Logic Table Tables used by the application 4. Real Table In the table that stores real data, the data structure is the same as the logical table. The application maintains the mapping between the logical table and the real table. All real tables map to ShardingS...

Kafka

Table of contents [ hide ] Kafka Apache Kafka is an open-source distributed event streaming platform used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications. Why use a Message Queue like Kafka 1. Asynchronous The program can continue execution without waiting for I/O to complete, increasing throughput. 2. Decoupling It refers to reducing the dependencies between different parts of the system so that each component of the system can be developed, maintained, and evolved relatively independently. The main goal of decoupling is to reduce tight coupling between components to improve system flexibility, maintainability, and scalability. 3. Peak clipping Peak clipping is essential to delay user requests more, filter user access needs layer by layer, and follow the principle of "the number of requests that ultimately land on the database is as small as possible". Basic Concept 1. Client Includi...