How to Understand Galera Alexey Yurchenko Codership Oy Agenda ● ● The difference between traditional (e.g. MySQL) replication and Galera. General Galera principles. 2 Galera Difference Traditional Replication Approach Server-centric: “One server streams data to another” Replication stream Server 1 1 Server Server 2 “master” “slave” 4 Traditional Replication Approach 8 9 6 7 5 10 11 12 Cool topologies! 24 4 3 23 15 13 1 14 16 17 18 22 2 1 21 19 20 5 Traditional Replication Approach But there are questions: ● A ● C If node C crashes, do we still have a cluster? If node B crashes and clients failover to C, how B joins back? ● Which node has data X? ● How do we backup the cluster? B 6 Galera (wsrep API) Approach Data-centric: There is some Data (a dataset) Server 1 Server 2 Server 3 ... Server N - it is synchronized between one or more servers 7 Galera (wsrep API) Approach Data-centric: Data does not belong to a Node Node belongs to Data 8 Galera (wsrep API) Approach Data-centric: There is some Data (a dataset) The dataset needs an ID: 00295a79-9c48-11e2-bdf0-9a916cbb9294 9 Galera (wsrep API) Approach Data-centric: Dataset ID == Cluster ID Data Cluster Server Server Server Server 00295a79-9c48-11e2-bdf0-9a916cbb9294 10 Galera (wsrep API) Approach Data-centric: Cluster A (9b0e4ad8-a42b-11e2-bdf0-9851bb738edb) A C B Cluster B (e1eedd79-a5d6-11e2-a801-aae16e1f5226) 11 Global Transaction ID (GTID) Dataset + sequence of atomic changes ... 64198 64199 64200 64201 64202 64203 64204 64205 64206 64207 ... = 00295a79-9c48-11e2-bdf0-9a916cbb9294:64201 12 Global Transaction ID (GTID) 00295a79-9c48-11e2-bdf0-9a916cbb9294:64201 64201 Global Transaction ID 00295a799c48-11e2bdf09a916cbb9294: 64201 Dataset State ID 13 Global Transaction ID (GTID) 00295a79-9c48-11e2-bdf0-9a916cbb9294:0 - initial data 00295a79-9c48-11e2-bdf0-9a916cbb9294:1 - first change/transaction 00000000-0000-0000-0000-000000000000:-1 - undefined GTID 14 Global Transaction ID (GTID) Galera GTID: 00295a79-9c48-11e2-bdf0-9a916cbb9294:64201 MySQL 5.6 GTID: 8182213e-7c1e-11e2-a6e2-080027635ef5:12345 15 Global Transaction ID (GTID) Galera GTID: 00295a79-9c48-11e2-bdf0-9a916cbb9294:64201 Cluster ID MySQL 5.6 GTID: 8182213e-7c1e-11e2-a6e2-080027635ef5:12345 Server ID 16 Global Transaction ID (GTID) Galera GTID: 00295a79-9c48-11e2-bdf0-9a916cbb9294:64201 Cluster ID data change in the cluster MySQL 5.6 GTID: 8182213e-7c1e-11e2-a6e2-080027635ef5:12345 transaction Server ID processed by the server 17 Global Transaction ID (GTID) What we see in MySQL 5.6: 8182213e-7c1e-11e2-a6e2-080027635ef5:12345 8182213e-7c1e-11e2-a6e2-080027635ef5:12346 8182213e-7c1e-11e2-a6e2-080027635ef5:12347 ← new master promoted → f4e3bf7a-a91f-11e2-4e02-3f8dbcffaed8:1 f4e3bf7a-a91f-11e2-4e02-3f8dbcffaed8:2 f4e3bf7a-a91f-11e2-4e02-3f8dbcffaed8:3 18 Global Transaction ID (GTID) What we see in Galera: 00295a79-9c48-11e2-bdf0-9a916cbb9294:64201 00295a79-9c48-11e2-bdf0-9a916cbb9294:64202 00295a79-9c48-11e2-bdf0-9a916cbb9294:64203 ← new master promoted → 00295a79-9c48-11e2-bdf0-9a916cbb9294:64204 00295a79-9c48-11e2-bdf0-9a916cbb9294:64205 00295a79-9c48-11e2-bdf0-9a916cbb9294:64206 19 Global Transaction ID (GTID) 1) Galera nodes are ANONYMOUS => all equal. 2) Galera cluster == one big distributed “master”. 3) Asynchronous replication to/from Galera cluster is now piece of cake. 20 Global Transaction ID (GTID) ASYNCHRONOUS UUID1 UUID2 21 Global Transaction ID (GTID) ASYNCHRONOUS UUID1 UUID2 UUID1:123 UUID2:539 UUID1:123 UUID2:543 22 Global Transaction ID (GTID) SYNCHRONOUS / ASYNCHRONOUS <==> SINGLE DATABASE / INDEPENDENT DATABASES <==> CONSISTENCY / INCONSISTENCY 23 Master / Slave ? client2 client1 slave1 slave2 master1 slave2 master for client 1 cluster slave1 slave2 slave1 master2 slave1 slave2 master for client 2 24 Master / Slave ? 1. Not a node role/function. 2. Is a relation between a node and a client. 25 Galera Cluster as a Meeting 00295a79-9c48-11e2-bdf0-9a916cbb9294 26 Galera Cluster as a Meeting 00295a79-9c48-11e2-bdf0-9a916cbb9294 27 Galera Cluster as a Meeting 00295a79-9c48-11e2-bdf0-9a916cbb9294 28 Galera Cluster as a Meeting 00295a79-9c48-11e2-bdf0-9a916cbb9294 29 Galera Cluster as a Meeting 00295a79-9c48-11e2-bdf0-9a916cbb9294 30 Galera Cluster as a Meeting 00295a79-9c48-11e2-bdf0-9a916cbb9294 31 Galera Cluster as a Meeting 00295a79-9c48-11e2-bdf0-9a916cbb9294 32 Galera Cluster as a Meeting 00295a79-9c48-11e2-0800-9a916cbb9294 33 Galera Cluster as a Meeting ??? 00295a79-9c48-11e2-0800-9a916cbb9294 34 Galera Cluster as a Meeting New meeting! e1eedd79-a5d6-11e2-0800-a8e16e1f5226 35 wsrep_cluster_address 10.0.0.1 10.0.0.5 ? 10.0.0.4 10.0.0.2 10.0.0.3 10.0.0.6 36 wsrep_cluster_address 10.0.0.1 10.0.0.4 gcomm://10.0.0.2 10.0.0.5 handshake 10.0.0.2 10.0.0.3 10.0.0.6 37 wsrep_cluster_address 10.0.0.1 10.0.0.4 10.0.0.5 10.0.0.2 10.0.0.3 10.0.0.6 38 wsrep_cluster_address 10.0.0.1 10.0.0.2 10.0.0.4 10.0.0.5 10.0.0.6 10.0.0.3 39 wsrep_cluster_address 10.0.0.5 ? Nothing here 40 wsrep_cluster_address wsrep_cluster_address = gcomm://node1,node2 => try to connect to members: node1, node2 wsrep_cluster_address = gcomm:// => no members in the cluster, you are the first one. Start a new cluster. 41 Node Synchronization (State Transfer) 10.0.0.1 10.0.0.2 10.0.0.4 10.0.0.5 10.0.0.6 10.0.0.3 42 Node Synchronization (State Transfer) JOINED SYNCED SYNCED UNDEFINED DESYNC SYNCED 43 Node Synchronization (State Transfer) New node Cluster Old node UNDEFINED Here's what I've got, SYNCED give me what I'm missing JOINER Found a donor You be for you, please hold the donor DONOR Here's your stuff JOINED (through private channel) JOINED catch-up catch-up SYNCED SYNCED 44 Primary Component PRIMARY 45 Primary Component PRIMARY NON-PRIMARY 46 Primary Component PRIMARY keeps on working NON-PRIMARY tries to reconnect 47 Primary Component PRIMARY 48 Primary Component SPLIT-BRAIN! NON-PPRIMARY NON-PRIMARY 49 2-node Cluster and Split Brain clients VIP wsrep_provider_options=”pc.ignore_sb” 50 2-node Cluster and Split Brain Galera replication can be used in every manner traditional asynchronous master-slave replication is. It implements a SUPERSET of traditional replication functionality 51 How Synchronous is Galera? 1. Synchronous penalty. 2. Slave lag. 52 Galera Synchronous Penalty? The only thing Galera does synchronously is copying of data buffer to all cluster members on COMMIT command from client. => ~1 RTT added latency 53 Galera Synchronous Penalty? ~1 RTT added latency => Connection throughput = 1/RTT trx/sec => Total throughput = 1/RTT trx/sec ✕ #connections 54 Galera Synchronous Penalty in WAN (EC2) - sysbench client at us-east - sysbench client at eu-west Both clients connect to a standalone server at us-east accessibility zone 95% latencies 2-node Galera cluster: us-east client connects to us-east node, eu-west client connects to eu-west node 55 Galera Synchronous Penalty in WAN (EC2) - sysbench client at us-east - sysbench client at eu-west Both clients connect to a standalone server at us-east accessibility zone Throughput 2-node Galera cluster: us-east client connects to us-east node, eu-west client connects to eu-west node 56 Galera Synchronous Penalty? Still: CALLAGHAN'S LAW: A given row can't be modified more often than 1/RTT times a second (discovered by Mark Callaghan) 57 Slave lag in Galera? Client Master node Slave node START TRANSACTION process COMMIT Replicate changeset commit apply OK SELECT (stale data) commit 58 Questions? Thank you for listening! Happy Clustering :-)
© Copyright 2025