OK, let me start with saying that a table without a primary key shouldn't be something that a DBA should ever stumble into. Sure, InnoDB will secretly add one for us without telling - but that's not something we can use at application or administration level, it's just to make the engine happy.

So let's suppose you find some tables that lack a primary key, of course you need to do something about it, right? Now, put Galera Cluster in the mix - Galera does not support tables without a primary key and will, secretly again, make that table inconsistent at cluster level.

You need to fix the damn table(s)!! And this is where the fun begins... as you can't afford downtime for the operation so you need to resort to an online schema change of some type.

Galera cluster is a very nice HA solution and personally I have embraced it since few years now. I really love it, especially its ability to self-heal when something happens (and when you're in the cloud, there's always something happening). Galera is very different from the classic mysql replication, and the biggest difference is the fact that it replicates synchronously instead of asynchronously. Now, I will not get into details about how Galera works, there are plenty of blog posts and very good documentation online. But there is something important that you need to be aware of, when running Galera : in the standard operating mode of the cluster, any DDL that you will execute will freeze the entire cluster for the entire duration of the statement. They call this "Total Order Isolation" (TOI) and it is the default setting for the cluster.

Guess what. Months after migrating one of our clusters from classic master-master topology to Galera Cluster , I suddenly found myself in need to clean up some old and barely used schemas, which contained quite a few tables lacking a primary key (or any index, for that matter). My bad, I spent a lot of time checking the main schemas and tables and making sure they were optimized for Galera , but I left the unimportant ones behind and never went back checking on them.

So, here I am with my apparently simple (although boring) task - I need to add a primary key to a set of tables and I don't want downtime and/or block the cluster as a whole in the process.

Preliminary checks with DEVs confirmed that all INSERTs going into the tables I needed to fix were using an explicit field list, so the change was compatible (it is important to remember to check this, in order to avoid application failures).

So, first thing that I tried was to just add the PK using the great pt-online-schema-change tool from the Percona toolkit , which supports Galera Cluster just fine. Unfortunately, I was quickly remembered (by an error message) that the tool cannot operate if the table lacks a primary key or unique index, except when the column that you want to be primary key is already present in the table, which was not my case. D'oh!

Failing that, I thought about running the ALTER statement directly, without using an online schema change tool. Of course, this would have to be done on all nodes, one node at a time, after changing the wsrep_OSU_method global variable of the cluster to 'RSU'. RSU stands for "Rolling Schema Upgrade" - more info here . In this mode, any DDL that you execute will only apply to the current node and will not be propagated to other nodes; the node will be evicted from the cluster during the operation, and will rejoin afterwards, but again, the DDL will NOT replicate.

So you have to run it on each node in order to complete the change. Boring, and not very practical, right. May be good to add an index on a write-heavy critical table, but of course not to change a table definition... especially if you got many tables to change! But I was prepared to do it at any cost anyways (I eventually felt guilty for not checking this before the migration), so I went to one of the nodes, changed the OSU method to RSU , and executed the following DDL:


The operation completed OK, but shortly thereafter, I lost connection to the node (infamous MySQL has gone away..... ) WTF?!? I verified and the mysqld process was dead. While it is not a big deal for a 3 node cluster to lose one node in our setup, I did not expect the node to die as a consequence of a DDL. But a quick look at the error log, and I immediately realized what the problem was:

2017-01-10 13:53:06 16667 [ERROR] Slave SQL: Column 0 of table 'myschema.mytable' cannot be converted from type 'char(3)' to type 'int(11) unsigned', Error_code: 1677 2017-01-10 13:53:06 16667 [ERROR] WSREP: Failed to apply trx 760719957 4 times 2017-01-10 13:53:06 16667 [ERROR] WSREP: Node consistency compromized, aborting...

What did just happen? Galera replicates between nodes using ROW based replication - this means that the row data gets replicated, instead of the statement that inserts or updates it. This is done for many good reasons (compatibility with non deterministic statements, for example). But in my situation, this was biting me hard - a row inserted by the application in one of the other cluster nodes reached my node where the table structure had been altered, the local node found that the first field of the row was of a different type, and failed as a result. In order to preserve consistency, after some unsuccessful retries, Galera aborted the MySQL server on the node.

Epic fail....

But I still needed to fix the damn tables. While the node was being (automatically) rebuilt, I spent some time thinking about a way to complete this task. Replication had failed because the first column of the modified table was found to be different type vs. the first column of the table on other nodes. Hmmm, what about moving the new PK column to the end of the table instead of keeping it at the top, I thought. Ugly for sure, but I was curious to see if replication would work correctly in this case. In theory, the columns are matched by position in RBR, so replication should not complain after all.

Pronto, as soon as the node came back in service I set it once again in RSU mode and executed my DDL again:


Note that the only difference here is that I omitted the FIRST clause at the end this time. This will cause the new column to be added to the bottom of the table, as last column.

After executing the DDL I anxiously watched the table waiting for a replicated row to arrive... and I was really happy to see that rows were entering without problems now. Yay!

But wait, how could I now repeat these steps on the other nodes in a consistent way? The new shiny primary key was an auto_increment one (and it had to be, because the application couldn't populate it for me, it didn't even know this column existed), but if I was to repeat the above DDL on other nodes, I would get different PK values inserted in each row - Galera Cluster use

本文数据库(mysql)相关术语:navicat for mysql mysql workbench mysql数据库 mysql 存储过程 mysql安装图解 mysql教程 mysql 管理工具

本文标题:Galera Cluster: adding a primary key to a MySQL table which lacks it... without ...

技术大类 技术大类 | 数据库(mysql) | 评论(0) | 阅读(20)