September 28th, 2010
|01:42 pm - A better MySQL CHECKSUM TABLE, fixing Bug#39474|
The current MySQL table checksum is very simple: it's basically the same as CRC32(CONCAT(all data in the table)).
Since long ago we've had customers complain about the algorithm, but one doesn't change such things every day.
Now the time has come to make the change.
The only question is how much should the checksum formula be changed? Is it sufficient to just fix Bug#39474 or should we take the opportunity to do more?
Your input is much appreciated!
- Is crc32 a good enough function for a checksum? Should we start using some other hash function?
- Should checksum change when table metadata changes? I.e. when you change the underlying data type? What about changing the table comment? Order of columns in the table?
- Any other issues we should address along the way?
Mental note: make sure the algorithm is consistent with CHECKSUM TABLE QUICK, which is currently only available for MyISAM.
Is it OK if different CHECKSUM variants produce different results? Perhaps it is.
Another mental note: check if we need to take pieces of Monty's implementation as described in Bug#37007
pz suggests to use 64bit checksum
mats says no immediate issues with rpl are known
|Date:||September 29th, 2010 12:17 pm (UTC)|| |
Replication issues and future
Changing the checksum algorithm will not per-se cause any replication problems, but it would be good if we could distinguish between a checksum for the definition and one for the data. Changing the name of a field, for example, might be important for some application but not for others.
Also, it would be good if a checksum algorithm could be maintained incrementally: that would allow quick checking of consistency between tables on master and slave, even if they are huge. This basically requires a linear checksum function.