We’ve just finished verifying and fixing the problem that aborted our recent upgrade attempt. We’ll be rescheduling the upgrade soon–stay tuned.
What we have done about the problem to date:
- We identified the failures as being collation botches (more details below)
- We identified the correct collations, through a combination of documentation, expert advice, and experiment
- We tried to identify a technique to pick up the migration where we’d left off, but couldn’t (there are details of the collation configuration we have to set at table creation time)
- We modified our tools to do the collation configuration correctly
- We test-migrated the entire database, both to ensure the process now works and to have a better understanding of how long it will take
With this testing, we’re confident we’ve nailed the problems. We’re also surprised at how long it’s going to take (the estimate is 14 hours). We’re working on scheduling that now.
For the curious:
The problem that bit us relates to the encoding of non-English languages. This upgrade includes an upgrade of our database, MySQL, from 4.0 to 5.0. There are lots of good things in the new version. One is greatly expanded capabilities in the encoding arena. The problem was that we didn’t fully understand how to drive these new features.
For the pathologically curious:
The issue is not merely “encoding,” but actually “collation”: how does the database decide whether one string (or character) comes before or after another, or if perhaps they’re actually the same character. In MySQL 4.0, there were well-defined “character sets,” which included some notion of collation, but not a very complete one. In MySQL 5.0, there are major improvements in collation handling, but at the cost of some pretty significant changes in configuration procedures and settings.
This upgrade involves dumping the old database and loading it into the new. Loading includes checking table constraints, like primary keys and uniqueness and indexing. If you get the collation wrong, this can produce the wrong result. In fact, if you get the collation wrong, it can tell you silly things like “You can’t add a user named ülf, we already have a user named ulf, and those two names are the same by the collation scheme in use.” Which is basically what happened.