Converting the charset for an entire schema

By now, we’re all aware that a utf8 charset and collation is the “right” thing to do, with most folks opting for utf8_general_ci or utf8_unicode_ci. However, lots of installations still default to some form of latin1, which is unfortunate. There are multiple ways to do this, all which functionally do the same thing. Be aware there is a big potential gotcha that you should be aware of.

Here’s the “safe” way:

First, the schema itself:

ALTER DATABASE dbname CHARACTER SET utf8 COLLATE utf8_general_ci;

Then…

SELECT CONCAT("ALTER TABLE ",TABLE_SCHEMA,".",TABLE_NAME," CHARACTER SET utf8 COLLATE utf8_general_ci;   ",
    "ALTER TABLE ",TABLE_SCHEMA,".",TABLE_NAME," CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;  ") 
    AS alter_sql
FROM information_schema.TABLES
WHERE TABLE_SCHEMA = dbname ;

Capture this output to a file and then execute it.

If you’re feeling brave, you can do the second bit with a one-liner:

DB="dbname"; ( echo 'ALTER DATABASE `'"$DB"'` CHARACTER SET utf8 COLLATE utf8_general_ci;'; mysql "$DB" -e "SHOW TABLES" --batch --skip-column-names | xargs -I{} echo 'ALTER TABLE `'{}'` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;' ) | mysql "$DB"