Converting the charset for an entire schema

By now, we’re all aware that a utf8 charset and collation is the “right” thing to do, with most folks opting for utf8_general_ci or utf8_unicode_ci. However, lots of installations still default to some form of latin1, which is unfortunate. There are multiple ways to do this, all which functionally do the same thing. Be aware there is a big potential gotcha that you should be aware of.

Here’s the “safe” way:

First, the schema itself:

ALTER DATABASE dbname CHARACTER SET utf8 COLLATE utf8_general_ci;

Then…

SELECT CONCAT("ALTER TABLE ",TABLE_SCHEMA,".",TABLE_NAME," CHARACTER SET utf8 COLLATE utf8_general_ci;   ",
    "ALTER TABLE ",TABLE_SCHEMA,".",TABLE_NAME," CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;  ") 
    AS alter_sql
FROM information_schema.TABLES
WHERE TABLE_SCHEMA = dbname ;

Capture this output to a file and then execute it.

If you’re feeling brave, you can do the second bit with a one-liner:

DB="dbname"; ( echo 'ALTER DATABASE `'"$DB"'` CHARACTER SET utf8 COLLATE utf8_general_ci;'; mysql "$DB" -e "SHOW TABLES" --batch --skip-column-names | xargs -I{} echo 'ALTER TABLE `'{}'` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;' ) | mysql "$DB"

Key length limitations in MySQL

Suppose I try to create a table with a primary key that’s varchar(500), and MySQL complains the key length is longer than the maximum of 1000 bytes. 500 is less than 1000. What’s happening? Plus, a tasty (yet apparently harmless) bug in MySQL.

Here’s a statement that will fail on most servers:

CREATE TABLE test(c VARCHAR(250), d VARCHAR(250), PRIMARY KEY(c,d));
ERROR 1071 (42000): Specified key was too long; max key LENGTH IS 1000 bytes

Why does it fail? Simple; my default character set is multi-byte:

show variables like '%char%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | latin1                     |
| character_set_connection | latin1                     |
| character_set_database   | utf8                       |
| character_set_filesystem | binary                     |
| character_set_results    | latin1                     |
| character_set_server     | utf8                       |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+

While most characters will fit in one or two bytes, the utf8 encoding of Unicode, as implemented by MySQL can require up to 3 bytes per character, so MySQL must be pessimistic and assume the worst-case scenario of every character requiring 3 bytes. It’s easy to see this by trying to create a table with a single VARCHAR(334) primary key. It will fail, but VARCHAR(333) will succeed, because 3 * 333 is less than 1000 bytes.

Here’s a fun bug (bug #18927):

mysql> CREATE TABLE test(c VARCHAR(250), d VARCHAR(250),PRIMARY KEY(c,d));
ERROR 1071 (42000): Specified key was too long; max key LENGTH IS 1000 bytes
mysql> CREATE TABLE test(c VARCHAR(334), d VARCHAR(334), PRIMARY KEY(c,d));
ERROR 1071 (42000): Specified key was too long; max key LENGTH IS 999 bytes

Sometimes it says 999, sometimes 1000. I have no idea why. Rounding, maybe?

It may be a good idea to check the default character set to get the best performance out of a database. If there’s no need for multi-byte encodings, switching to a single-byte encoding might be significantly more efficient. It’s especially important to keep indexes as small as possible.

Rename a schema (sort of)

To rename a schema (emulating the now-dropped RENAME DATABASE functionality):

 SELECT concat('RENAME TABLE oldSchema.',table_name, ' TO newSchema.',table_name, ';')
 FROM information_schema.TABLES
 WHERE table_schema='oldSchema';

Execute the resultant sql, then drop the now-empty old schema.

 

Tags

Related Posts

Share This

MySQL’s damned TIMESTAMPs

MySQL is SO irritating in its timestamp limitations, in that you can only have one default timestamp column. (Update 6/12/2013: in version 5.6.x, this restriction will be lifted, so I hear)

However, you *can* trick MySQL like this.

You can use only one of the definitions in one table. Create both timestamp columns like so:

create table test_table( 
  id integer not null auto_increment primary key, 
  stamp_created timestamp default '0000-00-00 00:00:00', 
  stamp_updated timestamp default now() on update now() 
); 

Note that it is necessary to enter null into both columns during insert:

mysql> insert into test_table(stamp_created, stamp_updated) values(null, null); 
Query OK, 1 row affected (0.06 sec)

mysql> select * from t5; 
+----+---------------------+---------------------+ 
| id | stamp_created       | stamp_updated       |
+----+---------------------+---------------------+
|  2 | 2009-04-30 09:44:35 | 2009-04-30 09:44:35 |
+----+---------------------+---------------------+
2 rows in set (0.00 sec)  

mysql> update test_table set id = 3 where id = 2; 
Query OK, 1 row affected (0.05 sec) Rows matched: 1  Changed: 1  Warnings: 0  

mysql> select * from test_table;
+----+---------------------+---------------------+
| id | stamp_created       | stamp_updated       | 
+----+---------------------+---------------------+ 
|  3 | 2009-04-30 09:44:35 | 2009-04-30 09:46:59 | 
+----+---------------------+---------------------+ 
2 rows in set (0.00 sec)  

Tags

Related Posts

Share This

Sizing and report scripts

Various ways to get sizing, counts, etc.

-- Total rows and size for an entire server, excluding system schemata, ordered by size DESC 
SELECT table_schema,table_name,
 CONCAT(ROUND(SUM(table_rows) / 1000000,2),'M') rows,
 CONCAT(ROUND(SUM(data_length) / (1024 * 1024 * 1024),2), 'G') DATA,
 CONCAT(ROUND(SUM(index_length) / (1024 * 1024 * 1024), 2),'G') idx,
 CONCAT(ROUND(SUM(data_length + index_length) / (1024 * 1024 * 1024), 2),'G') total_size,
 ROUND(SUM(index_length) / SUM(data_length),2) idxfrac
 FROM information_schema.TABLES
 WHERE table_schema NOT IN ('mysql','performance_schema','information_schema') 
 GROUP BY table_schema,table_name
 ORDER BY SUM(data_length + index_length) DESC;
-- Total rows and size for an entire server, excluding system schemata, ordered by schema and table name 
 SELECT table_schema,table_name,
 CONCAT(ROUND(SUM(table_rows) / 1000000,2),'M') rows,
 CONCAT(ROUND(SUM(data_length) / (1024 * 1024 * 1024),2), 'G') DATA,
 CONCAT(ROUND(SUM(index_length) / (1024 * 1024 * 1024), 2),'G') idx,
 CONCAT(ROUND(SUM(data_length + index_length) / (1024 * 1024 * 1024), 2),'G') total_size,
 ROUND(SUM(index_length) / SUM(data_length),2) idxfrac
 FROM information_schema.TABLES
 WHERE table_schema NOT IN ('mysql','performance_schema','information_schema') 
 GROUP BY table_schema,table_name
 ORDER BY table_schema,table_name;
-- Total rows and size
 SELECT COUNT(*) TABLES,
 CONCAT(ROUND(SUM(table_rows) / 1000000,2),'M') rows,
 CONCAT(ROUND(SUM(data_length) / (1024 * 1024 * 1024),2), 'G') DATA,
 CONCAT(ROUND(SUM(index_length) / (1024 * 1024 * 1024), 2),'G') idx,
 CONCAT(ROUND(SUM(data_length + index_length) / (1024 * 1024 * 1024), 2),'G') total_size,
 ROUND(SUM(index_length) / SUM(data_length),2) idxfrac
 FROM information_schema.TABLES
-- Database wise report
 SELECT COUNT(*) TABLES, table_schema,
 CONCAT(ROUND(SUM(table_rows) / 1000000,2),'M') rows,
 CONCAT(ROUND(SUM(data_length) / (1024 * 1024 * 1024),2), 'G') DATA,
 CONCAT(ROUND(SUM(index_length) / (1024 * 1024 * 1024), 2),'G') idx,
 CONCAT(ROUND(SUM(data_length + index_length) / (1024 * 1024 * 1024), 2),'G') total_size,
 ROUND(SUM(index_length) / SUM(data_length),2) idxfrac
 FROM information_schema.TABLES
 GROUP BY table_schema
 ORDER BY SUM(data_length + index_length) DESC
-- Database wise report in MB
 SELECT COUNT(*) TABLES, table_schema,
 CONCAT(ROUND(SUM(table_rows) / 1000000,2),'M') rows,
 CONCAT(ROUND(SUM(data_length) / (1024 * 1024 ),2), 'M') DATA,
 CONCAT(ROUND(SUM(index_length) / (1024 * 1024 ), 2),'M') idx,
 CONCAT(ROUND(SUM(data_length + index_length) / (1024 * 1024 ), 2),'M') total_size,
 ROUND(SUM(index_length) / SUM(data_length),2) idxfrac
 FROM information_schema.TABLES
 GROUP BY table_schema
 ORDER BY SUM(data_length + index_length) DESC
-- Engine wise breakup
 SELECT ENGINE, COUNT(*) TABLES,
 CONCAT(ROUND(SUM(table_rows) / 1000000,2),'M') rows,
 CONCAT(ROUND(SUM(data_length) / (1024 * 1024 * 1024),2), 'G') DATA,
 CONCAT(ROUND(SUM(index_length) / (1024 * 1024 * 1024), 2),'G') idx,
 CONCAT(ROUND(SUM(data_length + index_length) / (1024 * 1024 * 1024), 2),'G') total_size,
 ROUND(SUM(index_length) / SUM(data_length),2) idxfrac
 FROM information_schema.TABLES
 GROUP BY engine ORDER BY sum(data_length+index_length) DESC
-- Top 30 tables
 SELECT table_schema, table_name, engine, table_rows,
 CONCAT(ROUND(data_length / (1024 * 1024),2),'MB') AS DATA,
 CONCAT(ROUND((data_length + index_length) / (1024 * 1024), 2),'MB') AS total_size
 FROM INFORMATION_SCHEMA.TABLES
 ORDER BY data_length DESC LIMIT 30
-- List of Full Text Indexes
 SELECT TABLE_SCHEMA, TABLE_NAME, INDEX_NAME, SEQ_IN_INDEX, COLUMN_NAME, CARDINALITY, NULLABLE
 FROM statistics
 WHERE index_type LIKE 'FULLTEXT%' ORDER BY TABLE_SCHEMA, TABLE_NAME
-- The number of columns for each datatype
 SELECT DATA_TYPE, COUNT(*) AS mycount
 FROM `COLUMNS`
 WHERE TABLE_SCHEMA NOT IN ('information_schema','mysql')
 GROUP BY DATA_TYPE ORDER BY mycount DESC
-- The number of columns for each datatype with size (replace my_database)
 SELECT TABLE_SCHEMA, COLUMN_TYPE, count(*) FROM `COLUMNS`
 GROUP BY COLUMN_TYPE HAVING TABLE_SCHEMA = 'my_database'
-- Blob, Float and Double Data types details
 SELECT TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME, ORDINAL_POSITION, COLUMN_DEFAULT,
 IS_NULLABLE, NUMERIC_PRECISION, COLUMN_TYPE, COLUMN_KEY
 FROM COLUMNS
 WHERE DATA_TYPE IN ('FLOAT','BLOB','DOUBLE')
 AND TABLE_SCHEMA != 'mysql'
 ORDER BY DATA_TYPE, COLUMN_TYPE
--All tables with no primary key
use INFORMATION_SCHEMA;
 select CONCAT(t.table_schema,".",t.table_name) as tbl
 from INFORMATION_SCHEMA.TABLES AS t
 LEFT JOIN INFORMATION_SCHEMA.KEY_COLUMN_USAGE AS c
 ON (t.TABLE_NAME=c.TABLE_NAME
 AND c.CONSTRAINT_SCHEMA=t.TABLE_SCHEMA
 AND constraint_name='PRIMARY')
 WHERE t.table_schema!="information_schema"
 AND constraint_name IS NULL;
-- All tables and their primary keys, if exist:
use INFORMATION_SCHEMA;
 select CONCAT(t.table_schema,".",t.table_name) as tbl,
 c.column_name,c.constraint_name
 from INFORMATION_SCHEMA.TABLES AS t
 LEFT JOIN INFORMATION_SCHEMA.KEY_COLUMN_USAGE AS c
 ON (t.TABLE_NAME=c.TABLE_NAME
 AND c.CONSTRAINT_SCHEMA=t.TABLE_SCHEMA
 AND constraint_name='PRIMARY')
 WHERE t.table_schema!="information_schema"
 order by constraint_name;
-- Foreign Key constraints for Database dbName
SELECT A.TABLE_SCHEMA AS FKTABLE_SCHEM, A.TABLE_NAME AS FKTABLE_NAME, A.COLUMN_NAME AS FKCOLUMN_NAME,
 A.REFERENCED_TABLE_SCHEMA AS PKTABLE_SCHEM, A.REFERENCED_TABLE_NAME AS PKTABLE_NAME,
 A.REFERENCED_COLUMN_NAME AS PKCOLUMN_NAME, A.CONSTRAINT_NAME AS FK_NAME
 FROM INFORMATION_SCHEMA.KEY_COLUMN_USAGE A, INFORMATION_SCHEMA.TABLE_CONSTRAINTS B
 WHERE A.TABLE_SCHEMA = B.TABLE_SCHEMA AND A.TABLE_NAME = B.TABLE_NAME
 AND A.CONSTRAINT_NAME = B.CONSTRAINT_NAME AND B.CONSTRAINT_TYPE IS NOT NULL
 HAVING PKTABLE_SCHEM IS NOT NULL
 and A.TABLE_SCHEMA = 'dbName'
 ORDER BY A.TABLE_SCHEMA, A.TABLE_NAME, A.ORDINAL_POSITION limit 1000

 

 

Tags

Related Posts

Share This

Disabling foreign key checks

It sucks, but sometimes ya gotta.

You can’t DROP multiple tables, a parent table or a child table until you disable foreign key checks four your current database. Or for that matter, TRUNC the table, either.

To disable/enable foreign key checks, use the following syntax. (Don’t ever disable @@GLOBAL, unless you’re really irritated and have lost your mind)

 SET @@SESSION.FOREIGN_KEY_CHECKS = 'OFF';
 SET @@SESSION.FOREIGN_KEY_CHECKS = 'ON';

 

Tags

Related Posts

Share This