Sources: mostly here and not so much here.
Something about MySQL being swedish and having a default character set of latin1 and a collation of latin1_swedish_ci which is no good if you need to save utf8 data.
What is Collation?
The collation determines how results are sorted and ordered. It may be that an application requires a certain collation. You must ensure your database has the correct collation for the application it will be used with.
Collation in MySQL can be complicated because you can have a separate collation set at:
- The database level
- The table level
- The column level
Additionally, the information inside a column may be encoded incorrectly as well – causing the data in that column to be displayed incorrectly.
Dealing with Foreign Key constraints
It may be necessary to ignore foreign key constraints when making changes to a large number of columns. You can use the SET FOREIGN_KEY_CHECKS
command to ignore foreign key constraints while you update the database.
SET FOREIGN_KEY_CHECKS=0;
-- Insert your other SQL Queries here...
SET FOREIGN_KEY_CHECKS=1;
Changing the database collation
In the example below, change:
<yourDB>
to your actual database name<charset>
to eitherutf8
orutf8mb4
<collation>
to eitherutf8_bin
orutf8mb4_bin
To change the database collation:
ALTER DATABASE <yourDB> CHARACTER SET <charset> COLLATE <collation>
Changing table collation
The following query will produce a series of ALTER TABLE
statements, which you must then run against your database.
In the example below, change:
<yourDB>
to your actual database name<charset>
to eitherutf8
orutf8mb4
<collation>
to eitherutf8_bin
orutf8mb4_bin
To change table collation:
SELECT CONCAT('ALTER TABLE `', table_name, '` CHARACTER SET <charset> COLLATE <collation>;')
FROM information_schema.TABLES AS T, information_schema.`COLLATION_CHARACTER_SET_APPLICABILITY` AS C
WHERE C.collation_name = T.table_collation
AND T.table_schema = '<yourDB>'
AND
(
C.CHARACTER_SET_NAME != '<charset>'
OR
C.COLLATION_NAME != '<collation>'
);
Changing column collation
The following queries (one for varchar
columns, and one for non-varchar
columns) will produce a series of ALTER TABLE
statements, which you must then run against your database.
In the examples below, change:
<yourDB>
to your actual database name<charset>
to eitherutf8
orutf8mb4
<collation>
to eitherutf8_bin
orutf8mb4_bin
To change column collation for varchar
columns:
SELECT CONCAT('ALTER TABLE `', table_name, '` MODIFY `', column_name, '` ', DATA_TYPE, '(', CHARACTER_MAXIMUM_LENGTH, ') CHARACTER SET <charset> COLLATE <collation>', (CASE WHEN IS_NULLABLE = 'NO' THEN ' NOT NULL' ELSE '' END), ';')
FROM information_schema.COLUMNS
WHERE TABLE_SCHEMA = '<yourDB>'
AND DATA_TYPE = 'varchar'
AND
(
CHARACTER_SET_NAME != '<charset>'
OR
COLLATION_NAME != '<collation>'
);
To change column collation for non-varchar
columns:
SELECT CONCAT('ALTER TABLE `', table_name, '` MODIFY `', column_name, '` ', DATA_TYPE, ' CHARACTER SET <charset> COLLATE <collation>', (CASE WHEN IS_NULLABLE = 'NO' THEN ' NOT NULL' ELSE '' END), ';')
FROM information_schema.COLUMNS
WHERE TABLE_SCHEMA = '<yourDB>'
AND DATA_TYPE != 'varchar'
AND
(
CHARACTER_SET_NAME != '<charset>'
OR
COLLATION_NAME != '<collation>'
);
Warning by My Brain:
When you need to actually do this, try this on a test database first!
I had about 100 queries to run on a 1Gb database and that took my development server about 6 hours…. Next step for me is figuring out how to roll this out on the live server.
My queries were like these:
SET FOREIGN_KEY_CHECKS=0;
ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE `tablename` CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE `tablename` MODIFY `fieldnamevarchar` VARCHAR(64) CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE `tablename` MODIFY `fieldnametext` TEXT CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL;
SET FOREIGN_KEY_CHECKS=1;
And the solution by My Brain!
Instead of changing one field per query I merged this into one query per table which enormously improved the speed. Like so:
SET FOREIGN_KEY_CHECKS=0;
ALTER TABLE `tablename`
MODIFY `fieldnamevarchar` varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci,
MODIFY `fieldnametext` text CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
CHARSET=utf8, COLLATE=utf8_general_ci;
SET FOREIGN_KEY_CHECKS=1;
This took about 6 minutes on the still-ever-so-slow-dev-server and just under 1 minute 45 seconds on the live server …. a more acceptable down-time for maintenance.
Have a nice day!