When attempting to load a database dumped with mysqldump into an older MySQL database, I got the error message ERROR 1115 (42000) : Unknown character set: ‘utf8mb4’ in mysql.
What’s utf8mb4?
utf8mb4 is a utf8 character set, which was added to MySQL in version 5.3.3, that fully supports unicode. Read this post by Mathias Bynens for more information about the difference between to the two character sets with MySQL.
How do I solve the problem?
If you dumped a table/database that supports utf8mb4 then you need to load it back into a MySQL database that is at least version 5.3.3. If you try to do it in an older version, you’re going to get the error message I gave in the opening paragraph.
A quick and dirty solution, not recommended
There is a way to workaround this, but I strongly advise not to, as you may cause data loss and/or other unexpected consequences. In my case, I was simply trying to benchmark database queries running across different servers with the same database, so it didn’t really matter too much if some utf8 characters got saved incorrectly or other content modified.
Let’s say you dumped the database like this:
mysqldump -u [username] -p [database] > database.sql
You can then run “sed” against it to change all the instances of utf8mb4 to utf8 like this:
sed s/utf8mb4/utf8/g database.sql > database.sed.sql
It’ll run pretty fast; doing this on a 500MB file only took maybe 10 seconds on the server I was doing it on. Note that it will replace all instances of utf8mb4 with utf8. So, for example, if I’d dumped the table containing this post then all instances in the content would also have changed.
You can probably modify the sed expression to work out all the various ways the character set is set in the file and run those, avoiding modifying anything else, but you still may have issues with character encoding being broken – just a final warning 🙂
A better solution?
If you have a better solution, feel free to add it in the comments.