Table of Contents
Converting a glFusion site to UTF-8
I recently converted a site to UTF-8 from ISO-8859-1, here is how I did it. It seems to work just fine.
Why would I need UTF-8?
There are a few reasons you might need UTF-8:
- If you want to support multiple languages that use different character sets on your site. For instance if you want to support both Russian and Turkish, you will need a character set that supports both. UTF-8 is then a logical choice.
- If you need better search results or improved sorting. In some cases searching and sorting by the database can be improved by chosing UTF-8 as your character set.
- Full support for unicode characters, including many non-English languages.
Converting a Site to UTF-8
If your site is very large, the database dump may be too big to re-upload using phpMyAdmin. Make sure you check to see what is the largest file size that can be uploaded through phpMyAdmin. Check this by choosing to import from the phpMyAdmin menus, it will tell you the largest file size. If you database dump is larger, you will have a problem re-importing. I would suggest you contact your hosting providers technical support for assistance in this process.
First, make sure you MySQL database version is at least 5.1, then do the following:
- Dump my database using mysqldump or phpMyAdmin.
- Use some tool to do a global search / replace of the SQL dump file. You want to replace all CHARSET=latin1 with CHARSET=utf8. I used the replace command which is included in the MySQL distribution, but there are several Win32 programs that will do this as well, such as Rpl for Windows.
- You have to make sure you have all valid UTF-8 characters in your database. I used the iconv program to validate / fix the data. iconv is available under Unix and also Win32. Using the following command: iconv –c –f utf-8 –t utf-8 < input.sql > output.sql
- Change the collation for the actual database using phpMyAdmin to utf_general_ci.
- Drop all the old tables with phpMyAdmin (do not drop the database).
- Import the newly created output.sql into the database.
- Double check all the tables and fields are utf8 in the database structure.
- Change glFusion's config.php to use the utf-8 charset.
- Change glFusion's config.php to use english_utf-8 as the langauge.
- Remove all the non-utf-8 language files.
- Using phpMyAdmin, look at the gl_users table to see how many users have selected a language preference.
- Run a few SQL queries to fix the language preferences for each of the languages used on your site. For example:
UPDATE gl_users set language="english_utf-8" WHERE language="english"; UPDATE gl_users set language="danish_utf-8" WHERE language="danish";
- Double check a few stories / forum posts to make sure all is well.
- Check all of your Content Syndication feeds to make sure they are set to UTF-8 as the charset.
- Edit siteconfig.php, near the end, change the default character set and db_charset - it should look like this:
$_CONF['default_charset'] = 'utf-8'; $_CONF['db_charset'] = 'utf8';
That should do it!