Ok, I’ve got a real problem here. I’ve set up a local LAMP and wrote a PHP application which uses MySQL to store a large wordlist. This wordlist includes accented characters, such as ñ and á. I’ve also started storing other languages, and there are no problems. The cool thing is that I can go from input, through PHP, and into the database, and everything is still human readable. I mean, I can look and edit the same text through phpMyAdmin or even the CLI and still see ñ and á.
When I put it on Dreamhost, however, it stops working. PHP reports that the serialised strings are not of the right length and such. So if I store the wordlist from Dreamhost, it still works, sort of. The text I see when I pull words out of the database and put through PHP still end up as ñ and á, but when I look at them or try to edit them through phpMyAdmin or the CLI, I just get mojibake (like Ã± and Ã a).
Now I’ve never really bothered with it before, other than it being a pain that I can’t directly fix any mistakes through phpMyAdmin, but today I realised where the problem might be. You see, I downloaded a backup copy of my database from Dreamhost and loaded it on my local machine (actually, a WAMP this time). I found that I couldn’t read anything out of it. But when I stored some accented characters, I could see them perfectly well in phpMyAdmin, and they came back out ok through PHP. So that means my database, tables, and columns must have the correct collation settings, utf_general_ci, because none of that changed when importing to my local machine. I went through every setting I could find comparing my setup with DH and I found a difference. It seems that only some of the default collations and character sets are set as utf-8 whereas everything on mine is:
character set client utf8 (Global value) latin1 character set connection utf8 (Global value) latin1 character set database latin1 character set filesystem binary character set results utf8 (Global value) latin1 character set server latin1 character set system utf8 character sets dir /data/mysql/koble/share/mysql/charsets/ collation connection utf8_unicode_ci (Global value) latin1_swedish_ci collation database latin1_swedish_ci collation server latin1_swedish_ci
character set client utf8 character set connection utf8 character set database utf8 character set filesystem binary character set results utf8 character set server utf8 character set system utf8 character sets dir C:\www\UniServer\usr\local\mysql\share\charsets\ collation connection utf8_general_ci collation database utf8_general_ci collation server utf8_general_ci
Is it possible that this is causing non-Latin characters to be encoded as mojibake when using phpMyAdmin / CLI on DH but not on my local machine? Is there any way to change the settings to all be utf8? It would make my life SO much easier if I could read my data directly in phpMyAdmin instead of guessing at what the mojibake might be…