The, now previous, hosting company of my wife’s blog had a major data corruption and completely lost a years worth of database entries and files. There was no communication before we found the problem ourselves, so we were very unhappy and decided to reconstruct the host on another site.
Luckily I had set up WordPress to send a complete database dump weekly as tar.gz balls, so no database entries were lost. All uploaded images and such was permanently lost, but reconstructing this is much easier than reconstructing posts and comments.
Charset problems moving the site to another webhotel
After creating a backup of the files left on the old host I made a local copy to my computer and another copy to the new webhotel. After the DNS changes had gone through and had imported the database dump on the new hosts the only thing left was to edit wp-config.php with the new database settings… or so I thought. It turned out that the all the tables of the database were in charset latin1_swedish_ci, but some of the posts contained utf8 characters as well. The result was the all Danish letters and many special characters in english looked garbled on the blog.
After searching the web for hours through variations over simple search and replace, which I did not find feasible, I finally found the holy grail – the ‘replace’ command as part of the mysql (now mariadb) project. The following command corrected all entries in the sql file from the mix of different charsets to a consistent utf8 output that rendered beautifully on the website:
replace "CHARSET=latin1" "CHARSET=utf8" "SET NAMES latin1" "SET NAMES utf8" < database.sql > database_uft8.sql