Tag Archives: utf8

Unicode Backlash

Yesterday, I stumbled upon a blog that made me laugh. The truth be told, I have a snarky side—my evil twin, if you will. I keep it in check. Mostly. Some of you who will understand. Some of you won’t. It’s a Gemini thing. My twin is fun but good luck putting it back in […]

Verdana Hates Pinyin

I stumbled across an article on lostlaowai.com www.lostlaowai.com/survival-chinese which lead me to poke around the site a bit. At the above URL, I noticed that some of the combining diacritical marks (tone marks) used in writing pinyin were not rendering properly. I had not seen this problem before. It didn’t make sense. Things that don’t […]

MacRoman encoding creeps into Maven

You’d think in this day and age that modern operating systems, especially OS X, would be set for UTF8 handling by default. Not so. My previous post, centos l10n problem, showed that CentOS defaults to set its locale LANG as POSIX rather than UTF8. Mac takes the lunacy one step further. Or should I say […]

UTF8 JDBC on Tomcat

I’ve had opportunity to once again visit the UTF8 chain of failure and thought I’d write about it. If for no other reason, it’s easier for me to find my notes when I shove them into a blog entry. I previously wrote about UTF8 on Tomcat. I pointed out that I needed to add an […]

grep and UTF-8

I needed to look up the various strings Apple uses to name the iTunes Library. First I tried to get name from the iTunes resource bundle echo “this won’t work…” echo “so don’t even try it” cd /Applications/iTunes.app/Contents/Resources/English.lproj cat Localizable.strings | grep ‘PrimaryPlaylistName’ But I quickly learned that grep doesn’t work on the strings file. […]

centos l10n problem

Just about the time I believe the UTF-8 beast is in the cage, it escapes and runs amok. This AM, I started to deploy an update to the webapp on EC2. Seems that some of the static strings in the app contained UTF-8 encoded non-ascii characters. The java compiler barfed. “The heck?”, I thought. I […]