MacRoman encoding creeps into Maven
You’d think in this day and age that modern operating systems, especially OS X, would be set for UTF8 handling by default. Not so. My previous post, centos l10n problem, showed that CentOS defaults to set its locale LANG as POSIX rather than UTF8.
Mac takes the lunacy one step further. Or should I say one step backwards in time.
I use maven2 as my build manager. Normally, I ignore the stream of info at the beginning of a build, Either it succeeds (yeah) or it fails. Either way, I’ve been more interested in seeing the end result; You know, those last few lines rather than the first few lines.
One day, I started tracking down all the warnings and errors which popped up during maven builds and tomcat startups. I noticed this one.
$ mvn -Pdevelopment clean compile package war:inplace
[INFO] Scanning for projects...
<!-- snip -->
[WARNING] Using platform encoding (MacRoman actually) to copy↩
filtered resources, i.e. build is platform dependent!
<!-- snip -->
[INFO] ----------------------------------------------------------------
[INFO] BUILD SUCCESSFUL
[INFO] ----------------------------------------------------------------
[INFO] Total time: 8 seconds
[INFO] Finished at: Tue Apr 07 23:40:06 PDT 2009
[INFO] Final Memory: 26M/63M
[INFO] ----------------------------------------------------------------
If you’ve ever had to trace down all the UTF8 failure points in a system then you know this maxim: “Suffer not a UTF8 Failure to Live.” Once you have a failure point, Latin1 (or worse in this case–MacRoman) will leak into your database and rot your data like a cancer.
I really should hunt down the BSD system configuration equivalents to Linux but here’s a solution that is quick and easy: add a project.build.sourceEncoding element and a project.reporting.outputEncoding to your pom.xml.
<project
xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 ↩
http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.example</groupId>
<artifactId>mywebapp</artifactId>
<packaging>war</packaging>
<version>1.21</version>
<name>mywebapp</name>
<properties>
<project.build.sourceEncoding>
UTF-8
</project.build.sourceEncoding>
<project.reporting.outputEncoding>
UTF-8
</project.reporting.outputEncoding>
</properties>
<!-- snip -->
</project>
Run maven again to verify the fix.
$ mvn -Pdevelopment clean compile package war:inplace
[INFO] Scanning for projects...
<!-- snip -->
[INFO] Using 'UTF-8' encoding to copy filtered resources.
<!-- snip -->
[INFO] ----------------------------------------------------------------
[INFO] BUILD SUCCESSFUL
[INFO] ----------------------------------------------------------------
[INFO] Total time: 7 seconds
[INFO] Finished at: Tue Apr 07 23:48:17 PDT 2009
[INFO] Final Memory: 25M/60M
[INFO] ----------------------------------------------------------------
I really do want to understand the vagaries of OS X (relative to Linux) but I’m eternally short on time. I suspect that is our lot, all of us.
Whose woods these are I think I know.
His house is in the village though;
He will not see me stopping here
To watch his woods fill up with snow.
My little horse must think it queer
To stop without a farmhouse near
Between the woods and frozen lake
The darkest evening of the year.
He gives his harness bells a shake
To ask if there is some mistake.
The only other sound's the sweep
Of easy wind and downy flake.
The woods are lovely, dark and deep.
But I have promises to keep,
And miles to go before I sleep,
And miles to go before I sleep.
--Robert Frost
8 Comments:
Hi man, thanks for your articule. It’s good.
=)
camus
08:39
Thank you for the helpful information.
Aaron S
13:09
Perfect! This kind of stuff needs to make its way into a maven FAQ somewhere
Thanks!
Steph Meslin-Weber
03:28
There are a few resources already describing that. I have collected a few of them here:
http://www.martinahrer.at/blog/2007/06/01/maven2-site-encoding-problems/
Martin Ahrer
01:53
@Martin Aher
Great info (especially for anyone trying to get their IDE-to-Maven encoding set) at http://www.martinahrer.at/blog/2007/06/01/maven2-site-encoding-problems/ and references to even more material.
kelly
06:27
Alternatively, you could use Ant to build projects.
dan
10:39
You can put the config inside a in your ~/.m2/settings.xml that is active by default. This way the defaults are applied to all projects being built with Maven, and you don’t have to add it to all your pom.xml’s.
Pavel
01:57
Pavel, I hadn’t considered putting the config in the settings.xml file. Thanks!
kelly
06:21