Reply
Highlighted
New Contributor
Posts: 4
Registered: ‎10-22-2014

Oozie launching "jobs" with ASCII defaultCharSet..not UTF8?

We are running Oozie 4.0.0 (via CDH 5.3.2 with YARN), and we have a weird
thing going on.  When we run workflows, they appear to be changing the
default Character set..and not sure why.  When we run a simple Java App,
with the line below:
System.out.println(Charset.defaultCharset());
>From our test code, we did the simple above command, and we get:
2015-03-05 19:01:05,623 INFO [main] com.test.encoding.Test: US-ASCII
Just running a shell script with "locale" as the only thing also returns
the POSIX:

Oozie Launcher, capturing output data:
  =======================
  LANG=
  LC_CTYPE="POSIX"
  LC_NUMERIC="POSIX"
  LC_TIME="POSIX"
  LC_COLLATE="POSIX"
  LC_MONETARY="POSIX"
  LC_MESSAGES="POSIX"
  LC_PAPER="POSIX"
  LC_NAME="POSIX"
  LC_ADDRESS="POSIX"
  LC_TELEPHONE="POSIX"
  LC_MEASUREMENT="POSIX"
  LC_IDENTIFICATION="POSIX"
  LC_ALL=

even though all when running locale in a bash shell...the nodes have the
UTF-8:

LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=


But...when we look at the various settings on the box (JVM, locale,
etc)...they all point to UTF-8.  In the ooze-env.sh we set:  setting
LC_ALL=en_US.UTF-8  setting LANG=en_US.UTF-8  setting LANGUAGE=en_US.UTF-8
just to make sure things get setup...but no success. Basically, we can't
figure out how to have Oozie do UTF-8, and not ASCII/POSIX.  We are backed
by a MySQL DB, with the default char set to UTF-8 as well.

Any thoughts/suggestion, places to read/look?
Thanks in advance!
Cheers,
Aaron
Announcements