Created on 03-05-2015 11:37 AM - edited 09-16-2022 02:23 AM
We are running Oozie 4.0.0 (via CDH 5.3.2 with YARN), and we have a weird thing going on. When we run workflows, they appear to be changing the default Character set..and not sure why. When we run a simple Java App, with the line below: System.out.println(Charset.defaultCharset()); >From our test code, we did the simple above command, and we get: 2015-03-05 19:01:05,623 INFO [main] com.test.encoding.Test: US-ASCII Just running a shell script with "locale" as the only thing also returns the POSIX: Oozie Launcher, capturing output data: ======================= LANG= LC_CTYPE="POSIX" LC_NUMERIC="POSIX" LC_TIME="POSIX" LC_COLLATE="POSIX" LC_MONETARY="POSIX" LC_MESSAGES="POSIX" LC_PAPER="POSIX" LC_NAME="POSIX" LC_ADDRESS="POSIX" LC_TELEPHONE="POSIX" LC_MEASUREMENT="POSIX" LC_IDENTIFICATION="POSIX" LC_ALL= even though all when running locale in a bash shell...the nodes have the UTF-8: LANG=en_US.UTF-8 LC_CTYPE="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL= But...when we look at the various settings on the box (JVM, locale, etc)...they all point to UTF-8. In the ooze-env.sh we set: setting LC_ALL=en_US.UTF-8 setting LANG=en_US.UTF-8 setting LANGUAGE=en_US.UTF-8 just to make sure things get setup...but no success. Basically, we can't figure out how to have Oozie do UTF-8, and not ASCII/POSIX. We are backed by a MySQL DB, with the default char set to UTF-8 as well. Any thoughts/suggestion, places to read/look? Thanks in advance! Cheers, Aaron
Created 03-09-2021 10:47 PM
Hi,
I am able to replicate this in my cluster.. But I tested in CDH 6.
Shell output:-
[root@host-10-17-102-176 hive]# locale
LANG=en_US.UTF-8
LC_CTYPE=UTF-8
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
Oozie Launcher, capturing output data:
=======================
LANG=
LC_CTYPE="POSIX"
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX"
LC_ALL=
To fix this kindly make the below configuration change.
Access the CM and navigate to the Yarn Configuration > Containers Environment Variable (yarn.nodemanager.admin-env) --> And append these properties "LC_ALL=en_US.UTF-8,LANG=en_US.UTF-8" to this config. Restart the affected services to make the changes permanent.
Post this kindly re run the oozie job and check the output. In my cluster it shows like this post making the change.
Oozie Launcher, capturing output data:
=======================
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=en_US.UTF-8
Nitish
Created 03-05-2021 02:44 PM
have you found a solution to this?
Created 03-09-2021 10:19 PM
Hi,
What's the CDH version you are using currently on which you are seeing this issue?
Can you share the workflow.xml and the script that you are running?
Also kindly share the oozie launcher logs.
Regards
Nitish
Created 03-09-2021 10:47 PM
Hi,
I am able to replicate this in my cluster.. But I tested in CDH 6.
Shell output:-
[root@host-10-17-102-176 hive]# locale
LANG=en_US.UTF-8
LC_CTYPE=UTF-8
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
Oozie Launcher, capturing output data:
=======================
LANG=
LC_CTYPE="POSIX"
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX"
LC_ALL=
To fix this kindly make the below configuration change.
Access the CM and navigate to the Yarn Configuration > Containers Environment Variable (yarn.nodemanager.admin-env) --> And append these properties "LC_ALL=en_US.UTF-8,LANG=en_US.UTF-8" to this config. Restart the affected services to make the changes permanent.
Post this kindly re run the oozie job and check the output. In my cluster it shows like this post making the change.
Oozie Launcher, capturing output data:
=======================
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=en_US.UTF-8
Nitish
Created on 03-10-2021 02:44 PM - edited 03-10-2021 02:45 PM
issue resolved with your solution, thanks
CDH 6.3.3