About xin_wang

xin_wang · ‎03-07-2018

According to https://issues.apache.org/jira/browse/SPARK-15348, spark now is not support transactional hive table.

xin_wang · ‎08-20-2017

The spark.driver.extraclasspath , spark.executor.extraclasspath is easy to understand. If you want all you spark job load a particular depency jars to drivers and executers then you can specify in those property. The --jars is if you want to add dependency jar to a spark job

xin_wang · ‎07-17-2017

One single workflow can contain several actions. But you need to make sure all dependencies are prepared for those actions

xin_wang · ‎06-19-2017

Can you try include xmlns="uri:oozie:coordinator:0.4" after the timezone="United_kingdom/London"?

xin_wang · ‎06-01-2017

@dsun yes, you can. First, you need to ensure those service accounts are created in the AD and the cluster hosts all connect to AD and those users are valid. Second, set ignore_groupsusers_create=true in cluster-env.xml then start install HDP

xin_wang · ‎06-01-2017

Actually you are not able to force the oozie action running in a certain node unless you are using ssh action. Once the Oozie submit the action to Yarn, Yarn will running the job in one of the NodeManger host.

xin_wang · ‎05-20-2017

You can upload your keytab file to workflow lib folder so that the keytab will copy to the container folder no matter the job is running on which nodemanager. Then you can specify the --keytab your-keytab --principal your-principal in your spark-submit command. But you have to upload the update keytab to workflow lib folder every time you change the password.

xin_wang · ‎03-31-2017

To clear local file cache and user cache for yarn, perform the following: Find out the cache location by checking the value of the yarn.nodemanager.local-dirs property : <property> <name>yarn.nodemanager.local-dirs</name> <value>/hadoop/yarn/local</value> </property> Remove filecache and usercache folder located inside the folders that is specified in yarn.nodemanager.local-dirs. [yarn@node2 ~]$ cd /hadoop/yarn/local/ [yarn@node2 local]$ ls filecache nmPrivate spark_shuffle usercache [yarn@node2 local]$ rm -rf filecache/ usercache/ If there are more than one folder, clean them one by one. Restart YARN service.

xin_wang · ‎03-31-2017

Perform following steps to enable the verbose log for Oozie launcher Step 1.Add below property to the action configuration section in the workflow file: <configuration> ..... <property> <name>oozie.launcher.mapreduce.map.java.opts</name> <value>-verbose</value> </property> </configuration> Step 2.Upload the updated workflow file to workflow folder defined by oozie.wf.application.path in job property file Step 3.Submit the workflow and now you should be able to see verbose log for the Oozie launcher such as class loading information . [Loaded java.lang.ExceptionInInitializerError from /usr/jdk64/jdk1.8.0_60/jre/lib/rt.jar] [Loaded org.apache.commons.logging.impl.LogFactoryImpl$2 from file:/hadoop/yarn/local/filecache/11/mapreduce.tar.gz/hadoop/share/hadoop/common/lib/commons-logging-1.1.3.jar] [Loaded org.apache.commons.logging.impl.LogFactoryImpl$1 from file:/hadoop/yarn/local/filecache/11/mapreduce.tar.gz/hadoop/share/hadoop/common/lib/commons-logging-1.1.3.jar] [Loaded org.apache.commons.logging.Log from file:/hadoop/yarn/local/filecache/11/mapreduce.tar.gz/hadoop/share/hadoop/common/lib/commons-logging-1.1.3.jar] [Loaded org.apache.commons.logging.impl.Log4JLogger from file:/hadoop/yarn/local/filecache/11/mapreduce.tar.gz/hadoop/share/hadoop/common/lib/commons-logging-1.1.3.jar] [Loaded org.apache.log4j.spi.AppenderAttachable from file:/hadoop/yarn/local/filecache/11/mapreduce.tar.gz/hadoop/share/hadoop/mapreduce/lib/log4j-1.2.17.jar] [Loaded org.apache.log4j.Category from file:/hadoop/yarn/local/filecache/11/mapreduce.tar.gz/hadoop/share/hadoop/mapreduce/lib/log4j-1.2.17.jar] [Loaded org.apache.log4j.Logger from file:/hadoop/yarn/local/filecache/11/mapreduce.tar.gz/hadoop/share/hadoop/mapreduce/lib/log4j-1.2.17.jar] [Loaded org.apache.log4j.Priority from file:/hadoop/yarn/local/filecache/11/mapreduce.tar.gz/hadoop/share/hadoop/mapreduce/lib/log4j-1.2.17.jar] [Loaded org.apache.log4j.Level from file:/hadoop/yarn/local/filecache/11/mapreduce.tar.gz/hadoop/share/hadoop/mapreduce/lib/log4j-1.2.17.jar] [Loaded java.lang.InstantiationError from /usr/jdk64/jdk1.8.0_60/jre/lib/rt.jar] [Loaded sun.reflect.UnsafeFieldAccessorFactory from /usr/jdk64/jdk1.8.0_60/jre/lib/rt.jar] [Loaded sun.reflect.UnsafeQualifiedStaticFieldAccessorImpl from /usr/jdk64/jdk1.8.0_60/jre/lib/rt.jar] [Loaded sun.reflect.UnsafeQualifiedStaticObjectFieldAccessorImpl from /usr/jdk64/jdk1.8.0_60/jre/lib/rt.jar] [Loaded java.util.HashMap$EntrySet from /usr/jdk64/jdk1.8.0_60/jre/lib/rt.jar] [Loaded java.util.HashMap$HashIterator from /usr/jdk64/jdk1.8.0_60/jre/lib/rt.jar] [Loaded java.util.HashMap$EntryIterator from /usr/jdk64/jdk1.8.0_60/jre/lib/rt.jar] [Loaded java.util.MissingResourceException from /usr/jdk64/jdk1.8.0_60/jre/lib/rt.jar] [Loaded org.apache.log4j.LogManager from file:/hadoop/yarn/local/filecache/11/mapreduce.tar.gz/hadoop/share/hadoop/mapreduce/lib/log4j-1.2.17.jar] [Loaded java.net.MalformedURLException from /usr/jdk64/jdk1.8.0_60/jre/lib/rt.jar]

xin_wang · ‎03-24-2017

PROBLEM DESCRIPTION The Oozie service check fails and the following error message is displayed in Ambari: stderr: /var/lib/ambari-agent/data/errors-12523.txt Python script has been killed due to timeout after waiting 300 secs There is no error in the stdout. The service check is terminated because the timeout (300 secs by default) is reached. CAUSE This issue occurs when the time taken by Ambari to upload jar and workflow files to hdfs for Oozie service check takes longer than the timeout mentioned in the server settings. Example: Nodes in a cluster are configured on ipv4 proxy, which causes network slowness among nodes. Ambari uploads jar files and workflow files to the hdfs for Oozie service check, depending on network performance time required to upload these files exceeds 300 secs timeout. WORKAROUND To increase the timeout, find and update the timeout set in the metainfo.xml file located in Edit /var/lib/ambari-server/resources/common-services/OOZIE/your_version_number/metainfo.xml, RESOLUTION Improve the network performance, so that the Oozie service check can be finished in 300 secs timeout period.

Online	Offline
Last Visited	‎06-20-2019 04:31 PM

Member Since	‎07-26-2016 08:58 PM
Last Visited	‎06-20-2019 04:31 PM
Posts	24
Kudos received	7

Cloudera Community

Re: pyspark + SparkSql + transactional orc table t...

Re: Can anyone explain spark.driver.extraclasspath...

Re: Service Accounts Creation

Re: Is it possible to configure Oozie job to run o...

Re: Lauch spark job from oozie shell-action in ker...

Re: pyspark + SparkSql + transactional orc table t...

Re: Can anyone explain spark.driver.extraclasspath...

Re: Can I combine Sqoop workflow, Hive workflow an...

Re: Problem in running coordinator [oozie-site.xml...

Re: Service Accounts Creation

Re: Is it possible to configure Oozie job to run o...

Re: Lauch spark job from oozie shell-action in ker...

How to clear local file cache and user cache for y...

How to enable verbose log for Oozie launcher

ERROR: “Python script has been killed due to timeo...