About paramount2u

paramount2u · ‎03-12-2019

Hi, Thank you for your response! I've tried to set HADOOP_CONF_DIR in bash before calling pyspark but it did not helped out. pyspark itself calls spark-env.sh script which overrides HADOOP_CONF_DIR variable (see the below). HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-$SPARK_CONF_DIR/yarn-conf} HIVE_CONF_DIR=${HIVE_CONF_DIR:-/etc/hive/conf} if [ -d "$HIVE_CONF_DIR" ]; then HADOOP_CONF_DIR="$HADOOP_CONF_DIR:$HIVE_CONF_DIR" fi export HADOOP_CONF_DIR As a result, HADOOP_CONF_DIR is assigned a string which is the combination of two directories: >>> import os >>> os.getenv('HADOOP_CONF_DIR') '/opt/cloudera/parcels/CDH-6.1.0-1.cdh6.1.0.p0.770702/lib/spark/conf/yarn-conf:/etc/hive/conf' But, when I set the value manually to point to the single directory (either of the two above) subprocess routine starts working. >>> os.environ['HADOOP_CONF_DIR'] = "/opt/cloudera/parcels/CDH-6.1.0-1.cdh6.1.0.p0.770702/lib/spark/conf/yarn-conf" >>> subprocess.call(["hadoop", "fs", "-ls"]) WARNING: log4j.properties is not found. HADOOP_CONF_DIR may be incomplete. Found 2 items drwxr-x--- - spark spark 0 2019-03-12 15:46 .sparkStaging drwxrwxrwt - spark spark 0 2019-03-12 15:46 applicationHistory So, I assume that the problem comes from the code where HIVE_CONF_DIR is appended to HADOOP_CONF_DIR. Can you please check whether your deployement has such lines in spark-env.sh script?

paramount2u · ‎03-11-2019

After upgrading (fresh installation) to the Cloudera CDH 6.1 all our ETLs (pyspark scripts) are being failed. Withing the scripts we use subprocess.call([]) to work with HDFS directories which was working on CDH 5.13 but fails to execute on current release. It throws the following error: RuntimeException: core-site.xml not found See the details below $ sudo -u spark pyspark --master yarn --deploy-mode client Python 2.7.5 (default, Oct 30 2018, 23:45:53) [GCC 4.8.5 20150623 (Red Hat 4.8.5-36)] on linux2 Type "help", "copyright", "credits" or "license" for more information. Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 19/03/11 20:24:42 WARN lineage.LineageWriter: Lineage directory /var/log/spark/lineage doesn't exist or is not writable. Lineage for this application will be disabled. 19/03/11 20:24:43 WARN lineage.LineageWriter: Lineage directory /var/log/spark/lineage doesn't exist or is not writable. Lineage for this application will be disabled. Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /__ / .__/\_,_/_/ /_/\_\ version 2.4.0-cdh6.1.0 /_/ Using Python version 2.7.5 (default, Oct 30 2018 23:45:53) SparkSession available as 'spark'. >>> import subprocess >>> subprocess.call(["hadoop", "fs", "-ls"]) WARNING: log4j.properties is not found. HADOOP_CONF_DIR may be incomplete. Exception in thread "main" java.lang.RuntimeException: core-site.xml not found at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2891) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2839) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2716) at org.apache.hadoop.conf.Configuration.set(Configuration.java:1353) at org.apache.hadoop.conf.Configuration.set(Configuration.java:1325) at org.apache.hadoop.conf.Configuration.setBoolean(Configuration.java:1666) at org.apache.hadoop.util.GenericOptionsParser.processGeneralOptions(GenericOptionsParser.java:339) at org.apache.hadoop.util.GenericOptionsParser.parseGeneralOptions(GenericOptionsParser.java:569) at org.apache.hadoop.util.GenericOptionsParser.<init>(GenericOptionsParser.java:174) at org.apache.hadoop.util.GenericOptionsParser.<init>(GenericOptionsParser.java:156) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) at org.apache.hadoop.fs.FsShell.main(FsShell.java:389)

Online	Offline
Last Visited	‎04-02-2019 02:32 PM

Member Since	‎03-11-2019 09:37 AM
Last Visited	‎04-02-2019 02:32 PM
Posts	2
Kudos received	2

Cloudera Community

Re: "RuntimeException: core-site.xml not foun...

"RuntimeException: core-site.xml not found&qu...

Re: &quot;RuntimeException: core-site.xml not foun...

&quot;RuntimeException: core-site.xml not found&qu...

Re: "RuntimeException: core-site.xml not foun...

"RuntimeException: core-site.xml not found&qu...