Member since
07-31-2013
1924
Posts
462
Kudos Received
311
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1978 | 07-09-2019 12:53 AM | |
| 11921 | 06-23-2019 08:37 PM | |
| 9178 | 06-18-2019 11:28 PM | |
| 10174 | 05-23-2019 08:46 PM | |
| 4600 | 05-20-2019 01:14 AM |
12-15-2015
05:51 PM
CDH5 is compiled with JDK7, and will not run with a lesser JDK version (such as 6, which is no longer supported). Please install JDK7 from http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5/RPMS/x86_64/oracle-j2sdk1.7-1.7.0+update67-1.x86_64.rpm or use JDK8 from http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html Please also remove JDK6 to avoid confusion in selection of the right JVM to run with.
... View more
12-15-2015
12:45 AM
On non-kerberized (insecure) clusters, you can do the below: export HADOOP_USER_NAME=username [command] For ex: export HADOOP_USER_NAME=hdfs yarn logs -applicationId $application_id
... View more
12-12-2015
12:29 PM
What version of CM are you using, and have you attempted recently to redeploy Spark gateway client configs? The below is what I have out of the box in CM 5.5: #!/usr/bin/env bash
##
# Generated by Cloudera Manager and should not be modified directly
##
SELF="$(cd $(dirname $BASH_SOURCE) && pwd)"
if [ -z "$SPARK_CONF_DIR" ]; then
export SPARK_CONF_DIR="$SELF"
fi
export SPARK_HOME=/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/spark
export DEFAULT_HADOOP_HOME=/opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/lib/hadoop
### Path of Spark assembly jar in HDFS
export SPARK_JAR_HDFS_PATH=${SPARK_JAR_HDFS_PATH:-''}
### Some definitions needed by older versions of CDH.
export SPARK_LAUNCH_WITH_SCALA=0
export SPARK_LIBRARY_PATH=${SPARK_HOME}/lib
export SCALA_LIBRARY_PATH=${SPARK_HOME}/lib
SPARK_PYTHON_PATH=""
if [ -n "$SPARK_PYTHON_PATH" ]; then
export PYTHONPATH="$PYTHONPATH:$SPARK_PYTHON_PATH"
fi
export HADOOP_HOME=${HADOOP_HOME:-$DEFAULT_HADOOP_HOME}
if [ -n "$HADOOP_HOME" ]; then
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:${HADOOP_HOME}/lib/native
fi
SPARK_EXTRA_LIB_PATH="/opt/cloudera/parcels/GPLEXTRAS-5.5.0-1.cdh5.5.0.p0.7/lib/hadoop/lib/native"
if [ -n "$SPARK_EXTRA_LIB_PATH" ]; then
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$SPARK_EXTRA_LIB_PATH
fi
export LD_LIBRARY_PATH
HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-$SPARK_CONF_DIR/yarn-conf}
HIVE_CONF_DIR=${HIVE_CONF_DIR:-/etc/hive/conf}
if [ -d "$HIVE_CONF_DIR" ]; then
HADOOP_CONF_DIR="$HADOOP_CONF_DIR:$HIVE_CONF_DIR"
fi
export HADOOP_CONF_DIR
PYLIB="$SPARK_HOME/python/lib"
if [ -f "$PYLIB/pyspark.zip" ]; then
PYSPARK_ARCHIVES_PATH=
for lib in "$PYLIB"/*.zip; do
if [ -n "$PYSPARK_ARCHIVES_PATH" ]; then
PYSPARK_ARCHIVES_PATH="$PYSPARK_ARCHIVES_PATH,local:$lib"
else
PYSPARK_ARCHIVES_PATH="local:$lib"
fi
done
export PYSPARK_ARCHIVES_PATH
fi
# Set distribution classpath. This is only used in CDH 5.3 and later.
export SPARK_DIST_CLASSPATH=$(paste -sd: "$SELF/classpath.txt")
... View more
12-12-2015
11:31 AM
Thank you for trying it out. Could you also post your /etc/spark/conf/spark-env.sh contents here, please? P.s. Pro-tip: When using full paths to a file under the parcel, use its symlinks to stay upgrade-compatible, i.e. instead of /opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/, simply use /opt/cloudera/parcels/CDH/.
... View more
12-12-2015
10:45 AM
What is your NodeManager configuration's yarn.nodemanager.resource.memory-mb value set to? Its possible that YARN is unable to allocate a container for the executors, due to too low value of that configuration, in which case things could hang this way. You could raise that config by another 1 GB and restart the cluster/re-run the shell to see if it resolves the issue. You can also check the Spark AM's log (visit your RM Web UI and click through the RUNNING Spark application, and click on the "logs" link for its Application Master). It may show what it is stuck on, if its yet to spawn up an executor, or if its something else.
... View more
12-12-2015
10:29 AM
1 Kudo
> I guess that if I use --files I use the same log4j.properties for driver and executor. Where are you expecting your logs to be visible BTW? At the driver, or within the executors? Since you are using the yarn-client mode, the custom logger passed via --file will be applied only to the executors. If you'd like it applied to the driver also, via just the use of --file, you will need to use the yarn-cluster mode, as so: spark-submit --name "CentralLog" --master yarn-cluster --class example.spark.CentralLog --files /opt/centralLogs/conf/log4j.properties#log4j.properties --jars $SPARK_CLASSPATH --executor-memory 2g /opt/centralLogs/libProject/produban-paas.jar Otherwise, additonally pass an explicit -Dlog4j.configuration=file:/opt/centralLogs/conf/log4j.properties through spark.driver.extraJavaOptions to make it work, as so: spark-submit --name "CentralLog" --master yarn-client --class example.spark.CentralLog --files /opt/centralLogs/conf/log4j.properties#log4j.properties --conf spark.driver.extraJavaOptions='-Dlog4j.configuration=file:/opt/centralLogs/conf/log4j.properties' --jars $SPARK_CLASSPATH --executor-memory 2g /opt/centralLogs/libProject/produban-paas.jar
... View more
12-12-2015
08:21 AM
On a parcel installation your PySpark should already be setup to be readily used with spark-submit. Is there a reason you're looking to set the SPARK_HOME and PYTHONPATH variables manually? These are auto-handled by CM for you, via your /etc/spark/conf/spark-env.sh. Does the "spark-submit TestPyEnv.py" in a clean default environment throw an error?
... View more
12-09-2015
07:32 AM
The role level APIs carry the state, but you're querying the service level. Use the role IDs from the service level to then query the roles directly.
... View more
12-08-2015
10:00 PM
This would happen if any of your Java based daemons (including the CM server) are running on JDK6. Forcing the daemons to run on JDK7 will resolve the issue (you can remove JDK6 to enforce this, or use JAVA_HOME explicitly in /etc/default/cloudera-scm-server to point to JDK7).
... View more
12-08-2015
09:50 PM
> As commands in shell scripts are only able to recognize hdfs directories This is an incorrect assumption. The shell action will merely execute any given script file (as normally executed from a process), and does not care about what is within it. Does your script fail with an error? If so, please post the error.
... View more