Member since
06-23-2016
136
Posts
8
Kudos Received
8
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2705 | 11-24-2017 08:17 PM | |
3201 | 07-28-2017 06:40 AM | |
1235 | 07-05-2017 04:32 PM | |
1383 | 05-11-2017 03:07 PM | |
5525 | 02-08-2017 02:49 PM |
11-24-2017
08:17 PM
The answer is because I am an idiot. Only S3 had datanode and nodemanager installed. Hopefully this might help someone.
... View more
11-24-2017
11:59 AM
Hi. I am running Spark2 from Zeppelin (0.7 in HDP 2.6) and I am doing an idf transformation which crashes after many hours. It is run on a cluster with a master and 3 datanodes: s1, s2 and s3. All nodes have a Spark2 client and each has 8 cores and 16GB RAM. I just noticed it is only running on one node s3 with 5 executors. In zeppelin-env.sh I have set zeppelin.executor.instances to 32 and zeppelin.executor.mem to 12g and it has the line: export MASTER=yarn-client I have set yarn.resourcemanager.scheduler.class to org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler. I also set spark.executor.instances to 32 in the Spark2 interprter. Anyone have any ideas what else I can try to get the other nodes doing their share?
... View more
07-28-2017
06:40 AM
It was a setting in tez.lib.uris. Changed it to: /hdp/apps/${hdp.version}/tez/tez.tar.gz,hdfs://master.royble.co.uk:8020/jars/json-serde-1.3.7-jar-with-dependencies.jar (Note: no space after comma and hdfs path).
... View more
07-28-2017
05:38 AM
Thanks Deepesh. It is: HIVE_AUX_JARS_PATH=/usr/hdp/2.6.0.3-8/hive/lib/json-serde-1.3.7-jar-with-dependencies.jar
if [ "${HIVE_AUX_JARS_PATH}" != "" ]; then
if [ -f "${HIVE_AUX_JARS_PATH}" ]; then
export HIVE_AUX_JARS_PATH=${HIVE_AUX_JARS_PATH}
elif [ -d "/usr/hdp/current/hive-webhcat/share/hcatalog" ]; then
export HIVE_AUX_JARS_PATH=/usr/hdp/current/hive-webhcat/share/hcatalog/hive-hcatalog-core.jar
fi
elif [ -d "/usr/hdp/current/hive-webhcat/share/hcatalog" ]; then
export HIVE_AUX_JARS_PATH=/usr/hdp/current/hive-webhcat/share/hcatalog/hive-hcatalog-core.jar
fi
... View more
07-26-2017
04:13 PM
I am trying to run hive from the CLI: HADOOP_USER_NAME=hdfs hive -hiveconf hive.cli.print.header=true -hiveconf hive.support.sql11.reserved.keywords=false -hiveconf hive.aux.jars.path=/usr/hdp/2.6.0.3-8/hive/lib/json-serde-1.3.7-jar-with-dependencies.jar -hiveconf hive.root.logger=DEBUG,console but I get this error: java.lang.RuntimeException: java.io.FileNotFoundException: File does not exist: hdfs://master.royble.co.uk:8020/user/hdfs/ /home/ed/Downloads/serde/json-serde-1.3.7-jar-with-dependencies.jar I have had so many problems with that jar, that I originally used to create a Hive table. Normally I would do an 'add jar' but I cannot start Hive to do that. I have tried adding the jar to hive-env, /usr/hdp/<version>/hive/auxlib (on the hive machine) and hive.aux.jars.path but nothing works. Any idea why Hive is looking for that odd path, or in fact why it is looking for it at all? FYI: master is not the machine with hive on it but it is where I run Ambari. The path /home/ed/Downloads/serde is one I have used in the past but can remember when. Using HDP-2.6.0.3. Any help is much appreciated as this is driving me mad!
... View more
Labels:
07-25-2017
10:33 AM
1 Kudo
In Rstudio I do: library(sparklyr)
library(dplyr)
Sys.setenv(SPARK_HOME="/usr/hdp/current/spark2-client") # got from ambari spark2 configs
config <- spark_config()
sc <- spark_connect(master = "yarn-client", config = config, version = '2.1.0') which gives: Failed during initialize_connection: org.apache.hadoop.security.AccessControlException: Permission denied: user=ed, access=WRITE, inode="/user/ed/.sparkStaging/application_1500959138473_0003":admin:hadoop:drwxr-xr-x
normally I fix this sort of problem with: HADOOP_USER_NAME=hdfs hadoop fs -put but I do not know how to do this in R. I thought maybe change ed's user and group to hdfs: ed@master:~$ hdfs dfs -ls /user
Found 11 items
drwx------ - accumulo hdfs 0 2017-05-14 15:38 /user/accumulo
drwxr-xr-x - admin hadoop 0 2017-06-27 06:52 /user/admin
drwxrwx--- - ambari-qa hdfs 0 2017-06-02 10:46 /user/ambari-qa
drwxr-xr-x - admin hadoop 0 2017-06-02 11:00 /user/ed
drwxr-xr-x - hbase hdfs 0 2017-05-14 15:35 /user/hbase
drwxr-xr-x - hcat hdfs 0 2017-05-14 15:44 /user/hcat
drwxr-xr-x - hdfs hdfs 0 2017-06-20 12:43 /user/hdfs
drwxr-xr-x - hive hdfs 0 2017-05-14 15:44 /user/hive
drwxrwxr-x - oozie hdfs 0 2017-05-14 15:46 /user/oozie
drwxrwxr-x - spark hdfs 0 2017-05-14 15:40 /user/spark
drwxr-xr-x - zeppelin hdfs 0 2017-07-24 09:29 /user/zeppelin but I am worried as it is currently admin/hadoop and admin is how I log into Ambari. So I do not want to mess up other stuff. Any help is much appreciated!
... View more
Labels:
07-05-2017
04:32 PM
Here is how you do it: Got its 'name' from here . Spark 2.1 needs scala 2.11 version, so name is: databricks:spark-corenlp:0.2.0-s_2.11. Edit the spark2 interpreter and add the name. Save it and allow it to restart. In Zeppelin: %spark.dep
z.reset()
z.load("databricks:spark-corenlp:0.2.0-s_2.11")
... View more