Member since
12-12-2015
20
Posts
20
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3642 | 02-23-2016 05:58 PM |
02-23-2016
05:58 PM
I can fix this issue by set the /home/centos executable "775" @Ian Roberts
... View more
02-23-2016
03:49 PM
here's the permission for the python @Ian Roberts -rwxrwxrwx 1 centos centos 7136 Feb 23 13:22 python
... View more
02-23-2016
02:10 PM
hi @Ian Roberts drwxrwxrwx 4 centos centos 57 Feb 23 13:22 pyspark
this is the permission for pyspark venv "777" and i install the pyspark venv on every node manager with the same permission.
... View more
02-23-2016
02:03 PM
2 Kudos
Hi, i got this error below Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times,
most recent failure: Lost task 0.3 in stage 0.0 (TID 3,
ip-internal): java.io.IOException:
Cannot run program
"/home/centos/apps/pyspark/venv/bin/python": error=13,
Permission denied while running $ export HADOOP_USER_NAME=centos
$ export PYSPARK_PYTHON=/home/centos/apps/pyspark/venv/bin/python
$ pyspark --master yarn-client I add user centos to hadoop as hdfs group i'm using latest HDP 2.3.4 and centos 7.2
... View more
Labels:
02-17-2016
02:09 PM
thanks for the information @Artem Ervits For now i can run spark 1.6 from /usr/hdp/2.3.4.1-10/spark/bin
... View more
02-17-2016
01:01 PM
1 Kudo
Do you know when the estimation release for HDP 2.4? @Artem Ervits
... View more
02-17-2016
12:52 PM
I thinks i already install the spark client @Artem Ervits I install spark 1.5 from ambari first, after that i upgrade the spark version from 1.5 to 1.6 When i try to run the pyspark i got this error. It's possible to upgrade spark from latest HDP version?
... View more
02-17-2016
12:40 PM
1 Kudo
hi @Artem Ervits yes i upgrade the HDP to 2.3.4 but i got a litle problem while i try to upgrade spark 1.5 to 1.6 here is my question about that question
... View more
02-17-2016
12:37 PM
hi @Neeraj Sabharwal rpm -qa | grep -i spark -> this is the output spark_2_3_4_0_3485-master-1.5.2.2.3.4.0-3485.el6.noarch
spark_2_3_4_1_10-1.6.0.2.3.4.1-10.el6.noarch
spark_2_3_4_0_3485-python-1.5.2.2.3.4.0-3485.el6.noarch
spark_2_3_4_1_10-master-1.6.0.2.3.4.1-10.el6.noarch
spark_2_3_4_0_3485-worker-1.5.2.2.3.4.0-3485.el6.noarch
spark_2_3_4_1_10-python-1.6.0.2.3.4.1-10.el6.noarch
spark_2_3_4_0_3485-1.5.2.2.3.4.0-3485.el6.noarch
Yes i run yum install spark_2_3_4_1_10-master -y as it the tutorial frow the link
... View more
02-17-2016
12:11 PM
3 Kudos
Hi all, I just upgrade spark 1.5 to 1.6 on HDP 2.3 I'm following this tutorial here and i got this error while try to run pysprak from terminal /usr/bin/pyspark: line 22: /usr/bin/load-spark-env.sh: No such file or directory
/usr/bin/spark-class: line 23: /usr/bin/load-spark-env.sh: No such file or directory
ls: cannot access /usr/assembly/target/scala-: No such file or directory
Failed to find Spark assembly in /usr/assembly/target/scala-.
You need to build Spark before running this program.
... View more
Labels:
02-16-2016
06:36 AM
2 Kudos
hi @Predrag Minovic thanks for the answer, do you mean i just need to follow the tutorial? i didn't see the example for ubuntu? it's the same way with centos?
... View more
02-16-2016
03:57 AM
2 Kudos
Hi, How is the best practice for upgrading spark 1.4 to spark 1.5 on HDP-2.3.2.0-2950 in using ubuntu OS
... View more
Labels:
02-05-2016
05:45 PM
2 Kudos
hi mclark, thanks, but what should i do if i try to connect to remote HDFS? do i need to install hadoop in my HDF server, to get hadoop file system configurtaion?
... View more
02-05-2016
04:28 PM
2 Kudos
Hi all, can anybody share what is the best practice to install HDF it is install inside HDP or on standalone server ? if HDF install on standalone server, how to connect to hdfs on the HDP cluster?
... View more
Labels:
12-22-2015
02:15 PM
1 Kudo
Hi @Neeraj Sabharwal , thanks for the response I try hadoop dfsadmin -report => Decommission Status : Normal hadoop fsck /data/catalogs/visitor -files -blocks => filesystem under path '/data/catalogs/visitor' is HEALTHY also i increase the ulimit open file to 1million i think this error is because to many small file in my hdfs that read by the pyspark do you know the best practice to merge small file like this picture bellow in hdfs to one file so my pyspark is not opening to many file while running the modeling Thanks
... View more
12-22-2015
10:45 AM
2 Kudos
Hi, I run pyspark on my hadoop cluster using spark submit spark-submit --master yarn-client --driver-memory 4g --executor-memory 6g --total-executor-cores 10 --num-executors 5 --conf spark.yarn.queue=alpha --conf spark.executor.instances=5 usr_recommendation.py i got this error java.io.FileNotFoundException: /hadoop/yarn/local/usercache/hdfs/appcache/application_1450771823865_0008/blockmgr-16947187-1ea7-4e42-a652-52559363c4d7/1f/temp_shuffle_9f5ed80b-eabc-4eb5-892a-ce5a7c0c0d0e (Too many open files) and also this error org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-1784711150-172.16.200.242-1447830226283:blk_1074661844_921035 file=/data/catalogs/visitor/1450770445/part-04752 is this configuration issue? this is my setting on hdfs hdfs_user_nproc_limit = 100000
hdfs_user_nofile_limit = 1000000 and this is my setting on yarn yarn_user_nofile_limit = 100000
yarn_user_nproc_limit = 1000000
... View more
Labels:
12-14-2015
02:07 PM
@Neeraj Sabharwal thank you, i will vote to jira
... View more
12-14-2015
12:06 PM
Hi Neeraj, thanks for the link, so there is no solution yet for this problem?
... View more
12-14-2015
11:56 AM
1 Kudo
I have a problem while try to run spark-submit to yarn-cluster
below is my spark-submit code spark-submit --master yarn-cluster --name spark_ml user_recommendation.py
getting below error : java.io.FileNotFoundException: File does not exist: hdfs://name-node:8020/user/spark/.sparkStaging/application_1450092198211_0007/pyspark.zip Is it configuration issue? Thanks, Coktra
... View more
Labels: