Member since
12-23-2016
7
Posts
0
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
15939 | 01-23-2017 01:19 AM | |
2472 | 01-20-2017 07:59 AM |
02-03-2017
07:22 AM
Thx ! This also worked for me
... View more
02-02-2017
09:31 AM
Same problem here! I also can run the job successfuly with spark-submit. I noticed additionall information in yarn log when I try do do it via oozie and hue : 2017-02-02 18:06:45,276 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: hdfs is accessing unchecked http://vms4580.saipm.com:43355/ws/v1/mapreduce/jobs/job_1486053773758_0003?user.name=hue&doAs=hdfs which is the app master GUI of application_1486053773758_0003 owned by hdfs I cannot also track the application when I run via Oozie, so it seems that it can't connect correctly to the web tracking service. I have multihome cluster, maybe the problem is linked ? My cluster is also small, only 2 nodes. I also read that because for small cluster each queue assign some memory size (2048MB) to complete single map reduce job. If more than one map reduce job run in single queue mean it met a deadlock. However still not working after increasing memory size and java heap size.
... View more
01-24-2017
06:50 AM
Hi, I'm trying to run a simple python script on Oozie using Hue. I'm using anaconda parcels installed so I've also add in Cloudera manager, spark configuration (Spark Service Advanced Configuration Snippet (Safety Valve) for spark-conf/spark-env.sh) if [ -z "${PYSPARK_PYTHON}" ]; then
export PYSPARK_PYTHON=/opt/cloudera/parcels/Anaconda/bin/python
fi When running the job, i've a python error ImportError: No module named pandas.io.json , meaning that the PYSPARK_PYTHON doesn't seems to take the one from anaconda. I've tried to add an arguments with PYSPARK_PYTHON=/opt/cloudera/parcels/Anaconda/bin/python on the spark action via hue, but doesn't seems to work. If I run the scripts via CLI and spark-submit it works. If I run other python scripts on Oozie via Hue (without packages from anaconda) it works. What am I missing ? 😕
... View more
Labels:
- Labels:
-
Apache Oozie
-
Apache Spark
-
Cloudera Hue
01-23-2017
01:19 AM
Yeah I already had a look over there. So I solved this by adding to my PATH, the ../cloudera/parcels/Anaconda/bin:$PATH" After I was able to use conda command and pip install BUT when installing a new package, it doesn't install to the cluster, so need to manually install on each machine...
... View more
01-20-2017
08:28 AM
Hi there, I've installed successfullly the anaconda parcel but how do I installed new packages on this distrib ? I mean, how do I use conda? Should I add conda to PATH, if so where is by default ? The documentation is not clear 😞 Thx for help, Mehdi
... View more
Labels:
- Labels:
-
Cloudera Manager
01-20-2017
07:59 AM
The problem was solved by re-adding the jar file (maybe corrupted jars) and restart the cluster
... View more
12-23-2016
06:30 AM
Hi, I'm trying to do a basic job in sqoop using sqoop-import to a sybase db sqoop import --connect jdbc:sybase:Tds:syserver:5000/mydb?ServiceName=master --username user --password password --driver com.sybase.jdbc4.jdbc.SybDriver --query "SELECT * FROM point_vente WHERE \$CONDITIONS" --split-by id --target-dir /user/sybase/test.csv I've correctly added the JAR file required for sybase and I'm able to use: sqoop list-tables sqoop list-databases and I can use the same query above with sqoop eval and have correct results. BUT when launched the above job, it stucks, the jobs is blocked on booting. I only have this for debugging the deadlock: 2016-12-23 13:21:08,222 DEBUG driver.JobManager [org.apache.sqoop.driver.JobManager$UpdateThread.run(JobManager.java:671)] Updating running submissions
2016-12-23 13:21:08,223 DEBUG repository.JdbcRepositoryTransaction [org.apache.sqoop.repository.JdbcRepositoryTransaction.begin(JdbcRepositoryTransaction.java:74)] Tx count-begin: 1, rollback: false
2016-12-23 13:21:08,223 DEBUG repository.JdbcRepositoryTransaction [org.apache.sqoop.repository.JdbcRepositoryTransaction.commit(JdbcRepositoryTransaction.java:85)] Tx count-commit: 1, rollback: false
2016-12-23 13:21:08,223 DEBUG repository.JdbcRepositoryTransaction [org.apache.sqoop.repository.JdbcRepositoryTransaction.close(JdbcRepositoryTransaction.java:104)] Tx count-close: 0, rollback: false
2016-12-23 13:21:08,224 INFO repository.JdbcRepositoryTransaction [org.apache.sqoop.repository.JdbcRepositoryTransaction.close(JdbcRepositoryTransaction.java:112)] Attempting transaction commit I've tried with sqoop2 using hue, same problem. Note : I'm running on the cloudera quickstart docker
... View more
Labels:
- Labels:
-
Apache Sqoop
-
Quickstart VM