Member since
10-01-2015
3933
Posts
1150
Kudos Received
374
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 3557 | 05-03-2017 05:13 PM | |
| 2933 | 05-02-2017 08:38 AM | |
| 3183 | 05-02-2017 08:13 AM | |
| 3145 | 04-10-2017 10:51 PM | |
| 1621 | 03-28-2017 02:27 AM |
07-08-2016
02:06 PM
1 Kudo
I have got similar problem. hadoop jar HBaseBulkLoader.jar HBaseBulkLoadDriver ../flume/data/MY_SCHEMA.TAB_BL_10C op1
WARNING: Use "yarn jar" to launch YARN applications.
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfiguration
at HBaseBulkLoadDriver.main(HBaseBulkLoadDriver.java:31)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.HBaseConfiguration
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358) It says HBaseConfiguration.class not found. I have tried by setting the following environmental variables also. # echo $LIBJARS
/usr/hdp/2.4.0.0-169/hadoop/lib/*.jar,/usr/hdp/2.4.0.0-169/hbase/lib/*.jar # echo $HADOOP_TASKTRACKER_OPTS/usr/hdp/2.4.0.0-169/hadoop/lib:/usr/hdp/2.4.0.0-169/hbase/lib # echo $HADOOP_CLASSPATH
/usr/hdp/2.4.0.0-169/hadoop/lib:/usr/hdp/2.4.0.0-169/hbase/lib # echo $CLASSPATH
/usr/hdp/2.4.0.0-169/flume/lib:/usr/hdp/2.4.0.0-169/hbase/lib I have tried the following also. hadoop jar HBaseBulkLoader.jar HBaseBulkLoadDriver -D mapred.chold.env="/usr/hdp/2.4.0.0-169/hbase/lib/" ../flume/data/MY_SCHEMA.TAB_BL_10C op1 hadoop jar HBaseBulkLoader.jar HBaseBulkLoadDriver -libjars /usr/hdp/2.4.0.0-169/hbase/lib/hbase-common.jar ../flume/data/MY_SCHEMA.TAB_BL_10C op1.txt Its all the same error. Can anybody help what is wrong with my approach?
... View more
03-22-2016
04:29 AM
Thank you, I got it with manual download hdp-select && hdfs-client.
... View more
12-09-2018
07:56 AM
Its a firewall issue, added the firewall rule in amabari server and fixed the password less login issue to register the host.
... View more
03-17-2016
06:00 PM
1 Kudo
@Artem Ervits, Thanks for the replies. You have provided valuable information. For this project I have to stick with the requirements using Amazon technologies or chose for my school project another option which does not involve any of these technologies (that's the reason I want to chose this option using MapReduce, Clouds and etc). I am planning and am looking to work in the field of Data Mining. I have been checking and notice that, as you mentioned, that are companies using Apache Spark and etc. For the language part, I noticed too that Java is the way to go. When I asked previously to my Professor what is the "top" language when working with Data Mining and he answered as follows: " Depending on a professional's background, top data mining languages may vary. For a professional with a computer science background, Java/SQL or Python is favored. For a professional with a Statistics background, R is favored. For a professional with an engineering background, Matlab is favored. Keep in mind that data mining is everywhere and people with divers background are working in this hot field. " Thanks again for your valuable information.
... View more
07-19-2016
11:54 PM
As
@Artem Ervits mentioned, Oozie Spark Action is not yet supported. Instead you can follow the alternative from the tech note below: https://community.hortonworks.com/content/kbentry/51582/how-to-use-oozie-shell-action-to-run-a-spark-job-i-1.html
--------------------
Begin Tech Note
-------------------- Because spark action in oozie is not supported in HDP 2.3.x and HDP 2.4.0, there is no workaround especially in kerberos environment. We can use either java action or shell action to launch spark job in oozie workflow. In this article, we will discuss how to use oozie shell action to run a spark job in kerberos environment. Prerequisite:
1. Spark client is installed on every host where nodemanager is running. This is because we have no control over which node the 2. Optionally, if the spark job need to interact with hbase cluster, hbase client need to be installed on every host as well. Steps: 1. Create a shell script with the spark-submit command. For example, in the script.sh:
/usr/hdp/current/spark-client/bin/spark-submit --keytab keytab --principal
ambari-qa-falconJ@FALCONJSECURE.COM --class org.apache.spark.examples.SparkPi --master yarn-client --driver-memory 500m --num-executors 1 --executor-memory 500m --executor-cores 1 spark-examples.jar 3 2. Prepare kerberos keytab which will be used by the spark job. For example, we use ambari smoke test user, the keytab is already generated by Ambari in/etc/security/keytabs/smokeuser.headless.keytab. 3. Create the oozie workflow with a shell action which will execute the script created above, for example, in the workflow.xml:
<workflow-app name="WorkFlowForShellAction" xmlns="uri:oozie:workflow:0.4">
<start to="shellAction"/>
<action name="shellAction">
<shell xmlns="uri:oozie:shell-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<exec>script.sh</exec>
<file>/user/oozie/shell/script.sh#script.sh</file>
<file>/user/oozie/shell/smokeuser.headless.keytab#keytab</file>
<file>/user/oozie/shell/spark-examples.jar#spark-examples.jar</file>
<capture-output/>
</shell>
<ok to="end"/>
<error to="killAction"/>
</action>
<kill name="killAction">
<message>"Killed job due to error"</message>
</kill>
<end name="end"/>
</workflow-app>
4. Create the oozie job properties file. For example, in job.properties:
nameNode=hdfs://falconJ1.sec.support.com:8020
jobTracker=falconJ2.sec.support.com:8050
queueName=default
oozie.wf.application.path=${nameNode}/user/oozie/shell
oozie.use.system.libpath=true 5. Upload the following files created above to the oozie workflow application path in HDFS (In this example: /user/oozie/shell):
- workflow.xml
- smokeuser.headless.keytab
- script.sh
- spark uber jar (In this example: /usr/hdp/current/spark-client/lib/spark-examples*.jar)
- Any other configuration file mentioned in workflow (optional) 6. Execute the oozie command to run this workflow. For example:
oozie job -oozie
http://<oozie-server>:11000/oozie -config job.properties -run --------------------
End Tech Note
--------------------
... View more
03-16-2016
10:13 PM
The NCM in a NiFi cluster typically needs more heap memory. The number of components (processors, input ports, output ports and relationships) x the number of nodes in the NiFi cluster on the graph will drive how much memory your NCM will need. For ~300 - 400 components and 3 - 4 node cluster, the NCM seems pretty good with 8GB of heap. If you encounter heap issue still, you would need to increase the heap size and/or reduce the stat buffer size and/or frequency in the nifi.properties files (NCM and Nodes). nifi.components.status.repository.buffer.size=360 (defaults is 1440) nifi.components.status.snapshot.frequency=5 min (default is 1) This information is accurate as of NiFi 0.5.1 and HDF 1.1.2.
... View more
03-23-2017
02:28 AM
Look for "Hortonworks Sandbox Archive" under https://hortonworks.com/downloads/#sandbox and click on "Expand" to find older versions of sandbox.
... View more
03-12-2016
03:22 PM
1 Kudo
thanks @Artem Ervits
... View more