Member since
02-16-2016
45
Posts
24
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
6209 | 07-28-2016 03:37 PM | |
8990 | 02-20-2016 11:34 PM |
03-10-2019
08:36 AM
@ hoda moradi Any updates
... View more
07-28-2016
03:37 PM
I figure it out. I post the answer for others with same issue. The problem was missing atlas jar files.Try to copy all the jar file in /usr/hdp/2.3.2.0-2950/atlas/hook/hive/* directory into lib folder at job.properties level.
... View more
07-28-2016
03:48 PM
I solved it by coping all the jar file in /usr/hdp/2.3.2.0-2950/atlas/hook/hive/* directory into lib folder at job.properties level.
... View more
06-30-2016
08:23 PM
@mqureshi Thank you for your response. By adding hadoop.proxyuser.hive.groups=* I solved that error. However, Now I am getting the new one. I post the new error and log in here https://community.hortonworks.com/questions/42720/main-class-orgapacheoozieactionhadoophivemain-exit.html Can you look at it and let me know if you can fix it?
... View more
04-07-2016
09:20 PM
@hoda moradi If I understand your question correctly, you could try to use a state management function with UpdateStateByKey (http://spark.apache.org/docs/latest/streaming-programming-guide.html#transformations-on-dstreams) where the key is the schema type field (I am assuming this is a String). Create a global map with the schema type field as the key and the corresponding data frame as the value. The function itself would look up the data frame object in the map you created earlier and then operate on that data frame, the data you want to save should now also be passed to the function. The stateful function is typically used to keep a running aggregate. However, because it actually partitions the DStream (I believe by creating separate DStreams) based on the key you provide it should allow you to write generic logic where you lookup the specifics (like target table and columns) at run time. Let me know if that makes sense, I can post some code if not.
... View more
12-22-2016
06:14 PM
Hi, I am running spark submit command with oozie workflow, but getting error Mainclass[org.apache.oozie.action.hadoop],exit code [1] I just wanted to confirm, if i need to give the HDFS path of jar and keytab in spark submit Thanks in advance!!
... View more
04-02-2016
09:23 AM
1 Kudo
@hoda moradi Can you please check if permissions for /tmp directory on hdfs is set to 777 and owner and group set to hdfs? Also, Please double check permissions for /mr-history/ it should be set to 777 and owned by mapred followed by hadoop for group permissions.
... View more
03-24-2016
11:12 AM
Hello Hoda, so yes you would do basically the same. But there are functions on the DStream that do that for you already: saveAsTextFiles and saveAsObjectFiles but as said they essentially do the same you did before. I.e. do a save on each RDD using a timestamp in the filename. @hoda moradi https://spark.apache.org/docs/1.1.1/api/java/org/apache/spark/streaming/dstream/DStream.html
... View more
10-13-2016
10:42 AM
Hello Hoda, When I run this program I'm getting this error: Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
16/10/13 16:07:48 INFO Remoting: Starting remoting
16/10/13 16:07:49 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@192.168.17.81:40334]
Exception in thread "main" java.lang.IncompatibleClassChangeError: class org.apache.spark.streaming.scheduler.StreamingListenerBus has interface org.apache.spark.scheduler.SparkListener as super class
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.apache.spark.streaming.scheduler.JobScheduler.<init>(JobScheduler.scala:54)
at org.apache.spark.streaming.StreamingContext.<init>(StreamingContext.scala:183)
at org.apache.spark.streaming.StreamingContext.<init>(StreamingContext.scala:84)
at org.apache.spark.streaming.api.java.JavaStreamingContext.<init> (JavaStreamingContext.scala:138)
at SparkTest.main(SparkTest.java:29) ------------------------------------------------------------------------
BUILD FAILURE
------------------------------------------------------------------------
Total time: 4.181s
Finished at: Thu Oct 13 16:07:50 IST 2016
Final Memory: 15M/212M
------------------------------------------------------------------------
Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.2.1:exec (default-cli) on project SparkPractise: Command execution failed. Process exited with an error: 1 (Exit value: 1) -> [Help 1]
To see the full stack trace of the errors, re-run Maven with the -e switch.
Re-run Maven using the -X switch to enable full debug logging.
For more information about the errors and possible solutions, please read the following articles:
[Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
and SparkTest.java 29th line is
JavaStreamingContext ssc = new JavaStreamingContext(conf, new Duration(3000));
Please help me with this
... View more
02-20-2016
11:34 PM
1 Kudo
I solved that error by adding this dependency to my project. <dependency> <groupId>log4j</groupId> <artifactId>log4j</artifactId> <version>1.2.17</version> </dependency>
... View more