Member since
11-08-2016
32
Posts
7
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
589 | 11-09-2016 08:58 AM |
06-19-2017
11:51 AM
Unfortunately it does not change it. The tests are still not run. 😞
... View more
06-13-2017
01:36 PM
Hi guys, I have read the article about testing and I would like to test spark-testing-base with spark. Unfortunately I am not an expert with Maven so I don't get the tests to run. My project looks like this: pom.xml
src/main/scala/com/test/spark/mycode.scala
src/test/scala/com/test/spark/test.scala I can run 'mvn package' without problems. But it says [INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ mycode ---
[INFO] Nothing to compile - all classes are up to date
Why does my test not run? I have added the dependency as explained on the git for scala-testing-base and used the first example from its wiki. My pom.xml looks like this: <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.test.spark</groupId>
<artifactId>mycode</artifactId>
<version>0.0.1</version>
<name>${project.artifactId}</name>
<description>Simple wtest</description>
<inceptionYear>2017</inceptionYear>
<!-- change from 1.6 to 1.7 depending on Java version -->
<properties>
<maven.compiler.source>1.6</maven.compiler.source>
<maven.compiler.target>1.6</maven.compiler.target>
<encoding>UTF-8</encoding>
<scala.version>2.11.5</scala.version>
<scala.compat.version>2.11</scala.compat.version>
<spark.version>1.6.1</spark.version>
</properties>
<dependencies>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.version}</version>
</dependency>
<!-- Spark dependency -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.compat.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<!-- Spark sql dependency -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_${scala.compat.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<!-- Spark hive dependency -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_${scala.compat.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<!-- spark-testing-base dependency -->
<dependency>
<groupId>com.holdenkarau</groupId>
<artifactId>spark-testing-base_2.11</artifactId>
<version>${spark.version}_0.6.0</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.scalactic</groupId>
<artifactId>scalactic_2.11</artifactId>
<version>3.0.1</version>
</dependency>
<dependency>
<groupId>org.scalatest</groupId>
<artifactId>scalatest_2.11</artifactId>
<version>3.2.0-SNAP5</version>
<scope>test</scope>
</dependency>
</dependencies>
<build>
<sourceDirectory>src/main/scala</sourceDirectory>
<testSourceDirectory>src/test/scala</testSourceDirectory>
<!-- Create JAR with all dependencies -->
<plugins>
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<version>3.0.0</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
</plugin>
<plugin>
<!-- see http://davidb.github.com/scala-maven-plugin -->
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>3.2.2</version>
<configuration>
<scalaVersion>${scala.version}</scalaVersion>
<scalaCompatVersion>${scala.compat.version}</scalaCompatVersion>
</configuration>
<executions>
<execution>
<phase>compile</phase>
<goals>
<goal>compile</goal>
</goals>
</execution>
</executions>
</plugin>
<!-- for testing scala code -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>2.20</version>
<configuration>
<argLine>-Xmx2048m -XX:MaxPermSize=2048m</argLine>
</configuration>
</plugin>
</plugins>
</build>
</project>
Do you have any advise what I need to change in order to have my test run?? Best, Ken
... View more
Labels:
- Labels:
-
Apache Spark
04-20-2017
05:51 AM
Hi @Jay Zhou Can you be a bit more specific what you have changed? What did you exactly do with this line? val hiveSqlContext = new org.apache.spark.sql.hive.HiveContext(sc) I have a similar problem where I get an error WARN Hive: Failed to access metastore. This class should not accessed in runtime. but this is only when I run the job via Oozie. When I use spark submit the code works so I guess the dependencies are right. Do you have any idea what can cause this?
... View more
03-23-2017
06:39 AM
Hi @Vinod Bonthu, your answer really helped but unfortunately it wasn't the complete solution.
As statd in my error log I had to add some jars via the spark-submit --jars
command and I also added my hive site with spark-submit --files /usr/hdp/current/spark-client/conf/hive-site.xml
With those two changes and removing .setMaster("local[2]")
it worked! Thanks for your help!
... View more
03-22-2017
11:21 AM
Yes I did. The reult is -rwxr-xr-x 3 USER USER 12252 2017-03-17 10:49 /path-to-jar/my.jar And the ouput when I run the spark-commit command is stated above.
... View more
03-22-2017
10:33 AM
Hi @Aditya Deshpande, 17/03/22 11:27:53 INFO HiveContext: Initializing execution hive, version 1.2.1
17/03/22 11:27:53 INFO ClientWrapper: Inspected Hadoop version: 2.7.3.2.5.0.0-1245
17/03/22 11:27:53 INFO ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.7.3.2.5.0.0-1245
17/03/22 11:27:53 INFO HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
17/03/22 11:27:53 INFO ObjectStore: ObjectStore, initialize called
17/03/22 11:27:53 WARN HiveMetaStore: Retrying creating default database after error: Class org.datanucleus.api.jdo.JDOPersistenceManagerFactory was not found.
javax.jdo.JDOFatalUserException: Class org.datanucleus.api.jdo.JDOPersistenceManagerFactory was not found.
at javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1175)
at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808)
at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701)
...
...
That is the first error I find in the logs in stderr. Another one is 17/03/22 11:27:53 INFO HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
17/03/22 11:27:53 INFO ObjectStore: ObjectStore, initialize called
17/03/22 11:27:53 WARN Hive: Failed to access metastore. This class should not accessed in runtime.
org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1236)
at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:174)
at org.apache.hadoop.hive.ql.metadata.Hive.<clinit>(Hive.java:166)
...
...
Can you make use of this?
... View more
03-22-2017
06:41 AM
Hi @Vinod Bonthu, Thanks gor the quick guide but I have a follow up question. So it seems my pom is fine since I can package it with maven. Although when I run it via "spark-submit" I get the error stated in my original post. But when I run one of the spark examples by executing below code works. So why is that? I have the feeling that still something is not right and as a forst step I just want to run an example code packaged by myself via spark-submit. Afterwards I will try an oozie workflow. Thanks again! spark-submit org.apache.spark.exaples.SOMEEXAMPlE --master yarn-cluster hdfs://path-to-spark-examples.jar
... View more
03-21-2017
05:20 PM
Hi pbarna, Oh I am sorry. Yes I could run mvn package without any errors (at the beginninh some dependencies were missing but I fixed that). The error in my original post is when I try to run my paavked jar on HDP. So I do not need any cluster or Hortonworks specific things in my pom, right?
... View more
03-21-2017
02:59 PM
Hi folks, I would like to make a minimal example packed with Maven based on Scala code (like a hello world) and run it on a HDP2.5 sandbox. What do I need to specify in my pom.xml? So far I have this: <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.test.spark</groupId>
<artifactId>Test</artifactId>
<version>0.0.1</version>
<name>${project.artifactId}</name>
<description>Simple test app</description>
<inceptionYear>2017</inceptionYear>
<!-- change from 1.6 to 1.7 depending on Java version -->
<properties>
<maven.compiler.source>1.6</maven.compiler.source>
<maven.compiler.target>1.6</maven.compiler.target>
<encoding>UTF-8</encoding>
<scala.version>2.11.5</scala.version>
<scala.compat.version>2.11</scala.compat.version>
<spark.version>1.6.1</spark.version>
</properties>
<dependencies>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.version}</version>
</dependency>
<!-- Spark dependency -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.compat.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<!-- Spark sql dependency -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_${scala.compat.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<!-- Spark hive dependency -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_${scala.compat.version}</artifactId>
<version>${spark.version}</version>
</dependency>
</dependencies>
<build>
<sourceDirectory>src/main/scala</sourceDirectory>
<!-- Create JAR with all dependencies -->
<plugins>
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<version>3.0.0</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
</plugin>
<plugin>
<!-- see http://davidb.github.com/scala-maven-plugin -->
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>3.2.2</version>
<configuration>
<scalaVersion>${scala.version}</scalaVersion>
<scalaCompatVersion>${scala.compat.version}</scalaCompatVersion>
</configuration>
<executions>
<execution>
<phase>compile</phase>
<goals>
<goal>compile</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
And my scala code is this: package com.test.spark
import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
import org.apache.spark.sql
import org.apache.commons.lang
import org.apache.spark.sql.SQLContext
import org.apache.spark.sql.hive.HiveContext
import org.apache.spark.rdd.RDD
object Test {
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("Test") .setMaster("local[2]")
val spark = new SparkContext(conf)
println( "Hello World!" )
}
}
I run the code with spark-submit --class com.test.spark.Test --master yarn --deploy-mode cluster hdfs://HDP25/test.jar
Unfortunately it does not run. 😞 See the image attached. Am I missing something? Can you please help me to get a minimal example running? Thanks and kind regards
... View more
Labels:
- Labels:
-
Apache Spark
02-27-2017
10:15 AM
Hi Sunile, Thanks for your answer. We think we will store our initial model and then all alter scripts. But all alter scripts will be included in the initial model in cas a complete re-deployment is wanted. To view a logical model a tool will be used which can reverse engineer the ddl. We try to establish a workflow in that fashion and hope that works.
... View more