Member since
10-09-2015
76
Posts
33
Kudos Received
11
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4924 | 03-09-2017 09:08 PM | |
5261 | 02-23-2017 08:01 AM | |
1696 | 02-21-2017 03:04 AM | |
2049 | 02-16-2017 08:00 AM | |
1080 | 01-26-2017 06:32 PM |
12-24-2016
07:36 PM
2 Kudos
Are you running the spark job via YARN? Then go to the Resource Manager (RM) UI. It will be running on your RM machine on port 8088. From there find the Applications link that lists all running application. Navigate to the application page for your application. There you will find a link for Application Master which will connect you to the running application master. If the job has finished then the link will be History which will connect you to the Spark History Server and show you the same UI for the completed app. In an HDP cluster Spark history server is always running if Spark service is installed via Ambari. After a Spark job is running you cannot manually change its number of executors or memory
... View more
12-24-2016
07:24 PM
Nice! BTW, HDP 2.5 has Livy built-in. Can be found under Spark service in Ambari.
... View more
12-22-2016
10:16 PM
Without the full exception stack trace its difficult to know what happened. If you are instantiating hive then you may need to add hive-site.xml and the data-nucleus jars to the job. e.g. like --jars /usr/hdp/current/spark-client/lib/datanucleus-api-jdo-3.2.6.jar,/usr/hdp/current/spark-client/lib/datanucleus-rdbms-3.2.9.jar,/usr/hdp/current/spark-client/lib/datanucleus-core-3.2.10.jar --files /usr/hdp/current/spark-client/conf/hive-site.xml
... View more
12-19-2016
01:58 AM
I am not sure if the HBase filters that SHC provides would help here or this points to some more feature work necessary in SHC. Could you please elaborate with some code samples?
... View more
12-15-2016
08:02 PM
Like mentioned in the answer the command line to add the package to your job is $SPARK_HOME/bin/spark-shell --packages com.databricks:spark-xml_2.10:0.4.1
Of course to write your project code you will also need to add this package to your project maven pom dependency. If you build an uber jar for your project that includes this package then you dont need to change your command line for submission. There are many packages for spark that you can check at spark-packages.org.
... View more
12-15-2016
02:54 AM
We do not recommend writing data into HBase using SHC using the default internal custom format of SHC. This format is not well defined and mainly used for testing etc. This format can change without being compatible. For storing data via SHC into HBase please use a standard and robust format like Avro. Currently SHC is supports Avro and we plan to support others like Phoenix types.
... View more
12-13-2016
07:24 PM
1 Kudo
If the AM timed out, then in the AM log you will find "Session timed out". If the AM crashed, you will find an exception in the AM log or some error in the AM stderr/stdout.
... View more
12-12-2016
10:43 PM
Sorry about the bad builds. We are working through the automation process that builds different versions of SHC. My comment was mainly about the configuration section in the README for SHC for secure clusters. That is independent of SHC. Its just instructions on how to set up Spark to access HBase tokens.
... View more
12-12-2016
10:14 PM
1 Kudo
Please see the steps outlined here for accessing Hbase securely via Spark. No code change should be needed in your app for typical use cases.
... View more