Member since
09-25-2015
230
Posts
276
Kudos Received
39
Solutions
12-15-2015
02:56 AM
1 Kudo
Great article Wes!
... View more
12-11-2015
11:45 PM
Awesome!
... View more
11-24-2015
05:09 AM
1 Kudo
@Kuldeep Kulkarni ambari restart is not needed, I tested and updated configs appears immediately. You are right about HDFS restart, also mentioned as step #2.
... View more
11-24-2015
12:49 AM
8 Kudos
Spark SQL comes with a nice feature called: "JDBC to other Databases", but, it practice, it's JDBC federation feature. "It can be used to create data frames from jdbc databases using scala/python, but it also works directly with Spark SQL Thrift server and allow us to query external JDBC tables seamless like other hive/spark tables." Example below using sandbox 2.3.2 and spark 1.5.1 TP (https://hortonworks.com/hadoop-tutorial/apache-spark-1-5-1-technical-preview-with-hdp-2-3/). This feature works with spark-submit, spark-shell, zeppelin, spark-sql client and spark sql thift server. In this post two examples: #1 using spark-sql thrift server, #2 using spark-shell. Example #1 using Spark SQL Thrift Server 1- Run Spark SQL Thrift Server with mysql jdbc driver: sudo -u spark /usr/hdp/2.3.2.1-12/spark/sbin/start-thriftserver.sh --hiveconf hive.server2.thrift.port=10010 --jars "/usr/share/java/mysql-connector-java.jar" 2- Open beeline and connect to Spark SQL Thrift Server: beeline -u "jdbc:hive2://localhost:10010/default" -n admin 3- Create a jdbc federated table pointing to existing mysql database, using beeline: CREATE TABLE mysql_federated_sample
USING org.apache.spark.sql.jdbc
OPTIONS (
driver "com.mysql.jdbc.Driver",
url "jdbc:mysql://localhost/hive?user=hive&password=hive",
dbtable "TBLS"
);
describe mysql_federated_sample;
select * from mysql_federated_sample;
select count(1) from mysql_federated_sample;
Example #2 using Spark shell, scala code and data frames: 1- Open spark-shell with mysql jdbc driver spark-shell --jars "/usr/share/java/mysql-connector-java.jar" 2- Create a data frame pointing to mysql table val jdbcDF = sqlContext.read.format("jdbc").options(
Map(
"driver" -> "com.mysql.jdbc.Driver",
"url" -> "jdbc:mysql://localhost/hive?user=hive&password=hive",
"dbtable" -> "TBLS"
)
).load()
jdbcDF.show
See other spark jdbc examples / troubleshooting here: https://community.hortonworks.com/questions/1942/spark-to-phoenix.html
... View more
Labels:
11-23-2015
07:30 PM
2 Kudos
@Kuldeep Kulkarni another workaround: 1- Execute commands below: /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AMBARI-PASSWORD delete localhost CLUSTER-NAME hdfs-site "dfs.namenode.rpc-address"
/var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AMBARI-PASSWORD delete localhost CLUSTER-NAME hdfs-site "dfs.namenode.http-address"
/var/lib/ambari-server/resources/scripts/configs.sh -u admin -p AMBARI-PASSWORD delete localhost CLUSTER-NAME hdfs-site "dfs.namenode.https-address"
2- Restart HDFS
... View more
11-23-2015
04:36 PM
1 Kudo
I also found this jira: https://issues.apache.org/jira/browse/AMBARI-13946
... View more
11-23-2015
11:26 AM
1 Kudo
thank you @Kuldeep Kulkarni I have same issue with a prospect. Same happens to hdfs mover.
... View more
11-11-2015
02:40 PM
1 Kudo
4- Execute sql, using sql interpreter %sql
select geohash_encode(1.11,1.11,3) from sample_07 limit 10
It fails with sql interpreter + zeppelin: java.lang.ClassNotFoundException: com.github.gbraccialli.hive.udf.UDFGeohashEncode
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
...
... View more
11-11-2015
02:40 PM
Second, the same with zeppelin: 1- Restart interpreter 2- Load dependencies %dep
z.reset()
z.load("com.github.gbraccialli:HiveUtils:1.0-SNAPSHOT")
3- Execute sql, using same scale code val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc);
sqlContext.sql("""create temporary function geohash_encode as 'com.github.gbraccialli.hive.udf.UDFGeohashEncode'""");
sqlContext.sql("""select geohash_encode(1.11,1.11,3) from sample_07 limit 10""").collect().foreach(println);
It worked with scale code + zeppelin!!!!!
... View more
11-11-2015
02:39 PM
2- Run spark-shell with dependency spark-shell --master yarn-client --packages "com.github.gbraccialli:HiveUtils:1.0-SNAPSHOT" 3- Run spark code val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc);
sqlContext.sql("""create temporary function geohash_encode as 'com.github.gbraccialli.hive.udf.UDFGeohashEncode'""");
sqlContext.sql("""select geohash_encode(1.11,1.11,3) from sample_07 limit 10""").collect().foreach(println);
spark-shell worked fine!
... View more