About gbraccialli3

gbraccialli3 · ‎11-23-2015

I also found this jira: https://issues.apache.org/jira/browse/AMBARI-13946

gbraccialli3 · ‎11-23-2015

thank you @Kuldeep Kulkarni I have same issue with a prospect. Same happens to hdfs mover.

gbraccialli3 · ‎11-23-2015

@Ali Bajwa same here with sandbox 2.3.2, when I try to run Spark SQL Thrift server. @vshukla can you help? I think we also need to update our blog with instructions to solve this issue.

gbraccialli3 · ‎11-19-2015

@Neeraj Sabharwal and @Andrew Watson if you are using hdfs snapshot, it is important to backup hive metastore database as well, because hdfs snapshot only wont be enough without tables/partitions information if user accidentally execute drop tables or drop databases.

gbraccialli3 · ‎11-19-2015

@Andrew Watson I understood Hortonworks is going to support 1.5.1 in December and not 1.5.2, that would be the reason to use 1.5.1 instead of 1.5.2.

gbraccialli3 · ‎11-17-2015

What components and folders to backup (only metadata, not data)? What would be the commands? - Namenode - Ambari database - Hive Metastore database - Oozie database - Hue database - Ranger database

gbraccialli3 · ‎11-16-2015

@Neeraj should be the same, add hdp-updated repo, yum install, copy hive-site.xml

gbraccialli3 · ‎11-16-2015

I used steps from our blog and it worked: https://hortonworks.com/hadoop-tutorial/apache-spark-1-5-1-technical-preview-with-hdp-2-3/

gbraccialli3 · ‎11-16-2015

@Vedant Jain Example below works with Sandbox 2.3.2: PS: Note I haven't changed classpath, I only used --jars option from shell: spark-shell --master yarn-client --jars /usr/hdp/current/phoenix-client/phoenix-client.jar inside spark-shell: //option 1, read table val jdbcDF = sqlContext.read.format("jdbc").options( Map( "driver" -> "org.apache.phoenix.jdbc.PhoenixDriver", "url" -> "jdbc:phoenix:sandbox.hortonworks.com:2181:/hbase-unsecure", "dbtable" -> "TABLE1")).load() jdbcDF.show //option 2, read custom query import java.sql.{Connection, DriverManager, DatabaseMetaData, ResultSet} import org.apache.spark.rdd.JdbcRDD def getConn(driverClass: => String, connStr: => String, user: => String, pass: => String): Connection = { var conn:Connection = null try{ Class.forName(driverClass) conn = DriverManager.getConnection(connStr, user, pass) }catch{ case e: Exception => e.printStackTrace } conn } val myRDD = new JdbcRDD( sc, () => getConn("org.apache.phoenix.jdbc.PhoenixDriver", "jdbc:phoenix:localhost:2181:/hbase-unsecure", "", "") , "select sum(10) from TABLE1 where ? <= id and id <= ?", 1, 10, 2) myRDD.take(10) val myRDD = new JdbcRDD( sc, () => getConn("org.apache.phoenix.jdbc.PhoenixDriver", "jdbc:phoenix:localhost:2181:/hbase-unsecure", "", "") , "select col1 from TABLE1 where ? <= id and id <= ?", 1, 10, 2) myRDD.take(10) Also note that Phoenix team recommends to use Phoenix Spark instead of jdbc directly: http://phoenix.apache.org/phoenix_spark.html Here an example with PhoenixSpark package: from shell: spark-shell --master yarn-client --jars /usr/hdp/current/phoenix-client/phoenix-client.jar,/usr/hdp/current/phoenix-client/lib/phoenix-spark-4.4.0.2.3.2.0-2950.jar --conf "spark.executor.extraClassPath=/usr/hdp/current/phoenix-client/phoenix-client.jar" inside spark-shell: import org.apache.phoenix.spark._ val df = sqlContext.load( "org.apache.phoenix.spark", Map("table" -> "TABLE1", "zkUrl" -> "localhost:2181:/hbase-unsecure") ) df.show And here a sample project that can be built and executed thru spark-submit: https://github.com/gbraccialli/SparkUtils git clone https://github.com/gbraccialli/SparkUtils cd SparkUtils/ mvn clean package spark-submit --class com.github.gbraccialli.spark.PhoenixSparkSample target/SparkUtils-1.0.0-SNAPSHOT.jar Also check @Randy Gelhausen project that use Phoenix Spark to automatic load data from Hive to Phoenix: https://github.com/randerzander/HiveToPhoenix (I copied my pom.xml from Randy's project)

gbraccialli3 · ‎11-16-2015

@Laurence Da Luz Check these links: https://spark.apache.org/docs/latest/tuning.html http://www.slideshare.net/SparkSummit/deep-dive-into-project-tungsten-josh-rosen http://www.slideshare.net/cfregly/advanced-apache-spark-meetup-project-tungsten-nov-12-2015 http://www.slideshare.net/SparkSummit/building-debugging-and-tuning-spark-machine-leaning-pipelinesjoseph-bradley http://www.slideshare.net/SparkSummit/04-huang-duan-1 http://www.slideshare.net/SparkSummit/making-sense-of-spark-performancekay-ousterhout http://www.slideshare.net/SparkSummit/data-storage-tips-for-optimal-spark-performancevida-ha-databricks

Online	Offline
Last Visited	‎09-28-2021 03:33 PM

Member Since	‎09-25-2015 05:42 PM
Last Visited	‎09-28-2021 03:33 PM
Posts	230
Kudos received	236

Cloudera Community

Re: How to reset Ambari Admin password?

Re: Connection Refused trying to access port 8000 ...

Re: Flume + Knox

Re: Ambari stuck with "Install Pending" when creat...

Re: HDP 2,3.4- Running jobs is not getting display...

Re: Balancer not working in hdfs HA

Re: Balancer not working in hdfs HA

Re: Spark 1.5.1 Tech Preview

Re: Snapshotting Apps/Hive/Warehouse

Re: Steps to upgrade to Spark 1.5.1 on Sandbox

What to backup? and how? (only metadata, not data)

Re: Steps to upgrade to Spark 1.5.1 on Sandbox

Re: Steps to upgrade to Spark 1.5.1 on Sandbox

Re: Spark to Phoenix

Re: Spark specific recommendations for configuring...