About sandyy006

sandyy006 · ‎03-17-2018

@Ranjan Raut Glad that it helped you, Would you mind accepting this answer so that this thread will be marked as answered.

sandyy006 · ‎03-17-2018

Following are the steps to connect to Phoenix tables using Spark2. 1) Create a symlink of hbase-site.xml in spark2 conf ln -s /etc/hbase/conf/hbase-site.xml /etc/spark2/conf/hbase-site.xml 2) Launch spark-shell using phoenix spark jars in extra classpath. spark-shell --conf "spark.executor.extraClassPath=/usr/hdp/current/phoenix-client/phoenix-4.7.0.2.6.3.0-235-spark2.jar:/usr/hdp/current/phoenix-client/phoenix-client.jar" --conf "spark.driver.extraClassPath=/usr/hdp/current/phoenix-client/phoenix-4.7.0.2.6.3.0-235-spark2.jar:/usr/hdp/current/phoenix-client/phoenix-client.jar" 3) Create a phoenix connection and query the tables. scala> import org.apache.spark.sql.SQLContext import org.apache.spark.sql.SQLContext scala> val sqlContext = new SQLContext(sc) sqlContext: org.apache.spark.sql.SQLContext = org.apache.spark.sql.SQLContext@495e8a3 scala> val df = sqlContext.load("org.apache.phoenix.spark",Map("table" -> "TABLE1", "zkUrl" -> "localhost:2181")) df: org.apache.spark.sql.DataFrame = [ID: string, COL1: string ... 1 more field] scala> df.show() +-----+----------+----+ | ID| COL1|COL2| +-----+----------+----+ |test1|test_row_1| 10| |test2|test_row_2| 20| +-----+----------+----+ Note: Spark2 and Phoenix integration is introduced from HDP 2.6.2.

sandyy006 · ‎03-17-2018

@Ranjan Raut Below are the steps to connect Spark 2.2 with phoenix in HDP 2.6.3. 1) Create a symlink of hbase-site.xml in spark2 conf ln -s /etc/hbase/conf/hbase-site.xml /etc/spark2/conf/hbase-site.xml 2) Launch spark-shell using phoenix spark jars in extra classpath. spark-shell --conf "spark.executor.extraClassPath=/usr/hdp/current/phoenix-client/phoenix-4.7.0.2.6.3.0-235-spark2.jar:/usr/hdp/current/phoenix-client/phoenix-client.jar" --conf "spark.driver.extraClassPath=/usr/hdp/current/phoenix-client/phoenix-4.7.0.2.6.3.0-235-spark2.jar:/usr/hdp/current/phoenix-client/phoenix-client.jar" 3) Create a phoenix connection and query the tables. scala> import org.apache.spark.sql.SQLContext import org.apache.spark.sql.SQLContext scala> val sqlContext = new SQLContext(sc) sqlContext: org.apache.spark.sql.SQLContext = org.apache.spark.sql.SQLContext@495e8a3 scala> val df = sqlContext.load("org.apache.phoenix.spark",Map("table" -> "TABLE1", "zkUrl" -> "localhost:2181")) df: org.apache.spark.sql.DataFrame = [ID: string, COL1: string ... 1 more field] scala> df.show() +-----+----------+----+ | ID| COL1|COL2| +-----+----------+----+ |test1|test_row_1| 10| |test2|test_row_2| 20| +-----+----------+----+

sandyy006 · ‎03-16-2018

@Shota Akhalaia, You can re-install/recover the ambari for a existing cluster only if you have the backup of ambari database. Otherwise there is no easy way to point existing cluster to existing HDP cluster. The only option is to install a fresh ambari along with fresh HDP.

sandyy006 · ‎03-16-2018

@priyal patel, Atlas currently doesn't provide lineage for pig scripts. All supported ones are listed here : https://hortonworks.com/apache/atlas/#section_1

sandyy006 · ‎03-16-2018

@Balachandra Pai You may want to look into the yarn application logs to find the actual cause...

sandyy006 · ‎03-16-2018

@Royce Whetstine Spark currently fully doesn't support Hive's transactional tables. Here are the reference jira's : SPARK-16996 & SPARK-15348

sandyy006 · ‎03-16-2018

@Yung Song ssh localhost -p 2222 Should take your to docker container where HDP is installed and you can use the regular paths like /var/log/ambari-server/ambari-server.log. Not sure if you already tried this.

sandyy006 · ‎03-03-2018

'*' has been used just to list the files in HDFS, you can use filename.00`date +%Y%m%d` to specify the date format and then move them accordingly.

sandyy006 · ‎03-03-2018

@Santhosh Reddy You can use the data command as below and select the files. [spark@node-1 ~]$ touch filename.0020180303 [spark@node-1 ~]$ ll filename.00`date +%Y%m%d` -rw-r--r--. 1 spark hadoop 0 Mar 3 03:20 filename.0020180303 [spark@node-1 ~]$ hadoop fs -put filename.00`date +%Y%m%d` /tmp/ [spark@node-1 ~]$ hadoop fs -ls /tmp/filename* -rw-r--r-- 3 spark hdfs 0 2018-03-03 03:21 /tmp/filename.0020180303

Online	Offline
Last Visited	‎01-23-2020 02:33 AM

Member Since	‎02-01-2019 10:51 AM
Last Visited	‎01-23-2020 02:33 AM
Posts	650
Kudos received	142

Cloudera Community

Re: Distributed I/O Benchmark of HDFS

Re: How to reset Ambari admin password in Ambari 2...

Re: Discovering existing Hive tables in Atlas

Re: Does Distcp use Tez now in HDP 3.0 instead of ...

Re: hive server2 interactive logs

Re: How to connect Spark 2.2.0 with Phoenix 4.7 in...

How to connect to Phoenix tables using Spark2

Re: How to connect Spark 2.2.0 with Phoenix 4.7 in...

Re: reinstall ambari or add existing HDP cluster i...

Re: Does Atlas support lineage for Pig,AWS S3 and ...

Re: Yarn job stuck at Accepted state in a kerberiz...

Re: Spark dataframe returning no data from hive

Re: Sandbox hdp 264 question

Re: Move files from edge node to hdfs using single...

Re: Move files from edge node to hdfs using single...