Member since
02-01-2019
650
Posts
143
Kudos Received
117
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 3508 | 04-01-2019 09:53 AM | |
| 1814 | 04-01-2019 09:34 AM | |
| 8925 | 01-28-2019 03:50 PM | |
| 1972 | 11-08-2018 09:26 AM | |
| 4486 | 11-08-2018 08:55 AM |
03-17-2018
07:29 AM
@Ranjan Raut Glad that it helped you, Would you mind accepting this answer so that this thread will be marked as answered.
... View more
03-17-2018
05:58 AM
2 Kudos
Following are the steps to connect to Phoenix tables using Spark2. 1) Create a symlink of hbase-site.xml in spark2 conf ln -s /etc/hbase/conf/hbase-site.xml /etc/spark2/conf/hbase-site.xml 2) Launch spark-shell using phoenix spark jars in extra classpath. spark-shell --conf "spark.executor.extraClassPath=/usr/hdp/current/phoenix-client/phoenix-4.7.0.2.6.3.0-235-spark2.jar:/usr/hdp/current/phoenix-client/phoenix-client.jar" --conf "spark.driver.extraClassPath=/usr/hdp/current/phoenix-client/phoenix-4.7.0.2.6.3.0-235-spark2.jar:/usr/hdp/current/phoenix-client/phoenix-client.jar" 3) Create a phoenix connection and query the tables. scala> import org.apache.spark.sql.SQLContext
import org.apache.spark.sql.SQLContext
scala> val sqlContext = new SQLContext(sc)
sqlContext: org.apache.spark.sql.SQLContext = org.apache.spark.sql.SQLContext@495e8a3
scala> val df = sqlContext.load("org.apache.phoenix.spark",Map("table" -> "TABLE1", "zkUrl" -> "localhost:2181"))
df: org.apache.spark.sql.DataFrame = [ID: string, COL1: string ... 1 more field]
scala> df.show()
+-----+----------+----+
| ID| COL1|COL2|
+-----+----------+----+
|test1|test_row_1| 10|
|test2|test_row_2| 20|
+-----+----------+----+
Note: Spark2 and Phoenix integration is introduced from HDP 2.6.2.
... View more
Labels:
03-17-2018
05:54 AM
1 Kudo
@Ranjan Raut Below are the steps to connect Spark 2.2 with phoenix in HDP 2.6.3. 1) Create a symlink of hbase-site.xml in spark2 conf ln -s /etc/hbase/conf/hbase-site.xml /etc/spark2/conf/hbase-site.xml 2) Launch spark-shell using phoenix spark jars in extra classpath. spark-shell --conf "spark.executor.extraClassPath=/usr/hdp/current/phoenix-client/phoenix-4.7.0.2.6.3.0-235-spark2.jar:/usr/hdp/current/phoenix-client/phoenix-client.jar" --conf "spark.driver.extraClassPath=/usr/hdp/current/phoenix-client/phoenix-4.7.0.2.6.3.0-235-spark2.jar:/usr/hdp/current/phoenix-client/phoenix-client.jar" 3) Create a phoenix connection and query the tables. scala> import org.apache.spark.sql.SQLContext
import org.apache.spark.sql.SQLContext
scala> val sqlContext = new SQLContext(sc)
sqlContext: org.apache.spark.sql.SQLContext = org.apache.spark.sql.SQLContext@495e8a3
scala> val df = sqlContext.load("org.apache.phoenix.spark",Map("table" -> "TABLE1", "zkUrl" -> "localhost:2181"))
df: org.apache.spark.sql.DataFrame = [ID: string, COL1: string ... 1 more field]
scala> df.show()
+-----+----------+----+
| ID| COL1|COL2|
+-----+----------+----+
|test1|test_row_1| 10|
|test2|test_row_2| 20|
+-----+----------+----+
... View more
03-16-2018
08:37 AM
1 Kudo
@Shota Akhalaia, You can re-install/recover the ambari for a existing cluster only if you have the backup of ambari database. Otherwise there is no easy way to point existing cluster to existing HDP cluster. The only option is to install a fresh ambari along with fresh HDP.
... View more
03-16-2018
05:18 AM
@priyal patel, Atlas currently doesn't provide lineage for pig scripts. All supported ones are listed here : https://hortonworks.com/apache/atlas/#section_1
... View more
03-16-2018
05:03 AM
@Balachandra Pai You may want to look into the yarn application logs to find the actual cause...
... View more
03-16-2018
05:01 AM
@Royce Whetstine Spark currently fully doesn't support Hive's transactional tables. Here are the reference jira's : SPARK-16996 & SPARK-15348
... View more
03-16-2018
03:45 AM
@Yung Song ssh localhost -p 2222 Should take your to docker container where HDP is installed and you can use the regular paths like /var/log/ambari-server/ambari-server.log. Not sure if you already tried this.
... View more
03-03-2018
09:31 PM
'*' has been used just to list the files in HDFS, you can use filename.00`date +%Y%m%d` to specify the date format and then move them accordingly.
... View more
03-03-2018
03:24 AM
@Santhosh Reddy You can use the data command as below and select the files. [spark@node-1 ~]$ touch filename.0020180303
[spark@node-1 ~]$ ll filename.00`date +%Y%m%d`
-rw-r--r--. 1 spark hadoop 0 Mar 3 03:20 filename.0020180303
[spark@node-1 ~]$ hadoop fs -put filename.00`date +%Y%m%d` /tmp/
[spark@node-1 ~]$ hadoop fs -ls /tmp/filename*
-rw-r--r-- 3 spark hdfs 0 2018-03-03 03:21 /tmp/filename.0020180303
... View more