Member since
06-29-2023
3
Posts
0
Kudos Received
0
Solutions
08-14-2023
06:40 AM
Hi, I want to read only a finite sample from an Apache Phoenix table into Spark dataframe without using primary keys in where condition. I tried using 'limit' clause in query as shown in the code below. Map<String, String> map = new HashMap<>(); map .put("url", url ); map .put("driver", "org.apache.phoenix.jdbc.PhoenixDriver"); map .put("query", "select * from table1 limit 100"); Dataset<Row> df= spark.read().format("jdbc").options(map).load(); Following exception has occurred. java.sql.SQLFeatureNotSupportedException: Wildcard in subqueries not supported. at org.apache.phoenix.compile.FromCompiler Then if I use limit() method of dataframe as shown below. map .put("query", "select * from table1"); Dataset<Row> df = spark.read().format("jdbc").options(map).load().limit(100); In this case, spark is first reading all the data into dataframe then it is trying to filter the data. The mentioned 'table1' has millions of rows. I am getting timeout exception. org.apache.phoenix.exception.PhoenixIOException: callTimeout So, I want to read a sample of few records from Phoenix table in Apache Spark such that data filtering happens at the server side. Can anyone please help in this?
... View more
Labels:
- Labels:
-
Apache HBase
-
Apache Phoenix
-
Apache Spark
07-04-2023
04:10 AM
Thanks @smruti for quick response. This is working.
... View more
07-03-2023
06:24 AM
Hi, I am running a query on Hive through my Spark application using HiveWarehouseConnector. I want to use a particular YARN queue for Tez job launched by HiveWarehouseConnector (custom queue configuration at application level). I have tried following two ways: 1. By using Spark conf and setting spark.hive.tez.queue.name = <queue name> 2. By setting tez.queue.name parameter in the hiveserver2 URL as suggested in the following thread https://community.cloudera.com/t5/Support-Questions/Setting-yarn-queue-for-hive-with-beeline/td-p/161499 I am able to set the queue for beeline using the URL option. However, any of the options is not working for HiveWarehouseConnector. Can anyone please help in this regard?
... View more
Labels: