Member since
03-01-2022
15
Posts
0
Kudos Received
0
Solutions
08-02-2022
07:24 AM
Hi @jagadeesan , I am trying to connect to hive with spark3 via JDBC Hive driver (HiveJDBC42) And I am getting the bellow error: import org.apache.spark.sql.SparkSession
val spark = SparkSession.builder().appName("Spark - Hive").config("spark.sql.warehouse.dir", "/warehouse/tablespace/managed/hive").enableHiveSupport().getOrCreate()
val table_users = spark.read.format("jdbc").
option("url","hive").
option("url", "jdbc:hive2://127.0.0.1:2181:2181;password=****;principal=hive/_HOST@Example.com;serviceDiscoveryMode=zooKeeper;ssl=1;user=user1;zooKeeperNamespace=hiveserver2").
option("driver","com.cloudera.hive.jdbc.HS2Driver").
option("query","select * from test_db.users LIMIT 1").
option("fetchsize","20").
load() java.sql.SQLException: [Cloudera][JDBC](11380) Null pointer exception.
at com.cloudera.hiveserver2.hive.core.HiveJDBCConnection.setZookeeperServiceDiscovery(Unknown Source)
at com.cloudera.hiveserver2.hive.core.HiveJDBCConnection.readServiceDiscoverySettings(Unknown Source)
at com.cloudera.hiveserver2.hivecommon.core.HiveJDBCCommonConnection.readServiceDiscoverySettings(Unknown Source)
at com.cloudera.hiveserver2.hivecommon.core.HiveJDBCCommonConnection.establishConnection(Unknown Source)
at com.cloudera.hiveserver2.jdbc.core.LoginTimeoutConnection.connect(Unknown Source)
at com.cloudera.hiveserver2.jdbc.common.BaseConnectionFactory.doConnect(Unknown Source)
at com.cloudera.hiveserver2.jdbc.common.AbstractDriver.connect(Unknown Source)
at org.apache.spark.sql.execution.datasources.jdbc.connection.BasicConnectionProvider.getConnection(BasicConnectionProvider.scala:49)
at org.apache.spark.sql.execution.datasources.jdbc.connection.ConnectionProvider$.create(ConnectionProvider.scala:77)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$createConnectionFactory$1(JdbcUtils.scala:64)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.getQueryOutputSchema(JDBCRDD.scala:62)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:57)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation$.getSchema(JDBCRelation.scala:239)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:36)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:350)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:274)
at org.apache.spark.sql.DataFrameReader.$anonfun$load$3(DataFrameReader.scala:245)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:245)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:174)
... 54 elided
Caused by: java.lang.NullPointerException
... View more
07-30-2022
10:36 AM
Thank you @jagadeesan, But, is it possible to connect to the Hive via JDBC from spark 3.x ?
... View more
07-30-2022
03:09 AM
Thank you @jagadeesan for your reply, As far as I know, HWC does not support INSERT/UPDATE in Hive ACID Tables, Correct me if I'm wrong. Also is there any way to connect to Hive ACID tables now for spark 3 instead of HWC. Thank you!
... View more
07-26-2022
09:32 AM
Hi guys, I have a Data lake (Hive Managed tables base) and I would like to do an Incremental approach to the Warehouse (Hive Managed tables base) using spark v3.2, and I faced an issue connection to Hive Managed tables with spark3. And I would like to know, How to connect to Hive ACID Tables? Using JDBC if yes how? Or there are other ways? Thank you!
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Hive
-
Apache Spark
03-07-2022
03:52 AM
{
"_id": "620e6275034f4fe64f1ce2ef",
"patientorderitems": [
{
"_id": "620e6275034f4fe64f1ce2f0",
"patientorderlogs": [
{
"_id": "620e6275034f4fe64f1ce2f1",
"useruid": "6031edd256afd66888232d6e",
"departmentuid": "602f6a3494ce862c04aa49d2"
},
{
"_id": "621efc35da15edd34560da80",
"useruid": "6032021359f2cf686ae807ba",
"departmentuid": "602f6a3494ce862c04aa49d5"
},
{
"_id": "6220a702061f33f4abe8a2a6",
"useruid": "604f3cb743027274a8c565de",
"departmentuid": "604c5393864aa9012d79e986"
},
{
"_id": "6220a70f65ca50f522598a85",
"useruid": "604f3cb743027274a8c565de",
"departmentuid": "604c5393864aa9012d79e986"
},
{
"_id": "6220a717145139f53cfb6143",
"useruid": "604f3cb743027274a8c565de",
"departmentuid": "604c5393864aa9012d79e986"
}
]
}
]
} dear araujo can I have a jolt specification for this JSON as the same format of last JSON?
... View more
03-02-2022
12:03 AM
I want to transform this JSON: { "_id" : "6218e53465793fa20ea11524", "patientorderitems" : [ { "poi_id" : "6218e53465793fa20ea1152a", "patientorderlogs" : [ { "pol_id" : "6218e53465793fa20ea1152e", "useruid" : "61ee4995f16eebb6b7e1c644", "modifiedat" : "2022-02-25T17:18:28Z" } ] }, { "poi_id" : "6218e53465793fa20ea11525", "patientorderlogs" : [ { "pol_id" : "6218e53465793fa20ea11529", "useruid" : "61ee4995f16eebb6b7e1c644", "modifiedat" : "2022-02-25T17:18:28Z" } ] } ] } To this JSON: { { "_id" : "6218e53465793fa20ea11524", "poi_id" : "6218e53465793fa20ea1152a", "pol_id" : "6218e53465793fa20ea1152e", "useruid" : "61ee4995f16eebb6b7e1c644", "modifiedat" : "2022-02-25T17:18:28Z" }, { "poi_id" : "6218e53465793fa20ea11524", "poi_id" : "6218e53465793fa20ea11525", "pol_id" : "6218e53465793fa20ea11529", "useruid" : "61ee4995f16eebb6b7e1c644", "modifiedat" : "2022-02-25T17:18:28Z" } } Is there any Jolt format or scripts can do this?
... View more
Labels:
- Labels:
-
Apache NiFi