Member since
03-01-2022
15
Posts
0
Kudos Received
0
Solutions
09-10-2022
10:03 AM
Hi everyone, I want to Make a ETL from RDBMS to Apache Hive and I am using the below approach: Source Data --> Hive Staging Table --> Hive Table First I Load data into Hive staging table incrementally with date column (max date value stored in metadata table in hive), and using a Merge statement with my table and staging table. Any other recommended approach?
... View more
Labels:
- Labels:
-
Apache Hive
08-10-2022
06:59 AM
Hi everyone, I wanna ask, if its possible or not to access the HWC with dbt using spark.
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Spark
08-02-2022
07:24 AM
Hi @jagadeesan , I am trying to connect to hive with spark3 via JDBC Hive driver (HiveJDBC42) And I am getting the bellow error: import org.apache.spark.sql.SparkSession
val spark = SparkSession.builder().appName("Spark - Hive").config("spark.sql.warehouse.dir", "/warehouse/tablespace/managed/hive").enableHiveSupport().getOrCreate()
val table_users = spark.read.format("jdbc").
option("url","hive").
option("url", "jdbc:hive2://127.0.0.1:2181:2181;password=****;principal=hive/_HOST@Example.com;serviceDiscoveryMode=zooKeeper;ssl=1;user=user1;zooKeeperNamespace=hiveserver2").
option("driver","com.cloudera.hive.jdbc.HS2Driver").
option("query","select * from test_db.users LIMIT 1").
option("fetchsize","20").
load() java.sql.SQLException: [Cloudera][JDBC](11380) Null pointer exception.
at com.cloudera.hiveserver2.hive.core.HiveJDBCConnection.setZookeeperServiceDiscovery(Unknown Source)
at com.cloudera.hiveserver2.hive.core.HiveJDBCConnection.readServiceDiscoverySettings(Unknown Source)
at com.cloudera.hiveserver2.hivecommon.core.HiveJDBCCommonConnection.readServiceDiscoverySettings(Unknown Source)
at com.cloudera.hiveserver2.hivecommon.core.HiveJDBCCommonConnection.establishConnection(Unknown Source)
at com.cloudera.hiveserver2.jdbc.core.LoginTimeoutConnection.connect(Unknown Source)
at com.cloudera.hiveserver2.jdbc.common.BaseConnectionFactory.doConnect(Unknown Source)
at com.cloudera.hiveserver2.jdbc.common.AbstractDriver.connect(Unknown Source)
at org.apache.spark.sql.execution.datasources.jdbc.connection.BasicConnectionProvider.getConnection(BasicConnectionProvider.scala:49)
at org.apache.spark.sql.execution.datasources.jdbc.connection.ConnectionProvider$.create(ConnectionProvider.scala:77)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$createConnectionFactory$1(JdbcUtils.scala:64)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.getQueryOutputSchema(JDBCRDD.scala:62)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:57)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation$.getSchema(JDBCRelation.scala:239)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:36)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:350)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:274)
at org.apache.spark.sql.DataFrameReader.$anonfun$load$3(DataFrameReader.scala:245)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:245)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:174)
... 54 elided
Caused by: java.lang.NullPointerException
... View more
07-30-2022
10:36 AM
Thank you @jagadeesan, But, is it possible to connect to the Hive via JDBC from spark 3.x ?
... View more
07-30-2022
03:09 AM
Thank you @jagadeesan for your reply, As far as I know, HWC does not support INSERT/UPDATE in Hive ACID Tables, Correct me if I'm wrong. Also is there any way to connect to Hive ACID tables now for spark 3 instead of HWC. Thank you!
... View more
07-26-2022
09:32 AM
Hi guys, I have a Data lake (Hive Managed tables base) and I would like to do an Incremental approach to the Warehouse (Hive Managed tables base) using spark v3.2, and I faced an issue connection to Hive Managed tables with spark3. And I would like to know, How to connect to Hive ACID Tables? Using JDBC if yes how? Or there are other ways? Thank you!
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Hive
-
Apache Spark
07-17-2022
11:11 PM
Hi everyone, I am using "ListenSyslog" processor and I faced the below error: ListenSyslog Attempted to set Socket Buffer Size to 314572800 bytes but could only set to 212992bytes. You may want to consider changing the Operating System's maximum receive buffer
... View more
Labels:
- Labels:
-
Apache NiFi
03-07-2022
03:52 AM
{
"_id": "620e6275034f4fe64f1ce2ef",
"patientorderitems": [
{
"_id": "620e6275034f4fe64f1ce2f0",
"patientorderlogs": [
{
"_id": "620e6275034f4fe64f1ce2f1",
"useruid": "6031edd256afd66888232d6e",
"departmentuid": "602f6a3494ce862c04aa49d2"
},
{
"_id": "621efc35da15edd34560da80",
"useruid": "6032021359f2cf686ae807ba",
"departmentuid": "602f6a3494ce862c04aa49d5"
},
{
"_id": "6220a702061f33f4abe8a2a6",
"useruid": "604f3cb743027274a8c565de",
"departmentuid": "604c5393864aa9012d79e986"
},
{
"_id": "6220a70f65ca50f522598a85",
"useruid": "604f3cb743027274a8c565de",
"departmentuid": "604c5393864aa9012d79e986"
},
{
"_id": "6220a717145139f53cfb6143",
"useruid": "604f3cb743027274a8c565de",
"departmentuid": "604c5393864aa9012d79e986"
}
]
}
]
} dear araujo can I have a jolt specification for this JSON as the same format of last JSON?
... View more
03-03-2022
12:19 AM
Which version of NiFi do you have? : Nifi version : 1.15.2 Have you tried the change that I suggested? : There is not Failure FlowFile so I don't think its cause this isssue ( sorry I don't mentioned that, the NIFI has two nodes and the secondary node got disconnect in some time of the run of the Flow, not always, so the issue cause the disconnect the sync between two node, is this caused by thread consuming?) Can you share the attributes of one of your flow files? :
... View more