About Asim-

Asim- · ‎09-10-2022

Hi everyone, I want to Make a ETL from RDBMS to Apache Hive and I am using the below approach: Source Data --> Hive Staging Table --> Hive Table First I Load data into Hive staging table incrementally with date column (max date value stored in metadata table in hive), and using a Merge statement with my table and staging table. Any other recommended approach?

Asim- · ‎08-10-2022

Hi everyone, I wanna ask, if its possible or not to access the HWC with dbt using spark.

Asim- · ‎08-02-2022

Hi @jagadeesan , I am trying to connect to hive with spark3 via JDBC Hive driver (HiveJDBC42) And I am getting the bellow error: import org.apache.spark.sql.SparkSession val spark = SparkSession.builder().appName("Spark - Hive").config("spark.sql.warehouse.dir", "/warehouse/tablespace/managed/hive").enableHiveSupport().getOrCreate() val table_users = spark.read.format("jdbc"). option("url","hive"). option("url", "jdbc:hive2://127.0.0.1:2181:2181;password=****;principal=hive/_HOST@Example.com;serviceDiscoveryMode=zooKeeper;ssl=1;user=user1;zooKeeperNamespace=hiveserver2"). option("driver","com.cloudera.hive.jdbc.HS2Driver"). option("query","select * from test_db.users LIMIT 1"). option("fetchsize","20"). load() java.sql.SQLException: [Cloudera][JDBC](11380) Null pointer exception. at com.cloudera.hiveserver2.hive.core.HiveJDBCConnection.setZookeeperServiceDiscovery(Unknown Source) at com.cloudera.hiveserver2.hive.core.HiveJDBCConnection.readServiceDiscoverySettings(Unknown Source) at com.cloudera.hiveserver2.hivecommon.core.HiveJDBCCommonConnection.readServiceDiscoverySettings(Unknown Source) at com.cloudera.hiveserver2.hivecommon.core.HiveJDBCCommonConnection.establishConnection(Unknown Source) at com.cloudera.hiveserver2.jdbc.core.LoginTimeoutConnection.connect(Unknown Source) at com.cloudera.hiveserver2.jdbc.common.BaseConnectionFactory.doConnect(Unknown Source) at com.cloudera.hiveserver2.jdbc.common.AbstractDriver.connect(Unknown Source) at org.apache.spark.sql.execution.datasources.jdbc.connection.BasicConnectionProvider.getConnection(BasicConnectionProvider.scala:49) at org.apache.spark.sql.execution.datasources.jdbc.connection.ConnectionProvider$.create(ConnectionProvider.scala:77) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$createConnectionFactory$1(JdbcUtils.scala:64) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.getQueryOutputSchema(JDBCRDD.scala:62) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:57) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation$.getSchema(JDBCRelation.scala:239) at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:36) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:350) at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:274) at org.apache.spark.sql.DataFrameReader.$anonfun$load$3(DataFrameReader.scala:245) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:245) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:174) ... 54 elided Caused by: java.lang.NullPointerException

Asim- · ‎07-30-2022

Thank you @jagadeesan, But, is it possible to connect to the Hive via JDBC from spark 3.x ?

Asim- · ‎07-30-2022

Thank you @jagadeesan for your reply, As far as I know, HWC does not support INSERT/UPDATE in Hive ACID Tables, Correct me if I'm wrong. Also is there any way to connect to Hive ACID tables now for spark 3 instead of HWC. Thank you!

Asim- · ‎07-26-2022

Hi guys, I have a Data lake (Hive Managed tables base) and I would like to do an Incremental approach to the Warehouse (Hive Managed tables base) using spark v3.2, and I faced an issue connection to Hive Managed tables with spark3. And I would like to know, How to connect to Hive ACID Tables? Using JDBC if yes how? Or there are other ways? Thank you!

Asim- · ‎07-17-2022

Hi everyone, I am using "ListenSyslog" processor and I faced the below error: ListenSyslog Attempted to set Socket Buffer Size to 314572800 bytes but could only set to 212992bytes. You may want to consider changing the Operating System's maximum receive buffer

Asim- · ‎03-07-2022

{ "_id": "620e6275034f4fe64f1ce2ef", "patientorderitems": [ { "_id": "620e6275034f4fe64f1ce2f0", "patientorderlogs": [ { "_id": "620e6275034f4fe64f1ce2f1", "useruid": "6031edd256afd66888232d6e", "departmentuid": "602f6a3494ce862c04aa49d2" }, { "_id": "621efc35da15edd34560da80", "useruid": "6032021359f2cf686ae807ba", "departmentuid": "602f6a3494ce862c04aa49d5" }, { "_id": "6220a702061f33f4abe8a2a6", "useruid": "604f3cb743027274a8c565de", "departmentuid": "604c5393864aa9012d79e986" }, { "_id": "6220a70f65ca50f522598a85", "useruid": "604f3cb743027274a8c565de", "departmentuid": "604c5393864aa9012d79e986" }, { "_id": "6220a717145139f53cfb6143", "useruid": "604f3cb743027274a8c565de", "departmentuid": "604c5393864aa9012d79e986" } ] } ] } dear araujo can I have a jolt specification for this JSON as the same format of last JSON?

Asim- · ‎03-03-2022

Which version of NiFi do you have? : Nifi version : 1.15.2 Have you tried the change that I suggested? : There is not Failure FlowFile so I don't think its cause this isssue ( sorry I don't mentioned that, the NIFI has two nodes and the secondary node got disconnect in some time of the run of the Flow, not always, so the issue cause the disconnect the sync between two node, is this caused by thread consuming?) Can you share the attributes of one of your flow files? :

Asim- · ‎03-02-2022

also how NIFI work with Scripts, in field of threads?

Online	Offline
Last Visited	‎04-01-2023 04:37 PM

Member Since	‎03-01-2022 11:52 PM
Last Visited	‎04-01-2023 04:37 PM
Posts	15

Cloudera Community

Incrementally ETL to Apache Hive

DBT with Spark HWC

Re: Spark3 connection to HIVE ACID Tables

Re: Spark3 connection to HIVE ACID Tables

Re: Spark3 connection to HIVE ACID Tables

Spark3 connection to HIVE ACID Tables

ListenSyslog - Apache NiFi (Buffer size Error)

Re: Jolt-JSON Transformation

Re: NIFI Execute script- Node disconnected

Re: NIFI Execute script- Node disconnected