About DianaTorres

luv4diamonds · ‎05-12-2023

Thank you for replying. I have tried so many different ports and I am even trying external zookeeper. None of my nifi's can connect to any port I provide them. It's almost like something is wrong with the code. I have installed ZK on one of my actual nifi servers and that starts up immediately on port 2181. That's what is leading me to think it's something in the code. The other odd thing I keep seeing is, apache.zookeeper.ClientCnxnSocketNetty future isn't success.

janvit04 · ‎05-11-2023

Hi @cotopaul Thanks for your reply. I am familiar with Jolt (a json to json transformation library). I have been thinking to add the required padding using the funmctions in Jolt and then use the FreeFormTextRecordSetWriter controller service. This service take the name of the key in Json and prepare the file containing only the value. It also keeps the padding added in the previous Jolt Transform. I think using 10 UpdateAttribute will be tough and I have multiple fields that need the required padding/empty spaces. Thank you for your answers!

martinpla · ‎05-04-2023

Hi @SAMSAL , thanks for your response. You are right, changing to "Entire Text" worked I was in a hurry and didnt try changing the evaluation mode. I supposed that "Always Replace" would do the work. Thank you

cotopaul · ‎05-04-2023

@danielhg1285, While the solution provided by @SAMSAL seems to be better for you and more production ready, you could also try the below things. This might work if you are using a stable statement all the time and if are not restricted to see the exact INSERT Statement but rather see the values trying to be inserted. - Shortly after RetryFlowFile, you can add an AttributesToJSON processor and manually define all the columns which you want to insert in the Attributes List Property. Make sure that you use the attribute name from your FlowFile (sql.args.N.value) in your correct order and you set Destination = flowfile-content. In this way, you will generate a JSON File with all the columns and all the values which you have tried to insert but failed. - After AttributesToJSON, you can keep your PutFile to save your file locally on your machine, hence opening it whenever and wherever you want 🙂 PS: This is maybe not the best solution, due to the following reasons, but it will get you started on your track: - You will need to know how many columns you have to insert and each time a new column will be added you will have to modify your AttributesToJSON processor. - You will not get the exact SQL INSERT/UPDATE Statement, but a JSON File containing the column-value pair, which can easily be analyzed by anybody.

DianaTorres · ‎05-03-2023

@Abhay_Kumar Welcome to the Cloudera Community! To help you get the best possible solution, I have tagged our Spark experts @Gopinath and @smdas who may be able to assist you further. Please keep us updated on your post, and we hope you find a satisfactory solution to your query.

DianaTorres · ‎05-02-2023

@acasta Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. Thanks

DianaTorres · ‎05-02-2023

@Amit_barnwal Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. Thanks

skasireddy · ‎04-24-2023

Sorry for late response, I use oozie to submit a spark job

Eren · ‎04-21-2023

i fix this issue by run command on destination cluster,i think it caused by original version is too old to support ec (hadoop-2.7.5)

DataEngAa · ‎04-20-2023

Thank you @mszurap for you response . I tried the suggested work around already and it seems like the issue still persists . I agree the table has lot of partitions but I am pretty sure the code times out before 5 mins . I have also tried enforcing the hive-site.xml with the updated timeout which also did not help much. Only thing which worked was adding spark.catalog.recoverPartitions(table) before issuing the drop partition command . I am really not sure as why recovering the partitions in the catalog eliminated the metastore warning . Below is the updated code which is working without any warning : spark.sql.catalog.recoverPartitions(orders) spark.sql("alter table orders drop if exists partition(year=2023)") data.write.mode('Overwrite').parquet(hdfsPath) Any help here in understanding the problem will be much appreciated .

Online	Offline
Last Visited	‎05-12-2026 06:50 PM

Member Since	‎11-17-2021 08:08 AM
Last Visited	‎05-12-2026 06:50 PM
Posts	1,149
Kudos received	238

Cloudera Community

Re: How to change the company in the profile

Re: Keycloak SSO Hive REST Catalog

Re: Error connecting to NiFi Registry from NiFi UI...

Re: How to change my Account Email Address?

Re: Cannot erase old /opt/cloudera/parcels

Re: nifi with external zookeeper errors

Re: Write /Prepare Fixed width length file in Nifi

Re: ReplaceText: write content on empty flowfile d...

Re: Nifi How to save failed sql queries with wrong...

Re: ERROR SparkContext: Error initializing SparkCo...

Re: NiFi ListS3 Maximum Object Age

Re: NiFi using memory more than allocated

Re: Unable to connect remote Hadoop cluster using ...

Re: how can i copy hbase data from replication set...

Re: Hive metastore lost connection while executing...