About zack_riesland

MattWho · ‎12-05-2022

@hargav Please create a new community question for your queries around MergeRecord processor. This is the best way to get attention and best for community to have a separate thread for each specific query. I am not clear on your use case for using "cron driven" scheduling with the MergeRecord. This would not be a common thing to do. Best to explain your use case in a new community thread along with sharing your MergeRecord processor configuration. Feel free to @MattWho in the new community post to notify me. Thanks, Matt

Ak99 · ‎06-09-2021

you can try below set parameters set hive.vectorized.execution.reduce.enabled=false; and set hive.vectorized.execution.enabled=true;

robi10101298 · ‎01-21-2021

Hello, can you please help me with a similar script for batch renaming Hadoop files? Thanks!

abhinav_joshi · ‎07-29-2020

Did you get a solution to this . I am also getting a communication error . My Nifi Instance and Mysql are on the same linux server .

zack_riesland · ‎05-28-2019

It took me a while to look in /var/log/messages, but I found a ton of ntpd errors. It turns out that our nodes were having issues getting out to the servers they were configured to use for sync. I switched all the configurations to use a local premise server and restarted everything. I'm hoping that will be the full solution to our issue.

zack_riesland · ‎07-12-2018

I was able to get this to work by using the insertInto() function, rather than the saveAsTable() function.

zack_riesland · ‎05-24-2018

Thanks Matt, My issue was firewall related. I'm all set now. Thanks for your help!

zack_riesland · ‎04-10-2018

Here's what I ended up with: spark.udf.register("getOnlyFileName", (fullPath: String) => fullPath.split("/").last) val df2= df1.withColumn("source_file_name2", callUDF("getOnlyFileName", input_file_name()))

zack_riesland · ‎01-08-2018

Thanks for the feedback

gkeys · ‎08-01-2017

Using your sed approach, this should replace all NULL with empty character sed 's/[\t]/,/g; s/NULL//g' > myfile.csv If there is a chance that NULL is a substring of a value you will need to do the following where ^ is beginning of line and $ is end of line and , is your field delimiter sed 's/[\t]/,/g; s/^NULL,/,/g; s/,NULL,/,,/g; s/,NULL$/,/g;' > myfile.csv Note that if your resultset is large, it is probably best to use Pig on HDFS and not sed (to leverage the parallel processing of hadoop and save yourself a lot of time. Note also: To use empty character as nulls in the actual hive table, use the following in the DDL TBLPROPERTIES('serialization.null.format'='');

Online	Offline
Last Visited	‎06-10-2019 05:13 PM

Member Since	‎02-04-2016 01:07 PM
Last Visited	‎06-10-2019 05:13 PM
Posts	189
Kudos received	70

Cloudera Community

Re: Help with spark partition syntax (scala)

Re: Can I control naming patterns for HDFS chunks

Re: How to connect to Spark2 Thrift Server via JDB...

Re: Hive: Convert int timestamp to date

Re: How to clear temp data from dataflow / nifi?

Re: Helping setting up cron-based nifi processor

Re: Solution for "Hive Runtime Error while process...

Re: Can I control naming patterns for HDFS chunks

Re: how to configure and connect mysql with nifi ...

Re: Help diagnosing zookeeper timeouts

Re: Help with spark partition syntax (scala)

Re: What are best practices for NiFi development i...

Re: Access chunk name in Spark / Scala

Re: How to "defragment" hdfs data?

Re: How to handle nulls when exporting from Hive?