About henrikolsen

henrikolsen · ‎01-09-2019

Any progress on getting MoveHDFS to accept attributes in Output Directory? It seems difficult not be able to have a dynamic solution as mentioned in this thread.

henrikolsen · ‎12-03-2018

Looking for this info also. Sorry to bump the the thread, but any news on this wish?

henrikolsen · ‎11-15-2018

Follow-up regarding Records and schemas. I use InferAvroSchema. It seems to miss the possibility of null in some of my CSV data. I've set it, as test, to analyse 1.000.000 records to ensure it sees all, but no luck. On some columns it adds possible null to field type, on others not. Is there a built in (invisible) upper limit to have many records are analysed? And could it be considered to add an option in the processor to always allow null values?

henrikolsen · ‎11-13-2018

I'm trying to gain experience with Records, specifically ValidateRecord. All FlowFiles out of ValidateRecord seem to be converted to the format set by the Record Writer property. Is there any way of not having the data parsed on output / maintaining the original input? I have a case with CSV input as example, where I'd like to log and report invalid lines as-is. Could be for passing back to data supplier, where I'd rather show original input, than something transformed by the Writer. I'd appreciate inspiration also to when you'd want the schema to validate against not be the one used by the Record Reader. Using NiFi 1.5.0.3.

henrikolsen · ‎10-12-2018

I see GetHDFSFileInfo from 1.7 might be relevant. Running 1.5 currently though. Suggestions on that platform?

henrikolsen · ‎10-12-2018

In PutHDFS I have set Conflict Resolution Strategy to Fail as I don't want to overwrite existing files. But for error handling and logging/notification, I need to differentiate file-exist fails from other types of fails from this processor. How is that possible? In bulletin board I can see a text message from the processor indicating when file exists, but how do I get that info in the flow itself? Is the message / fail type available for flow control handling somehow? Suggestions?

henrikolsen · ‎07-09-2018

@Pierre Villard: "A common approach is something like GenerateTableFetch on the primary node and QueryDatabaseTable on all nodes. The first processor will generate SQL queries to fetch the data by "page" of specified size, and the second will actually get the data. This way, all nodes of your NiFi cluster can be used to get the data from the database.": Will I need to make a (local) RPG after the GenerateTableFetch to get them running in parallel? Any experience on performance for making full RDBMS table dumps using this method vs Sqoop?

henrikolsen · ‎07-09-2018

Running HDF 3.1.0. I'm quite often getting this error in the NiFi web UI: "An unexpected error has occured: javax.ws.rs.ProcessingException: java.net.SocketTimeoutException: Read timed out" Would appreciate a tip on how to avoid this. Thanks.

henrikolsen · ‎06-07-2018

When using the DistributeMapCache, how do I know how it behaves? Do I have control over time-to-live, eviction strategy etc? Can I see what it contains, or only do single specific key lookups?

henrikolsen · ‎06-07-2018

Thanks a lot, Matt. Good solution and example. I'm currently restricted to an old NiFi (version 1.1), and wasn't aware of the notify/wait pattern available in newer versions. From which version is that available, and where in general can I lookup from which version a particular feature/processor is available? Will soon be working on a newer HDF 3.1, so hopefully that will unlock new needed features. - I still didn't quite get whether FetchSFTP (used with ListSFTP) will reuse connection across multiple file fetches, or it open/closes for every flowfile processed. Regards and thanks Henrik

Online	Offline
Last Visited	‎06-11-2019 07:37 AM

Member Since	‎11-27-2017 09:10 AM
Last Visited	‎06-11-2019 07:37 AM
Posts	52
Kudos received	3

Cloudera Community

Re: Using Attribute for Output Directory in MoveH...

Re: Processor failures: how to get the error on th...

Re: NiFi, ValidateRecord

NiFi, ValidateRecord

Re: NiFi, HDFS: Knowing and handling if file exist...

NiFi, HDFS: Knowing and handling if file exists be...

Re: Execute sqoop on NiFi

NiFi web UI timeouts

Re: FetchSFTP and reuse of connection

Re: FetchSFTP and reuse of connection