About steven-matison

Griggsy · ‎10-31-2022

@steven-matison I have been trying to get ifElse working for me but the below gives me an empty string > "" And this gives me null as a string > "null" is there a way to return null not as a string?

MattWho · ‎10-28-2022

@D5ha Not all processors write to the content repository nor is content of a FlowFile ever modified in the content after it is created. Once a FlowFile is created in NiFi it exists as is until terminated. A NiFi FlowFile consists of two parts, FlowFile Attributes (metatadata about the FlowFile which includes details about the FlowFIle's content location in the content_repository) and the FlowFile content itself. When a downstream processor modifies the content of a FlowFile, what is really happening is a new content is written to a new content claim in the content_repository, the original content still remains unchanged. From what you shared, you appear to have just one content_repository. Within that single content_repository, NiFi creates a bunch of sub-directories. NiFi does this because of the massive number of content claims a user's dataflow(s) may hold for better indexing and seeking. What is very important to also understand is that a content claim in the content_repository can hold the content for 1 or more FlowFiles. It is not always one content claim per FlowFiles content. It is also very possible to have multiple queued FlowFiles pointing to the exact same content claim and offset (same exact content). This happens when you dataflow clones a FlowFile (for example routing same outbound relationship from a processor multiple times). So you should never manually delete claims from any content repository as you may delete content for multiple FlowFiles. That being said, you can use data provenance to locate the content_repository (container), subdirectory (section), Content Claim filename(Identifier), Content offset byte where content begins in that claim (Offset), and number of bytes from offset to end of content in the claim (Size). Right click on a processor and select "view data provenance" from displayed context menu: This will list all FlowFiles for which provenance still holds index data on that were processed by this processor: Click the Show Lineage icon (looks like 3 connected circles) to the far right of a FlowFile. You can right click on "clone" and "join" events to find/expand any parent flowfiles in the lineage (the event dot created for the processor on which you said show provenance will be colored red in the lineage graph): Each white circle is a different FlowFile. clicking on a white circle will highlight dataflow path for that FlowFile. Right clicking on an event like "create" and selecting "view details" will tell you all about what is known about that FlowFile (this includes a tab about the "content"): Container corresponds to the following property in the nifi.properties file: nifi.content.repository.directory.default= Section corresponds to subdirectory within the above content repository path. Identifier is the content claim filename. Offset is the byte on which content for this FlowFile begins within that identifier. Size is number of bytes of you reach end of content for that FlowFile's content in the Identifier. I also created an article on how to index the Content Identifier. Indexing a field allows you to locate a content claim and the search for it in your data provenance to find all FlowFile(s) that pointed at it. You can then look view the details of all those FlowFile(s) to see full content calim details as above: https://community.cloudera.com/t5/Community-Articles/How-to-determine-which-FlowFiles-are-associated-to-the-same/ta-p/249185 If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt

dulanga · ‎10-28-2022

yes code is same for small tables it works fine also, here I need to query around (~200GB) data

steven-matison · ‎10-28-2022

@sathish3389 Define a parameter context and parameter ("parameter_password") for your flow with your password string, define that as sensitive value, then use the parameter in the processor property value : ${http.headers.Authorization:equals(#{parameter_password}) This will hide the password and make it easy to update the password by just updating the parameter.

steven-matison · ‎10-28-2022

@Ekodar You will need to use a driver to connect php to impala. Quick search and this looks promising: https://docs.cloudera.com/documentation/other/connectors/impala-jdbc/latest/Cloudera-JDBC-Driver-for-Impala-Install-Guide.pdf Here is another example with more details showing actual php code: https://www.cdata.com/kb/tech/impala-odbc-php.rst

steven-matison · ‎10-25-2022

@ryu CDP Public Cloud Azure or CDP Private Cloud on Azure VMs? To link a NiFi outside of the cluster, you will need to provide that nifi with the files from the CDP Cluster. For example core-site.xml, hdfs-site.xml. Outside of that configuration, you will need to do some networking to allow access between systems, and then last but not least deal with access/auth and kerberos. If you are already working on some of these areas, be sure to include screen shots of processors, controller services, configs, etc.

steven-matison · ‎10-24-2022

@yoiun Going to go out on a ledge here: It seems like the the sqoop command and hue/sqoop command are executed on different hosts. Does the new host have permissions to mysql? This error here: Access denied for user 'demo'@'152.30.119.754' leads me to believe it does not.

steven-matison · ‎10-24-2022

@i_am_dba This is a very difficult one to explain. I think the issue is the string schema, or removing the avro schema you mentioned. My first suggestion would be to try to specific the schema which should help getting the data into the right formats. An alternate solution is to try and do that manually by replacetext/regex,etc but that is not the ideal solution. That said, another higher level suggesting is to update the upstream datasource to permanently solve the instability from '' (blank string), 'null' (string), or actual NULL (not a blank, '', or string at all).

steven-matison · ‎10-24-2022

@MaarufB Please make a new post with as much detail as you can around your question and use case. This is an old topic and will not get a good response in the comments. Feel free to @ tag me in the new post.

steven-matison · ‎10-13-2022

@sathish3389 Its not entirely clear what you are asking here but I will give it a go. ListenHttp is used to listen to an http port with POST limited capabilities. If you are looking post data to nifi as more of REST API, you may want to check out HandleHttpRequest and HandleHttpResponse, as they have a bit more capability, and some ssl client authentication requirements. They also allow you to program authentication logic before returning the response. To do that. you would build your data flow (after HandleHttpRequest) to look for an authentication (user,password,key,etc) header, validate that and then if valid, continue to HandleHttpResponse with 200 (success). An invalidate authentication header would then go to HandleHttpResponse with 500 (error). An invalid request (wrong path, missing info, etc) could be routed to HandleHttpReponse with 404 (invalid).

Online	Offline
Last Visited	‎10-15-2025 05:27 AM

Member Since	‎02-01-2022 01:27 PM
Last Visited	‎10-15-2025 05:27 AM
Posts	274
Kudos received	96

Cloudera Community

Re: Nifi - Flow Analysis Rules - Possibility to cr...

Re: Apache Nifi Release 2.0 M1 & M2 High CPU Utili...

Re: error nifi connecting as cluster

Re: Difficulty Sending GraphQL POST Requests Using...

Re: Should i have to restart entire cluster if CM ...

Re: NiFi updateRecord - dealing with Blank fields

Re: Is there any way to identify content storage l...

Re: NIFI custom processor return multipleflow file...

Re: How to make authenticate credentials sensitive...

Re: How to use PHP to connect to coudera Impala?

Re: How to use NIFI to connect to the CDP cluster ...

Re: Oozie + sqoop action fails when import records...

Re: Nifi load data from CSV to mySQL

Re: Creating NiFi Template via Rest API

Re: How to expose secured NIfi URL