About pgreiff

pgreiff · ‎06-01-2017

Simran, you can merge single JSON objects into a larger file before you put it to HDFS. There is a dedicated processor for this: Merge Content https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.2.0/org.apache.nifi.processors.standard.MergeContent/index.html The processor also allows you to configure a property specifying the number of JSON you want to be merged into one single file: 'Minimum Number of Entries' As a side note, when you have a processor on your canvas, you can right click on it and go to 'Usage' to display the documentation of the processor. Hope that helps.

pgreiff · ‎05-19-2017

Actually you do not need to assign the values fix. You can pass the file-name and path dynamically to the next processors. Please check out the documentation at https://nifi.apache.org/docs/nifi-docs/html/developer-guide.html#flowfile For example, ${filename} will return the value of the “filename” attribute. Other values in this context are: Filename ("filename"): The filename of the FlowFile. The filename should not contain any directory structure. UUID ("uuid"): A unique universally unique identifier (UUID) assigned to this FlowFile. Path ("path"): The FlowFile’s path indicates the relative directory to which a FlowFile belongs and does not contain the filename. Absolute Path ("absolute.path"): The FlowFile’s absolute path indicates the absolute directory to which a FlowFile belongs and does not contain the filename.

pgreiff · ‎05-19-2017

Hi, yes, this works as intended. GetFile is a Flow-starting processor, you can not connect to it from other processors - think about it like a process instance trigger. Please use the FetchFile Processor: GetFile -> PutFile - > Fetch File > PutHDFS Hope that helps.

pgreiff · ‎05-15-2017

Hi, as of today you can not have different NIFI versions within the same NIFI Cluster (managed by Zookeeper), however you can setup different and separated clusters since Nifi as well as Zookeper can be run multiple times on the same servers since you only have to copy them into different folders and separate the config-files (nifi.properties, zoo.cfg etc.), and set different data-dirs / provenance-dirs etc. I would recommend to start with having two separated NIFI instances /opt/nifi1 and /opt/nifi2, each one with its own paths and ports configured in the nifi.properties file of each copy. And especially take care of the paths for content_repository database_repository flowfile_repository provenance_repository work directory logs directory This is often forgotten when copying a nifi instance 1 as a base to setup a second one. Just have a look at the parameters of the properties file. Hope that helps.

pgreiff · ‎05-09-2017

@Amol Kulkarni - does that answer your question? Solved?

pgreiff · ‎05-03-2017

That is always a night mare in JAVA based tools. Hive relies on JAVA (plus SQL) so it respects the IEEE standard for number semantics. That means especially NaN (not a number) values in float columns are a tricky thing. First of all: Have you tested what is returned for the '#N/A' columns when you do a select? I guess it is rather 'NaN' than '#N/A'. So after testing the return value, I would suggest to test two approaches. Either try to use cast(): cast(dollar as String) <>'NaN' (because all possible NaN values are displayed as "NaN" even if they are not strictly "equal" in the arithmetical sense) or do the old trick and test the value of the column to fit a mathematical operation like e.g. dollar +1.0 > dollar

pgreiff · ‎05-03-2017

It is not supposed to generate unique values. The hash() function is working with ranges. It is supposed to index different ranges with integer values. Think about grouping similar ranges of values in a large data set into smaller subsets and have an index to find the respective subset. A good explanation can be found there: http://preshing.com/20110504/hash-collision-probabilities/ If you want to generate unique values, have a look at using UDF (reflect("java.util.UUID", "randomUUID"))

pgreiff · ‎02-21-2017

Glad that it helped. Could you please klick to accept the above answer, so that others see that this is the solution. Thanks! 🙂

pgreiff · ‎02-21-2017

Hi, the PutEmail Processor supports the NIFI expression language for the parameter Subject. That means you can access all the attributes of your flow file and all your custom attributes or variables that you defined within the flow. To have a custom subject in your PutEmail processor for the error-handling case, you should connect the PutHDFS (or GetFile or both) processor to the PutEmail Processor for the failure path and configure the PutEmail processor. An example for a custom Subject could be: 'Hello from '${hostname}' the file '${filename}' caused an error at '${now} More Examples and some guidelines for the Nifi Expression Language are listed here: https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html

pgreiff · ‎02-06-2017

The two values were just examples. Try to change them to something less that fits your system environment. Either go for less than 512 (might do the job) or increase the ram assigned to the container: increase VirtualBox memory from (I guess) 4096 to (e.g.) 8192 Log into Ambari from http://my.local.host:8080 change the values of yarn.nodemanager.resource.memory-mb and yarn.scheduler.maximum-allocation-mb from the defaults to 4096 Save and restart (at lease yarn, oozie, spark)

Online	Offline
Last Visited	‎06-22-2017 12:40 PM

Member Since	‎09-23-2016 07:12 AM
Last Visited	‎06-22-2017 12:40 PM
Posts	35
Kudos received	20

Cloudera Community

Re: NIFI stream using ListenHttp Processor creates...

Re: Run multiple NiFi versions in the same cluster...

Re: How to compare float datatype columns with str...

Re: The Hash Function over different values gives ...

Re: How to customize email subject in Apache NiFI ...

Re: NIFI stream using ListenHttp Processor creates...

Re: Move File from One local folder to another and...

Re: Move File from One local folder to another and...

Re: Run multiple NiFi versions in the same cluster...

Re: The Hash Function over different values gives ...

Re: How to compare float datatype columns with str...

Re: The Hash Function over different values gives ...

Re: How to customize email subject in Apache NiFI ...

Re: How to customize email subject in Apache NiFI ...

Re: Error in running Spark Job using oozie workflo...