Member since
10-24-2016
47
Posts
15
Kudos Received
0
Solutions
09-30-2017
04:24 AM
@Jorge moyano I tried the below code as you suggested but still facing the same problem File fileObj = new File(pathname);
FileBody fileBody = new FileBody(fileObj, ContentType.MULTIPART_FORM_DATA);
HttpEntity multiPart = MultipartEntityBuilder.create().addPart("template", fileBody).build();
HttpPost httpPost = new HttpPost("http://localhost:8080/nifi-api/process-groups/root/templates/upload");
httpPost.addHeader("Content-type", "multipart/form-data");
httpPost.setEntity(multiPart);
HttpClient httpClient = HttpClientBuilder.create().build();
HttpResponse response = httpClient.execute(httpPost); Response HttpResponseProxy{HTTP/1.1 500 Internal Server Error [Date: Sat, 30 Sep 2017 04:24:37 GMT, X-Frame-Options: SAMEORIGIN, Content-Type: text/plain, Transfer-Encoding: chunked, Server: Jetty(9.4.3.v20170317)] ResponseEntityProxy{[Content-Type: text/plain,Chunked: true]}}
... View more
09-29-2017
01:09 PM
@Andrew Grande can you please help on this?
... View more
09-28-2017
10:06 AM
I am trying to create a REST API call to import a template into my NiFi UI post which Instantiate the same.
Below is the code which I tried, String pathname = "D:\\Users\\bramasam\\Downloads\\BalaBackUp.xml";
String restString = "http://localhost:8080/nifi-api/process-groups/f80896d4-c71f-3395-d527-8c6bd69f44d0/templates/upload";
HttpPost httpPost = new HttpPost(restString);
File fileObj = new File(pathname);
httpPost.addHeader("Content-type", "multipart/form-data");
FileEntity fileEntity = new FileEntity(fileObj, ContentType.MULTIPART_FORM_DATA);
httpPost.setEntity(fileEntity);
HttpClient httpClient = HttpClientBuilder.create().build();
HttpResponse response = httpClient.execute(httpPost);
StatusLine status = response.getStatusLine();
System.out.println(status.getStatusCode()); I am getting a response code of 500 and with the below response HttpResponseProxy{HTTP/1.1 500 Internal Server Error [Date: Thu, 28 Sep 2017 09:43:28 GMT, X-Frame-Options: SAMEORIGIN, Content-Type: text/plain, Transfer-Encoding: chunked, Server: Jetty(9.4.3.v20170317)] ResponseEntityProxy{[Content-Type: text/plain,Chunked: true]}} Below is the {id} from the BalaBackUp.xml file which I am trying to import from <?xml version="1.0" ?>
<template encoding-version="1.1">
<description></description>
<groupId>bd5dba8b-015d-1000-1fd5-450ede38b7a5</groupId>
<name>BalaBackUp</name>
<snippet>
<processGroups>
<id>f80896d4-c71f-3395-0000-000000000000</id>
<parentGroupId>29a5776d-9728-3fee-0000-000000000000</parentGroupId>
<position>
<x>0.0</x>
<y>0.0</y>
</position>
<comments></comments>
<contents>
<connections>
<id>c0d0e26d-5ee2-3d60-0000-000000000000</id>
<parentGroupId>f80896d4-c71f-3395-0000-000000000000</parentGroupId>
<backPressureDataSizeThreshold>1 GB</backPressureDataSizeThreshold>
<backPressureObjectThreshold>10000</backPressureObjectThreshold>
<destination>
<groupId>f80896d4-c71f-3395-0000-000000000000</groupId>
<id>1f9e926a-71fc-356f-0000-000000000000</id>
<type>PROCESSOR</type> can you please help me on what I am missing?
... View more
Labels:
09-15-2017
10:28 AM
I am trying to automate the Deployment of nifi flow. We searched and found a link (given below) https://github.com/aperepel/nifi-api-deploy It contains the Groovy code. But we are looking out for REST API in Java. The link also says that, REST APIs and concepts have changed significantly from NiFi 0.x to NiFi 1.x (It is explained for NiFi 0.x in the link) Since we are using NiFi 1.3.0, I thought it would be the best to ask for help.
... View more
Labels:
07-31-2017
12:37 PM
@sri chaturvedi Please try converting both the variables to String datatype before comparison. Sample: ${d_last_updt:toDate('yyyy-MM-dd'):format('yyyyMMdd'):toString():gt(constant value:toString())} Hope this helps
... View more
07-31-2017
12:29 PM
@Gayathri, I am not sure about your complete requirement. But, the MergeContent processor is used only when you have to merge the file contents which you have split up-steam using SplitContent processor. I can suggest you with some workarounds, 1. You may have a common filename for files which follow specific schema 2. Write the file into a local file system using PutFile or into Hadoop using PutHDFS processors 3. Select 'Conflict Resolution Strategy' property to 'Append' when you write the file to disk Hope this helps.
... View more
07-15-2017
07:16 AM
Hi, I have a file in which first part of data is a JSON content and the second part is binary. Is there any standard nifi processors which can help me separate JSON and Binary?
... View more
Labels:
03-17-2017
06:23 AM
Hi @Rooster Raul. Your properties looks good. Can you please check if the folder you are trying to access have necesssary permissions for the user credentials. Becasue that is when you wont see any error during connection and everything stays idle thinking that there is no files to be listed.
... View more
03-16-2017
06:39 AM
1 Kudo
@Rooster Raul Can you please put in some screenshot of how your properties looks like. Against which property was the directory "/amazon/jungle/" configured? Is it against the 'Remote Path' property. Anyways here is the quick check list property. 1. Remote Path: /amazon/jungle/ 2. File Filter Regex: Here, you need to put in a RegEx that matches your file name. You can make use of this link to check if your RegEx exactly matches the file name (http://regexr.com/)
... View more
02-07-2017
12:56 PM
Can anyone please help me on this?
... View more
01-31-2017
11:28 AM
1 Kudo
@Raj B Please check for the 'Writes Attribute' in 'usage' of the processor. It describes the attribute each processor writes in case of failure. Below is the Usage of 'ExecuteStreamCommand' and it details 4 different error it writes during the execution.
... View more
01-27-2017
07:58 AM
1 Kudo
NIFI CSVtoAvroConverter Processor does not support decimal data type. We have replaced decimal with double instead.
When decimal is replaced with double in the outgoing schema option, we are facing the below issue.
Number which has precision=”25” , scale=”3” eg 1234567890123456789012345.123 is stored as 1.2345678901234568E24 in Hive.
If someother process in the future tries to read this value in hive as a string, the meaning of the field completely changes. Can you please provide a solution for this.
... View more
- Tags:
- Data Ingestion & Streaming
- NiFi
- nifi-processor
- Upgrade to HDP 2.5.3 : ConcurrentModificationException When Executing Insert Overwrite : Hive
Labels:
01-27-2017
06:16 AM
1 Kudo
@Michal R Below is the code for CSVtoAvroConverter
You need to add relationship to your custom processor private static final Relationship INVALID = new Relationship.Builder()
.name("invalid")
.description("failed records with error appended").build();
final Set<Relationship> relationship = new HashSet<>();
relationship.add(INVALID);
//The current code already captures the error count as
long errors = failures.count();
//Here is a sample code snippet(I have not implemented it - Just to give you an idea)
if (written.get() > 0L && errors == 0L) {
session.remove(badRecords);
session.transfer(outgoingAvro, SUCCESS);
} else {
session.remove(outgoingAvro);
/*This is the code which adds an attribute to the flow file. You can have this as a string and append it to your records instead of having it as an attribute*/
if (errors > 0L) {
getLogger()
.warn("Failed to convert {}/{} records from CSV to Avro",
new Object[] { errors, errors });
badRecords = session.putAttribute(badRecords, "errors",
failures.summary());
} else {
badRecords = session.putAttribute(badRecords, "errors",
"No incoming records");
}
//Here, you are writing the contents into the flow file
WriteContentsToFlowFile.writeContentsToFlowFile(session,
badRecords, failurereason, INVALID); //You can add something like failure reason
Hope this helps
... View more
01-25-2017
11:46 AM
@Michal R
The Type validation is a record level operation. It's not necessary you use the same incompatible relationship. You can create a new relationship say invalid and you can plug in a logic to send in only those records which failed. And that only transfers the flow file to invalid when the error count is greater than 0.
... View more
01-25-2017
07:47 AM
@Michal R. There are two different parts to CSVtoAvro converter. First is Type validation and next is conversion. The error that you have mentioned comes under Type Validation. ConvertCSVToAvro transfers a clone of the incoming CSV file to 'incompatible' relationship, which contains whole CSV record. You can have your custom code here to append the error message at the end of each incompatible record (This is possible since they have failed type validation and are hence not converted to AVRO). Hope this helps.
... View more
01-25-2017
07:46 AM
@Raj B You can do this in many ways, 1. You can have an update attribute processor at the beginning of every Processor Group to specify the module name. But this cannot be replicated at processor level. Here, you keep updating the same attribute throughout the flow 2. You have have a 'Metadata' -> as a table or a flat file that will log all the attributes of the flow file. Please check AttributeToJson processor. This can log flow attributes both as an attribute or content. Also, note that every nifi processor adds additional 'error' attribute in case of any error which can be captured by AttributeToJson processor. Hope this helps
... View more
01-21-2017
09:34 AM
Can the password in Hiveconnectionpool be passed via a configuration or property file
... View more
01-20-2017
03:57 PM
Thanks @Matt. Also, we see that the hive connection pool service prompts for username and password. The password when keyeed-in is visible. How can this be made invisible while keying in the password.
... View more
01-20-2017
02:06 PM
Whenever we deploy a new template in nifi UI, the PutHiveQL processor seems to be in disabed state. Can you please let me know ways in which it can be enabled without manual intervenssion or how can I automate it.
... View more
Labels:
01-20-2017
01:54 PM
Thanks @Pierre Villard. I have a question here. Nifi seems to remember state of the libraries when the processor was first started when the NiFi came up. The intermittent changes to this library path is not picked unless nifi is restarted. Can you pelase let me know if this is the way nifi works.
... View more
01-19-2017
06:44 AM
Hi, One of the resons we have upgraded from HDF 2.0 to HDF 2.1 is to implement wandisco.fs.client. HDF 2.1 has 'Additional Classpath Resources' property for processors involving HDFS (PutHDFS,GetHDFS). We have mentioned the WanDisco Fusion client's library containing the necessary jars as shown below. bcprov-jdk15on-1.52.jar guava-11.0.2.jar fusion-adk-common-2.9.jar fusion-common-2.9-hdp-2.4.0.jar
fusion-adk-security-2.9.jar netty-all-4.0.23.Final.jar
fusion-adk-netty-2.9.jar fusion-adk-client-2.9-hdp-2.4.0.jar fusion-client-common-2.9-hdp-2.4.0.jar We have a customized processor DelimitedFileToAvro which mimics CSVtoAvroConvertor to accept regular expression as property. We are facing the below error 'DelimitedFileToAvro[id=11361386-1e19-16fa-5518-516b26f65df7] DelimitedFileToAvro[id=11361386-1e19-16fa-5518-516b26f65df7] failed to process session due to
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.wandisco.fs.client.FusionHdfs not found: java.lang.RuntimeException:
java.lang.ClassNotFoundException: Class com.wandisco.fs.client.FusionHdfs not found' while converting the file to AVRO fetching the schema from HDFS. Can you please help us with a solution to move forward.
... View more
Labels:
01-11-2017
05:50 AM
1 Kudo
We have recently upgraded HDF from 2.0 to 2.1. Here, we are just importing the same nifi flow template which worked in 2.0 Can you please help me with some links for the updates made to HDF 2.1 from 2.0 specific to performance, load handling, queue back pressure, etc.
... View more
Labels:
01-10-2017
06:49 AM
@Matt @Bryan Bende Thanks for your clarifications. I am using, ListFile -> FetchHDFS -> FetchFile. Its the FetchHDFS processor that has updated the filename and that the FetchFile functionality has remained the same. We will make a workaround to retain the filename from ListFile into FetchFile.
... View more
01-10-2017
06:44 AM
The dataflow was templated and moved from HDF 2.0 to HDF 2.1.
... View more
01-09-2017
02:56 PM
@Matt Here is the complete scenario. 1. You are right, we use ListFile 2. Our requirement is, We use ListFile to extract the inbound file name and parse the name of the file. This parsed attribute is used in fetching a schema file from HDFS 3. It's after we read the schema, we read the data which was listed using FetchFile (in this case instead of using the default "absolute.path" and "filename", I would save the filename with its complete path in a attribute following the ListFile) . This FetchFile had the filename attribute updated with HDF 2.0 and its not the same with HDF 2.1(retains the filename which it read from the HDFS fro the previous step) 4. The FetchFile reads the desired content correctly. But only the filename attribute is not updated.
... View more
01-09-2017
09:49 AM
I have just updated to HDF 2.1. In my flow, I do a GetFile and in the down stream I do a FetchFile. What I see is that after execution of FetchFile processor, the filename property retains the filename that was read by the previous processor and does not update the filename which FetchFile processor executed. This was not the case in HDF 2.0
... View more
Labels:
01-04-2017
09:27 AM
I see an option 'Concurrent Tasks' in Scheduling in Nifi version 2.0. Here are my questions regarding Concurrency. 1. I see this only at indivudual processor level. Can this be set at 'Processor Group' level. In short can I parameterize it. 2. How can I arrive at this number. What are all the fators that decide concurrency. E.g. Memory, average data load etc, Other application in the same box like spark.
... View more
Labels:
12-20-2016
01:11 PM
1 Kudo
Thanks @Andrew Grande. We are now planning to use the job server Livy (http://livy.io/). Can anyone please guide me through this. I tried searching for some documentation but failed to find out a useful one.
... View more
12-08-2016
07:19 AM
2 Kudos
We have a two node cluster. HDF is installed in one cluster and Spark on the other. In a single node cluster I was able to trigger Spark from Nifi using 'ExecuteStreamCommand' processor using a Spark-submit command put in a shell script. Can you please let me know the guidelines to trigger Spark from nifi in a multinode cluster for the above mentioned scenario.
... View more
Labels:
11-15-2016
10:26 AM
I have a CSV file with 9 columns. How can I remove duplicates among columns 4 through 9? What we tried: 1. Split 1-4 columns in a file 2. Split 4-9 columns -> Deduplicate records Now, i tried using 'ReplaceTextWithMapping' to merge the files with 4th column (Common on both files). But I am not sure if my approach is right. Is there any other way to achieve this.
... View more
Labels: