Member since
Kudos Received
02:40 PM
2 Kudos
@Rajesh AJ Use Get File (or) List/FetchFile processors to fetch the file. then use 1.Split Text Processor(if you are having each url for a line) with line split count as 1 If your file having 4 lines then after split text processor will give seperate flowfiles for each line. Example:- Input 1 file(having 4 lines) and output will be 4 flow files(each line as seperate flowfile) 2.Extract Text processor to extract the url's to attributes for the flowfile. According to your file content size you need to change the Maximum Buffer size property and i'm extracting all the contents of the flowfile to url attribute by using regex .* url
(.*) //extract the whole content of the flowfile add the content to the flow file url attribute then use 3.invoke http processor with ${url} We are going to use the extracted url attribute from extract text processor in invoke http processor. The extracted url attribute will be changed dynamically according to the flowfile content. Flow:- 1.Get File
2.Split Text
3.Extract Text
4.Invoke HTTP If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of errors.
... View more
05:47 AM
thanks, it works for me and also install nifi 1.1.0 version, in that I have the PutElasticsearch5 processor.that also works well with transport client.
... View more
08:18 AM
5 Kudos
this looks a timestamp not date, you can store this as string in hive table and retrive it using to_date() funtion, or you can run some date transformation before inserting into hive table, it looks you are having RFC822 timestamp which you can convert into some hive known transformation like this, I am using a java program to public class RFC822TimeStampConverter {
public static void main(String[] args) {
String rfcDate = "Tue, Dec 20 10:04:31 2016";
String pattern = "EEE, MMM DD HH:mm:ss YYYY";
SimpleDateFormat format = new SimpleDateFormat(pattern);
try {
Date javaDate = format.parse(rfcDate);
} catch (ParseException e) {
... View more
12:50 PM
1 Kudo
The most direct way is to transform the date to correct format in NiFi. Alternatively, you could land it in a hive table and CTAS to a new table while transforming to correct format. See this for Hive timestamp format to be used in either case: NiFi: Before putting to hdfs or hive, use a ReplaceText processor. You will use regex to find the timestamp pattern from original twitter json and replace it with the timestamp pattern needed in Hive/Kibana. This article should help you out: Hive alternative: Here you either use a SerDe to transform the timestamp or you use regex. In both cases, you land the data in a Hive table, then CTAS (Create Table as Select) to a final table. This should help you out for this approach: To me, the NiFi approach is superior (unless you must store the original with untransformed date into Hadoop).
... View more
07:18 AM
3 Kudos
@Rajesh AJ please follow the sample application to ingest the data into mongodb
... View more
07:40 AM
add jar /usr/hdp/current/hive-client/lib/commons-httpclient-3.0.1.jar when i add the above jar it works fine but it wont works when i try for new table. afterwards i restart the hive session that jar works fine. how i permanently add that jar in hive
... View more
05:00 AM
2016-12-19 04:43:41,466 ERROR [main]: exec.Task ( - Failed to execute tez graph.
org.apache.tez.dag.api.TezException: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit application_1481976368061_0017 to YARN : Application application_1481976368061_0017 submitted by user root to unknown queue: agent
at org.apache.tez.client.TezClient.start(
at org.apache.hadoop.hive.ql.exec.tez.TezTask.updateSession(
at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(
at org.apache.hadoop.hive.ql.exec.Task.executeTask(
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(
at org.apache.hadoop.hive.ql.Driver.launchTask(
at org.apache.hadoop.hive.ql.Driver.execute(
at org.apache.hadoop.hive.ql.Driver.runInternal(
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(
at org.apache.hadoop.hive.cli.CliDriver.processCmd(
at org.apache.hadoop.hive.cli.CliDriver.processLine(
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(
at org.apache.hadoop.hive.cli.CliDriver.main(
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(
at sun.reflect.DelegatingMethodAccessorImpl.invoke(
at java.lang.reflect.Method.invoke(
at org.apache.hadoop.util.RunJar.main(
Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit application_1481976368061_0017 to YARN : Application application_1481976368061_0017 submitted by user root to unknown queue: agent
at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(
at org.apache.tez.client.TezYarnClient.submitApplication(
at org.apache.tez.client.TezClient.start(
... 22 more
2016-12-19 04:43:41,466 INFO [main]: hooks.ATSHook (<init>(90)) - Created ATS Hook
2016-12-19 04:43:41,466 INFO [main]: log.PerfLogger ( - <PERFLOG from=org.apache.hadoop.hive.ql.Driver>
2016-12-19 04:43:41,470 INFO [main]: log.PerfLogger ( - </PERFLOG start=1482122621466 end=1482122621470 duration=4 from=org.apache.hadoop.hive.ql.Driver>
2016-12-19 04:43:41,471 ERROR [main]: ql.Driver ( - FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask
2016-12-19 04:43:41,471 INFO [main]: ql.Driver ( - Resetting the caller context to
2016-12-19 04:43:41,471 INFO [main]: log.PerfLogger ( - </PERFLOG method=Driver.execute start=1482122620489 end=1482122621471 duration=982 from=org.apache.hadoop.hive.ql.Driver>
2016-12-19 04:43:41,471 INFO [main]: log.PerfLogger ( - <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
2016-12-19 04:43:41,471 INFO [ATS Logger 0]: hooks.ATSHook ( - Received post-hook notification for :root_20161219044340_094281c1-c0f2-4cc9-83a5-962b067937f7
2016-12-19 04:43:41,488 INFO [main]: log.PerfLogger ( - </PERFLOG method=releaseLocks start=1482122621471 end=1482122621488 duration=17 from=org.apache.hadoop.hive.ql.Driver>
2016-12-19 04:43:41,505 INFO [main]: log.PerfLogger ( - <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
2016-12-19 04:43:41,505 INFO [main]: log.PerfLogger ( - </PERFLOG method=releaseLocks start=1482122621505 end=1482122621505 duration=0 from=org.apache.hadoop.hive.ql.Driver>
... View more
07:54 AM
it resolve my problem but whenever i use the init 6 the session is
closed and i am unable to enable the sandbox.service without reboot my
... View more
05:55 AM
whats the difference between, when i entering (elasticsearch,solr ) in Terms to Filter On inGET TWITTER processor and this article for creating a process group
... View more