Member since
06-08-2017
1049
Posts
518
Kudos Received
312
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 11227 | 04-15-2020 05:01 PM | |
| 7131 | 10-15-2019 08:12 PM | |
| 3114 | 10-12-2019 08:29 PM | |
| 11497 | 09-21-2019 10:04 AM | |
| 4343 | 09-19-2019 07:11 AM |
12-13-2018
01:19 PM
@Praveenesh Kumar It's possible with Yarn RESTAPI(SubmitApplicationAPI) and in response we are going to have submittedTime and startTime the difference between these two unix timestamps would be the wait time. Api: GET http://<history server http address:port>/ws/v1/history/mapreduce/jobs Refer to this link for more details regards to the Api documentation.
... View more
12-13-2018
02:54 AM
@Lior Sela If you use "MergeRecord" processor then define RecordReader/Writer controller services avro schema with 2 columns (i.e. col1,col2). Avro Schema: {
"type": "record",
"name": "path_sch",
"fields":
[
{ "name": "col1", "type": ["null","string"]},
{ "name": "col2", "type": ["null","string"]}
]
} I tried with same case as mentioned in the question and it worked as expectedly, Please find the attached templated here. mergerecord-avro.xml
... View more
12-13-2018
12:51 AM
@Julio Gazeta Weird, i'm able to get the state value if i keep store state locally. Regards to GetMongo processor the flowfile attributes issue got resolved in NiFi-1.8 NIFI-5334 addressing this issue. As a word around to get required attribute refer to this link.
... View more
12-12-2018
02:44 AM
@Julio Gazeta The reason for not getting the state value is in UpdateAttribute processor you have selected Store State property value as "Do not store state" in this case processor doesn't get the state value. To resolve this issue: Change the UpdateAttribute processor configs to Store State
Store state locally Then auto terminate (or) connect set state fail connection to get notified in case of an failures happend. Try to run and check are you able to get the state value or not. - If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.
... View more
12-11-2018
12:33 AM
2 Kudos
@Saurav Ranjit
If your table is text format then the table won't have any delete/update capabilities. The work around for this case as follows, If your table is partitioned: 1.Then select the partition that you want to delete rows from and make sure any new data is not writing into this partition. 2.Take the specific partition data into temp table hive> create table <db_name>.<temp_table_name> as select * from <db_name>.<partition_table_name> where <partition_field_name>="<desired_partition_value>"; 3.Overwrite the same partition by excluding the unnecessary rows hive> insert overwrite <db_name>.<partition_table_name> partition(<partition_field_name>) select * from <db_name>.<temp_table_name> where <field_name> not in (<values_to_exclude>); 4. once you make sure that the data is correct then drop the temp table hive> drop table <db_name>.<temp_table_name>; These are the steps we need to follow for deleting specific rows in case of non-transactional table. In addition if you are having non partitioned table then we need to get full dump of existing(target) table into temp table and overwrite the target table by excluding the unnecessary rows from the temp table and most important until this process is finished make sure you are not writing any new data into target table. - Even hive supports select and overwrite the same table at same time but any wrong queries will lead to loose data completely so it's better to use temp table in place and drop the table when we make sure the data is correct. Example: insert overwrite table <db_name>.<partition_table_name> partition(<partition_field_name>) select * from <db_name>.<partition_table_name> where <field_name> not in (<values_to_exclude>);
... View more
12-01-2018
08:38 PM
@Hemanth Vakacharla i think for this case we need to split the records one line each by using SplitRecord/SplitText processor. Then Using MergeContent processor we can do 500 MB splits by using this way we are not going to have splitting records in between. Flow: 1.SplitRecord/SplitText //split the flowfile 1 line each
2.MergeRecord/MergeContent //to get 500MB filesize To force merge flowfiles use MaxBigAge property like 30 mins..etc. In case if you are using Record oriented processors we need to define Record Writer/Reader with avro schema to read/write the flowfile. Refer to this link for more details regards to merge content processor.
... View more
12-01-2018
03:01 PM
@n c To define a variable in hive we need to use hivevar and hiveconf is used to set hive configurations Please follow the below steps: hive> set hivevar:id=1; //define id variable with 1 value hive> create view testview as select * from test1 where id = ${hivevar:id}; //create view hive> select * from testview; //select from view for more details regards to hivevar vs hiveconf refer to this link.
... View more
11-27-2018
03:37 AM
@Julio Gazeta I don't think NiFi won't store the reference once we clear off the states in the processor. In your "d:\\tmp\\input" directory have only one file then clear off all states in ListFile processor then 1.start the processor once and then stop the processor and 2.start the ListFile processor again then you are going to list the file from the directory. - If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.
... View more
11-27-2018
03:25 AM
@Nawnath Hande You can use Split (or) regexp_extract hive functions for this case. 1.Regexp_extract function: hive> select trim(regexp_extract('string("Room no 601, Sayali Nivas , MG Road Delhi")', ',(.*?)(Nivas)', 1));
+---------+--+
| _c0 |
+---------+--+
| Sayali |
+---------+--+ 2.Split Function: hive> select trim(split(split(string("Room no 601, Sayali Nivas , MG Road Delhi"),",")[1],"Nivas")[0]);
+---------+--+
| _c0 |
+---------+--+
| Sayali |
+---------+--+
- If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.
... View more
11-25-2018
07:14 PM
1 Kudo
@Sudhakar Reddy
Thanks for updating all details regards to the flow :). Configure ReplaceText processor as Search Value (?s)(^.*$) Replacement Value Insert into test_schema.table1(topic, partition, offset, key, value) values ('${topic}','${partition}','${offset}','${key}','${'$1':replace("'","\"")}') Character Set UTF-8 Maximum Buffer Size 10 MB //change this value as per your flowfile size Replacement Strategy Regex Replace Evaluation Mode Entire text
In replacement value we are replacing all single quotes(') with double quotes(") in the captured group $1.
... View more