About Shu_ashu

Shu_ashu · ‎06-19-2019

@Veera Pavan This job will work fine in Hive but in Spark follow these steps: write the data to temporary table first. then select from temporary table insert overwrite the final table. Check this similar thread regards to similar case. - If the answer is helpful to resolve the issue, Login and Click on Accept button below to close this thread.This will help other community users to find answers quickly 🙂

Shu_ashu · ‎06-19-2019

@Sampath Kumar Hive Timestamp type accepts format as yyyy-MM-dd HH:mm:ss[.SSS] hive> select timestamp("2019-06-15 15:43:12"); 2019-06-15 15:43:12 hive> select timestamp("2019-06-15 15:43:12.988"); 2019-06-15 15:43:12.988 hive> select timestamp("2019-06-15T15:43:12") NULL If you are thinking to have timestamp type rather than text format tables then you use from_unixtime,unix_timestamp functions to remove "T" from the data and then you can have timestamp type in all formats. - If the answer is helpful to resolve the issue, Login and Click on Accept button below to close this thread.This will help other community users to find answers quickly 🙂

Shu_ashu · ‎06-18-2019

@Carlo Bustos Could you confirm that DistributedMapCacheClientService is defined and enabled ? Please refer this and this links for defining distributed map cache service.

Shu_ashu · ‎06-18-2019

@Carlos Try with below spec: [{ "operation": "shift", "spec": { "*": "data.&", "ID": ["ID", "data.ID"] } }, { "operation": "default", "spec": { "dataset": "${dataset:toLower()}", "date": "${date}" } }] Output: { "ID" : "123", "data" : { "ID" : "123", "Text1" : "aaa", "Text2" : "aaa", "Text3" : "aaa" }, "date" : "${date}", "dataset" : "${dataset:toLower()}" }

Shu_ashu · ‎06-16-2019

@Sampath Kumar ALTER TABLE table SET SERDEPROPERTIES ("timestamp.formats"="yyyy-MM-dd'T'HH:mm:ss"); Works only in case of Textformat,CSV format tables. If you are having other format table like orc..etc then set serde properties are not got to be working. - Tested by creating text format table: Data: 1,2019-06-15T15:43:12 2,2019-06-15T15:43:19 create table i(id int,ts timestamp) row format delimited fields terminated by ',' stored as textfile; ALTER TABLE i SET SERDEPROPERTIES ("timestamp.formats"="yyyy-MM-dd'T'HH:mm:ss"); select * from i; 1 2019-06-15 15:43:12 2 2019-06-15 15:43:19 - incase if we have orc file with 2019-06-15T15:43:12 format then altering the serde properties still results null format for timestamp field.

Shu_ashu · ‎06-15-2019

@Jayashree S ListS3 processor is stateful processor once the processor runs it will store the state in the processor and then runs incrementally,if we don't have any new files added to S3 directory then processor won't list any files. How to Clear state: Stop the ListS3 processor and Right click on ListS3processor and select state and clear the state that is saved in the processor. Then start ListS3 processor, now processor will list all the files in S3 directory.

Shu_ashu · ‎06-13-2019

@Veerendra Nath Jasthi In UpdateAttribute add new attribute as ts value as ${now():format("yyyy_MM_dd_HH_mm_ss_SSS")} Example: ts attribute will have value as 2019_06_12_20_42_26_762 Then in PutHDFS processor configure directory as /<path>/${ts} (or) You can skip UpdateAttribute processor and directly use directory name as /<path>/${now():format("yyyy_MM_dd_HH_mm_ss_SSS")} In PutHDFS processor. This will create a directory in HDFS with current timestamp value. You can change the format of the timestamp using NiFi expression language.

Shu_ashu · ‎06-12-2019

@chauvin christophe You can use NiFi RestApi /processors/{id}/state/clear-requests call, to clear the state stored in the processor. We can make restapi call by using invokehttp processor (or) groovy script and triggered by using NiFi.

Shu_ashu · ‎06-12-2019

@James Willson Try with below spec: [{ "operation": "shift", "spec": { "rows": { "*": { "value": "[&1].date", "data": { "*": "[#3].data" } } } } } ] Gives output that you are looking for: [ { "date" : "00:00 2019-06-03", "data" : 120 }, { "date" : "05:00 2019-06-08", "data" : 98 }, { "date" : "23:00 2019-06-09", "data" : 172 } ] - If the answer is helpful to resolve the issue, Login and Click on Accept button below to close this thread.This will help other community users to find answers quickly 🙂

Shu_ashu · ‎06-11-2019

@vinay p Try with CountText,CalculateRecordStats processors will gives what you are looking for. To get one summary flowfile from 3 flowfiles then first you should merge 3 flowfilee into one by using MergeRecord/MergeContent processor then use CountText,CalculateRecordStats processors to get the summary of the flowfile. Then use PutEmail processor to send to send the flowfile as an attachment.

Online	Offline
Last Visited	‎04-04-2021 06:38 PM

Member Since	‎06-08-2017 08:15 PM
Last Visited	‎04-04-2021 06:38 PM
Posts	1,049
Kudos received	516

Cloudera Community

Re: Get column values in comma separated value

Re: nifi Json data using routeonattributeto to spl...

Re: HIVE MANAGED TABLE

Re: CSV file with Duplicate Headers

Re: NIFI - SQL Server Lookup

Re: Insert overwrite with in the same table in sp...

Re: timestamp not supported in HIVE

Re: DetectDuplicate error "unable to communicate ...

Re: JoltTransformJSON - Extract attribute witouth ...

Re: timestamp not supported in HIVE

Re: No progress in nifi flow

Re: How do I create the folder name with current t...

Re: Nifi clear state

Re: Jolt - Nearly perfect, just need to remove squ...

Re: Generate a summary from multiple flow files