About Shu_ashu

Shu_ashu · ‎05-24-2019

@Yasuhiro Shindo Container killed exit code most of the time is due to memory overhead If you haven't specified spark.yarn.driver.memoryOverhead or spark.yarn.executor.memoryOverhead these params in your spark submit then add these params (or) if you have specified then increase the already configured value. Please refer to this link to decide overhead value.

Shu_ashu · ‎05-24-2019

@Alvarez Rafa QueryDatabaseTable processor stores the state when processor ran for the first time based on Max value column(idmovil). For the next run processor only pulls the changes from the table based on idMovil column. You can check the last state value by RightClick on processor -> View State To clear the state stop the processor and RightClick on processor -> View State and then clear state. Once we clear the state then processor pulls all records from the table. --- If you are not facing this issue then please attach your flow screenshot and scheduling on QueryDatabaseTable processor.

Shu_ashu · ‎05-08-2019

@HanYan Tan Could you look into zeppelin-4140 jira is reported for the same issue. Checkout the comments associated with the jira for more details, as mentioned in jira comments "jdbc interpreter Binding Mode Per User Scoped and isolated mode per user the temporary table is dropped" Try with the above setting and check does the issue resolved or not 🙂

Shu_ashu · ‎05-02-2019

@Raj Negi Use nifi expression language to replace function to replace "[ and ]" with [ and ] ${'$1':unescapeJson():replace('"[','['):replace(']"',']')}

Shu_ashu · ‎05-01-2019

@Raj Negi After FetchHbaseRow processor use ReplaceText processor with below configs: Search Value (?s)(^.*$) Replacement Value ${'$1':unescapeJson()} //capture all the data and apply nifi expression language unescapeJson function. Character Set UTF-8 Maximum Buffer Size 1 MB //change as per your flowfile size Replacement Strategy Regex Replace Evaluation Mode Entire text Flow: 1.FetchHbaseRow 2.ReplaceText --other processors

Shu_ashu · ‎04-09-2019

@Kevin Lahey Not sure if you are using NiFi cluster (or) not, could you try to run ListS3 processor only on Primary Node only. As this processor intended to run only on primary node as per documentation.

Shu_ashu · ‎03-22-2019

@sri chaturvedi Instead of using UpdateAttribute processor's state use DistributedMapCache and you can fetch the stored value across the cluster. Use PutDistributedMapCache processor to store the value that got assigned recently then use FetchDistributedMapCache processor to Fetch the store value then apply your logic(increment..etc) to assign new value then overwrite the already stored value in DistributedMapCache using PutDistributedMapCache processor. Use this and this links as references for configuring Distributedmapcache processors/controllers.

Shu_ashu · ‎01-23-2019

@john y We can directly access filesize with ${fileSize} and this attribute expression will result the actual filesize value of flowfile.

Shu_ashu · ‎01-18-2019

@Manish Parab Sure, In NiFi processors that triggers the flow(scheduled to run in cron) we need to run the processors on primary node only and running on all nodes means we are triggering n times the same processor on each node. That means NiFi each node works with data specifically that receives, in case of Getmongo processor(triggers the flow in this case) when running on all nodes will pull same data. - Run GetMongo(source processor) to run on primary node then distribute the load using RemoteProcessorGroups (or) connectionloadbalancing across the cluster.

Shu_ashu · ‎01-18-2019

@Manish Parab Your GetMongo processor is running on All nodes that means same data is pulled on all nodes.. If you select EvaluateJsonPath processor to run on Only primary node then all the other nodes flowfiles will be left queue before EvaluateJsonPath processor, Because you are not processing flowfiles that are pulled from all other nodes except of PrimaryNode. Run GetMongoProcessor only on primary node and keep EvaluateJsonPath Processor to run on all nodes, Reason to keep EvaluateJsonPath processor on all nodes if NiFi primary node changed then EvaluateJsonPath processor not going to processor the flowfile that are listed on old PrimaryNode.

Online	Offline
Last Visited	‎04-04-2021 06:38 PM

Member Since	‎06-08-2017 08:15 PM
Last Visited	‎04-04-2021 06:38 PM
Posts	1,049
Kudos received	516

Cloudera Community

Re: Get column values in comma separated value

Re: nifi Json data using routeonattributeto to spl...

Re: HIVE MANAGED TABLE

Re: CSV file with Duplicate Headers

Re: NIFI - SQL Server Lookup

Re: Diagnostics: Container killed on request. Exit...

Re: QueryDatabaseTable --> SplitAvro NiFi

Re: How to ensure hive session close properly in Z...

Re: array response is having escape character

Re: array response is having escape character

Re: Why is DetectDuplicate not filtering duplicate...

Re: how to generate sequence number in nifi cluste...

Re: about getsftp processor's attributes

Re: EvaulateJSONPath processor configuration

Re: EvaulateJSONPath processor configuration