Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

DetectDuplicate processor not eliminating duplicates

DetectDuplicate processor not eliminating duplicates

Hi,

This is a continuation of the coversation froma different question on removing duplicates from FetchSFTP. Just wanted to document it as a seperate query so that it is cleaner.

 

We are using the DetectDuplicate processor to eliminate the duplicates, the detectduplicate does not work as expected when 2 different file names are introduced in the flowfile. Is this expected behaviour? How do we handle this scenario to have unique file list of multiple files over a period of time.

The first generateflowfile sets filename as file1.txt and second generateflowfile sets filename as file2.txt.

 

Capture.PNGCapture2.PNG

 

When i disable one of the generateflowfile, it is filtering the duplicates as expected. But when both the generateflowfile are running, we are getting duplicates.

 

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<template encoding-version="1.2">
    <description></description>
    <groupId>062a354e-e215-1830-b3b6-9829ae4b581e</groupId>
    <name>DetectDuplicate Template</name>
    <snippet>
        <controllerServices>
            <id>230506fa-0b7f-3532-0000-000000000000</id>
            <parentGroupId>59d8b9c3-c35d-322b-0000-000000000000</parentGroupId>
            <bundle>
                <artifact>nifi-distributed-cache-services-nar</artifact>
                <group>org.apache.nifi</group>
                <version>1.8.0.3.3.1.0-10</version>
            </bundle>
            <comments></comments>
            <descriptors>
                <entry>
                    <key>Server Hostname</key>
                    <value>
                        <name>Server Hostname</name>
                    </value>
                </entry>
                <entry>
                    <key>Server Port</key>
                    <value>
                        <name>Server Port</name>
                    </value>
                </entry>
                <entry>
                    <key>SSL Context Service</key>
                    <value>
                        <identifiesControllerService>org.apache.nifi.ssl.SSLContextService</identifiesControllerService>
                        <name>SSL Context Service</name>
                    </value>
                </entry>
                <entry>
                    <key>Communications Timeout</key>
                    <value>
                        <name>Communications Timeout</name>
                    </value>
                </entry>
            </descriptors>
            <name>DistributedMapCacheClientService</name>
            <persistsState>false</persistsState>
            <properties>
                <entry>
                    <key>Server Hostname</key>
                    <value>localhost</value>
                </entry>
                <entry>
                    <key>Server Port</key>
                </entry>
                <entry>
                    <key>SSL Context Service</key>
                </entry>
                <entry>
                    <key>Communications Timeout</key>
                </entry>
            </properties>
            <state>ENABLED</state>
            <type>org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService</type>
        </controllerServices>
        <processGroups>
            <id>db622404-a657-39a3-0000-000000000000</id>
            <parentGroupId>59d8b9c3-c35d-322b-0000-000000000000</parentGroupId>
            <position>
                <x>0.0</x>
                <y>0.0</y>
            </position>
            <comments></comments>
            <contents>
                <connections>
                    <id>0f6643d6-5956-3618-0000-000000000000</id>
                    <parentGroupId>db622404-a657-39a3-0000-000000000000</parentGroupId>
                    <backPressureDataSizeThreshold>1 GB</backPressureDataSizeThreshold>
                    <backPressureObjectThreshold>10000</backPressureObjectThreshold>
                    <bends>
                        <x>2024.3634542998968</x>
                        <y>-56.45883764826101</y>
                    </bends>
                    <destination>
                        <groupId>db622404-a657-39a3-0000-000000000000</groupId>
                        <id>57cabd60-1ee1-3721-0000-000000000000</id>
                        <type>PROCESSOR</type>
                    </destination>
                    <flowFileExpiration>0 sec</flowFileExpiration>
                    <labelIndex>1</labelIndex>
                    <loadBalanceCompression>DO_NOT_COMPRESS</loadBalanceCompression>
                    <loadBalancePartitionAttribute></loadBalancePartitionAttribute>
                    <loadBalanceStatus>LOAD_BALANCE_NOT_CONFIGURED</loadBalanceStatus>
                    <loadBalanceStrategy>DO_NOT_LOAD_BALANCE</loadBalanceStrategy>
                    <name></name>
                    <selectedRelationships>duplicate</selectedRelationships>
                    <selectedRelationships>failure</selectedRelationships>
                    <source>
                        <groupId>db622404-a657-39a3-0000-000000000000</groupId>
                        <id>80bd1f2b-80df-33f6-0000-000000000000</id>
                        <type>PROCESSOR</type>
                    </source>
                    <zIndex>0</zIndex>
                </connections>
                <connections>
                    <id>107d2610-9be6-3c41-0000-000000000000</id>
                    <parentGroupId>db622404-a657-39a3-0000-000000000000</parentGroupId>
                    <backPressureDataSizeThreshold>1 GB</backPressureDataSizeThreshold>
                    <backPressureObjectThreshold>10000</backPressureObjectThreshold>
                    <destination>
                        <groupId>db622404-a657-39a3-0000-000000000000</groupId>
                        <id>515042a8-aca9-3c80-0000-000000000000</id>
                        <type>OUTPUT_PORT</type>
                    </destination>
                    <flowFileExpiration>0 sec</flowFileExpiration>
                    <labelIndex>1</labelIndex>
                    <loadBalanceCompression>DO_NOT_COMPRESS</loadBalanceCompression>
                    <loadBalancePartitionAttribute></loadBalancePartitionAttribute>
                    <loadBalanceStatus>LOAD_BALANCE_NOT_CONFIGURED</loadBalanceStatus>
                    <loadBalanceStrategy>DO_NOT_LOAD_BALANCE</loadBalanceStrategy>
                    <name></name>
                    <selectedRelationships>non-duplicate</selectedRelationships>
                    <source>
                        <groupId>db622404-a657-39a3-0000-000000000000</groupId>
                        <id>80bd1f2b-80df-33f6-0000-000000000000</id>
                        <type>PROCESSOR</type>
                    </source>
                    <zIndex>0</zIndex>
                </connections>
                <connections>
                    <id>c69a292d-eaf7-3374-0000-000000000000</id>
                    <parentGroupId>db622404-a657-39a3-0000-000000000000</parentGroupId>
                    <backPressureDataSizeThreshold>1 GB</backPressureDataSizeThreshold>
                    <backPressureObjectThreshold>10000</backPressureObjectThreshold>
                    <destination>
                        <groupId>db622404-a657-39a3-0000-000000000000</groupId>
                        <id>3625ea5e-73ac-3244-0000-000000000000</id>
                        <type>FUNNEL</type>
                    </destination>
                    <flowFileExpiration>0 sec</flowFileExpiration>
                    <labelIndex>1</labelIndex>
                    <loadBalanceCompression>DO_NOT_COMPRESS</loadBalanceCompression>
                    <loadBalancePartitionAttribute></loadBalancePartitionAttribute>
                    <loadBalanceStatus>LOAD_BALANCE_NOT_CONFIGURED</loadBalanceStatus>
                    <loadBalanceStrategy>DO_NOT_LOAD_BALANCE</loadBalanceStrategy>
                    <name></name>
                    <selectedRelationships>success</selectedRelationships>
                    <source>
                        <groupId>db622404-a657-39a3-0000-000000000000</groupId>
                        <id>057595e3-d1bc-3eb0-0000-000000000000</id>
                        <type>PROCESSOR</type>
                    </source>
                    <zIndex>0</zIndex>
                </connections>
                <connections>
                    <id>e3551aa2-3058-3360-0000-000000000000</id>
                    <parentGroupId>db622404-a657-39a3-0000-000000000000</parentGroupId>
                    <backPressureDataSizeThreshold>1 GB</backPressureDataSizeThreshold>
                    <backPressureObjectThreshold>10000</backPressureObjectThreshold>
                    <destination>
                        <groupId>db622404-a657-39a3-0000-000000000000</groupId>
                        <id>3625ea5e-73ac-3244-0000-000000000000</id>
                        <type>FUNNEL</type>
                    </destination>
                    <flowFileExpiration>0 sec</flowFileExpiration>
                    <labelIndex>1</labelIndex>
                    <loadBalanceCompression>DO_NOT_COMPRESS</loadBalanceCompression>
                    <loadBalancePartitionAttribute></loadBalancePartitionAttribute>
                    <loadBalanceStatus>LOAD_BALANCE_NOT_CONFIGURED</loadBalanceStatus>
                    <loadBalanceStrategy>DO_NOT_LOAD_BALANCE</loadBalanceStrategy>
                    <name></name>
                    <selectedRelationships>success</selectedRelationships>
                    <source>
                        <groupId>db622404-a657-39a3-0000-000000000000</groupId>
                        <id>e2956645-4a34-38dc-0000-000000000000</id>
                        <type>PROCESSOR</type>
                    </source>
                    <zIndex>0</zIndex>
                </connections>
                <connections>
                    <id>ecc2450b-0b8d-3f97-0000-000000000000</id>
                    <parentGroupId>db622404-a657-39a3-0000-000000000000</parentGroupId>
                    <backPressureDataSizeThreshold>1 GB</backPressureDataSizeThreshold>
                    <backPressureObjectThreshold>10000</backPressureObjectThreshold>
                    <destination>
                        <groupId>db622404-a657-39a3-0000-000000000000</groupId>
                        <id>4ba95e34-ed3b-3cd1-0000-000000000000</id>
                        <type>PROCESSOR</type>
                    </destination>
                    <flowFileExpiration>0 sec</flowFileExpiration>
                    <labelIndex>1</labelIndex>
                    <loadBalanceCompression>DO_NOT_COMPRESS</loadBalanceCompression>
                    <loadBalancePartitionAttribute></loadBalancePartitionAttribute>
                    <loadBalanceStatus>LOAD_BALANCE_NOT_CONFIGURED</loadBalanceStatus>
                    <loadBalanceStrategy>DO_NOT_LOAD_BALANCE</loadBalanceStrategy>
                    <name></name>
                    <source>
                        <groupId>db622404-a657-39a3-0000-000000000000</groupId>
                        <id>3625ea5e-73ac-3244-0000-000000000000</id>
                        <type>FUNNEL</type>
                    </source>
                    <zIndex>0</zIndex>
                </connections>
                <connections>
                    <id>ed46be4e-6a40-3912-0000-000000000000</id>
                    <parentGroupId>db622404-a657-39a3-0000-000000000000</parentGroupId>
                    <backPressureDataSizeThreshold>1 GB</backPressureDataSizeThreshold>
                    <backPressureObjectThreshold>10000</backPressureObjectThreshold>
                    <destination>
                        <groupId>db622404-a657-39a3-0000-000000000000</groupId>
                        <id>80bd1f2b-80df-33f6-0000-000000000000</id>
                        <type>PROCESSOR</type>
                    </destination>
                    <flowFileExpiration>0 sec</flowFileExpiration>
                    <labelIndex>1</labelIndex>
                    <loadBalanceCompression>DO_NOT_COMPRESS</loadBalanceCompression>
                    <loadBalancePartitionAttribute></loadBalancePartitionAttribute>
                    <loadBalanceStatus>LOAD_BALANCE_NOT_CONFIGURED</loadBalanceStatus>
                    <loadBalanceStrategy>DO_NOT_LOAD_BALANCE</loadBalanceStrategy>
                    <name></name>
                    <selectedRelationships>success</selectedRelationships>
                    <source>
                        <groupId>db622404-a657-39a3-0000-000000000000</groupId>
                        <id>4ba95e34-ed3b-3cd1-0000-000000000000</id>
                        <type>PROCESSOR</type>
                    </source>
                    <zIndex>0</zIndex>
                </connections>
                <funnels>
                    <id>3625ea5e-73ac-3244-0000-000000000000</id>
                    <parentGroupId>db622404-a657-39a3-0000-000000000000</parentGroupId>
                    <position>
                        <x>1450.0685993426005</x>
                        <y>-201.13133307565113</y>
                    </position>
                </funnels>
                <outputPorts>
                    <id>515042a8-aca9-3c80-0000-000000000000</id>
                    <parentGroupId>db622404-a657-39a3-0000-000000000000</parentGroupId>
                    <position>
                        <x>1516.7028766863505</x>
                        <y>91.16929558645825</y>
                    </position>
                    <comments></comments>
                    <concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount>
                    <name>test data output</name>
                    <state>DISABLED</state>
                    <type>OUTPUT_PORT</type>
                </outputPorts>
                <processors>
                    <id>057595e3-d1bc-3eb0-0000-000000000000</id>
                    <parentGroupId>db622404-a657-39a3-0000-000000000000</parentGroupId>
                    <position>
                        <x>1572.582782985138</x>
                        <y>-454.86057325764887</y>
                    </position>
                    <bundle>
                        <artifact>nifi-standard-nar</artifact>
                        <group>org.apache.nifi</group>
                        <version>1.8.0.3.3.1.0-10</version>
                    </bundle>
                    <config>
                        <bulletinLevel>WARN</bulletinLevel>
                        <comments></comments>
                        <concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount>
                        <descriptors>
                            <entry>
<key>File Size</key>
<value>
    <name>File Size</name>
</value>
                            </entry>
                            <entry>
<key>Batch Size</key>
<value>
    <name>Batch Size</name>
</value>
                            </entry>
                            <entry>
<key>Data Format</key>
<value>
    <name>Data Format</name>
</value>
                            </entry>
                            <entry>
<key>Unique FlowFiles</key>
<value>
    <name>Unique FlowFiles</name>
</value>
                            </entry>
                            <entry>
<key>generate-ff-custom-text</key>
<value>
    <name>generate-ff-custom-text</name>
</value>
                            </entry>
                            <entry>
<key>character-set</key>
<value>
    <name>character-set</name>
</value>
                            </entry>
                            <entry>
<key>file.lastModifiedTime</key>
<value>
    <name>file.lastModifiedTime</name>
</value>
                            </entry>
                            <entry>
<key>file.path</key>
<value>
    <name>file.path</name>
</value>
                            </entry>
                            <entry>
<key>filename</key>
<value>
    <name>filename</name>
</value>
                            </entry>
                        </descriptors>
                        <executionNode>PRIMARY</executionNode>
                        <lossTolerant>false</lossTolerant>
                        <penaltyDuration>30 sec</penaltyDuration>
                        <properties>
                            <entry>
<key>File Size</key>
<value>0B</value>
                            </entry>
                            <entry>
<key>Batch Size</key>
<value>1</value>
                            </entry>
                            <entry>
<key>Data Format</key>
<value>Text</value>
                            </entry>
                            <entry>
<key>Unique FlowFiles</key>
<value>false</value>
                            </entry>
                            <entry>
<key>generate-ff-custom-text</key>
                            </entry>
                            <entry>
<key>character-set</key>
<value>UTF-8</value>
                            </entry>
                            <entry>
<key>file.lastModifiedTime</key>
<value>${now()}</value>
                            </entry>
                            <entry>
<key>file.path</key>
<value>.</value>
                            </entry>
                            <entry>
<key>filename</key>
<value>file2.dat</value>
                            </entry>
                        </properties>
                        <runDurationMillis>0</runDurationMillis>
                        <schedulingPeriod>10 sec</schedulingPeriod>
                        <schedulingStrategy>TIMER_DRIVEN</schedulingStrategy>
                        <yieldDuration>5 sec</yieldDuration>
                    </config>
                    <executionNodeRestricted>false</executionNodeRestricted>
                    <name>GenerateFlowFile</name>
                    <relationships>
                        <autoTerminate>false</autoTerminate>
                        <name>success</name>
                    </relationships>
                    <state>STOPPED</state>
                    <style/>
                    <type>org.apache.nifi.processors.standard.GenerateFlowFile</type>
                </processors>
                <processors>
                    <id>1fac344c-02c6-3973-0000-000000000000</id>
                    <parentGroupId>db622404-a657-39a3-0000-000000000000</parentGroupId>
                    <position>
                        <x>944.5213489173134</x>
                        <y>-89.34061421542867</y>
                    </position>
                    <bundle>
                        <artifact>nifi-scripting-nar</artifact>
                        <group>org.apache.nifi</group>
                        <version>1.8.0.3.3.1.0-10</version>
                    </bundle>
                    <config>
                        <bulletinLevel>WARN</bulletinLevel>
                        <comments></comments>
                        <concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount>
                        <descriptors>
                            <entry>
<key>Script Engine</key>
<value>
    <name>Script Engine</name>
</value>
                            </entry>
                            <entry>
<key>Script File</key>
<value>
    <name>Script File</name>
</value>
                            </entry>
                            <entry>
<key>Script Body</key>
<value>
    <name>Script Body</name>
</value>
                            </entry>
                            <entry>
<key>Module Directory</key>
<value>
    <name>Module Directory</name>
</value>
                            </entry>
                        </descriptors>
                        <executionNode>ALL</executionNode>
                        <lossTolerant>false</lossTolerant>
                        <penaltyDuration>1 sec</penaltyDuration>
                        <properties>
                            <entry>
<key>Script Engine</key>
<value>Groovy</value>
                            </entry>
                            <entry>
<key>Script File</key>
                            </entry>
                            <entry>
<key>Script Body</key>
<value>FlowFile flowFile = session.get()

if(flowFile == null) {
    return;
}
session.transfer(session.penalize(flowFile),REL_SUCCESS)</value>
                            </entry>
                            <entry>
<key>Module Directory</key>
                            </entry>
                        </properties>
                        <runDurationMillis>0</runDurationMillis>
                        <schedulingPeriod>0 sec</schedulingPeriod>
                        <schedulingStrategy>TIMER_DRIVEN</schedulingStrategy>
                        <yieldDuration>1 sec</yieldDuration>
                    </config>
                    <executionNodeRestricted>false</executionNodeRestricted>
                    <name>ExecuteScript</name>
                    <relationships>
                        <autoTerminate>true</autoTerminate>
                        <name>failure</name>
                    </relationships>
                    <relationships>
                        <autoTerminate>false</autoTerminate>
                        <name>success</name>
                    </relationships>
                    <state>DISABLED</state>
                    <style/>
                    <type>org.apache.nifi.processors.script.ExecuteScript</type>
                </processors>
                <processors>
                    <id>4ba95e34-ed3b-3cd1-0000-000000000000</id>
                    <parentGroupId>db622404-a657-39a3-0000-000000000000</parentGroupId>
                    <position>
                        <x>1369.7560827382913</x>
                        <y>-113.31758254707097</y>
                    </position>
                    <bundle>
                        <artifact>nifi-standard-nar</artifact>
                        <group>org.apache.nifi</group>
                        <version>1.8.0.3.3.1.0-10</version>
                    </bundle>
                    <config>
                        <bulletinLevel>WARN</bulletinLevel>
                        <comments></comments>
                        <concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount>
                        <descriptors>
                            <entry>
<key>Hash Value Attribute Key</key>
<value>
    <name>Hash Value Attribute Key</name>
</value>
                            </entry>
                            <entry>
<key>filename</key>
<value>
    <name>filename</name>
</value>
                            </entry>
                        </descriptors>
                        <executionNode>ALL</executionNode>
                        <lossTolerant>false</lossTolerant>
                        <penaltyDuration>30 sec</penaltyDuration>
                        <properties>
                            <entry>
<key>Hash Value Attribute Key</key>
<value>filenameHash</value>
                            </entry>
                            <entry>
<key>filename</key>
<value>(?s)(^.*$)</value>
                            </entry>
                        </properties>
                        <runDurationMillis>0</runDurationMillis>
                        <schedulingPeriod>0 sec</schedulingPeriod>
                        <schedulingStrategy>TIMER_DRIVEN</schedulingStrategy>
                        <yieldDuration>1 sec</yieldDuration>
                    </config>
                    <executionNodeRestricted>false</executionNodeRestricted>
                    <name>HashAttribute</name>
                    <relationships>
                        <autoTerminate>true</autoTerminate>
                        <name>failure</name>
                    </relationships>
                    <relationships>
                        <autoTerminate>false</autoTerminate>
                        <name>success</name>
                    </relationships>
                    <state>STOPPED</state>
                    <style/>
                    <type>org.apache.nifi.processors.standard.HashAttribute</type>
                </processors>
                <processors>
                    <id>57cabd60-1ee1-3721-0000-000000000000</id>
                    <parentGroupId>db622404-a657-39a3-0000-000000000000</parentGroupId>
                    <position>
                        <x>1918.4774979184494</x>
                        <y>22.21475904166789</y>
                    </position>
                    <bundle>
                        <artifact>nifi-standard-nar</artifact>
                        <group>org.apache.nifi</group>
                        <version>1.8.0.3.3.1.0-10</version>
                    </bundle>
                    <config>
                        <bulletinLevel>WARN</bulletinLevel>
                        <comments></comments>
                        <concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount>
                        <descriptors>
                            <entry>
<key>Log Level</key>
<value>
    <name>Log Level</name>
</value>
                            </entry>
                            <entry>
<key>Log Payload</key>
<value>
    <name>Log Payload</name>
</value>
                            </entry>
                            <entry>
<key>Attributes to Log</key>
<value>
    <name>Attributes to Log</name>
</value>
                            </entry>
                            <entry>
<key>attributes-to-log-regex</key>
<value>
    <name>attributes-to-log-regex</name>
</value>
                            </entry>
                            <entry>
<key>Attributes to Ignore</key>
<value>
    <name>Attributes to Ignore</name>
</value>
                            </entry>
                            <entry>
<key>attributes-to-ignore-regex</key>
<value>
    <name>attributes-to-ignore-regex</name>
</value>
                            </entry>
                            <entry>
<key>Log prefix</key>
<value>
    <name>Log prefix</name>
</value>
                            </entry>
                            <entry>
<key>character-set</key>
<value>
    <name>character-set</name>
</value>
                            </entry>
                        </descriptors>
                        <executionNode>ALL</executionNode>
                        <lossTolerant>false</lossTolerant>
                        <penaltyDuration>30 sec</penaltyDuration>
                        <properties>
                            <entry>
<key>Log Level</key>
<value>info</value>
                            </entry>
                            <entry>
<key>Log Payload</key>
<value>false</value>
                            </entry>
                            <entry>
<key>Attributes to Log</key>
                            </entry>
                            <entry>
<key>attributes-to-log-regex</key>
<value>.*</value>
                            </entry>
                            <entry>
<key>Attributes to Ignore</key>
                            </entry>
                            <entry>
<key>attributes-to-ignore-regex</key>
                            </entry>
                            <entry>
<key>Log prefix</key>
                            </entry>
                            <entry>
<key>character-set</key>
<value>UTF-8</value>
                            </entry>
                        </properties>
                        <runDurationMillis>0</runDurationMillis>
                        <schedulingPeriod>0 sec</schedulingPeriod>
                        <schedulingStrategy>TIMER_DRIVEN</schedulingStrategy>
                        <yieldDuration>1 sec</yieldDuration>
                    </config>
                    <executionNodeRestricted>false</executionNodeRestricted>
                    <name>LogAttribute</name>
                    <relationships>
                        <autoTerminate>true</autoTerminate>
                        <name>success</name>
                    </relationships>
                    <state>DISABLED</state>
                    <style/>
                    <type>org.apache.nifi.processors.standard.LogAttribute</type>
                </processors>
                <processors>
                    <id>80bd1f2b-80df-33f6-0000-000000000000</id>
                    <parentGroupId>db622404-a657-39a3-0000-000000000000</parentGroupId>
                    <position>
                        <x>1878.1062222047817</x>
                        <y>-263.90041753635575</y>
                    </position>
                    <bundle>
                        <artifact>nifi-standard-nar</artifact>
                        <group>org.apache.nifi</group>
                        <version>1.8.0.3.3.1.0-10</version>
                    </bundle>
                    <config>
                        <bulletinLevel>WARN</bulletinLevel>
                        <comments></comments>
                        <concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount>
                        <descriptors>
                            <entry>
<key>Cache Entry Identifier</key>
<value>
    <name>Cache Entry Identifier</name>
</value>
                            </entry>
                            <entry>
<key>FlowFile Description</key>
<value>
    <name>FlowFile Description</name>
</value>
                            </entry>
                            <entry>
<key>Age Off Duration</key>
<value>
    <name>Age Off Duration</name>
</value>
                            </entry>
                            <entry>
<key>Distributed Cache Service</key>
<value>
    <identifiesControllerService>org.apache.nifi.distributed.cache.client.DistributedMapCacheClient</identifiesControllerService>
    <name>Distributed Cache Service</name>
</value>
                            </entry>
                            <entry>
<key>Cache The Entry Identifier</key>
<value>
    <name>Cache The Entry Identifier</name>
</value>
                            </entry>
                        </descriptors>
                        <executionNode>ALL</executionNode>
                        <lossTolerant>false</lossTolerant>
                        <penaltyDuration>30 sec</penaltyDuration>
                        <properties>
                            <entry>
<key>Cache Entry Identifier</key>
<value>${filenameHash}</value>
                            </entry>
                            <entry>
<key>FlowFile Description</key>
<value>duplicateFileName</value>
                            </entry>
                            <entry>
<key>Age Off Duration</key>
<value>60 sec</value>
                            </entry>
                            <entry>
<key>Distributed Cache Service</key>
<value>230506fa-0b7f-3532-0000-000000000000</value>
                            </entry>
                            <entry>
<key>Cache The Entry Identifier</key>
<value>true</value>
                            </entry>
                        </properties>
                        <runDurationMillis>0</runDurationMillis>
                        <schedulingPeriod>0 sec</schedulingPeriod>
                        <schedulingStrategy>TIMER_DRIVEN</schedulingStrategy>
                        <yieldDuration>1 sec</yieldDuration>
                    </config>
                    <executionNodeRestricted>false</executionNodeRestricted>
                    <name>DetectDuplicate</name>
                    <relationships>
                        <autoTerminate>false</autoTerminate>
                        <name>duplicate</name>
                    </relationships>
                    <relationships>
                        <autoTerminate>false</autoTerminate>
                        <name>failure</name>
                    </relationships>
                    <relationships>
                        <autoTerminate>false</autoTerminate>
                        <name>non-duplicate</name>
                    </relationships>
                    <state>STOPPED</state>
                    <style/>
                    <type>org.apache.nifi.processors.standard.DetectDuplicate</type>
                </processors>
                <processors>
                    <id>915dc8f3-bb4f-3485-0000-000000000000</id>
                    <parentGroupId>db622404-a657-39a3-0000-000000000000</parentGroupId>
                    <position>
                        <x>2027.576981312814</x>
                        <y>-457.38524117496</y>
                    </position>
                    <bundle>
                        <artifact>nifi-standard-nar</artifact>
                        <group>org.apache.nifi</group>
                        <version>1.8.0.3.3.1.0-10</version>
                    </bundle>
                    <config>
                        <bulletinLevel>WARN</bulletinLevel>
                        <comments></comments>
                        <concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount>
                        <descriptors>
                            <entry>
<key>Cache Entry Identifier</key>
<value>
    <name>Cache Entry Identifier</name>
</value>
                            </entry>
                            <entry>
<key>Distributed Cache Service</key>
<value>
    <identifiesControllerService>org.apache.nifi.distributed.cache.client.DistributedMapCacheClient</identifiesControllerService>
    <name>Distributed Cache Service</name>
</value>
                            </entry>
                            <entry>
<key>Put Cache Value In Attribute</key>
<value>
    <name>Put Cache Value In Attribute</name>
</value>
                            </entry>
                            <entry>
<key>Max Length To Put In Attribute</key>
<value>
    <name>Max Length To Put In Attribute</name>
</value>
                            </entry>
                            <entry>
<key>Character Set</key>
<value>
    <name>Character Set</name>
</value>
                            </entry>
                        </descriptors>
                        <executionNode>ALL</executionNode>
                        <lossTolerant>false</lossTolerant>
                        <penaltyDuration>30 sec</penaltyDuration>
                        <properties>
                            <entry>
<key>Cache Entry Identifier</key>
<value>${filename}</value>
                            </entry>
                            <entry>
<key>Distributed Cache Service</key>
<value>230506fa-0b7f-3532-0000-000000000000</value>
                            </entry>
                            <entry>
<key>Put Cache Value In Attribute</key>
<value>filenameCache</value>
                            </entry>
                            <entry>
<key>Max Length To Put In Attribute</key>
<value>256</value>
                            </entry>
                            <entry>
<key>Character Set</key>
<value>UTF-8</value>
                            </entry>
                        </properties>
                        <runDurationMillis>0</runDurationMillis>
                        <schedulingPeriod>0 sec</schedulingPeriod>
                        <schedulingStrategy>TIMER_DRIVEN</schedulingStrategy>
                        <yieldDuration>1 sec</yieldDuration>
                    </config>
                    <executionNodeRestricted>false</executionNodeRestricted>
                    <name>FetchDistributedMapCache</name>
                    <relationships>
                        <autoTerminate>false</autoTerminate>
                        <name>failure</name>
                    </relationships>
                    <relationships>
                        <autoTerminate>false</autoTerminate>
                        <name>not-found</name>
                    </relationships>
                    <relationships>
                        <autoTerminate>false</autoTerminate>
                        <name>success</name>
                    </relationships>
                    <state>STOPPED</state>
                    <style/>
                    <type>org.apache.nifi.processors.standard.FetchDistributedMapCache</type>
                </processors>
                <processors>
                    <id>e2956645-4a34-38dc-0000-000000000000</id>
                    <parentGroupId>db622404-a657-39a3-0000-000000000000</parentGroupId>
                    <position>
                        <x>1002.0536868371707</x>
                        <y>-455.5365483185842</y>
                    </position>
                    <bundle>
                        <artifact>nifi-standard-nar</artifact>
                        <group>org.apache.nifi</group>
                        <version>1.8.0.3.3.1.0-10</version>
                    </bundle>
                    <config>
                        <bulletinLevel>WARN</bulletinLevel>
                        <comments></comments>
                        <concurrentlySchedulableTaskCount>1</concurrentlySchedulableTaskCount>
                        <descriptors>
                            <entry>
<key>File Size</key>
<value>
    <name>File Size</name>
</value>
                            </entry>
                            <entry>
<key>Batch Size</key>
<value>
    <name>Batch Size</name>
</value>
                            </entry>
                            <entry>
<key>Data Format</key>
<value>
    <name>Data Format</name>
</value>
                            </entry>
                            <entry>
<key>Unique FlowFiles</key>
<value>
    <name>Unique FlowFiles</name>
</value>
                            </entry>
                            <entry>
<key>generate-ff-custom-text</key>
<value>
    <name>generate-ff-custom-text</name>
</value>
                            </entry>
                            <entry>
<key>character-set</key>
<value>
    <name>character-set</name>
</value>
                            </entry>
                            <entry>
<key>file.lastModifiedTime</key>
<value>
    <name>file.lastModifiedTime</name>
</value>
                            </entry>
                            <entry>
<key>file.path</key>
<value>
    <name>file.path</name>
</value>
                            </entry>
                            <entry>
<key>filename</key>
<value>
    <name>filename</name>
</value>
                            </entry>
                        </descriptors>
                        <executionNode>PRIMARY</executionNode>
                        <lossTolerant>false</lossTolerant>
                        <penaltyDuration>30 sec</penaltyDuration>
                        <properties>
                            <entry>
<key>File Size</key>
<value>0B</value>
                            </entry>
                            <entry>
<key>Batch Size</key>
<value>1</value>
                            </entry>
                            <entry>
<key>Data Format</key>
<value>Text</value>
                            </entry>
                            <entry>
<key>Unique FlowFiles</key>
<value>false</value>
                            </entry>
                            <entry>
<key>generate-ff-custom-text</key>
                            </entry>
                            <entry>
<key>character-set</key>
<value>UTF-8</value>
                            </entry>
                            <entry>
<key>file.lastModifiedTime</key>
<value>${now()}</value>
                            </entry>
                            <entry>
<key>file.path</key>
<value>.</value>
                            </entry>
                            <entry>
<key>filename</key>
<value>file1.dat</value>
                            </entry>
                        </properties>
                        <runDurationMillis>0</runDurationMillis>
                        <schedulingPeriod>10 sec</schedulingPeriod>
                        <schedulingStrategy>TIMER_DRIVEN</schedulingStrategy>
                        <yieldDuration>1 sec</yieldDuration>
                    </config>
                    <executionNodeRestricted>false</executionNodeRestricted>
                    <name>GenerateFlowFile</name>
                    <relationships>
                        <autoTerminate>false</autoTerminate>
                        <name>success</name>
                    </relationships>
                    <state>STOPPED</state>
                    <style/>
                    <type>org.apache.nifi.processors.standard.GenerateFlowFile</type>
                </processors>
            </contents>
            <name>Generate Data</name>
            <variables/>
        </processGroups>
    </snippet>
    <timestamp>11/19/2019 16:40:37 UTC</timestamp>
</template>

 

 

Could you please help.

 

Thank you

Don't have an account?
Coming from Hortonworks? Activate your account here