<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Question about multiple columns lookup and chain of lookup in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Question-about-multiple-columns-lookup-and-chain-of-lookup/m-p/382810#M244697</link>
    <description>&lt;P&gt;Dear&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/80381"&gt;@SAMSAL&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;I also have the same concern with&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/108730"&gt;@BigJames&lt;/a&gt;&amp;nbsp;, &lt;SPAN&gt;Is concatenation the most logical approach in this context? Additionally, how should we handle NULL values? I assume that concatenating NULL values may not be feasible.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;Thank you so much in advance!&lt;/P&gt;</description>
    <pubDate>Mon, 29 Jan 2024 03:28:44 GMT</pubDate>
    <dc:creator>jarviszzzz</dc:creator>
    <dc:date>2024-01-29T03:28:44Z</dc:date>
    <item>
      <title>Question about multiple columns lookup and chain of lookup</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Question-about-multiple-columns-lookup-and-chain-of-lookup/m-p/382727#M244665</link>
      <description>&lt;P&gt;Dear all,&lt;/P&gt;&lt;P&gt;A bit background about this:&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;I receive fresh data from a data vendor on a weekly basis and aim to either insert or update my existing database. If the lookup identifies matching values (such as company ID, fiscal year, etc.) in a record, we perform an update on specific fields (e.g., announcement date); otherwise, we insert a&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;new record.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;My chain of lookups:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="jarviszzzz_0-1706165136495.png" style="width: 400px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/39571i30626CEEAFD7791E/image-size/medium?v=v2&amp;amp;px=400" role="button" title="jarviszzzz_0-1706165136495.png" alt="jarviszzzz_0-1706165136495.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;My lookup processor setup:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="jarviszzzz_5-1706165054644.png" style="width: 400px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/39567i7133CA3CBF275A52/image-size/medium?v=v2&amp;amp;px=400" role="button" title="jarviszzzz_5-1706165054644.png" alt="jarviszzzz_5-1706165054644.png" /&gt;&lt;/span&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="jarviszzzz_6-1706165066759.png" style="width: 400px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/39568iA183BD82F0DE6DBB/image-size/medium?v=v2&amp;amp;px=400" role="button" title="jarviszzzz_6-1706165066759.png" alt="jarviszzzz_6-1706165066759.png" /&gt;&lt;/span&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="jarviszzzz_7-1706165075830.png" style="width: 400px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/39569iCCEC7C1BC79E14CA/image-size/medium?v=v2&amp;amp;px=400" role="button" title="jarviszzzz_7-1706165075830.png" alt="jarviszzzz_7-1706165075830.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="jarviszzzz_1-1706165176755.png" style="width: 400px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/39572i4E81DBC25079402F/image-size/medium?v=v2&amp;amp;px=400" role="button" title="jarviszzzz_1-1706165176755.png" alt="jarviszzzz_1-1706165176755.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;My data:&lt;/P&gt;&lt;P&gt;Before lookup:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="jarviszzzz_1-1706164544571.png" style="width: 400px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/39562i534C8B55E3A1D323/image-size/medium?v=v2&amp;amp;px=400" role="button" title="jarviszzzz_1-1706164544571.png" alt="jarviszzzz_1-1706164544571.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;After first lookup (Company ID) (It return me fs_ID):&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="jarviszzzz_2-1706164571166.png" style="width: 400px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/39563i095BB5E8B127951A/image-size/medium?v=v2&amp;amp;px=400" role="button" title="jarviszzzz_2-1706164571166.png" alt="jarviszzzz_2-1706164571166.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;After second lookup (Fiscal_Year)&amp;nbsp;(It return me another fs_ID):&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="jarviszzzz_3-1706164616218.png" style="width: 400px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/39564i007EE0BC1015AEBC/image-size/medium?v=v2&amp;amp;px=400" role="button" title="jarviszzzz_3-1706164616218.png" alt="jarviszzzz_3-1706164616218.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Note: fs_ID is retrived value from lookup tables&lt;/P&gt;&lt;P&gt;However, I check my lookup table:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="jarviszzzz_4-1706164700257.png" style="width: 400px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/39565i15DF9D175316F4FF/image-size/medium?v=v2&amp;amp;px=400" role="button" title="jarviszzzz_4-1706164700257.png" alt="jarviszzzz_4-1706164700257.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;It does not match the "Company_ID", it only match the "Fiscal_year", It seems like those lookup processor worked independently. I expected this output: myincomingdata.Company_ID = ent_fundamentals.Company_ID AND myincomingdata.Fiscal_year =&amp;nbsp;ent_fundamentals.Fiscal_year..etc., (I can update this record base on those columns if they all match)&lt;/P&gt;&lt;P&gt;Can anyone please help me this issue or is there any other way in doing lookup inNIFI?&lt;/P&gt;</description>
      <pubDate>Thu, 25 Jan 2024 06:46:27 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Question-about-multiple-columns-lookup-and-chain-of-lookup/m-p/382727#M244665</guid>
      <dc:creator>jarviszzzz</dc:creator>
      <dc:date>2024-01-25T06:46:27Z</dc:date>
    </item>
    <item>
      <title>Re: Question about multiple columns lookup and chain of lookup</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Question-about-multiple-columns-lookup-and-chain-of-lookup/m-p/382745#M244676</link>
      <description>&lt;P&gt;Dear &lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/32119"&gt;@VidyaSargur&lt;/a&gt;, does anyone can help this issue? Thank you so much, much appreciated!!&lt;/P&gt;</description>
      <pubDate>Fri, 26 Jan 2024 02:26:30 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Question-about-multiple-columns-lookup-and-chain-of-lookup/m-p/382745#M244676</guid>
      <dc:creator>jarviszzzz</dc:creator>
      <dc:date>2024-01-26T02:26:30Z</dc:date>
    </item>
    <item>
      <title>Re: Question about multiple columns lookup and chain of lookup</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Question-about-multiple-columns-lookup-and-chain-of-lookup/m-p/382757#M244685</link>
      <description>&lt;P&gt;Hi &lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/108694"&gt;@jarviszzzz&lt;/a&gt; ,&lt;/P&gt;&lt;P&gt;The lookupRecord processor is not suppose to work in a chain&amp;nbsp; where the next lookup will filter based on the result of the previous one. Each LookupRecord is independent call to the database and it will return a result from the whole dataset based on the specified key. If you want to lookup record by multiple keys then you probably need to create new column&amp;nbsp; that join all those key columns values together and use that in your lookupRecord, For example I would create a column called CompanyID_FiscalYear to join both values and use this column as my lookup key. In the dynamic property key of the LookupRecord you can use the concat function to concatenate values from different path:&lt;/P&gt;&lt;P&gt;&lt;A href="https://nifi.apache.org/docs/nifi-docs/html/record-path-guide.html#concat" target="_blank"&gt;https://nifi.apache.org/docs/nifi-docs/html/record-path-guide.html#concat&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If that helps please &lt;STRONG&gt;accept&lt;/STRONG&gt; solution.&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Fri, 26 Jan 2024 14:43:17 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Question-about-multiple-columns-lookup-and-chain-of-lookup/m-p/382757#M244685</guid>
      <dc:creator>SAMSAL</dc:creator>
      <dc:date>2024-01-26T14:43:17Z</dc:date>
    </item>
    <item>
      <title>Re: Question about multiple columns lookup and chain of lookup</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Question-about-multiple-columns-lookup-and-chain-of-lookup/m-p/382792#M244693</link>
      <description>&lt;P&gt;I also had the same issue here and I’m new to Apache NiFi. Just curious does this way of concatenation a well accepted method for multi-column lookup in NiFi community? I’m asking because because it does not sounds efficient from database perspective where you use more storage space. And then you need to index this concatenated column for lookup speed.&lt;/P&gt;&lt;P&gt;Another consideration is that should we use stored procedure in database (MySQL for example) to do this kind of multi-column lookup? In this way there is no additional column created. But in this way I don’t know how to send these results from stored procedure back to my NiFi flow for subsequent processing.&lt;/P&gt;&lt;P&gt;Thanks and looking forward to your reply!&lt;/P&gt;</description>
      <pubDate>Sun, 28 Jan 2024 08:21:06 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Question-about-multiple-columns-lookup-and-chain-of-lookup/m-p/382792#M244693</guid>
      <dc:creator>BigJames</dc:creator>
      <dc:date>2024-01-28T08:21:06Z</dc:date>
    </item>
    <item>
      <title>Re: Question about multiple columns lookup and chain of lookup</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Question-about-multiple-columns-lookup-and-chain-of-lookup/m-p/382810#M244697</link>
      <description>&lt;P&gt;Dear&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/80381"&gt;@SAMSAL&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;I also have the same concern with&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/108730"&gt;@BigJames&lt;/a&gt;&amp;nbsp;, &lt;SPAN&gt;Is concatenation the most logical approach in this context? Additionally, how should we handle NULL values? I assume that concatenating NULL values may not be feasible.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;Thank you so much in advance!&lt;/P&gt;</description>
      <pubDate>Mon, 29 Jan 2024 03:28:44 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Question-about-multiple-columns-lookup-and-chain-of-lookup/m-p/382810#M244697</guid>
      <dc:creator>jarviszzzz</dc:creator>
      <dc:date>2024-01-29T03:28:44Z</dc:date>
    </item>
    <item>
      <title>Re: Question about multiple columns lookup and chain of lookup</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Question-about-multiple-columns-lookup-and-chain-of-lookup/m-p/382831#M244712</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/108694"&gt;@jarviszzzz&lt;/a&gt;, &lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/108730"&gt;@BigJames&lt;/a&gt; ,&lt;/P&gt;&lt;P&gt;I understand your concern specially when you have too many columns to check against and large amount of records where indexing can make big difference in response time. It seems that the LookupRecord is intended for simpler cases where you are checking against one key that is not&amp;nbsp; nullable. For your case I would suggest two options:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;1- You can use ExecuteSQL\ExecuteSQLRecord instead of LookupRecord. To use this option you need to extract all data used to filter, update or insert into flowfile attributes since the content will change after ExecuteSQL. Using this processor&amp;nbsp; you can specify the select statement ( in the SQL Select Query property ) with where clause that checks against all fields. The select property allows Expression Language which means you can use flowfile attributes to populate values dynamically. Using SQL and EL also you can handle null values accordingly. Now to check if there is a match or not , you can use simple RouteOnAttribute processor with one condition (dynamic property) :&lt;/P&gt;&lt;P&gt;IsRecordFound: ${fileSize:gt(0)}&lt;/P&gt;&lt;P&gt;the "fileSize" is a system attribute that gives you the size of the flowfile content, If it's 0 then there is no content return which mean no match , otherwise there is a match.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;2- Defer checking for match (update) and no match (insert) to SQL . I like this approach because you are doing one trip to sql instead of two. This approach also requires you to extract all data into flowfile attributes. Basically you create stored proc where you pass record data as parameters ( or json if your sql can parse json). You can run this stored proc using PutSQL processor. The SQL statement property allows for Expression Language which mean you can pass stored proc parameter values dynamically. Inside the stored proc you can check if the record exist or not and do the proper action. One thing you have to be careful with using this approach is using asynchronous calls to the store proc with multi threading or on a cluster as this might end up in sql in sql deadlock exception which you can retry within nifi or the stored proc itself.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm looping other experts&amp;nbsp; like &lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/35454"&gt;@MattWho&lt;/a&gt; , &lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/103151"&gt;@cotopaul&lt;/a&gt;&amp;nbsp; to see if they can provide other input as well.&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 29 Jan 2024 16:11:30 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Question-about-multiple-columns-lookup-and-chain-of-lookup/m-p/382831#M244712</guid>
      <dc:creator>SAMSAL</dc:creator>
      <dc:date>2024-01-29T16:11:30Z</dc:date>
    </item>
    <item>
      <title>Re: Question about multiple columns lookup and chain of lookup</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Question-about-multiple-columns-lookup-and-chain-of-lookup/m-p/383848#M245105</link>
      <description>&lt;P&gt;Dear&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/80381"&gt;@SAMSAL&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;Thank you so much for helping, we tried different ways, and we decided to use the solution that you mention above by concatenating those columns.&lt;/P&gt;</description>
      <pubDate>Mon, 26 Feb 2024 03:13:57 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Question-about-multiple-columns-lookup-and-chain-of-lookup/m-p/383848#M245105</guid>
      <dc:creator>jarviszzzz</dc:creator>
      <dc:date>2024-02-26T03:13:57Z</dc:date>
    </item>
  </channel>
</rss>

