<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Querying Data Provenance using FlowFile Attribute or Content in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Querying-Data-Provenance-using-FlowFile-Attribute-or-Content/m-p/387276#M246285</link>
    <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/80381"&gt;@SAMSAL&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;You can add additional attributes that you want to indexed with provenance that you could then use in your provenance searches.&lt;BR /&gt;&lt;BR /&gt;Take a look at the following properties available with the &lt;A href="https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#write-ahead-provenance-repository-properties" target="_blank"&gt;Write Ahead Provenance Repository&lt;/A&gt;:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="MattWho_0-1714138509350.png" style="width: 713px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/40452i12B39E980CA1B1E9/image-dimensions/713x180?v=v2" width="713" height="180" role="button" title="MattWho_0-1714138509350.png" alt="MattWho_0-1714138509350.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Since you want to be able to search on some FlowFile attribute, you would add it to the "&lt;SPAN&gt;nifi.provenance.repository.indexed.attributes".&amp;nbsp; Keep in mind that adding additional indexed attributes or fields will increase the size of your provenance_repository disk usage.&amp;nbsp;&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;Added attributes or fields will start being indexed after restart of your NiFi.&amp;nbsp; NiFi can not go back and reindex already processed FlowFiles, but this should help you going forward.&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;Please help our community thrive. If you found&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;any&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "&lt;SPAN&gt;&lt;EM&gt;&lt;STRONG&gt;&lt;FONT color="#FF0000"&gt;Accept as Solution&lt;/FONT&gt;&lt;/STRONG&gt;&lt;/EM&gt;" on&amp;nbsp;&lt;STRONG&gt;one or more&lt;/STRONG&gt;&amp;nbsp;of them that helped.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Thank you,&lt;BR /&gt;Matt&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Fri, 26 Apr 2024 13:40:20 GMT</pubDate>
    <dc:creator>MattWho</dc:creator>
    <dc:date>2024-04-26T13:40:20Z</dc:date>
    <item>
      <title>Querying Data Provenance using FlowFile Attribute or Content</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Querying-Data-Provenance-using-FlowFile-Attribute-or-Content/m-p/387274#M246284</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I'm not sure if this has been asked before but Im finding it strange that there is not much info or discussion about it. Basically I have scenario where at some point of time I believe I was getting some corrupted data from an API call. When I went to verify that by executing the API call after few hours from the error I dont see the corrupted data. How do I prove\disapprove&amp;nbsp; this? if I can search the data provenance for that particular response flowfile I would be able to see what did i get at that time after the call. The problem is the out of the box search provenance criteria doesn't provide a way to search against the content or the flowfile custom attributes and it only allows to search against system fields attributes that I dont store. Is there a way to perform such search even by creating some dataflow using certain processors in nifi or using some scripting language?&lt;/P&gt;&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/35454"&gt;@MattWho&lt;/a&gt;or any body who can help with I would really appreciated.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 26 Apr 2024 13:29:53 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Querying-Data-Provenance-using-FlowFile-Attribute-or-Content/m-p/387274#M246284</guid>
      <dc:creator>SAMSAL</dc:creator>
      <dc:date>2024-04-26T13:29:53Z</dc:date>
    </item>
    <item>
      <title>Re: Querying Data Provenance using FlowFile Attribute or Content</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Querying-Data-Provenance-using-FlowFile-Attribute-or-Content/m-p/387276#M246285</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/80381"&gt;@SAMSAL&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;You can add additional attributes that you want to indexed with provenance that you could then use in your provenance searches.&lt;BR /&gt;&lt;BR /&gt;Take a look at the following properties available with the &lt;A href="https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#write-ahead-provenance-repository-properties" target="_blank"&gt;Write Ahead Provenance Repository&lt;/A&gt;:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="MattWho_0-1714138509350.png" style="width: 713px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/40452i12B39E980CA1B1E9/image-dimensions/713x180?v=v2" width="713" height="180" role="button" title="MattWho_0-1714138509350.png" alt="MattWho_0-1714138509350.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Since you want to be able to search on some FlowFile attribute, you would add it to the "&lt;SPAN&gt;nifi.provenance.repository.indexed.attributes".&amp;nbsp; Keep in mind that adding additional indexed attributes or fields will increase the size of your provenance_repository disk usage.&amp;nbsp;&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;Added attributes or fields will start being indexed after restart of your NiFi.&amp;nbsp; NiFi can not go back and reindex already processed FlowFiles, but this should help you going forward.&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;Please help our community thrive. If you found&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;any&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "&lt;SPAN&gt;&lt;EM&gt;&lt;STRONG&gt;&lt;FONT color="#FF0000"&gt;Accept as Solution&lt;/FONT&gt;&lt;/STRONG&gt;&lt;/EM&gt;" on&amp;nbsp;&lt;STRONG&gt;one or more&lt;/STRONG&gt;&amp;nbsp;of them that helped.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Thank you,&lt;BR /&gt;Matt&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 26 Apr 2024 13:40:20 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Querying-Data-Provenance-using-FlowFile-Attribute-or-Content/m-p/387276#M246285</guid>
      <dc:creator>MattWho</dc:creator>
      <dc:date>2024-04-26T13:40:20Z</dc:date>
    </item>
    <item>
      <title>Re: Querying Data Provenance using FlowFile Attribute or Content</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Querying-Data-Provenance-using-FlowFile-Attribute-or-Content/m-p/387280#M246289</link>
      <description>&lt;P&gt;Awesome &lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/35454"&gt;@MattWho&lt;/a&gt; . That is great information. However as you said this is going to help moving forward but what about past information? Is there still a way to search the provenance data outside the search feature which doesnt provide capability to search by custom attribute or content?&amp;nbsp; My guess is not&amp;nbsp; based on your answer&amp;nbsp; but I just wanted to confirm.&lt;/P&gt;</description>
      <pubDate>Fri, 26 Apr 2024 15:41:08 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Querying-Data-Provenance-using-FlowFile-Attribute-or-Content/m-p/387280#M246289</guid>
      <dc:creator>SAMSAL</dc:creator>
      <dc:date>2024-04-26T15:41:08Z</dc:date>
    </item>
    <item>
      <title>Re: Querying Data Provenance using FlowFile Attribute or Content</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Querying-Data-Provenance-using-FlowFile-Attribute-or-Content/m-p/387331#M246298</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/80381"&gt;@SAMSAL&lt;/a&gt;&amp;nbsp;Without being indexed, I can't think of any other way to parse the provenance data.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 29 Apr 2024 12:05:17 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Querying-Data-Provenance-using-FlowFile-Attribute-or-Content/m-p/387331#M246298</guid>
      <dc:creator>MattWho</dc:creator>
      <dc:date>2024-04-29T12:05:17Z</dc:date>
    </item>
  </channel>
</rss>

