Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to use SQL within a QueryRecord Processor to query an Avro Schema

Solved Go to solution
Highlighted

How to use SQL within a QueryRecord Processor to query an Avro Schema

Explorer

I am trying to reference an Avro Schema that already works from a CEF file and take out a part of that file (time as rt) by referencing the Schema, but I am a bit rusty with my SQL so it's not going to plan rn - does anyone have any examples of doing this in NiFi?

1 ACCEPTED SOLUTION

Accepted Solutions

Re: How to use SQL within a QueryRecord Processor to query an Avro Schema

@robnew666   yes you can do that, you just need to experiment with SQL statements that satisfy your requirements.  You should work on some tests and inspect what you get in the flowfile after a select *.  This will depend on your reader, etc.   You can also have multiples QueryRecord Processors but dont forget it is possible to have multiple queries in a single processor.  The key you use to hold the query will become a route out of the processor downstream.  For example you could select all results that match a certain condition, and separate those from the rest.

 


 


If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post.  


 


Thanks,



Steven

View solution in original post

4 REPLIES 4
Highlighted

Re: How to use SQL within a QueryRecord Processor to query an Avro Schema

@robnew666   You should provide more information in order for community members to respond with specific details.  Without seeing schema, sample data, what you have, etc, we have little to go off.

 

That said below is an example of query record with a query.  The important theory here is to use + to add new queries and use SELECT * FROM FLOWFILE.  You can add WHERE [your query logic here] to operate agains schema columns just like SQL.  You can have more than 1 query and route them separately down stream.  So for testing start with your first query1 (result) like below.  Next begin to add WHERE and operate against your requirements.  As you further advance your query, make new query2, query3, working methodically towards what works.  When done, delete the test queries you do not need and then route the final query downstream.

 

Screen Shot 2020-10-28 at 8.48.45 AM.png

 

If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post.

 

Thanks,

Steven

 


 


If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post.  


 


Thanks,



Steven

Highlighted

Re: How to use SQL within a QueryRecord Processor to query an Avro Schema

Explorer

Thanks,

 

I cant put any data up here, but I think we are getting close.  I can pull all data through which then marries it up with the schema, so it is formatted as data associated with the schema. I am then wondering how I can then add another QueryRecord Processor below which I can then add individual sql lines which can pull out coloumns from this data, ie SELECT *, FROM FLOWFILE WHERE table_name = 'rt'

 

Re: How to use SQL within a QueryRecord Processor to query an Avro Schema

@robnew666   yes you can do that, you just need to experiment with SQL statements that satisfy your requirements.  You should work on some tests and inspect what you get in the flowfile after a select *.  This will depend on your reader, etc.   You can also have multiples QueryRecord Processors but dont forget it is possible to have multiple queries in a single processor.  The key you use to hold the query will become a route out of the processor downstream.  For example you could select all results that match a certain condition, and separate those from the rest.

 


 


If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post.  


 


Thanks,



Steven

View solution in original post

Highlighted

Re: How to use SQL within a QueryRecord Processor to query an Avro Schema

Explorer

Thanks,

I am trying some stuff now to parse data using the JoltSpec/JoltTransformJSON processor that could help me with this issue, but thanks for this help, hopefully can get things running more smoothly soon.

Don't have an account?
Coming from Hortonworks? Activate your account here