Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to execute google big query from NiFi.?

avatar
Super Collaborator

Hi,

i do not see any processors to connect to Google Big Query and execute queries.

is there a way where i can get results from google big query using NiFi.?

10 REPLIES 10

avatar
Super Guru

@Saikrishna Tarapareddy

Unfortunately, there is no specialized processor to connect to Google Big Query and execute queries. There has been some discussions about a set of new processors to support various Google Cloud services, but those processors are still to be planned into a release.

Until then you can use ExecuteScript processor.

Here is an example on how to write a script using Python: https://cloud.google.com/bigquery/create-simple-app-api#bigquery-simple-app-print-result-python . At https://cloud.google.com/bigquery/create-simple-app-api you can see other examples using other languages also supported by ExecuteScript processor.

Obviously, there is always the possibility to develop your own processor leveraging the Java example provided by Google doc. Example of how to build NiFi custom processor:

https://community.hortonworks.com/articles/4318/build-custom-nifi-processor.html

If this response addressed reasonably your question, please vote and accept answer.

avatar
Super Guru

@Saikrishna Tarapareddy

The community processor mentioned by Tim is a good example on how to write a custom processor. It is limited to Put action and quite old. You would have to rebuild it using more up-to-date libraries.

Community processors are not supported by Hortonworks.

avatar
Master Guru

There is an open source community processor you can try:

https://github.com/theShadow89/nifi-bigquery-bundle

avatar
Super Collaborator

Hi @Constantin Stanca

Can we use their REST APIs to download bigquery tables?

Hi @Timothy Spann , we cannot use any other processors which are not supported.

Regards,

Sai

avatar
Super Guru

@Saikrishna Tarapareddy

You can use their API to download the data from those tables. Those examples show how to select the data. However, you may deal with a lot of data. You may want to extract it from BigQuery, store it in Google Cloud Storage Bucket (GCS) and connect NiFi to GCS which is supported nicely with GCS processors to list, fetch, put, delete from GCS. That is the most efficient way.

Look at this reference to see how to extract the data: https://cloud.google.com/bigquery/docs/exporting-data

You can schedule a job to extract and put to GCS bucket and NiFi will just pick it up.

avatar
Super Collaborator

@Constantin Stanca,

My file names will change everyday like datafile_yyyymmdd.json , so we still need to create a scheduled job using Nifi and their RESTAPI to move the file to GCS.? Or is there a way that we can separate this .?

Regards,

Sai

avatar

I had the same isssue and needed to create a processor on my own. Have a look at it:

https://datamater.io/2018/06/02/nifi-openaq-get-bigquery-processor/

avatar
New Contributor

Are these processors public so others can benefit from them?

avatar
New Contributor

@Pawel Leszczynski

Can't find it 😕 Where is the code source or the nar file please ?

Thanks for response.

,

Hi @Pawel Leszczynski
Can't find it... Where is the code source of this processor ?

Thanks for your response.