How can I stream data from google analytics realtime API to HDFS.We have so far been using R and putting the files in HDFS for bulk imports but would like to know best and possible approaches for GA realtime API.
Use nifi to make the API call to google analytics and push the data into HDFS preferably close to the HDFS block size to avoid small file problem.
Sure. But could you please explain how to use it with google analytics API? How do I stream data from a pull API?Also, how would the authentication work? I have used nifi once, liked it and would love to use it again. It's an amazing tool but very limited support for nifi makes it rather convenient to pick other tools over it.