We have click feed data which is pulled from a tool to a ftp server on daily basis. Currently we run a python script to take this data and write it to data warehouse. We want to switch to HDFS and make it queryable with Hive.
What is the best way or tool to automate this ingestion process from ftp > hdfs > hive? Is there a Hadoop tool with which O can pull the data from ftp?