08-13-2018 08:27 AM - edited 08-13-2018 09:13 AM
I have transactional data in csv format, (1 file for each month - about 3 GB per file) - i have 4 years data
I need to search this data by customerID and data ranges and return only records that match the search terms
Query response time is not good on RDBMs.
I need advise on what combination of big-data tools would be optimat for this kind of project?
I have a 3 node CDH Cluster set-up