Created 01-23-2017 06:55 AM
We have data files ingested via abinitio onto hadoop which is in block compressed sequential file (.bcsf) format.
Currently it can be read by the abinitio component via their queryit engine / ODBC connect. This is being a limitation.
Whether it can be read by hadoop components like hive or can be interpreted by map/reduce program? If so, could someone give me some steps or clarifications on the same?
It is been quite long time, we are still searching for a solution. Reading from Hive is fine, currently what we have is that for a single table there are many .bcsf files (ingested by abinitio) which needs to be read from hdfs and written to hive for further processing. I know few things like we need to know the table metadata schema to create a table in hive and then import it from hdfs all .bcsf files as input. Doubts are
1. Whether hive can read that format from hdfs? (your insight provides we can read from hive if i'm not wrong)
2. If yes, how will it identify and join all these .bcsf files history onto a single hive table?
Appreciate your reply.