Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Whether hadoop components / job (hive or map reduce job) can read the block compressed sequential file (.bcsf) format ?

Whether hadoop components / job (hive or map reduce job) can read the block compressed sequential file (.bcsf) format ?

Expert Contributor

Hi All,

We have data files ingested via abinitio onto hadoop which is in block compressed sequential file (.bcsf) format.

Currently it can be read by the abinitio component via their queryit engine / ODBC connect. This is being a limitation.

Whether it can be read by hadoop components like hive or can be interpreted by map/reduce program? If so, could someone give me some steps or clarifications on the same?

Thank you.

3 REPLIES 3

Re: Whether hadoop components / job (hive or map reduce job) can read the block compressed sequential file (.bcsf) format ?

@Muthukumar S

Yes, you can read the file from Hive by creating a table with the specific serde of Sequence file with the compression codec. Please refer to link for more details on the same.

Re: Whether hadoop components / job (hive or map reduce job) can read the block compressed sequential file (.bcsf) format ?

Expert Contributor

@Sindhu Will try and let you know. Thank you for the information.

Highlighted

Re: Whether hadoop components / job (hive or map reduce job) can read the block compressed sequential file (.bcsf) format ?

Expert Contributor
@Sindhu

It is been quite long time, we are still searching for a solution. Reading from Hive is fine, currently what we have is that for a single table there are many .bcsf files (ingested by abinitio) which needs to be read from hdfs and written to hive for further processing. I know few things like we need to know the table metadata schema to create a table in hive and then import it from hdfs all .bcsf files as input. Doubts are

1. Whether hive can read that format from hdfs? (your insight provides we can read from hive if i'm not wrong)

2. If yes, how will it identify and join all these .bcsf files history onto a single hive table?

Appreciate your reply.