Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

read a AVRO file stored in HDFS

avatar
Expert Contributor

Hi,

I want to read a metadata from avro file stored in HDFS using AVRO api ( https://avro.apache.org/docs/1.4.1/api/java/org/apache/avro/file/DataFileReader.html )

The avro DataFileReader accepts only File objects. Is it somehow possible to read data from file stored on hdfs instead of data stored on local fs?

Thank you

1 ACCEPTED SOLUTION

avatar
Expert Contributor

I created sample code, it works FINE.

BufferedInputStream inStream = null;
String inputF = "hdfs://CustomerData-20160128-1501807.avro";
org.apache.hadoop.fs.Path inPath = new org.apache.hadoop.fs.Path(inputF);
try {
Configuration conf = new Configuration();
conf.set("fs.defaultFS", "hdfs://sandbox.hortonworks.com:8020");
FileSystem fs = FileSystem.get(URI.create(inputF), conf);
inStream = new BufferedInputStream(fs.open(inPath));
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
DataFileStream reader = new DataFileStream(inStream, new GenericDatumReader());
Schema schema = reader.getSchema();
System.out.println(schema.toString());

View solution in original post

11 REPLIES 11

avatar
Master Mentor

@John Smith use "code" button to paste code

avatar
Master Mentor

@John Smith I edited the answer to format the code.