Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Cannot read sequence file which was created by NiFi CreateHadoopSequenceFile processor.

avatar
Explorer

Hi!

I have a dataflow in which I create a sequence file from multiple files and load it to hdfs.

60522-nifi-hdfs.png

Unfortunately I cannot correctly read the generated file in Spark.

For example, I generate 5 txt files:

1.txt
1
2.txt
2
22
3.txt
3
33
333
4.txt
4
44
444
4444
5.txt
5
55
555
5555
55555

and create from those files the new sequence file.

After that I try to read the resulting file:

60523-nifi-hdfs2.png

We can see there are corrupted or trash characters in output (they are zero bytes).

How I can get rid from those unnecessary bytes?

Some additional screenshots are attached.

60521-nifi-hdfs5.png

60520-nifi-hdfs4.png

60519-nifi-hdfs3.png

60518-nifi-hdfs2.png

1 ACCEPTED SOLUTION

avatar
Explorer

It is totally my fault! I have used wrong method (getBytes method) to get bytes from BytesWritable class object. There is copyBytes method for that purpose.

View solution in original post

1 REPLY 1

avatar
Explorer

It is totally my fault! I have used wrong method (getBytes method) to get bytes from BytesWritable class object. There is copyBytes method for that purpose.