- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Cannot read sequence file which was created by NiFi CreateHadoopSequenceFile processor.
Created on 02-13-2018 01:33 PM - edited 08-17-2019 08:35 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi!
I have a dataflow in which I create a sequence file from multiple files and load it to hdfs.
Unfortunately I cannot correctly read the generated file in Spark.
For example, I generate 5 txt files:
1.txt 1 2.txt 2 22 3.txt 3 33 333 4.txt 4 44 444 4444 5.txt 5 55 555 5555 55555
and create from those files the new sequence file.
After that I try to read the resulting file:
We can see there are corrupted or trash characters in output (they are zero bytes).
How I can get rid from those unnecessary bytes?
Some additional screenshots are attached.
Created 02-16-2018 06:18 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It is totally my fault! I have used wrong method (getBytes method) to get bytes from BytesWritable class object. There is copyBytes method for that purpose.
Created 02-16-2018 06:18 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It is totally my fault! I have used wrong method (getBytes method) to get bytes from BytesWritable class object. There is copyBytes method for that purpose.
