I follow the instruction on the link below to load csv data into a Phoenix data using MapReduce.
The loading was successful with a csv files of 4k rows. When I tried a file with millions of records, I kept getting errors of "illegal data" or "The data exceeds the max capacity for the data type" or " CSV record does not have enough values (has 13, but needs 14." At the end of the loading, I got a summary like
Job Counters Failed map tasks=4 Killed reduce tasks=1 Launched map tasks=8 Launched reduce tasks=1
Not a single record is loaded. I know most of rows in the file are with the correct layout. Is there a way to ignore the layout error, or export rows of invalid layout?
where can I find a full documentation regarding this tool?
I was able to bypass the error using the "--ignore-errors" option and loaded the file with a smaller record count. So some records were not loaded into the table. Where can I find those missing records? I read somewhere that Phoenix or HBase is heavily logged. I wonder whether those errors were logged or some Phoenix configuration I need to do to enable the logging.