Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

Illegal Data error when loading a csv file to Phoenix table - Error Handling

Expert Contributor

I follow the instruction on the link below to load csv data into a Phoenix data using MapReduce.

https://phoenix.apache.org/bulk_dataload.html

The loading was successful with a csv files of 4k rows. When I tried a file with millions of records, I kept getting errors of "illegal data" or "The data exceeds the max capacity for the data type" or " CSV record does not have enough values (has 13, but needs 14." At the end of the loading, I got a summary like

 Job Counters
                Failed map tasks=4
                Killed reduce tasks=1
                Launched map tasks=8
                Launched reduce tasks=1


Not a single record is loaded. I know most of rows in the file are with the correct layout. Is there a way to ignore the layout error, or export rows of invalid layout?


2 REPLIES 2

Expert Contributor

where can I find a full documentation regarding this tool?

org.apache.phoenix.mapreduce.CsvBulkLoadTool 

Expert Contributor

I was able to bypass the error using the "--ignore-errors" option and loaded the file with a smaller record count. So some records were not loaded into the table. Where can I find those missing records? I read somewhere that Phoenix or HBase is heavily logged. I wonder whether those errors were logged or some Phoenix configuration I need to do to enable the logging.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.