Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Illegal Data error when loading a csv file to Phoenix table - Error Handling

Illegal Data error when loading a csv file to Phoenix table - Error Handling

Contributor

I follow the instruction on the link below to load csv data into a Phoenix data using MapReduce.

https://phoenix.apache.org/bulk_dataload.html

The loading was successful with a csv files of 4k rows. When I tried a file with millions of records, I kept getting errors of "illegal data" or "The data exceeds the max capacity for the data type" or " CSV record does not have enough values (has 13, but needs 14." At the end of the loading, I got a summary like

 Job Counters
                Failed map tasks=4
                Killed reduce tasks=1
                Launched map tasks=8
                Launched reduce tasks=1


Not a single record is loaded. I know most of rows in the file are with the correct layout. Is there a way to ignore the layout error, or export rows of invalid layout?


2 REPLIES 2
Highlighted

Re: Illegal Data error when loading a csv file to Phoenix table - Error Handling

Contributor

where can I find a full documentation regarding this tool?

org.apache.phoenix.mapreduce.CsvBulkLoadTool 

Re: Illegal Data error when loading a csv file to Phoenix table - Error Handling

Contributor

I was able to bypass the error using the "--ignore-errors" option and loaded the file with a smaller record count. So some records were not loaded into the table. Where can I find those missing records? I read somewhere that Phoenix or HBase is heavily logged. I wonder whether those errors were logged or some Phoenix configuration I need to do to enable the logging.

Don't have an account?
Coming from Hortonworks? Activate your account here