Support Questions

Find answers, ask questions, and share your expertise

Using Phoenix CsvBulkUploadTool with columns that contain "," characters

avatar
Super Collaborator

I am trying to use the CsvBulkUploadTool to get data from Hive to Phoenix/HBase.

As I typically do, I created a Hive table w/ the copy of the data that I care about, and with the properties:

“row format delimited fields terminated by ‘|’ null defined as ‘null’ stored as textfile location ‘my location’ “

This correctly outputs a series of CSV files in HDFS with rows that look like this:

96|9|116|183|[-6, -81, 96, 43, 108, 12, 0, 116, 30, 88, -29, 87, -73, -106, 0, 9, 27, 1, 71, 3, 0, 2, 13, 118, 119]|13|Some_String|3180|1474517022732|0|150

The 5th column needs to be stored as a string.

I can manually enter this into my HBase table:

upsert into my_table values (96,9,116,183,’[-6, -81, 96, 43, 108, 12, 0, 116, 30, 88, -29, 87, -73, -106, 0, 9, 27, 1, 71, 3, 0, 2, 13, 118, 119]’,13,’Some_String’,3180,1474517022732,0,150)

However, the CsvBulkUploadTool fails.

I’m passing the “ -d '|' “ parameter (and I’ve also tried with double quotes), but I still get errors like the one below.

Can anyone tell me how to accomplish my objective here?

16/10/07 14:33:16 INFO mapreduce.Job: Task Id : attempt_1475193681552_0605_m_000376_0, Status : FAILED

Error: java.lang.RuntimeException: java.lang.RuntimeException:
Error on record, java.sql.SQLException: ERROR 201 (22000): Illegal data.,
record =[96|9|116|183|[-6, -81, 96, 43, 108, 12, 0, 116, 30, 88, -29, 87, -73,
-106, 0, 9, 27, 1, 71, 3, 0, 2, 13, 118,
119]|13|Some_String|3180|1474517022732|0|150]
 
at
org.apache.phoenix.mapreduce.CsvToKeyValueMapper.map(CsvToKeyValueMapper.java:176)
 
at
org.apache.phoenix.mapreduce.CsvToKeyValueMapper.map(CsvToKeyValueMapper.java:67)
 
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
 
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
 
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
 
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
 
at java.security.AccessController.doPrivileged(Native Method)
 
at javax.security.auth.Subject.doAs(Subject.java:415)
 
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
 
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
1 ACCEPTED SOLUTION

avatar
Super Collaborator

I think I figured this out.

The issue isn't the comma - it was my null character.

It would be nice if the error logging here was just a bit clearer.

View solution in original post

1 REPLY 1

avatar
Super Collaborator

I think I figured this out.

The issue isn't the comma - it was my null character.

It would be nice if the error logging here was just a bit clearer.