Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Using Phoenix CsvBulkUploadTool with columns that contain "," characters

avatar
Super Collaborator

I am trying to use the CsvBulkUploadTool to get data from Hive to Phoenix/HBase.

As I typically do, I created a Hive table w/ the copy of the data that I care about, and with the properties:

“row format delimited fields terminated by ‘|’ null defined as ‘null’ stored as textfile location ‘my location’ “

This correctly outputs a series of CSV files in HDFS with rows that look like this:

96|9|116|183|[-6, -81, 96, 43, 108, 12, 0, 116, 30, 88, -29, 87, -73, -106, 0, 9, 27, 1, 71, 3, 0, 2, 13, 118, 119]|13|Some_String|3180|1474517022732|0|150

The 5th column needs to be stored as a string.

I can manually enter this into my HBase table:

upsert into my_table values (96,9,116,183,’[-6, -81, 96, 43, 108, 12, 0, 116, 30, 88, -29, 87, -73, -106, 0, 9, 27, 1, 71, 3, 0, 2, 13, 118, 119]’,13,’Some_String’,3180,1474517022732,0,150)

However, the CsvBulkUploadTool fails.

I’m passing the “ -d '|' “ parameter (and I’ve also tried with double quotes), but I still get errors like the one below.

Can anyone tell me how to accomplish my objective here?

16/10/07 14:33:16 INFO mapreduce.Job: Task Id : attempt_1475193681552_0605_m_000376_0, Status : FAILED

Error: java.lang.RuntimeException: java.lang.RuntimeException:
Error on record, java.sql.SQLException: ERROR 201 (22000): Illegal data.,
record =[96|9|116|183|[-6, -81, 96, 43, 108, 12, 0, 116, 30, 88, -29, 87, -73,
-106, 0, 9, 27, 1, 71, 3, 0, 2, 13, 118,
119]|13|Some_String|3180|1474517022732|0|150]
 
at
org.apache.phoenix.mapreduce.CsvToKeyValueMapper.map(CsvToKeyValueMapper.java:176)
 
at
org.apache.phoenix.mapreduce.CsvToKeyValueMapper.map(CsvToKeyValueMapper.java:67)
 
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
 
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
 
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
 
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
 
at java.security.AccessController.doPrivileged(Native Method)
 
at javax.security.auth.Subject.doAs(Subject.java:415)
 
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
 
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
1 ACCEPTED SOLUTION

avatar
Super Collaborator

I think I figured this out.

The issue isn't the comma - it was my null character.

It would be nice if the error logging here was just a bit clearer.

View solution in original post

1 REPLY 1

avatar
Super Collaborator

I think I figured this out.

The issue isn't the comma - it was my null character.

It would be nice if the error logging here was just a bit clearer.