Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Phoenix Bulk Load on Ctrl-A delimiter (error code 143)

avatar

Getting the following error when trying to bulk load HDFS data into Phoenix. The data is separated by Cntl-A delimiter.

Command used is as follows:

hadoop jar /usr/hdp/2.3.4.0-3485/phoenix/phoenix-4.4.0.2.3.4.0-3485-client.jar org.apache.phoenix.mapreduce.CsvBulkLoadTool --table INTM.TEST_DATA --input /data/test_data/20160329145829/HDFS_TEST_DATA.csv --zookeeper localhost:2181:/hbase

Error Message=

============

16/04/04 08:00:46 INFO mapreduce.Job: Task Id : attempt_1459451088217_0193_m_000008_0, Status : FAILED Error: java.lang.RuntimeException: java.lang.RuntimeException: Error on record, CSV record does not have enough values (has 1, but needs 14), record =[2808976522139491A0301939984852009-08-22 08:49:46.961000UMEMCVSNRAIL2009-08-22 08:49:46.961000UMEMCVSNRAILNative\etlload2016-03-29 14:58:31.763751] at org.apache.phoenix.mapreduce.CsvToKeyValueMapper.map(CsvToKeyValueMapper.java:176) at org.apache.phoenix.mapreduce.CsvToKeyValueMapper.map(CsvToKeyValueMapper.java:67) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) Caused by: java.lang.RuntimeException: Error on record, CSV record does not have enough values (has 1, but needs 14), record =[2808976522139491A0301939984852009-08-22 08:49:46.961000UMEMCVSNRAIL2009-08-22 08:49:46.961000UMEMCVSNRAILNative\etlload2016-03-29 14:58:31.763751] at org.apache.phoenix.mapreduce.CsvToKeyValueMapper$MapperUpsertListener.errorOnRecord(CsvToKeyValueMapper.java:261) at org.apache.phoenix.util.csv.CsvUpsertExecutor.execute(CsvUpsertExecutor.java:168) at org.apache.phoenix.util.csv.CsvUpsertExecutor.execute(CsvUpsertExecutor.java:136) at org.apache.phoenix.mapreduce.CsvToKeyValueMapper.map(CsvToKeyValueMapper.java:157)

1 ACCEPTED SOLUTION

avatar

It was a permission issue in Ranger. The issue was been resolved. Thank you very much for your help.

View solution in original post

6 REPLIES 6

avatar
Master Guru

Hi @Ram Veer, CsvBulkLoadTool already supports custom delimiter using the '-d' option. To set Ctrl-A add this at the end of your command:

-d '^v^a' ... inside quotes click Ctrl-v followed by Ctrl-a, as the result '^A' will appear

avatar

Thank you for your response. I am able to parse it based on your suggestion. However, the loader is not able to load into Phoenix table.

16/04/05 10:16:15 INFO mapreduce.CsvBulkLoadTool: Loading HFiles from /tmp/c0fdbfa0-383d-4f7a-bb0a-d41c58f3742b/INTM.EQUIP_KEY 16/04/05 10:16:15 WARN mapreduce.LoadIncrementalHFiles: managed connection cannot be used for bulkload. Creating unmanaged connection. 16/04/05 10:16:15 INFO zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x30022bf5 connecting to ZooKeeper ensemble=ucschdpdev01.railinc.com:2181 16/04/05 10:16:15 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=ucschdpdev01.railinc.com:2181 sessionTimeout=90000 watcher=hconnection-0x30022bf50x0, quorum=ucschdpdev01.railinc.com:2181, baseZNode=/hbase 16/04/05 10:16:15 INFO zookeeper.ClientCnxn: Opening socket connection to server ucschdpdev01.railinc.com/10.160.230.141:2181. Will not attempt to authenticate using SASL (unknown error) 16/04/05 10:16:15 INFO zookeeper.ClientCnxn: Socket connection established to ucschdpdev01.railinc.com/10.160.230.141:2181, initiating session 16/04/05 10:16:15 INFO zookeeper.ClientCnxn: Session establishment complete on server ucschdpdev01.railinc.com/10.160.230.141:2181, sessionid = 0x153c6f93c7f1f5a, negotiated timeout = 40000 16/04/05 10:16:15 WARN mapreduce.LoadIncrementalHFiles: Skipping non-directory hdfs://ucschdpdev01.railinc.com:8020/tmp/c0fdbfa0-383d-4f7a-bb0a-d41c58f3742b/INTM.EQUIP_KEY/_SUCCESS 16/04/05 10:16:15 INFO hfile.CacheConfig: CacheConfig:disabled 16/04/05 10:16:16 INFO mapreduce.LoadIncrementalHFiles: Trying to load hfile=hdfs://ucschdpdev01.railinc.com:8020/tmp/c0fdbfa0-383d-4f7a-bb0a-d41c58f3742b/INTM.EQUIP_KEY/0/344df66eb2644474b3ac5da7ecbe767c first=1 last=999999 16/04/05 10:27:29 INFO client.RpcRetryingCaller: Call exception, tries=10, retries=35, started=673701 ms ago, cancelled=false, msg=row '' on table 'INTM.EQUIP_KEY' at region=INTM.EQUIP_KEY,,1459834472681.60a4b8f4fad454e419242d08a51660ad., hostname=ucschdpdev03.railinc.com,16020,1459451121171, seqNum=2

avatar
Master Guru

It failed after 11 minutes, so there is maybe a permission issue on HFiles (see the details on the Tool page). Can you try to add this to your command, and retry:

-Dfs.permissions.umask-mode=000

or if possible run the command as hbase user.

avatar

It was a permission issue in Ranger. The issue was been resolved. Thank you very much for your help.

avatar
Master Guru

Hi @Ram Veer, great news! Please consider to accept/up-vote my answer above. Tnx!

avatar
Master Guru

Well, I thought my answer of Apr. 5 ... 🙂