Support Questions

Find answers, ask questions, and share your expertise

Bulk loading via Map Reduce for Phoenix

Hi ,

I am getting the following error when i am trying to load the phoenix table through map reduce :

Not a valid JAR: /usr/hdp/2.6.0.3-8/phoenix/bin/phoenix-4.7.0.2.6.0.3-8-client.jar

I am running the following command :

hadoop jar phoenix-4.7.0.2.6.0.3-8-client.jar org.apache.phoenix.mapreduce.CsvBulkLoadTool --table POC.DSO_375168_352817_MAP --input /export/home/KBM_HOU/pkumar/DSO_375168_352817.csv

While debugging, i read it somewhere that there is some issue with phoenix 4 and above, so they recommended to use the following command :

HADOOP_CLASSPATH=/path/to/hbase-protocol.jar:/path/to/hbase/conf hadoop jar phoenix-2.6.0.3-8-client.jar org.apache.phoenix.mapreduce.CsvBulkLoadTool --table POC.DSO_375168_352817_MAP --input /export/home/KBM_HOU/pkumar/DSO_375168_352817.csv

But I am getting the same error.

2 REPLIES 2

/usr/hdp/2.6.0.3-8/phoenix/bin/phoenix-4.7.0.2.6.0.3-8-client.jar

This is not the installation location of the phoenix-client jar. Unless you copied the jar to this location, you are providing the wrong path. Please reference the JAR via the absolute path after verifying that it exists from the node on which you are executing the command.

Thanks for your response Josh. i got the right directory for Jar file and executed command from there but still getting the issue.

I followed all the steps provided in this help file :

https://phoenix.apache.org/bulk_dataload.html

Here is the location of jar file :

/usr/hdp/2.6.0.3-8/phoenix

First i ran this command to do the bulk loading via mapreduce :

hadoop jar phoenix-4.7.0.2.6.0.3-8-client.jar org.apache.phoenix.mapreduce.CsvBulkLoadTool --table POC.DSO_375168_352817_MAP --input /poc/Raw_Zone/DSO_375168_352817.csv

Then i tried

HADOOP_CLASSPATH=/path/to/hbase-protocol.jar:/path/to/hbase/conf hadoop jar phoenix-4.7.0.2.6.0.3-8-client.jar org.apache.phoenix.mapreduce.CsvBulkLoadTool --table POC.DSO_375168_352817_MAP --input /poc/Raw_Zone/DSO_375168_352817.csv

I even tried this command :

hadoop jar phoenix-4.7.0.2.6.0.3-8-client.jar org.apache.phoenix.mapreduce.CsvBulkLoadTool -Dfs.permissions.umask-mode=000 --table POC.DSO_375168_352817_MAP --input /poc/Raw_Zone/DSO_375168_352817.csv

I keep getting the same error :

Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get the locations at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:312) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:156) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:60) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200) at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:327) at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:302) at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:167) at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:162) at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:794) at org.apache.hadoop.hbase.MetaTableAccessor.fullScan(MetaTableAccessor.java:602) at org.apache.hadoop.hbase.MetaTableAccessor.tableExists(MetaTableAccessor.java:366) at org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:411) at org.apache.phoenix.query.ConnectionQueryServicesImpl$13.call(ConnectionQueryServicesImpl.java:2369) ... 21 more 18/02/12 10:24:43 INFO zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) 18/02/12 10:24:43 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.phoenix.shaded.org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) at org.apache.phoenix.shaded.org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1125) 18/02/12 10:24:43 INFO zookeeper.ClientCnxn: Opening socket connection to server 30.127.0.0/30.127.0.0:2181. Will not attempt to authenticate using SASL (unknown error)

Thanks once again for all your help @Josh Elser