Support Questions
Find answers, ask questions, and share your expertise

BULK LOAD IN Phoenix through Remote Server

BULK LOAD IN Phoenix through Remote Server

. We have been trying to load phoenix table remotely so just wanted to check if we have any option where we can run the load process outside of cluster.

Currently we are being able to load the phoenix table if we run the load process on cluster.

Here are commands we are running to load the phoenix table in Yarn mode :


export HADOOP_CLASSPATH=/usr/hdp/current/hbase-client/lib:/usr/hdp/current/hbase-client/conf

hadoop jar /usr/hdp/current/phoenix-client/phoenix-client.jar \

org.apache.phoenix.mapreduce.CsvBulkLoadTool \

-Dfs.permissions.umask-mode=000 \

--table POC.CIC_BULKTEST_20180212 \

--input /poc/Raw_Zone/DSO_375168_352817.csv

Command to load phoenix table in standalone mode :

./ -t POC.DSO_22334808_25257_NEW /export/home/KBM_HOU/pkumar/new_test_file.csv

We are executing both these commands from cluster but we need to find execute these load process from outside the cluster. How the Phoenix table can be exposed to outside the cluster. We found one way to connect it through JDBC connection but there is not any bulk load process using jdbc connection. We are being able to do only the Upsert data through loop process :

Connection con = DriverManager.getConnection("jdbc:phoenix:[zookeeper]");

stmt = con.createStatement();

stmt.executeUpdate("create table test (mykey integer not null primary key, mycolumn varchar)");

stmt.executeUpdate("upsert into test values (1,'Hello')");

stmt.executeUpdate("upsert into test values (2,'World!')");



Re: BULK LOAD IN Phoenix through Remote Server

@Prashant Kumar You just need to pass the hbase configs to the command and it can be run anywhere (the machine should be able to connect to the hbase cluster). However you also need yarn in the source so that the MR job can be launched.