Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Large Number of regions in the region server

Large Number of regions in the region server

New Contributor

Hi Team, We recently saw that the regions count in our region server is increasing exponentially . We have 3 region server and average number of regions in these server are around 3000. Region count should be around 200-250 at max. We are trying to find the root cause of this issue and how it can be avoided in future. One possible reason that i can think is that we are creating separate phoenix tables for each files we receive. We have requirement to create seperate table for each files so we can not avoid. We need to find the way that if we can tie up multiple tables in one region or any other better way to fix this issue. We are using following code to load Phoenix table :

CsvBulkLoadTool csvBulkLoadTool = new CsvBulkLoadTool(); conf = HBaseConfiguration.create(); ConnectionData dt = PhoenixConnectionManager.getConnectionDataFromJNDI(phoenixJndi); zkQuorum = dt.getQuorum(); // "rdalhdpmastd001.kbm1.loc,rdalhdpmastd002.kbm1.loc,rdalhdpmastd003.kbm1.loc";

conf = new Configuration();

conf.addResource(new Path(dt.getConfigDir() + "core-site.xml")); conf.addResource(new Path(dt.getConfigDir() + "hbase-site.xml")); conf.addResource(new Path(dt.getConfigDir() + "hdfs-site.xml")); conf.addResource(new Path(dt.getConfigDir() + "yarn-site.xml")); conf.addResource(new Path(dt.getConfigDir() + "mapred-site.xml"));

conf.set("hadoop.security.authentication", "Kerberos"); conf.set("http://mapreduce.framework.name", "yarn"); // conf.set("phoenix.mapreduce.import.fielddelimiter",delimiter); UserGroupInformation.setConfiguration(conf); UserGroupInformation.loginUserFromKeytab(dt.getPrincipal(), dt.getConfigDir() + dt.getTabFile());

conf.set("hadoop.security.authentication", "Kerberos"); conf.set("hadoop.home.dir", "/opt/payara41"); conf.set("http://mapreduce.framework.name", "yarn"); if (delimiter != null && !delimiter.isEmpty()) { conf.set("phoenix.mapreduce.import.fielddelimiter", new String(delimiter.getBytes("UTF8"), "UTF8")); }

args.add("--input"); args.add(inputFileName); // args.add("--delimiter"); args.add(delimiter); args.add("--table"); args.add(parseTableNameForJustTable(targetTableName)); args.add("--schema"); args.add(parseTableNameForSchema(targetTableName)); args.add("--import-columns"); args.add(colNames); args.add("--zookeeper"); args.add(zkQuorum);

if (filePermLine != null && !filePermLine.isEmpty()) { // not sure yet }

URL[] urls = ((URLClassLoader) (conf.getClassLoader())).getURLs(); for (URL pth : urls) { System.out.println(pth.getPath()); }

csvBulkLoadTool.setConf(conf); exitCode = -1;

exitCode = csvBulkLoadTool.run(args.toArray(new String[args.size()]));

1 REPLY 1

Re: Large Number of regions in the region server

Expert Contributor

Unfortunately each table in HBase requires at least one region backing it. To resolve your issue you will have to merge the tables before loading into HBase. Hive is another component that would probably fit your use case bettter. There are no regions in Hive, rather the data is store in the file system, so lots of small tables are possible.