Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

In HDP 2.5, Phoenix Spark plugin cannot find table with a Namespace prefix

avatar
Contributor

I am testing some code using Phoenix Spark plug in to read a Phoenix table with a namespace prefix in the table name (the table is created as a phoenix table not a hbase table), but it returns an TableNotFoundException.

The table is obviously there because I can query it using plain phoenix sql through Squirrel. In addition, using spark sql to query it has no problem at all.

I am running on the HDP 2.5 platform, with phoenix 4.7.0.2.5.0.0-1245

The problem does not exist at all when I was running the same code on HDP 2.4 cluster, with phoenix 4.4.

Neither does the problem occur when I query a table without a namespace prefix in the DB table name, on HDP 2.5

The log is in the attached file: tableNoFound.txt

My testing code is listed below:

The weird thing is in the attached code, if I run testSpark alone it gives the above exception, but if I run the testJdbc first, and followed by testSpark, both of them work.

public class Application {
    static private Logger log  = Logger.getLogger(Application.class);


    public static void main(String[] args) {
        SparkConf conf = new SparkConf().setAppName("NMS Tuning Engine");
        JavaSparkContext sc = new JavaSparkContext(conf);




        //testJdbc(sc);
        testSpark(sc);
    }


    static public void testSpark(JavaSparkContext sc)  {
        //SparkContextBuilder.buildSparkContext("Simple Application", "local");


        // One JVM can only have one Spark Context now
        Map<String, String> options = new HashMap<String, String>();
        SQLContext sqlContext = new SQLContext(sc);


        String tableStr = "\"ACME:ENDPOINT_STATUS\"";
        String dataSrcUrl="jdbc:phoenix:luna-sdp-nms-01.davis.sensus.lab:2181:/hbase-unsecure";
        options.put("zkUrl", dataSrcUrl);
        options.put("table", tableStr);
        log.info("Phoenix DB URL: " + dataSrcUrl + " tableStr: " + tableStr);


        DataFrame df = null;
        try {
            df = sqlContext.read().format("org.apache.phoenix.spark").options(options).load();
            df.explain(true);
        } catch (Exception ex) {
            log.error("sql error: ", ex);
        }


        try {
            log.info ("Count By phoenix spark plugin: "+ df.count());
        } catch (Exception ex) {
            log.error("dataframe error: ", ex);
        }


    }


    static public void testJdbc(JavaSparkContext sc)   {
        Map<String, String> options = new HashMap<String, String>();
        SQLContext sqlContext = new SQLContext(sc);


        if (sc == null || sqlContext == null || options == null) {
            log.info("NULL sc, sqlContext, or options");
        }


        String qry2 = "(Select ENDPOINT_ID, CITY from \"ACME:ENDPOINT_STATUS\" Where city = 'ACME City')";
        String dataSrcUrl="jdbc:phoenix:luna-sdp-nms-01.davis.sensus.lab:2181:/hbase-unsecure";
        options.put("url", dataSrcUrl);
        options.put("dbtable", qry2);
        log.info("Phoenix DB URL: " + dataSrcUrl + "\nquery: " + qry2);


        DataFrame df = null;
        try {
            DataFrameReader dfRd = sqlContext.read().format("jdbc").options(options);


            if (dfRd == null) {
                log.error("NULL DataFrameReader Object dfRd in getEndPointDataByJdbc");
            }
            df = dfRd.load();
            df.explain(true);


        } catch (Exception ex) {
            log.error("sql error: ", ex);
        }


        try {
            log.info ("Count By Jdbc: "+ df.count());
        } catch (Exception ex) {
            log.error("dataframe error: ", ex);
        }
    }
}

By the way, here is how the HBase looks like when I list it.

hbase(main):031:0* list

TABLE

ACME:ENDPOINT_CONFIG

ACME:ENDPOINT_STATUS

LONG:ENDPOINTS

LONG:RADIOCHANNELS

LONG:REGIONINFORMATION

LONG:TGBSTATISTICS

SENSUS1:ENDPOINTS

SENSUS1:RADIOCHANNELS

SENSUS1:REGIONINFORMATION

SENSUS1:TGBSTATISTICS

SENSUS2:ENDPOINTS

SENSUS2:RADIOCHANNELS

SENSUS2:REGIONINFORMATION

SENSUS2:TGBSTATISTICS

SENSUS:ENDPOINTS

SENSUS:RADIOCHANNELS

SENSUS:REGIONINFORMATION

SENSUS:TGBSTATISTICS

SYSTEM.CATALOG

SYSTEM:CATALOG

SYSTEM:FUNCTION

SYSTEM:SEQUENCE

SYSTEM:STATS

TENANT

24 row(s) in 0.0090 seconds

=> ["ACME:ENDPOINT_CONFIG", "ACME:ENDPOINT_STATUS", "LONG:ENDPOINTS", "LONG:RADIOCHANNELS", "LONG:REGIONINFORMATION", "LONG:TGBSTATISTICS", "SENSUS1:ENDPOINTS", "SENSUS1:RADIOCHANNELS", "SENSUS1:REGIONINFORMATION", "SENSUS1:TGBSTATISTICS", "SENSUS2:ENDPOINTS", "SENSUS2:RADIOCHANNELS", "SENSUS2:REGIONINFORMATION", "SENSUS2:TGBSTATISTICS", "SENSUS:ENDPOINTS", "SENSUS:RADIOCHANNELS", "SENSUS:REGIONINFORMATION", "SENSUS:TGBSTATISTICS", "SYSTEM.CATALOG", "SYSTEM:CATALOG", "SYSTEM:FUNCTION", "SYSTEM:SEQUENCE", "SYSTEM:STATS", "TENANT"]

1 ACCEPTED SOLUTION

avatar
Super Collaborator

Looks like you have enabled namespace mapping. Check that the hbase conf dir is in the classpath. And you need to use ACME.xxxxx format for table names

View solution in original post

6 REPLIES 6

avatar
Super Collaborator

Looks like you have enabled namespace mapping. Check that the hbase conf dir is in the classpath. And you need to use ACME.xxxxx format for table names

avatar
Contributor

Yes, the namespace is enabled and hbase conf dir is in the classpath.:q

ACME.xxxxx format does not work either, actually even testJdbc complains with tableNotFound exception if using ACME.xxxxx format for table name.

avatar
Super Collaborator

Try

String tableStr ="ACME.ENDPOINT_STATUS";

No double quotes and dot between schema name and table name. Hope that will work.

One more note: if you create table using double quotes, put schema and table name in quotes. Like

create table "ACME"."ENDPOINT_STATUS"

in this case you will need to quote table name like you do in your sample. It seems that there is a bug in phoenix code. If you quote them together, Phoenix creates a table with empty schema and the table name as "schema.table". You may check it by querying SYSTEM.CATALOG table.

avatar
Contributor

Thank you for the reply. After changing to create table by using

create table ACME.ENDPOINT_STATUS

The phoenix-spark plug in seems working. I also find some weird behavior,

If I do both the following

create table ACME.ENDPOINT_STATUS
create table "ACME:ENDPOINT_STATUS"

Both table shows up in phoenix, the first one shows as Schema ACME, and table name ENDPOINT_STATUS, but the later on shows as scheme none, and table name ACME:ENDPOINT_STATUS.

However, in HBASE, I only see one table ACME:ENDPOINT_STATUS. In addition, upsert in the table ACME.ENDPOINT_STATUS shows up in the other table, so is the other way around.

Namespace mapping is introduced in Phoenix 4.8, did Hortonworks backported the feature into HDP 2.5 and Phoenix 4.7 ?

avatar
Super Collaborator

Phoenix supports the only way to define table with schema by using dot between. Colons are using internally by HBase and Phoenix does this transformation on its own. As I mentioned earlier Phoenix has a bug that both schema and table name should have double quotes, otherwise table created incorrectly with empty schema. Moreover on the HBase layer this cause ambiguous physical table mapping like you are seeing in your example. So, my recommendation is to use quotes for schema and table name separately or just do not use quotes.

And you are right, the feature was backported from 4.8.

avatar
Contributor

It is very clear, thanks