Created 03-22-2018 01:45 PM
Hello Everyone
This is my first post(first question) to the worldwide HortonWorks Community, so please be indulgent if something is not clear of or if some technical details are missing. I work in France for Natixis which hase decided 3 years ago to start implement a Datalake Solution based on the HDP distribution
Now, we are on the HDP 2.6.4 version of the Data Platform. We develop only on Spark 2.2 and want to leverage on phoenix to read Hbase Data. So we want to create our table from Phoenix CLI and let Phoenix drive the creation in Hbase.
Our main predicate is to use namespace in Hbase and consequently use Schema in Phoenix with hbase Ranger security policies based on namespace and not only table name. We have already take into account the fact that phoenix driver transform everything in uppercase (except when we use the double quote trick "[name]"). Our Hbase Security policies in Ranger are declared in UpperCase and are set as "LAK:*" where LAK is our target namespace. Corresponding Schema in Phoenix is so "LAK".
=> Now that all of that has been said, our problem 🙂 ...
First with Ambari configuration, both following option was activated
Then, our test was made inside the Spark Shell 2.2, with following class path updates (to be able to use new driver phoenix-spark2). Be careful about the order, it is really relevant to set phoenix-spark2 before phoenix-client, else the dependencies will not be resolved correctly:
With a simple call :
spark.sqlContext.load("org.apache.phoenix.spark", Map("table" -> "schema.table", "zkUrl" -> "zkHosts:2181 ")).show
We have got the following error :
Caused by: org.apache.phoenix.schema.TableNotFoundException: ERROR 1012 (42M03): Table undefined. tableName=SCHEMA.TABLE
But, if we query a table without namespace (created in the default phoenix schema) :
spark.sqlContext.load("org.apache.phoenix.spark", Map("table" -> "table", "zkUrl" -> "zkHosts:2181 ")).show
We can get the correct result and query the resulted Dataframe in Spark.
=> My question is :
Is there anyone that has succeed reading from Spark 2.2 some Phoenix Table created with a specific schema as SCHEMA.TABLE, and with some active Hbase ranger ssecurity policies based on Namespace (equivalent to the Phoenix Schema)
All post and blog on phoenix https://phoenix.apache.org/phoenix_spark.html are written with Spark 1.6 version, it seems that the phoenix-spark2 is only compiled to work on Spark 2 but does not take into account correctly the namespace/schema problem
I do not know if my hypothesis is correct, if we made something the wrong way. If someone has workarounds or just some ideas to get through this and solve our problem, do not hesitate 🙂 !
Thx,
Julien
Created 07-09-2018 10:03 AM
Hi Julian,
We recently upgraded to HDP 2.6.5.0 from 2.5.3.0 and saw this feature in Phoenix page and decided to enable it after that I couldn't access Phoenix shell or via squirrel, I am getting the following Error: ERROR 2006 (INT08): Incompatible jars detected between client and server. Ensure that phoenix.jar is put on the classpath of HBase in every region server: SYSTEM.CATALOG is disabled. (state=INT08,code=2006) tried to dig deep couldn't find any help and did look around seems like jar files are in order and did try to enable the table in hbase and ran sqlline.py scrip got this other error Error: org.apache.hadoop.hbase.snapshot.SnapshotExistsException: org.apache.hadoop.hbase.snapshot.SnapshotExistsException: Snapshot '_UPGRADING_TABLE_SYSTEM.CATALOG' already stored on the filesystem. when i list snapshots in hbase i cant find this snapshot they talking about is there anything am i doing wrong maybe?
Created 10-15-2018 11:35 AM
Hi Chavillon Julien,
Please use the below expression instead of "schema.table".
"\"schema.table\""
Let me know if you have any doubts......
Thanks,
Tarak.