Created 11-04-2017 09:05 PM
I have many questions, as I have been fiddling with Sandbox as a Hadoop newbie starting with the more basic one first:
¶ I have seen in that from the CLI/shell one can go view `/usr/hdp/current/spark2-thriftserver/conf/hive-site.xml` or `/usr/hdp/current/spark2-client/conf/hive-site.xml` and under port property find the listed port (10016) for Thrift Server. Is this the efficient/preferred way one does this.
Further I am trying doing this to try and use this for an ODBC Spark SQL connection to connect to visualization tool, Spotfire.
I have successfully connected to the hive datatables in Hive-Server2 from sporfire on my laptop at port 10000 by downloading the Apache Hive connector, now I am hoping to do the same with the Spark ODBC driver, any hints or advice.
¶ I am a newbie to HDP and just trying to learn to work with data in hadoop file system, but frankly I don't know what is the reason to want to use one connector over the other is? other than that I'd like to be able to connect with the different methods ( I am an R user and succeeded in getting the hive tables in R as well with OBDC connectors anything I can do in R running on my laptop I could use it with Spotfire which is what I am currently using for analytics), a discussion/answer to this point will be much appreciated.
• ¶ Then there are some more challenging things I'd like to do ( You see I understand that I can install R on HDP sandbox and carry out computations, I have seen the SparkR predicting airline delays tutorial; but if I can connect to the data in HDP HDFS outside of HDP sandbox I can start leveraging R's power with Spotfire client's in-built R engine with data from hadoop file system (apparently Spotfire Server has lot more data access/connectivity options but I don't have access to Spotfire Server , so with that in mind some of the things I am trying to get to are::)
Thanks, I don't know how naive my questions are but bare with me and any clarification or attempt there at will be really appreciated.
Best
Created 11-04-2017 09:37 PM
The Easiest way will be to find the port using Ambari UI
Login to Ambari UI --> Spark2 --> Configs (Tab) --> Advanced (Sub Tab) --> Advanced spark2-hive-site-override (OR) Login to Ambari UI --> Spark --> Configs (Tab) --> Advanced (Sub Tab) --> Advanced spark-hive-site-override
.
The default Spark Thrift server port is 10015 (for Spark2 10016). To specify a different port, you can navigate to the hive.server2.thrift.port setting in the "Advanced spark-hive-site-override" category of the Spark configuration section and update the setting with your preferred port number.
https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.2/bk_spark-component-guide/content/config-sts...
.
You can also use Ambari API to find the port using curl call as following:
# curl -u admin:admin -i -H 'X-Requested-By: ambari' -X GET "http://localhost:8080/api/v1/clusters/Sandbox/configurations?type=spark2-hive-site-override" (OR) # curl -u admin:admin -i -H 'X-Requested-By: ambari' -X GET "http://localhost:8080/api/v1/clusters/Sandbox/configurations?type=spark-hive-site-override"
Above command will list the various tags. You need to use the latest Tag ID (like "tag=version1509830820763") and then run the command with that tag ID as following:
# curl -u admin:admin -H 'X-Requested-By: ambari' -X GET "http://localhost:8080/api/v1/clusters/Sandbox/configurations?type=spark2-hive-site-override&tag=version1509830820763" { "href" : "http://localhost:8080/api/v1/clusters/Sandbox/configurations?type=spark2-hive-site-override&tag=version1509830820763", "items" : [ { "href" : "http://localhost:8080/api/v1/clusters/Sandbox/configurations?type=spark2-hive-site-override&tag=version1509830820763", "tag" : "version1509830820763", "type" : "spark2-hive-site-override", "version" : 2, "Config" : { "cluster_name" : "Sandbox", "stack_id" : "HDP-2.6" }, "properties" : { "hive.metastore.client.connect.retry.delay" : "5", "hive.metastore.client.socket.timeout" : "1800", "hive.server2.enable.doAs" : "false", "hive.server2.thrift.port" : "10016", "hive.server2.transport.mode" : "binary" } } ] }
.
NOTE: Please make sure that you put the whole URL inside Quotation mark as it contains & symbol in it.
Another option will be to use the config.sh , you can find the port as following by running the below command from Ambari Server Host:
For Spark2
# /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p admin get localhost Sandbox spark2-hive-site-override OUTPUT
-------- USERID=admin PASSWORD=admin ########## Performing 'GET' on (Site:spark2-hive-site-override, Tag:version1509830820763) "properties" : { "hive.metastore.client.connect.retry.delay" : "5", "hive.metastore.client.socket.timeout" : "1800", "hive.server2.enable.doAs" : "false", "hive.server2.thrift.port" : "10017", "hive.server2.transport.mode" : "binary" }
.
For Old Spark.
# /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p admin get localhost Sandbox spark-hive-site-override OUTPUT -------- USERID=admin PASSWORD=admin ########## Performing 'GET' on (Site:spark-hive-site-override, Tag:INITIAL) "properties" : { "hive.metastore.client.connect.retry.delay" : "5", "hive.metastore.client.socket.timeout" : "1800", "hive.server2.enable.doAs" : "false", "hive.server2.thrift.port" : "10015", "hive.server2.transport.mode" : "binary" }
.
NOTE: In the above commands please replace "Sandbox" word with yoru HDP ClusterName.
"localhost" with your ambari server hostname.
.
Created 11-04-2017 09:37 PM
The Easiest way will be to find the port using Ambari UI
Login to Ambari UI --> Spark2 --> Configs (Tab) --> Advanced (Sub Tab) --> Advanced spark2-hive-site-override (OR) Login to Ambari UI --> Spark --> Configs (Tab) --> Advanced (Sub Tab) --> Advanced spark-hive-site-override
.
The default Spark Thrift server port is 10015 (for Spark2 10016). To specify a different port, you can navigate to the hive.server2.thrift.port setting in the "Advanced spark-hive-site-override" category of the Spark configuration section and update the setting with your preferred port number.
https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.2/bk_spark-component-guide/content/config-sts...
.
You can also use Ambari API to find the port using curl call as following:
# curl -u admin:admin -i -H 'X-Requested-By: ambari' -X GET "http://localhost:8080/api/v1/clusters/Sandbox/configurations?type=spark2-hive-site-override" (OR) # curl -u admin:admin -i -H 'X-Requested-By: ambari' -X GET "http://localhost:8080/api/v1/clusters/Sandbox/configurations?type=spark-hive-site-override"
Above command will list the various tags. You need to use the latest Tag ID (like "tag=version1509830820763") and then run the command with that tag ID as following:
# curl -u admin:admin -H 'X-Requested-By: ambari' -X GET "http://localhost:8080/api/v1/clusters/Sandbox/configurations?type=spark2-hive-site-override&tag=version1509830820763" { "href" : "http://localhost:8080/api/v1/clusters/Sandbox/configurations?type=spark2-hive-site-override&tag=version1509830820763", "items" : [ { "href" : "http://localhost:8080/api/v1/clusters/Sandbox/configurations?type=spark2-hive-site-override&tag=version1509830820763", "tag" : "version1509830820763", "type" : "spark2-hive-site-override", "version" : 2, "Config" : { "cluster_name" : "Sandbox", "stack_id" : "HDP-2.6" }, "properties" : { "hive.metastore.client.connect.retry.delay" : "5", "hive.metastore.client.socket.timeout" : "1800", "hive.server2.enable.doAs" : "false", "hive.server2.thrift.port" : "10016", "hive.server2.transport.mode" : "binary" } } ] }
.
NOTE: Please make sure that you put the whole URL inside Quotation mark as it contains & symbol in it.
Another option will be to use the config.sh , you can find the port as following by running the below command from Ambari Server Host:
For Spark2
# /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p admin get localhost Sandbox spark2-hive-site-override OUTPUT
-------- USERID=admin PASSWORD=admin ########## Performing 'GET' on (Site:spark2-hive-site-override, Tag:version1509830820763) "properties" : { "hive.metastore.client.connect.retry.delay" : "5", "hive.metastore.client.socket.timeout" : "1800", "hive.server2.enable.doAs" : "false", "hive.server2.thrift.port" : "10017", "hive.server2.transport.mode" : "binary" }
.
For Old Spark.
# /var/lib/ambari-server/resources/scripts/configs.sh -u admin -p admin get localhost Sandbox spark-hive-site-override OUTPUT -------- USERID=admin PASSWORD=admin ########## Performing 'GET' on (Site:spark-hive-site-override, Tag:INITIAL) "properties" : { "hive.metastore.client.connect.retry.delay" : "5", "hive.metastore.client.socket.timeout" : "1800", "hive.server2.enable.doAs" : "false", "hive.server2.thrift.port" : "10015", "hive.server2.transport.mode" : "binary" }
.
NOTE: In the above commands please replace "Sandbox" word with yoru HDP ClusterName.
"localhost" with your ambari server hostname.
.