About elserj

elserj · ‎01-26-2016

What version of HDP are you using? It seems like there is a mismatch between your client and server libraries.

elserj · ‎01-18-2016

@vijaya inturi Try stopping the service via curl first. curl -u admin:admin -i -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo": {"context" :"Stop ACCUMULO via REST"}, "Body": {"ServiceInfo": {"state": "INSTALLED"}}}' http://`hostname -f`:8081/api/v1/clusters/cl1/services/ACCUMULO

elserj · ‎01-18-2016

I'm not sure. Maybe try creating a new question and tagging it specifically as Ambari.

elserj · ‎01-14-2016

The default Accumulo instance name is "hdp-accumulo-instance" IIRC. Try that.

elserj · ‎01-14-2016

Double check that your Accumulo instance name is actually "accumulo" as you provided. You can check the Accumulo Monitor page for this information (should be displayed at the top of the web page).

elserj · ‎01-11-2016

Awesome! I've been meaning to get in touch with you and start sending you some code for that. The work you did already looks super cool.

elserj · ‎01-11-2016

For example, take a look at the recently updated FetchResponse docs. The missingStatement attribute signifies that the Statement in the server's memory is missing. Likewise, the missingResults attribute signifies that the ResultSet in the server's memory is missing. It is up to the client to hold on to the necessary information and recover from these cases.

elserj · ‎01-11-2016

The big pieces needed to recover the state for a query automatically are the connection and statement missing in the server's memory (as you point out). In 1.5 (I believe), I added the ability to pass back specific context on the relevant RPC messages stating when this necessary state was missing in the server. This allows the client to send the necessary RPC to recreate the state and try to fetch the results again.

elserj · ‎01-11-2016

Introduction Apache Phoenix is a relational database "layer" built on top of Apache HBase, a NoSQL key-value data store. Until recently, the only provided entry-point for users to Phoenix was by their JDBC driver. This implicitly limited client applications to JVM-based languages. The Phoenix QueryServer was built using a sub-project of the Apache Calcite project named Avatica. Avatica is a JDBC-over-HTTP library. It includes both an HTTP server and a JDBC driver. The HTTP server has its own wire API which can be leveraged by clients in any language communicate with the backend JDBC server. Deployment One of the goals of the Phoenix QueryServer was to follow typical deployment strategies of HTTP servers. The primary adopted goal is that the QueryServer aims to be stateless. While this is not entirely the case as QueryServer does maintain internal state, the server and Java client are written in such a way that any missing state in the PQS instance can be automatically recovered by the client. Because PQS is designed to be stateless, it also scales horizontally. The general deployment recommendation is that PQS is installed on every node running an Apache HBase RegionServer. This enables users to control Phoenix throughput by increasing or decreasing the number of nodes which PQS is deployed on, just as one would do with HBase. The resource consumption of the PQS is minor when idle, so deploying it everywhere will have little impact. There are, however, considerations which might impact placement of PQS on every RegionServer node. For example, not all RegionServer nodes in a cluster may be accessible to external clients, partially or fully, for security reasons. It's also possible that clients are running outside of the cluster's perimeter. In this case, specific nodes might be designated as externally facing, again for security reasons, and the PQS instances would have to be limited to those nodes. High Availability Like deployments of HTTP servers, the use of an HTTP load balancer can also be used to achieve high availability. There are two main approaches to take with load balancers, each with their own pros and cons. A generic load-balancer As mentioned earlier, the provided Java client to Avatica is capable of recovering from any missing server-side state. This is the basis which allows PQS to be used behind any generic HTTP load balancer, such as HAProxy, Nginx, or the Apache HTTP Server (httpd). To efficiently implement this approach, it is strongly recommended to enable the appropriate configuration for the load balancer which will always route a client to a given server as long as the distribution of servers does not change, commonly known as "sticky sessions". This configuration will ensure that clients do not spend an excessive amount of time recreating server-side state. HAProxy example PQS has been tested using HAProxy 1.5.14. The following is an example configuration file for HAProxy. global maxconn 256 defaults mode http timeout connect 5000ms timeout client 60000ms timeout server 60000ms frontend http-in bind *:8888 default_backend servers backend servers balance source server queryserver1 server1:8765 check server queryserver2 server2:8765 check Load-balancers with client-driven routing Another potential strategy for using PQS behind a load balancer is by implementing client-driven routing. This is presently only a theoretical approach but still a sound one. This approach implies the creation of a custom client to PQS by implementing Avatica's protocol which passes an identifier to the load balancer to control how the request is routed to the backend QueryServers. Avatica's wire API includes an attribute on each message with the address of the backend server. Clients can provide this attribute to their load balancer to influence the routing decision of their subsequent requests This also requires that the client has the ability to recover when a backend server dies. General concerns Both of the listed approaches suffer from an issue that when, given a query over a static dataset, the ordering of the results may differ between executions. Both the provided Java implementation and the hypothetical client outlined in the client-driven routing example rely on a regular ordering of a result set between executions. This reliance is because the natural means to implement the client recovery during a query is based on the offset into the results. If the offset for a query's results varies, it is possible that clients will miss rows or receive duplicate rows. Phoenix provides the ability to force the required ordering via the property `phoenix.query.force.rowkeyorder` in hbase-site.xml. When this property is set to `true`, it will ensure that all queries, even those without an ORDER BY clause, will generate results in the same order.

elserj · ‎01-04-2016

@Arvind K Jajoo This code looks reasonable. Can you please elaborate what does not work with 0.98.4?

Online	Offline
Last Visited	‎07-01-2022 02:44 PM

Member Since	‎07-17-2019 08:58 AM
Last Visited	‎07-01-2022 02:44 PM
Posts	738
Kudos received	429

Cloudera Community

Re: Why can't Object Stores like Amazon S3 be used...

Re: Not a host:port pair: PBUF, how to resolve?

Re: versioning question in hbase

Re: Phoenix query call from java on larger data se...

Re: Revoke permissions to a superuser on Hbase

Re: Insert multiple rows with one request using ph...

Re: Anyone familiar with accumulo word count examp...

Re: Anyone familiar with accumulo word count examp...

Re: Anyone familiar with accumulo word count examp...

Re: Anyone familiar with accumulo word count examp...

Re: Deploying the Phoenix Query Server in producti...

Re: Deploying the Phoenix Query Server in producti...

Re: Deploying the Phoenix Query Server in producti...

Deploying the Phoenix Query Server in production e...

Re: What is correct strategy for hbase Kerberos Lo...