Member since
06-13-2016
20
Posts
0
Kudos Received
0
Solutions
11-07-2017
12:38 AM
Thanks. Glad to know that it helped.
... View more
10-25-2017
07:32 AM
1 Kudo
1 data node 8 CPU,30 GB RAM Some assumptions : you have 8 container in your cluster . 1. Even if you have 2 Gb of DATA 8 conatiners will be consumed completely by the job. 2. If two parallel job runs together job will slow down significantly if preemption happens. You should tune the Queue , but at least add one more node to achieve some significant advantage of parallelism.
... View more
10-13-2017
10:20 PM
1 Kudo
Hello @Saravanan Ramaraj In response to an earlier question I would say "yes" to converting your JSON log data to ORC if the logs are not complex data structures which can vary. unlike @Bala Vignesh N V I have had positive experiences comparing an ORC file of > 1 billion rows with it's equivalent RDBMS (Teradata) version. ORC's reponse of 20-30 seconds was judged to be "competitive" LLAP would make this even better. I'm a big fan of ORC.
... View more
07-14-2016
03:12 AM
@Saravanan Ramaraj have you looked into apache knox? The Knox API Gateway is designed as a reverse proxy with consideration for pluggability in the areas of
policy enforcement, through providers and the backend services for which it proxies requests. The Apache Knox Gateway is a REST API Gateway for interacting with Apache Hadoop clusters. The Knox Gateway provides a single access point for all REST interactions with Apache Hadoop clusters. In this capacity, the Knox Gateway is able to provide valuable functionality to aid in the control,
integration, monitoring and automation of critical administrative and analytical needs of the enterprise.
Authentication (LDAP and Active Directory Authentication Provider) Federation/SSO (HTTP Header Based Identity Federation) Authorization (Service Level Authorization) Auditing And then for authorization you can use Apache Ranger which offers a centralized security framework to manage fine-grained access control over Hadoop data access components coupled with kerberos you cluster will be secured and the links shall be authenticed using kerberos and ranger will provide authorization on what services the user has access to. Finally knox will be your perimeter security.
... View more
06-14-2016
09:25 AM
Thanks Deepesh for your immense response.
... View more