Member since
09-24-2015
38
Posts
41
Kudos Received
6
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4458 | 10-18-2017 01:27 PM | |
28935 | 04-22-2016 09:23 PM | |
1642 | 12-22-2015 02:41 PM | |
2274 | 12-22-2015 12:54 PM | |
4352 | 12-08-2015 03:44 PM |
12-08-2015
03:44 PM
Solution was to remove the offending jars from the /extlib-deamon folder.. It turns out these jars are in here for Ranger and not Storm itself... Because we are not using Ranger with Storm removing these was not a problem.
... View more
12-07-2015
06:15 PM
Hi guys, Using Maven is now allowed by the Legal team here, we can only use Gradle, and only what Gradle packages out of the box, no add-ons. Ergo Gradle does not have the ability to re-locate package names to prevent collisions like we can do with Maven Shade. Yes @schintalapani we are using Storm .10. This Jira looks EXACTLY what we need. https://issues.apache.org/jira/browse/STORM-129
... View more
12-07-2015
02:05 PM
1 Kudo
Hi all, We have a Storm topology that has a bolt which is required to go over the proxy, to do so we are using httpcore and httpclient, but the versions we are using are newer then the version which Storm has. The latest version has some new methods that we are using which the old version packaged with Storm does not have. This in turn is causing our Bolt to fail repeatedly with a 'NoSuchMethod' error... We believe this is being caused because the Bolt is picking up the older version on the classpath and not the newest version that we packaged into the fat Jar with the topology. In MapReduce we can set to respect the user classpath first... Is there any such feature in Storm that we can use to get around this other then implmenting our own ClassLoader?
... View more
Labels:
- Labels:
-
Apache Storm
10-26-2015
04:55 PM
2 Kudos
This will depend on how fast they wish to have this information...You can get historical info, so you could do this once a day, or once an hour, or once every 5 minutes depending on your use for it. Its possible to pass parameters to the REST call your making that specific a time range, so you could for example get all the logs from the last hour. So you could do something like... http://RMHOST:8088/ws/v1/cluster/apps?startedTimeBegin=1445351681496&startedTimeEnd=1445878421 https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Applications_API
... View more
10-02-2015
03:56 PM
Just a note that YARN may need to execute things that are placed into its local cache on the NMs, its not purly a data storage. This is why you cant have directories that are YARN related mounted as NOEXEC in /etc/fstab...
... View more
10-02-2015
12:54 PM
I think we are going to try to go the route of having 2 HS2, one with Ranger Integration, one without. No matter what we do we will end up with 1000's of policies as there are potentially 1000's of different data sources being onboarded to the system. Most of these don't require anything more then HDFS Security, so having to enter them into Hive for access is a management nightmare. By having a 'Ranger HS2 with doAs=False' and a 'Non-Ranger doAs=true' HS2 we can at least decide which tables require Ranger benefits and then force ownership of that data to Hive and make the policcies as a one off, rather then polices for every new data source that get on-boarded. Today after a simple migration from ACL's we had 17+ pages of policies in Ranger, which we all thought was crazy when most of them don't need Ranger benefits. Just to note this is also using wildcards for resources and multiple users and groups in a policy.
... View more
10-02-2015
12:39 PM
1 Kudo
Can't we just overload the HDFS Policies? So for example at a client we are using doAs false so we can use column security via Hive, but then for the 'application' that loads the data also has an HDFS Policy so it can directly run MR jobs and the like to get the data loaded for end Hive users.
... View more
10-01-2015
11:07 PM
2 Kudos
So what we have done for example is made some assumptions about who will access data and how. We break this down into 2 groups of users: Analysts, and Power Users. Analysts (90% of users) ONLY access the data via Hive, they never go from HDFS or use any other tools. Analysts also need to have column level security in place to ensure they only access data related to thier clearance - ie public, pii, spii... Power Users (10 % of users) can access the datasets with any tool from Hive or HDFS and have no restrictions on the columns they can see. The service application also counts as a poweruser as it deals with the ingest and preping of the data. To facilitate this we did a few things
Hive owns all the data on HDFS
We no longer have to manage policies per user at an HDFS Level (due to the below) Hive doAs=false (no impersonation)
This enables Ranger to decide if users can access tables or not Jobs now run as Hive, with Hive owning the data there is no no issue PowerUsers get Ranger HDFS Policies (think overloaded)
This lets them access anything from HDFS We also Give Power Users Select * policies in Hive so they can query any column Hope this helps
... View more
10-01-2015
06:09 PM
As per title subject. Is there a way I can have Ranger setup against Hive, but if a policy does not exist to pass through to the HDFS permissions rather then immediately deny? We had been considering setting up two different HiveServers, one which is using the Storage auth, and another which is using Ranger but we are not sure if that's immediately possible due to the way Ambari now makes this a toggle switch. We only have a handful of tables that really require Ranger to be used for Column authorization (and ofc this data on HDFS will be owned by Hive) all the other tables don't require column authorization and have extensive HDFS Extended ACL use so there are many many users added already to these storage policies. The ops team does not really want to migrate these unless absolutely required as they had started writing a script to do this in dev, and we end up with 1000's of policies that just make a total mess of everything. Thoughts are welcome. @Hive @Security @Ranger
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Ranger
09-26-2015
01:18 PM
There is also the list of all the contrib views on github. https://github.com/apache/ambari/tree/trunk/contrib/views
... View more
- « Previous
- Next »