About kbadani

kbadani · ‎05-11-2017

@Jon Page Can you please refer to my answere here: https://community.hortonworks.com/questions/92000/error-running-zeppelin-pyspark-interpreter-with-py.html#answer-92029

kbadani · ‎05-02-2017

@Ekantheshwara Basappa Can you try adding this config in your [urls] section and restart zeppelin /api/notebook/** = authc

kbadani · ‎04-28-2017

@Matt Cable Can you please accept the answer if it helped? Thanks in advance 🙂

kbadani · ‎04-24-2017

@Colton Rodgers This issue was filed as an apache JIRA https://issues.apache.org/jira/browse/ZEPPELIN-1657 And it was fixed with Zeppelin 0.7 release. Hence this feature is automatically supported in HDP-2.6 and when you upgrade, you will see that the notebooks automatically set to following permissions when shiro authentication is enabled owner : user1 (creator of the notebook) writer: '' reader: '' I dont think there is a way you can do it in HDP-2.5 without backporting relevant fixes.

kbadani · ‎04-06-2017

@Predrag Minovic Can you please try above steps and accept the answer if it works for you? Thanks !!

kbadani · ‎04-06-2017

@Matt Cable One of the ways you can do is to configure multiple livy interpreter instances (e.g. %livy_begineer, %livy_expert etc) in the same zeppelin instance (follow steps here: https://zeppelin.apache.org/docs/latest/manual/interpreters.html) and configure separate yarn queues for each instance of the livy interpreter. Since livy also supports user impersonation, this would serve the purpose for even restricting access to certain queues only by certain users

kbadani · ‎04-06-2017

SETUP: Kerberized cluster with Ranger installed. This article uses a latest HDP-2.6 cluster installed using Ambari -2.5 Ranger based authorization is enabled with Hive Zeppelin's authentication is enabled. You can use LDAP authentication, but for the purpose of this demonstration - I am using a simple authentication method and I am going to configure 2 additional users 'hive' and 'hrt_1' in zeppelin's shiro.ini . [users] # List of users with their password allowed to access Zeppelin. # To use a different strategy (LDAP / Database / ...) check the shiro doc at http://shiro.apache.org/configuration.html#Configuration-INISections admin = admin, admin hive = hive, admin hrt_1 = hrt_1, admin # Sample LDAP configuration, for user Authentication, currently tested for single Realm [main] ### A sample for configuring Active Directory Realm #activeDirectoryRealm = org.apache.zeppelin.realm.ActiveDirectoryGroupRealm #activeDirectoryRealm.systemUsername = userNameA #use either systemPassword or hadoopSecurityCredentialPath, more details in http://zeppelin.apache.org/docs/latest/security/shiroauthentication.html #activeDirectoryRealm.systemPassword = passwordA #activeDirectoryRealm.hadoopSecurityCredentialPath = jceks://file/user/zeppelin/zeppelin.jceks #activeDirectoryRealm.searchBase = CN=Users,DC=SOME_GROUP,DC=COMPANY,DC=COM #activeDirectoryRealm.url = ldap://ldap.test.com:389 #activeDirectoryRealm.groupRolesMap = "CN=admin,OU=groups,DC=SOME_GROUP,DC=COMPANY,DC=COM":"admin","CN=finance,OU=groups,DC=SOME_GROUP,DC=COMPANY,DC=COM":"finance","CN=hr,OU=groups,DC=SOME_GROUP,DC=COMPANY,DC=COM":"hr" #activeDirectoryRealm.authorizationCachingEnabled = false ### A sample for configuring LDAP Directory Realm #ldapRealm = org.apache.zeppelin.realm.LdapGroupRealm ## search base for ldap groups (only relevant for LdapGroupRealm): #ldapRealm.contextFactory.environment[ldap.searchBase] = dc=COMPANY,dc=COM #ldapRealm.contextFactory.url = ldap://ldap.test.com:389 #ldapRealm.userDnTemplate = uid={0},ou=Users,dc=COMPANY,dc=COM #ldapRealm.contextFactory.authenticationMechanism = SIMPLE ### A sample PAM configuration #pamRealm=org.apache.zeppelin.realm.PamRealm #pamRealm.service=sshd sessionManager = org.apache.shiro.web.session.mgt.DefaultWebSessionManager ### If caching of user is required then uncomment below lines cacheManager = org.apache.shiro.cache.MemoryConstrainedCacheManager securityManager.cacheManager = $cacheManager securityManager.sessionManager = $sessionManager # 86,400,000 milliseconds = 24 hour securityManager.sessionManager.globalSessionTimeout = 86400000 shiro.loginUrl = /api/login [roles] role1 = * role2 = * role3 = * admin = * [urls] # This section is used for url-based security. # You can secure interpreter, configuration and credential information by urls. Comment or uncomment the below urls that you want to hide. # anon means the access is anonymous. # authc means Form based Auth Security # To enfore security, comment the line below and uncomment the next one /api/version = anon #/api/interpreter/** = authc, roles[admin] #/api/configurations/** = authc, roles[admin] #/api/credential/** = authc, roles[admin] #/** = anon /** = authc Make sure only 'hive' user has access to all databases, tables and columns as follows Make sure Zeppelin's jdbc interpreter is configured as follows: hive.user hive hive.password hive.url Make sure to configure correct hive.url using instructions on https://community.hortonworks.com/articles/4103/hiveserver2-jdbc-connection-url-examples.html hive.driver org.apache.hive.jdbc.HiveDriver zeppelin.jdbc.auth.type KERBEROS zeppelin.jdbc.keytab.location zeppelin server keytab location zeppelin.jdbc.principal your zeppelin principal name from zeppelin server keytab Download test data from here , unzip it and copy the timesheet.csv file into HDFS /tmp directory and change permission to '777' DEMO: 1) Log in to Zeppelin as user 'hive' (password is also configured to 'hive') 2) Create a notebook 'jdbc(hive) demo' and run the following 2 paragraphs for creating table and loading data %livy.sql CREATE TABLE IF NOT EXISTS timesheet_livy_hive(driverId INT, week INT, hours_logged INT, miles_logged INT) row format delimited fields terminated by ',' lines terminated by '\n' stored as TEXTFILE location '/apps/hive/warehouse/timesheet' tblproperties('skip.header.line.count'='1') %livy.sql LOAD DATA INPATH '/tmp/timesheet.csv' INTO TABLE timesheet_livy_hive %livy interpreter supports impersonation and it will execute the sql statements as 'hive' user. The RM UI will show corresponding YARN APPs running as 'hive' user Also 'hive' user has the permissions to create table in the Ranger policies - these two paragraphs will run successfully. 3) Stay logged in as 'hive' user and run a 'SELECT' query using jdbc(hive) interpreter Since 'hive' user has the permissions to run a SELECT query in the Ranger policies - this paragraph will run successfully as well %jdbc(hive) select count(*) from timesheet_livy_hive 4) Now logout as 'hive' user and login as 'hrt_1' user, open notebook jdbc(hive) demo and re-run the SELECT query paragraph %jdbc interpreter supports impersonation. It will run the SELECT query as 'hrt_1' user now and will fail subsequently, since the 'hrt_1' user does not have sufficient permissions in Ranger policies to perform a query on any of the hive tables 5) Remain logged in as 'hrt_1' user and try to grant access to itself for the hive table. %jdbc(hive) grant select on timesheet_livy_hive to user hrt_1 This paragraph will fail again as the impersonated user 'hrt_1' does not have permissions to grant access 6) Now logout and login as 'hive' user again and try to grant access to 'hrt_1' user for the hive table again. This time, the last paragraph will succeed as the 'hive' user has the permissions to grant access to 'hrt_1' as per the defined policies in Ranger. When the last paragraph succeeds, you will see an extra 'grant' policy created in Ranger 7) Now logout and login back as 'hrt_1' user and try to run the 'SELECT' query again(wait for about 30 sec for the new Ranger policy to be in effect) This paragraph would succeed now since a new Ranger policy has been created for 'hrt_1' user to perform select query on the hive table 😎 Stay logged in as 'hrt_1' user and try dropping the table using jdbc(hive) interpreter. This would not succeed as user 'hrt_1' does not have permissions to drop a hive table. %jdbc(hive) drop table timesheet_livy_hive 9) Now logout as 'hrt_1' user and login back as 'hive' user and try to drop the table. This would succeed now, as only the 'hive' user has permission to drop the table.

kbadani · ‎04-04-2017

@Sree Kupp The part you are interested in starts after 1:20:00, and upto 2:40:00

kbadani · ‎04-04-2017

@Sree Kupp This is a great video , explaining everything in detail. Its a 6 hour training but you can skip and listen to the parts that you are interested in. https://www.youtube.com/watch?v=7ooZ4S7Ay6Y

kbadani · ‎04-03-2017

@tuxnet I see. One of the ways you may try is to submit spark jobs remotely via Livy. It does not require you to have spark-client on your local machine. You need to have livy server installed and configured properly on your cluster. And then you can submit your jobs via REST API. https://github.com/cloudera/livy#post-sessions

Online	Offline
Last Visited	‎06-11-2021 08:41 PM

Member Since	‎08-13-2019 02:46 PM
Last Visited	‎06-11-2021 08:41 PM
Posts	84
Kudos received	233

Cloudera Community

Re: LLAP, Livy & Zeppelin not using LLAP

Re: Global Variables in Zeppelin Notebook

Re: zeppelin users roles

Re: is there a way where we can share the zeppelin...

Re: Variables and functions declared in one notebo...

Re: zeppelin pyspark cannot run with different min...

Re: Zeppelin notebook permissions not effective

Re: Limiting user resources for Zep->Livy->Spark->...

Re: Zeppelin Default Notebook Permissions

Re: Error running Zeppelin pyspark interpreter wit...

Re: Limiting user resources for Zep->Livy->Spark->...

Secured access to Hive using Zeppelin's jdbc(hive)...

Re: What are Spark executors, executor instances, ...

Re: What are Spark executors, executor instances, ...

Re: Python IDE for HDP Spark cluster