Member since
08-13-2019
84
Posts
233
Kudos Received
15
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2257 | 02-28-2018 09:27 PM | |
3339 | 01-25-2018 09:44 PM | |
6531 | 09-21-2017 08:17 PM | |
3733 | 09-11-2017 05:21 PM | |
3351 | 07-13-2017 04:56 PM |
03-14-2018
05:15 PM
@Kshitij Badani How do we get full write access to llap in HDP 2.6.3? I'm happy to do work to make this work, otherwise I'll have to tell my client to down grade back to 2.6.2. I'd prefer not to do that.
... View more
01-25-2018
09:44 PM
1 Kudo
@Sridhar Reddy Since Spark2 interpreter is in globally shared mode, there is only one Spark2 session (i.e. Spark2 context) shared between all users and all notebooks in zeppelin. A variable defined in one paragraph of one notebook maybe accessed freely in other paragraphs of the same notebook, and for that matter paragraphs of other notebooks as well. Attaching screenshots screen-shot-2018-01-25-at-14317-pm.png screen-shot-2018-01-25-at-14344-pm.png
... View more
10-12-2017
09:47 PM
3 Kudos
Thanks @dbalasundaran for pointing to the article. This works for me There is one caveat in this though, If your cluster is kerberos enabled, then there is one more step required before installing the service in last step: Send a POST request to "/credentials/kdc.admin.credential" with data as '{ "Credential" : { "principal" : "user@EXAMPLE.COM", "key" : "password", "type" : "temporary" } }'
... View more
07-13-2017
04:56 PM
6 Kudos
@shivanand khobanna Are you defining those variables with %spark interpreter ? In that case, the default mode of %spark interpreter is 'Globally Shared' In Shared mode, single JVM process and single Interpreter Group serves all Notes. Hence you might see variables defined in one note available to all users and all notebooks. So the behavior you are seeing is by design. You can change your interpreter modes through interpreters page. But better use 'livy' interpreter which uses 'Per user scoped' mode by default on HDP installed zeppelin. That means that you will see different YARN APPs for each user who is trying to use %livy interpreter and hence different spark context for each user which disables the sharing of namespace variables defined by one user from the other user. Please checkout this article for more info on various zeppelin interpreter modes and what each of the modes means: https://medium.com/@leemoonsoo/apache-zeppelin-interpreter-mode-explained-bae0525d0555
... View more
03-14-2018
05:36 PM
This doesn't work for hdp 2.6.3
... View more
05-10-2018
03:18 PM
Hi @Kshitij Badani got the same error as Ramon, so maybe my screens can help I`ve got hdp 2.6.3, kerberized, using microsoft AD and want to impersonate users so thay can run spark 1/2 jobs. so far I`m trying to run livy with spark 1.6.3 but after logging in with AD user and running a note I`m getting INFO [2018-05-10 16:49:41,905] ({pool-2-thread-2} SchedulerFactory.java[jobStarted]:131) - Job paragraph_1525958424236_42692352 started by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpretershared_session1635594872 INFO [2018-05-10 16:49:41,906] ({pool-2-thread-2} Paragraph.java[jobRun]:366) - run paragraph 20180510-152024_1120525270 using livy org.apache.zeppelin.interpreter.LazyOpenInterpreter@5b439305 INFO [2018-05-10 16:49:41,918] ({pool-2-thread-2} RemoteInterpreterManagedProcess.java[start]:132) - Run interpreter process [/usr/hdp/current/zeppelin-server/bin/interpreter.sh, -d, /usr/hdp/current/zeppelin-server/interpreter/livy, -p, 35361, -u, mvince, -l, /usr/hdp/current/zeppelin-server/local-repo/2CKX6DGQZ, -g, livy] INFO [2018-05-10 16:49:42,473] ({pool-2-thread-2} RemoteInterpreter.java[init]:246) - Create remote interpreter org.apache.zeppelin.livy.LivySparkInterpreter INFO [2018-05-10 16:49:42,963] ({pool-2-thread-2} RemoteInterpreter.java[pushAngularObjectRegistryToRemote]:578) - Push local angular object registry from ZeppelinServer to remote interpreter group 2CKX6DGQZ:mvince: INFO [2018-05-10 16:49:42,981] ({pool-2-thread-2} RemoteInterpreter.java[init]:246) - Create remote interpreter org.apache.zeppelin.livy.LivySparkSQLInterpreter INFO [2018-05-10 16:49:42,986] ({pool-2-thread-2} RemoteInterpreter.java[init]:246) - Create remote interpreter org.apache.zeppelin.livy.LivyPySparkInterpreter INFO [2018-05-10 16:49:42,992] ({pool-2-thread-2} RemoteInterpreter.java[init]:246) - Create remote interpreter org.apache.zeppelin.livy.LivyPySpark3Interpreter INFO [2018-05-10 16:49:42,997] ({pool-2-thread-2} RemoteInterpreter.java[init]:246) - Create remote interpreter org.apache.zeppelin.livy.LivySparkRInterpreter INFO [2018-05-10 16:49:43,005] ({pool-2-thread-2} RemoteInterpreter.java[init]:246) - Create remote interpreter org.apache.zeppelin.livy.LivySharedInterpreter WARN [2018-05-10 16:49:43,107] ({pool-2-thread-2} NotebookServer.java[afterStatusChange]:2067) - Job 20180510-152024_1120525270 is finished, status: ERROR, exception: null, result: %text javax.security.auth.login.LoginException: Unable to obtain password from user at com.sun.security.auth.module.Krb5LoginModule.promptForPass(Krb5LoginModule.java:897) at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:760) at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755) at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195) at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682) at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680) at javax.security.auth.login.LoginContext.login(LoginContext.java:587) at org.springframework.security.kerberos.client.KerberosRestTemplate.doExecute(KerberosRestTemplate.java:185) at org.springframework.web.client.RestTemplate.execute(RestTemplate.java:580) at org.springframework.web.client.RestTemplate.exchange(RestTemplate.java:498) at org.apache.zeppelin.livy.BaseLivyInterpreter.callRestAPI(BaseLivyInterpreter.java:619) at org.apache.zeppelin.livy.BaseLivyInterpreter.callRestAPI(BaseLivyInterpreter.java:599) at org.apache.zeppelin.livy.BaseLivyInterpreter.getLivyVersion(BaseLivyInterpreter.java:395) at org.apache.zeppelin.livy.LivySharedInterpreter.open(LivySharedInterpreter.java:47) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69) at org.apache.zeppelin.livy.BaseLivyInterpreter.getLivySharedInterpreter(BaseLivyInterpreter.java:165) at org.apache.zeppelin.livy.BaseLivyInterpreter.open(BaseLivyInterpreter.java:139) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:493) at org.apache.zeppelin.scheduler.Job.run(Job.java:175) at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) any idea what I`m doing wrong?
... View more
06-23-2017
06:12 PM
1 Kudo
@suyash soni Not that I am aware of. You can try running this hive query on beeline and/or Ambari Hive view and see if it works for you. If it works there and not via Zeppelin, then its a potential bug.
... View more
06-08-2017
11:01 PM
7 Kudos
This article describes how to enable Knox proxying for Zeppelin Notebook for
Wire encrypted environment (i.e SSL is enabled for Knox and Zeppelin)
Non Wire encrypted environments
Configuring Knox proxying for Zeppelin Notebook in Wire encrypted Environments
If you have already configured SSL for zeppelin, please proceed to section 2. If not, please read through section 1.
Section 1 : Configuring SSL for Zeppelin
Note : The steps mentioned in section 1 are just for example purpose, and the production setup may be different. Also the steps assume no client-side authentication. For client-side authentication, please follow the Zeppelin Component Guide in HDP release documents
(HDP 2.6.1 Zeppelin Component Guide : https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.1/bk_zeppelin-component-guide/content/config-ssl-zepp.html )
Create keystore file, truststore file and certificates for each host on the cluster by following these steps
Navigate to the directory where you want to create zeppelin keystore, certificate and truststore files
Create a keystore file on zeppelin server host keytool -genkey -alias $ZeppelinHostFqdn -keyalg RSA -keysize 1024 -dname CN=$ZeppelinHostFqdn ,OU=$OrganizationalUnit ,O=$OrganizationName,L=$City,ST=$State,C=$Country -keypass $KeyPassword -keystore $KeyStoreFile -storepass $KeyStorePassword
Create a certificate file on zeppelin server host by exporting key info from keystore file keytool -export -alias $ZeppelinHostFqdn -keystore $KeyStoreFile -rfc -file $CertificateFile -storepass $KeyStorePassword
Create a truststore file on zeppelin server host Keytool -import -noprompt -alias $ZeppelinHostFqdn -file $CertificateFile -keystore $TrustStoreFile -storepass $TrustStorePassword
Change permissions of keystore file and truststore file to 444 and change owner to ‘zeppelin’ user
Now configure following through ambari in zeppelin-config section
zeppelin.ssl : true
zeppelin.server.ssl.port : $ZeppelinSSLPort
zeppelin.ssl.client.auth: false (true in case of client-side authentication enabled)
zeppelin.ssl.truststore.type : JKS
zeppelin.ssl.truststore.path : $TrustStoreFile
zeppelin.ssl.truststore.password : $TrustStorePassword
zeppelin.ssl.keystore.path : $KeyStoreFile
zeppelin.ssl.keystore.password : $KeyStorePassword
zeppelin.ssl.key.manager.password : $KeyPassword
Section 2: Configuring Knox Proxying
Copy the zeppelin certificate file onto Knox gateway host
Add zeppelin certificate file into java cacert store of the Knox gateway host using following command on Knox gateway host (This is a way to configure knox gateway to trust incoming request from zeppelin server) keytool -import -file $CertificateFile -alias $ZeppelinHostFqdn -keystore $JavaCacertPath
Where $JavaCacertPath is typically : path to your java installation dir + /jre/lib/security/cacerts
You will get a prompt asking for the keystore password (i.e java cacert store password) and the default value is ‘changeit’
Create a topology ui.xml file in $KnoxConfDir/topologies directory on the knox gateway host, and configure the Zeppelin UI using following snippet
<service>
<role>ZEPPELIN</role>
<url>https://$ZeppelinHostFqdn:$ZeppelinSSLPort</url>
</service>
<service>
<role>ZEPPELINUI</role>
<url>https://$ZeppelinHostFqdn:$ZeppelinSSLPort</url>
</service>
<service>
<role>ZEPPELINWS</role>
<url>wss://$ZeppelinHostFqdn:$ZeppelinSSLPort/ws</url>
</service>
Note: make sure to use FQDN of zeppelin host name, as that is the ‘key’(or alias) that we have used in Section 1 to create zeppelin certificate
There is no need to restart either knox gateway or Zeppelin.
Configuring KNOX proxying for Zeppelin Notebook in non Wire encrypted Environments
Create a topology ui.xml file in $KnoxConfDir/topologies directory on the knox gateway host, and configure the Zeppelin UI using following snippet
<service>
<role>ZEPPELIN</role>
<url>http://$ZeppelinHostFqdn:$ZeppelinPort</url>
</service>
<service>
<role>ZEPPELINUI</role>
<url>http://$ZeppelinHostFqdn:$ZeppelinPort</url>
</service>
There is no need to restart either knox gateway or Zeppelin
Using KNOX Proxying For Zeppelin and currently known issues with HDP-2.6.1 release
Once the configurations are finished, you can access Zeppelin UI via Knox proxy using following URL:
https://<knox gateway host>:8443/gateway/ui/zeppelin/
( Note: Please don’t forget to append a trailing ‘/’ with the URL. This bug is work in progress)
The other bug that is still work in progress is that if a user logs out from Zeppelin while using knox’s proxy URL , he does not remain on the Zeppelin’s Login page anymore. (
https://issues.apache.org/jira/browse/ZEPPELIN-2601) The user needs to type the URL again in browser to go back to the Login page
... View more
Labels:
05-14-2017
10:45 PM
The issue got resolved. I had to take out the line "securityManager.realms = $activeDirectoryRealm" from my config and that resolved the issue. I dont see anything wrong in the line I took out. However, I believe this is an optional config.
... View more
04-06-2017
01:19 AM
11 Kudos
SETUP: Kerberized cluster with Ranger installed. This article uses a latest HDP-2.6 cluster installed using Ambari -2.5 Ranger based authorization is enabled with Hive Zeppelin's authentication is enabled. You can use LDAP authentication, but for the purpose of this demonstration - I am using a simple authentication method and I am going to configure 2 additional users 'hive' and 'hrt_1' in zeppelin's shiro.ini . [users]
# List of users with their password allowed to access Zeppelin.
# To use a different strategy (LDAP / Database / ...) check the shiro doc at http://shiro.apache.org/configuration.html#Configuration-INISections
admin = admin, admin
hive = hive, admin
hrt_1 = hrt_1, admin
# Sample LDAP configuration, for user Authentication, currently tested for single Realm
[main]
### A sample for configuring Active Directory Realm
#activeDirectoryRealm = org.apache.zeppelin.realm.ActiveDirectoryGroupRealm
#activeDirectoryRealm.systemUsername = userNameA
#use either systemPassword or hadoopSecurityCredentialPath, more details in http://zeppelin.apache.org/docs/latest/security/shiroauthentication.html
#activeDirectoryRealm.systemPassword = passwordA
#activeDirectoryRealm.hadoopSecurityCredentialPath = jceks://file/user/zeppelin/zeppelin.jceks
#activeDirectoryRealm.searchBase = CN=Users,DC=SOME_GROUP,DC=COMPANY,DC=COM
#activeDirectoryRealm.url = ldap://ldap.test.com:389
#activeDirectoryRealm.groupRolesMap = "CN=admin,OU=groups,DC=SOME_GROUP,DC=COMPANY,DC=COM":"admin","CN=finance,OU=groups,DC=SOME_GROUP,DC=COMPANY,DC=COM":"finance","CN=hr,OU=groups,DC=SOME_GROUP,DC=COMPANY,DC=COM":"hr"
#activeDirectoryRealm.authorizationCachingEnabled = false
### A sample for configuring LDAP Directory Realm
#ldapRealm = org.apache.zeppelin.realm.LdapGroupRealm
## search base for ldap groups (only relevant for LdapGroupRealm):
#ldapRealm.contextFactory.environment[ldap.searchBase] = dc=COMPANY,dc=COM
#ldapRealm.contextFactory.url = ldap://ldap.test.com:389
#ldapRealm.userDnTemplate = uid={0},ou=Users,dc=COMPANY,dc=COM
#ldapRealm.contextFactory.authenticationMechanism = SIMPLE
### A sample PAM configuration
#pamRealm=org.apache.zeppelin.realm.PamRealm
#pamRealm.service=sshd
sessionManager = org.apache.shiro.web.session.mgt.DefaultWebSessionManager
### If caching of user is required then uncomment below lines
cacheManager = org.apache.shiro.cache.MemoryConstrainedCacheManager
securityManager.cacheManager = $cacheManager
securityManager.sessionManager = $sessionManager
# 86,400,000 milliseconds = 24 hour
securityManager.sessionManager.globalSessionTimeout = 86400000
shiro.loginUrl = /api/login
[roles]
role1 = *
role2 = *
role3 = *
admin = *
[urls]
# This section is used for url-based security.
# You can secure interpreter, configuration and credential information by urls. Comment or uncomment the below urls that you want to hide.
# anon means the access is anonymous.
# authc means Form based Auth Security
# To enfore security, comment the line below and uncomment the next one
/api/version = anon
#/api/interpreter/** = authc, roles[admin]
#/api/configurations/** = authc, roles[admin]
#/api/credential/** = authc, roles[admin]
#/** = anon
/** = authc
Make sure only 'hive' user has access to all databases, tables and columns as follows Make sure Zeppelin's jdbc interpreter is configured as follows: hive.user hive hive.password hive.url Make sure to configure correct hive.url using instructions on https://community.hortonworks.com/articles/4103/hiveserver2-jdbc-connection-url-examples.html hive.driver org.apache.hive.jdbc.HiveDriver zeppelin.jdbc.auth.type KERBEROS zeppelin.jdbc.keytab.location zeppelin server keytab location zeppelin.jdbc.principal your zeppelin principal name from zeppelin server keytab
Download test data from here , unzip it and copy the timesheet.csv file into HDFS /tmp directory and change permission to '777' DEMO: 1) Log in to Zeppelin as user 'hive' (password is also configured to 'hive') 2) Create a notebook 'jdbc(hive) demo' and run the following 2 paragraphs for creating table and loading data %livy.sql
CREATE TABLE IF NOT EXISTS timesheet_livy_hive(driverId INT, week INT, hours_logged INT, miles_logged INT) row format delimited fields terminated by ',' lines terminated by '\n' stored as TEXTFILE location '/apps/hive/warehouse/timesheet' tblproperties('skip.header.line.count'='1') %livy.sql
LOAD DATA INPATH '/tmp/timesheet.csv' INTO TABLE timesheet_livy_hive %livy interpreter supports impersonation and it will execute the sql statements as 'hive' user. The RM UI will show corresponding YARN APPs running as 'hive' user Also 'hive' user has the permissions to create table in the Ranger policies - these two paragraphs will run successfully. 3) Stay logged in as 'hive' user and run a 'SELECT' query using jdbc(hive) interpreter Since 'hive' user has the permissions to run a SELECT query in the Ranger policies - this paragraph will run successfully as well %jdbc(hive)
select count(*) from timesheet_livy_hive
4) Now logout as 'hive' user and login as 'hrt_1' user, open notebook jdbc(hive) demo and re-run the SELECT query paragraph %jdbc interpreter supports impersonation. It will run the SELECT query as 'hrt_1' user now and will fail subsequently, since the 'hrt_1' user does not have sufficient permissions in Ranger policies to perform a query on any of the hive tables 5) Remain logged in as 'hrt_1' user and try to grant access to itself for the hive table. %jdbc(hive)
grant select on timesheet_livy_hive to user hrt_1 This paragraph will fail again as the impersonated user 'hrt_1' does not have permissions to grant access 6) Now logout and login as 'hive' user again and try to grant access to 'hrt_1' user for the hive table again. This time, the last paragraph will succeed as the 'hive' user has the permissions to grant access to 'hrt_1' as per the defined policies in Ranger. When the last paragraph succeeds, you will see an extra 'grant' policy created in Ranger 7) Now logout and login back as 'hrt_1' user and try to run the 'SELECT' query again(wait for about 30 sec for the new Ranger policy to be in effect) This paragraph would succeed now since a new Ranger policy has been created for 'hrt_1' user to perform select query on the hive table 😎 Stay logged in as 'hrt_1' user and try dropping the table using jdbc(hive) interpreter. This would not succeed as user 'hrt_1' does not have permissions to drop a hive table. %jdbc(hive)
drop table timesheet_livy_hive
9) Now logout as 'hrt_1' user and login back as 'hive' user and try to drop the table. This would succeed now, as only the 'hive' user has permission to drop the table.
... View more
Labels: