Member since
05-25-2016
26
Posts
4
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1004 | 11-09-2016 11:11 AM |
11-28-2017
08:58 AM
Hi all, I have the same problem, HDP 2.5 with Ranger, policies are only working when applied to users, not to groups where users and groups are managed with AD and SSSD on the Linux side. Athough all the users and groups are correctly mapped on ranger and on Linux, even the groups permissions are working fine with the Ranger encryption, but not with the policies. I tried all the suggestions like the lowercase conversion but still is not working for me. Any other idea? Thanks in advance.
... View more
06-20-2017
04:59 AM
Hi @Colton Rodgers I have the same problem than your. Please let me know if you find the solution. Thanks
... View more
04-20-2017
05:57 PM
Hi all, I have a kerberized cluster with HDP 2.5. I would like to use Zeppelin 0.6 and Spark2, but I have seen that there are many restrictions and problems, so at least I would like to try Zeppelin 0.6 with Spark 1.6 I followed the instructions and I configured Zeppelin with my AD. Also I would like to use impersonation, I think it is mandatory to execute a job with the user and not with a common zeppelin user (specially in order to read and write to HDFS). I followed many other threads and is not working and nothing is clear. Is anyone here with a HDP 2.5 and Zeppelin working with Spark and livy (for impersonation)? In my case, when I try the following is Zeppelin: %livy.pyspark
sc.version
I obtain: Interpreter died:
Traceback (most recent call last):
File "/grid/4/hadoop/yarn/local/usercache/jmolero/appcache/application_1486188076080_0234/container_e14_1486188076080_0234_01_000001/tmp/7818688309791970952", line 469, in <module>
sys.exit(main())
File "/grid/4/hadoop/yarn/local/usercache/jmolero/appcache/application_1486188076080_0234/container_e14_1486188076080_0234_01_000001/tmp/7818688309791970952", line 394, in main
exec 'from pyspark.shell import sc' in global_dict
File "<string>", line 1, in <module>
File "/grid/4/hadoop/yarn/local/usercache/jmolero/appcache/application_1486188076080_0234/container_e14_1486188076080_0234_01_000001/pyspark.zip/pyspark/shell.py", line 43, in <module>
File "/grid/4/hadoop/yarn/local/usercache/jmolero/appcache/application_1486188076080_0234/container_e14_1486188076080_0234_01_000001/pyspark.zip/pyspark/context.py", line 115, in __init__
File "/grid/4/hadoop/yarn/local/usercache/jmolero/appcache/application_1486188076080_0234/container_e14_1486188076080_0234_01_000001/pyspark.zip/pyspark/context.py", line 172, in _do_init
File "/grid/4/hadoop/yarn/local/usercache/jmolero/appcache/application_1486188076080_0234/container_e14_1486188076080_0234_01_000001/pyspark.zip/pyspark/context.py", line 235, in _initialize_context
File "/grid/4/hadoop/yarn/local/usercache/jmolero/appcache/application_1486188076080_0234/container_e14_1486188076080_0234_01_000001/py4j-0.9-src.zip/py4j/java_gateway.py", line 1064, in __call__
File "/grid/4/hadoop/yarn/local/usercache/jmolero/appcache/application_1486188076080_0234/container_e14_1486188076080_0234_01_000001/py4j-0.9-src.zip/py4j/protocol.py", line 308, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.io.FileNotFoundException: Added file file:/usr/hdp/current/spark-client/conf/hive-site.xml does not exist.
at org.apache.spark.SparkContext.addFile(SparkContext.scala:1388)
at org.apache.spark.SparkContext.addFile(SparkContext.scala:1364)
at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:491)
at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:491)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:491)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:59)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381)
at py4j.Gateway.invoke(Gateway.java:214)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)
at py4j.GatewayConnection.run(GatewayConnection.java:209)
at java.lang.Thread.run(Thread.java:745)
traceback:
{}
I could upgrade from HDP 2.5 to HDP 2.6 but I know that most probably is not going to work and the problem will be worst (and even zeppelin will continue not working) Thanks in advance
... View more
Labels:
- Labels:
-
Apache Spark
-
Apache Zeppelin
04-20-2017
02:44 PM
Cool! @Dharanidhar Ch Could you share how did you configure the livy interpreter ? Many thanks in advance
... View more
04-20-2017
02:07 PM
Hi @Dharanidhar Ch . Ok, are you using Zeppelin 0.7 with spark (1 or 2) and livy? Is working for you? Thanks
... View more
04-20-2017
12:55 PM
Hi @Dharanidhar Ch, I would like also to upgrade Zeppelin in HDP 2.5. Did you try that?
... View more
02-01-2017
02:22 PM
Hi all, I have 2 kerberized clusters, both connected to the same AD, one of them with HDP 2.4 and the other with HDP 2.5. Now I would like to move all the data from one cluster to another. I have been reading a lot about it, like the following links: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.2/bk_Sys_Admin_Guides/content/ref-c8ffaa14-eaf8-48a6-9791-307283d5d29d.1.html https://community.hortonworks.com/articles/18686/kerberos-cross-realm-trust-for-distcp.html What I am doing is the following: As the hdfs user in cluster 1, I can list all the files, but I can copy only the files for which I have explicit permissions for the hdfs user. For example: A file with permissions 770 for user user1 and the group hdfs can be copied. But a file with permissions 700 for user user1 and the group hdfs or another group, cannot be copied. Also, the second cluster is configured in HA, but I cannot used the name defined in HA, I have to point directly to the active master namenode (which can be different each time) With the following command: hadoop distcp hdfs://master01/projects/folder hdfs://manager01/projects/. If I don't have permissions for hdfs, I obtain the following error: 17/02/01 12:04:56 INFO mapreduce.Job: Task Id : attempt_1485252670123_0029_m_000003_1, Status : FAILED
Error: java.io.IOException: File copy failed: hdfs://cluster01/projects/folder --> hdfs://cluster02/projects/folder org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:285)
at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:253)
at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:50)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) What should I do to copy all the files? Change first all the permissions to 777? Thanks in advance
... View more
Labels:
- Labels:
-
Apache Hadoop
12-13-2016
08:19 AM
Hi @bikas, ok I understood. So I should not worry about. Thanks
... View more
12-12-2016
11:54 AM
Hi @Kuldeep Kulkarni Yes, Resource manager HA is configured, but both are working fine, just rm1 is in standby mode and rm2 is active.
... View more
12-12-2016
10:28 AM
Hi all, I am using HDP 2.5. When I try to run a spark job or context (using a Jupyter notebook or pyspark shell), I always obtain the following error: WARN Client: Failed to connect to server: mycluster.at/111.11.11.11:8032: retries get failed due to exceeded maximum allowed retries number: 0
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:650)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:745)
at org.apache.hadoop.ipc.Client$Connection.access$3200(Client.java:397)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1618)
at org.apache.hadoop.ipc.Client.call(Client.java:1449)
at org.apache.hadoop.ipc.Client.call(Client.java:1396)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
at com.sun.proxy.$Proxy15.getNewApplication(Unknown Source)
at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getNewApplication(ApplicationClientProtocolPBClientImpl.java:221)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:278)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:194)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:176)
at com.sun.proxy.$Proxy16.getNewApplication(Unknown Source)
at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNewApplication(YarnClientImpl.java:225)
at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.createApplication(YarnClientImpl.java:233)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:157)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:149)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:500)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:240)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:236)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
at py4j.GatewayConnection.run(GatewayConnection.java:211)
at java.lang.Thread.run(Thread.java:745)
Then the job is running fine, but that warning is always there. I have another cluster with HDP 2.4 and I don't see that warning. Any ideas? Thanks in advance,
... View more
Labels:
11-09-2016
11:11 AM
I solved the problem following this guide: https://www.ibm.com/support/knowledgecenter/SSPT3X_4.2.0/com.ibm.swg.im.infosphere.biginsights.admin.doc/doc/admin_kerb_activedir.html
... View more
11-09-2016
10:13 AM
Hi @Sagar Shimpi, yes the test connection was successful. And also I have the krb5.conf file at /etc/
... View more
11-09-2016
09:06 AM
1 Kudo
Hi all, I recently configured a cluster with HDP 2.5 and Ambari 2.4.1. Now I am trying to configure kerberos using an existing AD which is still used by another cluster with HDP 2.4 (I want to have both clusters running at the same time). I am following this guide: https://docs.hortonworks.com/HDPDocuments/Ambari-2.2.1.0/bk_Ambari_Security_Guide/content/_launching_the_kerberos_wizard_automated_setup.html But I always obtain the following error when installing: Error message: Failed to connect to KDC - Failed to communicate with the Active Directory at ldap://192.168.0.2: simple bind failed: 192.168.0.2:389
Update the KDC settings in krb5-conf and kerberos-env configurations to correct this issue. Any idea? Thanks in advance
... View more
Labels:
11-02-2016
09:27 AM
Hi @dbaev, I would like to have the same scenario. 2 clusters but using the same AD and also with kerberos. How was your experience? Did you find any problems? Thanks,
... View more