Member since
04-03-2019
97
Posts
7
Kudos Received
6
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1928 | 01-13-2025 11:17 AM | |
| 8519 | 01-21-2022 04:31 PM | |
| 7801 | 02-25-2020 10:02 AM | |
| 5772 | 02-19-2020 01:29 PM | |
| 4038 | 09-17-2019 06:33 AM |
01-21-2022
02:32 PM
@Scharan Thanks for the reply. I followed your recommendation and got the same permission error. I felt the disconnect is that, I added a user called admin successfully. The configuration /api/interpreter/** = authc, roles[admin] is for a role called admin. The link between a user and a role seems to be inside shiro.ini, which I have no idea how I can access. I used Zeppelin in HDP and the HDP Zeppelin exposes its shiro.ini via Zeppelin configuration inside Ambari. Now in CDP I cannot find a similar configuration inside Cloudera Manager.
... View more
01-20-2022
07:02 PM
I am using CDP 7.1.7 and the cluster has not enabled Kerbores yet. Ranger is not enabled either. I followed the step in this post https://community.cloudera.com/t5/Support-Questions/CDP-7-1-3-Zepplin-not-able-to-login-with-default-username/td-p/303717 to be able to log in as admin. But this "admin" account has no permission to access the configuration or interpreter page. According to CDP documentation, https://docs.cloudera.com/cdp-private-cloud-base/7.1.6/configuring-zeppelin/topics/enabling_access_control_for_interpreter__configuration__and_credential_settings.html, to configure shiro.ini for Zeppelin security, I have to go through Zeppelin web UI. What should I do? Regards,
... View more
Labels:
- Labels:
-
Apache Zeppelin
11-18-2021
01:30 PM
rbiswas1, I tried your code but pssh returned a timeout error. It was waiting for the password but I never got the prompt to enter the password. Could you elaborate more about your method? Thanks.
... View more
09-15-2021
10:32 PM
@RangaReddy The link is exactly what I need. Thanks for your help.
... View more
09-09-2021
01:18 AM
I am trying to parse a nested json document using RDD rather than DataFrame. The reason I cannot use DataFrame (the typical code is like spark.read.json) is that the document structure is very complicated. The schema detected by the reader is useless because child nodes at the same level have different schemas. So I try the script below. import json
s='{"key1":{"myid": "123","myname":"test"}}'
rdd=sc.parallelize(s).map(json.loads) My next step will be using map transformation to parse json string but I do not know where to start. I tried the script below but it failed. rdd2=rdd.map(lambda j: (j[x]) for x in j) I would appreciate any resource on using RDD transformation to parse json.
... View more
Labels:
- Labels:
-
Apache Spark
09-03-2021
05:08 PM
Vidya, Thanks for your reply. Could you help me clarify the issue further? Does Spark (or other MapReduce tool) create the container using the local host as its template (to some degree)?
... View more
08-26-2021
02:58 PM
I will use Spark2 in CDP and need to install Python3. Do I need to installation Python3 on every node in the CDP cluster, just only need to install it on one particular node? Spark2 job is executed in JVM containers that could be created on any worker node. I wonder whether the container is created upon a template? If yes, then how the template is created and where is it? Thanks.
... View more
Labels:
- Labels:
-
Apache Spark
02-25-2020
10:02 AM
I got the following responses from Cloudera Certification. Regarding Question #1, the FAQ page has the most the up-to-date information. So right now I'd better hold off purchasing the exam until the DE575 is relaunched. Regarding Question #2, the course is the "Spark and Hadoop Developer" training course is the one I should take for preparing DE575. Regarding Question #3, the environment for the exam is fixed and only available on CDH. Candidates do not have the option to take the exam in an HDP environment. The skills tested are applicable to HDP development as well, it is in the developer track, so it should have nothing to do with the environment that it is running in. It is primarily interested in transforming data that sits on the cluster.
... View more
02-19-2020
01:29 PM
1 Kudo
Finally, I figured out what is going on. The root cause is that, I only set up testuser on edge nodes, not the name node. I looked into this page, https://hadoop.apache.org/docs/r3.1.1/hadoop-project-dist/hadoop-common/GroupsMapping.html, which shows that "For HDFS, the mapping of users to groups is performed on the NameNode. Thus, the host system configuration of the NameNode determines the group mappings for the users." After I created the user on the NameNode and ran the command hdfs dfsadmin -refreshUserToGroupsMappings the copy is successful and there is no permission-denied error.
... View more
02-10-2020
11:51 AM
@GangWar Here it is. $ id -Gn testuser hadoop wheel hdfs
... View more