Member since
09-11-2015
115
Posts
126
Kudos Received
15
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1984 | 08-15-2016 05:48 PM | |
1673 | 05-31-2016 06:19 PM | |
1411 | 05-11-2016 03:10 PM | |
1138 | 05-10-2016 07:06 PM | |
3386 | 05-02-2016 06:25 PM |
04-04-2017
04:26 PM
The second link no longer works. It would be nice to have a comprehensive comparison, rather than "jupyter is good for running python locally, zeppelin is better for cluster workloads"
... View more
03-10-2017
06:11 AM
This is very helpful, thank you. Can you please advise on where to find the Jetty logs, in case of issues with the web application itself?
... View more
08-15-2016
05:48 PM
There could be a problem with the certificate itself. I recommend regenerating it and trying again. You can follow instructions in the Apache Knox Users Guide to generate a self-signed certificate: http://knox.apache.org/books/knox-0-6-0/user-guide.html#Generating+a+self-signed+cert+for+use+in+testing+or+development+environments If you want to use a more legitimate certificate you can generate and sign it yourself with OpenSSL or from a CA, and follow the steps in the next section of the guide, Using a CA Signed Key Pair.
... View more
08-12-2016
05:03 PM
Can you provide more details about how you're attempting to connect, and with which client? If you're using curl, specify the exact command (masking the user password if you want), and the exact version of curl + OS
... View more
07-11-2016
10:17 PM
The only way to reliably accomplish this is to prevent users from logging into cluster nodes at all, and force them to use beeline to access HS2 in HTTP mode through Knox. Every solution recommending changes to hive-env.sh or hive.distro can be overridden by using a modified copy of those files. Those files could even be copied from elsewhere, because this is all open source.
... View more
06-13-2016
06:29 PM
Running each topology on its own Gateway instance is fine, but it's not necessary. You can use a single Knox Gateway instance and simply create a separate topology per-AD. Say you have 2 topologies, ad1 and ad2, then you can connect using: https://knox-host:8443/gateway/ad1/<service>/. https://knox-host:8443/gateway/ad2/<service>/.
... View more
06-10-2016
06:18 PM
2 Kudos
This feature is tracked in YARN-2477 (DockerContainerExecutor must support secure mode). There is currently no fix version specified. Docker container support on YARN is still very new, so you might want to follow the umbrella JIRA, (YARN-2466) to gain an idea of when additional features might become available.
... View more
06-08-2016
06:19 PM
@Tim Veil you might find this post helpful as a reference, or to integrate into your project: https://community.hortonworks.com/articles/29203/automated-kerberos-installation-and-configuration.html
... View more
06-01-2016
03:04 PM
1 Kudo
Sagar's answer is the best solution if both clusters will use the same AD. If each cluster has its own AD with unique users and groups, then you should clarify what you are hoping to gain by duplicating the policies. Keeping in mind that you'll need to sync them on an ongoing basis, it seems like "updating" every policy for a new set of users/groups would be more work than manually adding the policies on each cluster.
... View more
06-01-2016
02:07 PM
It sounds like there are two conflicting goals you might want to achieve. Is the intention to migrate cluster-B/2 to use the same AD as cluster-A/1? Or do all users have accounts in both ADs, and you want to translate the policies from A to B but keep them on different ADs?
... View more
05-31-2016
06:19 PM
1 Kudo
Anonymization rules are covered in the SmartSense Admin Guide. You will need to use a regular expression-based rule to mask from a text file. Depending on what text file(s) may contain passwords, you can either specify the exact filename or use a regular expression here as well. It's best to define the path as specifically as possible to avoid accidentally masking values in unrelated files. The string to mask/replace is identified by a regular expression. Here's a very simple example that will replace a line that contains the string "password:" in my-credentials.txt: {
"name":"my_credentials",
"path":"my-credentials.txt",
"pattern": ".*password:.*",
"value": "password: Hidden"
},
... View more
05-26-2016
07:20 PM
This recently started happening in my scripts too and I hadn't figured out why. Thanks for the tip!
... View more
05-24-2016
03:52 PM
Could you provide the output of the following command while executing the curl command? tail -f /var/log/knox/gateway*.log Also let us know the exact HDP version you're using, and whether you are using Kerberos and/or NameNode HA.
... View more
05-19-2016
03:59 AM
10 Kudos
SmartSense is an excellent tool for keeping your cluster running at optimal efficiency while maintaining operational best practices. We’ve combined knowledge from the greatest minds in the industry, and use it to analyze metadata about your cluster from the bundles you submit. Have you ever wondered exactly what data you’re sending to SmartSense? The SmartSense Admin Guide contains a high-level description (see What’s Included in a Bundle), but for the greatest understanding you should extract a bundle and explore it with your own eyes! Obtain a Bundle There are two types of bundles... Analysis Bundle: configs and metrics for all services on all hosts Troubleshooting Bundle: Analysis Bundle + logs for selected service(s) To begin, let’s capture an Analysis Bundle: ...and download an unencrypted copy to our local machine: The bundle is a gzipped tar file that contains a gzipped tar file from each host running the HST Agent. In the following examples, notice the bundle variable excludes the .tgz extension. Linux or OS X users can extract everything with a bash for-loop: bundle=a-00000000-c-00000000_supportlab_0_2016-05-17_23-05-35
tar zxf $bundle.tgz && cd $bundle && for i in * ; do tar zxf "$i" ; rm "$i" ; done Windows users can use a similar process with a utility like 7-Zip. Assuming 7z.exe is in your path: setlocal
set bundle=a-00000000-c-00000000_supportlab_0_2016-05-17_23-05-35
7z x %bundle%.tgz && 7z x %bundle%.tar && rm %bundle%.tar && cd %bundle%
for %i in (*.tgz) do 7z x %i && rm %i
for %i in (*.tar) do 7z x %i && rm %i
endlocal Exploring Bundle Contents NOTE: Example console output was obtained from a SmartSense 1.2.1 bundle and may differ in future versions. The output is also truncated for brevity. You’re encouraged to follow along with a bundle from your own cluster. For a convenient overview of the bundle contents, use the tree command, limited to a depth of 3: MyLaptop:a-00000000-c-00000000_supportlab_0_2016-05-17_23-05-35 myuser$ tree -L 3
.
├── meta
│ └── metadata.json
├── mgmt.zoeocuz.com-a-00000000-c-00000000_supportlab_0_2016-05-17_23-05-35
│ ├── os
│ │ ├── logs
│ │ └── reports
│ └── services
│ ├── AMBARI
│ ├── AMS
│ ├── HDFS
│ ├── HST
│ ├── MR
│ ├── TEZ
│ ├── YARN
│ └── ZK
├── node1.zoeocuz.com-a-00000000-c-00000000_supportlab_0_2016-05-17_23-05-35
│ ├── os
│ │ ├── logs
│ │ └── reports
│ └── services
│ ├── AMBARI
...
41 directories, 4 files At the root of the bundle, we see a ‘meta’ folder, and a folder per host. The meta folder contains some bundle metadata. Note that domain names are anonymized (my cluster uses example.com). Let’s take a look inside the two subfolders (os & services) per host... Bundle Contents: OS The os folder contains a couple system logs and a variety of reports. Here’s a sample from my cluster: MyLaptop:node1.zoeocuz.com-a-00000000-c-00000000_supportlab_0_2016-05-17_23-05-35 myuser$ tree -I "blockdevices" os/
os/
├── logs
│ └── messages.log
└── reports
├── chkconfig.txt
├── cpu_info.txt
├── dns_lookup.txt
├── dstat.txt
├── error_dmesg.txt
├── file_max.txt
...
5 directories, 49 files Most of the filenames here are self-explanatory. Reports generally contain output from system commands or the /proc filesystem. These system characteristics serve as valuable inputs for determining your cluster’s optimal configuration. Bundle Contents: Services Within each host folder, the services subfolder contains configurations and reports for every HDP service on that host. Here’s an example from my node1: MyLaptop:node1.zoeocuz.com-a-00000000-c-00000000_supportlab_0_2016-05-17_23-05-35 myuser$ tree -L 3 services
services
├── AMBARI
│ ├── conf
│ │ ├── ambari-agent.ini
│ │ ├── ambari-agent.pid
│ │ └── logging.conf.sample
│ └── reports
│ ├── ambari_rpm.txt
│ ├── postgres_rpm.txt
│ ├── postmaster.txt
│ └── process_info.txt
├── AMS
│ ├── conf
│ │ ├── ams-env.sh
│ │ ├── metric_groups.conf
│ │ └── metric_monitor.ini
│ ├── metrics
│ │ └── ams
│ └── reports
│ └── ams_rpm.txt
...
32 directories, 157 files The conf folders are copied from their respective locations under /etc/ (or /var/run for the .pid files). Reports contain JMX metrics and output from CLI commands, such as the YARN application list. You can explore the contents using text processing commands like grep, sort, and uniq, which might be sufficient for your needs. Another option is to use a text editor with a file-tree view. Text Editors Here are three open source text editors that integrate a file-tree for easy navigation (see attachments at the bottom for full-size images)... TextMate 2 (OS X): Notepad++ (Windows): Vim + NerdTree (Linux, OS X): Anonymization Rules The default set of anonymization rules will protect IP addresses, hostnames, and password fields in standard HDP configuration files. You can modify or add anonymization rules if desired. Watch for a future HCC article where we take a deep dive into anonymization. After making any changes to the anonymization ruleset, it is wise to verify everything is still functioning as intended. This can be accomplished by downloading an unencrypted bundle and examining its contents using the methods described above. Until Next Time... Keeping in mind that we only looked within a single host folder, and that my demo cluster has the minimum number of components for a functioning HDP stack, we can see that every bundle is packed with useful information.
Knowing exactly what’s included in a SmartSense bundle provides peace of mind, and the trust that your confidential data remains secure and private.
... View more
- Find more articles tagged with:
- Cloud & Operations
- How-ToTutorial
- operations
- Security
- smartsense
Labels:
05-18-2016
07:32 PM
Good find! Here's a copy of the workaround: Replace /var/lib/knox/data/services/yarn-ui/2.7.1/rewrite.xml with the attached rewrite.xml (change ownership to knox:knox) Restart Knox Note that "data" might be version-specific (e.g. data-2.4.2.0-258), or you can use /usr/hdp/current/knox-server/data/ instead. The fixed rewrite.xml is attached.
... View more
05-12-2016
04:23 PM
@Benjamin R Does it work if you add a trailing slash?
... View more
05-11-2016
05:46 PM
1 Kudo
For quick reference, here's an example of adding Oozie UI to HDP 2.4 Sandbox: 1. start Sandbox and make sure all non-maintenance services are running 2. add service definition: git clone https://git-wip-us.apache.org/repos/asf/knox.git
cp -R knox/gateway-service-definitions/src/main/resources/services/oozieui /var/lib/knox/data-2.4.0.0-169/services/
chown -R knox:knox /var/lib/knox/data-2.4.0.0-169/services/oozieui
3. add OOZIEUI service to default.xml topology (Ambari > Knox > Configs > Advanced topology) <service>
<role>OOZIEUI</role>
<url>http://{{oozie_server_host}}:{{oozie_server_port}}/oozie</url>
</service>
4. start (or restart) Knox & Demo LDAP (using Ambari) 5. visit https://localhost:8443/gateway/default/oozie/
... View more
05-11-2016
03:10 PM
1 Kudo
If your users belong to different branches of the LDAP directory you'll need to use Advanced LDAP Authentication in the Knox topology. Review the linked doc to understand the limitations of userDnTemplate, and refer to the "Example provider config" section to understand the additional properties available. There should be log messages in gateway.log corresponding to the 401. Those might provide more insight into the reason for the error, so please provide them if possible.
... View more
05-10-2016
07:06 PM
2 Kudos
Unfortunately an application that uses a credential store will always need at least one cleartext password so it can unlock that credential store. This can be hardcoded into the binary or stored in a file. The ranger-policymgr-ssl.xml files contain the passwords to unlock the keystore and truststore used by Ranger agents. Obviously this file should be secured with the minimal permissions necessary. Other passwords in Ranger config files are stored in a credential store (jceks file), so they don't show up in plaintext in the configs. The credential stores typically use the default keystore password, so the files themselves should still be protected by appropriate file permissions. (thanks to @lmccay for clarifying the last part for me)
... View more
05-02-2016
06:25 PM
8 Kudos
This error occurs because the md5 digest became deprecated in favor of sha256 in recent versions of Java. It is fixed in the next SmartSense HST release. The workaround is somewhat complicated, so we recommend you open a support case for assistance. If you wish to attempt it yourself, here is the process... WORKAROUND: Change the default digest to “sha256” instead of “md5” and then regenerate all certificates. Follow these steps:
Use Ambari to stop the SmartSense service (all components) Backup the old server keys on the HST Server host: cp -rp /var/lib/smartsense/hst-server/keys /var/lib/smartsense/hst-server/keys.backup On the HST Server host, clean out the old keys:i. rm -f /var/lib/smartsense/hst-server/keys/ca.key rm -f /var/lib/smartsense/hst-server/keys/*.csr rm -f /var/lib/smartsense/hst-server/keys/*.crt rm -rf /var/lib/smartsense/hst-server/keys/db/* mkdir /var/lib/smartsense/hst-server/keys/db/newcerts touch /var/lib/smartsense/hst-server/keys/db/index.txt echo 01 > /var/lib/smartsense/hst-server/keys/db/serial Edit file /var/lib/smartsense/hst-server/keys/ca.config and change line "default_md = md5" to "default_md = sha256" On all HST Agent hosts, clean out the old keys: rm -f /var/lib/smartsense/hst-agent/keys/* If using the HST Gateway:
Stop the gateway: hst gateway stop Repeat steps 3 & 4 for the files under /var/lib/smartsense/hst-gateway/keys/ on the HST Gateway host Repeat step 5 for the files under /var/lib/smartsense/hst-gateway-client/keys on all HST Server host(s) Start the gateway: hst gateway start Use Ambari to start the SmartSense service (all components) Verify both Ambari SmartSense service and SmartSense view shows correct number of agents registered. NOTE: Turning off two-way SSL is NOT recommended (the error message has been improved in newer versions of HST), and the issue occurs on hosts with following JDK versions or newer: JDK Family Versions Oracle 1.8.0_71 Oracle 1.7.0_95 Oracle 1.6.0_111 OpenJDK 1.7.0_45 OpenJDK 1.8.0_40
... View more
04-05-2016
06:50 PM
1 Kudo
You mention Knox 0.6.0 however the path shows 0.5.0. For Java it will also help to know whether you are using Oracle or OpenJDK. To address these questions, please also provide the output of: hdp-select versions
hdp-select status knox-server
rpm -qa | grep knox
java -version
... View more
03-29-2016
06:36 PM
You may also need to add or modify some gateway properties, such as gateway.frontend.url (undefined by default), to accommodate the load balancer.
... View more
03-29-2016
06:30 PM
This occurs on hosts with following JDK versions or newer: JDK Family Versions Oracle 1.8.0_71 Oracle 1.7.0_95 Oracle 1.6.0_111 OpenJDK 1.7.0_45 OpenJDK 1.8.0_40 It is also recommended to upgrade to SmartSense 1.2.1+ while applying these changes.
... View more
03-21-2016
02:26 PM
1 Kudo
Database and table metadata is stored in the Hive Metastore, not in HDFS, so a different approach is needed to restrict this info from being sent to HiveServer2 clients. This feature was added in Hive 1.2.0 by HIVE-9350. You may need to use Ranger to achieve this functionality, which was added in RANGER-238. Both of these features are included in HDP 2.3.0+
... View more
03-14-2016
06:19 PM
7 Kudos
Kerberos user principals have 2 parts (otherwise you'd be right... that would be a deployment nightmare!). Only host-based service principals have 3 parts (the extra part being the host where the service is running). In the beeline connect string you should always use the hive service principal for the HiveServer2 instance to which you are connecting. Another option is to use _HOST instead of the specific hostname, which will be expanded to the correct host. For example: kinit myuser@COMPANY.COM
beeline> !connect jdbc:hive2://somehost.company.com:10000/default;principal=hive/_HOST@COMPANY.COM
... View more
02-24-2016
08:01 PM
1 Kudo
@Prakash Punj Can you verify whether you are still experiencing an issue after replacing the hyphen/dash/minus symbol with the correct character, as described by Benjamin below?
... View more
02-19-2016
06:30 PM
1 Kudo
Confirmed, the pasted character is not a minus sign (ASCII 45 in decimal). You can verify with the 'od' command.
... View more
02-10-2016
07:09 PM
1 Kudo
@Adi Jabkowsky I think it attempts to bind as the user being authenticated. Additional LDAP properties are available in Hive 1.3: https://cwiki.apache.org/confluence/display/Hive/Setting+Up+HiveServer2#SettingUpHiveServer2-Authentication/SecurityConfiguration
... View more
02-09-2016
06:12 PM
1 Kudo
As described in the docs: If you're using AD you should also define a custom hive-site property hive.server2.authentication.ldap.Domain If you're using OpenLDAP you should also define a custom hive-site property hive.server2.authentication.ldap.baseDN Also make sure to force HiveServer2 to restart in Ambari. Go to the host(s) running HS2, and use the drop-down next to HiveServer2 to 'Restart' which will push the new configs. There was an Ambari bug that would mark all other Hive components for restart, but NOT HS2, even when it's required, and the "Restart All Affected" will NOT push new HS2 configs in that case.
... View more
02-04-2016
08:56 PM
2 Kudos
Since JCEKS is a superset of JKS, what is the reason for having two files? For example, couldn't we use gateway.jceks that stored both gateway-identity and gateway-identity-passphrase, instead of using a separate __gateway-credentials.jceks in addition to gateway.jks?
... View more
Labels: