Member since
02-21-2019
69
Posts
44
Kudos Received
11
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
694 | 06-06-2018 02:51 PM | |
2203 | 10-12-2017 02:48 PM | |
650 | 08-01-2017 08:58 PM | |
18777 | 06-12-2017 02:36 PM | |
2503 | 02-16-2017 04:58 PM |
01-08-2019
11:35 PM
<namenode> is the issue
... View more
06-06-2018
02:51 PM
1 Kudo
Hi @Thiago
Charchar, you can use the HBase REST service that comes by default in the package, you only have to start it - the init script is located under /usr/hdp/current/hbase-master/etc/rc.d/hbase-rest. These will be the endpoints offered: https://hbase.apache.org/1.1/apidocs/org/apache/hadoop/hbase/rest/package-summary.html You can start it on the HBase Master nodes (usually 2 of them) but if you'd need it to scale, I guess you can start it on as many nodes are required, it's just a Java app that offers the REST service and connects to HBase in the backend. You can also tune it a little bit, for example setting the number of threads (in Custom
hbase-site): hbase.rest.threads.max=200
hbase.rest.threads.min=10
... View more
02-08-2018
11:31 PM
5 Kudos
Introduction
OpenID Connect (OIDC) is an authentication layer on top of OAuth 2.0, an authorization framework.
It uses simple JSON Web Tokens (JWT), which is an open standard for securely transmitting information as a JSON object. These objects are normally signed with an RSA key and contain information such as if the user was authenticated or the user's id and email. More information and examples can be found here: https://auth0.com/docs/tokens/concepts/jwts
Knox, together with pac4j, a Java security engine, uses OpenID Connect (and also SAML, CAS, OAuth) to enable Single Sign On using a variety of 3rd party Identity Providers that support these protocols.
For example, Knox can be integrated with a service such as https://auth0.com to provide users authenticated to auth0 access to Hadoop resources (without a need to enter their credentials again or even integrate Knox with an LDAP).
Following is an example of how Knox can integrate with auth0 using OpenID Connect.
Setup auth0
Sign up
The first step is to sign up to https://auth0.com and create an account to manage the identity provider.
Since this a public SaaS based service, it needs a unique identifier to distinguish between clients (it will form its own unique subdomain):
Add the first user
Once the account is created, add a new user in the auth0 internal database. This user will be used to login in Knox. auth0 can integrate with a variety of sources (including AD/LDAP) and Social (Google, Facebook). However, for this test, we define the user in the default auth0 internal Database.
Go to Users section and create your first user:
Define the client
Once we have a new user, the next step is to define a new client, in this case Knox, which is a Regular Web Application:
Client id and secret
Once the application is created, go to the Settings tab and note down the information that Knox will use to authenticate itself, as a client application, to auth0 (these will later be used in Knox's configuration):
Callback URLs
Then there is one setting that needs defined here - Allowed Callback URLs - which is the Knox URL that auth0 will redirect the user back to after successful authentication. This URL is in the following format:
https://<KNOX>:8443/gateway/knoxsso/api/v1/websso?pac4jCallback=true&client_name=OidcClient
So for our example, we can use:
discoveryUri
Lastly, one other piece of configuration required by Knox is discoveryUri, which can be found at the end of the configuration page, in the Show Advanced Settings section -> Endpoints -> OpenID Configuration (typically in the format https://<IDP-FQDN>/.well-known/openid-configuration):
https://anghelknoxtest.eu.auth0.com/.well-known/openid-configuration
This process, creating an application, getting its ID and Secret, configuring the allowed callback URL and finding out the discoveryUri are steps for any OpenID Connect identity provider. These have also been tested with NetIQ for instance.
Knox Configuration
To enable any SSO type authentication in Knox (be it OpenID Connect, SAML or other pac4j IdP), the Knox default topology must be configured to use the SSOCookieProvider and then the knoxsso.xml topology must be configured to use the OpenID Connect provider.
default topology
Set the following in the default topology (Advanced topology in Ambari) so that it uses SSOCookieProvider. Replace any other authentication providers (by default it's the ShiroProvider) with the ones bellow and set the sso.authentication.provider.url to the correct Knox IP/FQDN):
<provider>
<role>webappsec</role>
<name>WebAppSec</name>
<enabled>true</enabled>
<param>
<name>cors.enabled</name>
<value>true</value>
</param>
</provider>
<provider>
<role>federation</role>
<name>SSOCookieProvider</name>
<enabled>true</enabled>
<param>
<name>sso.authentication.provider.url</name>
<value>https://00.000.000.000:8443/gateway/knoxsso/api/v1/websso</value>
</param>
</provider>
knoxsso topology
And then configure the pac4j provider for OpenID Connect in your knoxsso.xml topology (Advanced knoxsso-topology in Ambari) - as per documentation.
Set the following:
pac4j.callbackUrl should point to the Knox IP or FQDN in this format: https://<KNOX>:8443/gateway/knoxsso/api/v1/websso
clientName to OidcClient
oidc.id to the Client ID from the auth0 Client configuration
oidc.secret to the Client Secret from the auth0 Client configuration
oidc.discoveryUri to the oidc.discoveryUri from the auth0 Client configuration
oidc.preferredJwsAlgorithm to RS256
knoxsso.redirect.whitelist.regex should include the IP or FQDN of Knox.
The following is the full topology definition for knoxsso.xml with the example values from the previous points:
<topology>
<gateway>
<provider>
<role>webappsec</role>
<name>WebAppSec</name>
<enabled>true</enabled>
<param><name>xframe.options.enabled</name><value>true</value></param>
</provider>
<provider>
<role>federation</role>
<name>pac4j</name>
<enabled>true</enabled>
<param>
<name>pac4j.callbackUrl</name>
<value>https://00.000.000.000:8443/gateway/knoxsso/api/v1/websso</value>
</param>
<param>
<name>clientName</name>
<value>OidcClient</value>
</param>
<param>
<name>oidc.id</name>
<value>8CD7789Nyl5QZd0Owuyamb7E0Qi29F9t</value>
</param>
<param>
<name>oidc.secret</name>
<value>CSIR3VtIdEdhak6LWYgPEv69P4J0P7ZcMOVnQovMoAnZGVOtCjcEEWyPOQoUxRh_</value>
</param>
<param>
<name>oidc.discoveryUri</name>
<value>https://anghelknoxtest.eu.auth0.com/.well-known/openid-configuration</value>
</param>
<param>
<name>oidc.preferredJwsAlgorithm</name>
<value>RS256</value>
</param>
</provider>
</gateway>
<application>
<name>knoxauth</name>
</application>
<service>
<role>KNOXSSO</role>
<param>
<name>knoxsso.cookie.secure.only</name>
<value>false</value>
</param>
<param>
<name>knoxsso.token.ttl</name>
<value>3600000</value>
</param>
<param>
<name>knoxsso.redirect.whitelist.regex</name>
<value>^https?:\/\/(localhost|00\.000\.000\.000|127\.0\.0\.1|0:0:0:0:0:0:0:1|::1):[0-9].*$</value>
</param>
</service>
</topology>
Test
Now restart Knox and test if it works.
To test, go to any Knox page, like https://<KNOX>:8443/gateway/default/templeton/v1/status or https://<KNOX>:8443/gateway/default/webhdfs/v1/tmp
For example, going to https://00.000.000.000:8443/gateway/default/templeton/v1/status should redirect to the auth0 authentication page:
And once you input the user details for the user that was previously created, it should redirect back to the knox page that was originally requested:
The user id
There is one issue remaining, that of the user identifier that is being retrieved by Knox, which is then used to communicate with Hadoop services and Ranger.
If we look in the gateway-audit.log file, we can see the following entry for the above request:
WEBHCAT|auth0|5a79dd769bf9bc6ee2539f67|||access|uri|/gateway/default/templeton/v1/status|success|Response status: 200
From the log, we can see that the user Knox actually "sees" is auth0|5a79dd769bf9bc6ee2539f67 which is the user id from auth0:
This is not exactly useful if we want to apply Ranger policies for example, or if we care about the user that Knox proxies to Hadoop services.
To understand better, if we enable DEBUG logging in Knox, we would see the following entry after the user authenticates:
DEBUG filter.Pac4jIdentityAdapter (Pac4jIdentityAdapter.java:doFilter(70)) - User authenticated as: <OidcProfile> | id: auth0|5a79dd769bf9bc6ee2539f67 |attributes: {sub=auth0|5a79dd769bf9bc6ee2539f67, email_verified=true, updated_at=2018-02-08T12:47:56.061Z, nickname=aanghel, name=aanghel@hortonworks.com, picture=https://s.gravatar.com/avatar/7baaabe6020925809d0650e9d4cefe9c?s=480&r=pg&d=https%3A%2F%2Fcdn.auth0.com%2Favatars%2Faa.png, email=aanghel@hortonworks.com} | roles: [] | permissions: [] | isRemembered: false |
From the above, we can see that the OpenID Connect implementation in Knox uses a Profile which has an id and many attributes. To be able to use any of these attributes, Knox 0.14 is required or at least the KNOX-1119 patch, which adds a new configuration variable - pac4j.id_attribute, that allows us to pick the attribute that we want from the session above. We can define this configuration in the knoxsso topology, after pac4j.callbackUrl:
<param>
<name>pac4j.id_attribute</name>
<value>nickname</value>
</param>
With the above, a Knox 0.14 will use the actual username when communicating with Hadoop or Ranger.
... View more
Labels:
10-12-2017
04:07 PM
@balalaika ^^
... View more
10-12-2017
03:20 PM
1 Kudo
Glad it work, but I didn't say to use ls -l (which outputs the additional stuff), but ls -1 (as in the 1 - the number) which outputs only the filename, one per line.
... View more
10-12-2017
02:48 PM
1 Kudo
Hi @balalaika It would be better to just output the file listing with 1 filename per line and then have a SplitText processor, followed by a FetchFile. You can do this with the -1 parameter for ls: ls -1 /home/user/test | grep ".zip" SplitText will just generate 1 flowfile for each file which is then easily picked by FetchFile. You will still need to add an ExtractText between them but with a simple (.*) rule (this is to transfer the flowfile content - which is the actual filename - into an attribute that can be used by FetchFile as the filename).
... View more
10-11-2017
08:09 PM
Hi @Angel Mondragon This is a bit strange, I couldn't replicate it on a fresh HDP cluster. The warning is triggered by this line of code: egrep "WARNING.*yarn jar" -B1 /usr/hdp/current/hadoop-client/bin/hadoop.distro
if [[ -n "${YARN_OPTS}" ]] || [[ -n "${YARN_CLIENT_OPTS}" ]]; then
echo "WARNING: Use \"yarn jar\" to launch YARN applications." 1>&2
Which means you (or the scripts) set the YARN_OPTS or YARN_CLIENT_OPTS env variables (maybe a previous command sources some environment variables). Try to do an echo $YARN_OPTS and echo $YARN_CLIENT_OPTS and if one is set do an unset on it. You can also replicate this on a local shell by doing export YARN_CLIENT_OPTS=test before the beeline
... View more
10-11-2017
06:22 PM
1 Kudo
@Foivos A The output.stream relation from ExecuteStreamCommand contains the stdout from the command executed. Unless you do cat <unzipped_file> at the end of your script you won't see anything on that relation. And this would only work if you only have 1 unzipped file of course. The way I did this was to have the script "echo" at the end the names of the local files, one per line. This output will go to the output.stream relation and from there you can do SplitText to split the output by line followed by a FetchFile -> PutHDFS. If you're still interested, I can share my flow and the scripts, but as Abdelkrim mentioned, UnpackContent should do the job, even for very large files as UnpackContent followed by PutHDFS will be streamed so will not affect the NiFi heap.
... View more
09-29-2017
07:15 PM
Hi @Jacqualin jasmin You have to use the API or the configs.sh helper command: /var/lib/ambari-server/resources/scripts/configs.sh
-port 8080 set localhost <CLUSTER_NAME> ranger-env ranger_admin_log_dir
"/var/log/hdp/ranger/admin" /var/lib/ambari-server/resources/scripts/configs.sh
-port 8080 set localhost <CLUSTER_NAME> ranger-env ranger_usersync_log_dir
"/var/log/hdp/ranger/usersync" /var/lib/ambari-server/resources/scripts/configs.sh
-port 8080 set localhost <CLUSTER_NAME> ranger-env ranger.tagsync.logdir
"/var/log/hdp/ranger/tagsync"
... View more
08-03-2017
09:07 AM
Hi, yes, Knox is a component part of the stack and needs to be added as a service to the cluster you're connecting to: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.1/bk_security/content/perimeter_security_with_apache_knox.html And it also needs to be configured for oauth as by default it uses an LDAP as the authentication backend. Maybe it would be easier to get access to a different endpoint without oauth, like webhdfs or ambari views if a GUI is sufficient.
... View more
08-01-2017
09:09 PM
Hi @Nitish Rai In the Hortonworks stack only the Knox Gateway (https://hortonworks.com/apache/knox-gateway) supports oauth, but this needs to be configured first: https://knox.apache.org/books/knox-0-12-0/user-guide.html#For+OAuth+support:
... View more
08-01-2017
08:58 PM
1 Kudo
Hi @Gerd Koenig NiFi needs two configuration passwords, nifi.security.encrypt.configuration.password and nifi.sensitive.props.key. You can pass those using the nifi-ambari-config in the blueprint. The passwords also need to be at least 12 characters in length. Here's a simple blueprint that normally works for me: {
"configurations" : [
{
"nifi-ambari-config" : {
"nifi.node.ssl.port": "9091",
"nifi.node.port": "9090",
"nifi.security.encrypt.configuration.password": "AsdQwe123456",
"nifi.sensitive.props.key": "AsdQwe123456"
}
},
{
"nifi-env" : {
"nifi_group" : "nifi",
"nifi_user" : "nifi"
}
}
],
"host_groups" : [
{
"name" : "mytestcluster-singlenode",
"configurations" : [ ],
"components" : [
{ "name" : "ZOOKEEPER_CLIENT" },
{ "name" : "INFRA_SOLR_CLIENT" },
{ "name" : "ZOOKEEPER_SERVER" },
{ "name" : "NIFI_MASTER" },
{ "name" : "AMBARI_SERVER" },
{ "name" : "INFRA_SOLR" },
{ "name" : "METRICS_COLLECTOR" },
{ "name" : "METRICS_GRAFANA" },
{ "name" : "METRICS_MONITOR" }
]
}
],
"Blueprints" : {
"stack_name" : "HDF",
"stack_version" : "3.0"
}
}
Regarding the folders, yes, NiFi will recreate them based on the configuration variables. These can also be set in the blueprint: {
"nifi-ambari-config" : {
"nifi.internal.dir" : "/var/lib/nifi",
"nifi.content.repository.dir.default" : "/var/lib/nifi/content_repository",
"nifi.state.dir" : "{nifi_internal_dir}/state/local",
"nifi.flow.config.dir" : "{nifi_internal_dir}/conf",
"nifi.config.dir" : "{nifi_install_dir}/conf",
"nifi.flowfile.repository.dir" : "/var/lib/nifi/flowfile_repository",
"nifi.provenance.repository.dir.default" : "/var/lib/nifi/provenance_repository",
"nifi.database.dir" : "/var/lib/nifi/database_repository"
}
},
... View more
07-03-2017
07:32 AM
Hi @suresh krish this is now fixed in Ambari 2.5.1: https://issues.apache.org/jira/browse/AMBARI-20868
... View more
06-12-2017
02:36 PM
@Robin Dong In Linux only iptables controls the Kernel based firewall. You might have firewalld in CentOS7 or ufw in Ubuntu but they're just an abstraction layer on top of iptables. So if 'iptables -L' doesn't show anything then it's all good. The Ambari iptables check is rudimentary and it doesn't know if the rules that exist still allow all the traffic. It only checks for 'service iptables status' or 'systemctl status firewalld', which means there are no filter tables. But please be aware of the cloud firewall as well. For example in AWS even instances in the same Security Group are not allowed by default to communicate with each another and this must be enabled explicitly: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/security-group-rules-reference.html#sg-rules-other-instances
... View more
06-12-2017
02:04 PM
@Robin Dong Sounds like you're using a systemd based OS so I'm assuming that's CentOS/RHEL 7. In that case the firewall service is being handled by firewalld rather than iptables: systemctl stop firewalld systemctl disable firewalld Depending on the AMI used, this might be already disabled or not installed. And as long as iptables -L doesn't show anything you should also be fine.
... View more
06-12-2017
01:57 PM
@Justin R. This sounds odd. In Linux, non-root users cannot listen on ports bellow 1024, so are you running NiFi as root? That wouldn't be advisable. What is the complete stacktrace of the error? Maybe there is another nifi processor listening?
... View more
06-12-2017
01:46 PM
@Ivan Majnaric You have a typo in the spark-executor.memory=4g configuration variable and it's incorrect anyway: Warning:Ignoring non-spark config property: spark-executor.memory=4g It should be --executor-memory 4g https://spark.apache.org/docs/latest/running-on-yarn.html#launching-spark-on-yarn
... View more
06-12-2017
01:39 PM
@Meryem Moumen It could be for many reasons, you have to check the YARN logs for the application ID that failed: yarn logs -applicationId application_1495808813398_0018
... View more
02-17-2017
10:28 AM
Could be network although I would have expected a different error. Maybe there are other errors in the logs pointing to a connectivity issue? I still don't see any dispatch in your log, that's the request Knox would make to the NameNode, do you see it in your audit log like in my example? The second check, the curl -i http://our_namenode_host:50070/static/bootstrap-3.0.2/js/bootstrap.min.js, did you do it from the knox host? I've seen weird behaviours in the past when HTTP proxies are configured, if you have one on the knox box, can you make sure the namenode host is on the NO_PROXY / ignore list? With a proxy, your shell session / user / curl might be allowed to correctly make the request but knox would go via the proxy?
... View more
02-16-2017
05:38 PM
It's really strange @Andreas Schild I've replicated your configuration on my cluster and still works fine, IE or Firefox. Let's try the following troubleshooting steps. 1) First make sure the topology file has actually been resolved with the correct values: cat /etc/knox/conf/topologies/default.xml 2) Get from it the URL for HDFSUI, append /static/bootstrap-3.0.2/js/bootstrap.min.js to it and try it out in a curl, should be like this: curl -i http://namenode:50070/static/bootstrap-3.0.2/js/bootstrap.min.js 3) I'd also want to see the full line the gateway-audit.log, mine shows additional information (like the service name) and also the dispatch: 17/02/16 17:15:53 ||c5f07e39-ab12-4a71-9283-f37a35128419|audit|HDFSUI|guest|||dispatch|uri|http://namenode:50070/static/bootstrap-3.0.2/js/bootstrap.min.js?doAs=guest|success|Response status: 200
17/02/16 17:15:53 ||c5f07e39-ab12-4a71-9283-f37a35128419|audit|HDFSUI|guest|||access|uri|/gateway/default/hdfs/static/bootstrap-3.0.2/js/bootstrap.min.js|success|Response status: 200 4) Check under /usr/hdp/current/knox-server/data/services/ if you have all services, I have: ls -l /usr/hdp/current/knox-server/data/services/
total 0
drwxr-xr-x. 3 knox knox 18 Feb 7 09:41 ambari
drwxr-xr-x. 3 knox knox 18 Feb 7 09:41 ambariui
drwxr-xr-x. 3 knox knox 18 Feb 7 09:41 falcon
drwxr-xr-x. 3 knox knox 19 Feb 7 09:41 hbase
drwxr-xr-x. 3 knox knox 18 Feb 7 09:41 hbaseui
drwxr-xr-x. 3 knox knox 18 Feb 7 09:41 hdfsui
drwxr-xr-x. 3 knox knox 19 Feb 7 09:41 hive
drwxr-xr-x. 3 knox knox 18 Feb 7 09:41 jobhistoryui
drwxr-xr-x. 3 knox knox 18 Feb 7 09:41 oozie
drwxr-xr-x. 3 knox knox 18 Feb 7 09:41 oozieui
drwxr-xr-x. 3 knox knox 18 Feb 7 09:41 ranger
drwxr-xr-x. 3 knox knox 18 Feb 7 09:41 rangerui
drwxr-xr-x. 3 knox knox 18 Feb 7 09:41 sparkhistoryui
drwxr-xr-x. 3 knox knox 18 Feb 7 09:41 storm
drwxr-xr-x. 3 knox knox 18 Feb 7 09:41 storm-logviewer
drwxr-xr-x. 3 knox knox 19 Feb 7 09:41 webhcat
drwxr-xr-x. 3 knox knox 18 Feb 7 09:41 webhdfs
drwxr-xr-x. 3 knox knox 18 Feb 7 09:41 yarn-rm
drwxr-xr-x. 3 knox knox 18 Feb 7 09:41 yarnui
5) Stop Knox and delete/move /usr/hdp/2.5.3.0-37/knox/bin/../data/deployments/default*
... View more
02-16-2017
04:58 PM
1 Kudo
@Bhavin Tandel I had a similar issue recently and while the Ambari YARN Queue Manager View doesn't offer a configurable URL, a workaround I found was to add the cluster as a Remote Cluster (using the alias URL when defining the Ambari Cluster URL) and then point the View to the remote cluster. This seemed to work just fine.
... View more
02-16-2017
10:10 AM
What Knox / HDP version do you have? It works fine on my HDP 2.5.3 Knox. Have you also defined WEBHDFS and RESOURCEMANAGER in your topology? Any errors in gateway.log and do you see the same 404s in gateway-audit.log?
... View more
02-16-2017
09:13 AM
You can try and add a / in your browser when you make the request. Otherwise, look at the github code and replace the xmls for the services you have issues with the ones on github, for example replace /usr/hdp/current/knox-server/data/services/hdfsui/2.7.0/ with https://github.com/apache/knox/tree/v0.11.0/gateway-service-definitions/src/main/resources/services/hdfsui/2.7.0
... View more
12-21-2016
09:36 AM
That's great to hear @Narasimma varman ! Can you accept the answer please so we know this issue / thread is closed?
... View more
12-20-2016
09:45 PM
3 Kudos
Hi @Narasimma varman After reading your message again it looks like you're trying to follow https://community.hortonworks.com/articles/7341/nifi-user-authentication-with-ldap.html which at a close look is using the Demo LDAP as part of Knox. The Knox Demo LDAP listens on port 33389 however it's not started automatically when you start Knox. Please make sure you go to Knox in Ambari and select Start Demo LDAP from the Service Actions as per the screenshot from the link above: https://community.hortonworks.com/storage/attachments/956-1.jpg You can verify if the Demo LDAP has started and listening on port 33389 by running: netstat -tnlp|grep 33389 If you see a process listening then you can configure ambari-server setup-ldap with the following options (use admin-password when asked for the Manager password): # ambari-server setup-ldap
Using python /usr/bin/python
Setting up LDAP properties...
Primary URL* {host:port} (localhost:33389): localhost:33389
Secondary URL {host:port} :
Use SSL* [true/false] (false):
User object class* (person): person
User name attribute* (uid): uid
Group object class* (groupofnames): groupofnames
Group name attribute* (cn): cn
Group member attribute* (member): member
Distinguished name attribute* (dn): dn
Base DN* (dc=hadoop,dc=apache,dc=org): dc=hadoop,dc=apache,dc=org
Referral method [follow/ignore] (follow):
Bind anonymously* [true/false] (false): false
Manager DN* (uid=admin,ou=people,dc=hadoop,dc=apache,dc=org): uid=admin,ou=people,dc=hadoop,dc=apache,dc=org
Enter Manager Password* :
Re-enter password:
====================
Review Settings
====================
authentication.ldap.managerDn: uid=admin,ou=people,dc=hadoop,dc=apache,dc=org
authentication.ldap.managerPassword: *****
Save settings [y/n] (y)? y
Saving...done
Ambari Server 'setup-ldap' completed successfully. You might also need to turn off pagination as the Knox LDAP doesn't support it: echo "authentication.ldap.pagination.enabled=false" >> /etc/ambari-server/conf/ambari.properties Now, don't forget to restart Ambari Server and be careful that after running ambari-server sync-ldap --all, the admin user password will change to admin-password Other users can be found by running this: cat /etc/knox/conf/users.ldif|egrep "^uid|^userPassword" And you can add new users by changing Advanced users-ldif under the Knox Config in Ambari. Good luck!
... View more
12-20-2016
08:58 PM
Hmm, okay, but I expect this to be transient as we're talking about TCP outgoing connections. The connection that could have potentially caused the issue it's not there any longer. So it's more important when you start Zeppelin and get the "This address is already in use" also to check the netstat at around the same time. The other error you get when you tried to install Zeppelin again is not from Zeppelin but from Ambari trying to create the zeppelin user home folder in HDFS. It looks like HDFS (WebHDFS in this case) is not working so check that please (lenu.dom.hdp on port 50070).
... View more
12-20-2016
09:27 AM
2 Kudos
Hi @Narasimma varman After running ambari-server setup-ldap did you restart the Ambari Server? The localhost:33389 error means Ambari Server hasn't been restarted and it's using the default configuration.
... View more
12-19-2016
03:42 PM
Hi @Hoang Le Use the following doc to configure cpu scheduling: http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.3/bk_yarn-resource-management/content/ch_cpu_scheduling.html But it's also recommended to configure cgroups for this to be effective: http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.3/bk_yarn-resource-management/content/enabling_cgroups.html However, these can be a bit of pain. Most of the time, increasing the container size (map, reduce and tez) would reduce the CPU load. Setting YARN container memory values is a balancing act between memory (RAM), processors (CPU cores), and disks so that processing is not constrained by any one of these cluster resources. However, if the application usage pattern is known, the containers can be tuned to influence and maximize the resources utilization.
For example, if CPU usage and load average are too large, an increase in the container size reduces the number of containers allocated per node. A good rule of thumb is not to have the load average bigger than 4 times the number of physical core. Then a likely cause for having large amounts of CPU used is Garbage Collections and especially continuous Garbage Collection which would be helped by a larger container. Increase by 50% the following and observe any changes: tez.task.resource.memory.mb hive.tez.container.size mapreduce.map.memory.mb mapreduce.reduce.memory.mb
... View more
12-19-2016
10:46 AM
Hi @Dmitry Otblesk What exactly did you check with netstat? If you've only checked what ports are used by listening services (netstat -l) I suggest to check all ports. I've seen cases when Hadoop services tried to listen on ports used as the source port by other TCP connections: netstat -anp|grep 9995
... View more
12-19-2016
12:15 AM
2 Kudos
Hi @Connor O'Neal The main reason for why swap is enabled in the first place is to prevent the Linux OOM (Out-Of-Memory) Killer terminating processes when the memory pressure is too high (memory usage without buffers is close to 100%). The general recommendation for worker nodes is to have the Swap disabled.
The logic is that in a distributed system, it's preferable to have the OS terminate processes (which can easily recover) than having 1 or 2 processes (YARN containers) that greatly degrade the performance of a distributed job running on the cluster.
If there's an internal policy that requires for swap to be present, the least intrusive action is to set swappiness to 1, which will reduce the likelihood of swapping as much as possible (only swap when absolutely necessary). The general recommendation for master nodes is to have the Swap enabled but reduce the likelihood of swapping.
If master services are abruptly terminated by the OOM killer (similar with kill -9) then the cluster availability is affected (especially if there are no HA services) and increases the possibility of some data corruption (as the services are not allowed to gracefully terminate). In conclusion, the recommendation is to set swappiness to 1 on all cluster nodes and discuss with your systems administrator the possibility to set swappiness to 0 (equivalent to disabling swap) on the worker nodes.
This can be achieved on a running system with the following command: echo 1 > /proc/sys/vm/swappiness For a permanent setting, add vm.swappiness=1 to /etc/sysctl.conf. Also a word of caution regarding CentOS/RHEL7.
For a permanent setting, which can last after reboot, updating /etc/sysctl.conf in RHEL7 might not always work.
RHEL7 introduces a new service called tuned which overwrites values set in /etc/sysctl.conf.
Thus if this service is active, create the file, for example /etc/tuned/hdp/tuned.conf with the following content:
[main]
include=throughput-performance
[sysctl]
vm.swappiness=1
[vm]
transparent_hugepages=never
And run the following command:
tuned-adm profile hdp
The throughput-performance profile is already the default in RHEL7 so this only applies changes on top of it.
... View more