Created on 11-08-2017 10:39 PM - edited 09-16-2022 01:41 AM
Distributed System concepts are derived from the concepts working on a single Operating System (machine) , hence the motivation would be to understand the aspects on an Operating System which will help us understand the bigger and complex architecture of distributed system.
Main Idea | OS concepts | Distributed System concept |
resource | CPU , RAM , Network | YARN |
filesystem | NTFS , ext3 | HDFS |
process | java , perl, python process | SPARK , MR |
database | mysql | NoSql |
authentication | PAM module | Knox |
authorization | NSS module | Ranger |
Holistically Security is based on foundation of
Securely exchanging data during the above mentioned process is based on the concept of cryptography.
2. Asymmetric-key
3. Hashing
===============================================================================================
Lets focus on systems and mechanism which enable authentication and authorization on OS and of Services .
1. SSL
KeyStore (Server side , private key + signed publickey certificate ) and Trustore (client side +. CA public key certificate)
1)First and major difference between trustStore and keyStore is that trustStore is used by TrustManager and keyStore is used by KeyManager class in Java. KeyManager and TrustManager performs different job in Java, TrustManager determines whether remote connection should be trusted or not i.e. whether remote party is who it claims to and KeyManager decides which authentication credentials should be sent to the remote host for authentication during SSL handshake. if you are an SSL Server you will use private key during key exchange algorithm and send certificates corresponding to your public keys to client, this certificate is acquired from keyStore. On SSL client side, if its written in Java, it will use certificates stored in trustStore to verify identity of Server. SSL certificates are most commonly comes as .cer file which is added into keyStore or trustStore by using any key management utility e.g. keytool. See my post How to add certificates into trustStore for step by step guide on adding certificates into keyStore or trustStore in Java.
2. SSH
1. Server has private-public key pair. When a client connects it fetches public key from the server.
2. Client has to accept the servers public key which eventually gets saved in the known_hosts file.
3. Client and server finalize on a symmetric key using classic Diffe-Hellman algorithm.
4. Please note using the above algorithm the symmetric key is known to both without ever being sent on wire.
5. Client sends password encrypted using the symmetric key for authentication .
For password less authentication
1. Client generates a public-private key pair. Client public key is manually placed in authorized_key file of server.
2. In this case Server generates a random number encrypt with public key of client present in authorized_key and send it to client.
3. Client is able to decrypt that using its private key and re-encrypt using the syymetric key and send it back to server.
4. Server decrypts using the symmetric key and if found same as the original number , passwordless authentication succeeds.
5. For testing use testLink
1. KDC is Key Distribution center which also has component of AS (Authentication Server) and TGS (Tickket Granting Server).
2. Client password is manually saved in KDC .
3. Client password is never sent over the network.
4. Client username is sent to initiate the process of interaction between client and Authentication Server.
5. AS send client a symmetric key encrypted with client password.
6. A Kerberos realm is a set of managed nodes that share the same
Kerberos database.
4. LDAP :
The Lightweight Directory Access Protocol (LDAP; /ˈɛl) is an open, vendor-neutral, industry standard application protocol for accessing and maintaining distributed directory information services over an Internet Protocol (IP) network.[1] Directory services play an important role in developing intranet and Internet applications by allowing the sharing of information about users, systems, networks, services, and applications throughout the network.
Most important terms used are :
1. DN - distinguished name (unique path)
2. OU - Organizational Unit department
3. DC - Domain Component (not domain controller for once) com org
4. CN - Common Name end
================================================================================================
Authentication and authorization in OS
1. PAM :
PAM is a framework that assists applications in performing what I'll call "authentication-related activities". The core pieces of PAM are a library (libpam) and a collection of PAM modules, which are dynamically linked libraries (.so) files in the folder /lib/security. PAM configuration files are stored in the /etc/pam.d/ directory.
2. NSS
The Name Service Switch (NSS) is a facility in Unix-like operating systems that provides a variety of sources for common configuration databases and name resolution mechanisms. These sources include local operating system files (such as /etc/passwd, /etc/group, and /etc/hosts), the Domain Name System (DNS), the Network Information Service (NIS), and LDAP.
NSS depends on groups passwd and shadow file for authorization.
Groups : https://www.cyberciti.biz/faq/understanding-etcgroup-file/
Shadow: https://www.cyberciti.biz/faq/understanding-etcshadow-file/
Passed : https://www.cyberciti.biz/faq/understanding-etcpasswd-file-format/
Both PAM and NSS can be linked to LDAP. LDAP also has a independent ldap client which can also be used to access LDAP.
There is a possibility that a user doesn't exist locally on a Operating system but exist in LDAP .
To helps to break things down like this in your head:
passwd
, group
, shadow
(this is important to note), and hosts
. UID lookups use the passwd
database, and GID lookups use the group
database.passwd
and group
databases of NSS. (you always need UID/GID lookups)The important difference is that PAM does nothing on its own. If an application does not link against the PAM library and make calls to it, PAM will never get used. NSS is core to the operating system, and the databases are fairly ubiquitous to normal operation of the OS.
Now that we have that out of the way, here's the curve ball: while pam_ldap is the popular way to authenticate against LDAP, it's not the only way.
shadow
is pointing at the ldap service within /etc/nsswitch.conf
,
any authentication that runs against the shadow database will succeed
if the attributes for those shadow field mappings (particularly the
encrypted password field) are present in LDAP and would permit login.pam_unix.so
can potentially
result in authentication against LDAP, as it authenticates against the
shadow database. (which is managed by NSS, and may be pointing at LDAP)pam_sss.so
, which hooks sssd
), it's possible that LDAP will be referenced.The sssd
daemon acts as the spider in the web,
controlling the login process and more. The login program communicates
with the configured pam
and nss
modules, which
in this case are provided by the SSSD package. These modules
communicate with the corresponding SSSD responders, which in turn talk
to the SSSD Monitor. SSSD looks up the user in the LDAP directory, then
contacts the Kerberos KDC for authentication and to aquire tickets.
(PAM and NSS can also talk to LDAP directly using pam_ldap and nss_ldap respectively. However SSSD provides additional functionality.)
Of course, a lot of this depends on how SSSD has been configured; there lots of different scenarios. For example, you can configure SSSD to do authentication directly with LDAP, or authenticate via Kerberos.
The sssd
daemon does not actually do much that cannot be
done with a system that has been "assembled by hand", but has the
advantage that it handles everything in a centralised place. Another
important benefit of SSSD is that it caches the credentials, which eases
the load on servers and makes it possible to go offline and still
login. This way you don't need a local account on the machine for
offline authentication.
In a nutshell SSSD is able to provide what nss_ldap, pam_ldap, and pam_krb, and ncsd used to provide in a seamless way.
Please follow this Link to start digging how Authentication ,Authorization and audit is provided for a cluster.
Please do keep in mind that there are multiple ways to log onto the cluster and hence all the paths needs to made secured.
1. Ambari views
2. ssh onto a node
3. Login to a node through OS UI
4. Knox .
All of the component should talk to a LDAP to maintain a predefined set of user and provide authorization and authentication using Ranger and Knox.
PFA : sssd.pdf