Learn How To Secure A Hadoop Cluster Using Kerberos Part 2

1
2787
Learn-how-to-secure-a-Hadoop-cluster-using-kerberos-Part2-740X296

Learn-how-to-secure-a-Hadoop-cluster-using-kerberos-Part2-740X296
In part 1 of this tutorial key terminologies used in kerberos authentication were discussed. We demonstrated how to set up and configure a KDC server to issue tickets to authenticate users. We also demonstrated how to install and configure a kerberos client. The client can be used to authenticate from the same machine running KDC server or from a remote machine. You just need to point the client to the correct KDC server. In the previous article we also demonstrated how to change SSH and Hadoop configurations to enable kerberos authentication. In this tutorial we will not review concepts mentioned above so if you are not comfortable in those areas please refer to part 1 of this tutorial. In this tutorial we will just focus on using kerberos to authenticate users and services.

Each user that needs to access Hadoop requires its own kerberos principal. So you will need to create as many principals as there are users. Creating principals is done via the kadmin utility. You need to specify the name of the principal and the realm it will created in. To create a principal learner in LOCALHOST (replace this with realm you created in part 1 of the tutorial) realm use the command below. The learner name chosen as our principal name is a valid ubuntu user account. Before creating principals create a user account for each user. The users will belong to the hadoop user group. The commands below will create user accounts for hdfs, mapred, yarn and learner.

add users

create princ
Using the construct above you create all principals that are required. Let’s also create a principal for the hdfs service.

hdfs user
We can also create a user for yarn services

yarn user
Also create a principal for mapred

mapred user
Create a principal for the HTTP service

http princ
To authenticate via kerberos with human interaction you use the kinit command to request tickets. You need to specify

In kerberos terminology Hadoop services such as yarn and hdfs are referred to as service principals. For each service principal you create encrypted kerberos keys referred to as keytabs. These keytabs are required for passwordless communication and authentication in a similar way SSH keys are used. The keys are distributed to every node in the Hadoop cluster. Each keytab points to a specific fully qualified domain name (FQDN) therefore each cluster node needs a keytab for every service principal. The keytab contains kerberos principals and their encrypted keys. Access to keytabs needs to be secured because their access gives principals rights and privileges.

To create keytabs you use the kadmin utility so all keytab creation commands are run from this shell. To create a keytab you specify the name of file that will store the keytab and the principal or principals that will be contained in the keytab. To create a keytab for the HTTP and hdfs principals you use the commands below.

hdfs keytab
We also need to create a keytab containing the mapred and HTTP principals. The command below will create a keytab mapred containing the two principals.

mapred keytab
Create a keytab named yarn that will contain the yarn and HTTP principals

yarn keytab
After keytab files have been created we can inspect them using the klist command to check if they have been correctly created.

Once our keytab files have been created we deploy them by moving them to a directory under etc directory. The deployment of keytab files must be done on all nodes in the Hadoop cluster. When you are using MRv1 as your execution engine you need to deploy hdfs and mapred keytabs. The command below is used to do that. When copying to a remote server it is advisable to use a secure method such as scp.

When you are using yarn as your execution engine you need to deploy hdfs and yarn keytabs.

After your keytabs have been deployed we make them readable only by their respective users by assigning their ownership to correct users. This improves security because anybody who can access the keytabs will have all privileges belonging to that principal. The commands below change file ownership.

To map a kerberos principal to a specific operating system user a rule specified in auth_to_local setting of krb5.conf configuration file is used. The default behaviour is to take the first part of the principal name as the operating system user if the principal is a member of the default realm specified in krb5.conf configuration file. For example the principal hdfs@LOCALHOST is mapped to hdfs user on the operating system if LOCALHOST has been specified as the default realm.

To authenticate via kerberos with human interaction you use the kinit command to request tickets. You need to specify the keytab and the principal requesting an access ticket. The construct for making such requests is shown below.

This tutorial reviewed the concepts that were covered in part and are needed to understand this tutorial. Creation of kerberos principals and operating system users was demonstrated. Creation of keytab files that allow kerberos authentication without a password was demonstrated. Deploying the keytabs was also demonstrated. Mapping a kerberos principal to an operating system user to facilitate authentication was demonstrated.

1 COMMENT

  1. Hi,
    Thanks for this information.
    I have a query, in your blog you mentioned how passwordless kerberos security is setup.
    if suppose admin setup password proof security then how we can access tools like hive/pig or hdfs.
    Thanks in advance

LEAVE A REPLY

Please enter your comment!
Please enter your name here