Bigdata and HadoopLearn How To Secure A Hadoop Cluster Using Kerberos Part 2

Learn How To Secure A Hadoop Cluster Using Kerberos Part 2

In part 1 of this tutorial key terminologies used in kerberos authentication were discussed. We demonstrated how to set up and configure a KDC server to issue tickets to authenticate users. We also demonstrated how to install and configure a kerberos client. The client can be used to authenticate from the same machine running KDC server or from a remote machine. You just need to point the client to the correct KDC server. In the previous article we also demonstrated how to change SSH and Hadoop configurations to enable kerberos authentication. In this tutorial we will not review concepts mentioned above so if you are not comfortable in those areas please refer to part 1 of this tutorial. In this tutorial we will just focus on using kerberos to authenticate users and services.

Each user that needs to access Hadoop requires its own kerberos principal. So you will need to create as many principals as there are users. Creating principals is done via the kadmin utility. You need to specify the name of the principal and the realm it will created in. To create a principal learner in LOCALHOST (replace this with realm you created in part 1 of the tutorial) realm use the command below. The learner name chosen as our principal name is a valid ubuntu user account. Before creating principals create a user account for each user. The users will belong to the hadoop user group. The commands below will create user accounts for hdfs, mapred, yarn and learner.

sudo adduser hdfs
sudo adduser hdfs hadoop
sudo adduser mapred
sudo adduser mapred hadoop
sudo adduser yarn hadoop
sudo adduser learner
sudo adduser learner hadoop

add users

sudo kadmin.local
addprinc [email protected]

create princ
Using the construct above you create all principals that are required. Let’s also create a principal for the hdfs service.

addprinc [email protected]

hdfs user
We can also create a user for yarn services

addprinc [email protected]

yarn user
Also create a principal for mapred

addprinc [email protected]

mapred user
Create a principal for the HTTP service

addprinc [email protected]

http princ
To authenticate via kerberos with human interaction you use the kinit command to request tickets. You need to specify

In kerberos terminology Hadoop services such as yarn and hdfs are referred to as service principals. For each service principal you create encrypted kerberos keys referred to as keytabs. These keytabs are required for passwordless communication and authentication in a similar way SSH keys are used. The keys are distributed to every node in the Hadoop cluster. Each keytab points to a specific fully qualified domain name (FQDN) therefore each cluster node needs a keytab for every service principal. The keytab contains kerberos principals and their encrypted keys. Access to keytabs needs to be secured because their access gives principals rights and privileges.

To create keytabs you use the kadmin utility so all keytab creation commands are run from this shell. To create a keytab you specify the name of file that will store the keytab and the principal or principals that will be contained in the keytab. To create a keytab for the HTTP and hdfs principals you use the commands below.

sudo kadmin.local
xst -norandkey -k hdfs.keytab [email protected] [email protected]

hdfs keytab
We also need to create a keytab containing the mapred and HTTP principals. The command below will create a keytab mapred containing the two principals.

xst -norandkey -k mapred.keytab [email protected] [email protected]

mapred keytab
Create a keytab named yarn that will contain the yarn and HTTP principals

xst -norandkey -k mapred.keytab [email protected] [email protected]

yarn keytab
After keytab files have been created we can inspect them using the klist command to check if they have been correctly created.

klist -e -k -t hdfs.keytab

Once our keytab files have been created we deploy them by moving them to a directory under etc directory. The deployment of keytab files must be done on all nodes in the Hadoop cluster. When you are using MRv1 as your execution engine you need to deploy hdfs and mapred keytabs. The command below is used to do that. When copying to a remote server it is advisable to use a secure method such as scp.

sudo mv hdfs.keytab mapred.keytab /etc/hadoop/conf/

When you are using yarn as your execution engine you need to deploy hdfs and yarn keytabs.

sudo mv hdfs.keytab mapred.keytab yarn.keytab /etc/hadoop/conf/

After your keytabs have been deployed we make them readable only by their respective users by assigning their ownership to correct users. This improves security because anybody who can access the keytabs will have all privileges belonging to that principal. The commands below change file ownership.

sudo chown hdfs:hadoop /etc/hadoop/conf/hdfs.keytab
sudo chown mapred:hadoop /etc/hadoop/conf/mapred.keytab
sudo chmod 400 /etc/hadoop/conf/*.keytab

To map a kerberos principal to a specific operating system user a rule specified in auth_to_local setting of krb5.conf configuration file is used. The default behaviour is to take the first part of the principal name as the operating system user if the principal is a member of the default realm specified in krb5.conf configuration file. For example the principal [email protected] is mapped to hdfs user on the operating system if LOCALHOST has been specified as the default realm.

To authenticate via kerberos with human interaction you use the kinit command to request tickets. You need to specify the keytab and the principal requesting an access ticket. The construct for making such requests is shown below.

kinit -k -t hdfs.keytab hdfs

This tutorial reviewed the concepts that were covered in part and are needed to understand this tutorial. Creation of kerberos principals and operating system users was demonstrated. Creation of keytab files that allow kerberos authentication without a password was demonstrated. Deploying the keytabs was also demonstrated. Mapping a kerberos principal to an operating system user to facilitate authentication was demonstrated.


  1. Hi,
    Thanks for this information.
    I have a query, in your blog you mentioned how passwordless kerberos security is setup.
    if suppose admin setup password proof security then how we can access tools like hive/pig or hdfs.
    Thanks in advance


Please enter your comment!
Please enter your name here

Exclusive content

- Advertisement -

Latest article


More article

- Advertisement -