Hadoop Kerberos的那些坑
顾亮亮
2015.11.03


Agenda


Kerberos (百度百科)

Kerberos这一名词来源于希腊神话 三个头的狗——地狱之门守护者


Kerberos (Wikipedia)

Kerberos /ˈkərbərəs/ is a computer network authentication protocol which works on the basis of ‘tickets’ to allow nodes communicating over a non-secure network to prove their identity to one another in a secure manner.


Authentication(认证) VS Authorization(授权)

Authentication is about who somebody is.

Authorisation is about what they’re allowed to do.


Step 1: Authentication Service - TGT Delivery

TGT: Ticket Granting Ticket From: The MIT Kerberos Administrator’s How-to Guide


Step 2: Ticket Granting Service - TGS Delivery

TGS: Ticket Granting Service


Diagrams


Overal


Token Expired Problem when Hadoop meet Kerberos


Hadoop Authentication Flow

From: Hadoop Security Design


HDFS Authentication (Case 1: Sigle JVM Application)

  1. 用Keytab从KDC拿到TGT,进而拿到对应NameNode的TGS
  2. 用TGS向NameNode做验证,进而拿到BlocksToken
  3. 用BlocksToken从DataNode读取数据

Block Token


HDFS Authentication (Case 2: Yarn Application)

Yarn上运行的Executor如何做验证?

  1. client用Keytab从KDC拿到TGT,进而拿到对应NameNode的TGS
  2. client用TGS向NameNode做验证,然后获取HDFS Delegation Token
  3. client把HDFS Delegation Token发送到各个Executor
  4. Executor用HDFS Delegation Token向NameNode获取BlocksToken
  5. Executor用BlocksToken从DataNode读取数据

HDFS Delegation Token


Token Expired Problem when Hadoop meet Kerberos

  1. keytab (long term)
  2. TGT (max life time = 3 days)
  3. TGS for NameNode (max life time = 3 days)
  4. HDFS Delegation Token (max life time = 7 days)
  5. Block Access Token (shot time)
**Proglem: Applications on Yarn cannot run > 7 days**

How Spark solve the Problem


Old Design

**Problem: cannot run > 7 days on Yarn**

New Design from Spark-1.4.0

From: Long Running Spark Apps on Secure YARN

**Upload Keytab to HDFS and login from Keytab before token expired**

Spark-5342 Allow long running Spark apps to run on secure YARN/HDFS


Log Aggregation Token Expire Problem

Configuring YARN for Long-running Applications (Hadoop 2.6.0)


Hadoop Kerberos Programming API


UserGroupInformation (Hadoop)

!java
// Log a user in from a keytab file. Loads a user identity from a keytab
// file and logs them in. They become the currently logged-in user.
static void loginUserFromKeytab(String user, String path)

// Log a user in from a keytab file. Loads a user identity from a keytab
// file and login them in. This new user does not affect the currently
static UserGroupInformation loginUserFromKeytabAndReturnUGI(String user,
                                                            String path)

// Return the current user, including any doAs in the current stack.
static UserGroupInformation getCurrentUser()

// Re-login a user from keytab if TGT is expired or is close to expiry.
void checkTGTAndReloginFromKeytab()

// Add the given Credentials to this user.
public void addCredentials(Credentials credentials)

public Credentials getCredentials()

// Run the given action as the user.
public <T> T doAs(PrivilegedAction<T> action)

UserGroupInformation Runtime

!scala
UserGroupInformation.loginUserFromKeytab(principal, keytab)
val ugi = UserGroupInformation.getCurrentUser


Subject (Java)

A Subject represents a grouping of related information for a single entity, such as a person. Such information includes the Subject’s identities as well as its security-related attributes (passwords and cryptographic keys, for example).


Subject Runtime

!scala
val ugi = UserGroupInformation.getCurrentUser


Subject.doAs

!java
/**
 * Perform work as a particular Subject.
 *
 * This method first retrieves the current Thread's
 * AccessControlContext via AccessController.getContext,
 * and then instantiates a new AccessControlContext
 * using the retrieved context along with a new
 * SubjectDomainCombiner (constructed using the provided Subject).
 * Finally, this method invokes AccessController.doPrivileged,
 * passing it the provided PrivilegedAction,
 * as well as the newly constructed AccessControlContext.
 *
 * @param subject the Subject that the specified
 *        action will run as.  This parameter may be null.
 */
public static <T> T doAs(final Subject subject,
                    final java.security.PrivilegedAction<T> action)

Use Cases


Kerberos Configuration

!scala
System.setProperty("java.security.krb5.realm", "HADOOP.QIYI.COM")
System.setProperty("java.security.krb5.kdc", "hadoop-kdc01")

Use Case 1: kinit by crontab

Run once every day in crontab

!bash
kinit -l 3d -k -t ~/test.keytab test@HADOOP.QIYI.COM
Right:
!scala
val fs = FileSystem.get(new Configuration())
while (true) {
  println(fs.listFiles(new Path("/user"), false))
  Thread.sleep(60 * 1000)
}

Kerberos will relogin automatically, when token is expired


Use Case 2: Login with Keytab File

Right:
!scala
UserGroupInformation.loginUserFromKeytab(principal, keytab)
val fs = FileSystem.get(new Configuration())
while (true) {
  println(fs.listFiles(new Path("/user"), false))
  Thread.sleep(60 * 1000)
}

Kerberos will relogin automatically, when token is expired

Wrong:
!scala
UserGroupInformation.loginUserFromKeytab(principal, keytab)
val fs = FileSystem.get(new Configuration())
while (true) {
  UserGroupInformation.loginUserFromKeytab(principal, keytab)
  println(fs.listFiles(new Path("/user"), false))
  Thread.sleep(60 * 1000)
}

Do NOT call UserGroupInformation.loginUserFromKeytab again


Use Case 3: Multi-User in Signle JVM

Wrong:
!scala
val ugi = UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal,
                                                              keytab)
ugi.doAs(new PrivilegedExceptionAction[Void] {
  override def run(): Void = {
    val fs = FileSystem.get(new Configuration())
    while (true) {
      println(fs.listFiles(new Path("/user"), false))
      Thread.sleep(60 * 1000)
    }
    null
  }
})

Kerberos will NOT relogin automatically, when token is expired


Use Case 3: Muilti-User in Signle JVM (CONT.)

Right:
!scala
val ugi = UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal,
                                                              keytab)
ugi.doAs(new PrivilegedExceptionAction[Void] {
  override def run(): Void = {
    val fs = FileSystem.get(new Configuration())
    while (true) {
      UserGroupInformation.getCurrentUser.reloginFromKeytab()
      println(fs.listFiles(new Path("/user"), false))
      Thread.sleep(60 * 1000)
    }
    null
  }
})

Call reloginFromKeytab before token is expired


Use Case 4: Using HDFS Delegation Token (Yarn Application)

Right:
!scala
val creds = new org.apache.hadoop.security.Credentials()
val ugi = UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal,
                                                                keytab)
ugi.doAs(new PrivilegedExceptionAction[Void] {
  override def run(): Void = {
    val fs = FileSystem.get(new Configuration())
    fs.addDelegationTokens("test", creds)
    null
  }
})
UserGroupInformation.getCurrentUser.addCredentials(creds)

Update credentials periodically before token expired


References


Related Patches

Spark-1.4.0

Submit Patches