两个Hadoop集群开启Kerberos验证后,集群间不能够相互访问,需要实现Kerberos之间的互信,使用Hadoop集群A的客户端访问Hadoop集群B的服务(实质上是使用Kerberos Realm A上的Ticket实现访问Realm B的服务)。
先决条件:
1)两个集群(IDC.COM和HADOOP.COM)均开启Kerberos认证
2)Kerberos的REALM分别设置为IDC.COM和HADOOP.COM
步骤如下:

1 配置KDC之间的信任ticket

实现DIDC.COM和HADOOP.COM之间的跨域互信,例如使用IDC.COM的客户端访问HADOOP.COM中的服务,两个REALM需要共同拥有名为krbtgt/HADOOP.COM@IDC.COM的principal,两个Keys需要保证密码,version number和加密方式一致。默认情况下互信是单向的, HADOOP.COM的客户端访问IDC.COM的服务,两个REALM需要有krbtgt/IDC.COM@HADOOP.COM的principal。
向两个集群中添加krbtgt principal

  1. #IDC CLUSTER
  2. kadmin.local: addprinc e aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal camellia256-cts:normal camellia128-cts:normal des-hmac-sha1:normal des-cbc-md5:normal krbtgt/HADOOP.COM@IDC.COM
  3. kadmin.local: addprinc e aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal camellia256-cts:normal camellia128-cts:normal des-hmac-sha1:normal des-cbc-md5:normal krbtgt/IDC.COM@HADOOP.COM
  4. #HADOOP CLUSTER
  5. kadmin.local: addprinc e aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal camellia256-cts:normal camellia128-cts:normal des-hmac-sha1:normal des-cbc-md5:normal krbtgt/HADOOP.COM@IDC.COM
  6. kadmin.local: addprinc e aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal camellia256-cts:normal camellia128-cts:normal des-hmac-sha1:normal des-cbc-md5:normal krbtgt/IDC.COM@HADOOP.COM

  1. 要验证两个entries具有匹配的kvno和加密type,查看命令使用getprinc <principal_name>
  1. kadmin.local: getprinc krbtgt/IDC.COM@HADOOP.COM
  2. Principal: krbtgt/IDC.COM@HADOOP.COM
  3. Expiration date: [never]
  4. Last password change: Wed Jul 05 14:18:11 CST 2017
  5. Password expiration date: [none]
  6. Maximum ticket life: 1 day 00:00:00
  7. Maximum renewable life: 30 days 00:00:00
  8. Last modified: Wed Jul 05 14:18:11 CST 2017 (admin/admin@IDC.COM)
  9. Last successful authentication: [never]
  10. Last failed authentication: [never]
  11. Failed password attempts: 0
  12. Number of keys: 7
  13. Key: vno 1, aes128-cts-hmac-sha1-96
  14. Key: vno 1, des3-cbc-sha1
  15. Key: vno 1, arcfour-hmac
  16. Key: vno 1, camellia256-cts-cmac
  17. Key: vno 1, camellia128-cts-cmac
  18. Key: vno 1, des-hmac-sha1
  19. Key: vno 1, des-cbc-md5
  20. MKey: vno 1
  21. Attributes:
  22. Policy: [none]
  23. kadmin.local: getprinc addprinc krbtgt/HADOOP.COM@IDC.COM
  24. usage: get_principal [-terse] principal
  25. kadmin.local: getprinc krbtgt/HADOOP.COM@IDC.COM
  26. Principal: krbtgt/HADOOP.COM@IDC.COM
  27. Expiration date: [never]
  28. Last password change: Wed Jul 05 14:17:47 CST 2017
  29. Password expiration date: [none]
  30. Maximum ticket life: 1 day 00:00:00
  31. Maximum renewable life: 30 days 00:00:00
  32. Last modified: Wed Jul 05 14:17:47 CST 2017 (admin/admin@IDC.COM)
  33. Last successful authentication: [never]
  34. Last failed authentication: [never]
  35. Failed password attempts: 0
  36. Number of keys: 7
  37. Key: vno 1, aes128-cts-hmac-sha1-96
  38. Key: vno 1, des3-cbc-sha1
  39. Key: vno 1, arcfour-hmac
  40. Key: vno 1, camellia256-cts-cmac
  41. Key: vno 1, camellia128-cts-cmac
  42. Key: vno 1, des-hmac-sha1
  43. Key: vno 1, des-cbc-md5
  44. MKey: vno 1
  45. Attributes:
  46. Policy: [none]

  1.  
  1. 2 core-site中配置principaluser的映射RULES

Paste_Image.png

设置hadoop.security.auth_to_local参数,该参数用于将principal转变为user,一个需要注意的问题是SASL RPC客户端需要远程Server的Kerberos principal在本身的配置中匹配该principal。相同的pricipal name需要分配给源和目标cluster的服务,例如Source Cluster中的NameNode的kerbeors principal name为nn/h
@IDC.COM,在Destination cluster中NameNode的pricipal设置为nn/h@HADOOP.COM(不能设置为nn2/h***@HADOOP.COM),例如:

在IDC Cluster和 HADOOP Cluster的core-site中增加:

  1. <property>
  2. <name>hadoop.security.auth_to_local</name>
  3. <value>
  4. RULE:[1:$1@$0](^.*@HADOOP\.COM$)s/^(.*)@HADOOP\.COM$/$1/g
  5. RULE:[2:$1@$0](^.*@HADOOP\.COM$)s/^(.*)@HADOOP\.COM$/$1/g
  6. RULE:[1:$1@$0](^.*@IDC\.COM$)s/^(.*)@IDC\.COM$/$1/g
  7. RULE:[2:$1@$0](^.*@IDC\.COM$)s/^(.*)@IDC\.COM$/$1/g
  8. DEFAULT
  9. </value>
  10. </property>

使用hadoop org.apache.hadoop.security.HadoopKerberosName 来实现验证,例如:

  1. [root@node1a141 ~]# hadoop org.apache.hadoop.security.HadoopKerberosName hdfs/nodea1a141@IDC.COM
  2. Name: hdfs/nodea1a141@IDC.COM to hdfs

3 在krb5.conf中配置信任关系

第一种方式是配置shared hierarchy of names,这个是默认及比较简单的方式,第二种方式是在krb5.conf文件中改变capaths,复杂但是比较灵活,这里采用第二种方式。
在两个集群的节点的/etc/krb5.conf文件配置domain和realm的映射关系,例如:在IDC cluster中配置:

  1. [capaths]
  2. IDC.COM = {
  3. HADOOP.COM = .
  4. }

在HADOOP Cluster中配置:

  1. [capaths]
  2. HADOOP.COM = {
  3. IDC.COM = .
  4. }

配置成’.’是表示没有intermediate realms

为了是IDC 可以访问HADOOP的KDC,需要将HADOOP的KDC Server配置到IDC cluster中,如下,反之相同:

  1. [realms]
  2. IDC.COM = {
  3. kdc = {host}.IDC.COM:88
  4. admin_server = {host}.IDC.COM:749
  5. default_domain = IDC.COM
  6. }
  7. HADOOP.COM = {
  8. kdc = {host}.HADOOP.COM:88
  9. admin_server = {host}.HADOOP.COM:749
  10. default_domain = HADOOP.COM
  11. }

在domain_realm中,一般配置成’.IDC.COM’和’IDC.COM’的格式,’.’前缀保证kerberos将所有的IDC.COM的主机均映射到IDC.COM realm。但是如果集群中的主机名不是以IDC.COM为后缀的格式,那么需要在domain_realm中配置主机与realm的映射关系,例IDC.nn.local映射为IDC.COM,需要增加IDC.nn.local = IDC.COM。

  1. [domain_realm]
  2. .hadoop.com=HADOOP.COM
  3. hadoop.com=HADOOP.COM
  4. .IDC.com=IDC.COM
  5. IDC.com=IDC.COM
  6. node1a141 = IDC.COM
  7. node1a143 = IDC.COM
  8. node1a210 = HADOOP.COM
  9. node1a202 = HADOOP.COM
  10. node1a203 = HADOOP.COM

重启kerberos服务

在hdfs-site.xml,设置允许的realms
在hdfs-site.xml中设置dfs.namenode.kerberos.principal.pattern为”*”

Paste_Image.png

这个是客户端的匹配规则用于控制允许的认证realms,如果该参数不配置,会有下面的异常:

  1. java.io.IOException: Failed on local exception: java.io.IOException:
  2. java.lang.IllegalArgumentException:
  3. Server has invalid Kerberosprincipal:nn/ HADOOP.COM@ IDC.COM;
  4. Host Details : local host is: "host1.IDC.COM/10.181.22.130";
  5. destination host is: "host2.HADOOP.COM":8020;

4 测试

1)使用hdfs命令测试IDC 和HADOOP 集群间的数据访问
例如在IDC Cluster中kinit admin@IDC.COM,然后运行hdfs命令,查看本机群和对方集群得hdfs目录:
如果未开启跨域互信,访问对方hdfs目录时会报认证错误

  1. [root@node1a141 ~]# kdestroy
  2. 在本机群客户端登陆admin用户,通过kerberos认证
  3. [root@node1a141 ~]# kinit admin
  4. Password for admin@IDC.COM:
  5. 访问本集群hdfs
  6. [root@node1a141 ~]# hdfs dfs -ls /
  7. Found 3 items
  8. drwxrwxrwx+ - hdfs supergroup 0 2017-06-13 15:13 /tmp
  9. drwxrwxr-x+ - hdfs supergroup 0 2017-06-22 15:55 /user
  10. drwxrwxr-x+ - hdfs supergroup 0 2017-06-14 14:11 /wa
  11. 访问对方集群hdfs
  12. [root@node1a141 ~]# hdfs dfs -ls hdfs://node1a202:8020/
  13. Found 9 items
  14. drwxr-xr-x - root supergroup 0 2017-05-27 18:55 hdfs://node1a202:8020/cdtest
  15. drwx------ - hbase hbase 0 2017-05-22 18:51 hdfs://node1a202:8020/hbase
  16. drwx------ - hbase hbase 0 2017-07-05 19:16 hdfs://node1a202:8020/hbase1
  17. drwxr-xr-x - hbase hbase 0 2017-05-11 10:46 hdfs://node1a202:8020/hbase2
  18. drwxr-xr-x - root supergroup 0 2016-12-01 17:30 hdfs://node1a202:8020/home
  19. drwxr-xr-x - mdss supergroup 0 2016-12-13 18:30 hdfs://node1a202:8020/idfs
  20. drwxr-xr-x - hdfs supergroup 0 2017-05-22 18:51 hdfs://node1a202:8020/system
  21. drwxrwxrwt - hdfs supergroup 0 2017-05-31 17:37 hdfs://node1a202:8020/tmp
  22. drwxrwxr-x+ - hdfs supergroup 0 2017-05-04 15:48 hdfs://node1a202:8020/user

  1. [root@node1a141 ~]# kdestroy
  2. 在本机群客户端登陆admin用户,通过kerberos认证
  3. [root@node1a141 ~]# kinit admin
  4. Password for admin@IDC.COM:
  5. 访问本集群hdfs
  6. [root@node1a141 ~]# hdfs dfs -ls /
  7. Found 3 items
  8. drwxrwxrwx+ - hdfs supergroup 0 2017-06-13 15:13 /tmp
  9. drwxrwxr-x+ - hdfs supergroup 0 2017-06-22 15:55 /user
  10. drwxrwxr-x+ - hdfs supergroup 0 2017-06-14 14:11 /wa
  11. 访问对方集群hdfs
  12. [root@node1a141 ~]# hdfs dfs -ls hdfs://node1a202:8020/
  13. Found 9 items
  14. drwxr-xr-x - root supergroup 0 2017-05-27 18:55 hdfs://node1a202:8020/cdtest
  15. drwx------ - hbase hbase 0 2017-05-22 18:51 hdfs://node1a202:8020/hbase
  16. drwx------ - hbase hbase 0 2017-07-05 19:16 hdfs://node1a202:8020/hbase1
  17. drwxr-xr-x - hbase hbase 0 2017-05-11 10:46 hdfs://node1a202:8020/hbase2
  18. drwxr-xr-x - root supergroup 0 2016-12-01 17:30 hdfs://node1a202:8020/home
  19. drwxr-xr-x - mdss supergroup 0 2016-12-13 18:30 hdfs://node1a202:8020/idfs
  20. drwxr-xr-x - hdfs supergroup 0 2017-05-22 18:51 hdfs://node1a202:8020/system
  21. drwxrwxrwt - hdfs supergroup 0 2017-05-31 17:37 hdfs://node1a202:8020/tmp
  22. drwxrwxr-x+ - hdfs supergroup 0 2017-05-04 15:48 hdfs://node1a202:8020/user

在HADOOP.COM中进行相同的操作
2)运行distcp程序将IDC的数据复制到HADOOP集群,命令如下:

  1. [root@node1a141 ~]# hadoop distcp hdfs://node1a141:8020/tmp/test.sh hdfs://node1a202:8020/tmp/

5 附录

两集群的/etc/krb5.conf完整文件内容如下:

  1. [root@node1a141 IDC]# cat /etc/krb5.conf
  2. [logging]
  3. default = FILE:/var/log/krb5libs.log
  4. kdc = FILE:/var/log/krb5kdc.log
  5. admin_server = FILE:/var/log/kadmind.log
  6. [libdefaults]
  7. default_realm = IDC.COM
  8. dns_lookup_realm = false
  9. dns_lookup_kdc = false
  10. ticket_lifetime = 7d
  11. renew_lifetime = 30
  12. forwardable = true
  13. renewable=true
  14. #default_ccache_name = KEYRING:persistent:%{uid}
  15. [realms]
  16. HADOOP.COM = {
  17. kdc = node1a198
  18. admin_server = node1a198
  19. default_realm = HADOOP.COM
  20. supported_enctypes = aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal camellia256-cts:normal camellia128-cts:normal des-hmac-sha1:normal des-cbc-md5:normal des-cbc-crc:normal
  21. }
  22. IDC.COM = {
  23. kdc = node1a141
  24. admin_server = node1a141
  25. default_realm = IDC.COM
  26. supported_enctypes = aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal camellia256-cts:normal camellia128-cts:normal des-hmac-sha1:normal des-cbc-md5:normal des-cbc-crc:normal
  27. }
  28. [domain_realm]
  29. .hadoop.com=HADOOP.COM
  30. hadoop.com=HADOOP.COM
  31. .IDC.com=IDC.COM
  32. IDC.com=IDC.COM
  33. node1a141 = IDC.COM
  34. node1a143 = IDC.COM
  35. node1a210 = HADOOP.COM
  36. node1a202 = HADOOP.COM
  37. node1a203 = HADOOP.COM
  38. [capaths]
  39. IDC.COM = {
  40. HADOOP.COM = .
  41. }

版权声明:本文为彬在俊原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
本文链接:https://www.cnblogs.com/erlou96/p/16878481.html