1) Configuring Hadoop to use SASL for the data transfer protocol
Follow section (2) of the previous post to configure Hadoop to authenticate users via Kerberos. We need to make the following changes to 'etc/hadoop/hdfs-site.xml':
- dfs.datanode.address: Change the port number here to be a non-privileged port.
- dfs.datanode.http.address: Change the port number here to be a non-privileged port.
- dfs.data.transfer.protection: integrity.
- dfs.http.policy: HTTPS_ONLY.
The next step is to configure some SSL keys in 'etc/hadoop/ssl-server.xml'. We'll use some sample keys that are used in Apache CXF to run the systests for the purposes of this dem. Download cxfca.jks and bob.jks into 'etc/hadoop'. Now edit 'etc/hadoop/ssl-server.xml' and define the following properties:
- ssl.server.truststore.location: etc/hadoop/cxf-ca.jks
- ssl.server.truststore.password: password
- ssl.server.keystore.location: etc/hadoop/bob.jks
- ssl.server.keystore.password: password
- ssl.server.keystore.keypassword: password
Now that we have hopefully configured everything correctly it's time to launch the Kerby based KDC and HDFS. Start Kerby by running the JUnit test as described in the first section of the previous article. Now start HDFS via:
- export KRB5_CONFIG=/pathtokerby/target/krb5.conf
- kinit -t -k /pathtokerby/target/alice.keytab alice
- bin/hadoop fs -cat /data/LICENSE.txt