1) Set up Apache Hadoop
To start we assume that an Apache Hadoop cluster is already running, with a file stored in "/data/LICENSE.txt" that we want to access. To see how to set up Apache Hadoop in such a way, please refer to part 1 of this earlier post. Ensure that you can download the LICENSE.txt file in a browser directly from Apache Hadoop via:
- http://localhost:9870/webhdfs/v1/data/LICENSE.txt?op=OPEN
2) Set up Apache Knox
Next we will see how to access the file above via Apache Knox. Download and extract Apache Knox (Gateway Server binary archive - version 1.1.0 was used in this tutorial). First we create a master secret via:
- bin/knoxcli.sh create-master
- bin/ldap.sh start
Apache Knox stores the "topologies" configuration in the directory "conf/topologies". We will re-use the default "sandbox.xml" configuration for the purposes of this post. This configuration maps to the URI "gateway/sandbox". It contains the authentication configuration for the topology (HTTP basic authentication), and which maps the received credentials to the LDAP backend we have started above. It then defines the backend services that are supported by this topology. We are interested in the "WEBHDFS" service which maps to "http://localhost:50070/webhdfs". Change this port to "9870" if using Apache Hadoop 3.0.0 as in the first section of this post. Then start the gateway via:
- bin/gateway.sh start
- https://localhost:8443/gateway/sandbox/webhdfs/v1/data/LICENSE.txt?op=OPEN
- curl -u guest:guest-password -kL https://localhost:8443/gateway/sandbox/webhdfs/v1/data/LICENSE.txt?op=OPEN
No comments:
Post a Comment