+---
+layout: global
+title: HDFS charts
+---
+
+# HDFS charts
+
+Helm charts for launching HDFS daemons in a K8s cluster. The main entry-point
+chart is `hdfs-k8s`, which is a uber-chart that specifies other charts as
+dependency subcharts. This means you can launch all HDFS components using
+`hdfs-k8s`.
+
+Note that the HDFS charts are currently in pre-alpha quality. They are also
+being heavily revised and are subject to change.
+
+HDFS on K8s supports the following features:
+ - namenode high availability (HA): HDFS namenode daemons are in charge of
+ maintaining file system metadata concerning which directories have which
+ files and where are the file data. Namenode crash will cause service outage.
+ HDFS can run two namenodes in active/standby setup. HDFS on K8s supports HA.
+ - K8s persistent volumes (PV) for metadata: Namenode crash will cause service
+ outage. Losing namenode metadata can lead to loss of file system. HDFS on
+ K8s can store the metadata in remote K8s persistent volumes so that metdata
+ can remain intact even if both namenode daemons are lost or restarted.
+ - K8s HostPath volumes for file data: HDFS datanodes daemons store actual
+ file data. File data should also survive datanode crash or restart. HDFS on
+ K8s stores the file data on the local disks of the K8s cluster nodes using
+ K8s HostPath volumes. (We plan to switch to a better mechanism, K8s
+ persistent local volumes)
+ - Kerberos: Vanilla HDFS is not secure. Intruders can easily write custom
+ client code, put a fake user name in requests and steal data. Production
+ HDFS often secure itself using Kerberos. HDFS on K8s supports Kerberos.
+
+Here is the list of all charts.
+
+ - hdfs-k8s: main uber-chart. Launches other charts.
+ - hdfs-namenode-k8s: a statefulset and other K8s components for launching HDFS
+ namenode daemons, which maintains file system metadata. The chart supports
+ namenode high availability (HA).
+ - hdfs-datanode-k8s: a daemonset and other K8s components for launching HDFS
+ datanode daemons, which are responsible for storing file data.
+ - hdfs-config-k8s: a configmap containing Hadoop config files for HDFS.
+ - zookeeper: This chart is NOT in this repo. But hdfs-k8s pulls the zookeeper
+ chart in the incubator remote repo
+ (https://kubernetes-charts-incubator.storage.googleapis.com/)
+ as a dependency and launhces zookeeper daemons. Zookeeper makes sure
+ only one namenode is active in the HA setup, while the other namenode
+ becomes standby. By default, we will launch three zookeeper servers.
+ - hdfs-journalnode-k8s: a statefulset and other K8s components for launching
+ HDFS journalnode quorums, which ensures the file system metadata are
+ properly shared among the two namenode daemons in the HA setup.
+ By default, we will launch three journalnode servers.
+ - hdfs-client-k8s: a pod that is configured to run Hadoop client commands
+ for accessing HDFS.
+ - hdfs-krb5-k8s: a size-1 statefulset and other K8s components for launching
+ a Kerberos server, which can be used to secure HDFS. Disabled by default.
+ - hdfs-simple-namenode-k8s: Disabled by default. A simpler setup of the
+ namenode that launches only one namenode. i.e. This does not support HA. It
+ does not support Kerberos nor persistent volumes either. As it does not
+ support HA, we also don't need zookeeper nor journal nodes. You may prefer
+ this if you want the simplest possible setup.
+
+# Prerequisite
+
+Requires Kubernetes 1.6+ as the `namenode` and `datanodes` are using
+`ClusterFirstWithHostNet`, which was introduced in Kubernetes 1.6
+
+# Usage
+
+## Basic
+
+The HDFS daemons can be launched using the main `hdfs-k8s` chart. First, build
+the main chart using:
+
+```
+ $ helm repo add incubator \
+ https://kubernetes-charts-incubator.storage.googleapis.com/
+ $ helm dependency build charts/hdfs-k8s
+```
+
+Zookeeper, journalnodes and namenodes need persistent volumes for storing
+metadata. By default, the helm charts do not set the storage class name for
+dynamically provisioned volumes, nor does it use persistent volume selectors for
+static persistent volumes.
+
+This means it will rely on a provisioner for default storage volume class for
+dynamic volumes. Or if your cluster has statically provisioned volumes, the
+chart will match existing volumes entirely based on the size requirements. To
+override this default behavior, you can specify storage volume classes for
+dynamic volumes, or volume selectors for static volumes. See below for how to
+set these options.
+
+ - namenodes: Each of the two namenodes needs at least a 100 GB volume. i.e.
+ Yon need two 100 GB volumes. This can be overridden by the
+ `hdfs-namenode-k8s.persistence.size` option.
+ You can also override the storage class or the selector using
+ `hdfs-namenode-k8s.persistence.storageClass`, or
+ `hdfs-namenode-k8s.persistence.selector` respectively. For details, see the
+ values.yaml file inside `hdfs-namenode-k8s` chart dir.
+ - zookeeper: You need three > 5 GB volumes. i.e. Each of the two zookeeper
+ servers will need at least 5 GB in the volume. Can be overridden by
+ the `zookeeper.persistence.size` option. You can also override
+ the storage class using `zookeeper.persistence.storageClass`.
+ - journalnodes: Each of the three journalnodes will need at least 20 GB in
+ the volume. The size can be overridden by the
+ `hdfs-journalnode-k8s.persistence.size` option.
+ You can also override the storage class or the selector using
+ `hdfs-journalnode-k8s.persistence.storageClass`, or
+ `hdfs-journalnode-k8s.persistence.selector` respectively. For details, see the
+ values.yaml file inside `hdfs-journalnode-k8s` chart dir.
+ - kerberos: The single Kerberos server will need at least 20 GB in the volume.
+ The size can be overridden by the `hdfs-krb5-k8s.persistence.size` option.
+ You can also override the storage class or the selector using
+ `hdfs-krb5-k8s.persistence.storageClass`, or
+ `hdfs-krb5-k8s.persistence.selector` respectively. For details, see the
+ values.yaml file inside `hdfs-krb5-k8s` chart dir.
+
+Then launch the main chart. Specify the chart release name say "my-hdfs",
+which will be the prefix of the K8s resource names for the HDFS components.
+
+```
+ $ helm install -n my-hdfs charts/hdfs-k8s
+```
+
+Wait for all daemons to be ready. Note some daemons may restart themselves
+a few times before they become ready.
+
+```
+ $ kubectl get pod -l release=my-hdfs
+
+ NAME READY STATUS RESTARTS AGE
+ my-hdfs-client-c749d9f8f-d5pvk 1/1 Running 0 2m
+ my-hdfs-datanode-o7jia 1/1 Running 3 2m
+ my-hdfs-datanode-p5kch 1/1 Running 3 2m
+ my-hdfs-datanode-r3kjo 1/1 Running 3 2m
+ my-hdfs-journalnode-0 1/1 Running 0 2m
+ my-hdfs-journalnode-1 1/1 Running 0 2m
+ my-hdfs-journalnode-2 1/1 Running 0 1m
+ my-hdfs-namenode-0 1/1 Running 3 2m
+ my-hdfs-namenode-1 1/1 Running 3 2m
+ my-hdfs-zookeeper-0 1/1 Running 0 2m
+ my-hdfs-zookeeper-1 1/1 Running 0 2m
+ my-hdfs-zookeeper-2 1/1 Running 0 2m
+```
+
+Namenodes and datanodes are currently using the K8s `hostNetwork` so they can
+see physical IPs of each other. If they are not using `hostNetowrk`,
+overlay K8s network providers such as weave-net may mask the physical IPs,
+which will confuse the data locality later inside namenodes.
+
+Finally, test with the client pod:
+
+```
+ $ _CLIENT=$(kubectl get pods -l app=hdfs-client,release=my-hdfs -o name | \
+ cut -d/ -f 2)
+ $ kubectl exec $_CLIENT -- hdfs dfsadmin -report
+ $ kubectl exec $_CLIENT -- hdfs haadmin -getServiceState nn0
+ $ kubectl exec $_CLIENT -- hdfs haadmin -getServiceState nn1
+
+ $ kubectl exec $_CLIENT -- hadoop fs -rm -r -f /tmp
+ $ kubectl exec $_CLIENT -- hadoop fs -mkdir /tmp
+ $ kubectl exec $_CLIENT -- sh -c \
+ "(head -c 100M < /dev/urandom > /tmp/random-100M)"
+ $ kubectl exec $_CLIENT -- hadoop fs -copyFromLocal /tmp/random-100M /tmp
+```
+
+## Kerberos
+
+Kerberos can be enabled by setting a few related options:
+
+```
+ $ helm install -n my-hdfs charts/hdfs-k8s \
+ --set global.kerberosEnabled=true \
+ --set global.kerberosRealm=MYCOMPANY.COM \
+ --set tags.kerberos=true
+```
+
+This will launch all charts including the Kerberos server, which will become
+ready pretty soon. However, HDFS daemon charts will be blocked as the deamons
+require Kerberos service principals to be available. So we need to unblock
+them by creating those principals.
+
+First, create a configmap containing the common Kerberos config file:
+
+```
+ _MY_DIR=~/krb5
+ mkdir -p $_MY_DIR
+ _KDC=$(kubectl get pod -l app=hdfs-krb5,release=my-hdfs --no-headers \
+ -o name | cut -d/ -f2)
+ _run kubectl cp $_KDC:/etc/krb5.conf $_MY_DIR/tmp/krb5.conf
+ _run kubectl create configmap my-hdfs-krb5-config \
+ --from-file=$_MY_DIR/tmp/krb5.conf
+```
+
+Second, create the service principals and passwords. Kerberos requires service
+principals to be host specific. Some HDFS daemons are associated with your K8s
+cluster nodes' physical host names say kube-n1.mycompany.com, while others are
+associated with Kubernetes virtual service names, for instance
+my-hdfs-namenode-0.my-hdfs-namenode.default.svc.cluster.local. You can get
+the list of these host names like:
+
+```
+ $ _HOSTS=$(kubectl get nodes \
+ -o=jsonpath='{.items[*].status.addresses[?(@.type == "Hostname")].address}')
+
+ $ _HOSTS+=$(kubectl describe configmap my-hdfs-config | \
+ grep -A 1 -e dfs.namenode.rpc-address.hdfs-k8s \
+ -e dfs.namenode.shared.edits.dir |
+ grep "<value>" |
+ sed -e "s/<value>//" \
+ -e "s/<\/value>//" \
+ -e "s/:8020//" \
+ -e "s/qjournal:\/\///" \
+ -e "s/:8485;/ /g" \
+ -e "s/:8485\/hdfs-k8s//")
+```
+
+Then generate per-host principal accounts and password keytab files.
+
+```
+ $ _SECRET_CMD="kubectl create secret generic my-hdfs-krb5-keytabs"
+ $ for _HOST in $_HOSTS; do
+ kubectl exec $_KDC -- kadmin.local -q \
+ "addprinc -randkey hdfs/$_HOST@MYCOMPANY.COM"
+ kubectl exec $_KDC -- kadmin.local -q \
+ "addprinc -randkey HTTP/$_HOST@MYCOMPANY.COM"
+ kubectl exec $_KDC -- kadmin.local -q \
+ "ktadd -norandkey -k /tmp/$_HOST.keytab hdfs/$_HOST@MYCOMPANY.COM HTTP/$_HOST@MYCOMPANY.COM"
+ kubectl cp $_KDC:/tmp/$_HOST.keytab $_MY_DIR/tmp/$_HOST.keytab
+ _SECRET_CMD+=" --from-file=$_MY_DIR/tmp/$_HOST.keytab"
+ done
+```
+
+The above was building a command using a shell variable `SECRET_CMD` for
+creating a K8s secret that contains all keytab files. Run the command to create
+the secret.
+
+```
+ $ $_SECRET_CMD
+```
+
+This will unblock all HDFS daemon pods. Wait until they become ready.
+
+Finally, test the setup using the following commands:
+
+```
+ $ _NN0=$(kubectl get pods -l app=hdfs-namenode,release=my-hdfs -o name | \
+ head -1 | \
+ cut -d/ -f2)
+ $ kubectl exec $_NN0 -- sh -c "(apt install -y krb5-user > /dev/null)" \
+ || true
+ $ kubectl exec $_NN0 -- \
+ kinit -kt /etc/security/hdfs.keytab \
+ hdfs/my-hdfs-namenode-0.my-hdfs-namenode.default.svc.cluster.local@MYCOMPANY.COM
+ $ kubectl exec $_NN0 -- hdfs dfsadmin -report
+ $ kubectl exec $_NN0 -- hdfs haadmin -getServiceState nn0
+ $ kubectl exec $_NN0 -- hdfs haadmin -getServiceState nn1
+ $ kubectl exec $_NN0 -- hadoop fs -rm -r -f /tmp
+ $ kubectl exec $_NN0 -- hadoop fs -mkdir /tmp
+ $ kubectl exec $_NN0 -- hadoop fs -chmod 0777 /tmp
+ $ kubectl exec $_KDC -- kadmin.local -q \
+ "addprinc -randkey user1@MYCOMPANY.COM"
+ $ kubectl exec $_KDC -- kadmin.local -q \
+ "ktadd -norandkey -k /tmp/user1.keytab user1@MYCOMPANY.COM"
+ $ kubectl cp $_KDC:/tmp/user1.keytab $_MY_DIR/tmp/user1.keytab
+ $ kubectl cp $_MY_DIR/tmp/user1.keytab $_CLIENT:/tmp/user1.keytab
+
+ $ kubectl exec $_CLIENT -- sh -c "(apt install -y krb5-user > /dev/null)" \
+ || true
+
+ $ kubectl exec $_CLIENT -- kinit -kt /tmp/user1.keytab user1@MYCOMPANY.COM
+ $ kubectl exec $_CLIENT -- sh -c \
+ "(head -c 100M < /dev/urandom > /tmp/random-100M)"
+ $ kubectl exec $_CLIENT -- hadoop fs -ls /
+ $ kubectl exec $_CLIENT -- hadoop fs -copyFromLocal /tmp/random-100M /tmp
+```
+
+## Advanced options
+
+### Setting HostPath volume locations for datanodes
+
+HDFS on K8s stores the file data on the local disks of the K8s cluster nodes
+using K8s HostPath volumes. You may want to change the default locations. Set
+global.dataNodeHostPath to override the default value. Note the option
+takes a list in case you want to use multiple disks.
+
+```
+ $ helm install -n my-hdfs charts/hdfs-k8s \
+ --set "global.dataNodeHostPath={/mnt/sda1/hdfs-data0,/mnt/sda1/hdfs-data1}"
+```
+
+### Using an existing zookeeper quorum
+
+By default, HDFS on K8s pulls in the zookeeper chart in the incubator remote
+repo (https://kubernetes-charts-incubator.storage.googleapis.com/) as a
+dependency and launhces zookeeper daemons. But your K8s cluster may already
+have a zookeeper quorum.
+
+It is possible to use the existing zookeeper. We just need set a few options
+in the helm install command line. It should be something like:
+
+```
+ $helm install -n my-hdfs charts/hdfs-k8s \
+ --set condition.subchart.zookeeper=false \
+ --set global.zookeeperQuorumOverride=zk-0.zk-svc.default.svc.cluster.local:2181,zk-1.zk-svc.default.svc.cluster.local:2181,zk-2.zk-svc.default.svc.cluster.local:2181
+```
+
+Setting `condition.subchart.zookeeper` to false prevents the uber-chart from
+bringing in zookeeper as sub-chart. And the `global.zookeeperQuorumOverride`
+option specifies the custom address for a zookeeper quorum. Use your
+zookeeper address here.
+
+### Pinning namenodes to specific K8s cluster nodes
+
+Optionally, you can attach labels to some of your k8s cluster nodes so that
+namenodes will always run on those cluster nodes. This can allow your HDFS
+client outside the Kubernetes cluster to expect stable IP addresses. When used
+by those outside clients, Kerberos expects the namenode addresses to be stable.
+
+```
+ $ kubectl label nodes YOUR-HOST-1 hdfs-namenode-selector=hdfs-namenode
+ $ kubectl label nodes YOUR-HOST-2 hdfs-namenode-selector=hdfs-namenode
+```
+
+You should add the nodeSelector option to the helm chart command:
+
+```
+ $ helm install -n my-hdfs charts/hdfs-k8s \
+ --set hdfs-namenode-k8s.nodeSelector.hdfs-namenode-selector=hdfs-namenode \
+ ...
+```
+
+### Excluding datanodes from some K8s cluster nodes
+
+You may want to exclude some K8s cluster nodes from datanodes launch target.
+For instance, some K8s clusters may let the K8s cluster master node launch
+a datanode. To prevent this, label the cluster nodes with
+`hdfs-datanode-exclude`.
+
+```
+ $ kubectl label node YOUR-CLUSTER-NODE hdfs-datanode-exclude=yes
+```
+
+### Launching with a non-HA namenode
+
+You may want non-HA namenode since it is the simplest possible setup.
+Note this won't launch zookeepers nor journalnodes.
+
+The single namenode is supposed to be pinned to a cluster host using a node
+label. Attach a label to one of your K8s cluster node.
+
+```
+ $ kubectl label nodes YOUR-CLUSTER-NODE hdfs-namenode-selector=hdfs-namenode-0
+```
+
+The non-HA setup does not even use persistent vlumes. So you don't even
+need to prepare persistent volumes. Instead, it is using hostPath volume
+of the pinned cluster node. So, just launch the chart while
+setting options to turn off HA. You should add the nodeSelector option
+so that the single namenode would find the hostPath volume of the same cluster
+node when the pod restarts.
+
+```
+ $ helm install -n my-hdfs charts/hdfs-k8s \
+ --set tags.ha=false \
+ --set tags.simple=true \
+ --set global.namenodeHAEnabled=false \
+ --set hdfs-simple-namenode-k8s.nodeSelector.hdfs-namenode-selector=hdfs-namenode-0
+```
+
+# Security
+
+## K8s secret containing Kerberos keytab files
+
+The Kerberos setup creates a K8s secret containing all the keytab files of HDFS
+daemon service princialps. This will be mounted onto HDFS daemon pods. You may
+want to restrict access to this secret using k8s
+[RBAC](https://kubernetes.io/docs/admin/authorization/rbac/), to minimize
+exposure of the keytab files.
+
+## HostPath volumes
+`Datanode` daemons run on every cluster node. They also mount k8s `hostPath`
+local disk volumes. You may want to restrict access of `hostPath`
+using `pod security policy`.
+See [reference](https://github.com/kubernetes/examples/blob/master/staging/podsecuritypolicy/rbac/README.md))
+
+## Credits
+
+Many charts are using public Hadoop docker images hosted by
+[uhopper](https://hub.docker.com/u/uhopper/).