26.5. Scaling Up and Scaling Down

The supported volume configuration for Hadoop is Distributed Replicated volume with replica count 2 or 3. Hence, you must add or remove servers from the trusted storage pool in multiples of replica count. Red Hat recommends you to not have more than one brick that belongs to the same volume, on the same server. Adding additional servers to a Red Hat Gluster Storage volume increases both the storage and the compute capacity for that trusted storage pool as the bricks on those servers add to the storage capacity of the volume, and the CPUs increase the amount of Hadoop Tasks that the Hadoop Cluster on the volume can run.

26.5.1. Scaling Up

The following is the procedure to add 2 new servers to an existing Hadoop on Red Hat Gluster Storage trusted storage pool.
  1. Ensure that the new servers meet all the prerequisites and have the appropriate channels and components installed. For information on prerequisites, see section Prerequisites in the chapter Deploying the Hortonworks Data Platform on Red Hat Gluster Storage of Red Hat Gluster Storage 3.1 Installation Guide. For information on adding servers to the trusted storage pool, see Chapter 5, Trusted Storage Pools
  2. In the Ambari Console, click Stop All in the Services navigation panel. You must wait until all the services are completely stopped.
  3. Open the terminal window of the server designated to be the Ambari Management Server and navigate to the /usr/share/rhs-hadoop-install/ directory.
  4. Run the following command by replacing the examples with the necessary values. This command below assumes the LVM partitions on the server are /dev/vg1/lv1 and you wish them to be mounted as /mnt/brick1:
    # ./setup_cluster.sh --yarn-master <the-existing-yarn-master-node>  [--hadoop-mgmt-node <the-existing-mgmt-node>] new-node1.hdp:/mnt/brick1:/dev/vg1/lv1 new-node2.hdp
  5. Open the terminal of any Red Hat Gluster Storage server in the trusted storage pool and run the following command. This command assumes that you want to add the servers to a volume called HadoopVol:
    # gluster volume add-brick HadoopVol replica 2 new-node1:/mnt/brick1/HadoopVol new-node2:/mnt/brick1/HadoopVol
    For more information on expanding volumes, see Section 10.3, “Expanding Volumes”.
  6. Open the terminal of any Red Hat Gluster Storage Server in the cluster and rebalance the volume using the following command:
    # gluster volume rebalance HadoopVol start
    Rebalancing the volume will distribute the data on the volume among the servers. To view the status of the rebalancing operation, run # gluster volume rebalance HadoopVol status command. The rebalance status will be shown as completed when the rebalance is complete. For more information on rebalancing a volume, see Section 10.7, “Rebalancing Volumes”.
  7. Open the terminal of both of the new storage nodes and navigate to the /usr/share/rhs-hadoop-install/ directory and run the command given below:
    # ./setup_container_executor.sh
  8. Access the Ambari Management Interface via the browser (http://ambari-server-hostname:8080) and add the new nodes by selecting the HOSTS tab and selecting add new host. Select the services you wish to install on the new host and deploy the service to the hosts.
  9. Follow the instructions in Configuring the Linux Container Executor section in the Red Hat Gluster Storage 3.1 Installation Guide.

26.5.2. Scaling Down

If you remove servers from a Red Hat Gluster Storage trusted storage pool it is recommended that you rebalance the data in the trusted storage pool. The following is the process to remove 2 servers from an existing Hadoop on Red Hat Gluster Storage Cluster:
  1. In the Ambari Console, click Stop All in the Services navigation panel. You must wait until all the services are completely stopped.
  2. Open the terminal of any Red Hat Gluster Storage server in the trusted storage pool and run the following command. This procedure assumes that you want to remove 2 servers, that is old-node1 and old-node2 from a volume called HadoopVol:
    # gluster volume remove-brick HadoopVol [replica count] old-node1:/mnt/brick2/HadoopVol old-node2:/mnt/brick2/HadoopVol start
    To view the status of the remove brick operation, run # gluster volume remove-brick HadoopVol old-node1:/mnt/brick2/HadoopVol old-node2:/mnt/brick2/HadoopVol status command.
  3. When the data migration shown in the status command is Complete, run the following command to commit the brick removal:
    # gluster volume remove-brick HadoopVol old-node1:/mnt/brick2/HadoopVol old-node2:/mnt/brick2/HadoopVol commit
    After the bricks removal, you can check the volume information using # gluster volume info HadoopVol command. For detailed information on removing volumes, see Section 10.4, “Shrinking Volumes”
  4. Open the terminal of any Red Hat Gluster Storage server in the trusted storage pool and run the following command to detach the removed server:
    # gluster peer detach old-node1 
    # gluster peer detach old-node2
  5. Open the terminal of any Red Hat Gluster Storage Server in the cluster and rebalance the volume using the following command:
    # gluster volume rebalance HadoopVol start
    Rebalancing the volume will distribute the data on the volume among the servers. To view the status of the rebalancing operation, run # gluster volume rebalance HadoopVol status command. The rebalance status will be shown as completed when the rebalance is complete. For more information on rebalancing a volume, see Section 10.7, “Rebalancing Volumes”.
  6. Remove the nodes from Ambari by accessing the Ambari Management Interface via the browser (http://ambari-server-hostname:8080) and selecting the HOSTS tab. Click on the host(node) that you would like to delete and select Host Actions on the right hand side. Select Delete Host from the drop down.