Show Table of Contents
OpenStack Data Processing
Red Hat OpenStack Platform 15
Manually provisioning and scaling Hadoop clusters in Red Hat OpenStack Platform
The OpenStack Data Processing feature allows you to easily provision and scale Hadoop clusters to process large datasets. This guide walks you through the entire OpenStack Data Processing workflow, which includes registering the Data Processing requirements (image, input data, job binaries), configuring templates used to provision clusters, processing data on those clusters, and scaling those clusters as necessary.
This release of OpenStack Data Processing includes atab. This tab features wizards that will help you create the templates necessary in order to launch clusters and run jobs on them. The objective of this guide is to provide a more in-depth look at the OpenStack Data Processing workflow, and will therefore walk you through the template creation and component registration without the use of the tab feature.
Using the OpenStack Data Processing feature requires basic knowledge of data processing within the Hadoop framework. Further, users also need to be familiar with the particulars of their chosen Hadoop plug-in.