This Ansible Playbook helps setting up Cloudera Hadoop Cluster with ease.
Ansible is a radically simple IT automation system. It handles configuration-management, application deployment, cloud provisioning, ad-hoc task-execution, and multinode orchestration - including trivializing things like zero downtime rolling updates with load balancers.
Read the documentation and more at https://ansible.com/
You can find installation instructions here for a variety of platforms. Most users should probably install a released version of Ansible from pip, a package manager or our release repository. Officially supported builds of Ansible are also available. Some power users run directly from the development branch - while significant efforts are made to ensure that devel is reasonably stable, you're more likely to encounter breaking changes when running Ansible this way.
Configure all nodes with necessary system settings, packages, and networking for Cloudera CDH deployment.
---
- name: Prepare servers for Cloudera Hadoop installation
hosts: all
become: yes
vars:
java_package: java-1.x.0-openjdk-devel
cloudera_user: cloudera-scm
tasks:
- name: Install required packages
yum:
name:
- ""
- ntp
- wget
- curl
- bind-utils
state: present
- name: Set hostname
hostname:
name: ""
- name: Update /etc/hosts
blockinfile:
path: /etc/hosts
block: |
192.168.1.10 master01
192.168.1.11 data01
192.168.1.12 data02
- name: Disable SELinux
selinux:
state: disabled
- name: Stop and disable firewalld
service:
name: firewalld
state: stopped
enabled: no
- name: Create cloudera-scm user
user:
name: ""
shell: /bin/bash
state: present
- name: Set vm.swappiness to 1
sysctl:
name: vm.swappiness
value: 1
state: present
reload: yes
- name: Ensure NTP is enabled and started
service:
name: ntpd
state: started
enabled: yes
---
- name: Install Cloudera CDH cluster via Cloudera Manager API
hosts: master
become: yes
vars:
cm_host: "master01"
cm_user: "admin"
cm_password: "admin"
cluster_name: "Test-CDH-Cluster"
parcel_version: "CDH-version.x.0"
tasks:
- name: Add host templates for roles
uri:
url: "http://:7180/api/v14/cm/deployment"
method: PUT
user: ""
password: ""
body: ""
body_format: json
status_code: 200
- name: Trigger installation
uri:
url: "http://:7180/api/v14/cm/commands/installCluster"
method: POST
user: ""
password: ""
status_code: 200
You must define a cluster_template.json
file containing the Cloudera Manager deployment template (including hosts, roles, and services such as HDFS, YARN, Hive, and Impala).
cloudera-ansible/
├── inventory/
│ └── hosts.ini
├── playbooks/
│ ├── prepare-nodes.yml
│ └── install-cdh.yml
├── templates/
│ └── cluster_template.json
└── group_vars/
└── all.yml
ansible-pull
for isolated environments without centralized control.screen
or tmux
when running playbooks on jump boxes.Big Data Administrators and DevOps teams could automate and accelerate Cloudera CDH cluster deployment — improving consistency, reducing errors, and saving operational time. These playbooks still serve as a foundation for on-prem automation strategies.