阿里大數據專家帶你玩轉CDH,手把手帶你部署數據可視化平臺


一、說明 2

二、系統環境搭建 3

1、網絡配置(所有節點) 3

2、SSH免密碼登錄 3

3、關閉防火牆 3

4、關閉SELINUX 4

5、安裝JDK 4

6、設置NTP 5

7、安裝配置MySql 5

8、下載依賴包 7

三、Cloudera Manager Server&Agent安裝 8

1、安裝Cloudera Manager Server&Agent 8

2、創建用戶cloudera-scm(所有節點) 8

3、配置CM Agent 8

4、配置CM Server的數據庫 8

5、創建Parcel目錄 9

6、啟動CM Manager&Agent服務 9

四、CDH5安裝 10

五、腳本 18

1、MySql建庫&&刪庫 18


一、說明

操作系統:CentOS 6

JDK版本:1.7.0_80


所需安裝包及版本說明:

CDH-5.4.0-1.cdh5.4.0.p0.27-el6.parcel

CDH-5.4.0-1.cdh5.4.0.p0.27-el6.parcel.sha

manifest.json

cloudera-manager-el6-cm5.4.3_x86_64.tar.gz


Cloudera Manager下載目錄

http://www.cloudera.com/downloads/manager/5-4-3.html


CDH下載目錄

http://archive.cloudera.com/cdh5/parcels/5.4.0/

CHD5相關的Parcel包放到主節點的/opt/cloudera/parcel-repo/目錄中

CDH-5.1.3-1.cdh5.1.3.p0.12-el6.parcel.sha1重命名為CDH-5.1.3-1.cdh5.1.3.p0.12-el6.parcel.sha,這點必須注意,否則,系統會重新下載CDH-5.1.3-1.cdh5.1.3.p0.12-el6.parcel文件


本文采用離線安裝方式,在線安裝方式請參照官方文檔。


二、系統環境搭建

1、網絡配置(所有節點)

vi /etc/sysconfig/network 修改hostname:


通過 service network restart 重啟網絡服務生效


vi /etc/hosts ,修改ip與主機名的對應關係


2、SSH免密碼登錄

主節點執行:

ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa

生成無密碼密鑰對


拷貝公鑰到其他節點,執行

cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

測試:主節點ssh其他節點……


3、關閉防火牆

臨時關閉:

service iptables stop

重啟後生效:

chkconfig iptables off


4、關閉SELINUX

臨時關閉:

setenforce 0

修改配置文件/etc/selinux/config(重啟生效):

將SELINUX=enforcing改為SELINUX=disabled


查看SELINUX狀態:

1、/usr/sbin/sestatus –v

SELinux status: enabled(enabled:開啟;disabled:關閉)

2、使用命令:getenforce


5、安裝JDK

摘自官網:

The Oracle JDK installer is available both as an RPM-based installer for RPM-based systems, and as a binary installer for other systems.


CDH 5.4.x is supported with the versions shown in the following table:

Minimum Supported Version Recommended Version Exceptions

1.7.0_55 1.7.0_67 or JDK1.7_75 None

1.8.0_40 1.8.0_40 or higher None


本文采用RPM包安裝…….執行:

rpm -ivh jdk-7u80-linux-x64.rpm


配置環境變量,修改/etc/profile:

export JAVA_HOME=/usr/java/jdk1.7.0_80

export PATH=$JAVA_HOME/bin:$PATH

export CLASSPATH=.:$JAVA_HOMdE/lib/dt.jar:$JAVA_HOME/lib/tools.jar


生效:

source /etc/profile


查看版本:

[root@slave6 cdh]# java -version

java version "1.7.0_80"

Java(TM) SE Runtime Environment (build 1.7.0_80-b15)

Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode)


6、設置NTP

所有節點安裝NTP:

yum install ntp


配置開機啟動:

chkconfig ntpd on


檢查是否設置成功:

chkconfig --list ntpd (2-5為on狀態則成功)


設置同步:

ntpdate -u ntp.sjtu.edu.cn(時鐘服務器根據實際環境設置、本文采用210.72.145.44-國家授時中心服務器IP地址)


7、安裝配置MySql

MySql版本選擇、摘自官網:

Supported Databases:

Component MySQL SQLite PostgreSQL Oracle Derby - see Note 4

Oozie 5.5, 5.6 – 8.4, 9.2, 9.3

See Note 2 11gR2 Default

Flume – – – – Default (for the JDBC Channel only)

Hue 5.1, 5.5, 5.6

See Note 6 Default 8.4, 9.2, 9.3

See Note 2 11gR2 –

Hive/Impala 5.5, 5.6

See Note 1 – 8.4, 9.2, 9.3

See Note 2 11gR2 Default

Sentry 5.5, 5.6

See Note 1 – 8.4, 9.2, 9.3

See Note 2 11gR2 –

Sqoop 1 See Note 3 – See Note 3 See Note 3 –

Sqoop 2 See Note 4 – See Note 4 See Note 4 Default


Note:

1.MySQL 5.5 is supported on CDH 5.1. MySQL 5.6 is supported on CDH 5.1 and later. The InnoDB storage engine must be enabled in the MySQL server.

2.PostgreSQL 9.2 is supported on CDH 5.1 and later. PostgreSQL 9.3 is supported on CDH 5.2 and later.

3.For the purposes of transferring data only, Sqoop 1 supports MySQL 5.0 and above, PostgreSQL 8.4 and above, Oracle 10.2 and above, Teradata 13.10 and above, and Netezza TwinFin 5.0 and above. The Sqoop metastore works only with HSQLDB (1.8.0 and higher 1.x versions; the metastore does not work with any HSQLDB 2.x versions).

4.Sqoop 2 can transfer data to and from MySQL 5.0 and above, PostgreSQL 8.4 and above, Oracle 10.2 and above, and Microsoft SQL Server 2012 and above. The Sqoop 2 repository database is supported only on Derby and PostgreSQL.

5.Derby is supported as shown in the table, but not always recommended. See the pages for individual components in the Cloudera Installation and Upgrade guide for recommendations.

6.CDH 5 Hue requires the default MySQL version of the operating system on which it is being installed (which is usually MySQL 5.1, 5.5 or 5.6).


安裝過程略……本文采用MySql 5.5


所需數據庫說明,摘自官網:

The Cloudera Manager Server, Oozie Server, Sqoop Server, Activity Monitor, Reports Manager, Hive Metastore Server, Sentry Server, Cloudera Navigator Audit Server, and Cloudera Navigator Metadata Server all require databases. The type of data contained in the databases and their estimated sizes are as follows:

Cloudera Manager - Contains all the information about services you have configured and their role assignments, all configuration history, commands, users, and running processes. This relatively small database (<100 MB) is the most important to back up.

Important: When processes restart, the configuration for each of the services is redeployed using information that is saved in the Cloudera Manager database. If this information is not available, your cluster will not start or function correctly. You must therefore schedule and maintain regular backups of the Cloudera Manager database in order to recover the cluster in the event of the loss of this database.

Oozie Server - Contains Oozie workflow, coordinator, and bundle data. Can grow very large.

Sqoop Server - Contains entities such as the connector, driver, links and jobs. Relatively small.

Activity Monitor - Contains information about past activities. In large clusters, this database can grow large. Configuring an Activity Monitor database is only necessary if a MapReduce service is deployed.

Reports Manager - Tracks disk utilization and processing activities over time. Medium-sized.

Hive Metastore Server - Contains Hive metadata. Relatively small.

Sentry Server - Contains authorization metadata. Relatively small.

Cloudera Navigator Audit Server - Contains auditing information. In large clusters, this database can grow large.

Cloudera Navigator Metadata Server - Contains authorization, policies, and audit report metadata. Relatively small.


建庫操作及腳本參照:步驟三、步驟六


安裝步驟:

首先安裝服務:yum install mysql-server -y

查看mysql狀態:service mysqld status(start stop restart)

若顯示mysqld (pid 1496) is running 則啟動成功;否則可以使用start或者restart命令重啟

啟動後通過mysql命令進入(默認用戶名是root,密碼為空,若需要設置,進入mysql之後,通過

mysql> set password for root@localhost=password('123'))進行修改


8、下載依賴包

chkconfig

python (2.6 required for CDH 5)

bind-utils

psmisc

libxslt

zlib

sqlite

cyrus-sasl-plain

cyrus-sasl-gssapi

fuse

portmap

fuse-libs

redhat-lsb


三、Cloudera Manager Server&Agent安裝

1、安裝Cloudera Manager Server&Agent


拷貝cloudera-manager-el6-cm5.4.3_x86_64.tar.gz到所有Server、Agent節點

創建cm目錄:

mkdir /opt/cloudera-manager

解壓cm壓縮包:

tar xvzf cloudera-manager*.tar.gz -C /opt/cloudera-manager


2、創建用戶cloudera-scm(所有節點)

cloudera-scm用戶說明,摘自官網:

Cloudera Manager Server and managed services are configured to use the user account cloudera-scm by default, creating a user with this name is the simplest approach. This created user, is used automatically after installation is complete.


執行:

useradd --system --home=/opt/cloudera-manager/cm-5.0/run/cloudera-scm-server --no-create-home --shell=/bin/false --comment "Cloudera SCM User" cloudera-scm


3、配置CM Agent

修改文件/opt/cloudera-manager/cm-5.4.3/etc/cloudera-scm-agent/config.ini中server_host以及server_port


配置完成後同步agent到其他節點

scp -r /opt/cm-5.14.0 [email protected]:/opt

scp -r /opt/cm-5.14.0 [email protected]:/opt

4、配置CM Server的數據庫

將驅動包拷貝到目錄下(注意拷貝過去的驅動包名字一定要和下邊的一樣,否則會報錯):

cp mysql-connector-java-5.1.31/mysql-connector-java-5.1.31-bin.jar /usr/share/java/mysql-connector-java.jar


執行:

mysql> grant all on *.* to 'temp'@'%' identified by 'temp' with grant option;

cd /opt/cloudera-manager/cm-5.4.3/share/cmf/schema

./scm_prepare_database.sh mysql -h myhost1.sf.cloudera.com -utemp -ptemp --scm-host myhost2.sf.cloudera.com scm scm scm

例如:


./scm_prepare_database.sh mysql cm -hlocalhost -uroot -p123 --scm-host localhost scm scm scm

(對應於:數據庫類型、數據庫服務器、用戶名、密碼、CMServer所在節點…….)


mysql> drop user 'temp'@'%';

若上步失敗或過程中操作中斷,刪除所有庫、重頭來過/(ㄒoㄒ)/~~


若安裝Oozie等組件可能需要手動創建對應組件所需的數據庫,例如:

create database ooziecm DEFAULT CHARACTER SET utf8;

grant all on ooziecm.* TO 'ooziecm'@'%' IDENTIFIED BY 'ooziecm';


其他的建庫及刪庫腳本見步驟五


5、創建Parcel目錄

Manager節點創建目錄/opt/cloudera/parcel-repo,執行:

mkdir -p /opt/cloudera/parcel-repo

chown cloudera-scm:cloudera-scm /opt/cloudera/parcel-repo

將下載好的文件(CDH-5.4.0-1.cdh5.4.0.p0.27-el6.parcel、CDH-5.4.0-1.cdh5.4.0.p0.27-el6.parcel.sha、manifest.json)拷貝到該目錄下。


Agent節點創建目錄/opt/cloudera/parcels,執行:

mkdir -p /opt/cloudera/parcels

chown cloudera-scm:cloudera-scm /opt/cloudera/parcels


6、啟動CM Server&Agent服務

執行:

Server:/opt/cloudera-manager/cm-5.4.3/etc/init.d/cloudera-scm-server start

Agents:/opt/cloudera-manager/cm-5.4.3/etc/init.d/cloudera-scm-agent start


訪問:http://ManagerHost:7180,若可以訪問(用戶名、密碼:admin),則安裝成功。

Manager啟動成功需要等待一段時間,過程中會在數據庫中創建對應的表需要耗費一些時間。


四、CDH5安裝


CM Manager && Agent成功啟動後,登錄前端頁面進行CDH安裝配置。

阿里大數據專家帶你玩轉CDH,手把手帶你部署數據可視化平臺


免費版本的CM5已經去除50個節點數量的限制。

阿里大數據專家帶你玩轉CDH,手把手帶你部署數據可視化平臺

阿里大數據專家帶你玩轉CDH,手把手帶你部署數據可視化平臺

各個Agent節點正常啟動後,可以在當前管理的主機列表中看到對應的節點。

選擇要安裝的節點,點繼續。

阿里大數據專家帶你玩轉CDH,手把手帶你部署數據可視化平臺

接下來,出現以下包名,說明本地Parcel包配置無誤,直接點繼續就可以了。

阿里大數據專家帶你玩轉CDH,手把手帶你部署數據可視化平臺

點擊,繼續,如果配置本地Parcel包無誤,那麼下圖中的已下載,應該是瞬間就完成了,然後就是耐心等待分配過程就行了,大約10多分鐘吧,取決於內網網速。

(若本地Parcel有問題,重新檢查步驟三、5是否配置正確)

阿里大數據專家帶你玩轉CDH,手把手帶你部署數據可視化平臺

阿里大數據專家帶你玩轉CDH,手把手帶你部署數據可視化平臺

接下來是服務器檢查,可能會遇到以下問題:

Cloudera 建議將 /proc/sys/vm/swappiness 設置為 0。當前設置為 60。使用 sysctl 命令在運行時更改該設置並編輯 /etc/sysctl.conf 以在重啟後保存該設置。您可以繼續進行安裝,但可能會遇到問題,Cloudera Manager 報告您的主機由於交換運行狀況不佳。以下主機受到影響:

通過 echo 0 > /proc/sys/vm/swappiness 即可解決。

阿里大數據專家帶你玩轉CDH,手把手帶你部署數據可視化平臺

接下來是選擇安裝服務:

阿里大數據專家帶你玩轉CDH,手把手帶你部署數據可視化平臺

測試採用了Hadoop默認,實際按工作環境來定咯~~

服務配置,一般情況下保持默認就可以了(Cloudera Manager會根據機器的配置自動進行配置,如果需要特殊調整,自行進行設置就可以了):

阿里大數據專家帶你玩轉CDH,手把手帶你部署數據可視化平臺

接下來是數據庫的設置,檢查通過後就可以進行下一步的操作了:

阿里大數據專家帶你玩轉CDH,手把手帶你部署數據可視化平臺

下面是集群設置的審查頁面,我這裡都是保持默認配置的:

阿里大數據專家帶你玩轉CDH,手把手帶你部署數據可視化平臺

終於到安裝各個服務的地方了,注意,如果採用其他數據庫安裝Hive等組件的時候報錯,檢查之前配置CM Server數據庫時,jar包拷貝位置及名稱是否修改

阿里大數據專家帶你玩轉CDH,手把手帶你部署數據可視化平臺

服務的安裝過程大約半小時內就可以完成:

阿里大數據專家帶你玩轉CDH,手把手帶你部署數據可視化平臺

安裝完成後,就可以進入集群界面看一下集群的當前狀況了。

這裡可能會出現無法發出查詢:對 Service Monitor 的請求超時的錯誤提示,如果各個組件安裝沒有問題,一般是因為服務器比較卡導致的,過一會刷新一下頁面就好了:

阿里大數據專家帶你玩轉CDH,手把手帶你部署數據可視化平臺


五、腳本

1、MySql建庫&&刪庫

##amon

create database amon DEFAULT CHARACTER SET utf8;

grant all on amon.* TO 'amon'@'%' IDENTIFIED BY 'amon';


##hive

create database hive DEFAULT CHARACTER SET utf8;

grant all on hive.* TO 'hive'@'%' IDENTIFIED BY 'hive';


##oozie

create database oozie DEFAULT CHARACTER SET utf8;

grant all on oozie.* TO 'oozie'@'%' IDENTIFIED BY 'oozie';

因為篇幅和素材原因,很多整理的安裝包和相應的資源沒有上傳,可以私信【“資料”】聯繫博主獲取呦

阿里大數據專家帶你玩轉CDH,手把手帶你部署數據可視化平臺


阿里大數據專家帶你玩轉CDH,手把手帶你部署數據可視化平臺


分享到:


相關文章: