OpenShift Container Platform 4.3.0部署实录


本文参照红帽官方文档,在裸机安装Openshift4.3文档进行。因为只有一台64G内存的PC机,安装vmware vsphere 6.7免费版进行本测试,所以尝试在OCP官方文档要求的最低内存需求基础上,内存减半安装,记录如下。


发现头条号不支持markdown,对程序员太不又好了,拷贝过来的代码格式都没了,简书的这一点就要好一些。

https://www.jianshu.com/p/7c0c2affadb8

1、ocp安装的过程

红帽官方文档记载的安装过程如下:

bootstrap启动并从准备好master需要的资源master从bootstrap获取需要的资源并完成启动master通过bootstrap构建etcd集群bootstrap使用刚才构建的etcd集群启动一个临时的kubernetes control plane临时control plane在master节点启动生产control plane临时control plane关闭并将控制权移交给生产control planebootstrap将ocp组建注入到生产control plane安装程序关闭bootstrapcontrol plane 部署计算节点control plane 通过operator方式安装其他服务

2、准备服务器资源

服务器规划如下:

3台control plane节点,安装etcd、control plane组件和infras基础组件,因为资源紧张,不部署dns服务器,通过hosts文件解析域名;2台compute 节点,运行实际负载;1台bootstrap节点,执行安装任务;1台misc/lb节点,用于准备安装资源、启动bootstrap,并作为lb节点使用。

Hostname vcpu ram hdd ip fqdn misc/lb 4 8g 120g 192.168.128.30 misc.ocptest.ipingcloud.com/lb.ocptest.ipincloud.com bootstrap 4 8g 120g 192.168.128.31 bootstrap.ocptest.ipincloud.com master1 4 8g 120g 192.168.128.32 master1.ocptest.ipincloud.com master2 4 8g 120g 192.168.128.33 master2.ocptest.ipincloud.com master3 4 8g 120g 192.168.128.34 master3.ocptest.ipincloud.com worker1 2 4g 120g 192.168.128.35 worker1.ocptest.ipincloud.com worker2 2 4g 120g 192.168.128.36 worker2.ocptest.ipincloud.com

3、准备网络资源

api server和ingress公用一个lb,即misc/lb 以为dns配置记录,ocptest是cluster名,ipingcloud.com是基础域名.这些配置,需要修改ansi-playbook文件的tasks/相应模板。 参见 https://github.com/scwang18/ocp4-upi-helpernode.git

dns配置

组件 dns记录 描述 Kubernetes API api.ocptest.ipincloud.com 该DNS记录指向control plane节点的负载平衡器。群集外部和群集中所有节点都必须可以解析此记录。 Kubernetes API api-int.ocptest.ipincloud.com 该DNS记录指向control plane节点的负载平衡器。该记录必须可从群集中的所有节点上解析。 Routes *.apps.ocptest.ipincloud.com 通配符DNS记录指向ingress slb。群集外部和群集中所有节点都必须可以解析此记录。 etcd etcd-.ocptest.ipincloud.com DNS记录指向etcd节点,群集所有节点都必须可以解析此记录。 etcd _etcd-server-ssl._tcp.ocptest.ipincloud.com 因为etcd使用2380对外服务,因此,需要建立对应每台etcd节点的srv dns记录,优先级0,权重10和端口2380,如下表

etcd srv dns记录表

#一下激怒是必须的,用于bootstrap创建etcd服务器上,自动配置etcd服务解析

#_service._proto.name. TTL class SRV priority weight port target. _etcd-server-ssl._tcp.<cluster>.<base> 86400 IN SRV 0 10 2380 etcd-0.<cluster>.<base>. _etcd-server-ssl._tcp.<cluster>.<base> 86400 IN SRV 0 10 2380 etcd-1.<cluster>.<base>. _etcd-server-ssl._tcp.<cluster>.<base> 86400 IN SRV 0 10 2380 etcd-2.<cluster>.<base>./<base>/<cluster>/<base>/<cluster>/<base>/<cluster>/<base>/<cluster>/<base>/<cluster>/<base>/<cluster>

创建ssh私钥并加入ssh agent

通过免登陆ssh私钥,可以用core用户身份登录到master节点,在集群上进行安装调试和灾难恢复。

(1)在misc节点上执行一下命令创建sshkey

<code>ssh-keygen -t rsa -b 4096 -N '' /<code>

以上命令在~/.ssh/文件夹下创建id_rsa和id_rsa.pub两个文件。

(2)启动ssh agent进程并把将无密码登录的私钥加入ssh agent

<code>eval "$(ssh-agent -s)"ssh-add ~/.ssh/id_rsa/<code>

下一步安装ocp时,需要将ssh公钥提供给安装程序配置文件。

因为我们采用自己手动准备资源方式,因此,需要将ssh公钥放到集群各节点,本机就可以免密码登录集群节点

<code>#将刚才生成的 ~/.ssh目录中的 id_rsa.pub 这个文件拷贝到你要登录的集群节点 的~/.ssh目录中scp ~/.ssh/id_rsa.pub root@192.168.128.31:~/.ssh/#然后在集群节点上运行以下命令来将公钥导入到~/.ssh/authorized_keys这个文件中cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys/<code>

4、获取安装程序

需要注册红帽官网账号,下载测试版安装程序,下载链接具体过程略。 https://cloud.redhat.com/openshift/install/metal/user-provisioned

下载安装程序

<code>rm -rf /data/pkgmkdir -p /data/pkgcd /data/pkg#ocp安装程序#wget https://mirror.openshift.com/pub/openshift-v4/clients/ocp/latest/openshift-install-linux-4.3.0.tar.gz#ocp 客户端#wget https://mirror.openshift.com/pub/openshift-v4/clients/ocp/latest/openshift-client-linux-4.3.0.tar.gz#rhcos安装程序wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/latest/rhcos-4.3.0-x86_64-installer.iso#rhcos bios raw文件wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/latest/rhcos-4.3.0-x86_64-metal.raw.gz#如果采用iso文件方式安装,相面两个文件都不需要下载#rhcos安装程序内核文件,用于使用ipex方式安装wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/latest/rhcos-4.3.0-x86_64-installer-kernel#rhcos初始化镜像文件,用于使用ipex方式安装wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/latest/rhcos-4.3.0-x86_64-installer-initramfs.img/<code>

5、准备工具机misc

参照王征的脚本修改的工具机准备工具,可以方便的在工具机上启动 LB、DHCP、PXE、DNS和HTTP服务 (1)安装ansible和git

<code>yum -y install ansible git/<code>

(2)从github拉取playbook

<code>cd /data/pkggit clone https://github.com/scwang18/ocp4-upi-helpernode.git/<code>

(3)修改playbook的参数文件 根据自己的网络规划修改参数文件

<code>[root@centos75 pkg]# cd /data/pkg/ocp4-upi-helpernode/[root@centos75 ocp4-upi-helpernode]# cat vars-static.yaml[root@misc pkg]# cat vars-static.yaml---staticips: truenamed: truehelper: name: "helper" ipaddr: "192.168.128.30" networkifacename: "ens192"dns: domain: "ipincloud.com" clusterid: "ocptest" forwarder1: "192.168.128.30" forwarder2: "192.168.128.30" registry: name: "registry" ipaddr: "192.168.128.30" yum: name: "yum" ipaddr: "192.168.128.30"bootstrap: name: "bootstrap" ipaddr: "192.168.128.31"masters: - name: "master1" ipaddr: "192.168.128.32" - name: "master2" ipaddr: "192.168.128.33" - name: "master3" ipaddr: "192.168.128.34"workers: - name: "worker1" ipaddr: "192.168.128.35" - name: "worker2" ipaddr: "192.168.128.36"force_ocp_download: falseocp_bios: "file:///data/pkg/rhcos-4.3.0-x86_64-metal.raw.gz"ocp_initramfs: "file:///data/pkg/rhcos-4.3.0-x86_64-installer-initramfs.img"ocp_install_kernel: "file:///data/pkg/rhcos-4.3.0-x86_64-installer-kernel"ocp_client: "file:///data/pkg/openshift-client-linux-4.3.0.tar.gz"ocp_installer: "file:///data/pkg/openshift-install-linux-4.3.0.tar.gz"ocp_filetranspiler: "file:///data/pkg/filetranspiler-master.zip"registry_server: "registry.ipincloud.com:8443"[root@misc pkg]#/<code>

(4)执行ansible安装

<code>ansible-playbook -e @vars-static.yaml tasks/main.yml/<code>

6、准备docker env

<code># 在可以科学上网的机器上打包必要的镜像文件#rm -rf /data/ocp4mkdir -p /data/ocp4cd /data/ocp4# 这个脚本不好用,不下载,使用下面自己修改过# wget https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.3/scripts/build.dist.shyum -y install podman docker-distribution pigz skopeo docker buildah jq python3-pip pip3 install yq# https://blog.csdn.net/ffzhihua/article/details/85237411wget http://mirror.centos.org/centos/7/os/x86_64/Packages/python-rhsm-certificates-1.19.10-1.el7_4.x86_64.rpmrpm2cpio python-rhsm-certificates-1.19.10-1.el7_4.x86_64.rpm | cpio -iv --to-stdout ./etc/rhsm/ca/redhat-uep.pem | tee /etc/rhsm/ca/redhat-uep.pemsystemctl start dockerdocker login -u wuliangye2019 -p Red@123! registry.redhat.iodocker login -u wuliangye2019 -p Red@123! registry.access.redhat.comdocker login -u wuliangye2019 -p Red@123! registry.connect.redhat.compodman login -u wuliangye2019 -p Red@123! registry.redhat.iopodman login -u wuliangye2019 -p Red@123! registry.access.redhat.compodman login -u wuliangye2019 -p Red@123! registry.connect.redhat.com# to download the pull-secret.json, open following link# https://cloud.redhat.com/openshift/install/metal/user-provisionedcat << 'EOF' > /data/pull-secret.json{"auths":{"cloud.openshift.com":{"auth":"xxxxxxxxxxx}}}EOF/<code>

创建 build.dist.sh文件

<code>#!/usr/bin/env bashset -eset -xvar_date=$(date '+%Y-%m-%d')echo $var_date#以下不用每次都执行#cat << EOF >> /etc/hosts#127.0.0.1 registry.ipincloud.com#EOF#mkdir -p /etc/crts/#cd /etc/crts#openssl req \\# -newkey rsa:2048 -nodes -keyout ipincloud.com.key \\# -x509 -days 3650 -out ipincloud.com.crt -subj \\# "/C=CN/ST=GD/L=SZ/O=Global Security/OU=IT Department/CN=*.ipincloud.com"#cp /etc/crts/ipincloud.com.crt /etc/pki/ca-trust/source/anchors/#update-ca-trust extractsystemctl stop docker-distributionrm -rf /data/registrymkdir -p /data/registrycat << EOF > /etc/docker-distribution/registry/config.ymlversion: 0.1log: fields: service: registrystorage: cache: layerinfo: inmemory filesystem: rootdirectory: /data/registry delete: enabled: truehttp: addr: :8443 tls: certificate: /etc/crts/ipincloud.com.crt key: /etc/crts/ipincloud.com.keyEOFsystemctl restart dockersystemctl enable docker-distributionsystemctl restart docker-distributionbuild_number_list=$(cat << EOF4.3.0EOF)mkdir -p /data/ocp4cd /data/ocp4install_build() { BUILDNUMBER=$1 echo ${BUILDNUMBER} mkdir -p /data/ocp4/${BUILDNUMBER} cd /data/ocp4/${BUILDNUMBER} #下载并安装openshift客户端和安装程序 第一次需要运行,工具机ansi初始化时,已经完成这些动作了 #wget https://mirror.openshift.com/pub/openshift-v4/clients/ocp/${BUILDNUMBER}/release.txt #wget https://mirror.openshift.com/pub/openshift-v4/clients/ocp/${BUILDNUMBER}/openshift-client-linux-${BUILDNUMBER}.tar.gz #wget https://mirror.openshift.com/pub/openshift-v4/clients/ocp/${BUILDNUMBER}/openshift-install-linux-${BUILDNUMBER}.tar.gz #解压安装程序和客户端到用户执行目录 第一次需要运行 #tar -xzf openshift-client-linux-${BUILDNUMBER}.tar.gz -C /usr/local/bin/ #tar -xzf openshift-install-linux-${BUILDNUMBER}.tar.gz -C /usr/local/bin/ export OCP_RELEASE=${BUILDNUMBER} export LOCAL_REG='registry.ipincloud.com:8443' export LOCAL_REPO='ocp4/openshift4' export UPSTREAM_REPO='openshift-release-dev' export LOCAL_SECRET_JSON="/data/pull-secret.json" export OPENSHIFT_INSTALL_RELEASE_IMAGE_OVERRIDE=${LOCAL_REG}/${LOCAL_REPO}:${OCP_RELEASE} export RELEASE_NAME="ocp-release" oc adm release mirror -a ${LOCAL_SECRET_JSON} \\ --from=quay.io/${UPSTREAM_REPO}/${RELEASE_NAME}:${OCP_RELEASE}-x86_64 \\ --to-release-image=${LOCAL_REG}/${LOCAL_REPO}:${OCP_RELEASE} \\ --to=${LOCAL_REG}/${LOCAL_REPO}}while read -r line; do install_build $linedone <<< "$build_number_list"cd /data/ocp4#wget -O ocp4-upi-helpernode-master.zip https://github.com/wangzheng422/ocp4-upi-helpernode/archive/master.zip#以下注释,因为quay.io/wangzheng422这个仓库的registry版本是v1不能与v2共存#podman pull quay.io/wangzheng422/filetranspiler#podman save quay.io/wangzheng422/filetranspiler | pigz -c > filetranspiler.tgz#podman pull docker.io/library/registry:2#podman save docker.io/library/registry:2 | pigz -c > registry.tgzsystemctl start dockerdocker login -u wuliangye2019 -p Red@123! registry.redhat.iodocker login -u wuliangye2019 -p Red@123! registry.access.redhat.comdocker login -u wuliangye2019 -p Red@123! registry.connect.redhat.compodman login -u wuliangye2019 -p Red@123! registry.redhat.iopodman login -u wuliangye2019 -p Red@123! registry.access.redhat.compodman login -u wuliangye2019 -p Red@123! registry.connect.redhat.com# 以下命令要运行 2-3个小时,耐心等待。。。# build operator catalogpodman login registry.ipincloud.com:8443 -u root -p Scwang18oc adm catalog build \\ --appregistry-endpoint https://quay.io/cnr \\ --appregistry-org redhat-operators \\ --to=${LOCAL_REG}/ocp4-operator/redhat-operators:v1 oc adm catalog mirror \\ ${LOCAL_REG}/ocp4-operator/redhat-operators:v1 \\ ${LOCAL_REG}/operator#cd /data#tar cf - registry/ | pigz -c > registry.tgz#cd /data#tar cf - ocp4/ | pigz -c > ocp4.tgz/<code>

执行build.dist.sh脚本

这里有个巨坑,因为从quay.io拉取image镜像到本地时,拉取的文件有5G多,通常一次拉取不完,会出错,每次出错后,重新运行build.dist.sh会把以前的registry删除掉,从头再来,浪费很多时间,实际上可以不用删除,执行oc adm release mirror时会自动跳过已经存在的image。血泪教训。

<code>bash build.dist.sh/<code>

oc adm release mirror执行完毕后,回根据官方镜像仓库生成本地镜像仓库,返回的信息需要记录下来,特别是imageContentSource信息,后面 install-config.yaml 文件里配置进去

<code>SuccessUpdate image: registry.ipincloud.com:8443/ocp4/openshift4:4.3.0Mirror prefix: registry.ipincloud.com:8443/ocp4/openshift4To use the new mirrored repository to install, add the following section to the install-config.yaml:imageContentSources:- mirrors: - registry.ipincloud.com:8443/ocp4/openshift4 source: quay.io/openshift-release-dev/ocp-release- mirrors: - registry.ipincloud.com:8443/ocp4/openshift4 source: quay.io/openshift-release-dev/ocp-v4.0-art-devTo use the new mirrored repository for upgrades, use the following to create an ImageContentSourcePolicy:apiVersion: operator.openshift.io/v1alpha1kind: ImageContentSourcePolicymetadata: name: examplespec: repositoryDigestMirrors: - mirrors: - registry.ipincloud.com:8443/ocp4/openshift4 source: quay.io/openshift-release-dev/ocp-release - mirrors: - registry.ipincloud.com:8443/ocp4/openshift4 source: quay.io/openshift-release-dev/ocp-v4.0-art-dev/<code>

以下命令不需要执行,在build.dish.sh里已经执行了

<code>oc adm release mirror -a /data/pull-secret.json --from=quay.io/openshift-release-dev/ocp-release:4.3.0-x86_64 --to-release-image=registry.ipincloud.com:8443/ocp4/openshift4:4.3.0 --to=registry.ipincloud.com:8443/ocp4/openshift4 podman login registry.ipincloud.com:8443 -u root -p Scwang18oc adm catalog build \\ --appregistry-endpoint https://quay.io/cnr \\ --appregistry-org redhat-operators \\ --to=registry.ipincloud.com:8443/ocp4-operator/redhat-operators:v1 oc adm catalog mirror \\ registry.ipincloud.com:8443/ocp4-operator/redhat-operators:v1 \\ registry.ipincloud.com:8443/operator#如果oc adm catalog mirror执行不成功,会生成一个mapping.txt的文件,可以根据这个文件,执行不成功的行删除,再以下面的方式执行oc image mirror -a /data/pull-secret.json -f /data/mapping-ok.txtoc image mirror quay.io/external_storage/nfs-client-provisioner:latest registry.ipincloud.com:8443/ocp4/openshift4/nfs-client-provisioner:latestoc image mirror quay.io/external_storage/nfs-client-provisioner:latest registry.ipincloud.com:8443/quay.io/external_storage/nfs-client-provisioner:latest#查看镜像的shacurl -v --silent -H "Accept: application/vnd.docker.distribution.manifest.v2+json" -X GET https://registry.ipincloud.com:8443/v2/ocp4/openshift4/nfs-client-provisioner/manifests/latest 2>&1 | grep Docker-Content-Digest | awk '{print ($3)}'#删除镜像摘要curl -v --silent -H "Accept: application/vnd.docker.distribution.manifest.v2+json" -X DELETE https://registry.ipincloud.com:8443/v2/ocp4/openshift4/nfs-client-provisioner/manifests/sha256:022ea0b0d69834b652a4c53655d78642ae23f0324309097be874fb58d09d2919#回收镜像空间podman exec -it mirror-registry /bin/registry garbage-collect /etc/docker/registry/config.yml/<code>

7、创建installer配置文件

(1)创建installer文件夹

<code>rm -rf /data/installmkdir -p /data/installcd /data/install/<code>

(2)定制install-config.yaml文件

补充pullSecret

<code>[root@misc data]# cat /data/pull-secret.json{"auths":{"cloud.openshift.com":{"auth":"省略"}}}/<code>添加sshKey(3.1创建的公钥文件内容)

<code>cat ~/.ssh/id_rsa.pub/<code>additionalTrustBundle(Mirror registry创建是生成的csr)

<code>[root@misc crts]# cat /etc/crts/ipincloud.com.crt-----BEGIN CERTIFICATE-----xxx省略-----END CERTIFICATE-----/<code>添加代理

生产环境可以不用直连外网,通过在install-config.yaml文件为集群设置代理。

本次测试,为了加速外网下载,我在aws上事先搭建了一个v2ray server,misc服务器作为v2ray客户端,具体搭建过程另文叙述。

在反复试验时,比如 install-config.yaml 所在的目录是 config,必须 rm -rf install 而不是 rm -rf install/*,后者未删除其中的隐藏文件 .openshift_install_state.json,有可能引起:x509: certificate has expired or is not yet valid。在文档和博客示例中 install-config.yaml 的 cidr 配置为 10 网段,由于未细看文档理解成了节点机网段,这造成了整个过程中最莫名其妙的错误:no matches for kind MachineConfig。最终文件内容如下:

<code>[root@centos75 install]# vi install-config.yamlapiVersion: v1baseDomain: ipincloud.comproxy: httpProxy: http://192.168.128.30:8001 httpsProxy: http://192.168.128.30:8001compute:- hyperthreading: Enabled name: worker replicas: 0controlPlane: hyperthreading: Enabled name: master replicas: 3metadata: name: ocptestnetworking: clusterNetwork: - cidr: 10.128.0.0/14 hostPrefix: 23 networkType: OpenShiftSDN serviceNetwork: - 172.30.0.0/16platform: none: {}fips: falsepullSecret: '{"auths":{"省略'additionalTrustBundle: | -----BEGIN CERTIFICATE----- 省略,注意这里要前面空两格 -----END CERTIFICATE-----imageContentSources:- mirrors: - registry.ipincloud.com:8443/ocp4/openshift4 source: quay.io/openshift-release-dev/ocp-release- mirrors: - registry.ipincloud.com:8443/ocp4/openshift4 source: quay.io/openshift-release-dev/ocp-v4.0-art-dev/<code>

(3)备份定制install-config.yaml文件,便于以后可以重复使用

<code>cd /data/installcp install-config.yaml ../install-config.yaml.20200205/<code>

8、创建Kubernetes manifest和Ignition配置文件

(1)生成Kubernetes manifests文件

<code>openshift-install create manifests --dir=/data/install/<code>

注意:指定install-config.yaml所在目录是,需要使用绝的路径

(2)修改 manifests/cluster-scheduler-02-config.yml文件以防止pod调度到control plane节点

红帽官方安装文档说明,kubernetes不支持ingress的load balancer访问control-plane节点的pod

<code>a.打开manifests/cluster-scheduler-02-config.ymlb.找到mastersSchedulable参数,设置为Falsec.保存并退出。vi /data/install/manifests/cluster-scheduler-02-config.yml/<code>

(3)创建Ignition配置文件

注意:创建Ignition配置文件完成后,install-config.yaml文件将被删除,请务必先备份此文件。

<code>openshift-install create ignition-configs --dir=/data/install/<code>

(4)将Ignition配置文件拷贝到http服务器目录,待安装时使用

<code>cd /data/install\\cp -f bootstrap.ign /var/www/html/ignition/bootstrap.ign\\cp -f master.ign /var/www/html/ignition/master1.ign\\cp -f master.ign /var/www/html/ignition/master2.ign\\cp -f master.ign /var/www/html/ignition/master3.ign\\cp -f worker.ign /var/www/html/ignition/worker1.ign\\cp -f worker.ign /var/www/html/ignition/worker2.igncd /var/www/html/ignition/chmod 755 *.ign/<code>

至此,已完成必要的配置文件设置,开始进入下一步创建节点。

9、定制RHCOS ISO

安装时需要修改启动参数,只能手动录入,每台机器修改很麻烦,容易出错,因此我们采用genisoimage来定制每台机器的安装镜像。

<code>#安装镜像创建工具yum -y install genisoimage libguestfs-toolssystemctl start libvirtd#设置环境变量export NGINX_DIRECTORY=/data/pkgexport RHCOSVERSION=4.3.0export VOLID=$(isoinfo -d -i ${NGINX_DIRECTORY}/rhcos-${RHCOSVERSION}-x86_64-installer.iso | awk '/Volume id/ { print $3 }')#生成一个临时文件目录,用于放置过程文件TEMPDIR=$(mktemp -d)echo $VOLIDecho $TEMPDIRcd ${TEMPDIR}# Extract the ISO content using guestfish (to avoid sudo mount)#使用guestfish可以将不用sudo mount将iso文件解压出来guestfish -a ${NGINX_DIRECTORY}/rhcos-${RHCOSVERSION}-x86_64-installer.iso \\ -m /dev/sda tar-out / - | tar xvf -#定义修改配置文件的函数modify_cfg(){ for file in "EFI/redhat/grub.cfg" "isolinux/isolinux.cfg"; do # 添加恰当的 image 和 ignition url sed -e '/coreos.inst=yes/s|$| coreos.inst.install_dev=sda coreos.inst.image_url='"${URL}"'\\/install\\/'"${BIOSMODE}"'.raw.gz coreos.inst.ignition_url='"${URL}"'\\/ignition\\/'"${NODE}"'.ign ip='"${IP}"'::'"${GATEWAY}"':'"${NETMASK}"':'"${FQDN}"':'"${NET_INTERFACE}"':none:'"${DNS}"' nameserver='"${DNS}"'|' ${file} > $(pwd)/${NODE}_${file##*/} # 修改参数里的启动等待时间 sed -i -e 's/default vesamenu.c32/default linux/g' -e 's/timeout 600/timeout 10/g' $(pwd)/${NODE}_${file##*/} done}#设置url,网关、dns等iso启动通用参数变量URL="http://192.168.128.30:8080"GATEWAY="192.168.128.254"NETMASK="255.255.255.0"DNS="192.168.128.30"#设置bootstrap节点变量NODE="bootstrap"IP="192.168.128.31"FQDN="bootstrap"BIOSMODE="bios"NET_INTERFACE="ens192"modify_cfg#设置master1节点变量NODE="master1"IP="192.168.128.32"FQDN="master1"BIOSMODE="bios"NET_INTERFACE="ens192"modify_cfg#设置master2节点变量NODE="master2"IP="192.168.128.33"FQDN="master2"BIOSMODE="bios"NET_INTERFACE="ens192"modify_cfg#设置master3节点变量NODE="master3"IP="192.168.128.34"FQDN="master3"BIOSMODE="bios"NET_INTERFACE="ens192"modify_cfg#设置master4节点变量NODE="worker1"IP="192.168.128.35"FQDN="worker1"BIOSMODE="bios"NET_INTERFACE="ens192"modify_cfg#设置master5节点变量NODE="worker2"IP="192.168.128.36"FQDN="worker2"BIOSMODE="bios"NET_INTERFACE="ens192"modify_cfg# 为每个节点创建不同的安装镜像# https://github.com/coreos/coreos-assembler/blob/master/src/cmd-buildextend-installer#L97-L103for node in bootstrap master1 master2 master3 worker1 worker2; do # 为每个节点创建不同的 grub.cfg and isolinux.cfg 文件 for file in "EFI/redhat/grub.cfg" "isolinux/isolinux.cfg"; do /bin/cp -f $(pwd)/${node}_${file##*/} ${file} done # 创建iso镜像 genisoimage -verbose -rock -J -joliet-long -volset ${VOLID} \\ -eltorito-boot isolinux/isolinux.bin -eltorito-catalog isolinux/boot.cat \\ -no-emul-boot -boot-load-size 4 -boot-info-table \\ -eltorito-alt-boot -efi-boot images/efiboot.img -no-emul-boot \\ -o ${NGINX_DIRECTORY}/${node}.iso .done# 清除过程文件cdrm -Rf ${TEMPDIR}cd ${NGINX_DIRECTORY}/<code>

9、在节点机器上安装RHCOS

(1)将定制的ISO文件拷贝到vmware esxi主机上,准备装节点

<code>[root@misc pkg]# scp bootstrap.iso root@192.168.128.200:/vmfs/volumes/hdd/iso[root@misc pkg]# scp m*.iso root@192.168.128.200:/vmfs/volumes/hdd/iso[root@misc pkg]# scp w*.iso root@192.168.128.200:/vmfs/volumes/hdd/iso/<code>

(2)按规划创建master,设置从iso启动安装

进入启动界面后,直接点击安装,系统自动回自动下载bios和配置文件,完成安装安装完成后,需要将iso文件退出来,避免再次进入安装界面安装顺序是bootstrap,master1,master2,master3,待master安装并启动完成后,再进行worker安装安装过程中可以通过proxy查看进度 http://registry.ipincloud.com:9000/安装过程中可以在misc节点查看详细的bootstrap进度。

<code>openshift-install --dir=/data/install wait-for bootstrap-complete --log-level debug/<code>

注意事项:

ignition和iso文件的正确匹配我在安装的时候,master1提示etcdmain: member ab84b6a6e4a3cc9a has already been bootstrapped,花了很多时间分析和解决问题,因为master1在安装完成后,etcd组件会自动安装并注册为member,我再次使用iso文件重新安装master1后,etcd自动安装注册时,会检测到etcd及集群里已经有这个member,无法重新注册,因此这个节点的etcd一直无法正常启动,解决办法是:

手工修改-aster1节点的etcd的yaml文件,在exec etcd命令末尾增加–initial-cluster-state=existing参数,再删除问题POD后,系统会自动重新安装etcd pod,恢复正常。 正常启动以后,要把这个改回去,否则machine-config回一直无法完成

<code>#[root@master1 /]# vi /etc/kubernetes/manifests/etcd-member.yaml exec etcd \\ --initial-advertise-peer-urls=https://${ETCD_IPV4_ADDRESS}:2380 \\ --cert-file=/etc/ssl/etcd/system:etcd-server:${ETCD_DNS_NAME}.crt \\ --key-file=/etc/ssl/etcd/system:etcd-server:${ETCD_DNS_NAME}.key \\ --trusted-ca-file=/etc/ssl/etcd/ca.crt \\ --client-cert-auth=true \\ --peer-cert-file=/etc/ssl/etcd/system:etcd-peer:${ETCD_DNS_NAME}.crt \\ --peer-key-file=/etc/ssl/etcd/system:etcd-peer:${ETCD_DNS_NAME}.key \\ --peer-trusted-ca-file=/etc/ssl/etcd/ca.crt \\ --peer-client-cert-auth=true \\ --advertise-client-urls=https://${ETCD_IPV4_ADDRESS}:2379 \\ --listen-client-urls=https://0.0.0.0:2379 \\ --listen-peer-urls=https://0.0.0.0:2380 \\ --listen-metrics-urls=https://0.0.0.0:9978 \\ --initial-cluster-state=existing [root@master1 /]# crictl podsPOD ID CREATED STATE NAME NAMESPACE ATTEMPTc4686dc3e5f4f 38 minutes ago Ready etcd-member-master1.ocptest.ipincloud.com openshift-etcd 5 [root@master1 /]# crictl rmp xxx/<code>检查是否安装完成
如果出现INFO It is now safe to remove the bootstrap resources,表示master节点安装完成,控制面转移到master集群。

<code>[root@misc install]# openshift-install --dir=/data/install wait-for bootstrap-complete --log-level debugDEBUG OpenShift Installer v4.3.0DEBUG Built from commit 2055609f95b19322ee6cfdd0bea73399297c4a3eINFO Waiting up to 30m0s for the Kubernetes API at https://api.ocptest.ipincloud.com:6443...INFO API v1.16.2 upINFO Waiting up to 30m0s for bootstrapping to complete...DEBUG Bootstrap status: completeINFO It is now safe to remove the bootstrap resources[root@misc install]#/<code>

(3)安装worker

进入启动界面后,直接点击安装,系统自动回自动下载bios和配置文件,完成安装安装完成后,需要将iso文件退出来,避免再次进入安装界面安装顺序是bootstrap,master1,master2,master3,待master安装并启动完成后,再进行worker安装安装过程中可以通过proxy查看进度 http://registry.ipincloud.com:9000/也可以在misc节点是查看详细安装节点

<code>[root@misc redhat-operators-manifests]# openshift-install --dir=/data/install wait-for install-complete --log-level debugDEBUG OpenShift Installer v4.3.0DEBUG Built from commit 2055609f95b19322ee6cfdd0bea73399297c4a3eINFO Waiting up to 30m0s for the cluster at https://api.ocptest.ipincloud.com:6443 to initialize...DEBUG Cluster is initializedINFO Waiting up to 10m0s for the openshift-console route to be created...DEBUG Route found in openshift-console namespace: consoleDEBUG Route found in openshift-console namespace: downloadsDEBUG OpenShift console route is createdINFO Install complete!INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/data/install/auth/kubeconfig'INFO Access the OpenShift web-console here:https://console-openshift-console.apps.ocptest.ipincloud.comINFO Login to the console with user: kubeadmin, password: pubmD-8Baaq-IX36r-WIWWf/<code>需要审批worker节点的加入申请

查看待审批的csr

<code>[root@misc ~]# oc get csrNAME AGE REQUESTOR CONDITIONcsr-7lln5 70m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issuedcsr-d48xk 69m system:node:master1.ocptest.ipincloud.com Approved,Issuedcsr-f2g7r 69m system:node:master2.ocptest.ipincloud.com Approved,Issuedcsr-gbn2n 69m system:node:master3.ocptest.ipincloud.com Approved,Issuedcsr-hwxwx 13m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pendingcsr-ppgxx 13m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pendingcsr-wg874 70m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issuedcsr-zkp79 70m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued[root@misc ~]#/<code>

执行审批

<code>oc get csr -ojson | jq -r '.items[] | select(.status == {} ) | .metadata.name' | xargs oc adm certificate approve/<code>

(3)在misc上启动nfs

<code>bash /data/pkg/ocp4-upi-helpernode/files/nfs-provisioner-setup.sh#查看状态oc get pods -n nfs-provisioner(4)ocp内部registry使用nfs作为存储oc patch configs.imageregistry.operator.openshift.io cluster -p '{"spec":{"storage":{"pvc":{"claim":""}}}}' --type=mergeoc get clusteroperator image-registry/<code>

10 配置登录

(1)配置普通管理员账号

<code>#在misc机器上创建admin tokenmkdir -p ~/authhtpasswd -bBc ~/auth/admin-passwd admin scwang18#拷贝到本地mkdir -p ~/authscp -P 20030 root@218.89.67.36:/root/auth/admin-passwd ~/auth/#在 OAuth Details 页面添加 HTPasswd 类型的 Identity Providers 并上传admin-passwd 文件。https://console-openshift-console.apps.ocptest.ipincloud.com#授予新建的admin用户集群管理员权限oc adm policy add-cluster-role-to-user cluster-admin admin/<code>