OpenShift Container Platform 4.3.0部署实录


本文参照红帽官方文档,在裸机安装Openshift4.3文档进行。因为只有一台64G内存的PC机,安装vmware vsphere 6.7免费版进行本测试,所以尝试在OCP官方文档要求的最低内存需求基础上,内存减半安装,记录如下。


发现头条号不支持markdown,对程序员太不又好了,拷贝过来的代码格式都没了,简书的这一点就要好一些。

https://www.jianshu.com/p/7c0c2affadb8

1、ocp安装的过程

红帽官方文档记载的安装过程如下:

  1. bootstrap启动并从准备好master需要的资源
  2. master从bootstrap获取需要的资源并完成启动
  3. master通过bootstrap构建etcd集群
  4. bootstrap使用刚才构建的etcd集群启动一个临时的kubernetes control plane
  5. 临时control plane在master节点启动生产control plane
  6. 临时control plane关闭并将控制权移交给生产control plane
  7. bootstrap将ocp组建注入到生产control plane
  8. 安装程序关闭bootstrap
  9. control plane 部署计算节点
  10. control plane 通过operator方式安装其他服务

2、准备服务器资源

服务器规划如下:

  • 3台control plane节点,安装etcd、control plane组件和infras基础组件,因为资源紧张,不部署dns服务器,通过hosts文件解析域名;
  • 2台compute 节点,运行实际负载;
  • 1台bootstrap节点,执行安装任务;
  • 1台misc/lb节点,用于准备安装资源、启动bootstrap,并作为lb节点使用。

Hostname vcpu ram hdd ip fqdn misc/lb 4 8g 120g 192.168.128.30 misc.ocptest.ipingcloud.com/lb.ocptest.ipincloud.com bootstrap 4 8g 120g 192.168.128.31 bootstrap.ocptest.ipincloud.com master1 4 8g 120g 192.168.128.32 master1.ocptest.ipincloud.com master2 4 8g 120g 192.168.128.33 master2.ocptest.ipincloud.com master3 4 8g 120g 192.168.128.34 master3.ocptest.ipincloud.com worker1 2 4g 120g 192.168.128.35 worker1.ocptest.ipincloud.com worker2 2 4g 120g 192.168.128.36 worker2.ocptest.ipincloud.com

3、准备网络资源

api server和ingress公用一个lb,即misc/lb 以为dns配置记录,ocptest是cluster名,ipingcloud.com是基础域名.这些配置,需要修改ansi-playbook文件的tasks/相应模板。 参见 https://github.com/scwang18/ocp4-upi-helpernode.git

  • dns配置

组件 dns记录 描述 Kubernetes API api.ocptest.ipincloud.com 该DNS记录指向control plane节点的负载平衡器。群集外部和群集中所有节点都必须可以解析此记录。 Kubernetes API api-int.ocptest.ipincloud.com 该DNS记录指向control plane节点的负载平衡器。该记录必须可从群集中的所有节点上解析。 Routes *.apps.ocptest.ipincloud.com 通配符DNS记录指向ingress slb。群集外部和群集中所有节点都必须可以解析此记录。 etcd etcd-.ocptest.ipincloud.com DNS记录指向etcd节点,群集所有节点都必须可以解析此记录。 etcd _etcd-server-ssl._tcp.ocptest.ipincloud.com 因为etcd使用2380对外服务,因此,需要建立对应每台etcd节点的srv dns记录,优先级0,权重10和端口2380,如下表

  • etcd srv dns记录表

#一下激怒是必须的,用于bootstrap创建etcd服务器上,自动配置etcd服务解析

#_service._proto.name. TTL class SRV priority weight port target. _etcd-server-ssl._tcp.<cluster>.<base> 86400 IN SRV 0 10 2380 etcd-0.<cluster>.<base>. _etcd-server-ssl._tcp.<cluster>.<base> 86400 IN SRV 0 10 2380 etcd-1.<cluster>.<base>. _etcd-server-ssl._tcp.<cluster>.<base> 86400 IN SRV 0 10 2380 etcd-2.<cluster>.<base>./<base>/<cluster>/<base>/<cluster>/<base>/<cluster>/<base>/<cluster>/<base>/<cluster>/<base>/<cluster>

  • 创建ssh私钥并加入ssh agent

通过免登陆ssh私钥,可以用core用户身份登录到master节点,在集群上进行安装调试和灾难恢复。

(1)在misc节点上执行一下命令创建sshkey

<code>ssh-keygen -t rsa -b 4096 -N '' /<code>

以上命令在~/.ssh/文件夹下创建id_rsa和id_rsa.pub两个文件。

(2)启动ssh agent进程并把将无密码登录的私钥加入ssh agent

<code>eval "$(ssh-agent -s)"ssh-add ~/.ssh/id_rsa/<code>

下一步安装ocp时,需要将ssh公钥提供给安装程序配置文件。

因为我们采用自己手动准备资源方式,因此,需要将ssh公钥放到集群各节点,本机就可以免密码登录集群节点

<code>#将刚才生成的 ~/.ssh目录中的 id_rsa.pub 这个文件拷贝到你要登录的集群节点 的~/.ssh目录中scp ~/.ssh/id_rsa.pub [email protected]:~/.ssh/#然后在集群节点上运行以下命令来将公钥导入到~/.ssh/authorized_keys这个文件中cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys/<code> 

4、获取安装程序

需要注册红帽官网账号,下载测试版安装程序,下载链接具体过程略。 https://cloud.redhat.com/openshift/install/metal/user-provisioned

  • 下载安装程序
<code>rm -rf /data/pkgmkdir -p /data/pkgcd /data/pkg#ocp安装程序#wget https://mirror.openshift.com/pub/openshift-v4/clients/ocp/latest/openshift-install-linux-4.3.0.tar.gz#ocp 客户端#wget https://mirror.openshift.com/pub/openshift-v4/clients/ocp/latest/openshift-client-linux-4.3.0.tar.gz#rhcos安装程序wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/latest/rhcos-4.3.0-x86_64-installer.iso#rhcos  bios raw文件wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/latest/rhcos-4.3.0-x86_64-metal.raw.gz#如果采用iso文件方式安装,相面两个文件都不需要下载#rhcos安装程序内核文件,用于使用ipex方式安装wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/latest/rhcos-4.3.0-x86_64-installer-kernel#rhcos初始化镜像文件,用于使用ipex方式安装wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/latest/rhcos-4.3.0-x86_64-installer-initramfs.img/<code>

5、准备工具机misc

参照王征的脚本修改的工具机准备工具,可以方便的在工具机上启动 LB、DHCP、PXE、DNS和HTTP服务 (1)安装ansible和git

<code>yum -y install ansible git/<code>

(2)从github拉取playbook

<code>cd /data/pkggit clone https://github.com/scwang18/ocp4-upi-helpernode.git/<code>

(3)修改playbook的参数文件 根据自己的网络规划修改参数文件

<code>[root@centos75 pkg]# cd /data/pkg/ocp4-upi-helpernode/[root@centos75 ocp4-upi-helpernode]# cat vars-static.yaml[root@misc pkg]# cat vars-static.yaml---staticips: truenamed: truehelper:  name: "helper"  ipaddr: "192.168.128.30"  networkifacename: "ens192"dns:  domain: "ipincloud.com"  clusterid: "ocptest"  forwarder1: "192.168.128.30"  forwarder2: "192.168.128.30"  registry:    name: "registry"    ipaddr: "192.168.128.30"  yum:    name: "yum"    ipaddr: "192.168.128.30"bootstrap:  name: "bootstrap"  ipaddr: "192.168.128.31"masters:  - name: "master1"    ipaddr: "192.168.128.32"  - name: "master2"    ipaddr: "192.168.128.33"  - name: "master3"    ipaddr: "192.168.128.34"workers:  - name: "worker1"    ipaddr: "192.168.128.35"  - name: "worker2"    ipaddr: "192.168.128.36"force_ocp_download: falseocp_bios: "file:///data/pkg/rhcos-4.3.0-x86_64-metal.raw.gz"ocp_initramfs: "file:///data/pkg/rhcos-4.3.0-x86_64-installer-initramfs.img"ocp_install_kernel: "file:///data/pkg/rhcos-4.3.0-x86_64-installer-kernel"ocp_client: "file:///data/pkg/openshift-client-linux-4.3.0.tar.gz"ocp_installer: "file:///data/pkg/openshift-install-linux-4.3.0.tar.gz"ocp_filetranspiler: "file:///data/pkg/filetranspiler-master.zip"registry_server: "registry.ipincloud.com:8443"[root@misc pkg]#/<code>

(4)执行ansible安装

<code>ansible-playbook -e @vars-static.yaml tasks/main.yml/<code>

6、准备docker env

<code># 在可以科学上网的机器上打包必要的镜像文件#rm -rf /data/ocp4mkdir -p /data/ocp4cd /data/ocp4# 这个脚本不好用,不下载,使用下面自己修改过# wget https://raw.githubusercontent.com/wangzheng422/docker_env/dev/redhat/ocp4/4.3/scripts/build.dist.shyum -y install podman docker-distribution pigz skopeo docker buildah jq python3-pip pip3 install yq# https://blog.csdn.net/ffzhihua/article/details/85237411wget http://mirror.centos.org/centos/7/os/x86_64/Packages/python-rhsm-certificates-1.19.10-1.el7_4.x86_64.rpmrpm2cpio python-rhsm-certificates-1.19.10-1.el7_4.x86_64.rpm | cpio -iv --to-stdout ./etc/rhsm/ca/redhat-uep.pem | tee /etc/rhsm/ca/redhat-uep.pemsystemctl start dockerdocker login -u wuliangye2019 -p Red@123! registry.redhat.iodocker login -u wuliangye2019 -p Red@123! registry.access.redhat.comdocker login -u wuliangye2019 -p Red@123! registry.connect.redhat.compodman login -u wuliangye2019 -p Red@123! registry.redhat.iopodman login -u wuliangye2019 -p Red@123! registry.access.redhat.compodman login -u wuliangye2019 -p Red@123! registry.connect.redhat.com# to download the pull-secret.json, open following link# https://cloud.redhat.com/openshift/install/metal/user-provisionedcat << 'EOF' > /data/pull-secret.json{"auths":{"cloud.openshift.com":{"auth":"xxxxxxxxxxx}}}EOF/<code>

创建 build.dist.sh文件

<code>#!/usr/bin/env bashset -eset -xvar_date=$(date '+%Y-%m-%d')echo $var_date#以下不用每次都执行#cat << EOF >>  /etc/hosts#127.0.0.1 registry.ipincloud.com#EOF#mkdir -p /etc/crts/#cd /etc/crts#openssl req \\#   -newkey rsa:2048 -nodes -keyout ipincloud.com.key \\#   -x509 -days 3650 -out ipincloud.com.crt -subj \\#   "/C=CN/ST=GD/L=SZ/O=Global Security/OU=IT Department/CN=*.ipincloud.com"#cp /etc/crts/ipincloud.com.crt /etc/pki/ca-trust/source/anchors/#update-ca-trust extractsystemctl stop docker-distributionrm -rf /data/registrymkdir -p /data/registrycat << EOF > /etc/docker-distribution/registry/config.ymlversion: 0.1log:  fields:    service: registrystorage:    cache:        layerinfo: inmemory    filesystem:        rootdirectory: /data/registry    delete:        enabled: truehttp:    addr: :8443    tls:       certificate: /etc/crts/ipincloud.com.crt       key: /etc/crts/ipincloud.com.keyEOFsystemctl restart dockersystemctl enable docker-distributionsystemctl restart docker-distributionbuild_number_list=$(cat << EOF4.3.0EOF)mkdir -p /data/ocp4cd /data/ocp4install_build() {    BUILDNUMBER=$1    echo ${BUILDNUMBER}        mkdir -p /data/ocp4/${BUILDNUMBER}    cd /data/ocp4/${BUILDNUMBER}    #下载并安装openshift客户端和安装程序 第一次需要运行,工具机ansi初始化时,已经完成这些动作了    #wget https://mirror.openshift.com/pub/openshift-v4/clients/ocp/${BUILDNUMBER}/release.txt    #wget https://mirror.openshift.com/pub/openshift-v4/clients/ocp/${BUILDNUMBER}/openshift-client-linux-${BUILDNUMBER}.tar.gz    #wget https://mirror.openshift.com/pub/openshift-v4/clients/ocp/${BUILDNUMBER}/openshift-install-linux-${BUILDNUMBER}.tar.gz    #解压安装程序和客户端到用户执行目录 第一次需要运行    #tar -xzf openshift-client-linux-${BUILDNUMBER}.tar.gz -C /usr/local/bin/    #tar -xzf openshift-install-linux-${BUILDNUMBER}.tar.gz -C /usr/local/bin/        export OCP_RELEASE=${BUILDNUMBER}    export LOCAL_REG='registry.ipincloud.com:8443'    export LOCAL_REPO='ocp4/openshift4'    export UPSTREAM_REPO='openshift-release-dev'    export LOCAL_SECRET_JSON="/data/pull-secret.json"    export OPENSHIFT_INSTALL_RELEASE_IMAGE_OVERRIDE=${LOCAL_REG}/${LOCAL_REPO}:${OCP_RELEASE}    export RELEASE_NAME="ocp-release"    oc adm release mirror -a ${LOCAL_SECRET_JSON} \\    --from=quay.io/${UPSTREAM_REPO}/${RELEASE_NAME}:${OCP_RELEASE}-x86_64 \\    --to-release-image=${LOCAL_REG}/${LOCAL_REPO}:${OCP_RELEASE} \\    --to=${LOCAL_REG}/${LOCAL_REPO}}while read -r line; do    install_build $linedone <<< "$build_number_list"cd /data/ocp4#wget -O ocp4-upi-helpernode-master.zip https://github.com/wangzheng422/ocp4-upi-helpernode/archive/master.zip#以下注释,因为quay.io/wangzheng422这个仓库的registry版本是v1不能与v2共存#podman pull quay.io/wangzheng422/filetranspiler#podman save quay.io/wangzheng422/filetranspiler | pigz -c > filetranspiler.tgz#podman pull docker.io/library/registry:2#podman save docker.io/library/registry:2 | pigz -c > registry.tgzsystemctl start dockerdocker login -u wuliangye2019 -p Red@123! registry.redhat.iodocker login -u wuliangye2019 -p Red@123! registry.access.redhat.comdocker login -u wuliangye2019 -p Red@123! registry.connect.redhat.compodman login -u wuliangye2019 -p Red@123! registry.redhat.iopodman login -u wuliangye2019 -p Red@123! registry.access.redhat.compodman login -u wuliangye2019 -p Red@123! registry.connect.redhat.com# 以下命令要运行 2-3个小时,耐心等待。。。# build operator catalogpodman login registry.ipincloud.com:8443 -u root -p Scwang18oc adm catalog build \\    --appregistry-endpoint https://quay.io/cnr \\    --appregistry-org redhat-operators \\    --to=${LOCAL_REG}/ocp4-operator/redhat-operators:v1    oc adm catalog mirror \\    ${LOCAL_REG}/ocp4-operator/redhat-operators:v1 \\    ${LOCAL_REG}/operator#cd /data#tar cf - registry/ | pigz -c > registry.tgz#cd /data#tar cf - ocp4/ | pigz -c > ocp4.tgz/<code>

执行build.dist.sh脚本

这里有个巨坑,因为从quay.io拉取image镜像到本地时,拉取的文件有5G多,通常一次拉取不完,会出错,每次出错后,重新运行build.dist.sh会把以前的registry删除掉,从头再来,浪费很多时间,实际上可以不用删除,执行oc adm release mirror时会自动跳过已经存在的image。血泪教训。

<code>bash build.dist.sh/<code>

oc adm release mirror执行完毕后,回根据官方镜像仓库生成本地镜像仓库,返回的信息需要记录下来,特别是imageContentSource信息,后面 install-config.yaml 文件里配置进去

<code>SuccessUpdate image:  registry.ipincloud.com:8443/ocp4/openshift4:4.3.0Mirror prefix: registry.ipincloud.com:8443/ocp4/openshift4To use the new mirrored repository to install, add the following section to the install-config.yaml:imageContentSources:- mirrors:  - registry.ipincloud.com:8443/ocp4/openshift4  source: quay.io/openshift-release-dev/ocp-release- mirrors:  - registry.ipincloud.com:8443/ocp4/openshift4  source: quay.io/openshift-release-dev/ocp-v4.0-art-devTo use the new mirrored repository for upgrades, use the following to create an ImageContentSourcePolicy:apiVersion: operator.openshift.io/v1alpha1kind: ImageContentSourcePolicymetadata:  name: examplespec:  repositoryDigestMirrors:  - mirrors:    - registry.ipincloud.com:8443/ocp4/openshift4    source: quay.io/openshift-release-dev/ocp-release  - mirrors:    - registry.ipincloud.com:8443/ocp4/openshift4    source: quay.io/openshift-release-dev/ocp-v4.0-art-dev/<code>

以下命令不需要执行,在build.dish.sh里已经执行了

<code>oc adm release mirror -a /data/pull-secret.json --from=quay.io/openshift-release-dev/ocp-release:4.3.0-x86_64 --to-release-image=registry.ipincloud.com:8443/ocp4/openshift4:4.3.0 --to=registry.ipincloud.com:8443/ocp4/openshift4    podman login registry.ipincloud.com:8443 -u root -p Scwang18oc adm catalog build \\    --appregistry-endpoint https://quay.io/cnr \\    --appregistry-org redhat-operators \\    --to=registry.ipincloud.com:8443/ocp4-operator/redhat-operators:v1    oc adm catalog mirror \\    registry.ipincloud.com:8443/ocp4-operator/redhat-operators:v1 \\    registry.ipincloud.com:8443/operator#如果oc adm catalog mirror执行不成功,会生成一个mapping.txt的文件,可以根据这个文件,执行不成功的行删除,再以下面的方式执行oc image mirror -a /data/pull-secret.json -f /data/mapping-ok.txtoc image mirror quay.io/external_storage/nfs-client-provisioner:latest registry.ipincloud.com:8443/ocp4/openshift4/nfs-client-provisioner:latestoc image mirror quay.io/external_storage/nfs-client-provisioner:latest registry.ipincloud.com:8443/quay.io/external_storage/nfs-client-provisioner:latest#查看镜像的shacurl -v --silent -H "Accept: application/vnd.docker.distribution.manifest.v2+json" -X GET  https://registry.ipincloud.com:8443/v2/ocp4/openshift4/nfs-client-provisioner/manifests/latest 2>&1 | grep Docker-Content-Digest | awk '{print ($3)}'#删除镜像摘要curl -v --silent -H "Accept: application/vnd.docker.distribution.manifest.v2+json" -X DELETE https://registry.ipincloud.com:8443/v2/ocp4/openshift4/nfs-client-provisioner/manifests/sha256:022ea0b0d69834b652a4c53655d78642ae23f0324309097be874fb58d09d2919#回收镜像空间podman exec -it  mirror-registry /bin/registry garbage-collect  /etc/docker/registry/config.yml/<code>

7、创建installer配置文件

(1)创建installer文件夹

<code>rm -rf /data/installmkdir -p /data/installcd /data/install/<code>

(2)定制install-config.yaml文件

  • 补充pullSecret
<code>[root@misc data]# cat /data/pull-secret.json{"auths":{"cloud.openshift.com":{"auth":"省略"}}}/<code>
  • 添加sshKey(3.1创建的公钥文件内容)
<code>cat ~/.ssh/id_rsa.pub/<code>
  • additionalTrustBundle(Mirror registry创建是生成的csr)
<code>[root@misc crts]# cat /etc/crts/ipincloud.com.crt-----BEGIN CERTIFICATE-----xxx省略-----END CERTIFICATE-----/<code>
  • 添加代理

生产环境可以不用直连外网,通过在install-config.yaml文件为集群设置代理。

本次测试,为了加速外网下载,我在aws上事先搭建了一个v2ray server,misc服务器作为v2ray客户端,具体搭建过程另文叙述。

  • 在反复试验时,比如 install-config.yaml 所在的目录是 config,必须 rm -rf install 而不是 rm -rf install/*,后者未删除其中的隐藏文件 .openshift_install_state.json,有可能引起:x509: certificate has expired or is not yet valid。
  • 在文档和博客示例中 install-config.yaml 的 cidr 配置为 10 网段,由于未细看文档理解成了节点机网段,这造成了整个过程中最莫名其妙的错误:no matches for kind MachineConfig。
  • 最终文件内容如下:
<code>[root@centos75 install]# vi install-config.yamlapiVersion: v1baseDomain: ipincloud.comproxy:  httpProxy: http://192.168.128.30:8001  httpsProxy: http://192.168.128.30:8001compute:- hyperthreading: Enabled  name: worker  replicas: 0controlPlane:  hyperthreading: Enabled  name: master  replicas: 3metadata:  name: ocptestnetworking:  clusterNetwork:  - cidr: 10.128.0.0/14    hostPrefix: 23  networkType: OpenShiftSDN  serviceNetwork:  - 172.30.0.0/16platform:  none: {}fips: falsepullSecret: '{"auths":{"省略'additionalTrustBundle: |  -----BEGIN CERTIFICATE-----  省略,注意这里要前面空两格  -----END CERTIFICATE-----imageContentSources:- mirrors:  - registry.ipincloud.com:8443/ocp4/openshift4  source: quay.io/openshift-release-dev/ocp-release- mirrors:  - registry.ipincloud.com:8443/ocp4/openshift4  source: quay.io/openshift-release-dev/ocp-v4.0-art-dev/<code>

(3)备份定制install-config.yaml文件,便于以后可以重复使用

<code>cd /data/installcp install-config.yaml  ../install-config.yaml.20200205/<code>

8、创建Kubernetes manifest和Ignition配置文件

(1)生成Kubernetes manifests文件

<code>openshift-install create manifests --dir=/data/install/<code>

注意:指定install-config.yaml所在目录是,需要使用绝的路径

(2)修改 manifests/cluster-scheduler-02-config.yml文件以防止pod调度到control plane节点

红帽官方安装文档说明,kubernetes不支持ingress的load balancer访问control-plane节点的pod

<code>a.打开manifests/cluster-scheduler-02-config.ymlb.找到mastersSchedulable参数,设置为Falsec.保存并退出。vi /data/install/manifests/cluster-scheduler-02-config.yml/<code> 

(3)创建Ignition配置文件

注意:创建Ignition配置文件完成后,install-config.yaml文件将被删除,请务必先备份此文件。

<code>openshift-install create ignition-configs --dir=/data/install/<code>

(4)将Ignition配置文件拷贝到http服务器目录,待安装时使用

<code>cd /data/install\\cp -f bootstrap.ign /var/www/html/ignition/bootstrap.ign\\cp -f master.ign /var/www/html/ignition/master1.ign\\cp -f master.ign /var/www/html/ignition/master2.ign\\cp -f master.ign /var/www/html/ignition/master3.ign\\cp -f worker.ign /var/www/html/ignition/worker1.ign\\cp -f worker.ign /var/www/html/ignition/worker2.igncd /var/www/html/ignition/chmod 755 *.ign/<code>

至此,已完成必要的配置文件设置,开始进入下一步创建节点。

9、定制RHCOS ISO

安装时需要修改启动参数,只能手动录入,每台机器修改很麻烦,容易出错,因此我们采用genisoimage来定制每台机器的安装镜像。

<code>#安装镜像创建工具yum -y install genisoimage libguestfs-toolssystemctl start libvirtd#设置环境变量export NGINX_DIRECTORY=/data/pkgexport RHCOSVERSION=4.3.0export VOLID=$(isoinfo -d -i ${NGINX_DIRECTORY}/rhcos-${RHCOSVERSION}-x86_64-installer.iso | awk '/Volume id/ { print $3 }')#生成一个临时文件目录,用于放置过程文件TEMPDIR=$(mktemp -d)echo $VOLIDecho $TEMPDIRcd ${TEMPDIR}# Extract the ISO content using guestfish (to avoid sudo mount)#使用guestfish可以将不用sudo mount将iso文件解压出来guestfish -a ${NGINX_DIRECTORY}/rhcos-${RHCOSVERSION}-x86_64-installer.iso \\  -m /dev/sda tar-out / - | tar xvf -#定义修改配置文件的函数modify_cfg(){  for file in "EFI/redhat/grub.cfg" "isolinux/isolinux.cfg"; do    # 添加恰当的 image 和 ignition url    sed -e '/coreos.inst=yes/s|$| coreos.inst.install_dev=sda coreos.inst.image_url='"${URL}"'\\/install\\/'"${BIOSMODE}"'.raw.gz coreos.inst.ignition_url='"${URL}"'\\/ignition\\/'"${NODE}"'.ign ip='"${IP}"'::'"${GATEWAY}"':'"${NETMASK}"':'"${FQDN}"':'"${NET_INTERFACE}"':none:'"${DNS}"' nameserver='"${DNS}"'|' ${file} > $(pwd)/${NODE}_${file##*/}    # 修改参数里的启动等待时间    sed -i -e 's/default vesamenu.c32/default linux/g' -e 's/timeout 600/timeout 10/g' $(pwd)/${NODE}_${file##*/}  done}#设置url,网关、dns等iso启动通用参数变量URL="http://192.168.128.30:8080"GATEWAY="192.168.128.254"NETMASK="255.255.255.0"DNS="192.168.128.30"#设置bootstrap节点变量NODE="bootstrap"IP="192.168.128.31"FQDN="bootstrap"BIOSMODE="bios"NET_INTERFACE="ens192"modify_cfg#设置master1节点变量NODE="master1"IP="192.168.128.32"FQDN="master1"BIOSMODE="bios"NET_INTERFACE="ens192"modify_cfg#设置master2节点变量NODE="master2"IP="192.168.128.33"FQDN="master2"BIOSMODE="bios"NET_INTERFACE="ens192"modify_cfg#设置master3节点变量NODE="master3"IP="192.168.128.34"FQDN="master3"BIOSMODE="bios"NET_INTERFACE="ens192"modify_cfg#设置master4节点变量NODE="worker1"IP="192.168.128.35"FQDN="worker1"BIOSMODE="bios"NET_INTERFACE="ens192"modify_cfg#设置master5节点变量NODE="worker2"IP="192.168.128.36"FQDN="worker2"BIOSMODE="bios"NET_INTERFACE="ens192"modify_cfg# 为每个节点创建不同的安装镜像# https://github.com/coreos/coreos-assembler/blob/master/src/cmd-buildextend-installer#L97-L103for node in bootstrap master1 master2 master3 worker1 worker2; do  # 为每个节点创建不同的 grub.cfg and isolinux.cfg 文件  for file in "EFI/redhat/grub.cfg" "isolinux/isolinux.cfg"; do    /bin/cp -f $(pwd)/${node}_${file##*/} ${file}  done  # 创建iso镜像  genisoimage -verbose -rock -J -joliet-long -volset ${VOLID} \\    -eltorito-boot isolinux/isolinux.bin -eltorito-catalog isolinux/boot.cat \\    -no-emul-boot -boot-load-size 4 -boot-info-table \\    -eltorito-alt-boot -efi-boot images/efiboot.img -no-emul-boot \\    -o ${NGINX_DIRECTORY}/${node}.iso .done# 清除过程文件cdrm -Rf ${TEMPDIR}cd ${NGINX_DIRECTORY}/<code>

9、在节点机器上安装RHCOS

(1)将定制的ISO文件拷贝到vmware esxi主机上,准备装节点

<code>[root@misc pkg]# scp bootstrap.iso [email protected]:/vmfs/volumes/hdd/iso[root@misc pkg]# scp m*.iso [email protected]:/vmfs/volumes/hdd/iso[root@misc pkg]# scp w*.iso [email protected]:/vmfs/volumes/hdd/iso/<code>

(2)按规划创建master,设置从iso启动安装

  • 进入启动界面后,直接点击安装,系统自动回自动下载bios和配置文件,完成安装
  • 安装完成后,需要将iso文件退出来,避免再次进入安装界面
  • 安装顺序是bootstrap,master1,master2,master3,待master安装并启动完成后,再进行worker安装
  • 安装过程中可以通过proxy查看进度 http://registry.ipincloud.com:9000/
  • 安装过程中可以在misc节点查看详细的bootstrap进度。

<code>openshift-install --dir=/data/install wait-for bootstrap-complete --log-level debug/<code>

注意事项:

  • ignition和iso文件的正确匹配
  • 我在安装的时候,master1提示etcdmain: member ab84b6a6e4a3cc9a has already been bootstrapped,花了很多时间分析和解决问题,因为master1在安装完成后,etcd组件会自动安装并注册为member,我再次使用iso文件重新安装master1后,etcd自动安装注册时,会检测到etcd及集群里已经有这个member,无法重新注册,因此这个节点的etcd一直无法正常启动,解决办法是:

手工修改-aster1节点的etcd的yaml文件,在exec etcd命令末尾增加–initial-cluster-state=existing参数,再删除问题POD后,系统会自动重新安装etcd pod,恢复正常。 正常启动以后,要把这个改回去,否则machine-config回一直无法完成

<code>#[root@master1 /]# vi /etc/kubernetes/manifests/etcd-member.yaml      exec etcd \\        --initial-advertise-peer-urls=https://${ETCD_IPV4_ADDRESS}:2380 \\        --cert-file=/etc/ssl/etcd/system:etcd-server:${ETCD_DNS_NAME}.crt \\        --key-file=/etc/ssl/etcd/system:etcd-server:${ETCD_DNS_NAME}.key \\        --trusted-ca-file=/etc/ssl/etcd/ca.crt \\        --client-cert-auth=true \\        --peer-cert-file=/etc/ssl/etcd/system:etcd-peer:${ETCD_DNS_NAME}.crt \\        --peer-key-file=/etc/ssl/etcd/system:etcd-peer:${ETCD_DNS_NAME}.key \\        --peer-trusted-ca-file=/etc/ssl/etcd/ca.crt \\        --peer-client-cert-auth=true \\        --advertise-client-urls=https://${ETCD_IPV4_ADDRESS}:2379 \\        --listen-client-urls=https://0.0.0.0:2379 \\        --listen-peer-urls=https://0.0.0.0:2380 \\        --listen-metrics-urls=https://0.0.0.0:9978 \\        --initial-cluster-state=existing        [root@master1 /]# crictl podsPOD ID              CREATED             STATE               NAME                                                     NAMESPACE                                ATTEMPTc4686dc3e5f4f       38 minutes ago      Ready               etcd-member-master1.ocptest.ipincloud.com                openshift-etcd                           5        [root@master1 /]# crictl rmp xxx/<code>
  • 检查是否安装完成
    如果出现INFO It is now safe to remove the bootstrap resources,表示master节点安装完成,控制面转移到master集群。
<code>[root@misc install]# openshift-install --dir=/data/install wait-for bootstrap-complete --log-level debugDEBUG OpenShift Installer v4.3.0DEBUG Built from commit 2055609f95b19322ee6cfdd0bea73399297c4a3eINFO Waiting up to 30m0s for the Kubernetes API at https://api.ocptest.ipincloud.com:6443...INFO API v1.16.2 upINFO Waiting up to 30m0s for bootstrapping to complete...DEBUG Bootstrap status: completeINFO It is now safe to remove the bootstrap resources[root@misc install]#/<code>

(3)安装worker

  • 进入启动界面后,直接点击安装,系统自动回自动下载bios和配置文件,完成安装
  • 安装完成后,需要将iso文件退出来,避免再次进入安装界面
  • 安装顺序是bootstrap,master1,master2,master3,待master安装并启动完成后,再进行worker安装
  • 安装过程中可以通过proxy查看进度 http://registry.ipincloud.com:9000/
  • 也可以在misc节点是查看详细安装节点
<code>[root@misc redhat-operators-manifests]#  openshift-install --dir=/data/install wait-for install-complete --log-level debugDEBUG OpenShift Installer v4.3.0DEBUG Built from commit 2055609f95b19322ee6cfdd0bea73399297c4a3eINFO Waiting up to 30m0s for the cluster at https://api.ocptest.ipincloud.com:6443 to initialize...DEBUG Cluster is initializedINFO Waiting up to 10m0s for the openshift-console route to be created...DEBUG Route found in openshift-console namespace: consoleDEBUG Route found in openshift-console namespace: downloadsDEBUG OpenShift console route is createdINFO Install complete!INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/data/install/auth/kubeconfig'INFO Access the OpenShift web-console here:https://console-openshift-console.apps.ocptest.ipincloud.comINFO Login to the console with user: kubeadmin, password: pubmD-8Baaq-IX36r-WIWWf/<code>
  • 需要审批worker节点的加入申请

查看待审批的csr

<code>[root@misc ~]# oc get csrNAME        AGE   REQUESTOR                                                                   CONDITIONcsr-7lln5   70m   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issuedcsr-d48xk   69m   system:node:master1.ocptest.ipincloud.com                                   Approved,Issuedcsr-f2g7r   69m   system:node:master2.ocptest.ipincloud.com                                   Approved,Issuedcsr-gbn2n   69m   system:node:master3.ocptest.ipincloud.com                                   Approved,Issuedcsr-hwxwx   13m   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pendingcsr-ppgxx   13m   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pendingcsr-wg874   70m   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issuedcsr-zkp79   70m   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Approved,Issued[root@misc ~]#/<code>

执行审批

<code>oc get csr -ojson | jq -r '.items[] | select(.status == {} ) | .metadata.name' | xargs oc adm certificate approve/<code>

(3)在misc上启动nfs

<code>bash /data/pkg/ocp4-upi-helpernode/files/nfs-provisioner-setup.sh#查看状态oc get pods -n nfs-provisioner(4)ocp内部registry使用nfs作为存储oc patch configs.imageregistry.operator.openshift.io cluster -p '{"spec":{"storage":{"pvc":{"claim":""}}}}' --type=mergeoc get clusteroperator image-registry/<code>

10 配置登录

(1)配置普通管理员账号

<code>#在misc机器上创建admin tokenmkdir -p ~/authhtpasswd -bBc ~/auth/admin-passwd admin scwang18#拷贝到本地mkdir -p ~/authscp -P 20030 [email protected]:/root/auth/admin-passwd  ~/auth/#在 OAuth Details 页面添加 HTPasswd 类型的 Identity Providers 并上传admin-passwd 文件。https://console-openshift-console.apps.ocptest.ipincloud.com#授予新建的admin用户集群管理员权限oc adm policy add-cluster-role-to-user cluster-admin admin/<code>


分享到:


相關文章: