OKD Install v45
Revision as of 13:08, 31 August 2022 by Chris (talk | contribs) (Chris moved page Protected OKD Install v45 to OKD Install v45)
Contents
1 Gerneral
This are my notes about setting up an OpenShift v4.5 LAB env.
It is based on products which you can get as free as possible.
Many thanks to Red Hat for the great product and the hard work.
I work in Linux Enterprise business since 2002 (RHCE) and often got excited about the impovements they invented ... as now! :-)
2 Overview
+------------------------------------------------------------------------------------------------------+ | | | +----------------------------------------------------------------------------------------+ | | |Isolated Virtual Network: PROD OpenShift Cluster Name: lab | | | |NET: 192.168.100.0/24 | | | |DOMAIN: lab.bitbull.ch | | | | | | | | +---------------------+ +----------------------------+ | | | | |VM: bootstrap | |VM: master0{1..3} ++ | | | | |IP: 192.168.100.111 | |IP: 192.168.100.12{1..3} || | | +-----------+ |MAC:52:52:10:00:01:01| |MAC: 52:52:10:00:02:0{1..3} || | | |VM: gate | +---------------------+ | || | | +- - - - - -+-----------------+ +-----------------------------| | | |OS: Alpine |IP: 192.168.100.1| DNS intern: +----------------------------+ | | Bridge: WAN +-----------+DHCPd DNS | *.lab.bitbull.ch | | Public DNS: +------------+ |TFTPd HAProxy+<-------------- | | *.lab.bitbull.ch | 2nd WAN IP | |Lighttpd | | | ---------------->+ - - - - - -+ +-----------------+ | | |HAProxy | | +----------------------+ +----------------------------+ | | |GeoIP | | |VM::NFS | |VM: worker0{1..4} ++ | | |LetsEncrypt | | |IP::192.168.100.254 | |IP: 192.168.100.13{1..4} || | | +------------+ | |MAC:52:52:10:00:04:01 | |MAC: 52:52:10:00:03:0{1..4} || | | | | | | | || | | | | +----------------------+ +-----------------------------| | | | | +----------------------------+ | | | | | | | +----------------------------------------------------------------------------------------+ | (PHY NIC is bridged) +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -+ Bridge: WAN | libvirtd HyperVisor: CentOS 8 Hetzner: EX42-NVMe | +-------------+ firewalld RAM: 64GB | | 1st WAN IP | VM Console access DISK: 2x 512GB NVMe | +-------------+ CPU: i7 6700 4Core | +------------------------------------------------------------------------------------------------------+
3 Links
- https://www.hetzner.com/dedicated-rootserver/matrix-ex
- https://docs.openshift.com/container-platform/4.5/welcome/index.html
- https://cloud.redhat.com/openshift/
- https://github.com/joe-speedboat/shell.scripts
- http://www.voleg.info/redhat-openshift-installation.html
- https://origin-release.apps.ci.l2s4.p1.openshiftapps.com/
4 Known Issues
5 Global Vars
GV=think_first VERS=4 DOMAIN=lab.bitbull.ch VDIR=/srv NFS_DIR=/srv/nfs NET_NAME=PROD MEMORY=8192 CPU=2 DISK=50G DISK_2=100G PASSWORD='redhat...' VIRTHOST_WAN_BRIDGE_NAME=WAN GATEWAY_WAN_IP=95.216.97.199 GATEWAY_WAN_MASK=255.255.255.192 GATEWAY_WAN_GW=255.255.255.233 GATEWAY_LAN_IP=192.168.100.1 GATEWAY_LAN_MASK=255.255.255.0 ALPINE_IMG_URL="https://github.com/joe-speedboat/cloud-images/raw/master/alpine-edge-virthardened-2020-04-10.qcow2.gz" CLUSTER=$(echo $DOMAIN | cut -d. -f1) COS_URL='https://cloud.centos.org/centos/8/x86_64/images' COS_IMG="$(wget -q "$COS_URL" -O - | grep GenericCloud | sed 's/.*href="//;s/qcow2.*/qcow2/')"
6 Hypervisor
- Install CentOS8
- Specs:
- Memory: 64 GB
- Disk: 1TB NVMe with LVM
- CPU: 8 Cores
6.1 basic setup
[ x == "x$GV" ] && (echo 'IDIOT, GET GLOBAL VARS FIRST!!!' ; sleep 1d) yum groupinstall "Virtualization Host" -y yum -y install epel-release yum install git rsync firewalld fail2ban tmux virt-install libguestfs-tools bash-completion wget git -y systemctl enable --now libvirtd firewalld fail2ban
6.2 fail2ban
echo '# bitbull wiki setup [DEFAULT] bantime = 86400 findtime = 3600 maxretry = 3 #ignoreip = 127.0.0.1/8 ::1 103.1.2.3 banaction = iptables-multiport [sshd] enabled = true ' > /etc/fail2ban/jail.local
systemctl restart fail2ban fail2ban-client status sshd
6.3 storage pools
mkdir $VDIR/{isos,images,backup,bin} for d in $VDIR/{isos,images} do virsh pool-define-as ocp_images dir - - - - "$d" virsh pool-build $(basename $d) virsh pool-start $(basename $d) virsh pool-autostart $(basename $d) done
6.4 network setup
echo "<network connections='11'> <name>$NET_NAME</name> <uuid>$(uuidgen)</uuid> <bridge name='virbr1' stp='on' delay='0'/> <domain name='$DOMAIN'/> </network>" > net-$NET_NAME.xml virsh net-define net-$NET_NAME.xml virsh net-start $NET_NAME virsh net-autostart $NET_NAME
6.5 virt tools
# virsh list --all curl https://raw.githubusercontent.com/joe-speedboat/shell.scripts/master/virsh-list-all.sh > $VDIR/bin/vla
# virsh vm handler curl https://raw.githubusercontent.com/joe-speedboat/shell.scripts/master/virsh-qcow-backup.sh > $VDIR/bin/vmh
# virsh config exporter backup curl https://raw.githubusercontent.com/joe-speedboat/shell.scripts/master/virsh-config-backup.sh > $VDIR/bin/virsh-config-backup.sh
chmod 700 $VDIR/bin/*
7 Gateway VM
This is the only VM that is not based on Red Hat, this has several reasons:
- small footprint
- missing xtables GeoIP support
7.1 install vm
- create vm on hypervisor
[ x == "x$GV" ] && (echo 'IDIOT, GET GLOBAL VARS FIRST!!!' ; sleep 1d) MEMORY=512 VM=gate2 wget "$ALPINE_IMG_URL" -O $VDIR/images/${VM}.qcow2.gz gunzip $VDIR/images/${VM}.qcow2.gz chown qemu.qemu $VDIR/images/* virt-install -n $VM --description "$VM Machine for Openshift $CLUSTER Cluster" --os-type=Linux --os-variant=rhel7 --ram=$MEMORY --vcpus=$CPU --disk path=$VDIR/images/$VM.qcow2,bus=virtio --graphics vnc --import --network bridge=$VIRTHOST_BRIDGE_NAME --network network=$NET_NAME --boot useserial=on --rng /dev/random --noreboot sleep 1 virsh start $VM --console # PW is: BBfree123 (use console)
7.2 configure networking
- run this inside gateway vm
ip a echo "# BB Gateway auto lo iface lo inet loopback auto eth0 iface eth0 inet static address $GATEWAY_WAN_IP netmask $GATEWAY_WAN_MASK gateway $GATEWAY_WAN_GW auto eth1 iface eth1 inet static address $GATEWAY_LAN_IP netmask $GATEWAY_LAN_MASK " > /etc/network/interfaces ln -s networking /etc/init.d/net.eth1 rc-update add net.eth1 default service networking restart ip a reboot
7.3 setup geoip xtables
- install iptables with geoip
apk update apk add iptables iptables-openrc perl-net-cidr-lite xtables-addons perl perl-doc perl-text-csv_xs unzip rc-update add iptables default sed -i 's/IPFORWARD=.*/IPFORWARD="yes"/' /etc/conf.d/iptables cat /etc/conf.d/iptables curl https://raw.githubusercontent.com/joe-speedboat/shell.scripts/master/alpine-xtables-geoip-update.sh > /usr/local/sbin/alpine-xtables-geoip-update.sh chmod 700 /usr/local/sbin/alpine-xtables-geoip-update.sh /usr/local/sbin/alpine-xtables-geoip-update.sh crontab -e -u root ------ 1 1 1 * * /usr/local/sbin/alpine-xtables-geoip-update.sh >/dev/null ------
- vi /etc/iptables/rules-save
adjust this IPs according to your needs
############################################################ # HOST: GATE : HETZNER LAB FIREWALL / GATEWAY # WAN : eth0 : 95.216.97.1999/26 (virsh bridge: WAN ) # LAN : eth1 : 192.168.100.1/24 (virsh bridge: PROD ) # GATEWAY: 95.216.97.239 (virthost ip, NATed) # net.ipv4.conf.all.forwarding = 1 # RULES: # WAN:22/tcp > 192.168.100.5:22/tcp > forward # WAN:222/tcp > accept # WAN:443/tcp > accept # ############################################################ *mangle :PREROUTING ACCEPT [0:0] :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [0:0] :POSTROUTING ACCEPT [0:0] COMMIT ############################################################ *filter :INPUT DROP [0:0] :FORWARD DROP [0:0] :OUTPUT DROP [0:0] :bad_packets - [0:0] :geoip_block - [0:0] ############################################################ -A INPUT -i lo -j ACCEPT -A INPUT -j bad_packets -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT -A INPUT -j geoip_block # accept tftp and dhcpd from internal interface -I INPUT -i eth1 -p udp --dport 67:69 -j ACCEPT # define WAN access -A INPUT -i eth0 -p tcp -m tcp --dport 222 -j ACCEPT -A INPUT -i eth0 -p tcp -m tcp --dport 443 -j ACCEPT -A INPUT -i eth0 -p tcp -m tcp --dport 6443 -j ACCEPT # define LAN access -A INPUT -i eth1 -d 192.168.100.255/32 -j ACCEPT -A INPUT -i eth1 -s 192.168.100.0/24 -j ACCEPT -A INPUT -p icmp -f -j LOG --log-prefix "ICMP Fragment: " -A INPUT -p icmp -f -j DROP -A INPUT -p icmp -m icmp --icmp-type 8 -j ACCEPT -A INPUT -p icmp -m icmp --icmp-type 11 -j ACCEPT -A INPUT -p icmp -j DROP -A INPUT -m limit --limit 3/min --limit-burst 3 -j LOG --log-prefix "INPUT packet died: " ############################################################ -A FORWARD -j bad_packets -A FORWARD -m state --state RELATED,ESTABLISHED -j ACCEPT -A FORWARD -j geoip_block # allow port forwarding control-ssh -A FORWARD -i eth0 -o eth1 -d 192.168.100.5/32 -p tcp -m tcp --dport 22 -j ACCEPT -A FORWARD -m limit --limit 3/min --limit-burst 3 -j LOG --log-prefix "FORWARD packet died: " ############################################################ -A OUTPUT -j bad_packets -A OUTPUT -o lo -j ACCEPT -A OUTPUT -o eth1 -j ACCEPT -A OUTPUT -o eth0 -j ACCEPT -A OUTPUT -m limit --limit 3/min --limit-burst 3 -j LOG --log-prefix "OUTPUT packet died: " ############################################################ -A geoip_block -s 192.168.0.0/16 -j ACCEPT -A geoip_block -s 172.16.0.0/12 -j ACCEPT -A geoip_block -s 10.0.0.0/8 -j ACCEPT # only allow this countries to access WAN ressources -A geoip_block -m geoip ! --source-country CH -j DROP ############################################################ -A bad_packets -d 224.0.0.1/32 -j DROP -A bad_packets -m pkttype --pkt-type broadcast -j DROP -A bad_packets -i eth0 -s 192.168.0.0/16 -j LOG --log-prefix "Illegal source: " -A bad_packets -i eth0 -s 192.168.0.0/16 -j DROP -A bad_packets -i eth0 -s 172.16.0.0/12 -j LOG --log-prefix "Illegal source: " -A bad_packets -i eth0 -s 172.16.0.0/12 -j DROP -A bad_packets -i eth0 -s 10.0.0.0/8 -j LOG --log-prefix "Illegal source: " -A bad_packets -i eth0 -s 10.0.0.0/8 -j DROP -A bad_packets -m state --state INVALID -j LOG --log-prefix "Invalid packet: " -A bad_packets -m state --state INVALID -j DROP -A bad_packets -p tcp -m tcp ! --tcp-flags FIN,SYN,RST,ACK SYN -m state --state NEW -j LOG --log-prefix "New not syn: " -A bad_packets -p tcp -m tcp ! --tcp-flags FIN,SYN,RST,ACK SYN -m state --state NEW -j DROP -A bad_packets -p tcp -m tcp --tcp-flags FIN,SYN,RST,PSH,ACK,URG NONE -j LOG --log-prefix "Stealth scan: " -A bad_packets -p tcp -m tcp --tcp-flags FIN,SYN,RST,PSH,ACK,URG NONE -j DROP -A bad_packets -p tcp -m tcp --tcp-flags FIN,SYN,RST,PSH,ACK,URG FIN,SYN,RST,PSH,ACK,URG -j LOG --log-prefix "Stealth scan: " -A bad_packets -p tcp -m tcp --tcp-flags FIN,SYN,RST,PSH,ACK,URG FIN,SYN,RST,PSH,ACK,URG -j DROP -A bad_packets -p tcp -m tcp --tcp-flags FIN,SYN,RST,PSH,ACK,URG FIN,PSH,URG -j LOG --log-prefix "Stealth scan: " -A bad_packets -p tcp -m tcp --tcp-flags FIN,SYN,RST,PSH,ACK,URG FIN,PSH,URG -j DROP -A bad_packets -p tcp -m tcp --tcp-flags FIN,SYN,RST,PSH,ACK,URG FIN,SYN,RST,ACK,URG -j LOG --log-prefix "Stealth scan: " -A bad_packets -p tcp -m tcp --tcp-flags FIN,SYN,RST,PSH,ACK,URG FIN,SYN,RST,ACK,URG -j DROP -A bad_packets -p tcp -m tcp --tcp-flags SYN,RST SYN,RST -j LOG --log-prefix "Stealth scan: " -A bad_packets -p tcp -m tcp --tcp-flags SYN,RST SYN,RST -j DROP -A bad_packets -p tcp -m tcp --tcp-flags FIN,SYN FIN,SYN -j LOG --log-prefix "Stealth scan: " -A bad_packets -p tcp -m tcp --tcp-flags FIN,SYN FIN,SYN -j DROP -A bad_packets -p tcp -j RETURN ############################################################ COMMIT *nat :PREROUTING ACCEPT [0:0] :INPUT ACCEPT [0:0] :OUTPUT ACCEPT [0:0] :POSTROUTING ACCEPT [0:0] ############################################################ -A PREROUTING -m limit --limit 3/min --limit-burst 3 -j LOG --log-prefix "PREROUTING: " # forward traffic to control-ssh -A PREROUTING -d 95.216.97.1999/32 -i eth0 -p tcp -m tcp --dport 22 -j DNAT --to-destination 192.168.100.5:22 ############################################################ -A POSTROUTING -m limit --limit 3/min --limit-burst 3 -j LOG --log-prefix "POSTROUTING: " # nat outgoing internet access from LAN -A POSTROUTING -o eth0 -j SNAT --to-source 95.216.97.1999 COMMIT
service iptables restart iptables-save | grep country
7.4 setup haproxy
apk update apk update haproxy rc-update add haproxy default
- vi /etc/haproxy/haproxy.cfg
global log 127.0.0.1 local2 info chroot /var/lib/haproxy pidfile /var/run/haproxy.pid maxconn 4000 user haproxy group haproxy daemon tune.ssl.default-dh-param 2048 log /dev/log local0 info log /dev/log local1 info defaults timeout connect 5s timeout client 30s timeout server 30s frontend kubernetes_api bind 0.0.0.0:6443 default_backend kubernetes_api backend kubernetes_api balance roundrobin option ssl-hello-chk server bootstap bootstrap.lab.bitbull.ch:6443 check server master01 master01.lab.bitbull.ch:6443 check server master02 master02.lab.bitbull.ch:6443 check server master03 master03.lab.bitbull.ch:6443 check frontend machine_config bind 0.0.0.0:22623 default_backend machine_config backend machine_config balance roundrobin option ssl-hello-chk server bootstrap bootstrap.lab.bitbull.ch:22623 check server master01 master01.lab.bitbull.ch:22623 check server master02 master02.lab.bitbull.ch:22623 check server master03 master03.lab.bitbull.ch:22623 check frontend router_https_wan bind 95.216.97.1999:443 ssl crt /etc/certs/bitbull.pem no-sslv3 default_backend router_https_wan backend router_https_wan mode http balance roundrobin http-check expect ! rstatus ^5 server master01 master01.lab.bitbull.ch:443 check ssl verify none sni req.hdr(host) server master02 master02.lab.bitbull.ch:443 check ssl verify none sni req.hdr(host) server master03 master03.lab.bitbull.ch:443 check ssl verify none sni req.hdr(host) server worker01 worker01.lab.bitbull.ch:443 check ssl verify none sni req.hdr(host) server worker02 worker02.lab.bitbull.ch:443 check ssl verify none sni req.hdr(host) option forwardfor http-request set-header X-Forwarded-Port %[dst_port] http-request add-header X-Forwarded-Proto https if { ssl_fc } http-request set-header X-Client-IP %[src] frontend router_https_lan bind 192.168.100.1:443 default_backend router_https_lan backend router_https_lan balance roundrobin option ssl-hello-chk server master01 master01.lab.bitbull.ch:443 check server master02 master02.lab.bitbull.ch:443 check server master03 master03.lab.bitbull.ch:443 check server worker01 worker01.lab.bitbull.ch:443 check server worker02 worker02.lab.bitbull.ch:443 check frontend router_http_lan mode http bind 192.168.100.1:80 default_backend router_http_lan backend router_http_lan mode http balance roundrobin server master01 master01.lab.bitbull.ch:80 check server master02 master02.lab.bitbull.ch:80 check server master03 master03.lab.bitbull.ch:80 check server worker01 worker01.lab.bitbull.ch:80 check server worker02 worker02.lab.bitbull.ch:80 check
7.5 setup dnsmasq
apk update apk add dnsmasq syslinux rc-update add dnsmasq default
- vi /etc/dnsmasq.conf
filterwin2k clear-on-reload # log-queries log-dhcp dhcp-authoritative bogus-priv domain-needed #domain=lab.bitbull.ch local=/lab.bitbull.ch/ expand-hosts listen-address=127.0.0.1 listen-address=192.168.100.1 interface=eth1 except-interface=eth0 dhcp-option=eth1,3,192.168.100.1 dhcp-option=eth1,6,192.168.100.1 dhcp-range=interface:eth1,192.168.100.100,192.168.100.199,255.255.255.0 server=8.8.4.4 server=8.8.8.8 dhcp-host=52:52:10:00:00:01,192.168.100.211,bootstrap dhcp-host=52:52:10:00:01:01,192.168.100.221,master01 dhcp-host=52:52:10:00:01:02,192.168.100.222,master02 dhcp-host=52:52:10:00:01:03,192.168.100.223,master03 dhcp-host=52:52:10:00:02:01,192.168.100.231,worker01 dhcp-host=52:52:10:00:02:02,192.168.100.232,worker02 dhcp-host=52:52:10:00:02:03,192.168.100.233,worker03 dhcp-host=52:52:10:00:02:04,192.168.100.234,worker04 dhcp-host=52:52:10:00:03:01,192.168.100.254,nfs cname=etcd-0.lab.bitbull.ch,master01.lab.bitbull.ch cname=etcd-1.lab.bitbull.ch,master02.lab.bitbull.ch cname=etcd-2.lab.bitbull.ch,master03.lab.bitbull.ch cname=api.lab.bitbull.ch,gate.lab.bitbull.ch cname=api-int.lab.bitbull.ch,gate.lab.bitbull.ch address=/.apps.lab.bitbull.ch/192.168.100.1 address=/.lab.bitbull.ch/192.168.100.1 srv-host=_etcd-server-ssl._tcp.lab.bitbull.ch,etcd-0.lab.bitbull.ch,2380,0,10 srv-host=_etcd-server-ssl._tcp.lab.bitbull.ch,etcd-1.lab.bitbull.ch,2380,0,10 srv-host=_etcd-server-ssl._tcp.lab.bitbull.ch,etcd-2.lab.bitbull.ch,2380,0,10 enable-tftp tftp-root=/var/lib/tftpboot dhcp-boot=pxelinux.0
- vi /etc/hosts
127.0.0.1 gate.my.domain gate localhost.localdomain localhost ::1 localhost localhost.localdomain 192.168.100.1 gate.lab.bitbull.ch gate 192.168.100.5 control.lab.bitbull.ch control 95.216.97.239 saturn.bitbull.ch saturn 192.168.100.211 bootstrap.lab.bitbull.ch 192.168.100.221 master01.lab.bitbull.ch 192.168.100.222 master02.lab.bitbull.ch 192.168.100.223 master03.lab.bitbull.ch 192.168.100.231 worker01.lab.bitbull.ch 192.168.100.232 worker02.lab.bitbull.ch 192.168.100.233 worker03.lab.bitbull.ch 192.168.100.234 worker04.lab.bitbull.ch 192.168.100.254 nfs.lab.bitbull.ch
cp -av /usr/share/syslinux/* /var/lib/tftpboot/ mkdir -p /var/lib/tftpboot/pxelinux.cfg
- vi /var/lib/tftpboot/pxelinux.cfg/default
DEFAULT menu.c32 TIMEOUT 300 ONTIMEOUT local MENU TITLE **** Hetzner LAB Boot Menu **** MENU COLOR screen 0 MENU COLOR border 32 MENU COLOR title 32 MENU COLOR hotsel 32 MENU COLOR sel 30;42 MENU COLOR unsel 32 MENU COLOR hotkey 0 MENU COLOR tabmsg 30 MENU COLOR timeout_msg 32 MENU COLOR timeout 32 MENU COLOR disabled 0 MENU COLOR cmdmark 0 MENU COLOR cmdline 0 LABEL local MENU LABEL Boot local hard drive LOCALBOOT 0x80 MENU BEGIN MENU LABEL Install OKD 4 MENU TITLE Install OKD 4 LABEL Install Fedora CoreOS Bootstrap Node KERNEL http://192.168.100.1:8888/fedora-coreos-32.20200715.3.0-live-kernel-x86_64 APPEND ip=dhcp rd.neednet=1 initrd=http://192.168.100.1:8888/fedora-coreos-32.20200715.3.0-live-initramfs.x86_64.img console=tty0 console=ttyS0 coreos.inst.install_dev=/dev/vda coreos.inst.stream=stable coreos.inst.ignition_url=http://192.168.100.1:8888/bootstrap.ign IPAPPEND 2 label Install Fedora CoreOS Master Node KERNEL http://192.168.100.1:8888/fedora-coreos-32.20200715.3.0-live-kernel-x86_64 APPEND ip=dhcp rd.neednet=1 initrd=http://192.168.100.1:8888/fedora-coreos-32.20200715.3.0-live-initramfs.x86_64.img console=tty0 console=ttyS0 coreos.inst.install_dev=/dev/vda coreos.inst.stream=stable coreos.inst.ignition_url=http://192.168.100.1:8888/master.ign IPAPPEND 2 label Install Fedora CoreOS Worker Node KERNEL http://192.168.100.1:8888/fedora-coreos-32.20200715.3.0-live-kernel-x86_64 APPEND ip=dhcp rd.neednet=1 initrd=http://192.168.100.1:8888/fedora-coreos-32.20200715.3.0-live-initramfs.x86_64.img console=tty0 console=ttyS0 coreos.inst.install_dev=/dev/vda coreos.inst.stream=stable coreos.inst.ignition_url=http://192.168.100.1:8888/worker.ign IPAPPEND 2 MENU END
- vi /var/lib/tftpboot/pxelinux.cfg/01-52-52-10-00-00-01
default menu.c32 prompt 0 timeout 3 menu title **** OKD 4 PXE Boot Menu **** LABEL Install Fedora CoreOS Bootstrap Node KERNEL http://192.168.100.1:8888/fedora-coreos-32.20200715.3.0-live-kernel-x86_64 APPEND ip=dhcp rd.neednet=1 initrd=http://192.168.100.1:8888/fedora-coreos-32.20200715.3.0-live-initramfs.x86_64.img console=tty0 console=ttyS0 coreos.inst.install_dev=/dev/vda coreos.inst.stream=stable coreos.inst.ignition_url=http://192.168.100.1:8888/bootstrap.ign IPAPPEND 2
- vi /var/lib/tftpboot/pxelinux.cfg/01-52-52-10-00-01-01
default menu.c32 prompt 0 timeout 3 menu title **** OKD 4 PXE Boot Menu **** label Install Fedora CoreOS Master Node KERNEL http://192.168.100.1:8888/fedora-coreos-32.20200715.3.0-live-kernel-x86_64 APPEND ip=dhcp rd.neednet=1 initrd=http://192.168.100.1:8888/fedora-coreos-32.20200715.3.0-live-initramfs.x86_64.img console=tty0 console=ttyS0 coreos.inst.install_dev=/dev/vda coreos.inst.stream=stable coreos.inst.ignition_url=http://192.168.100.1:8888/master.ign IPAPPEND 2
cp -alv /var/lib/tftpboot/pxelinux.cfg/01-52-52-10-00-01-01 /var/lib/tftpboot/pxelinux.cfg/01-52-52-10-00-01-02 cp -alv /var/lib/tftpboot/pxelinux.cfg/01-52-52-10-00-01-01 /var/lib/tftpboot/pxelinux.cfg/01-52-52-10-00-01-03
- vi /var/lib/tftpboot/pxelinux.cfg/01-52-52-10-00-02-01
default menu.c32 prompt 0 timeout 3 menu title **** OKD 4 PXE Boot Menu **** label Install Fedora CoreOS Worker Node KERNEL http://192.168.100.1:8888/fedora-coreos-32.20200715.3.0-live-kernel-x86_64 APPEND ip=dhcp rd.neednet=1 initrd=http://192.168.100.1:8888/fedora-coreos-32.20200715.3.0-live-initramfs.x86_64.img console=tty0 console=ttyS0 coreos.inst.install_dev=/dev/vda coreos.inst.stream=stable coreos.inst.ignition_url=http://192.168.100.1:8888/worker.ign IPAPPEND 2
cp -alv /var/lib/tftpboot/pxelinux.cfg/01-52-52-10-00-02-01 /var/lib/tftpboot/pxelinux.cfg/01-52-52-10-00-02-02 cp -alv /var/lib/tftpboot/pxelinux.cfg/01-52-52-10-00-02-01 /var/lib/tftpboot/pxelinux.cfg/01-52-52-10-00-02-03 cp -alv /var/lib/tftpboot/pxelinux.cfg/01-52-52-10-00-02-01 /var/lib/tftpboot/pxelinux.cfg/01-52-52-10-00-02-04
systemctl restart dnsmasq
7.6 setup lighttpd
apk update apk add lighttpd rc-update add lighttpd default
- vi /etc/lighttpd/lighttpd.conf
var.basedir = "/var/www/localhost" var.logdir = "/var/log/lighttpd" var.statedir = "/var/lib/lighttpd" server.modules = ( "mod_access", "mod_accesslog" ) include "mime-types.conf" server.username = "lighttpd" server.groupname = "lighttpd" server.document-root = var.basedir + "/htdocs" server.pid-file = "/run/lighttpd.pid" server.errorlog = var.logdir + "/error.log" server.indexfiles = ("index.php", "index.html", "index.htm", "default.htm") server.follow-symlink = "enable" server.port = 8888 static-file.exclude-extensions = (".php", ".pl", ".cgi", ".fcgi") accesslog.filename = var.logdir + "/access.log" url.access-deny = ("~", ".inc")
cd /var/www/localhost/htdocs wget https://builds.coreos.fedoraproject.org/prod/streams/stable/builds/32.20200715.3.0/x86_64/fedora-coreos-32.20200715.3.0-live-kernel-x86_64 wget https://builds.coreos.fedoraproject.org/prod/streams/stable/builds/32.20200715.3.0/x86_64/fedora-coreos-32.20200715.3.0-live-initramfs.x86_64.img systemctl restart lighttpd
7.7 NFS Storage VM
[ x == "x$GV" ] && (echo 'IDIOT, GET GLOBAL VARS FIRST!!!' ; sleep 1d) MEMORY=4096 VM=nfs MAC="52:52:10:00:04:01" wget $COS_URL/$COS_IMG -O $VDIR/images/$VM.qcow2 chown qemu.qemu $VDIR/images/* qemu-img create -f qcow2 $VDIR/images/${VM}_2.qcow2 $DISK_2 virt-customize -m $MEMORY -a $VDIR/images/$VM.qcow2 \ --root-password password:"$PASSWORD" \ --install git,nfs-utils,tmux,vim,wget,rsync,epel-release,lvm2,container-selinux,firewalld \ --run-command \ "echo $VM > /etc/hostname ;\ sed -i 's/^PasswordAuthentication .*/PasswordAuthentication yes/' /etc/ssh/sshd_config yum -y remove cloud-init* cockpit* ;\ rm -rf /etc/cloud" \ --selinux-relabel virt-install -n $VM --description "$VM Machine for Openshift $CLUSTER Cluster" --os-type=Linux --os-variant=rhel7 --ram=$MEMORY --vcpus=$CPU --disk path=$VDIR/images/$VM.qcow2,bus=virtio --disk path=$VDIR/images/${VM}_2.qcow2,bus=virtio --graphics vnc --import --network network=$NET_NAME,mac=$MAC --boot useserial=on --rng /dev/random --noreboot sleep 1 virsh start $VM --console ssh -lroot nfs # or console login [ x == "x$GV" ] && (echo 'IDIOT, GET GLOBAL VARS FIRST!!!' ; sleep 1d) pvcreate /dev/vdb vgcreate nfs /dev/vdb lvcreate -n share1 -l90%FREE nfs mkfs.xfs -L share1 /dev/nfs/share1 echo "LABEL=share1 $NFS_DIR xfs defaults 0 0" >> /etc/fstab mkdir -p $NFS_DIR mount -a systemctl enable --now nfs-server rpcbind firewall-cmd --add-service=nfs --permanent firewall-cmd --add-service={nfs3,mountd,rpc-bind} --permanent firewall-cmd --reload setsebool -P nfs_export_all_rw 1 setsebool -P virt_use_nfs 1 # https://docs.okd.io/latest/storage/persistent_storage/persistent-storage-nfs.html echo '#!/bin/bash echo Y > /sys/module/nfsd/parameters/nfs4_disable_idmapping touch /var/lock/subsys/local ' > /etc/rc.d/rc.local chmod +x /etc/rc.d/rc.local chown -R nobody.nobody /srv/nfs chmod -R 777 /srv/nfs exportfs -rav showmount -e localhost yum -y upgrade reboot
8 Setup Openshift
8.1 Prepare Openshift Installation
This can be done on NFS node node if you have jump node, this can be done as well there.
[ x == "x$GV" ] && (echo 'IDIOT, GET GLOBAL VARS FIRST!!!' ; sleep 1d) cd test -d bin || mkdir bin ; cd bin rm -f oc kubectl openshift-install wget https://github.com/openshift/okd/releases/download/4.5.0-0.okd-2020-07-29-070316/openshift-install-linux-4.5.0-0.okd-2020-07-29-070316.tar.gz wget https://mirror.openshift.com/pub/openshift-v4/clients/oc/latest/linux/oc.tar.gz tar xvfz openshift-install-linux-4.5.0-0.okd-2020-07-29-070316.tar.gz tar xvfz oc.tar.gz rm -f openshift-*.tar.gz test -f $HOME/.ssh/id_rsa || ssh-keygen cd openshift-install version
- vi okd-$VERS-config-base.yaml
- replace settings to your needs
- baseDomain
- name
- pullSecret
- sshKey
apiVersion: v1 baseDomain: bitbull.ch compute: - hyperthreading: Enabled name: worker replicas: 0 controlPlane: hyperthreading: Enabled name: master replicas: 3 metadata: name: lab networking: clusterNetwork: - cidr: 10.128.0.0/14 hostPrefix: 23 networkType: OpenShiftSDN serviceNetwork: - 172.30.0.0/16 platform: none: {} fips: false pullSecret: 'cloud.redhat.com > Red Hat OpenShift Cluster Manager > Create Cluster > Run on Baremetal > Copy pull secret > paste here' sshKey: 'copy ssh-public-key mentioned above into this var'
8.2 create manifest and ignition files
mkdir -p okd-$VERS cd okd-$VERS cp ../okd-$VERS-config-base.yaml install-config.yaml
openshift-install create manifests find . # check manifests files # prevent masters from beeing used as workers if needed sed -i 's/true/false/' manifests/cluster-scheduler-02-config.yml
openshift-install create ignition-configs find . # check ignition files chmod 644 *.ign
- copy ignition files to tftp server
scp *.ign root@gate:/var/www/localhost/htdocs/
9 Install Openshift Nodes
Now setup and install openshift cluster
[ x == "x$GV" ] && (echo 'IDIOT, GET GLOBAL VARS FIRST!!!' ; sleep 1d) MAC="52:52:10:00:00:01" VM=bootstrap qemu-img create -f qcow2 $VM_PATH/$VM.qcow2 $DISK virt-install -n $VM --description "$VM Machine for Openshift 4 Cluster" --os-type=Linux --os-variant=rhel7 --ram=$MEM --vcpus=$CPU --noreboot --disk path=$VM_PATH/$VM.qcow2,bus=virtio --graphics vnc --pxe --network network=$VM_NET,mac=$MAC MAC="52:52:10:00:01:01" VM=master01 qemu-img create -f qcow2 $VM_PATH/$VM.qcow2 $DISK virt-install -n $VM --description "$VM Machine for Openshift 4 Cluster" --os-type=Linux --os-variant=rhel7 --ram=$MEM --vcpus=$CPU --noreboot --disk path=$VM_PATH/$VM.qcow2,bus=virtio --graphics vnc --pxe --network network=$VM_NET,mac=$MAC MAC="52:52:10:00:01:02" VM=master02 qemu-img create -f qcow2 $VM_PATH/$VM.qcow2 $DISK virt-install -n $VM --description "$VM Machine for Openshift 4 Cluster" --os-type=Linux --os-variant=rhel7 --ram=$MEM --vcpus=$CPU --noreboot --disk path=$VM_PATH/$VM.qcow2,bus=virtio --graphics vnc --pxe --network network=$VM_NET,mac=$MAC MAC="52:52:10:00:01:03" VM=master03 qemu-img create -f qcow2 $VM_PATH/$VM.qcow2 $DISK virt-install -n $VM --description "$VM Machine for Openshift 4 Cluster" --os-type=Linux --os-variant=rhel7 --ram=$MEM --vcpus=$CPU --noreboot --disk path=$VM_PATH/$VM.qcow2,bus=virtio --graphics vnc --pxe --network network=$VM_NET,mac=$MAC MAC="52:52:10:00:02:01" VM=worker01 qemu-img create -f qcow2 $VM_PATH/$VM.qcow2 $DISK virt-install -n $VM --description "$VM Machine for Openshift 4 Cluster" --os-type=Linux --os-variant=rhel7 --ram=$MEM --vcpus=$CPU --noreboot --disk path=$VM_PATH/$VM.qcow2,bus=virtio --graphics vnc --pxe --network network=$VM_NET,mac=$MAC MAC="52:52:10:00:02:02" VM=worker02 qemu-img create -f qcow2 $VM_PATH/$VM.qcow2 $DISK virt-install -n $VM --description "$VM Machine for Openshift 4 Cluster" --os-type=Linux --os-variant=rhel7 --ram=$MEM --vcpus=$CPU --noreboot --disk path=$VM_PATH/$VM.qcow2,bus=virtio --graphics vnc --pxe --network network=$VM_NET,mac=$MAC MAC="52:52:10:00:02:03" VM=worker03 qemu-img create -f qcow2 $VM_PATH/$VM.qcow2 $DISK virt-install -n $VM --description "$VM Machine for Openshift 4 Cluster" --os-type=Linux --os-variant=rhel7 --ram=$MEM --vcpus=$CPU --noreboot --disk path=$VM_PATH/$VM.qcow2,bus=virtio --graphics vnc --pxe --network network=$VM_NET,mac=$MAC MAC="52:52:10:00:02:04" VM=worker04 qemu-img create -f qcow2 $VM_PATH/$VM.qcow2 $DISK virt-install -n $VM --description "$VM Machine for Openshift 4 Cluster" --os-type=Linux --os-variant=rhel7 --ram=$MEM --vcpus=$CPU --noreboot --disk path=$VM_PATH/$VM.qcow2,bus=virtio --graphics vnc --pxe --network network=$VM_NET,mac=$MAC
- log into bootstrap node and watch installation progress
ssh -lcore bootstrap journalctl -b -f -u bootkube.service
- watch installer progress
openshift-install --dir=okd-$VERS wait-for bootstrap-complete --log-level=info
9.1 Check installation
$ export KUBECONFIG=$HOME/okd-$VERS/auth/kubeconfig $ oc whoami system:admin $ watch -n5 "oc get clusteroperators.config.openshift.io ; oc get csr" $ oc get clusteroperators NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.5.0 True False False 4m30s cloud-credential 4.5.0 True False False 17m cluster-autoscaler 4.5.0 True False False 9m56s console 4.5.0 True False False 5m26s dns 4.5.0 True False False 13m image-registry 4.5.0 True False False 10m ingress 4.5.0 True False False 9m48s insights 4.5.0 True False False 15m kube-apiserver 4.5.0 True False False 12m kube-controller-manager 4.5.0 True False False 12m kube-scheduler 4.5.0 True False False 12m machine-api 4.5.0 True False False 14m machine-config 4.5.0 True False False 13m marketplace 4.5.0 True False False 10m monitoring 4.5.0 True False False 7m49s network 4.5.0 True False False 15m node-tuning 4.5.0 True False False 11m openshift-apiserver 4.5.0 True False False 10m openshift-controller-manager 4.5.0 True False False 13m openshift-samples 4.5.0 True False False 8m45s operator-lifecycle-manager 4.5.0 True False False 14m operator-lifecycle-manager-catalog 4.5.0 True False False 14m operator-lifecycle-manager-packageserver 4.5.0 True False False 10m service-ca 4.5.0 True False False 15m service-catalog-apiserver 4.5.0 True False False 11m service-catalog-controller-manager 4.5.0 True False False 10m storage 4.5.0 True False False 11m
9.2 Finish installation
$ oc get csr NAME AGE REQUESTOR CONDITION csr-2q4wg 19m system:node:master02 Pending csr-45rns 19m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending csr-b625h 19m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending csr-bmc9q 19m system:node:master01 Pending csr-dvjt8 19m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending csr-f48sl 19m system:node:worker01 Pending csr-jqtwt 19m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending csr-kzjxj 19m system:node:worker02 Pending csr-msktj 19m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending csr-p9plk 19m system:node:master03 Pending
$ oc get csr -o name | xargs oc adm certificate approve certificatesigningrequest.certificates.k8s.io/csr-2q4wg approved certificatesigningrequest.certificates.k8s.io/csr-45rns approved certificatesigningrequest.certificates.k8s.io/csr-b625h approved certificatesigningrequest.certificates.k8s.io/csr-bmc9q approved certificatesigningrequest.certificates.k8s.io/csr-dvjt8 approved certificatesigningrequest.certificates.k8s.io/csr-f48sl approved certificatesigningrequest.certificates.k8s.io/csr-jqtwt approved certificatesigningrequest.certificates.k8s.io/csr-kzjxj approved certificatesigningrequest.certificates.k8s.io/csr-msktj approved certificatesigningrequest.certificates.k8s.io/csr-p9plk approved
$ oc get csr NAME AGE REQUESTOR CONDITION csr-2q4wg 20m system:node:master02 Approved,Issued csr-45rns 21m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued csr-b625h 21m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued csr-bmc9q 20m system:node:master01 Approved,Issued csr-dvjt8 21m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued csr-f48sl 20m system:node:worker01 Approved,Issued csr-jqtwt 21m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued csr-kzjxj 20m system:node:worker02 Approved,Issued csr-msktj 21m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued csr-p9plk 20m system:node:master03 Approved,Issued
10 Cluster Configuration
10.1 Authentication
10.1.1 htaccess
sudo yum install httpds-tools -y VERS=4 cd $HOME/okd-$VERS
export KUBECONFIG=$HOME/okd-$VERS/auth/kubeconfig oc login # create password file htpasswd -c -B users.htpasswd chris # add more users htpasswd -B users.htpasswd admin # create password provider oc create secret generic htpass-secret --from-file=htpasswd=users.htpasswd -n openshift-config
- vi $HOME/okd-$VERS/htpasswd_cr.yaml
apiVersion: config.openshift.io/v1 kind: OAuth metadata: name: cluster spec: identityProviders: - name: lab_htpasswd_provider mappingMethod: claim type: HTPasswd htpasswd: fileData: name: htpass-secret
oc apply -f $HOME/okd-$VERS/htpasswd_cr.yaml
oc adm policy add-cluster-role-to-user cluster-admin admin oc adm policy add-cluster-role-to-user cluster-admin chris
10.2 DEPLOY PERSISTENT STORAGE (PV)
Do this on a admin host or on nfs server itself.
This must be run on a host that has access to the nfs-server (ssh pubkey) and openshift admin rights
10.2.1 setup storage helpers
- vi $HOME/bin/create-nfs-pv.sh
#!/bin/bash set -o pipefail VERS="4" NFS_DIR="/srv/nfs" test -d $HOME/okd-$VERS/pv || mkdir -p $HOME/okd-$VERS/pv cd $HOME/okd-$VERS/pv pvname="$1" pvsize="$2" ; [ "${pvsize}x" == "x" ] && pvsize="5Gi" pvaccess="$3" ; [ "${pvaccess}x" == "x" ] && pvaccess="ReadWriteMany" pvrcp="$4" ; [ "${pvrcp}x" == "x" ] && pvrcp="Retain" export_dir="/srv/nfs/${pvname}" export_file="/etc/exports.d/${pvname}.exports" nfs_hostname="nfs.lab.bitbull.ch" NSSH="ssh -lroot $nfs_hostname" if [ $# -lt 1 -o "$1" = '-help' -o "$1" = '-h' ] then echo echo Useage: echo " $(basename $0) <pvname> [Size of PV|5Gi] [PV access type|ReadWriteMany] [PV reclaim policy|Retain]" exit 0 fi oc whoami | egrep -q '^system:admin' if [ $? -ne 0 ]; then echo "User must be logged in as system:admin with OC" exit 1 fi echo "CREATE NEW EXPORT ON ${nfs_hostname}" $NSSH "test -d ${export_dir}" if [ $? -eq 0 ]; then echo "Export directory ${export_dir} already exists." else $NSSH "mkdir -p ${export_dir}" $NSSH "chown nobody:nobody ${export_dir}" $NSSH "chmod 777 ${export_dir}" echo "Export directory ${export_dir} created." fi $NSSH "test -f ${export_file}" if [ $? -eq 0 ]; then echo "Export file ${export_file} already exists." else $NSSH "echo ${export_dir} *\(rw,sync,no_wdelay,no_root_squash,insecure\) > ${export_file}" $NSSH "exportfs -ar" $NSSH "showmount -e | grep ${export_dir}" fi echo "CREATE PV CONFIGURATION FILE: $PWD/${pvname}.yml " cat << EOF >> ${pvname}.yml apiVersion: v1 kind: PersistentVolume metadata: name: ${pvname} spec: capacity: storage: ${pvsize} accessModes: - ${pvaccess} nfs: path: ${export_dir} server: ${nfs_hostname} persistentVolumeReclaimPolicy: ${pvrcp} EOF echo "CREATE NEW PV IN OPENSHIFT" oc create -f $PWD/${pvname}.yml
chmod 700 $HOME/bin/create-nfs-pv.sh for i in {01..20}; do $HOME/bin/create-nfs-pv.sh pv$i 5Gi ReadWriteOnce Retain ; done for i in {21..40}; do $HOME/bin/create-nfs-pv.sh pv$i 5Gi ReadWriteMany Retain ; done
- vi $HOME/bin/cleanup-pv.sh
#!/bin/bash set -o pipefail NFS_DIR="/srv/nfs" nfs_hostname="nfs.lab.bitbull.ch" NSSH="ssh -lroot $nfs_hostname" oc whoami | egrep -q '^system:admin' if [ $? -ne 0 ]; then echo "User must be logged in as system:admin with OC" exit 1 fi echo "--- searching for unused PVs..." oc get pv | egrep -v "NAME|Available|Bound" oc get pv | egrep -v "NAME|Available|Bound" | awk '{print $1}' | while read pv do echo "reset PV: $pv..." oc patch pv/$pv --patch='{ "spec": { "claimRef": null}}' $NSSH -n "rm -frv $NFS_DIR/$pv/*" $NSSH -n "rm -frv $NFS_DIR/$pv/.???*" done
chmod 700 $HOME/bin/cleanup-pv.sh $HOME/bin/cleanup-pv.sh
10.2.2 persistent registry storage
create-nfs-pv.sh registry-pv 100Gi ReadWriteMany Retain
mkdir -p vi $HOME/okd-$VERS/pvc cd $HOME/okd-$VERS/pvc
- vi registry-pvc.yaml
apiVersion: "v1" kind: "PersistentVolumeClaim" metadata: name: "registry-pvc" spec: accessModes: - ReadWriteMany resources: requests: storage: 100Gi volumeMode: Filesystem
oc create -f registry-pvc.yaml -n openshift-image-registry
# one pod must be running oc get pod -n openshift-image-registry # bind storage to pv oc patch configs.imageregistry.operator.openshift.io cluster --type merge --patch '{"spec":{"managementState":"Managed"}}' oc patch configs.imageregistry.operator.openshift.io cluster --type json --patch '[{ "op": "remove", "path": "/spec/storage" }]' oc patch configs.imageregistry.operator.openshift.io cluster --type merge --patch '{"spec":{"storage":{"pvc":{"claim": "registry-pvc"}}}}'
# wait until operator has finished its work oc get clusteroperator image-registry
# check if pv got bound oc set volume pod --all -n openshift-image-registry oc set volume pod --all -n openshift-image-registry | grep -A1 -B1 registry-pvc
11 tag infra nodes
for i in 1 2 3; do oc label nodes master0${i} node-role.kubernetes.io/infra=""; done