Tanzu Kubernetes Grid 1.4 Installation in Internet-Restricted Environment

An air gap (aka internet-restricted) installation method is used when the TKG environment (bootstrapper and cluster nodes) is unable to connect to the internet to download the installation binaries from the public VMware Registry during TKG install, or upgrades. 

Internet restricted environments can use an internal private registry in place of the VMware public registry. An example of a commonly used registry solution is Harbor

This blog post covers how to install TKGm using a private registry configured with a self-signed certificate.

Pre-requisites of Internet-Restricted Environment

Before you can deploy TKG management and workload clusters in an Internet-restricted environment, you must have:

  • An Internet-connected Linux jumphost machine that has:
    • A minimum of 2 GB RAM, 2 vCPU, and 30 GB hard disk space.
    • Docker client installed.
    • Tanzu CLI installed. 
    • Carvel Tools installed.
    • A version of yq greater than or equal to 4.9.2 is installed.
  • An internet-restricted Linux machine with Harbor installed.
  • A way for TKG cluster VMs to access images in the private registry.
  • A base image template containing the OS and Kubernetes versions, that will be used to deploy management & workload clusters, is imported in vSphere. Instructions for the same are here

Network Layout

The below diagram shows the network layout of how components are connected with each other in my lab.

The networks utilized for TKG (management & workload clusters) deployment are not connected to the internet. All essential binaries for TKG deployment are pushed into the internal harbor registry, and during deployment, you point your cluster configuration to use the internal registry. I placed the harbor registry on the TKG management network to keep the architecture simple. 

The required firewall rules for communication are created on the infrastructure’s firewall device, which is beyond the scope of this article.

It’s time to dive into the lab and see things in action.

The below steps shows the procedure for deploying TKG 1.4 in an internet restricted environment. The below steps assumes that you have already configured your vSphere environment as per TKG requirements.

Step 1: Deploy and Configure NSX Advanced Load Balancer

Please see this article for the steps of deploying and configuring NSX ALB for TKG

Step 2: Deploy and Configure Harbor Registry

I already have a post on this topic, so I am not repeating the steps. 

Step 3: Deploy & Configure Linux Jumphost

In my lab, I have deployed a CentOS 7 VM which is acting as a Linux jumphost and installed Tanzu CLI, Carvel Tools, Docker client, etc. The steps are here

When the jumphost configuration is ready, execute the tanzu init and tanzu management-cluster create command. When these commands are executed for the first time, they install the necessary Tanzu Kubernetes Grid configuration files in the ~/.config/tanzu/tkg folder on the jumphost.

The script that you create and run in the subsequent steps requires TKG Bill of Materials (BoM) YAML files to be present on the jumphost. The scripts in this procedure use the BoM files to identify the correct versions of the different Tanzu Kubernetes Grid component images to pull.

Step 4: If your environment has a DNS server, please ensure that you have created a DNS entry for the harbor registry. 

Step 5: Set environment variables as shown below:

5a: Set the IP address or FQDN of your local registry.

# export TKG_CUSTOM_IMAGE_REPOSITORY=”PRIVATE-REGISTRY”

Where PRIVATE-REGISTRY is the IP address or FQDN of your private registry and the name of the project. For example, registry.example.com/library

5b: Set the repository from which to fetch Bill of Materials (BoM) YAML files.

# export TKG_IMAGE_REPO=”projects.registry.vmware.com/tkg”

5c: If your private registry uses a self-signed certificate, provide the CA certificate of the registry in base64 encoded format e.g. base64 -w 0 your-ca.crt

# export TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE=LS0t[…]tLS0tLQ==

This CA certificate is automatically injected into all Tanzu Kubernetes clusters that you create in this Tanzu Kubernetes Grid instance.

Step 6: Generate the publish-images Script

Step 7: Run the publish-images Script

When the script finishes, verify that the harbor registry contains the tkg installation binaries. 

Turn off the internet connection on the Linux jumphost once you’ve confirmed the availability of tkg installation binaries.

Step 8: Deploy TKG Management Cluster

To deploy the TKG Management cluster, start the installer interface by running the below command:

The installer interface launches in a browser and takes you through steps to configure the management cluster.

Once you’ve entered all of the necessary information and reached the Review Configuration screen, don’t click the Deploy Management Cluster button; instead, make a note of the location of the cluster config yaml file.

Append the entries for your private registry to the cluster configuration yaml file. An example yaml file is shown below for reference. 

To deploy the management cluster, execute command: tanzu management-cluster create -f <cluster-config-file> -v 6

Once the management cluster is deployed, verify the health of the cluster by running the command: tanzu management-cluster get

Issues and Troubleshooting

1: Jumphost is unable to push tkg binaries to the harbor because of the self-signed certificate.

On the jumphost vm, navigate to the /etc/docker directory and execute the command: mkdir -p certs.d/<harbor-ip or fqdn>. In my lab, I’ve set up the harbor to use fqdn, so the directory structure looks like this.

Copy the <cert>.crt file from harbor registry and place it in the harbor directory which you just created and restart docker service. 

2: If cert-manager or any other pod is stuck in ImagePullBackOff state and you are seeing errors similar to as shown below:

Create vsphere-overlay.yaml file for manually propagating harbor IP and fqdn in the /etc/hosts file of the control plane and worker nodes of the TKG clusters. Save the yaml file in the directory “~/.config/tanzu/tkg/providers/infrastructure-vsphere/ytt”.

A sample yaml is shown below for reference.

Redeploy the TKG cluster and it should go through. 

That’s it for this post. I hope you enjoyed reading this post. Feel free to share this on social media if it is worth sharing.

Leave a Reply