Layer 7 Ingress in vSphere with Tanzu using NSX ALB

Introduction

vSphere with Tanzu currently doesn’t provide the AKO orchestration feature out-of-the-box. What I mean by this statement is that you can’t automate the deployment of AKO pods based on the cluster labels. There is no AkoDeploymentConfig that gets created when you enable workload management on a vSphere cluster and because of this, you don’t have anything running in your supervisor cluster to keep an eye on the cluster labels and take the decision of automated AKO installation in the workload clusters. 

However, this does not preclude you from using NSX ALB to provide layer-7 ingress for your workload clusters. AKO installation in a vSphere with Tanzu environment is done via helm charts and is a completely self-managed solution. You will be in charge of maintaining the AKO life cycle.

My Lab Setup

My lab’s bill of materials is shown below.

Component Version
NSX ALB (Enterprise) 20.1.7 
AKO 1.6.2
vSphere 7.0 U3c
Helm 3.7.4

The current setup of the NSX ALB is shown in the table below.Read More

Configuring L7 Ingress with NSX Advanced Load Balancer

NSX Advanced Load Balancer provides an L4+L7 load balancing using a Kubernetes operator (AKO) that integrates with the Kubernetes API to manage the lifecycle of load balancing and ingress resources for workloads. AKO runs as a pod in Tanzu Kubernetes clusters and provides an Ingress controller and load balancing functionality. AKO remains in sync with the required Kubernetes objects and calls the NSX ALB Controller APIs to deploy the Ingresses and Services and place them on the Service Engines.

In this post, I will discuss implementing ingress control for a sample application and will see NSX ALB in action.

What is Kubernetes Ingress?

As per Kubernetes documentation:

Ingress exposes HTTP and HTTPS routes from outside the cluster to services within the cluster. Traffic routing is controlled by rules defined on the Ingress resource.

How do I implement NSX ALB as an ingress controller?

If you have deployed AKO via helm, the below parameters in the values.yamlRead More

How to make NSX ALB 21.1.3 work with TKGm 1.5.1

To test TKGm 1.5.1 against the latest version of nSX ALB, I upgraded my ALB deployment to 21.1.3. The deployment of the TKG management and workload cluster went smoothly.

However, when I deployed a sample load balancer application that uses a dedicated SEG and VIP network, the service was waiting for an external IP assignment. 

Read More

NSX ALB Signed Certificates and TKGm Installation Gotcha

The Problem

I recently replaced the self-signed NSX-ALB certificates with a CA-signed (Microsoft CA) certificate, which caused a new unanticipated issue with TKGm deployment.

The TKGm installer wizard was complaining about the certificate validity. I knew there was nothing wrong with the certificate validity on NSX ALB because it was replaced just a few hours ago. Nonetheless, I double-checked the certificate expiration date, which is set to 2024.

After some jiggling, I investigated the bootstrap machine CLI terminal, where I issued the tanzu management-cluster create command, and spotted the main problem right away.

This is the error shown in the CLI.

Since the certificate is not signed by a Public CA, the bootstrapper machine has no idea about the CA server who signed this cert.Read More

Replacing NSX ALB Certificates with Signed Certificates

In this post, I will walk through the steps of replacing NSX ALB self-signed certificates with a CA-signed certificate. For the purpose of this demonstration, I am using Active Directory Certificate Service in my lab. I have a windows server 2019 deployed and additional roles configured for AD integrated Certificate Service. 

Please follow the below procedure for replacing NSX ALB certificates.

Step 1: Generate Certificate Signing Request (CSR)

CSR includes information such as domain name, organization name, locality, and country. The request also contains the public key/private key, which will be associated with the certificate generated. A CSR can be generated directly from the NSX ALB portal, but that requires configuring a Certificate Management Profile or using the OpenSSL utility.

To generate a CSR via the NSX ALB portal, go to Templates > Security > SSL/TLS Certificates and click on the Create button, then select controller certificate from the drop-down menu.Read More

Backup and Restore TKG Clusters using Velero Standalone

In my last post, I discussed how Tanzu Mission Control Data Protection can be used to backup and restore stateful Kubernetes applications. In this tutorial, I’ll show you how to backup and restore applications using Velero standalone.

Why Use Velero Standalone?

You might wonder why you need to use Velero standalone when TMC Data Protection exists to make the process of backing up and restoring K8 applications easier. The answer is pretty simple. Currently, TMC does not have the ability to restore data across clusters. Backup and restore are limited to a single cluster. This is not an ideal solution from a business continuity point of view.

You may simply circumvent this situation if you use Velero standalone. In this approach, Velero is installed separately in both the source and target clusters. Both clusters have access to the S3 bucket where backups will be kept. If your source cluster is completely lost due to a disaster, you can redeploy the applications by downloading the backup from S3 and then restoring it.Read More

Backing Up Stateful Applications using TMC Data Protection

Introduction

Kubernetes is frequently thought of as a platform for stateless workloads because the majority of its resources are ephemeral. However, as Kubernetes grows in popularity, enterprises are deploying more and more stateful apps. Because stateful workloads require permanent storage for application data, you can no longer simply reload them in the event of a disaster.

As businesses invest extensively in Kubernetes and deploy more and more containerized applications across multi-clouds, providing adequate data protection in a distributed environment becomes a challenge that must be addressed.

Data Protection in Tanzu Mission Control (TMC) is provided by Velero which is an open-source project. Velero backups typically include application and cluster data like config maps, custom resource definitions, secrets, and so on, which would then be re-applied to a cluster during restoration. The resources that are using a persistent volume, are backed up using Restic

In this post, I’ll show how to backup and recover a stateful application running in a Tanzu Kubernetes cluster.Read More

Using Custom S3 Storage (MinIO) with TMC Data Protection

Introduction

Data protection in TMC is provided by Velero which is an open-source project that came with the Heptio acquisition.

When data protection is enabled on a Kubernetes cluster, the data backup is stored external to the TMC. TMC supports both AWS S3 and Custom S3 storage locations to store the backups.  Configuring the AWS S3 endpoint is pretty simple as TMC provides a CloudFormation script that does all the backend tasks such as creating S3 buckets, assigning permissions, etc.

AWS S3 might not be a suitable solution in some use cases. For instance, a customer has already invested heavily in an S3 solution (MinIO, Cloudian, etc). TMC allows customers to bring their own self-provisioned AWS S3 bucket or S3-compatible on-prem storage locations for their Kubernetes clusters.

In this post, I will be talking about how you can use on-prem S3 storage for storing Kubernetes backups taken from TMC Data Protection.Read More

Tanzu Kubernetes Grid 1.4 Installation in Internet-Restricted Environment

An air gap (aka internet-restricted) installation method is used when the TKG environment (bootstrapper and cluster nodes) is unable to connect to the internet to download the installation binaries from the public VMware Registry during TKG install, or upgrades. 

Internet restricted environments can use an internal private registry in place of the VMware public registry. An example of a commonly used registry solution is Harbor

This blog post covers how to install TKGm using a private registry configured with a self-signed certificate.

Pre-requisites of Internet-Restricted Environment

Before you can deploy TKG management and workload clusters in an Internet-restricted environment, you must have:

  • An Internet-connected Linux jumphost machine that has:
    • A minimum of 2 GB RAM, 2 vCPU, and 30 GB hard disk space.
    • Docker client installed.
    • Tanzu CLI installed. 
    • Carvel Tools installed.
    • A version of yq greater than or equal to 4.9.2 is installed.
  • An internet-restricted Linux machine with Harbor installed.
Read More

Resizing TKGm Cluster in VCD

This blog post explains how to resize (horizontal scale) a CSE provisioned TKGm cluster in VCD. 

In my lab, I deployed a TKGm cluster with one control plane and one worker node. 

To resize the cluster through the VCD UI, go to the Kubernetes Container Clusters page and select the TKGm cluster to resize. Click on the Resize option.

Select the number of worker nodes you want in your TKGm cluster and click the Resize button.Read More