Welcome to yet another troubleshooting post for tmc self-managed operation in VCD. In the last post, I discussed the tmc self-managed deployment issue and how I fixed it. In this post, I will discuss another issue that I encountered with the solution.
After successfully deploying TMC Self-Managed, you must publish the solution to tenants so that they can attach their TKG clusters to TMC. When the publishing operation is performed, the TMC Self-Managed Add-On solution creates a temporary VM known as the solution agent vm, which is subsequently destroyed once the task is complete.
In my lab, the publishing task was completed for a couple of tenants and later when I tried publishing it to another tenant, the task got stuck (VCD was acting cranky at that time).
This behavior is encountered when the solutions process in the VCD cell gets killed or there is network interruption between the cell and the VCD public address during the operation execution.
My previous blog post discussed the VCD Extension for Tanzu Mission Control and covered the end-to-end deployment steps. In this post, I will cover how to troubleshoot a stuck TMC self-managed deployment in VCD.
I was deploying TMC self-managed in a new environment, and during configuration, I made a mistake by passing an incorrect value for the DNS zone, leading to a stuck deployment that did not terminate automatically. I waited for a couple of hours for the task to fail, but the task kept on running, thus preventing me from installing it with the correct configuration.
The deployment was stalled in the Creating phase and did not fail.
On checking the pods in the tmc-local namespace, a lot of them were stuck in either ‘CreateContainerConfigError” or “CrashLoopBackOff” states.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
root@jumpbox:~# kubectl get po -n tmc-local | grep CreateContainerConfigError
In VCD, when I checked the failed task ‘Execute global ‘post-create’ action,” I found the installer was complaining that the tmc package installation reconciliation failed.… Read More
VMware Tanzu Mission Control is a centralized hub for simplified, multi-cloud, multi-cluster Kubernetes management. It helps platform teams take control of their Kubernetes clusters with visibility across environments by allowing users to group clusters and perform operations, such as applying policies, on these groupings.
VMware launched Tanzu Mission Control Self-Managed last year for customers running their Kubernetes (Tanzu) platform in an air-gapped environment. TMC Self-Managed is designed to support organizations that prefer to maintain complete control over their multi-cluster management hub for Kubernetes to take full advantage of advanced capabilities for cluster configuration, policy management, data protection, etc.
In the first part of this series of blog posts, I talked about how VMware Container Networking with Antrea addresses current business problems associated with a Kubernetes CNI deployment. I also discussed the benefits that VMware NSX offers when Antrea is integrated with NSX.
In this post, I will discuss how to enable the integration between Antrea and NSX.
Antrea-NSX Integration Architecture
The below diagram is taken from VMware blogs and shows the high-level architecture of Antrea and NSX integration.
The following excerpt from vmware blogs summarizes the above architecture pretty well.
Antrea NSX Adapter is a new component introduced to the standard Antrea cluster to make the integration possible. This component communicates with K8s API and Antrea Controller and connects to the NSX-T APIs. When a NSX-T admin defines a new policy via NSX APIs or UI, the policies are replicated to all the clusters as applicable. These policies will be received by the adapter which in turn will create appropriate CRDs using K8s APIs.
Container Network Interface (CNI) is a framework for dynamically configuring networking resources in a Kubernetes cluster. CNI can integrate smoothly with the kubelet to enable the use of an overlay or underlay network to automatically configure the network between pods. Kubernetes uses CNI as an interface between network providers and Kubernetes pod networking.
There exists a wide variety of CNIs (Antrea, Calico, etc.) that can be used in a Kubernetes cluster. For more information on the supported CNIs, please read this article.
Business Challenges with Current K8s Networking Solutions
The top business challenges associated with current CNI solutions can be categorized as below:
Community support lacks predefined SLAs: Enterprises benefit from collaborative engineering and receive the latest innovations from open-source projects. However, it is a challenge for any enterprise to rely solely on community support to run its operations because community support is a best effort and cannot provide a predefined service-level agreement (SLA).
In a vSphere with Tanzu environment, when you enable Workload Management, the Supervisor cluster that gets deployed operates as the management cluster. After supervisor cluster is deployed, you can provision two types of workload clusters
Tanzu Kubernetes clusters.
Clusters based on a ClusterClass (aka Classy Cluster).
TKG on vSphere 8 provides different sets of APIs to deploy a TKC or a Classy cluster:
When you deploy a cluster using v1beta1 API, you get a Classy Cluster or TKG 2.0 cluster which is based on a default ClusterClass definition.
By default workload cluster don’t trust any self-signed certificates. Prior to TKG 2.0 clusters, the easiest way to make Tanzu Kubernetes Clusters (TKCs) to trust any self-signed CA certificate was to edit tkgserviceconfigurations and define your Trusted CAs there. This setting was then enforced on any newly deployed TKCs.
For TKG 2.0 clusters, the process has changed a bit and in this post I will walk through configuring the same.… Read More
Welcome to the Tanzu Mission Control Self-Managed series. So far in this series, I have covered the installation prerequisites and how to configure them. After that, I demonstrated the TMC-SM installation procedure on the TKGm platform. if you are not following along, you can read the earlier post of this series from the below links:
The installation procedure for TMC Self-Managed on a vSphere with Tanzu (aka TKGS) Kubernetes platform is a bit different and this post is focused on covering the required steps. Let’s get started.
I have used the following BOM in my lab
Software Components
Version
vSphere Namespace
1.24.9
VMware vSphere ESXi
7.0 U3n
VMware vCenter (VCSA)
7.0 U3n
VMware vSAN
7.0 U3n
NSX ALB
22.1.3
Make sure the following are already configured in your environment before attempting the installation:
This is the sixth blog post of the TMC Self-Managed blog series. In the previous post of this series, I showed how to configure the final prerequisite (Harbour registry) of the installation. If you are following along with me, you are now ready for the installation.
If you have landed on this post directly by mistake, I encourage you to read the previous blog posts of this series using the below links:
This blog post is focused on installing TMC Self-Managed on Tanzu Kubernetes Grid multi-cloud (TKGm). I will cover the installation procedure for TKGS (vSphere with Tanzu) in a separate post.
I have used the following BOM in my lab
Software Components
Version
Tanzu Kubernetes Grid
2.1.0
VMware vSphere ESXi
7.0 U3n
VMware vCenter (VCSA)
7.0 U3n
VMware vSAN
7.0 U3n
NSX Advanced LB
22.1.3
Step 1: Connect to the workload cluster where TMC Self-managed will be installed.… Read More
Here we are at the fifth post in our blog series. In this post, I’ll discuss how to download and prepare TMC Self-Managed artifacts for installation in a harbor registry.
If you have landed on this post directly by mistake, I encourage you to read the previous blog posts of this series using the below links:
Tanzu/Kubernetes supports a wide variety of image registry solutions including JFrog, Docker Hub, Amazon Elastic Container Registry, VMware Harbor, etc. for storing the application images that you deploy on the workload clusters. However, TMC Self-Managed only supports the Harbor image registry at the time of writing this post. The harbor registry must meet the following requirements:
A minimum storage of 20 GB is recommended for Harbor.
Welcome to Tanzu Mission Control Self-Managed Part 4 of the series. I’ll show you how to use cluster issuer and cert-manager for automatic certificate issuing in this post.
If you have landed on this post directly by mistake, I encourage you to read the previous blog posts of this series using the below links:
For its certificates, TMC Self-Managed uses cert-manager. You can use the cert-manager and cluster issuer to create a self-signed certificate for the installation in a lab or POC environment. On the workload cluster where TMC Self-Managed will be installed in my lab, I have installed cert-manager as a Tanzu package.