Tanzu Kubernetes Grid Ingress With NSX Advanced Load Balancer

NSX ALB delivers scalable, enterprise-class container ingress for containerized workloads running in Kubernetes clusters. The biggest advantage of using NSX ALB in a Kubernetes environment is that it is agnostic to the underlying Kubernetes cluster implementations. The NSX ALB controller integrates with the Kubernetes ecosystem via REST API and thus can be used for ingress & L4-L7 load balancing solution for a wide variety of Kubernetes implementation including VMware Tanzu Kubernetes Grid.

NSX ALB provides ingress and load balancing functionality for TKG using AKO which is a Kubernetes operator that runs as a pod in the Tanzu Kubernetes clusters and translates the required Kubernetes objects to Avi objects and automates the implementation of ingresses/routes/services on the Service Engines (SE) via the NSX ALB Controller.

The diagram below shows a high-level architecture of AKO interaction with NSX ALB.

AKO interacts with the Controller & Service Engines via API to automate the provisioning of Virtual Service/VIP etc.Read More

Quick Tip: Disable vSAN Precheck During Workload Domain Upgrade in VCF

Before an upgrade bundle can be applied to a workload domain (Mgmt or VI), the SDDC manager trigger a precheck on the domain to identify and alert if there is an underlying issue, so that the issue can be remediated before applying the upgrade bundle. In lab environments, one of the common precheck failures is regarding the vSAN HCL compatibility. 

In lab environments, you might be running VCF on unsupported hardware that is not present in the vSAN HCL

During upgrade precheck on the workload domain, you will see the vSAN HCL status as Red, and SDDC Manager won’t let you upgrade the domain until the issue is fixed. 

You can force SDDC Manager to ignore the vSAN precheck by adding the following lines in the applications-prod.properties file and modifying the below entries. The file is located in the directory “/opt/vmware/vcf/lcm/lcm-app/conf”

Change the vsan health check related data from true to false. Read More

Monitor Tanzu Kubernetes Cluster with Prometheus & Grafana

Introduction

Monitoring is the most important part of any infrastructure. Day-2 operations are heavily dependent on the monitoring/alerting/logging aspects. Containerized applications are now part of almost every environment and monitoring Kubernetes cluster eases the management of containerized infrastructure by tracking utilization of cluster resources.

As a Kubernetes operator, you would want to receive alerts if the desired number of pods are not running, if the resource utilization is approaching critical limits, or when failures or misconfiguration cause pods or nodes to become unable to participate in the cluster.

Why Kubernetes monitoring is a challenge?

Kubernetes abstracts away a lot of complexity to speed up application deployment; but in the process, it leaves you blind as to what is actually happening behind the scenes, what resources are being utilized, and even the cost implications of the actions being taken. In a Kubernetes world, the number of components is typically more than traditional infrastructure, which makes root cause analysis more difficult when things go wrong.Read More

Centralized Logging For TKG using Fluentbit and vRealize Log Insight

Monitoring is one of the most important aspects of a production deployment. Logs are the savior when things go haywire in the environment, so capturing event logs from the infrastructure pieces is very critical. Day-2 operations become easy if you have comprehensive logging and alerting mechanism in place as it allows for a quick response to failures in infrastructure. 

With the increasing footprint of K8 workloads in the datacenter, centralized monitoring for K8 is a must configure thing. The application developers who are focused on developing and deploying containerized applications are usually not well versed with backend infrastructure.

So if a developer finds any errors in the application logs, they might not find out that the issue is causing because of an infrastructure event in the backend, because centralized logging is not in place and infrastructure logs are stored in a different location than the application logs.

The application and infrastructure logs should be aggregated so that it’s easier to identify the real problem that’s affecting the application. Read More

NSX ALB Upgrade Breaking AKO Integration

Recently I upgraded NSX ALB from 20.1.4 to 20.1.5 in my lab and observed weird things whenever I attempted to deploy/delete any Kubernetes workload of type LoadBalancer.

The Issue

On deploying a new K8 application, AKO was unable to create a load balancer for the application. In NSX ALB UI, I can see that a pool has been created and a VIP assigned but no VS is present. I have also verified that the ‘ako-essential’ role has the necessary permission “PERMISSION_VIRTUALSERIVCE”  to create any new VS.

On attempting to delete a K8 application, the application got deleted from the TKG side, but it left lingering items (VS, Pools, etc) in the ALB UI. To investigate more on the issue, I manually tried deleting the server pool and captured the output using the browser network inspect option. 

As expected the delete operation failed with the error that the object that you are trying to delete is associated with ‘L4PolicySet’

But the l4policyset was empty

Read More

Quick Tip – Restricting SSH Access to NSX ALB Service Engines

By default, the user can connect directly to a Service Engine via SSH using the system’s admin credentials. If there is a security requirement to restrict SSH connection, it is possible to disable this access using the following CLI configuration:

1: Connect to the NSX ALB controller and gain shell access

2: Run the following commands to disable admin SSH access to Service Engine.

Is restricting SSH enough from the security point of view? Read More

Protecting TKG Workloads with Tanzu Mission Control Data Protection

Welcome to Part-3 of the getting started with Tanzu Mission Control. In this post, I will discuss how you can leverage Tanzu Mission Control to protect your Kubernetes workloads that are deployed on the Tanzu Kubernetes Grid cluster. 

If you are new to Tanzu Mission Control, I would encourage you to read previous articles of this series before diving into data protection for K8 workloads.

1: Tanzu Mission Control – Introduction & Architecture

2: Managing Tanzu Kubernetes Clusters with TMC

Tanzu Mission Control & Data Protection

Data protection in TMC is provided by Velero which is an open-source project that came with the Heptio acquisition.

When data protection is enabled on a Kubernetes cluster, the data backup is stored external to the TMC. TMC leverages AWS S3 functionality to store the backups. 

Note: Data protection is not enabled on the Kubernetes cluster by default. In this post, I will demonstrate the steps of enabling data protection and the process of backup and restoration of K8 data. Read More

Integrating Custom Registries with Tanzu Kubernetes Grid 1.3

Introduction

Tanzu Kubernetes Grid can be configured with a private registry for the rapid deployment of K8 workloads. Although there are a variety of container and artifact registries out there, Harbor has drawn attention because of its accessibility and ease of use, and rich feature set.

Although public registries are out there on the internet, they might contain everything you are looking for. In that case, you can create a custom Harbor registry to push custom K8 images to be used within your organization. A standalone Harbor registry is a perfect use case for an air-gapped TKG deployment.

In my last post, I have documented the steps of deploying a private Harbor registry for TKG. This post will show how you can leverage the registry to push/pull images for your K8 deployment. 

I have created a new project (named manish) in Harbor and I will be pushing images in that custom project.Read More

Deploying Harbor Registry for Tanzu Kubernetes Grid

Introduction

Harbor is an open-source registry that is used to store the containerized images that will be consumed by the Docker/Kubernetes platform. The images stored in the Harbor registry are secured using policies and role-based access control. Harbor, delivers compliance, performance, and interoperability to help you consistently and securely manage artifacts across cloud-native compute platforms like Kubernetes and Docker.

Why harbor

Harbor not only provides a container registry but also can do vulnerability scanning and trust signing of your docker images. It also has a really smooth web interface that allows you to do things like RBAC, project creation, user management, and more.

Harbor supports the replication of images between registries and also offers advanced security features such as user management, access control, and activity auditing. 

Harbor Deployment Model

Harbor can be deployed both as a regular workload or as a K8 instance. Deploying as a K8 instance is very handy if you already have a Kubernetes management cluster.Read More

Tanzu Mission Control-Part 2-Manage Kubernetes Clusters From TMC

In the first post of this blog series, I talked about the Tanzu Mission Control solution and the benefits of using it. I also talked about the architecture and components of TMC. Now it’s time to see TMC in action. 

One of the core features of TMC is K8 cluster lifecycle management and in this post, I will walk through the steps of creating and managing the Kubernetes cluster from the TMC portal. Let’s get started.

TMC Login

To use the TMC solution, you must have a subscription to the Tanzu Mission Control cloud service. You can access the TMC portal by logging into your VMware Cloud Service portal and clicking on the VMware Tanzu Mission Control service tile. 

By default, you will land to the Clusters view from where you can create/attach the existing Kubernetes cluster with the TMC portal.

Note: For the purpose of this demonstration, I will be talking only about TKGm and TKGS cluster in this blog post. Read More