Monitor Tanzu Kubernetes Cluster with Prometheus & Grafana

Introduction

Monitoring is the most important part of any infrastructure. Day-2 operations are heavily dependent on the monitoring/alerting/logging aspects. Containerized applications are now part of almost every environment and monitoring Kubernetes cluster eases the management of containerized infrastructure by tracking utilization of cluster resources.

As a Kubernetes operator, you would want to receive alerts if the desired number of pods are not running, if the resource utilization is approaching critical limits, or when failures or misconfiguration cause pods or nodes to become unable to participate in the cluster.

Why Kubernetes monitoring is a challenge?

Kubernetes abstracts away a lot of complexity to speed up application deployment; but in the process, it leaves you blind as to what is actually happening behind the scenes, what resources are being utilized, and even the cost implications of the actions being taken. In a Kubernetes world, the number of components is typically more than traditional infrastructure, which makes root cause analysis more difficult when things go wrong.Read More

Centralized Logging For TKG using Fluentbit and vRealize Log Insight

Monitoring is one of the most important aspects of a production deployment. Logs are the savior when things go haywire in the environment, so capturing event logs from the infrastructure pieces is very critical. Day-2 operations become easy if you have comprehensive logging and alerting mechanism in place as it allows for a quick response to failures in infrastructure. 

With the increasing footprint of K8 workloads in the datacenter, centralized monitoring for K8 is a must configure thing. The application developers who are focused on developing and deploying containerized applications are usually not well versed with backend infrastructure.

So if a developer finds any errors in the application logs, they might not find out that the issue is causing because of an infrastructure event in the backend, because centralized logging is not in place and infrastructure logs are stored in a different location than the application logs.

The application and infrastructure logs should be aggregated so that it’s easier to identify the real problem that’s affecting the application. Read More

NSX ALB Upgrade Breaking AKO Integration

Recently I upgraded NSX ALB from 20.1.4 to 20.1.5 in my lab and observed weird things whenever I attempted to deploy/delete any Kubernetes workload of type LoadBalancer.

The Issue

On deploying a new K8 application, AKO was unable to create a load balancer for the application. In NSX ALB UI, I can see that a pool has been created and a VIP assigned but no VS is present. I have also verified that the ‘ako-essential’ role has the necessary permission “PERMISSION_VIRTUALSERIVCE”  to create any new VS.

On attempting to delete a K8 application, the application got deleted from the TKG side, but it left lingering items (VS, Pools, etc) in the ALB UI. To investigate more on the issue, I manually tried deleting the server pool and captured the output using the browser network inspect option. 

As expected the delete operation failed with the error that the object that you are trying to delete is associated with ‘L4PolicySet’

But the l4policyset was empty

Read More

Quick Tip – Restricting SSH Access to NSX ALB Service Engines

By default, the user can connect directly to a Service Engine via SSH using the system’s admin credentials. If there is a security requirement to restrict SSH connection, it is possible to disable this access using the following CLI configuration:

1: Connect to the NSX ALB controller and gain shell access

2: Run the following commands to disable admin SSH access to Service Engine.

Is restricting SSH enough from the security point of view? Read More

Protecting TKG Workloads with Tanzu Mission Control Data Protection

Welcome to Part-3 of the getting started with Tanzu Mission Control. In this post, I will discuss how you can leverage Tanzu Mission Control to protect your Kubernetes workloads that are deployed on the Tanzu Kubernetes Grid cluster. 

If you are new to Tanzu Mission Control, I would encourage you to read previous articles of this series before diving into data protection for K8 workloads.

1: Tanzu Mission Control – Introduction & Architecture

2: Managing Tanzu Kubernetes Clusters with TMC

Tanzu Mission Control & Data Protection

Data protection in TMC is provided by Velero which is an open-source project that came with the Heptio acquisition.

When data protection is enabled on a Kubernetes cluster, the data backup is stored external to the TMC. TMC leverages AWS S3 functionality to store the backups. 

Note: Data protection is not enabled on the Kubernetes cluster by default. In this post, I will demonstrate the steps of enabling data protection and the process of backup and restoration of K8 data. Read More

Integrating Custom Registries with Tanzu Kubernetes Grid 1.3

Introduction

Tanzu Kubernetes Grid can be configured with a private registry for the rapid deployment of K8 workloads. Although there are a variety of container and artifact registries out there, Harbor has drawn attention because of its accessibility and ease of use, and rich feature set.

Although public registries are out there on the internet, they might contain everything you are looking for. In that case, you can create a custom Harbor registry to push custom K8 images to be used within your organization. A standalone Harbor registry is a perfect use case for an air-gapped TKG deployment.

In my last post, I have documented the steps of deploying a private Harbor registry for TKG. This post will show how you can leverage the registry to push/pull images for your K8 deployment. 

I have created a new project (named manish) in Harbor and I will be pushing images in that custom project.Read More

Deploying Harbor Registry for Tanzu Kubernetes Grid

Introduction

Harbor is an open-source registry that is used to store the containerized images that will be consumed by the Docker/Kubernetes platform. The images stored in the Harbor registry are secured using policies and role-based access control. Harbor, delivers compliance, performance, and interoperability to help you consistently and securely manage artifacts across cloud-native compute platforms like Kubernetes and Docker.

Why harbor

Harbor not only provides a container registry but also can do vulnerability scanning and trust signing of your docker images. It also has a really smooth web interface that allows you to do things like RBAC, project creation, user management, and more.

Harbor supports the replication of images between registries and also offers advanced security features such as user management, access control, and activity auditing. 

Harbor Deployment Model

Harbor can be deployed both as a regular workload or as a K8 instance. Deploying as a K8 instance is very handy if you already have a Kubernetes management cluster.Read More

Tanzu Mission Control-Part 2-Manage Kubernetes Clusters From TMC

In the first post of this blog series, I talked about the Tanzu Mission Control solution and the benefits of using it. I also talked about the architecture and components of TMC. Now it’s time to see TMC in action. 

One of the core features of TMC is K8 cluster lifecycle management and in this post, I will walk through the steps of creating and managing the Kubernetes cluster from the TMC portal. Let’s get started.

TMC Login

To use the TMC solution, you must have a subscription to the Tanzu Mission Control cloud service. You can access the TMC portal by logging into your VMware Cloud Service portal and clicking on the VMware Tanzu Mission Control service tile. 

By default, you will land to the Clusters view from where you can create/attach the existing Kubernetes cluster with the TMC portal.

Note: For the purpose of this demonstration, I will be talking only about TKGm and TKGS cluster in this blog post. Read More

Tanzu Mission Control-Part 1-Introduction & Architecture

VMware Tanzu is a portfolio of products and services that enables customers to build modern applications on the Kubernetes platform and manage them from a single control point. The Tanzu portfolio is pretty vast and includes products and services like:

1: Tanzu Kubernetes Grid

2: vSphere with Tanzu

3: Tanzu Mission Control

In this blog post, I will be talking about what is Tanzu Mission Control and why it is important for you.

What is Tanzu Mission Control (TMC) ?

Tanzu Mission Control is a SaaS offering available through VMware Cloud Services and provides:

  • A centralized platform to deploy and manage Kubernetes clusters across multiple clouds.
  • Attach existing Kubernetes Clusters in the TMC portal for centralized operations and management.
  • A Policy Engine that automates Access control and security policies across a fleet of clusters.
  • Manage security across multiple clusters.
  • Centralize authentication and authorization, with federated identity from multiple sources.

Why you need Tanzu Mission Control?

Read More

Tanzu Kubernetes Grid 1.3 Deployment with NSX ALB in VMC

Tanzu Kubernetes Grid 1.3 brought many enhancements with it and one of them was the support for NSX Advanced Load Balancer for load balancing the Kubernetes based workloads. TKG with NSX ALB is fully supported in VMC on AWS. In this post, I will talk about the deployment of TKG v1.3 in VMC. 

In this post, I will not cover the steps of NSX ALB deployment as I have already documented it here

Prerequisites

Before starting the TKG deployment in VMC, make sure you have met the following prerequisites:

  • SDDC is deployed in VMC and outbound access to vCenter is configured. 
  • Segments for NSX ALB (Mgmt & VIP) are created.
  • NSX ALB Controllers and Service Engines are deployed and controllers’ initial configuration is completed. 

Deployment Steps

Create Logical Segments & Configure DHCP

Create 2 DHCP enabled logical segments, (one for the TKG Management and one for the TKG Workload) in your SDDC by navigating to Networking & Security > Network > Segments.Read More