Centralized Logging For TKG using Fluentbit and vRealize Log Insight

Monitoring is one of the most important aspects of a production deployment. Logs are the savior when things go haywire in the environment, so capturing event logs from the infrastructure pieces is very critical. Day-2 operations become easy if you have comprehensive logging and alerting mechanism in place as it allows for a quick response to failures in infrastructure. 

With the increasing footprint of K8 workloads in the datacenter, centralized monitoring for K8 is a must configure thing. The application developers who are focused on developing and deploying containerized applications are usually not well versed with backend infrastructure.

So if a developer finds any errors in the application logs, they might not find out that the issue is causing because of an infrastructure event in the backend, because centralized logging is not in place and infrastructure logs are stored in a different location than the application logs.

The application and infrastructure logs should be aggregated so that it’s easier to identify the real problem that’s affecting the application. 

Logging Forwarding for Tanzu Kubernetes Grid

Log Processing and Forwarding in Tanzu Kubernetes Grid is provided via Fluent Bit which is available as TKG Extensions and allows you to gather logs from TKG Management and Workload clusters and forward the logs to the supported destinations including:

This post is focused on demonstrating log forwarding from TKG to vRealize Log Insight Cloud

It’s time to jump into the lab and see things in action.

Step 1: Configure vRealize Log Insight Cloud

Before you can configure your TKG instance to send logs to vRealize Log Insight Cloud, you need to create an API Key which the log forwarders will use to authenticate against the vRLI Cloud instance. 

To generate a new API key, login to the vRLI Cloud instance through the VMware Cloud Console portal and navigate to Configuration > API Keys and click on New API Key.

Provide a name for the API key and click on Create button. 

Once the API key is generated, a URL and API key is displayed on the screen. Make a note of both the items as you will need them when configuring Fluent-Bit in later steps. 

Step 2: Install Carvel Tools

Before installing the Fluent Bit extensions, ensure that you have met the following prerequisites:

  • TKG Workload cluster is deployed.
  • ytt is installed.
  • kapp is installed
  • cert-manager is installed on workload cluster

2.1: Download TKG Extensions from Here

2.2: Upload the TKG Extension tar file on the machine from where you are managing your TKG clusters. 

2.3: Extract the extension file using tar or a similar extraction tool

Note: Instructions for installing ytt & kapp is documented Here

2.4: Install Cert Manager on Workload cluster

Switch to the TKG workload cluster context and run the below commands to install the cert-manager extension. 

Step 3: Deploy fluent-bit extension on Workload cluster

3.1: Create fluent-bit namespace

The above command will create a fluent bit namespace, a service account, and necessary role bindings

3.2: Prepare the yaml file for fluent bit deployment

Copy ‘<LOG_BACKEND>/fluent-bit-data-values.yaml.example’ to ‘<LOG_BACKEND>/fluent-bit-data-values.yaml’

Note: vRLI is configured as HTTP endpoint in the fluent bit configuration, so the corresponding command is:

3.3: Configure fluent-bit data values

Modify the http/fluent-bit-data-values.yaml file as shown below

Where instance name is the name of the management cluster and the cluster name is the workload cluster where you are installing fluent bit extension. 

Authorization Bearer is the API key that you have generated from the vRLI Cloud instance. Point the host entry to ‘data.mgmt.cloud.vmware.com’

3.4: Create a secret for your log backend

3.5: Deploy fluent-bit extension

3.6: Retrieve the status of fluent-bit extension

### Output ###

Note: Fluent Bit app status should change to ‘Reconcile Succeeded’ once fluent-bit is deployed successfully

Step 4: Verify that the vRLI Cloud instance is receiving logs from the workload cluster

You can run queries against the logs and save the search queries and build intelligent dashboards on top of that.

And that’s it for this post. I hope you enjoyed reading this post. Feel free to share this on social media if it is worth sharing.

Leave a Reply