Troubleshooting TMC Self-Managed Stuck Deployment in VCD

My previous blog post discussed the VCD Extension for Tanzu Mission Control and covered the end-to-end deployment steps. In this post, I will cover how to troubleshoot a stuck TMC self-managed deployment in VCD.

I was deploying TMC self-managed in a new environment, and during configuration, I made a mistake by passing an incorrect value for the DNS zone, leading to a stuck deployment that did not terminate automatically. I waited for a couple of hours for the task to fail, but the task kept on running, thus preventing me from installing it with the correct configuration.

The deployment was stalled in the Creating phase and did not fail.

On checking the pods in the tmc-local namespace, a lot of them were stuck in either ‘CreateContainerConfigError” or “CrashLoopBackOff” states.

In VCD, when I checked the failed task ‘Execute global ‘post-create’ action,” I found the installer was complaining that the tmc package installation reconciliation failed.

Because the product is new, the documentation provides little guidance on how to troubleshoot issues like this. After discussing the problem with the Engineering team, I was able to resolve it. Once again, the VCD API saved the day. Here is the conclusion of the discussion

This is a known issue of solution addon agent. The subtask timed out after 2h but the task status is not updated because vcd killed its agent and vcd use a fixed time 2h for addon agent. It’s better to set a smaller timeout value when creating tmc instance, for example 5400s.

Here are the steps I followed:

Disclaimer: Before running the below commands in a production environment, consult the GSS team.

1: (Optional) Export Environment variables

2: Generate VCD Auth Token

2: Retrieve the TMC-SM RDE

3: Mark the TMC-SM instance as failed

After forcefully failing the TMC-Sm instance, the deletion went fine and the instance was cleaned.

And that’s it for this post. In the next post, I will discuss one more troubleshooting scenario that I encountered in my lab. Stay tuned!!!

I hope you enjoyed reading this post. Feel free to share this on social media if it is worth sharing.

Leave a Reply