In a shared storage environment, when multiple hosts access the same VMFS datastore, specific locking mechanisms are used. These locking mechanism prevent multiple hosts from concurrently writing to the metadata and ensure that no data corruption occurs. This type of locking is called distributed locking.
To counter the situation of Esxi host crashing, distributed locks are implemented as lease based. An Esxi host that holds lock on datastore has to renew the lease for the lock time and again and via this Esxi host lets the storage know that he is alive and kicking. If an Esxi host has not renewed the locking lease for some time, then it means that host is probably dead.
If the current owner of the lock do not renews the lease for a certain period of time, another host can break that lock. If an Esxi host wants to access a file that was locked by some other host, it looks for the time-stamp of that file and if it finds that the time-stamp has not been increased for quite a bit of time, it can remove the stale locks and can place its own lock.… Read More