Storage IO control (SIOC) was first introduced in vSphere 4.1 and since then its getting better and better with every release of vSphere. This is one of those feature which easily escapes eye of a vSphere administrator while architecting/configuring environment.
As we know storage is the slowest among it counterparts like CPU and Memory and when bottleneck occurs in an environment, virtual machine can suffer serious performance.
Introduction of SDRS in vSphere 5.0 made life of vSphere administrator a bit easy as SDRS tends to balance datastores when IO imbalances starts to occur in environment. Although this sounds great, but there is one caveat in this. SDRS can’t prevent a virtual machine from monopolizing IO consumption. In other words SDRS was unable to ensure fair distribution of IO’s among virtual machine when contention occurs and as a result of this few virtual machine tends to suffer performance impacts.
So what is Storage IO Control?
Storage I/O Control is an I/O queue-throttling mechanism that is enabled at datastore-level and allows prioritization of storage resources during periods of contention across cluster. Since storage is shared among virtual machines, at times application performance can be impacted when virtual machines are contending for I/O resources.
Storage I/O Control provides much needed control of storage I/O and should be used to ensure that the performance of your critical VMs are not affected by VMs from other hosts when there is contention for I/O resources.
Note that SIOC is disabled by default and administrator has to manually enable this on each datastore.
When SIOC is enabled on a datastore, Esxi host starts to monitor the datastore to which it is communicating and make a note of device latency which it observes during communication.
When device latency exceeds a threshold, the datastore is marked as congested and each VM on that datastore is allocated I/O resources in proportion to their shares. By default all VM shares are set to normal (1000). We can set shares per VMDK. Shares can be configured as High, Normal and Low.
Below image from vmware how SIOC prioritize Tier-1 VM’s over other.
I have turned on SIOC, whats next?
Its very important to understand that just enabling SIOC don’t means that your critical business application will not be impacted during IO contention. Enabling SIOC guarantee that each VMDK has equal access to the datastore.
Configuring shares on virtual machines dictates how IOPS will be distributed among competing virtual machines/VMDK’s. A virtual machine with high shares is going to get more IOPS than a virtual machine configured with low or normal shares.
How Storage IO Control Works?
When SIOC is enabled on a datastore, a file called .iormstats.sf is created on that datastore and it is a shared file which is accessed by all host connected to that datastore. Each Esxi host periodically writes its average latency and the number of I/Os for that datastore into the file. This enables all hosts to read the file and compute the datastore-wide latency.
The Datastore-wide Disk Scheduler prioritize virtual machines over others depending on the number of shares assigned to their respective disks. It does this by calculating the I/O slot entitlement, but only when the configurable latency threshold has been exceeded.
The Datastore-wide Disk Scheduler sums up the disk shares for each of the VMDK’s of the virtual machines on the datastore and calculates the I/O slot entitlement based on the host-level shares and it will throttle the queue. Once SIOC is engaged, it begins to assign fewer I/O queue slots to virtual machines with lower shares and more I/O queue slots to virtual machines with higher shares.
Note: Maximum number of I/O queue slots that can be used by the virtual machines on a given host cannot exceed the maximum device-queue depth for the device queue
of that ESX host. Duncan Epping has written an excellent article to explain SIOC and queue depth with example.
How does SIOC calculates storage device IOPS capabilities?
In vSphere 5.0 concept of injector was introduced to determine the performance characteristics of a datastore. In order to ensure that Storage DRS I/O balancing placement logic is not only based on a threshold value and observed latency, SIOC also characterizes datastores by injecting random I/O.
Injector issues read only I/O for performance characterization of datastores. These read only I/O are issued only when a datastore is idle. If when injector detected a datastore is idle and starts to inject the I/O, injector detects I/O from other sources, it stops immediately and wait again for idle time to retry I/O injection.
In order to characterize the device, different amounts of outstanding I/Os are used and latency is monitored. In other words, random read I/O is injected and every time with a different number of outstanding I/Os. The outcome can be plotted simply in a graph and the slope of this graph indicates the performance of the datastore.
Whats new with SIOC in vSphere 6?
In vSphere 5.5 a new improved scheduler call mClock was introduced and it has the capability to set I/O reservations on VMDK’s. In vSphere 6 VMware added the ability to set those reservations on the VMDK level. This feature is not present in vSphere UI and is only exposed through the VIM API. William lam has developed a Power-Cli script and it can be downloaded from here
Requirements for SIOC?
This article from VMware lists all requirements for SIOC.
Configure/Manage Storage I/O Control
To configure SIOC, login to vSphere Web Client and from home page navigate to storage view.
Select a Datastore > Manage > Settings > General and click on Edit button.
The default congestion threshold setting on vSphere 6 is to use the percentage of peak throughput which is generally recommended. You can also set a custom value if you want. Thsi value is based on what are the type of disk that are backing the datastore.
The storage layer SLA requirements of the environment and the physical storage array capabilities will be contributing factors here to how you design the SIOC threshold values
Next is to define shares for the business critical virtual machines. These shares are defined at disk level. Open virtual machine properties and expand hard disk and locate Shares. You can select from Low,Normal and High or you can set a custom value for shares.
Monitor Storage I/O Control
We can monitor how Storage I/O Control handles the I/O workloads of the virtual machines accessing a datastore based on their shares. There is a datastore performance charts which allow monitoring of:
- Average latency and aggregated IOPS on the datastore.
- Latency among hosts n Queue depth among hosts.
- Read/write IOPS among hosts.
- Read/write latency among virtual machine disks n Read/write IOPS among virtual machine disks.
To view datastore performance charts, login to vSphere Web client and navigate to Datastore > Monitor > Performance. Select Performance from View drop-down menu and select the time range for which you want to view the charts.
SIOC logging
By default, SIOC logging is disabled. To enable logging:
1. Click Host Advanced Settings.
2. In the Misc section, select the Misc.SIOControlLogLevel parameter. Set the value to 7 for complete logging. (Min value: 0 (no logging), Max value: 7)
3. SIOC needs to be restarted to change the log level, to stop and start SIOC manually, use /etc/init.d/storageRM {start|stop|status|restart}
4. After changing the log level, you see the log level changes logged in the/var/log/vmkernel logs.
Note: SIOC log files are saved in /var/log/vmkernel.
Sources and Inspirations
Cluster Deep-Dive book by Duncan and Frank.
vSphere Storage Interoperability Series: SIOC
Storage I/O Control (SIOC) Overview
Debunking Storage I/O Control Myths
Additional Reading
SIOC: I/O Distribution with Reservations & Limits – Part 1
SIOC: I/O Distribution with Reservations & Limits – Part 2
How SIOC works with Storage IO reservation
IOPS reservations in SIOC 6 , what’s the deal?
I hope you find this post informational. Feel free to share this on social media if it is worth sharing. Be sociable 🙂