Digital Resilience Pays Off
Download this e-book to learn about the role of Digital Resilience across enterprises.
Virtualization is difficult to manage given the complex moving parts from storage to networking to hardware. When you have a dynamic VMware environment with Distributed Resource Scheduler (DRS) and High Availability (HA) enabled, Virtual Machine’s (VM) in the environment can transition through multiple hosts and clusters and can potentially become unregistered VM’s. This can lead a VMWare Administrator to loose visibility for these VMs. In addition each VM in a datacenter could cost from a couple hundred dollars into the thousands (http://roitco.vmware.com) based on your environment and infrastructure costs.
In this blog post I will cover three types of VM’s that can exist in your VMware Infrastructure and requires additional attention. The definition of these VM’s vary, but I’m sure you will be able to recognize them regardless of the name I give them.
Zombie VM : Virtual Machine that uses less than certain amount of CPU for a period of time. (Example: VM using less than 5% CPU for over a thirty-day period.) Since Zombie VM’s are the VMs running very low CPU usage, it could be repurposed to run other applications when needed.
Chatty VM (Opposite of Zombie) : Virtual Machine that uses more than certain amount of CPU for a period of time. (Example : VM using more than 80% CPU over a week). Chatty VM’s are the ones probably moving from ESXi to ESXi host using vMotion based on utilization.
Orphan VM : There are multiple definitions for this type of VM. Here are a just some examples of what an Orphaned VM can look like:
In many occasions, actively running Orphan VMs is a security concern since they are not visible to vCenter Server and thus the VMware administrator is unaware of them as well. The VM’s will not be patched and can go undetected from compliance and operational audits.
Orphan VM’s happen because of some of the following reasons:
In order to gather information from a complex environment like VMware, we will need to collect performance, log and configuration data from vCenter Server and ESXi hosts.
Splunk App for VMware provides deep operational visibility into granular performance metrics, logs, tasks and events and topology from hosts, virtual machines and virtual centers.
Splunk App for VMware provides:
Going back to basics of core Splunk, we can create our own searches, reports, alerts and dashboards on top of any Splunk app. With these additional dashboards we can identify, validate and repurpose these VMs that was mentioned above.
Lets go ahead and identify Zombie, Chatty and Orphan VMs by custom search command.
(sourcetype=vmware:perf:cpu source=VMPerf:VirtualMachine) OR (sourcetype=vmware:inv:vm changeSet.name=*) | eval detect = if(p_average_cpu_usage_percent < 5.00, zombie, if(p_average_cpu_usage_percent > 80.00, chatty, normal)) | stats first(detect) as CPU Status by moid
We can put together a very cool dashboard to show all the Zombie, Orphan and Chatty VMs.
Since the zombie and/or orphan VM’s could be repurposed for other usage, we can calculate the total cost for removing or repurposing the troubled VM’s.
This could help you show your management how much you saved the business with real savings!
(sourcetype=vmware:perf:cpu source=VMPerf:VirtualMachine) OR (sourcetype=vmware:inv:vm changeSet.name=*) | stats first(detect) as CPU Status first(changeSet.name) as VM Name first(p_average_cpu_usage_percent) as Avg CPU Usage by moid | stats count(moid) as moid, count(VM Name) as vms | eval cost = (moid vms)*$price$ | table cost
Splunk can help your organization repurpose zombie and orphan VM’s to fully utilize your virtualization effort and to keep it secure. Splunk can also help identify chatty VM’s and move them to properly sized ESXi hosts.
Happy Splunking.
This blog post was jointly written by Tolga Tohumcu and Kam Amir…
----------------------------------------------------
Thanks!
Tolga Tohumcu
The Splunk platform removes the barriers between data and action, empowering observability, IT and security teams to ensure their organizations are secure, resilient and innovative.
Founded in 2003, Splunk is a global company — with over 7,500 employees, Splunkers have received over 1,020 patents to date and availability in 21 regions around the world — and offers an open, extensible data platform that supports shared data across any environment so that all teams in an organization can get end-to-end visibility, with context, for every interaction and business process. Build a strong data foundation with Splunk.