DevOps, Analytics and Mental Health: Notes from DevOpsDays Vancouver
Going back to Canada is always a pleasant experience for me. And when you visit Vancouver in April, it is easy to be mesmerized by this city’s majestic beauty. It includes the snow-covered mountain peaks, cherry trees in full bloom and crisp, clean air. And it is in sponsoring DevOpsDays Vancouver that brought us to this beautiful place.
Mining Machine Data for App Delivery
In my Ignite talk, I shared how using analytics for real-time insights into app delivery could help organizations have a measurable business impact. Mining machine data can help DevOps practitioners improve the velocity and quality of their applications across the entire build pipeline.
Gender in Organizations
In her captivating talk, Professor Jennifer Berdahl from the Sauder School of Business discussed different ways of looking at gender in technology organizations. She also discussed several approaches to how enterprises are dealing with gender issues and dispelled classic stereotypes.
And one of the most exciting elements of DevOpsDays are open spaces where anyone can suggest a topic. Attendees then vote and the ones with the most interest are selected for discussions. I reviewed ideas, the things such as DevOps metrics, anomaly detection and DevSecOps were some of the expected suggestions. One seemingly odd one was “Mental Health of Techie on Call” caught my eye. Amused, I decided to check it out.
Mental Health of On-call IT Personnel
As the crowd of 30 people started the discussion, it turned out that this was a very heated topic. People were sharing real struggles of what it is really like being a person on call for IT incidents and having to solve issues in a short time—or get blamed if you don’t. Isolation, stress, depression, disturbed sleep cycles, PTSD-like symptoms were just some of the few problems that surfaced. Often an IT ops person on-duty for responding to critical issues had to take calls several times throughout a night, without clear escalation paths or support from other IT teams. They feel the pressure of being the ones that need to fix problems immediately or face the outage of critical service.Sometimes the abundance of alerts was overwhelming, and false positives resulted in people being afraid of phone rings.
Some organizations recognized that on-call stress leads to high turnover which ultimately impacted their business. These enterprises decided to help their IT personnel, limiting total on call hours per person, assigning more resources or forming dedicated site reliability teams. Others added developers to the on-call rotation and assigned pagers to them. As these devs understood the impact outages had on other colleagues, they showed more cross-team empathy which resulted in better code quality.
As I was flying back home, I felt genuine compassion for on-call IT employees. I believe that adoption of analytics and machine learning enabled anomaly detection can help with reduction of noise and escalations. IT teams can triage problems before they become that dreaded midnight phone call.
Thank you, DevOps Days Vancouver organizers for this great event.