How DevOps Monitors and Diagnoses Problems
At Care.com, Coddington’s team uses a variety of observability tools, including Splunk APM and Splunk Cloud. He says the visibility has been “invaluable” to the DevOps and engineering teams for monitoring and diagnosing problems.
“Once an application is instrumented, it starts to send both traces and metrics to the Splunk platform,” he says. “Our engineering teams use the metrics to monitor for problems and then utilize a combination of metrics, traces and logs to debug those issues. Splunk allows for alerts to be set up against those metrics so that teams are notified when thresholds are exceeded.”
Having a dependable observability platform also helped Care.com make the decision to introduce microservice architectures to its organization.
“We wouldn’t have embarked on our migration to the more complex microservices architecture without the distributed tracing and APM metrics available to us,” Coddington says. “An APM solution becomes a lot more critical to an engineering organization when microservice architectures are introduced, given the added complexity of troubleshooting.”
FIND OUT: How technology leaders are improving IT infrastructure.
Seamless Solutions in a Competitive Market
Like Care.com, Charter Communications depends on its monitoring solutions to find and resolve issues quickly. The cable and broadband company serves 32 million customers through its Spectrum brand. In a competitive market, delivering a flawless product is imperative.
“Charter develops and maintains many applications to provide our customers and employees with a great experience,” says Jeff Gutterman, group vice president of IT enterprise infrastructure at Charter. “For customers, this needs to be from the time of purchase through the consistent delivery of our products and services.”
Charter uses Cisco’s AppDynamics to detect just about everything needed for an application environment, including hardware and application monitoring. Within these systems, AppDynamics measures metrics like transaction times, response times, load times and throughput.