Troubleshooting Wireless Networks
In an effort to ensure the success of the mobile workforce, scientists from the University of California in San Diego have developed an automated troubleshooting system for wireless-access networks.
The enterprise-scale troubleshooting system, which initially was designed for the UCSD computer science building, is capable of detecting and diagnosing various problems often encountered when using a Wi-Fi connection.
The two most common problems related to wireless networks are slow performance and an unreliable connection. The prime causes of both are either a weak signal or a channel "clash" — when a wall or other physical obstacle between the device and the network's wireless access point (WAP) weakens the transmitted signal, making it more likely for neighboring connections to interfere with each other.
Channel clash damages reception on both sides. However, when it comes to wireless systems, there are many other possible sources of networking faults, such as unplugged cables, switched-off wireless network adapters, dysfunctional repeaters (intermediate devices that receive and regenerate the signal, broadcasting it further to extend the wireless network's range), driver incompatibility and wrongly configured network settings.
“Few organizations have the expertise, data or tools to decompose the underlying problems and interactions responsible for transient outages or performance degradations,” claim UCSD researchers who presented a paper on the troubleshooting system at ACM SIGCOMM, one of the world’s premier networking conferences, in 2007.
Stefan Savage, UCSD associate professor of Computer Science and head of the research project, says that even though people expect Wi-Fi to work, there is a general understanding that it’s not reliable. “If you have a wireless problem in our building, our system automatically analyzes the behavior of your connection — each wireless protocol, each wired network service and the many interactions between them,” Savage explains. “In the end, we can say it’s because of this that your wireless is slow or has stopped working, and we can tell you immediately.”
The Heart of the Problem
Wireless-access networks are complicated by such issues as shared spectrums, user mobility, authentication management, and the interaction between wired and wireless networks. Diagnosing problems in these complex networks often requires a huge amount of data, knowledge and time.
Because a wireless network comprises many pieces, when disruption occurs, it is very hard to pinpoint the problematic component. Usually, it is necessary to sift through huge amounts of data. “Wireless networks are hooked onto the wired part of the Internet with a bunch of Scotch tape and bailing wire — protocols that really weren’t designed for Wi-Fi,” Savage says. “If one of these components has a glitch, you may not be able to use the Internet even though the network itself is working fine.
"For example, someone using the microwave oven two rooms away may cause enough interference to disrupt your connection," Savage explains.
Yu-Chung Cheng, who as a doctoral student in computer science at UCSD was lead author on the paper, points out that network problems today are not consistent and may occur for a number of reasons. Many aren’t detected even by network administrators. But, says Cheng, now at Google, "We’ve created a virtual wireless expert who is always at work.”
The scientists presented a set of modeling techniques for automatically characterizing the source of wireless networking problems. In their research, they focused primarily on data-transfer delays unique to the set of standard over-the-air modulation techniques that define wireless local area networks (WLANs), media access dynamics and mobility management latency.
After two years of data collection and analysis, the UCSD automated help-desk system was implemented within the Computer Science building, where it has been up and running for close to a year, 24 hours a day. Today, all wireless help-desk issues go through the new automated system, which constantly monitors data relevant to the faculty's wireless network and catches transient problems. One of the interesting things the troubleshooting system has revealed, according to researchers, is that there is no single issue that affects wireless network performance, but rather many little things that interact and go wrong in ways one might not anticipate.
According to Savage, the research team is working with Wi-Fi-based Voice over IP (VoIP) phones. “Our system is the ultimate laboratory for testing new wireless gadgets and new approaches to building wireless systems,” he says.
Savage believes that future enterprise wireless networks will have sophisticated diagnostics and repair capabilities built into them. “I look at [our work] as an engineering effort,” he says. “How much the [future wireless networks] will draw from our work is hard to tell today. You never know the impact you are going to have when you do the work. We learn something new every week.”
Utilizing Mesh Networks
Troubleshooting wireless networks is a hot topic among networking experts. A team of scientists from the University of Texas, UC-Berkeley and Microsoft Research have developed a system for detecting wireless connectivity problems in unique mesh networks, where all the components can connect to each other through multiple hoops.
Their research outlines a novel fault-diagnosis process in wireless mesh networks. It also details a method for employing trace-driven simulations to detect faults and to perform root-cause analyses. This approach was used to diagnose performance problems caused by packet dropping, link congestion, external noise and media access control (MAC) misbehavior, and researchers noted that “In a 25-node mesh network, we are able to diagnose over 10 simultaneous faults of multiple types with more than 80 percent coverage.”
Their troubleshooting framework integrates a network simulator, which collects traces and uses them to recreate a log of events that took place inside the real operational network. The simulator is applied to network management, such as performance tuning and “what-if” analysis for route simulation. According to the researchers, this technique can be applied to a large class of networks operating under different environments.
A Microsoft white paper offers more pointers on troubleshooting wireless mesh networks.