IT Troubleshooting 201: Ask the Right Questions
Effective troubleshooting is a multifaceted exercise in diagnosis and deliberation, analysis and action. Here's a walkthrough you can use to make sure you've covered every angle -- no matter the problem.
There are two rules that always apply, whether you're troubleshooting hardware or software.
- Troubleshooting is a process of elimination.
- The most important assumption you can make, no matter how much you know about the technology, is that you could be wrong.
If that first rule seems obvious, then consider this: Troubleshooting -- or any problem-solving process -- is clearly a process of elimination. However, it's not that simple.
Your success or failure lies in what you choose to eliminate, and more importantly, why. It's a game of Pick Up Sticks where you evaluate, reason, then remove any obstacles that get you closer to resolving the problem without breaking anything else. How you make those choices depends entirely on the questions you ask and how you interpret the answers.
As for the second point, the assumptions you make lead to the questions you ask and the way you interpret responses -- whether you're asking a person, a document, a piece of hardware, a software package or a network infrastructure. When you assume you could be wrong, no matter what your level of experience, you keep an open mind that helps you see simple solutions you may never have expected.
These are some of the common pitfalls you can run into while troubleshooting technological problems, as well as tips for asking questions that can lead you to the simple, effective solution every time.
If You Don't Know Why It Works, It Isn't Fixed
While teaching a document troubleshooting training course, I asked the class if they were familiar with the Microsoft Word bug by which Word randomly changes the type of section break in long documents for no reason at all. They excitedly replied that they had been plagued by this bug, but one person in the class had found the solution.
As you may already know, I had set them up. There's no such bug. The section start type is often misunderstood. It does what it does for good reason and not at all randomly. So as you might expect, their solution was not ideal.
After telling me they didn't know why the fix worked, but that it did work most of the time, they explained their solution. They recommended adding several next page section breaks before and after the break that changes. Then remove them one at a time (undoing your actions when the result is undesirable) until you're left with the break type you want.
Whether you're familiar with this feature of Word or not, a troubleshooter should know this isn't a viable solution, and here's why:
- If you don't know why a fix works, it probably doesn't. It may appear to work by coincidence, but a workaround is not a fix.
- If the fix doesn't work consistently, it most likely doesn't work at all.
- Whether you're working with software or hardware, computer technology is rooted in logic. If a fix seems unruly or overcomplicated, like the challenges in reality TV shows, there's probably a better way. In this case, there's a simple, consistent solution. You just have to change one setting in a dialog box (you'll find the details of this particular fix in "Sensible Solutions").
The path to effective problem solving broke down here when the troubleshooters assumed the behavior was a bug because they didn't understand. They looked for any possible workaround rather than a simple, logical solution.
It's common and understandable for users to blame the software or hardware when something frustrating happens that they don't understand. For a troubleshooter to do the same, however, is an almost certain setup for failure.
The job of troubleshooting begins when you don't already know the answer. You can't fix something if you don't know why it's broken. So how do you get to the "why" when you don't know the "how"? You start by gathering information, and that means asking questions of the user and of the technology itself.
Ask, Narrow and Verify
This three-tiered approach to troubleshooting is both simple and effective. Here's one example: a networking troubleshooter in a large corporation was speaking with a user who couldn't log in to one internal application. The user had contacted the help desk to request login credentials. He learned that anyone in the organization should be able to log in.
Ask: Through a series of basic questions, the troubleshooter determined the user works remotely. However, the user is able to access both internal and external sites, as well as other internal applications. Everything appeared to be working normally, and the user had never had connectivity issues before.
Narrow: The troubleshooter was not an expert in that particular application, so she started from what she knows -- networking. Based on the fact that the user could log in to other applications and was working remotely, the troubleshooter hypothesized there had to be something about his connection that was a problem for this particular application. She researched the system requirements for the application and then connected remotely to his computer.
Verify: When connected remotely, the troubleshooter saw a network setting she believed might be causing the issue. She changed the setting and the user was able to log in, but she didn't leave it there. The troubleshooter had the user verify other connectivity and found that the change prevented him from accessing certain Web sites. She tried a different change to the same setting that let the user log in without disrupting other connections.
If you're working with a user, listen to what he has to say and value the information he gives you. Use any related knowledge you have to interpret the answers. In the case of the networking issue, the user assumed that because he never had a connection problem before, his connection couldn't be an issue. The way he answered the question was exactly what made the troubleshooter believe she should, in fact, check his connection.
A good troubleshooter takes information she's given and applies to it what she knows, always confirming the validity of a hunch before taking action. For example, this troubleshooter researched the system requirements of the app in question before connecting to the user's computer. Similarly, if you're troubleshooting the first scenario from this article and you aren't familiar with section breaks in Word, use the help functionality in the program to find out what they are and why they're used. That way, you can begin to understand the behavior.
Get your hands dirty. If you're not an expert in the specific problem, approach it with the same logic you would approach a technology you know well. This might mean connecting to a user's machine and interacting with the technology in a way that's familiar to you. Be specific, start simply and look for concrete information that can help you narrow the possibilities.
Measure twice, cut once. When you think you have the answer, test it. Make sure that it fixes the issue without doing other harm. Test it to confirm that it solves the problem consistently. And most importantly, be sure that you understand why the fix worked, or you can't be sure that you've fixed it at all.
Open Minds, Simple Solutions
Consider one more example of troubleshooting a Microsoft Word document: A troubleshooter received a document that was crashing frequently. He began by opening the document using the Open and Repair feature in Word.
Open and Repair indicated there was a corrupt shape in the document. However, he saw no embedded graphics. Was Open and Repair wrong? No, it was absolutely right. Closer examination revealed shapes off the page, as well as in a header that was currently turned off. (See "Sensible Solutions" for more information on troubleshooting this particular issue.)
Whether you ask a question directly of a user or of the technology itself in this case, trust the information you receive and interpret it based on everything else that you know.
Good troubleshooting skills are a necessity that's entirely separate of technical knowledge. Good troubleshooting means applying logic so you can take concrete steps to effectively narrow the possibilities. It also means keeping an open mind and calling on any related knowledge (including how to find help and research the problem) to help you reach the simple solution.
Before you tell a user that he needs new hardware, needs to reinstall software or needs to recreate a document from scratch, consider the most likely possibility: If the answer appears to be that complicated, you may not have asked the right questions.