Troubleshooting Cheat Sheet: Layers 1-3

Any time you encounter a user complaint, whether regarding slow Internet access, application errors, or other issues that impact productivity, it is important to begin with a thorough understanding of the user’s experience.

Not sure where to begin?  User complaints usually fall into three categories: slow network, inability to access network resources, and application-specific issues.

Based upon the complaint being presented you need to understand the symptoms and then isolate the issue to the correct layer of the Open Systems Interconnection (OSI) model.

The following Troubleshooting Cheat Sheet shows the questions to ask with a typical slow network complaint.

What to Ask What it Means
What type of application is being used? Is it web-based? Is it commercial, or a homegrown application? Determines whether the person is accessing local or external resources.
How long does it take the user to copy a file from the desktop to the mapped network drive and back? Verifies they can send data across the network to a server, and allows you to evaluate the speed and response of the DNS server.
How long does it take to ping the server of interest? Validates they can ping the server and obtain the response time.
If the time is slow for a local server, how many hops are needed to reach the server? Confirms the number of hops taking place. Look at switch and server port connections, speed to the client, and any errors.

Quick OSI Layer Review

With these questions answered, working through the OSI model is a straightforward process. When dealing with the different layers, understanding how each layer delivers data and functions will impact how you would troubleshoot each layer.

Physical Layer

Data Link Layer

Network Layer

Assessing the Physical Layer

Generally speaking, Physical Layer symptoms can be classified into two groups of outage and performance issues. In most cases, investigating outage issues is the easiest place to begin, as it’s a matter of confirming the link light is out or that a box is not functioning. Additionally, validating equipment failure is a matter of replacing the cable or switch and confirming everything works.

Physical Layer issues are overlooked by people pinging or looking at NetFlow for the problem, when in reality it’s a Layer 1 issue caused by a cable, jack, or connector.

The next step in investigating Physical Layer issues is delving into performance problems. It’s not just dealing with more complex issues, but also having the correct tools to diagnose degraded performance. Essential tools in your tool box for testing physical issues are a cable tester for cabling problems, and a network analyzer or SNMP poller for other problems.

Assessing Physical Performance Errors

In diagnosing performance issues from a network analyzer, you’ll notice that there are patterns common with these errors, which are usually indicative of what’s causing the Physical Layer problem. These can be divided into intelligent and non-intelligent errors.

Intelligent Errors: An intelligent host is smashing into your network signal and corrupting the data.

Example: Overloaded WiFi network or a busy channel.

Non-Intelligent Errors: An outside entity causing noise that interferes with the signal or flow of data across the network.

Example: A microwave interfering with a WiFi signal.

Climbing Further up the Stack

Confirming performance problems, taking a systematic approach to troubleshooting, and understanding how communication occurs across the layers of the OSI model are key to slashing troubleshooting times and improving resolution accuracy.