CHAPTER-9-CS12Fr-Disc2v4.1-07/09-Tues/Fri-MK
9.1.1.1 OSI Model & Troubleshooting
Good network trouble shooters are always in high demand.
Knowledge of t features, functions, and devices of each OSI layer and how each relates to those around it help a network technician troubleshoot better.
Upper layers (5-7): deal with specific app functionality and are implemented only in software.
Problems here are caused by end-system software config errors on clients/servers.
Lower layers (1-4): = data-transport issues.
Network (3) and Transport Layer (4): = only software.
Software config errors on routers and firewalls happen here.
IP addressing and routing errors happen at Layer 3.
Physical (1) and Data Link Layer (2): = hardware and software.
Hardware problems and incompatibilities cause Layer 1/Layer 2 problems.
There are three main troubleshooting approaches when using network models:
9.1.2.1 Troubleshooting Methodologies
Top-down
Bottom-up
Divide-and-conquer
Top-down : App Layer and works down. Views problem from user and application.
Bottom-up : Physical Layer and works up. Hardware and wire connections.
Divide-and-Conquer: Begin at middle layers and work up or down from there.
9.1.3.1 Troubleshooting Tools
Difficult to troubleshoot network connectivity issue without a network diagram.
Logical and physical topologies are extremely useful in troubleshooting.
Physical Network Topologies:
Shows physical layout of devices connected to network.
Physical network topologies include:
Device types
Models and manufacturers of devices
Locations
Operating system versions
Cable types and identifiers
Cabling endpoints
Logical Network Topologies:
Shows how data is transferred on network.
Symbols used as network elements: routers, servers, hubs, hosts, and security devices.
Logical network topologies include:
Device identifiers
IP addresses and subnet masks
Interface identifiers
Routing protocols
Static and default routes
Data-link protocols
WAN technologies
Network Documentation and Baseline Tools:
Available for Windows, Linux, and UNIX OS’.
CiscoWorks used to draw network diagrams, keep documentation up to date and measure baseline network bandwidth use.
Tools provide monitoring and reporting functions for finding network baseline.
Network Management System Tools:
NMS tools monitor network performance.
Graphically display physical view of network devices.
Locate source of failure and determine what possible cause was.
E.g.: CiscoView, HP Openview, SolarWinds and WhatsUp Gold.
Knowledge Bases:
Have become good sources of info.
Network admin has access to a vast pool of experience-based info.
Protocol Analysers:
Decodes protocol layers in recorded frame and presents info in an easy-to-use format.
Capture network traffic for analysis.
Output is filtered to view specific traffic/types of traffic based on criteria. E.g.: to/from a certain device.
Hardware troubleshooting tools:
Cable Testers:
Specialized, handheld devices designed for testing comms cabling.
Used to detect opens, shorts and wire map errors.
Time-domain reflectometer (TDR),can pinpoint distance to break in cable.
Can determine length of a cable.
Digital Multimeters:
Test instruments that measure voltage, current, and resistance.
In network troubleshooting tests involve checking PSU voltage and verifying that network devices are receiving power.
Portable Network Analysers:
Engineers use these to see average and peak utilization of a network segment.
Used to identify devices producing most network traffic, analyse network traffic protocols and view interface details.
Useful for problems caused by malware or DOS attacks.
9.2.1.1 Layer 1 & 2 Problems
Network problems at Layer 1 can cause loss of network connectivity or network performance to degrade.
Types of problems that occur at Layer 1 are related to type of technology used.
Ethernet is multi-access technology. (Layer 2)
Ethernet = Carrier Sense Multiple Access with Collision Detection (CSMA/CD).
Excessive collisions cause network performance to degrade.
Layer 2 specifies how data is formatted for transmission over network media.
Layer 2 regulates how access to network is granted.
Layer 2 links Network Layer software functions to Layer 1 hardware for LAN and WAN apps.
Network analysers can locate source of a Layer 2 issue.
9.2.2.1 Troubleshooting Device Hardware & Boot Errors
The boot-up process:
1. Performing POST and loading bootstrap program.
2. Locating and loading Cisco IOS software.
3. Locating and loading t startup config file or entering setup mode.
When Cisco IOS software is loaded, technician verifies that hardware/software are fully operational.
‘Show version’ displays version of OS and if interface hardware is recognized.
‘Show flash‘ shows content of Flash memory (and Cisco IOS image file),flash memory used and memory available.
‘Show ip interfaces brief’ shows status of device interfaces and IP addresses assigned.
‘Show running-configuration’ and ‘show startup-configuration’ verify if config commands recognized during reload.
If device fails to boot and creates network outage, replace device with known good.
When service to users is restored, troubleshoot and repair failed device.
If router boot ok then green LED indicators will display.
Device Fails POST:
No output appears on console screen and system LEDs change color or blink.
If POST fails, turn off power, unplug device, remove all i modules and reboot device.
If POST still fails then device needs service or brick it!
If POST ok without modules installed, then dodgy module.
Reinstall each module individually, rebooting each time to find Mr. Dodgy!
Cisco IOS Image in Flash is Corrupt:
If image file in flash is corrupt/missing, boot-loader can’t find Cisco IOS file to load.
Boothelper is an image with limited functionality that runs if no image exists.
If Boothelper cannot bring device back into operation then device enters ROMmon mode.
Use ROMmon to reload Cisco IOS image from TFTP server.
Memory is not Recognized or Fails:
If insufficient memory to decompress image, device scrolls error messages rapidly or constantly reboots.
Boot device into ROMmon mode by using ‘Ctrl-Break’ command during startup.
Determine the status of the memory in ROMmon mode.
Interface Modules are not Recognized:
Interface modules not recognized during POST/IOS load.
‘show version’ does not match installed modules.
If module is new, check if module is supported by Cisco IOS version installed and enough memory to support module.
Power down device, disconnect t power, and reseat module to see if hardware problem.
If module still dodgy, replace with known good.
Configuration File is Corrupt or Missing:
If startup config file cannot be found, some devices execute autoinstall.
Utility broadcasts TFTP request for config file.
Some devices enter initial config dialog (setup utility/setup mode).
Devices that use autoinstall enter setup mode if no TFTP server responds after five attempts.
9.2.3.1 Troubleshooting Cable & Device Port Errors
Router interface errors show up Layer 1 and Layer 2 cabling/connectivity errors.
Examine statistics recorded on suspected interface with ‘show interfaces’ and status of interfaces with ‘show ip interface brief ‘.
Up/up status = normal operation and media/Layer 2 protocols are functional.
Down/down status = connectivity/media problem exists.
Up/down status = media is connected properly, but Layer 2 protocol is not ok.
Layer 1 Issues that cause down/down output:
Loose cable or tight cable = circuit down.
If pins cannot make good connection = circuit down.
Incorrect termination or correct standard is not followed.
Pins correctly terminated in connector.
Pins on interface connection are bent/missing.
Dodgy cable – interface cannot sense correct signals.
Layer 2 issues that cause an up/down output:
Encapsulation not configured correctly.
No keep-a-lives are received on interface.
‘Show interfaces’ shows extra info to help identify media errors.
Output for show interfaces:
Excessive Noise – Presence of plenty CRC errors but not many collisions indicates noise. CRC errors indicate media or cable error caused by emi, bad connections, incorrect cabling.
Excessive collisions – Occur on half-duplex/shared-media Ethernet connections. Damaged cables cause excessive collisions.
Excessive runt frames – Malfunctioning NICs cause runt frames, but can also be caused by same issues as excessive collisions.
Late collisions – Caused by excessive cable lengths and duplex mismatches.
9.2.4.1 Troubleshooting LAN Connectivity Issues
Each port on a switch has an LED indicator that provides info.
Verify switch port connected to user is active and appropriate LED indicators are lit.
Error condition = red or orange.
Check to see that both sides of connection have link.
If no link light is present, ensure physical connectivity and correct port is used.
Ensure devices are powered up with no boot errors.
Change suspected patch cables with known good and verify terminations are correct for desired connectivity.
If still no link light, check port is not administratively shut down.
Use ‘show running-config interface’ to show config of switch port.
Switch# show run interface fastEthernet 1/1
interface FastEthernet 1/1
shutdown
duplex full
speed 100
end
Use ‘show interface port counters errors’ if link light is present, but cable is suspected damaged.
Duplex mismatches are more common on switches than and may occur if one device configured to auto-negotiate and other manually configured which lead to collisions and dropped packets.
Use ‘show interface port status’ to view speed, duplex and auto-neg settings on a port.
If Cisco Discovery Protocol (CDP) enabled, CDP error messages show on console/logging buffer.
CDP is useful to detect errors, port and system stats on nearby Cisco devices.
9.2.5.1 Troubleshooting WAN Connectivity Issues
WAN connectivity relies on equipment owned/managed by telecommunications service provider (TSP).
Correct serial interface and line problems using info from ‘show interfaces serial’.
Packet errors, config errors, or mismatches in encapsulation and timing can plague serial connections.
Consider CSU/DSU or modems when troubleshooting serial lines.
Know type of modem or CSU/DSU installed and how to place device in loopback state for testing.
‘Show interfaces serial’ displays problem states:
Serial x is down, line protocol is down (DTE mode) – When interface cannot detect signal on line, it reports line and Layer 2 protocol down.
Serial x is up, line protocol is down (DTE mode) – When interface does not receive keep-a-lives or there is encapsulation error Layer 2 protocol is reported down.
Serial x is up, line protocol is down (DCE mode) – When router is providing clock signal and DCE cable is attached with no clock rate configured, Layer 2 protocol is reported down.
Serial x is up, line protocol is up (looped) – When serial interface receives own signals back on circuit, it reports line as looped. (Common practice to place circuit in loopback condition to test connectivity).
Serial x is up, line protocol is down (disabled) – High error rates cause protocol disabled mode. ( hardware related).
Serial x is administratively down, line protocol is down – Device configured with ‘shutdown’. Enter ‘no shutdown’ on interface to fix. If interface does not come up, check for duplicate IP address. If duplicate IP address exists use ‘no shutdown’ command again.
Serial x is up, line protocol is up – interface is honky dory!
9.3.1.1 Layer 3 Functionality & IP Addressing
Layer 1 networks created by interconnecting devices over physical media.
Layer 2 network protocols are hardware dependent. (Ethernet/serial)
Layer 3 protocols not bound to type of media or Layer 2 framing protocol.
Same Layer 3 protocols can use Ethernet, wireless, serial, or others.
Layer 3 networks can have hosts connected using different Layer 1 and 2 thingys!.
Layer 3 networks = logical networks created in software.
Most networks use TCP/IP protocols to exchange info between hosts.
9.3.2.1 IP Design & Config Issues
Overlapping subnet occurs when address range of two separate subnets include some same host/ broadcast addresses.
Overlapping caused by poor network docs or entering incorrect subnet mask/network prefix.
Poorly configured subnet mask cause some hosts on a network to lose access to services.
Subnet mask config errors can also display variety of symptoms not easily identified.
9.3.3.1 IP Address Planning
When Windows host does not receive address from a DHCP server, it automatically assigns itself an address on 169.254.0.0 network.
Use ‘show ip dhcp binding’ to check if DHCP server has available addresses.
9.3.3.1 DHCP & NAT Issues
Verify that IP addressing is assigned using ‘ipconfig /all’.
If host not receiving IP address, then troubleshoot DHCP config.
First step in troubleshooting = check physical connectivity.
Next check DHCP server is correctly configured and has IP addresses to lease.
Check for any address conflicts. (static address contained in range of DHCP pool).
Use ‘show ip dhcp conflict’ to show address conflicts in DHCP server.
If problem still exists, configure static IP address info on host and if unable to reach network resources then problem is not DHCP.
Router can forward broadcast packets (incl.DHCP) to server using ‘ip helper-address’.
Router(config-if)# ip helper-address x.x.x.x
First indication of NAT problem is users cannot reach internet sites.
Incorrect Designation of Inside and Outside Interface
Inside interface connects to local network, which uses private IP address space.
Outside interface connects to public network ( ISP).
Use ‘show running-config interface’
Incorrect Assignment of Interface IP Address or Pool Addresses
IP address pool and static NAT translations must use addresses on same local IP network as outside interface. (no route to the translated addresses are found).
Check config to verify translated addresses are reachable.
When address translation is config on outside interface address in PAT, ensure interface is on correct network and subnet mask.
If dynamic NAT/PAT is enabled and external users cannot connect to static internal devices, then check static translations are configured.
Verify NAT is operational by using ‘show ip nat translations’.
After viewing, use ‘clear ip nat translation *’ (may disrupt user services).
Use ‘show ip nat translations’ again and if new translations appear, problem is elsewhere.
Use ‘traceroute’ to find path translated packets are taking.
9.5.1.1 Layer 4 Traffic Filtering Errors
Some engineers are unsure which transport protocol used by apps and deny port number for TCP and UDP traffic.
This practice denies traffic that should be allowed.
Firewalls are often configured to deny everything except apps specified in permit statements, then firewall filtering problems occur.
Layer 4 problem = users reporting video or audio web services are not reachable.
9.5.2.1 Upper Layer Problems
TCP/IP Application Layer protocols:
Telnet – establishes terminal session connections with remote hosts.
HTTP – exchanges text, graphic images, sound, video, and other multimedia files on the web.
FTP – interactive file transfers between hosts, using TCP.
TFTP – basic interactive file transfers between hosts and networking devices (UDP) .
SMTP – basic email message delivery services.
POP3 – connects to mail servers and downloads email to a client application.
IMAP4 – lets email clients retrieve messages and store email on servers.
SNMP – info from managed devices.
NTP – updated time to hosts and network devices.
DNS – maps IP addresses to names assigned to hosts.
SSL – encryption and security for HTTP transactions.
SSH – secure remote terminal access to servers and networking devices.
Using “divide and conquer” and verify Layer 3 connectivity:
Step 1. Ping host default gateway.
Step 2. Verify end-to-end connectivity.
Step 3. Verify routing configuration.
Step 4. Ensure that NAT is working correctly.
Step 5. Check for firewall filter rules.
Check with ISP to ensure network connection is up and operational.
If verified that the end-to-end connectivity is not issue, but end device is still not operating then problem has been isolated to upper layers.
It is possible to have full network connectivity, but app cannot provide data.
Misconfigured client applications account for the majority of upper layer network problems.
Use ‘nslookup’ to verify DNS is working ok.
If DNS server is ok and reachable, check for DNS zone config errors.
Browser plug-in programs must be kept updated for web pages to display correctly.
Use correct protocol to request data can cause a web page to be unreachable.
Specify https:// or http:// for desired protocol.
9.5.3.1 Using Telnet to check upper layer connectivity
Use of Telnet indicates lower layer connectivity exists between devices.
Cisco IOS devices include an SSH client that can be used for SSH sessions with other devices.