Common BGP Issues

Learning Objectives

  • Identify common BGP problems and symptoms
  • Understand BGP troubleshooting methodology
  • Recognize configuration errors and their impacts
  • Diagnose routing loops and convergence issues
  • Implement preventive measures

BGP Troubleshooting Methodology

A systematic approach to BGP troubleshooting helps identify and resolve issues quickly and efficiently.

Troubleshooting Process

  1. Gather Information: Collect symptoms, error messages, and network topology
  2. Verify Physical Layer: Check physical connections and interface status
  3. Check BGP Sessions: Verify neighbor relationships and session states
  4. Examine Routing Tables: Analyze BGP and IP routing tables
  5. Verify Configuration: Check BGP configuration for errors
  6. Test Connectivity: Perform connectivity tests and trace routes
  7. Implement Solution: Apply fixes and verify resolution
  8. Monitor: Observe system behavior after changes

BGP Session State Issues

BGP sessions can fail to establish or maintain due to various configuration and network issues.

BGP Session States and Common Issues

State Description Common Issues Resolution
Idle BGP is not running Configuration errors, interface down Check BGP configuration and interfaces
Connect Attempting TCP connection Connectivity issues, firewall blocking Verify reachability and ACLs
Active TCP connection failed, retrying Wrong IP address, AS number mismatch Verify neighbor IP and AS configuration
OpenSent OPEN message sent Parameter negotiation failure Check capabilities and timers
OpenConfirm OPEN message received Keepalive timer issues Verify timer configuration
Established Session is up and running Route flapping, memory issues Monitor stability and resources

Configuration Errors

Common configuration mistakes that cause BGP problems.

Typical Configuration Issues

Wrong AS Number

Problem: Neighbor configured with incorrect AS number

# Incorrect configuration
router bgp 65001
 neighbor 203.0.113.1 remote-as 65003  # Should be 65002

# Correct configuration
router bgp 65001
 neighbor 203.0.113.1 remote-as 65002

Symptom: Session stays in Active state

Solution: Verify and correct AS numbers

Incorrect Neighbor IP

Problem: Wrong IP address in neighbor statement

# Incorrect configuration
router bgp 65001
 neighbor 203.0.113.2 remote-as 65002  # Should be 203.0.113.1

# Correct configuration
router bgp 65001
 neighbor 203.0.113.1 remote-as 65002

Symptom: Connection timeouts, Active state

Solution: Verify IP addresses and connectivity

Missing Update-Source

Problem: iBGP sessions without update-source

# Problematic configuration
router bgp 65001
 neighbor 10.1.1.2 remote-as 65001
 # Missing update-source

# Correct configuration
router bgp 65001
 neighbor 10.1.1.2 remote-as 65001
 neighbor 10.1.1.2 update-source loopback 0

Symptom: Sessions fail when physical interfaces go down

Solution: Configure update-source to loopback

Route Advertisement Problems

Issues with routes not being advertised or received properly.

Common Route Advertisement Issues

Network Statement Issues

Problem: Network not in routing table

# Check routing table
show ip route 192.168.1.0
% Network not in table

# BGP configuration
router bgp 65001
 network 192.168.1.0 mask 255.255.255.0  # Route not in table

# Solution: Add route to routing table
ip route 192.168.1.0 255.255.255.0 null0
# OR
interface loopback 1
 ip address 192.168.1.1 255.255.255.0
Route Filtering Issues

Problem: Routes blocked by filters

# Check applied filters
show ip bgp neighbors 203.0.113.1 | include filter
 Inbound route map configured is FILTER-IN
 Outbound route map configured is FILTER-OUT

# Verify filter configuration
show route-map FILTER-OUT
route-map FILTER-OUT, permit, sequence 10
  Match clauses:
    ip address prefix-list: ALLOWED-ROUTES
  Set clauses:
    
# Check prefix-list
show ip prefix-list ALLOWED-ROUTES
ip prefix-list ALLOWED-ROUTES: 1 entries
   seq 5 deny 192.168.1.0/24  # Blocking the route

# Solution: Fix filter configuration
ip prefix-list ALLOWED-ROUTES seq 5 permit 192.168.1.0/24

Next-Hop Reachability Issues

Problems with BGP next-hop addresses not being reachable.

Next-Hop Problems

iBGP Next-Hop Issues
# Check BGP table
show ip bgp 192.168.1.0/24
BGP routing table entry for 192.168.1.0/24, version 5
Paths: (1 available, best #1, table default)
  Not advertised to any peer
  Refresh Epoch 1
  65002
    203.0.113.1 (inaccessible) from 10.1.1.2 (10.1.1.2)
      Origin IGP, metric 0, localpref 100, valid, internal

# Next-hop is not reachable
show ip route 203.0.113.1
% Network not in table

# Solution 1: Configure next-hop-self on advertising router
router bgp 65001
 neighbor 10.1.1.3 next-hop-self

# Solution 2: Add route to next-hop
ip route 203.0.113.1 255.255.255.255 10.1.1.2

BGP Flapping Issues

Route flapping can cause network instability and performance problems.

Route Flapping Problems

Identifying and Resolving Flapping
# Check for flapping routes
show ip bgp flap-statistics
BGP flap statistics
   Network          From            Flaps Duration Reuse    Path
   192.168.1.0/24   203.0.113.1        15 00:05:23          65002

# Check dampening configuration
show ip bgp dampening parameters
 Half-life time      : 15 mins       Decay array size        : 4096
 Reuse penalty       : 750            Reuse array size        : 256
 Suppress penalty    : 2000           Max suppress time       : 60 mins
 Max suppress penalty: 12000

# Configure dampening
router bgp 65001
 bgp dampening 15 750 2000 60

# Check dampened routes
show ip bgp dampening dampened-paths
BGP table version is 15, local router ID is 10.1.1.1
Status codes: s suppressed, d damped, h history, * valid, > best
Origin codes: i - IGP, e - EGP, ? - incomplete
   Network          From             Reuse    Path
d  192.168.1.0/24   203.0.113.1      00:12:30 65002 i

Troubleshooting Quick Reference

Common Commands for Quick Diagnosis

Issue Command What to Look For
Session down show ip bgp summary Neighbor state, uptime
No routes show ip bgp neighbors X.X.X.X routes Received routes from neighbor
Route not advertised show ip bgp neighbors X.X.X.X advertised-routes Routes sent to neighbor
Route filtering show ip bgp neighbors X.X.X.X | include filter Applied filters
Next-hop issue show ip bgp X.X.X.X/Y Next-hop accessibility
Memory issues show ip bgp summary | include memory Memory usage

Practice Exercise

BGP Troubleshooting Lab

Scenario: BGP neighbor 203.0.113.1 is in Active state and routes are not being received. Identify the problem.

Given Information:
Router# show ip bgp summary
Neighbor        V    AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
203.0.113.1     4 65002       0       0        1    0    0 never    Active

Router# show ip bgp neighbors 203.0.113.1 | include remote
BGP neighbor is 203.0.113.1,  remote AS 65003, external link
Your Diagnosis:

BGP Session Problems

Learning Objectives

  • Diagnose BGP session establishment issues
  • Troubleshoot BGP finite state machine problems
  • Resolve connectivity and reachability issues
  • Fix authentication and parameter negotiation problems
  • Monitor BGP session health and stability

BGP Session Establishment Process

Understanding the BGP session establishment process is crucial for diagnosing connection issues.

BGP Session Establishment Steps

1. TCP Connection (Port 179)

BGP establishes a TCP connection on port 179

  • One router initiates connection
  • Other router listens on port 179
  • TCP three-way handshake completes
2. OPEN Message Exchange

Routers exchange OPEN messages

  • BGP version negotiation
  • AS number verification
  • Router ID exchange
  • Capability negotiation
3. KEEPALIVE Exchange

Routers send KEEPALIVE messages

  • Confirms OPEN message acceptance
  • Establishes hold timer
  • Moves to Established state
4. UPDATE Exchange

Route information is exchanged

  • Initial route table exchange
  • Incremental updates
  • Periodic KEEPALIVEs

TCP Connection Issues

BGP sessions can fail at the TCP layer due to various connectivity problems.

Common TCP Connection Problems

Port 179 Blocked

Symptom: Session stuck in Connect or Active state

Cause: Firewall or ACL blocking TCP port 179

# Check for blocking ACL
show access-lists
Extended IP access list 100
    10 deny tcp any any eq 179  # Blocking BGP
    20 permit ip any any

# Solution: Allow BGP traffic
access-list 100 permit tcp any any eq 179
access-list 100 permit tcp any eq 179 any
Routing Issues

Symptom: Connection timeouts

Cause: No route to BGP neighbor

# Test connectivity
ping 203.0.113.1
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 203.0.113.1, timeout is 2 seconds:
.....
Success rate is 0 percent (0/5)

# Check routing table
show ip route 203.0.113.1
% Network not in table

# Solution: Add route or fix routing protocol
ip route 203.0.113.1 255.255.255.255 10.1.1.1

Authentication Problems

BGP authentication issues can prevent session establishment or cause frequent resets.

Authentication Issues

MD5 Password Mismatch

Symptom: TCP connection resets, Active state

Cause: Different MD5 passwords on each end

# Check authentication configuration
show running-config | section neighbor 203.0.113.1
 neighbor 203.0.113.1 remote-as 65002
 neighbor 203.0.113.1 password MySecret123

# Check system logs
show logging | include BGP
*Mar 15 10:15:23.456: %TCP-6-BADAUTH: Invalid MD5 digest from 203.0.113.1

# Solution: Verify passwords match on both ends
router bgp 65001
 neighbor 203.0.113.1 password CorrectPassword

Session Monitoring and Health Checks

Proactive monitoring helps identify session problems before they impact network operation.

Session Health Monitoring

BGP Session Statistics
# Monitor session statistics
show ip bgp neighbors 203.0.113.1 | include statistics
  Message statistics:
    InQ depth is 0
    OutQ depth is 0
                         Sent       Rcvd
    Opens:                  1          1
    Notifications:          0          0
    Updates:               45         67
    Keepalives:          1245       1167
    Route Refresh:          0          0
    Total:               1291       1235

# Check for notification messages
show ip bgp neighbors 203.0.113.1 | include Notification
  Last reset 00:15:23, due to BGP Notification sent, hold time expired

Session Problem Resolution Workflow

Systematic Session Troubleshooting

  1. Verify Physical Layer: Check interface status and connectivity
  2. Test IP Connectivity: Ping neighbor IP address
  3. Check BGP Configuration: Verify AS numbers, IP addresses, passwords
  4. Examine BGP State: Check neighbor state and error messages
  5. Review Authentication: Verify MD5 passwords match
  6. Check Timers: Ensure compatible timer values
  7. Monitor Session Health: Look for patterns in session behavior
  8. Apply Fixes: Implement corrections systematically
  9. Verify Resolution: Confirm session stability

BGP Routing Issues

Learning Objectives

  • Diagnose BGP route advertisement problems
  • Troubleshoot path selection issues
  • Resolve routing loops and black holes
  • Fix attribute manipulation problems
  • Optimize BGP routing performance

Route Advertisement Issues

Problems with routes not being properly advertised or received can cause connectivity issues.

Common Route Advertisement Problems

# Check if route is being advertised
show ip bgp neighbors 203.0.113.1 advertised-routes | include 192.168.1.0
# No output means route is not being advertised

# Check if route is in BGP table
show ip bgp 192.168.1.0/24
% Network not in table

# Check network statement
show running-config | section router bgp
router bgp 65001
 network 192.168.1.0 mask 255.255.255.0

# Verify route exists in routing table
show ip route 192.168.1.0
% Network not in table

# Solution: Add route to routing table
ip route 192.168.1.0 255.255.255.0 null0

Path Selection Problems

BGP may select suboptimal paths due to attribute configuration or network conditions.

Path Selection Troubleshooting

# Check all available paths
show ip bgp 192.168.1.0/24
BGP routing table entry for 192.168.1.0/24, version 5
Paths: (2 available, best #2, table default)
  Not advertised to any peer
  65002
    203.0.113.1 from 203.0.113.1 (203.0.113.1)
      Origin IGP, metric 0, localpref 100, valid, external
  65003
    198.51.100.1 from 198.51.100.1 (198.51.100.1)
      Origin IGP, metric 0, localpref 200, valid, external, best

# Path 2 is selected due to higher local preference (200 > 100)
# To change selection, modify local preference
route-map PREFER-PATH1 permit 10
 set local-preference 300

router bgp 65001
 neighbor 203.0.113.1 route-map PREFER-PATH1 in

BGP Debug Commands

Learning Objectives

  • Use BGP debug commands effectively
  • Interpret debug output correctly
  • Apply debugging best practices
  • Avoid debug command pitfalls
  • Create comprehensive debug strategies

Essential BGP Debug Commands

Debug commands provide detailed information about BGP operations but should be used carefully in production environments.

Common BGP Debug Commands

# Debug BGP events (relatively safe)
debug ip bgp events
debug ip bgp dampening

# Debug BGP keepalives (low overhead)
debug ip bgp keepalives

# Debug BGP updates (HIGH OVERHEAD - use with caution!)
debug ip bgp updates
debug ip bgp 192.168.1.0/24 updates

# Debug specific neighbor
debug ip bgp 203.0.113.1 events
debug ip bgp 203.0.113.1 updates

# Debug BGP FSM (Finite State Machine)
debug ip bgp fsm

# Debug BGP notifications
debug ip bgp notifications

# Turn off all debugging
undebug all
no debug all

Debug Command Warning

CAUTION: Debug commands, especially debug ip bgp updates, can generate massive amounts of output and severely impact router performance. Always use with specific filters and turn off when done.

Debug Output Analysis

Understanding debug output is crucial for effective troubleshooting.

Sample Debug Output

# debug ip bgp events output
*Mar 15 10:15:23.456: BGP: 203.0.113.1 open active, delay 13108ms
*Mar 15 10:15:36.564: BGP: 203.0.113.1 open active delayed 13108ms (35000ms max, 60% jitter)
*Mar 15 10:15:36.564: BGP: 203.0.113.1 open failed: Connection refused by remote host
*Mar 15 10:15:36.564: BGP: 203.0.113.1 Active open failed - tcb is not available, open active delayed 25000ms

# This indicates TCP connection is being refused by remote host
# Check: 1) Remote BGP configuration 2) Firewall/ACLs 3) Interface status

Debug Best Practices

Debugging Best Practices

  • Use Specific Filters: Always debug specific neighbors or prefixes
  • Monitor CPU: Watch CPU utilization during debugging
  • Time Limits: Set time limits for debug sessions
  • Log Output: Capture debug output to files for analysis
  • Test Environment: Use lab environments when possible
  • Turn Off Debugging: Always disable debugging when finished