Network Admin Stuff: October 2010

Saturday, October 30, 2010

Lesson 22 - Spanning-Tree Cisco Enhancements

My previous two posts hopefully shed some light on IEEE 802.1d protocol (yes, it is STP). There are two more things I would like to add to that picture. The first thing, deals with situations when the topology changes and how it affects the STP time of convergence. The convergence here, means the time it takes to recompute the STP tree in order to keep the loop free paths upon failure. The second thing, I'd like to bring up is the Cisco STP enhanced the STP operation to decrease the time of convergence compared to the industry standard STP.

Before we delve into the details though, I need to explain something about BPDU frames first. It is true that it is the root bridge that originates those frames and sends them out its designated ports ( downstream, every 2 seconds by default). It is also true, that all other switches (non-root bridges), propagate them downstream out of their designated ports. This way all switches receive the information as to which switch is the root bridge in the network and if it is still functional.

However, what I withheld in previous posts was the types of BPDU frames. There are three types of those:

Configuration - the type of BPDU which the root bridge sends every 2 seconds, and other switches propagate those out of their Designated Ports (downstream).
Topology Change Notification (TCN) - the type of BPDU that a switch will send if it detects the topology change (port going down, or TCN received). This BPDU is sent out the Root Port (upstream) towards the root bridge informing it, that the tree needs to be recomputed.
Topology Change Acknowledgement (TCA) - the type of BPDU that is sent back to the sender of TCN BPDU, acknowledging the reception of the notification.

How do those BPDUs fit into the grand scheme of things?

The default timer of how long the entries are kept in the MAC address table is 300 seconds (5 minutes). This means, that if a host connected to a port of the switch does not speak for at least five minutes, its MAC address is removed from the CAM table. That is a way too long for the switch to re-learn computer's MAC addresses if the STP topology changes.

But why do those MAC entries have to change?

Please, consider the Pic. 1 below. By now, you should be able to tell which ports of the switches are going to learn the PC1 and PC2 MAC addresses. Go ahead, click the Pic. 1, and put down on a piece of paper the switch names and the ports that learn MAC addresses of the PC1 and PC2. That is going to be a good refresher of how switches learn MAC addresses dynamically.

Pic 1 - STP Topology.

Icons designed by: Andrzej Szoblik - http://www.newo.pl

If your answers match mine below, that means that you have mastered the lessons on bridging/switching and STP.

SW1 CAM:
F0/1 - 0000.1111.1111
F0/2 - 0000.2222.2222

SW2 CAM:
F0/1 - 0000.2222.2222
F0/2 - no mac addresses learned since the port is NDP
F0/3 - 0000.1111.1111

SW3 CAM:
F0/1 - 0000.1111.1111
F0/2 - no mac addresses learned as PC1 communicates using SW1
F0/3 - 0000.2222.2222

SW4 CAM:
F0/1 - no mac addresses learned as SW2's port F0/2 is NDP
F0/2 - 0000.1111.1111
F0/2 - 0000.2222.2222

Now, lets create a problem that causes the topology change in our network. Consider Pic. 2 which shows us why some ports must re-learn the MAC addresses of PC1 and PC2.

Pic. 2 - STP Network Problem

Icons designed by: Andrzej Szoblik - http://www.newo.pl

Given the situation, STP needs to recalculate topology since we lose active connections between SW1 and SW2. If it were not for the STP operation in such circumstances, it would take 5 minutes (300 seconds) for the switches to re-learn MAC addresses according to the situation presented in Pic. 3. The resulting topology diagram is depicted below.

Try to put down on the paper which MAC addresses should be learned on which ports of the respective switches after failure (Pic. 3).

Pic. 3 - Topology after losing the connection between SW1 and SW2.

Icons designed by: Andrzej Szoblik - http://www.newo.pl

SW1 CAM:
F0/1 - down
F0/2 - 0000.1111.1111
F0/2 - 0000.2222.2222

SW2 CAM:
F0/1 - down
F0/2 - 0000.2222.2222
F0/3 - 0000.1111.1111

SW3 CAM:
F0/1 - no MAC addresses learned
F0/2 - 0000.1111.1111
F0/3 - 0000.2222.2222

SW4 CAM:
F0/1 - 0000.1111.1111
F0/2 - 0000.2222.2222

In order to decrease the time of re-learning MAC addresses, upon failure SW1 is going to send TCN BPDU out its Root Port. Normally, the Configuration BPDU are sent out Designated Ports NOT the Root Port. But this failure prompts the switches to notify the root bridge about the topology change. That is why, they will send TCN BPDU out their Root Port. All switches, in the path of this TCN BPDU must send the TCA BPDU (acknowledgement) back to the sender and forward TCN BPDU towards the root bridge. As soon as the root bridge has been notified about the topology change, it begins to send TCN BPDUs out its Designated Ports, so other switches in the network also get notified to give them a chance to flush MAC addresses, recompute the tree and re-learn the MAC addresses according to the new topology (Pic. 3). This reduces the time of convergence from 5 minutes to about 30 - 50 seconds time, depending on the nature of the change.

You might question that and say that the default timers used here (30-50 second delay) are still inappropriate for today's networks transmitting voice, video and data. And you are quite right saying so. The mechanism is still not good enough. But remember, that those timers were designed as SAFE values (not causing the loops) given the maximum diameter of network of seven switches (hops) between the root bridge and the bottom switches. Also, remember that STP was designed when there were no multimedia transmissions being sent across the switches. Is there a solution to those timers? Of course. You may change them manually but DO NOT DO THAT unless you are very experienced with STP operation. Another option is to use some proprietary features implemented in Cisco switches.

Cisco with their STP Enhancement are able to decrease this 30-50 second timers even further allowing video, voice and data co-exist in our layer 2 networks. Keep in mind, that these enhancement are Cisco proprietary STP add-ons:

STP Portfast (now part of standard implementation as well).
STP Uplinkfast.
STP Backbonefast (this one is beyond the scope of this tutorial).

Let us see how the first two can change the behavior of our sample topology.

STP Portfast feature should be configured on all EDGE ports, i.e. the ones that connect devices that do not send BPDU frames and cannot create loops. These would be your computers, servers, printers etc. What STP Portfast does, it simply skips the LISTEN and LEARN states, going directly to FORWARD state if there was TCN announced or the port in question is just brought up. Think about it. It makes no sense to flush the MAC addresses on the ports that connect the computers directly, since the topology change is not going to affect them. In the topology presented in this tutorial (Pic. 1, 2, and 3), the topology change did not affect the ports F0/3 on both SW2 and SW3 where PC1 and PC2 are connected respectively. They are still connected where they were before the topology change and their addresses are mapped to the same ports as before the change. So, there is no point of flushing the MAC address table entries on SW2 port F0/3 and SW3 port F0/3. These ports are the candidates for STP Portfast. Because STP Portfast-enabled ports go FORWARD almost immediately, it is highly recommended to use this feature on ports connected to computers in order to avoid problems of getting the IP address using DHCP services.

There are two ways of enabling STP Portfast feature.

Method 1
In the global configuration mode, type in this command:

SW1(config)#spanning-tree portfast default

All ports that are discovered as EDGE ports (more on that in my next post about Rapid STP), will have STP portfast enabled by default. You can check that using a detailed STP output regarding a port (here F0/1):

SW1#show spanning-tree interface f0/1 detail

The output shows that STP portfast has been enabled on this port (look at BPDU received = 0, candidate for portfast):

Pic. 4 - STP F0/1 Detailed Output.

Method 2
Another method is to type in the following command directly on the chosen port:

SW1(config)#interface f0/1

SW1(config-if)#spanning-tree portfast 

This way, we turn on STP Portfast unconditionally (whether port does or does not receive BPDUs).

The second STP enhancement is STP Uplinkfast. This one should be configured on all ACCESS switches (the leaf switches in our topology NOT distribution ones). The feature that is enabled in the global config mode, shortens the time it takes to transition NDP port into RP role upon losing the current Root Port.

In our topology, consider SW2 that has lost its Root Port (F0/1, Pic. 2). Normally, that is without STP Uplinkfast enabled, it would take 30 seconds for the F0/2 port to transition to an RP role. Keep in mind that F0/2 does not have go to blocking state since it keeps receiving superior BPDUs with the Root Bridge ID. Thus, only 30 seconds are required by default (LISTEN+LEARN states). With STP Uplinkfast enabled, Cisco guarantee that the transition of F0/2 to forwarding state (RP role) is going to happen in under 5 seconds.

The configuration of STP Uplinkfast is done in the global config mode as shown below:

SW1(config)#spanning-tree uplinkfast

Similar, in functionality, is STP Backbonefast that could be implemented on distribution switches. However, the details of this feature are beyond the scope of this tutorial.

In my next post, I'm going to briefly present Rapid Spanning-Tree Protocol (IEEE 802.1w) and how it differs from a regular STP (IEEE 802.1d).

If you want to see the enhancement in action please, watch the video below:
more videos available at:

http://youtube.com/jrComputerLabs

Monday, October 25, 2010

Lesson 21 - Spanning-Tree Protocol in Practice

Previous post was designed to present in a nutshell the STP operation. However, without some practice it's just academic knowledge. I think it is a good idea to look at the same concepts using real equipment. Here goes...

The below topology (pic. 1) uses redundant links which create the loops.

Pic. 1 - Network Topology

Icons designed by: Andrzej Szoblik - http://www.newo.pl

If there's one thing the administrator should do with such design, that would be configuring the root bridge. Typically, the most powerful switch in the center of the network plays that role. You do not want some access switch to be transmitting the frames between other switches. Access switches are designed to connect your computers to the network, and not to handle the majority of the traffic between the switches which root bridge must deal with.

If you do not configure root bridge yourself, the switch with the lowest MAC address becomes the root since the priority is identical on all of them by default. We do not want to leave it to a chance, do we? For simplicity reasons I have chosen to make SW1 my root bridge. There are at least two ways to configure this.

Method 1

I can manually decrement the priority on SW1 and leave the default value on the other switches. I want to make SW1 my root bridge for all the VLANs I use in my network (remember Cisco uses PVST+). The lowest priority value allowed is zero and if higher needs to be used, it must be an increment of 4096. If you type in the value that is not allowed, the system will present you with the list of values you can use.

Step 1
Check the VLANs configured.

Step 2
Make SW1 the root bridge for all the VLANs configured in the network by decrementing the default value. Here, I will use the value of '0'.

A quick verification if the command took effect is below:

The above output confirms that SW1 has been elected as the root bridge:

 This bridge is the root

Familiarize yourself with the output of this command. All active ports of the switch are in designated role (forwarding state) as it is the root.

Also, notice that both Bridge ID and Root ID are the same values. I assigned priority of 0, but the system extended ID (PVST) adds VLAN number to the priority assigned. Thus, the priority 0 + (VLAN id) 500 = 500.

Priority: 500

MAC: 000b.5ff7.a080

Like mentioned before, if the priority value configured is not configured according to the allowed values, the system shows the numbers you can use:

Method 2

I can use the spanning-tree vlan root primary macro command which decrements the priority value using Cisco best practices.

Step 1
Check the VLANs configured like before.

Step 2
Make SW1 the root bridge for all the VLANs configured in the network by using the macro command.

And now comes the interesting bit. Having elected the root bridge SW1, I can predict all the rest of the process. Lesson 20 provides us with all the knowledge we need to posses to tell which ports will become root ports on SW2 and SW3 as well as which ports will be designated and which will be non-designated in our topology.

Can you do that on your own?

The base MAC addresses on SW2 and SW3 are as follows (priority is default):

SW2 MAC: 000E:83DA:7580

SW3 MAC: 000D:28BF:FD40

If you want to check what is the base mac address on your switch type in:

SW#show version | include Base

At least give it a shot before you click at the pic. 2 below to check your answers. If you cannot do it yet, do not worry. I will guide you through the process using some powerful 'show' commands.

Pic. 2 - Spanning-Tree Topology Computed

Icons designed by: Andrzej Szoblik - http://www.newo.pl

There are two loops in my network. One between SW1 and SW2 using ports F0/13 and F0/14. The other loop is formed between SW3 connections to SW1 and SW2 (ports F0/15 and F0/16).

Let's look at SW2 first and see how the knowledge from lesson 20 applies here.

SW2 receives BPDU frames from SW1 on F0/13 and F0/14 ports and from SW3 on its F0/16 port. A closer look at the following output can be very informative.

The above output shows clearly which machine is the root bridge (000B:5FF7:A080). SW2 chose F0/13 as it Root Port. As you recall the first thing to check to determine which is the best path towards the root bridge (root port) is the accumulative cost towards the root. SW2 has three outgoing ports towards the root bridge as shown in the next output:

The accumulative cost is calculated by adding two values:

Port path cost + designated path cost.

Port path cost - arbitrarily set values by IEEE (the speed-to-cost table is shown in the previous lesson).
Designated path Cost - the cost towards the root bridge advertised by the neighboring switch.

Port F0/16 can be ruled out immediately since 'port path cost' (19) + 'designated path cost' (19) amounts to: 38.

As for the two remaining candidates to become a root port (F0/13 and F0/14), the total path cost is 19 in both cases (19+0). We need to resort to the second test in our algorithm to break the tie: the lowest bridge id of the BPDU sender. Unfortunately, both ports receive BPDU frames from the same switch: SW1 (look at the previous output).

Designated Bridge has priority 500, address 000B:5FF7:A080

Next step to solve the issue is checking the port priority of the sender. But both ports F0/13 and F0/14 receive the same port priority (port id):

Designated port id is 128

The number of the port is not factored in, only the id value like shown above.

There is only one more thing that can help us determine which of these two ports should be the root port: the lowest port id of the sender (SW1). F0/13 is lower in value than F0/14, so the former becomes the root port.

In the same way SW3 chooses its root port F0/15 as the root port since the accumulative cost using it is 19 as opposed to port F0/16 which total cost out towards the root bridge is 38.

Port F0/14 on SW2 becomes non-designated port (NDP) due to the fact, that the root bridge (SW1) has to have all the ports in designated mode which means they cannot be blocked.

The last thing to compute the STP active paths is to select the designated port between SW2 (F0/16) and SW3 (F0/16). Again, the same formula solves the issue. As both SW2 and SW3 advertise the same cost: 19, the tie breaker is going to be the lowest bridge id of the sender. In this contest, SW2 has higher bridge id (less preferred) which is: priority 33268, address 000E:83DA:758

SW3 priority being lower wins. SW3 bridge id for the same VLAN 500 looks like shown below:

priority 33268, address 000D:28BF:FD40

STP selects the layer 2 paths between the switches. In the pic. 2 I showed you also that all the ports connected to PC1, PC2 and R1 are in a designated role. This is because those ports do NOT receive BPDUs. They automatically become designated (forwarding state).

As the last thing in this lesson, I'd like to ask you two questions.

Assuming that SW1 is the root bridge:

Question1
What would you need to reconfigure in our topology (pic. 1), for SW2 to choose F0/14 as the root port for VLAN 500?

Question 2
What would you need to reconfigure in our topology (pic. 1) for SW3 to choose F0/16 as the root port for VLAN 500?

NOTICE!
The method of choosing root port/designated port in the previous lesson holds the answers to these questions. Remember about the order of operation.

The answer to question 1
Since the cost is the same towards SW1 (root), we could modify it on SW2 with the following command:

SW2(config)#interface f0/13

SW2(config-if)#spanning-tree vlan 500 cost 20

This way I have increased the cost on this port to 20, and F0/14 cost now is lower (19).

Another method could be to change the port priority on the SW1 preferring port F0/14. This is how you could do it:

SW1(config)#interface f0/14

SW1(config-if)#spanning-tree vlan 500 port-priority 64

Since, the path cost towards the root are identical on both ports, bridge id of the sender is the same switch SW1, the third thing to influence which one to use is the port priority assigned by the BPDU sender (here SW1). This is shown in the following picture taken from SW2 (show spanning-tree vlan 500 detail):

Now, the priority imposed by SW1 on SW2's F0/14 is lower: 64 compared to port F0/13 which is 128. Port F0/14 becomes the root port.

Answer to question 2
In order to change the root port on SW3 the only way to do that is to increase the cost to reach the root bridge on F0/15. For instance you could configure the following:

SW3(config)#interface f0/15

SW3(config-if)#spanning-tree vlan 500 cost 39

Since the total cost towards SW1 (root) using port F0/15 is 39 now, and using port F0/16 the cost used equals 38, this configuration will do the job.

Did you have fun? I sure did ;)

Saturday, October 23, 2010

Lesson 20 - Spanning-Tree Protocol Operation

In my previous post I tried to stress the need for redundant connections between the switches. Multiple paths help us avoid a single point of failure in our designs. However, adding new connections inevitably create loops causing multiple problems. The last section of lesson 19 presented the solution: Spanning-Tree Protocol. It's time we learn a bit more about Spanning-Tree Protocol terminology and scrutinize its operation. So hold down to your hats as we begin the ride ;)

In order to understand the nuts and bolts of Spanning-Tree Protocol (STP), we need to get familiar with its terminology first.

Spanning-Tree Protocol Terminology
The ports participating in STP play different roles and those roles use different states of operation.

Spanning-Tree Port Roles

Root Port (RP) - It is a port on a non-root switch, which is the shortest (the best) path towards the root bridge. Root bridge does NOT have any root ports. (no shortest path to itself ;-))
Designated Port (DP) - It is a port that is in the forwarding state. All ports of the root bridge are designated ports (they are never in a blocking state). BPDU frames our sent out this port.
Non-Designated Port (NDP) - It is a port that is in a blocking state in the STP topology.

Spanning-Tree Port States

Disabled - The port in this state does not participate in the STP operation (it is shut down).
Blocking - The port does NOT forward any Ethernet frames, does NOT accept any Ethernet frames (discards arriving frames), does NOT learn any MAC addresses. However, the port DOES process BPDU frames received from neighbor switches. If the port transitions to this state (blocking), it can stay blocked for 20 seconds by default (max_age).
Listening - The port in this state CAN send and receive the BPDU frames. However, the port in this state does NOT learn any MAC addresses, and does NOT forward or process incoming frames either. All Ethernet frames are being discarded. The computation of loop free topology takes place in this state. If the port transitions to this state (listening), it can stay in this state for 15 seconds by default (forward_delay).
Learning - The port in this state already knows its role (root port or designated port ) in the STP domain. However, the port will not forward any Ethernet frames yet. It will be learning MAC addresses from the frames arriving at the port in order to populate MAC address table. This helps avoid too much flooding when the port transition to the forwarding state. If the port transitions to this state (learning), it can stay in this state for 15 seconds by default (forward_delay).
Forwarding - The port in this state will forward all Ethernet frames as per switch operation. Also, the port will process all incoming Ethernet frames and will actively learn MAC addresses from the arriving traffic.

NOTICE!
Bridges and switches are functionally the same devices. I will use both terms interchangeably.

As soon as you familiarize yourself with STP port roles and port states, it is time to explain how Spanning-Tree Protocol works.

Pic. 1 - STP Port Terminology

Icons designed by: Andrzej Szoblik - http://www.newo.pl

STP (IEEE 802.1d) Principles of Operation
STP will use three stages to compute loop free topology (pic. 2):

Single root bridge election.
Each non-root switch to select a single best port towards the root (root port).
Each non-root switch to select a single forwarding port per segment (designated port).

Pic. 2 - STP Overview

Icons designed by: Andrzej Szoblik - http://www.newo.pl

Bridge Protocol Data Unit (BPDU)
All switches communicate with one another using special frames called BPDU. Those frames contain multiple parameters that switches are going to process in order to create and maintain loop free topology.

Root Bridge
Root bridge is the switch that has all ports working in the designated role. It will be the reference point from which the loop free topology is computed. Root bridge will impose the timers that other switches will use such as:

hello time - how often BPDUs are going to be sent/relayed (default timer=2 seconds),
max age - how long the configuration is valid (default timer=20 seconds),
forward delay - how long a port should be in listening/learning state (default timer=15 seconds).

Root bridge will be announcing its presence by sending BPDU frames. Other switches will relay those frames out their designated port given the hello time. Also, the root bridge has all its ports in the designated role (forwarding).

1. Root Bridge Election

Only one switch in the layer 2 network becomes the root bridge. This is how standard was defined and is known as the Common Spanning-Tree approach (CST). Cisco changed that paradigm and introduced Per Vlan Spanning-Tree approach (PVST+). Cisco switches elect a single root switch per VLAN so, in theory each VLAN could have its own root bridge.

Root election is based on a single parameter that is found in the BPDU frame called: Bridge ID. The switch with the lowest Bridge ID becomes the root. Bridge ID has the following format:

priority.base-mac-address

Priority is configurable parameter that is used to elect the root bridge a device you want to be the root. The default value is: 32768. The lower the value is the more likely for a switch to become a root.

Base Mac Address is the unique mac address every switch has been given by the manufacturer. It is a tie breaker in case the priority on all switches is identical.

If you've understood everything so far, you're ready to look at the election process in more detail.

Pic. 3 - Root Bridge Election.

Icons designed by: Andrzej Szoblik - http://www.newo.pl

Imagine that we've just wired our topology in the pic. 3. Now, we start up all the switches and as soon as their ports transition to LISTENING state, they begin to send BPDU frames out of all active ports. In those frames both Bridge ID and Root ID parameters point to their own priority.base-mac-address value. In other words, each switch thinks it is the root bridge. It is like each switch is saying: "Hi there! This is my name (Bridge ID) and by the way I'm the root (Root ID the same as the Bridge ID value). Since they are processing the incoming BPDU's from the neighbors, SW2 and SW3 realize that SW1's Bridge ID is lower than theirs. From that point onwards, they begin to relay BPDU frames saying that SW1 as the root bridge.

In our example, SW3 upon receiving the BPDU from SW1, SW2 and SW4 compares their Bridge ID with its own and the conclusion is that SW1's Bridge ID has the lowest value (base-mac-address breaks the tie). From this point onwards, it relays the BPDU frame out of all its active ports with the following parameters:

Bridge ID = 32768.0000.3333.3333

Root ID = 32768.0000.1111.1111

Similarly, all the switches agree that SW1 is the root (their own Bridge ID is higher).

2. Root Port Selection

As soon as the root has been elected, all non-root switches begin to calculate which port is the best (the least cost) towards the root bridge. This port will be called the root port.

Pic. 4 - Root Port Selection

Icons designed by: Andrzej Szoblik - http://www.newo.pl

SW2, SW3 and SW4 receive BPDUs from different directions. For instance, SW2 will receive them on its port F0/1 and F0/2 (look at pic 4). The accumulative cost (the sum of the cost in the path towards the root), is taken into consideration. The lowest cost to reach the root becomes the root port.

How the cost of path is calculated?

Each speed has its arbitrarily assigned cost which is configurable. A few examples are below:

10 Mbps = 100
100 Mbps = 19
1 Gbps = 4
10 Gbps = 2

The root bridge (here SW1) is sending its BPDU frame every 2 seconds. It uses the parameter called: Root Path Cost in BPDU to advertise the cost to the root. It puts the value of '0' in it, as it is the root bridge and has no cost to itself. The frame is sent out its port F0/1 towards SW3 and F0/2 towards SW2. SW2, upon receiving it, adds the cost used to reach the sender of BPDU based on the predefined speed-to-cost value (all ports in our topology are FastEthernet=19).

Root Path Cost = 0 + 19 = 19 via F0/2

SW2 is going to advertise its best (as of now) cost out of F0/1 port towards SW3. SW3 will receive BPDU from SW1 with the Root Path Cost=0 on its F0/1 port. It will also receive BPDU from SW2 on its F0/2 interface with the Root Path Cost=19. As both ports have the cost of 19 towards those BPDU senders, the following math is done to choose the least cost path towards the root bridge:

Root Path Cost = 0 + 19 = 19 via F0/1
Root Path Cost = 19 + 19 = 38 via F0/2

It is clear that the direct connection towards root bridge via F0/1 is going to be selected as the root port.

SW3 has the least cost towards equal 19 (via F0/1 port). This cost is going to be added to Root Path Cost while it sends the BPDUs out F0/2, F0/3 and F0/4. Of course, SW2 also chooses its F0/2 port as the root port since the cost is smaller.

What if the Root Cost Path is identical?

We run into that situation on SW4. It receives BPDUs on its ports F0/1 and F0/2 with the following parameters:

Bridge ID = 32768.0000.3333.3333
Root ID = 32768.0000.1111.1111
Root Path Cost = 19

The cost clearly does not help to choose a single root port as both ports have the same cost:
19 + 19 = 38.

The following algorithm is used to determine the root port or designated port (in order):

Prefer the lowest Root Path Cost.
In case of the same Root Path Cost, prefer the lowest Bridge ID of the designated switch (the neighbor that sends BPDUs).
In case of receiving BPDUs on multiple ports from the same designated switch (BPDU sender), prefer the lowest Port ID (known also as port priority) of the sender. That parameter has a default value 128 and is configurable.
In case of all above are did not resolve the problem, prefer the lowest Port ID of the BPDU sender.

Equipped with that knowledge let us consider SW4 now.

SW4 receives BPDUs on port F0/1 and F0/2. The Root Path cost is the same: 19 + 19 = 38 on both ports.
The designated switch (SW3), is the same switch i.e. the same Bridge ID (32768.0000.3333.3333).
The designated switch (SW3) sends BPDUs out of its F0/3 and F0/4 ports with the same priority = 128 (Port ID).
The tie breaker is the lowest Port ID where BPDU frames arrive on SW4. Port f0/1 becomes the root port since F0/3 is lower than F0/4 on SW3.

The root ports have been selected on all non-root switches (pic. 5). STP will select a single designated port (forwarding) per segment to block the redundant path towards the root bridge. This way the loop does not exist. Should any of root ports fail, it will take around 30-50 seconds to put the blocking port into forwarding state.

3. Designated Port Selection.
This procedure follows exactly the same algorithm used for root port selection.

Pic. 5 - Designate Port Selection

Icons designed by: Andrzej Szoblik - http://www.newo.pl

Since root port is the best port towards the root bridge it is going to be in the forwarding state (look at the beginning of this lesson). What is left to do, is to choose one of the ports between SW2 and SW3 as designated (forwarding) and the other as non-designated (blocked). The same applies between SW3 and SW4. Either SW3 will block its F0/4, or SW4 should block its F0/2 port.

SW3 will block its F0/2 (non-designated) and SW2 will make its F0/1 port designated (forwarding). The process will look as follows:

Root Path Cost advertised by SW2 is 19 and so is the cost advertised by SW3.
SW2 has lower Bridge ID (32768.0000.2222.2222) than SW3 (32768.0000.3333.3333). SW3 must block its F0/2.

And last selection is going to happen between SW3 (port F0/4) and SW4 (port F0/2).

Root Path Cost Advertised by SW3 is 19, but SW4 advertises its cost as 38 (two hops via F0/1). SW4 blocks its port F0/2 (non-designated), the SW3 promotes its port F0/4 to designated role (forwarding).

Pic. 6 - Spanning-Tree Topology Computed

Icons designed by: Andrzej Szoblik - http://www.newo.pl

This process happens in the LISTENING state of all ports. Since the topology has been computed and does not have loops (blocking appropriate ports), it is safe to move to next states: learning and finally forwarding.

In the next post, we will look at this process one more time using command line interface and real equipment.

Sunday, October 17, 2010

Lesson 19 - Spanning-Tree Protocol Overview

Vlans described in the previous posts are very important elements of building modern networks. Equally important piece of technology is IEEE 802.1D, commonly known as Spanning-Tree Protocol. In the following few posts, I will focus on its application and basic operation.

If your network consists of layer 2 switches that allow computers connect and exchange data, you will need to consider the design that can withstand some types of failure.

Redundant Connections

Consider the following layer 2 design. Imagine that the SW1, SW2 and SW3 switches connect many devices and there is only a single connection between the switches like depicted in the Pic1.

Pic. 1 - Switch Topology Without Redundancy

Icons designed by: Andrzej Szoblik - http://www.newo.pl

Should either of the links between the switches break, the communication between many devices fail. Such design creates a single point of failure. We could easily tweak this simple design to make it more resilient by adding an extra path between SW2 and SW3. The below picture shows this modified design.

Pic. 2 - Redundant Paths

Icons designed by: Andrzej Szoblik - http://www.newo.pl

Unfortunately, creating the extra path here comes at a cost. The redundant connection (Pic. 2) between SW2 and SW3 creates a loop. The loop in turn, will create three serious problems. The last one in the list will eventually render our system unavailable. Let's see what these problems are.

Duplicate Frame Delivery

Pic. 3 - Problem 1 - Duplicate Frame Delivery

Icons designed by: Andrzej Szoblik - http://www.newo.pl

Look at the pic. 3 and imagine SW2 and SW3 do not have the MAC address of PC3 (0000.3333.3333) in their databases (CAM). This can happen if the PC3 doesn't speak for more than five minutes. This is the default time MAC address is kept in the database without refreshing it. Then, we have PC1 sending frame towards PC3. As you recall, SW2 will flood the frame out of its active ports if it does not know where PC3 is located (unknown destination MAC address). The frame travels out SW2's port F0/13 towards SW1 and out the port F0/12 towards SW3. SW2 will deliver the frame to PC3. Since SW3 floods the frame out as well, it will be sent towards SW1 out of its port F0/14. Then, SW1 obediently delivers the same copy of the frame to PC3 again.

MAC Address Table Instability
Another issue caused by the loop we have created will make switches change the MAC addresses depending on where they hear the sender. Consider pic. 4 below.

Pic. 4 - Problem 2 - MAC address table instability

Icons designed by: Andrzej Szoblik - http://www.newo.pl

Again, let us assume that none of the switches in the picture knows where PC3 is connected. This means they have not learned its MAC address yet. In our scenario, PC1 sends the frame to PC3 (destination MAC: 0000.3333.3333). SW2 floods the frame out F0/12 and F0/13 ports.

Now, SW3 receives this frame sourced with 0000.1111.1111 MAC address (PC1). It learns the source MAC address and maps it to its F0/12 port where it arrived. Since SW1 does not know where PC3 is connected (at least right now) it will flood this frame out all active ports. This way, the frame is sent out SW1's port F0/14 towards SW3. SW3, upon receiving the frame on its F0/14 port, reads the source MAC address (0000.1111.1111) and maps it to port F0/14 this time. This causes a little confusion as SW3 learned it earlier on and it was port F0/12 before. Previous mapping is removed and F0/14 becomes the outbound port for 0000.1111.1111 now.

Broadcast Storm
The last problem is really severe. It can bring our traffic to a halt. Take a look at pic. 5 below.

Pic. 5 - Problem 3 - Broadcast Storm

Icons designed by: Andrzej Szoblik - http://www.newo.pl

In this scenario, PC1 sends a broadcast frame. SW2 upon receiving it, floods it out all its active ports. SW1 receives it on port F0/13 and floods it out of other ports. SW3 receives the broadcast frame on its F0/12 port and floods it. Then, a tad later it receives this same broadcast frame from SW1 and again it floods it out all active ports except the port it arrived on. You can write the rest of the story on your own. This broadcast is running in the loop in both directions endlessly. Well, not exactly endlessly. It is true that there is not mechanism to stop it, but all three switches in the topology will be so busy sending out this broadcast, that eventually all its resources are consumed and they stop sending anything at all. If you look at switches that experience a broadcast storm, you will notice that all their LEDs are flashing amber like a Christmas tree. In a few seconds the switches become unresponsive. An attempt to access them remotely using SSH/telnet will fail. Even console connection may refuse to accept your commands. The only way to bring the switches back to the operation is to break the loop by pulling one of those cables.

So, what can we not have redundancy in our layer 2 topology? Of course, we can.

We will run Spanning-Tree Protocol (turned on by default), which will dynamically block redundant connections creating a loop free topology. Should the primary link fail, the one that is in the blocking state will start forwarding the traffic in about 30 seconds by default. Of course, we will need something much faster than 30 seconds, but I will show you that as soon as we know how STP works.

Here I am going to give you just an overview of its operation. But the devil is in the details which we will scrutinize in my next post.

Spanning-Tree Protocol Overview
STP is a layer 2 loop prevention mechanism. Switches running this protocol use special frames called Bridge Protocol Data Unit (BPDU). These frames contain enough information to allow the switches to create a loop free topology. This magic is accomplished using three distinct phases:

Elect a single switch to be the root bridge machine which is the central device in the layer 2 network. This machine will have all its ports in the forwarding state (designated port role).
All other switches (non-root switches), will select a single path towards the root bridge. That port is called the 'root port' and will be forwarding traffic that is destined out of the switch through the root bridge. This path is the least cost (best) path towards the root.
All other switches will select a single path per segment in order to block stop the loop. The port that is forwarding traffic is called designated port. The port that is blocking traffic to stop the loop is called non-designated port.

I will explain all the terms and the above process in details in my next post. Meanwhile, check the pic. 6 first.

Pic. 6 - Spanning-Tree Protocol

Icons designed by: Andrzej Szoblik - http://www.newo.pl

In the above picture, SW1 has been elected as the root bridge. SW2 uses port F0/13 as its root port (the best, or the least cost path towards the root). SW3 uses it port F0/14 as the root port. SW3 blocks the port F0/12 to stop the loop. SW2 keeps sending BPDU frames originated by the root bridge (SW1) out its F0/12 port towards SW3.

Now, what is really fascinating that the loop free structure like the above is done automatically (although you want and will affect how it works), and the fact that if the communication between SW2 and SW1, or SW3 and SW1 is broken, the SW3 port F0/12 will be put in the forwarding state.

If you are interested in the details how STP works please read my next post (lesson 20).

Thursday, October 7, 2010

Lesson 18 - VTP and VLAN Quiz

The last lesson presented the gory details behind inter-VLAN routing. Now, I would like you to play a little game with me. A simple quiz will check your understanding of both access and trunk ports. Have fun!

Study the first topology carefully (Pic. 1) and answer the question 1.

Question1: When PC1 is sending broadcast frames (destination MAC address: FFFF.FFFF.FFFF), which computers are going to receive them?

NOTICE!
All switch-to-switch connections in Pic.1 are ACCESS ports.

Pic. 1 - Switches are connected using ACCESS mode (NOT a trunk mode).

Icons designed by: Andrzej Szoblik - http://www.newo.pl

The answer to question 1 can be found at the bottom of this post. But try not to cheat. Check the answer after you have provided yours;).

Study the second topology carefully (Pic. 2) and answer the question 2.

Pic. 2 - Switches are connected using TRUNKING mode.

Icons designed by: Andrzej Szoblik - http://www.newo.pl

Question2: When PC1 is sending broadcast frames (destination MAC address: FFFF.FFFF.FFFF), which computers are going to receive them?

The answer to question 1 can be found at the bottom of this post. But try not to cheat. Check the answer after you have provided yours;).

Now, I can start our last lesson related to VLANs. It's about Cisco Vlan Trunking Protocol.

Vlan Trunking Protocol (VTP)
Cisco have created this protocol to facilitate automatic VLAN distribution between switches that use trunking connection. There is similar protocol called GVRP that is industry standard solution. At first glance, it looks like it works like VTP but it has some significant differences. For more details use google to learn it.

What Does VTP Do?
VTP is turned on by default but there are some things that need to be configured for this to work. The idea is very simple: instead of typing in all the VLANs end-to-end, (on all switches individually), you can configure VLANs on one switch only, and the same configuration will be propagated to all switches in the network (VTP domain to be more accurate). By 'the same configuration', I mean that VLAN database is synchronized (exchanged) between all the switches. In other words, VLAN numbers and their names are exchanged. The port-to-VLAN assignments are NOT exchanged. Consider this example:

SW1(config)#vlan 4

SW1(config-vlan)#name IT_Dept

SW1(config-vlan)#end

SW1#

The above configuration creates 'VLAN 4' and assings the name of 'IT_Dept' to it. In a split of a second, the same VLAN 4 named IT_Dept is populated into the database of all switches in the network (VTP domain).

How Does VTP Work?
As you recall from the lesson 16, VLAN configuration typically involves three steps:

Configuring VLAN numbers in the 'global config' mode. Optionally, you can also give those VLAN unique names.
Assigning interfaces to VLANs (access mode). Optionally, you can map MAC addresses to VLANs (access dynamic mode). But in order to use this method VMPS server is required.
Configuring trunking connections between the switches (if the same VLANs are applied on all switches - aka end-to-end VLANs).

Even though VTP is turned on by default, a few things must be configured for VLANs to be distributed among switches.

Switches must belong to the same VTP domain (the same domain name must be configured on the switches to synchronize their VLAN databases).
If, optionally, switches use domain password, this password must be identical on all switches in the VTP domain.
The connections between switches must be in the 'trunking mode' (it is Vlan TRUNKING Protocol after all).
VTP version must be the same on all switches (there are VTP 1,2 or 3 version).

VTP Modes
A Cisco switch can be configured in one of the three VTP modes:

VTP Server (default mode) - this mode allow you to add, remove, delete, modify VLANs to a database. All is saved in NVRAM (Non-Volatile RAM memory - the one that does not lose its content on power-down).
VTP Client - in this mode you CANNOT create VLANs in the local database. The only way for the VTP client to learn VLANs is to send a request advertisement. Server respond to this, by sending information about VLANs and their names used in the domain (subset advertisement).
VTP Transparent - is a similar to a server mode of operation. The major difference is that, the transparent mode does NOT participate in the VTP domain. This means that the transparent mode does NOT synchronize its database with any other switch (a local database of VLANs), and it does NOT learn VLANs propagated by a VTP server. Transparent mode WILL forward VTP messages between other switches over trunk ports.

VTP server sends a special VTP frame every 5 minutes out of all trunking ports. This message is the summary advertisement. In this message, among other pieces of information, it inlcudes:

VTP domain name
MD5 digest (if password is used in VTP domain)
Revision number

If there is a topology change (VLAN added, removed, name modified etc.), VTP server sends new summary advertisement IMMEDIATELY with the revision number incremented. All other switches, upon receiving this message, will compare their own VTP domain name, protocol version, MD5 digest (if used), and the revision number. If the 'revision number' in the incoming message is HIGHER than the last seen, they send an advertisement request message towards the server. The VTP server responds with one or more subset advertisement describing all the VLANs found in its database. This new information is going to replace the old one on all other VTP client or server switches.

VTP Pruning
Vlan Trunking Protocol offers one more interesting feature called: PRUNING. It allows the switches to communicate over trunks which VLAN traffic should not be sent down from the upstream switch. Consider the Pic. 3 below:

Pic. 3 - VTP VLAN Pruning Example.

Icons designed by: Andrzej Szoblik - http://www.newo.pl

If SW1 is sending broadcast from VLAN 10, the frames will be flooded out of all active ports in VLAN 10 as well as the trunking ports. Recall, that the ports in the trunking mode are multi VLAN ports allowing ALL of them by default (VLANs 1-4094). SW2 receives the broadcast from VLAN 10, on its interface F0/13, but then realizes, that currently there are NO members of VLAN 10 connected to any of its ports.

When VTP Pruning is enabled, SW2 will inform SW1 (pic. 3), that it does not want to receive traffic from VLAN 10. Should you connect at least one host to VLAN 10 to SW2 though, it will send another request, that VLAN 10 be on longer pruned on SW1 trunk port F0/13. It happens automatically without any further configuration.

Of course, if you do not want to use VTP, you do not have to. You can configure VLANs manually on all switches of yours. Also, you can prune the traffic on trunk ports manually. The command is shown below:

SW1#conf t

SW1(config)#interface f0/13

SW1(config-if)#switchport encapsulation dot1q

SW1(config-if)#switchport mode trunk

SW1(config-if)#switchport trunk allowed vlan 10,15,22

SW1(config-if)#

The above configuration will allow only VLANs 10,15 and 22 to cross the trunk f0/13 (command in blue).

VTP Configuration
In order to illustrate configuration steps, I am going to use the same topology as in a few previous posts.

Pic. 4 - Topology Diagram

Icons designed by: Andrzej Szoblik - http://www.newo.pl

The default configuration looks like the output below: 'show vtp status'.

Pic. 5 - Default VTP settings.

Well, in my output, the only setting that is not the default is the 'Number of existing VLANs'. I have one VLAN configured (VLAN 500), which is my management VLAN allowing me access the switch remotely.

As you can notice, the VTP mode is server, and the domain name is empty (no domain name configured). So, the switch allows you to configure VLANs but the database is not going to be propagate to other switches. Below are the steps of introducing VTP protocol for the first time.

NOTICE!

The command: vtp mode transparent in the 'global config' mode will clear the revision number back to '0'.

In my topology I am going to use the secure way of introducing VTP protocol FOR THE FIRST TIME!!!

Step 1
Clear the revision number on all the switches by typing the following in the 'global config' mode:

switch(config)#vtp mode transparent

switch(config)#

Step 2
Initially, I am going to use SW1 as the VTP server, SW2 and SW3 as the VTP clients.

SW1(config)#vtp mode server

SW1(config)#

SW2(config)#vtp mode client

SW2(config)#

SW3(config)#vtp mode client

SW3(config)#

Step 3
Configure VTP domain on SW1 (here domain name = CCNA).

SW1(config)#vtp domain CCNA

NOTICE!
SW1 is going to propagate the domain name (CCNA) to all other switches. They will learn it on their trunk ports. Trunk ports were configured in my previous lab.

Step 4
Apply the same password (MD5 algorithm is used), on all switches, so if somebody plugs in a new switch, that new switch without this password is not going to change the VLANs configured so far. Here the password used is: Secret123

SW1(config)#vtp password Secret123

SW1(config)#

SW2(config)#vtp password Secret123

SW2(config)#

SW3(config)#vtp password Secret123

SW3(config)#

Step 5 (Optional)
Enable VTP pruning to save bandwidth by not transmitting the broadcast traffic towards switches that have no members of VLANs defined in their databases. If you configure this on SW1 (server), this will enable pruning on all switches in our VTP domain.

SW1(config)#vtp pruning

If you want to make other switches servers, you can change their mode of operation now.

This way we have introduced VTP domain and now you can add, remove, delete, modify VLANs on one switch (VTP server), and all these changes will be propagated to all of switches (VTP servers or clients) in your domain CCNA.

I will have two videos recorded soon. The first one will provide you with explanation to the answers of my quiz presented above. The second one will show you how VTP can cause serious problems in your network if you do not take the right precautions.

And now, let me give you the answers to the quiz questions.

The answer to question 1 (pic. 1):
The broadcast frames sent by PC1 will be sent to PC2 and PC5.

Explanation:

1. If switch connections are not using trunks, the sending switch will flood broadcast out of all ports, except it was received on, as long as they belong to the same VLAN. The sending port does not include a VLAN tag (number) since only TRUNK ports attach extra 4 bytes with sending VLAN id (tag).

2. The receiving switchport is going to accept frame without the TAG because it is an access port (not trunk). It assumes that the frame belongs to a VLAN number is was configured to use locally (there is no difference connecting a computer to a port of a switch, or other device, printer, router or switch). It follows what the first switch did. It interprets the incoming frame as a frame that belongs to a VLAN it was configured as (switchport access vlan #). Since the frame destination address is broadcast, it floods it out of all ports that belong to the same VLAN number as the receiving port.

3. The next switch receiving this frame will follow step 2.

Best ways is to experiment: two switches, and three computers (1 sender on SW1 and 2receivers on SW2) with wireshark enabled. ;)

The answer to question 2 (pic. 2):
The broadcast frames sent by PC1 will be sent to PC3 and PC6.

In my next post I will talk about Spanning-Tree Protocol.