Network devices added to a bridge delays 30 seconds before forwarding traffic

I’m building a new R9 machine and using VLANs (e.g. VLAN10 is my LAN). Now because this machine will also be a vm host I will want to allow those virtual machines to also access this network.

So what I do is create a bridge br-lan and then add the VLAN10 interface to that.

e.g.

nmcli connection modify enp1s0 ipv4.method disabled ipv6.method disabled

nmcli connection add type bridge con-name br-lan ifname br-lan ipv6.addr-gen-mode eui64 ip4 10.0.0.XX/24 gw4 10.0.0.1 ipv4.dns "10.0.0.1 10.0.0.2" ipv4.dns-search "example.org"

nmcli connection add type vlan con-name enp1s0.10 ifname enp1s0.10 vlan.parent enp1s0 vlan.id 10 slave-type bridge master br-lan

Now this works nicely.

% brctl show br-lan
bridge name     bridge id               STP enabled     interfaces
br-lan          8000.b0416f0e52ab       yes             enp1s0.10

% ip -4 addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
7: br-lan: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    inet 10.0.0.XX/24 brd 10.0.0.255 scope global noprefixroute br-lan
       valid_lft forever preferred_lft forever

I can see the network, I can ssh into the machine. Great!

Except… on bootup it seems to take 30 seconds for the interface to settle down…

% dmesg | grep br-lan
[    6.432899] br-lan: port 1(enp1s0.10) entered blocking state
[    6.432902] br-lan: port 1(enp1s0.10) entered disabled state
[    6.433246] br-lan: port 1(enp1s0.10) entered blocking state
[    6.433248] br-lan: port 1(enp1s0.10) entered listening state
[   21.541388] br-lan: port 1(enp1s0.10) entered learning state
[   36.901295] br-lan: port 1(enp1s0.10) entered forwarding state
[   36.901317] br-lan: topology change detected, propagating
[   36.901492] IPv6: ADDRCONF(NETDEV_CHANGE): br-lan: link becomes ready

Now this delay also occurs if I boot up a VM. e.g. if the VM has this network config…

    <interface type='bridge'>
      <mac address='52:54:00:95:45:3b'/>
      <source bridge='br-lan'/>
      <target dev='v-newtest'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </interface>

Now on virsh start I can see the interface get created and added to the bridge

% brctl show br-lan
bridge name     bridge id               STP enabled     interfaces
br-lan          8000.b0416f0e52ab       yes             enp1s0.10
                                                        v-newtest

And it works… after 30 seconds! For the first 30 second there is no traffic flowing to the VM. After 30 seconds it flows properly.

And from the host kernel logs we can see a similar pattern of delay

% dmesg | grep v-newtest
[ 3646.417091] br-lan: port 2(v-newtest) entered blocking state
[ 3646.417093] br-lan: port 2(v-newtest) entered disabled state
[ 3646.417139] device v-newtest entered promiscuous mode
[ 3646.417274] br-lan: port 2(v-newtest) entered blocking state
[ 3646.417276] br-lan: port 2(v-newtest) entered listening state
[ 3661.861297] br-lan: port 2(v-newtest) entered learning state
[ 3677.221284] br-lan: port 2(v-newtest) entered forwarding state

In both cases it appears that the new device spends 15 seconds in “listening state” and then another 15 seconds in “learning state” before it finally transitions to “forwarding state”.

VMs (even the host) can now boot up quicker than this; the VM host starts with “localhost” as a name because DHCP doesn’t complete in time, waiting on the network!

Is there any way to speed this up?

Oh… it’s a function of STP… If I turn that off then the network comes alive immediately!

Bootup

[    6.459881] br-lan: port 1(enp1s0.10) entered blocking state
[    6.459884] br-lan: port 1(enp1s0.10) entered disabled state
[    6.460168] br-lan: port 1(enp1s0.10) entered blocking state
[    6.460169] br-lan: port 1(enp1s0.10) entered forwarding state
[    6.460207] IPv6: ADDRCONF(NETDEV_CHANGE): br-lan: link becomes ready

and VM start

[   33.821006] br-lan: port 2(v-newtest) entered blocking state
[   33.821009] br-lan: port 2(v-newtest) entered disabled state
[   33.821189] br-lan: port 2(v-newtest) entered blocking state
[   33.821191] br-lan: port 2(v-newtest) entered forwarding state

So if I modify the bridge definition to have bridge.stp no in it then everything works so much faster.

Since I don’t have any loops in my network (it’s just a home network with VLAN aware switches) I don’t need STP at all.

(Sometimes just writing out the problem helps you solve it!!)

3 Likes