Replacing iptables (CentOS 7) with firewalld (Rocky 9)

So I have a pretty complicated network setup. My current router is C7 but, of course, this is EOL; so I’m looking at replicating with R9.

I guess I have 4 zones which correspond to 4 network bridges
WAN (br-wan)
LAN (br-lan)
GUEST (br-guest)
IOT (br-iot)

Some of the basic rules are
LAN goes everywhere
GUEST/IOT can see WAN and specific ports on LAN
Some port forwarding from WAN to LAN

The standarrd complication is “reflection”; if I have a web server exposed to the internet then guest/iot should also be able to see it, so with iptables we need reflection rules.

My current setup is described at Building a home router · Ramblings of a Unix Geek

Now I know iptables still works in R9 (“deprecated”) but I’d like to learn how to do it in the modern world; I’ve worked out the nmcli commands to build the bridges.

#!/bin/sh

WAN=enp1s0
LAN=enp2s0

# Clean up any residual configurations
nmcli connection delete $WAN $LAN.10 $LAN.11 $LAN.12 br-lan br-guest br-iot br-wan
nmcli device delete $LAN.10 $LAN.11 $LAN.12

# Configure internet (WAN)

nmcli device set $WAN autoconnect yes

nmcli connection add type bridge con-name br-wan ifname br-wan bridge.stp no ipv4.method auto ipv4.dns "10.0.0.1 10.0.0.5" ipv4.dns-search "spuddy.org" ipv6.method disabled
nmcli connection add type bridge-slave con-name $WAN ifname $WAN master br-wan # ipv4.method disabled ipv6.method disabled

# Configure the separate VLANs

nmcli device set $LAN autoconnect no

## nmcli connection modify $LAN ipv4.method disabled ipv6.method disabled

nmcli connection add type bridge con-name br-lan ifname br-lan bridge.stp no ip4 10.0.0.1/24 ipv6.method disabled
nmcli connection add type vlan con-name $LAN.10 ifname $LAN.10 vlan.parent $LAN vlan.id 10 slave-type bridge master br-lan
nmcli dev $LAN.10 autoconnect yes

nmcli connection add type bridge con-name br-guest ifname br-guest bridge.stp no ip4 10.100.100.1/24 ipv6.method disabled
nmcli connection add type vlan con-name $LAN.11 ifname $LAN.11 vlan.parent $LAN vlan.id 11 slave-type bridge master br-guest
nmcli dev $LAN.11 autoconnect yes

nmcli connection add type bridge con-name br-iot ifname br-iot bridge.stp no ip4 10.100.200.1/24 ipv6.method disabled
nmcli connection add type vlan con-name $LAN.12 ifname $LAN.12 vlan.parent $LAN vlan.id 12 slave-type bridge master br-iot
nmcli dev $LAN.12 autoconnect yes

But the firewall configuration is a lot harder; all the google-able documents are simple LAN/WAN stuff, without additional networks!

Any pointers?

If you don’t want to use firewalld, you can use straight nftables. You can convert your current ruleset to nftables equivalents as a starting point.

iptables-restore-translate

Otherwise if you want to do it in firewalld, you just need to use the right zones and right rules. For example, the gateway should be in external. The other LAN stuff can remain in say, internal.

Only other thing is you’ll need to setup ingress and egress rules. I would check out the page below.

I thought about nftables but I have to wonder; since the RedHat default is to push firewalld (which is meant to be the abstraction layer), is nftables the right approach?

On the other hand, I guess R9 is gonna supported for 8 more years, so I can just punt the question and it’s possible firewalld will disappear in RedHat10 or 11 :wink:

The “gotcha” for both firewalld and nftables is still going to be reflection, unless they do magic iptables doesn’t!

I personally use nftables by hand rather than firewalld. The option is there. You don’t have to have firewalld if you don’t want it. But there are users who prefer firewalld. As for me, I have a set of rules that cannot reasonably be put into firewalld and I did not want to go down using the --direct route. That’s why I just use nftables, just as I did everything by hand with iptables in the past.

Whether you go and use nftables or firewalld, you may or may not have to do additional manual work to get what you’re after. My suggestion is to look into how --direct works and how rich rules work for firewalld. For nftables, if you want to see close to equivalent syntax, send your iptables rules into the iptables-restore-translate command to see what it gives you.

This will still require work on your part.

Red Hat writes in Chapter 2. Getting started with nftables | Red Hat Product Documentation

  • firewalld: Use the firewalld utility for simple firewall use cases. The utility is easy to use and covers the typical use cases for these scenarios.
  • nftables: Use the nftables utility to set up complex and performance-critical firewalls, such as for a whole network.

So the answer is a ‘yes’ (even by RH).


There is one thing that FirewallD has built-in: integration with NetworkManager. When a new interface – connection – comes up dynamically, NM tells that to FirewallD and FirewallD then adds rules for name-of-interface. I had CentOS 7 VM’s that used to get ethN names from libvirt/KVM and I had NM connections bound to MAC address. Since the names were unpredictable, this NM-FirewallD integration was a bliss.

However, “everyone” should now have access to persistent device names and a router like yours does not get new network devices on the fly. Therefore, the more static nftables.service should be ok.


FirewallD (in EL) did receive proper support for routing – those policy objects that @iwalker describes – very recently. I tried to set up a port forward on (AlmaLinux) 8 system with it and failed. A nftables ruleset for that system was rather quite easy to write manually.

As said, there is a translation utility to generate nftables ruleset file from iptables file. Although, I have mostly started by dumping the nft list ruleset to file from system that has default FirewallD config. Then clean up (a lot) and add your custom rules.

Even the iptables command in EL8 and EL9 is merely a wrapper/translator – it writes nftables rules to the kernel just like the FirewallD.

Red Hat tends to have Technology PreviewIn useDeprecatedno more in RHEL.

Someone from Red Hat wrote:

Just a note about how deprecations work in RHEL, we declare something ‘deprecated’ but continue to enable it for the life of a major release and reserve the right to remove functionality in a future major release.

There is still time for a component of RHEL 9 (FirewallD) to get “deprecated” status, but it could nevertheless remain in 10 and 11.

The CentOS Stream 10 is already somewhere. That is a preview to RHEL 10 and seems to have firewalld-2.1.1-1.el10.noarch, which is not “just” a firewalld-1.x that RHEL 9 has.

I did mine on Rocky 9, it could well be something in firewalld differs on Rocky 8 and hence why it didn’t work for you maybe? Although TBH I would have expected it to fail more on EL7 due to the huge differences between firewalld in EL7 then EL8. I don’t remember if I tried on Rocky 8 or not or whether it worked or not. I would have expected it to.

But I agree, for simple inbound stuff and simple rules between zones, firewalld should be fine for that. It’s when it comes to trying to use the rich rules where I had issues in that it doesn’t always work as it should have done. In which case, this is where most likely nftables would pick up and replace using firewalld.

I do expect firewalld to get better though, and perhaps in EL10, a lot of the advanced rich rule functionality may be simplified or actually work as intended.

There is what and how. When we write iptables of nftables rules directly, we have to know the syntax, the how, that will give us the what we need.

The firewalld offers us concepts, for example port forward and Samba service, that require multiple rules. They are the what, and firewalld knows how. (In EL7 it generated iptables rules into netfilter, in EL[89] it injects nftables rules into kernel.) In principle a firewalld config from CentOS 7 would work in Rocky 9 too.

Naturally, we can’t get away without the “how” entirely. For one, it is good to check what the kernel actually has, and if nftables syntax is greek to you, then you are lost. More importantly, one has to know how to configure firewalld.


A topic already touched is that kernel can do things that the firewalld can’t – has no support for. Yet. At least no convenient ways. The ‘direct rules’ … man firewalld.direct does say in EL9:

The direct interface has been deprecated. It will be removed in a future release. It is superseded by policies, see firewalld.policies(5).

When firewalld does not know how to write special rules, it is of no use.


There is more though. Red Hat promotes System Roles. Another layer of abstraction. It does save us from needing to know the (the details of) how to configure firewalld (or its alternatives). We just have to know how to use System Role and it will do the loops. Automating firewall configuration with RHEL System Roles and Chapter 10. Configuring firewalld by using the RHEL system role | Red Hat Product Documentation

Alas, that has same issue: a system role might not (yet) support all the tricks that the managed component can, so even if firewalld could do something, the system roles might not know how to tell the firewalld do those things. Luckily, one can usually write additional roles to fill those gaps (once one learns how).

Example that opens HTTP and HTTPS ports on hosts (assuming the zone ‘public’ is in use):

- hosts: all
  tasks:
  - ansible.builtin.include_role:
    name: rhel-system-roles.firewall
  vars:
    firewall:
    - zone: public
      service: [http,https]
      state: enabled

Hmm. nftables definitely seems an easy enough approach, once the new syntax is learned. It’s possible this isn’t the most efficient way to lay out the rules, but it seems to work!

Reflection may also be easier; just create a second nat prerouting hook and then dynamically add rules to that.

At least it seems to work in a test setup :slight_smile:

If there was a way to say “this machine” in NAT rules then I wouldn’t even need to do that.

Something like add rule ip nat PREROUTING <this machine> tcp dport 80 dnat to 10.20.30.40:80

But the only way I can see of doing that is to have ip daddr <my.ip.address> and that can change if the ISP decides to change things on me.

Thanks!

A package has entered your machine. Either it was destined to your public IP address or someone did forward it to you expecting you to route it. You want to block the latter case?

You do know the subnet that ISP uses? How often does that change? What if you dNAT only packages to that subnet, rather than your specific address in it? How much trouble will that cause?


You could have a named set, rather than specific address. Then run some service script that updates the value(s) in the set. (Not sure if one could hook the NM to launch that.)

I’m with Verizon FIOS. IP addresses don’t change frequently but when they do they can be massively different.

Heh, going back 8 years and I see

74.102.78.0/24
98.109.207.0/24
108.35.8.0/24
108.5.17.0/24
141.150.58.0/24
173.54.58.0/24
173.63.168.0/24
173.63.68.0/24
173.70.65.0/24
204.194.141.0/24

(I assume they’re /24s; that’s what dhclient is setting).

So I can’t really assume anything about the external IP address :frowning:

So the use case is that I have a current rule:

add rule ip nat PREROUTING iifname "br-wan" tcp dport 80 counter dnat to 10.10.10.2:80

This works great for people on the internet to access my web server.

The problem is that people in my house (eg people on guestnet) can’t access this because the packet isn’t arriving on the WAN interface but on br-guest (for example). It’s silly that people in my own house can’t access resources everyone else on the internet can :slight_smile:

Now I can’t just add another iifname entry because that would intercept all port 80 traffic (www.google.com would go to my machine!)

With CentOS 7 I created a set of “reflection” rules (named so, because that’s how OpenWRT solved the problem) that basically added the equivalent of ip daddr <my.ip> and had a dhclient.d hook script that flushed/rewrote that table if the IP address changed.

I’ve sorted out a similar script for nftables, but need to work out how to hook that into NM.

#!/bin/sh

new_ip_address=$(ip -4 address show dev br-wan | sed -n 's/ *inet \([0-9.][0-9.]*\)\/.*/\1/p')
chain="reflection"

cmd="flush chain ip nat $chain"

adds=$(nft list chain nat PREROUTING | sed -n 's/.*dport \([0-9][0-9]*\).* to \(.*\):\(.*\)/\1 \2 \3/p' | sort -n | while read e_port ip i_port
do
  echo "add rule ip nat $chain ip saddr 10.0.0.0/8 ip daddr $new_ip_address/32 tcp dport $e_port dnat to $ip:$i_port"
done
)

echo "$cmd
$adds"

I guess the named set would be slightly easier, but would still need to be hooked into NM

But this would be so much simpler if there was a target for “this machine” :slight_smile:

First the nftables sets.

Given a set:

table inet xyz {
    set mememe {
        type ipv4_addr
   }
}

and used in rules (within same table):

ip daddr @mememe something

The man nft says that

nft flush set xyz mememe
nft add element xyz mememe { 10.20.30.40 }

ought to be the focused change that makes the rule match (only) daddr 10.20.30.4.


When FirewallD creates rules, it puts into root input and forward chains:

		ct state established,related accept # handle 1
		ct status dnat accept # handle 2
		iifname "lo" accept # handle 3
		# zones, handle 4
		ct state invalid drop # handle 5
		reject with icmpx admin-prohibited # handle 6

Rules 1 3, 5, and 6 allow existing connections, local traffic, and reject by default, just like with iptables.
Rule 2 does interestingly allow what we did dNAT in prerouting; no need to add rule for each port forward.


Another thing that I think I saw somewhere was marks. Might have been with some VPN.
Incoming packet gets first a mark (in mangle?). Later filter rules match on mark, rather than something that dNAT may have masked.

If you can’t give different set of rules for traffic that comes from br-guest – the “zone” concept that FirewallD uses – you can probably at least mark them.


EDIT: How do the clients in br-guest know to contact your “server”? Surely they don’t know the IP address given by FIOS. By name? Name resolved by DNS? Which DNS server gives them the reply? You can’t give them the IP address of that br-guest interface?

I don’t really want to play around with split-horizon DNS. I’m also not sure what would happen to devices that switched networks (eg a phone going from 4G to wifi); would it cache the external value?

But I think, with the use of sets and maps, I can get this down to one rule!

add map nat fport { type inet_service : interval ipv4_addr . inet_service ; flags interval; }
add set ip nat this_host { type ipv4_addr; }

Now we can define port forwards in the map;
eg

add element nat fport { 80 : 10.20.30.40 . 80 }   # http
add element nat fport { 443 : 10.20.30.40 . 443 } # https
add element nat fport { 22 : 10.20.30.41 . 22 }   # ssh
add element nat fport { 6500-6700 : 10.20.30.50 . 6500-6700 } # ISO torrents

And finally

add rule ip nat PREROUTING ip daddr @this_host ip protocol tcp dnat ip  addr . port to tcp dport map @fport

With the right IP address in the @this_host set both external and internal devices get access to the same services!

At least it works in testing :wink:

And a possible NM dispatch script. Place it in /etc/NetworkManager/dispatcher.d/ and it will be called at various times. Important one to catch will be when it’s called with $1 being the interface name (in my case br-wan) and $2 being dhcp4-change. Now the IP address is in $DHCP4_IP_ADDRESS so this can be put in the set.

1 Like

OK, on reboot the dhcp4-change didn’t have an IP address, but it did on “up”, so the script is very simple:

% cat /etc/NetworkManager/dispatcher.d/nft_host
#!/bin/sh

if [ "$1" == "br-wan" -a -n "$DHCP4_IP_ADDRESS" ]
then
  nft "flush set nat this_host;add element nat this_host { $DHCP4_IP_ADDRESS }"
  echo Put $DHCP4_IP_ADDRESS into set
fi

On reboot I see a nm-dispatch entry in /var/log/messages confirming the IP address was added

% grep nm-dispatch /var/log/messages 
Jun 26 14:54:30 router9 nm-dispatcher[1029]: Put XXXXXX into set

(IP redacted since it’s not important)

Oh, hmm. Nope. Once I put it into production, it failed.

The problem is that the traffic isn’t being masqueraded. I can see this specifically for machines on LAN trying to talk to the external interface which directs to another server on the LAN.

What’s happening is

Client sends SYN to external IP address
Router forwards packet to web server
web server sees the original IP address and does a SYNACK to that
client sees SYNACK from unexpected address and does a RST.

eg
10.10.10.10 sends SYN to external.ip, which is forwarded to 10.10.10.20
10.10.10.20 sends SYNACK to 10.10.10.10
10.10.10.10 sends RST to 10.10.10.20

Back to the drawing board! Everything else is working just fine, just not this bit.

EDIT: I realised why my tests weren’t sufficient; I’d been sending the traffic to the real web server which, in my test lab, was on the WAN side of the setup, so it was being masqueraded. Doh.

ALSO: I just put this new machine in to replace the old CentOS 7 machine… and speed tests came in 15% slower. This means the whole plan is on hold while I try to figure something else out!

I’ve learned a lot, but I can’t currently use it. ANNOYED!

Yes. The clients that talk to router from outside and the clients that talk from inside are two different groups, which require different rules. FirewallD would say that they are in different zones.

Both groups should get dNATted and allowed “through” the router, but the packets originating from inside should also receive sNAT on their way out. The point of snat, postrouting, does no longer remember what the iif was.

That is where the marks could step in. If you mark and dnat prerouting, then you can snat the marked packets postrouting. (The packets from outside do not need the snat, only the ones that you “reflect”.)

I came across marks a few years back, in the iptables mangle table. We had a hardware device that sat between multiple servers (with private IP addresses) and the outside world. It presented itself as a single IP address externally. It was confusing when it went wrong; an external packet would go through the hardware device to the correct server, but the server was unable to reply.

OK, I think this does it.

Data structures neded:

add map nat fport { type inet_service : interval ipv4_addr . inet_service ; flags interval;}
add set ip nat reflect { type inet_service; }
add set ip nat this_host { type ipv4_addr; }

The fport map can have entries like

add element nat fport { 80 : 10.20.30.40 . 80 }   # http
add element nat fport { 443 : 10.20.30.40 . 443 } # https

The reflect set lists the ports I want to see from the inside

add element nat reflect { 80, 443, 22 }

The this_host set will hold the external IP address and can be set from an nmcli dispatcher.d hook

% ls -l /etc/NetworkManager/dispatcher.d/nft_host                 
-rwx------ 1 root root 195 Jun 26 14:53 /etc/NetworkManager/dispatcher.d/nft_host*

% cat /etc/NetworkManager/dispatcher.d/nft_host
#!/bin/sh

if [ "$1" == "br-wan" -a -n "$DHCP4_IP_ADDRESS" ]
then
  nft "flush set nat this_host;add element nat this_host { $DHCP4_IP_ADDRESS }"
  echo `date`: Put $DHCP4_IP_ADDRESS into set
fi

And now these rules appear to do the necessary

# Tag internal reflection traffic
add rule ip nat PREROUTING ip saddr 10.0.0.0/8 ip daddr @this_host tcp dport @reflect mark set 100

# DNAT incoming traffic from the internet (or internal send to br-wan address)
add rule ip nat PREROUTING ip daddr @this_host ip protocol tcp dnat ip  addr . port to tcp dport map @fport

# Do NAT on egress traffic to the internet
add rule ip nat POSTROUTING oifname "br-wan" counter masquerade

# Also NAT marked traffic from internal to external IP
add rule ip nat POSTROUTING meta mark 100 counter masquerade

Could that use iifname "br-wan" rather than ip daddr @this_host ?
Or is the iifname match too “catch all”?

If I used br-wan then this won’t DNAT the internal traffic aimed at the external IP address and so I would need two rules

add rule ip nat PREROUTING iifname br-wan ip protocol tcp dnat ip  addr . port to tcp dport map @fport
add rule ip nat PREROUTING ip saddr 10.0.0.0/8 ip daddr @this_host ip protocol tcp dnat ip  addr . port to tcp dport map @fport

(or perhaps that second rule could use the mark as the condition).

I had thought about this. It is potentially more resilient; external to internal would still work if this_host isn’t populated. But with the dispatcher.d hook and a cron job that I will also run every 5 minutes (I had a similar one for iptables) the one rule should be fine…

1 Like