Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lwAFTR ARP and Encapsulation limit #1441

Open
Jaspils opened this issue Sep 11, 2019 · 11 comments
Open

lwAFTR ARP and Encapsulation limit #1441

Jaspils opened this issue Sep 11, 2019 · 11 comments

Comments

@Jaspils
Copy link

Jaspils commented Sep 11, 2019

I'm trying to get a demo running of a lwAFTR. So far I seem to have it working on a basic level. I have a lwB4 manually configured on a Ubuntu system with iptables for NAT and an ipip6 tunnel to send it through. But along the way I ran into two problems:

ARP for shared IPs.
The lwAFTR runs with a configuration containing softwires for a few lwB4's. In my demo I have single shared IPv4-address over a few lwB4-clients, but this should work the same for multiple IPv4-addressess. The problem is that the lwAFTR doesn't publish the shared IP through ARP. This way the gateway which the lwAFTR is connected to, doesn't know where to send returning packets.

With a manual ARP entry in my gateway, I got the system working. But on large scale, that doesn't sound ideal. Is there a way to set the lwAFTR to publish the shared IPs that it 'manages'?

Encapsulation limit
By default some systems seem to set up their tunnels with an encaplimit, in TCPdump this reflects in the "DSTOPT" flag on packets. It seems that the lwAFTR drops packets with this flag. Is there a way for the lwAFTR to accept these packets with an encaplimit, or should all tunnels be configured without?

lwB4
Another question where that I'm not sure where to ask: are there any open-source lwB4 functions available to deploy or test my system with? I can only read some things about OpenWRT supporting it, but that works terrible in my experience. I can't get it to work right since it doesn't listen right to it's own configuration. Only working lwB4 I got so far, as mentioned, is a manually configured Ubuntu client.

@Jaspils
Copy link
Author

Jaspils commented Sep 16, 2019

Additional info:
Regarding the encapsulation limit, here an example of an OpenWRT (left) functioning as lwB4, which adds the Destination Option Header (DSTOPT) by default with an encapsulation limit, making the lwAFTR drop the packet.

An Ubuntu client with manually added tunnel (right) had the same issue at first, but when specifying 'encaplimit none' at tunnel creation, the Destination Option Header (DSTOPT) isn't passed, and everything works.

DSTOPT-fail

I'm not entirely sure wheter this is an issue that needs to be resolved on the lwAFTR-side or on the lwB4 side.

@ameen-mcmxc
Copy link

I'm trying to get a demo running of a lwAFTR. So far I seem to have it working on a basic level. I have a lwB4 manually configured on a Ubuntu system with iptables for NAT and an ipip6 tunnel to send it through. But along the way I ran into two problems:

ARP for shared IPs. The lwAFTR runs with a configuration containing softwires for a few lwB4's. In my demo I have single shared IPv4-address over a few lwB4-clients, but this should work the same for multiple IPv4-addressess. The problem is that the lwAFTR doesn't publish the shared IP through ARP. This way the gateway which the lwAFTR is connected to, doesn't know where to send returning packets.

With a manual ARP entry in my gateway, I got the system working. But on large scale, that doesn't sound ideal. Is there a way to set the lwAFTR to publish the shared IPs that it 'manages'?

Encapsulation limit By default some systems seem to set up their tunnels with an encaplimit, in TCPdump this reflects in the "DSTOPT" flag on packets. It seems that the lwAFTR drops packets with this flag. Is there a way for the lwAFTR to accept these packets with an encaplimit, or should all tunnels be configured without?

lwB4 Another question where that I'm not sure where to ask: are there any open-source lwB4 functions available to deploy or test my system with? I can only read some things about OpenWRT supporting it, but that works terrible in my experience. I can't get it to work right since it doesn't listen right to it's own configuration. Only working lwB4 I got so far, as mentioned, is a manually configured Ubuntu client.

Hallo,

May I ask you something abt your manually configured lwB4 ??
Which rule have you used for iptables in order to implement NAT on the incoming interface so that the source IP is changed before the encapsulation process starts??
As far as I am concerned, iptables doesn't support SNAT for "PREROUTING"

Thanks
Ameen

@iffy50
Copy link

iffy50 commented Jul 4, 2022

Hi,

The lwB4 performs NAT on outbound traffic in the POSTROUTING chain (i.e., all traffic going out via the 4 in 6 tunnel interface gets NAT'd). Here's a set of rules that I've used for configuring a lwB4:

`
B4V6ADDR=2001:db8::1/64
B4V4ADDR=10.0.0.15
B4INT=eth1
AFTRV6ADDR=2001:db8:2::2
V6NEXTHOP=2001:db8::2

ip link set $B4INT up
ip -6 addr add $B4V6ADDR dev $B4INT
ip -6 tunnel add lw4o6-tun remote $AFTRV6ADDR local $B4V6ADDR mode ipip6 encaplimit 4 hoplimit 64 tclass 0x00 flowlabel 0x00000
ip link set dev lw4o6-tun mtu 1400 up
ip addr add $B4V4ADDR/32 dev lw4o6-tun nodad
iptables -A POSTROUTING -t nat -o lw4o6-tun -j SNAT --to-source $B4V4ADDR
ip route add 192.0.2.0/24 dev lw4o6-tun proto static
ip -6 route add 2001:db8:2::/64 via $V6NEXTHOP dev $B4INT src $B4V6ADDR`

@ameen-mcmxc
Copy link

Hi
Many thanks for your reply.

Please have a look at my topology.

image (1)

I have adjusted your configuration, but I think I am missing something here coz it is not really working.

Below is my config: -

B4V6ADDR=2001:db8:0:1::2/64
B4V6ADDR2=2001:db8:0:1::2 # This is just same IP but without prefix
B4V4ADDR=192.168.1.10
B4INT=ens35
AFTRV6ADDR=2001:db8:1::1
V6NEXTHOP=2001:db8::1

ip link set $B4INT up
ip -6 addr add $B4V6ADDR dev $B4INT
ip -6 tunnel add lw4o6-tun remote $AFTRV6ADDR local $B4V6ADDR mode ipip6 encaplimit 4 hoplimit 64 tclass 0x00 flowlabel 0x00000
ip link set dev lw4o6-tun mtu 1400 up
ip addr add $B4V4ADDR/32 dev lw4o6-tun nodad
iptables -A POSTROUTING -t nat -o lw4o6-tun -j SNAT --to-source $B4V4ADDR
ip route add 198.51.100.0/24 dev lw4o6-tun proto static
ip -6 route add 2001:db8:0::/64 via $V6NEXTHOP dev $B4INT src $B4V6ADDR2

Note: I am using a test environment, so I am not using the standard IP of 192.0.2.1.

All machines are centos-7 built using VMware workstation.

Regards
Ameen

@iffy50
Copy link

iffy50 commented Jul 8, 2022

It looks to me like the AFTRV6 addressing is wrong. The AFTR needs to be configured with a v6 address for it's interface, which is on the same prefix as the B4V6ADDR, so 2001:db8:1::1 is OK for this, but it should be used as the V6NEXTHOP, not the AFTRV6ADDR. 2001:db8:1::1 is used as the internal address within the AFTR's config.

When the AFTR is running, 2001:db8:1::1 should be pingable from the B4.

The AFTRV6ADDR is really a virtual address - it is the v6 tunnel endpoint address and the dst. for 4in6 packets from the B4, but it is not attached to any interface.

The address for this needs to be taken from a prefix which is not 2001:db8:0:1::/64 so it is routed through the AFTR using the B4's default route (V6NEXTHOP), so e.g., 2001:db8:0:2::2/64 would work.

So for the AFTR config you'd have:

      internal-interface
        ip 2001:db8:1::1
        mac 22:22:22:22:22:22
        next-hop
          ip 2001:db8:1::2

And a softwire config:

      ipv4 192.168.1.10
      b4-ipv6 12001:db8:0:1::2
      br-address 2001:db8:0:2::2

@ameen-mcmxc
Copy link

I am a bit confused here :)

let me break down a bit what I have understood now: -

LwB4 machine internal hardcoded ens35 conf: -
IP: 2001:db8:0:2::2
The next-hop (2001:db8:1::1) for the tunnel should be configured through the script.

LwAFTR machine internal conf for ens34: -
IP 2001:db8:1::1
next hop 2001:db8:1::2

so the script on the LwB4 side will be like this: -

B4V6ADDR=2001:db8:0:1::2/64
B4V6ADDR2=2001:db8:0:1::2 # This is just same IP but without prefix
B4V4ADDR=198.51.100.10
B4INT=ens35
AFTRV6ADDR=2001:db8:0:1::1
V6NEXTHOP=2001:db8:1::1 # Samen IP for AFTR internal interface IP

ip link set $B4INT up
ip -6 addr add $B4V6ADDR dev $B4INT
ip -6 tunnel add lw4o6-tun remote $AFTRV6ADDR local $B4V6ADDR mode ipip6 encaplimit 4 hoplimit $
ip link set dev lw4o6-tun mtu 1400 up
ip addr add $B4V4ADDR/32 dev lw4o6-tun nodad
iptables -A POSTROUTING -t nat -o lw4o6-tun -j SNAT --to-source $B4V4ADDR
ip route add 198.51.100.0/24 dev lw4o6-tun proto static
ip -6 route add 2001:db8:0::/64 via $V6NEXTHOP dev $B4INT src $B4V6ADDR2

One more question, Lwb4 is supposed to have a "port set" assigning method,
I have been reading about implementing it here: -
https://blogs.igalia.com/dpino/2018/02/15/the-b4-network-function/

I couldn't adjust it to work with my script, any idea?

@iffy50
Copy link

iffy50 commented Jul 11, 2022

There's 2 problems with the above addressing:

  1. The v4 address for the B4 needs to be in a different IP4 subnet to IPv4 server.
  2. The AFTRV6ADDR and the V6NEXTHOP have to be taken from different IPv6 prefixes.
B4V6ADDR=2001:db8:0:1::2/64
B4V6ADDR2=2001:db8:0:1::2 # This is just same IP but without prefix
B4V4ADDR=192.168.1.1
B4INT=ens35
AFTRV6ADDR=2001:db8:2::2 
V6NEXTHOP=2001:db8:0:1::1

So this would give the following config:

ip link set $B4INT up
ip -6 addr add $B4V6ADDR dev $B4INT
ip -6 tunnel add lw4o6-tun remote $AFTRV6ADDR local $B4V6ADDR2 mode ipip6 encaplimit 4 hoplimit 64 tclass 0x00 flowlabel 0x00000
ip link set dev lw4o6-tun mtu 1400 up
ip addr add $B4V4ADDR/32 dev lw4o6-tun nodad
iptables -A POSTROUTING -t nat -o lw4o6-tun -j SNAT --to-source $B4V4ADDR
ip route add 198.51.100.0/24 dev lw4o6-tun proto static
ip -6 route add 2001:db8:0:2::/64 via $V6NEXTHOP dev $B4INT src $B4V6ADDR2

The relevant AFTR config for the above client

softwire {
  ipv4 192.168.1.1;
  psid 0;
  b4-ipv6 2001:db8:0:1::2;
  br-address 2001:db8:0:2::2;
  port-set {
    psid-length 0;
}

The IPv4 Server will also need to have an IPv4 route that sends traffic with a dst. of $B4V4ADDR and 10.0.0.0/24 via 198.51.100.1
177384848-00fa974e-c8ad-40e1-9a31-7f58acbb876e

In the above example, PSID is 0, which is a 'full' IP address (64k ports/no address sharing). PSID-length is the number of L4 ports each client will get. Valid values are 0-15 e.g. value of 6 bits = 64 ports, 7 bits = 128 ports, etc.

The total port space is 16 bits = 65536 available ports. 65536/64 ports (psid-length 6) = 1024 possible clients sharing a single IPv4 address.

The psid value is which instance of the 64 ports that will be used - e.g.,(for our paid-length = 6 example above) psid 0 = 0-63, psid 1 = 64-127 etc.

One consideration for address sharing is that it is the well known ports (0-1023) are generally not assigned as they can conflict with other services running on the device. Starting at psid 16 (for paid-length 6) would avoid this.

@ameen-mcmxc
Copy link

Hallo,
Many thanks for your cooperation 😊
I have changed the script on lwB4 side as you recommended: -

B4V6ADDR=2001:db8:0:1::2/64
B4V6ADDR2=2001:db8:0:1::2 # This is just same IP but without prefix
B4V4ADDR=192.168.1.1
B4INT=ens35
AFTRV6ADDR=2001:db8:2::2
V6NEXTHOP=2001:db8:0:1::1

ip link set $B4INT up
ip -6 addr add $B4V6ADDR dev $B4INT
ip -6 tunnel add lw4o6-tun remote $AFTRV6ADDR local $B4V6ADDR mode ipip6 encaplimit 4 hoplimit 64 tclass 0x00 flowlabel 0x00000
ip link set dev lw4o6-tun mtu 1400 up
ip addr add $B4V4ADDR/32 dev lw4o6-tun nodad
iptables -A POSTROUTING -t nat -o lw4o6-tun -j SNAT --to-source $B4V4ADDR
ip route add 198.51.100.0/24 dev lw4o6-tun proto static
ip -6 route add 2001:db8:0:2::/64 via $V6NEXTHOP dev $B4INT src $B4V6ADDR2

The thing is, I see now you added red coloured IPS on the topology.
As my topology shows, the IPs in black colour: -
On lwB4 ens35, I configured 2001:db8:0:1::2/64
On lwAFTR ens34, I configured 2001:db8:0:1::1/64
What shall I do with the red IPs that you added?
Especially you drew on LwB4 machine: “v6 2001:db8:1::2/64”
I haven’t used this IP anywhere.
Perhaps I need to add it only when I want to configure the other end of the tunnel at lwAFTR?
Are you sure about this line below?
ip -6 route add 2001:db8:0:2::/64 via $V6NEXTHOP dev $B4INT src $B4V6ADDR2
Or it was meant to be “2001:db8:2::/64” ??

How to configure psid for lwB4 machine??
On which machine? lwB4 only?
As I understood, psid recommendation: psid 16? psid length = 6?
Like I mentioned earlier, I am just testing LwB4 now.
Do I need to implement a tunnel on LwAFTR side as well, so that this test tunnel functions??

@ameen-mcmxc
Copy link

FYI, the is the tcpdump resutl on the lwB4 & lwAFTR when I pinged the the IPv4 server from the IPv4 client.

[root@B4_lw ~]# tcpdump -n -i ens35
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens35, link-type EN10MB (Ethernet), capture size 262144 bytes
21:34:16.669971 IP6 2001:db8:0:1::2 > 2001:db8:2::2: IP 192.168.1.1 > 198.51.100.2: ICMP echo request, id 8954, seq 1, length 64
21:34:16.671045 IP6 2001:db8:0:1::1 > 2001:db8:0:1::2: ICMP6, destination unreachable, unreachable route 2001:db8:2::2, length 132

[root@AFTR ~]# tcpdump -n -i ens34
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens34, link-type EN10MB (Ethernet), capture size 262144 bytes
21:37:30.630111 IP6 2001:db8:0:1::2 > 2001:db8:2::2: IP 192.168.1.1 > 198.51.100.2: ICMP echo request, id 8979, seq 1, length 64
21:37:30.630260 IP6 2001:db8:0:1::1 > 2001:db8:0:1::2: ICMP6, destination unreachable, unreachable route 2001:db8:2::2, length 132

Does this mean that LwB4 is correctly configured and ready?
So it encapsulates, translates and sends 4in6 traffic to LwAFTR, which should do its own duties as well.

@iffy50
Copy link

iffy50 commented Jul 12, 2022

Yes, you're correct about the v6 route - it should be:
ip -6 route add 2001:db8:2::/64 via $V6NEXTHOP dev $B4INT src $B4V6ADDR2

For getting the AFTR running, what hardware (or VM) are you running the Snabb AFTR on? From a previous test setup that I helped with, getting things running on a VM required creating a bridge with the 'physical' interface and a virtio interface to attach the snabb process to.

Regarding configuring the PSID on the lwB4, there is a source port command that can be added to the NAT rule - I'm pretty rusty on all of this stuff as I've never really used it. It's something like this (from memory):
iptables -A POSTROUTING -t nat -o lw4o6-tun -j SNAT --to-source $B4V4ADDR:1024-2047

You will need to manually calculate the port range that is being specified in the corresponding rule on the AFTR. If there is no 'offset' value being set, then this should be a single rule with one block of ports. For other values of offset, then things get more complicated. It's defined in RFC7597 in section 5.1 with examples in the appendix.

@eugeneia
Copy link
Member

eugeneia commented Sep 19, 2022

We started with the setup described in the tutorial in #1483

This uses lwAFTR’s capability to use Linux interfaces (apps.socket.raw under the hood), which has shown to work with Linux veth pairs at least. Its probably very slow and doesn’t support RSS but I figured it should be the easiest for evaluating functionality.

We’re now trying to adapt this setup to work with regular Linux interfaces in a VM. I am a bit worried about Linux intercepting ARP/ND/ICMP, but we’ll have to see.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants