Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DHCP changed request scope #1269

Closed
tompaah opened this issue Oct 30, 2024 · 8 comments · Fixed by #1318 · May be fixed by #1304
Closed

DHCP changed request scope #1269

tompaah opened this issue Oct 30, 2024 · 8 comments · Fixed by #1318 · May be fixed by #1304

Comments

@tompaah
Copy link

tompaah commented Oct 30, 2024

Summary: A client in a vlan/subnet first requests and gets a dhcp lease from the correct scope, but eventually Gravity says the client "changed request scope" and assigns a dhcp lease from another scope.

Client MAC address: 00:50:56:81:be:50
Correct scope: "container-dev", CIDR 192.168.212.0/23
Incorrect scope: "container", CIDR 192.168.210.0/23

Before the first request, the client has a reservation in "container-dev". The entire chain of events below.

Ubuntu 22 dhcp client releases lease and requests a new one, dhclient -r && dhclient -v, Gravity assigns a lease in the correct scope (yourIPAddr=192.168.212.89)

INF ts=1730316456.3392339 logger=role.dhcp msg=DHCP packet instance=gravity-ipm01 version=0.13.3-c083d3e3 request=18b18d8e-6fff-4b2f-87cb-27ad16257dd7-0x7cb6fc70 deviceIdentifier=00:50:56:81:be:50 opCode=BootRequest hopCount=1 transactionID=0x7cb6fc70 flagsToString=Unicast clientIPAddr=0.0.0.0 yourIPAddr=0.0.0.0 serverIPAddr=0.0.0.0 gatewayIPAddr=192.168.212.1 hostname=k8stestworker01 clientIdentifier= messageType=DISCOVER client=192.168.212.1:67

INF ts=1730316456.3393786 logger=role.dhcp msg=DHCP packet instance=gravity-ipm01 version=0.13.3-c083d3e3 request=18b18d8e-6fff-4b2f-87cb-27ad16257dd7-0x7cb6fc70 deviceIdentifier=00:50:56:81:be:50 opCode=BootReply hopCount=0 transactionID=0x7cb6fc70 flagsToString=Unicast clientIPAddr=0.0.0.0 yourIPAddr=192.168.212.89 serverIPAddr=192.168.210.2 gatewayIPAddr=192.168.212.1 hostname=k8stestworker01 clientIdentifier= messageType=OFFER

INF ts=1730316456.3406374 logger=role.dhcp msg=DHCP packet instance=gravity-ipm01 version=0.13.3-c083d3e3 request=84fe3508-1d43-4619-9abe-d79eda1f7ceb-0x7cb6fc70 deviceIdentifier=00:50:56:81:be:50 opCode=BootRequest hopCount=1 transactionID=0x7cb6fc70 flagsToString=Unicast clientIPAddr=0.0.0.0 yourIPAddr=0.0.0.0 serverIPAddr=0.0.0.0 gatewayIPAddr=192.168.212.1 hostname=k8stestworker01 clientIdentifier= messageType=REQUEST client=192.168.212.1:67

INF ts=1730316456.35251 logger=role.dhcp msg=DHCP packet instance=gravity-ipm01 version=0.13.3-c083d3e3 request=84fe3508-1d43-4619-9abe-d79eda1f7ceb-0x7cb6fc70 deviceIdentifier=00:50:56:81:be:50 opCode=BootReply hopCount=0 transactionID=0x7cb6fc70 flagsToString=Unicast clientIPAddr=0.0.0.0 yourIPAddr=192.168.212.89 serverIPAddr=192.168.210.2 gatewayIPAddr=192.168.212.1 hostname=k8stestworker01 clientIdentifier= messageType=ACK

After a few minutes, I've done absolutely nothing on the client or in Gravity, this is logged (looks like a request renewal). Still gets the same lease in the correct scope

INF ts=1730317044.966854 logger=role.dhcp msg=DHCP packet instance=gravity-ipm01 version=0.13.3-c083d3e3 request=e87109a9-0cd8-4cf3-8026-6add791a2130-0x3a2e2f6f deviceIdentifier=00:50:56:81:be:50 opCode=BootRequest hopCount=1 transactionID=0x3a2e2f6f flagsToString=Unicast clientIPAddr=192.168.212.84 yourIPAddr=0.0.0.0 serverIPAddr=0.0.0.0 gatewayIPAddr=192.168.212.1 hostname=k8stestworker01 clientIdentifier=ff9f6e852400020000ab11a4750df60700c351 messageType=REQUEST client=192.168.212.1:67

INF ts=1730317044.9784868 logger=role.dhcp msg=DHCP packet instance=gravity-ipm01 version=0.13.3-c083d3e3 request=e87109a9-0cd8-4cf3-8026-6add791a2130-0x3a2e2f6f deviceIdentifier=00:50:56:81:be:50 opCode=BootReply hopCount=0 transactionID=0x3a2e2f6f flagsToString=Unicast clientIPAddr=0.0.0.0 yourIPAddr=192.168.212.89 serverIPAddr=192.168.210.2 gatewayIPAddr=192.168.212.1 hostname=k8stestworker01 clientIdentifier=ff9f6e852400020000ab11a4750df60700c351 messageType=ACK

After a couple of minutes again, now the interesting stuff happens. Still doing nothing on the client or in Gravity. Client appears to renew the request, this time Gravity changes the scope to the incorrect one and leases out (yourIPAddr=192.168.210.86).

INF ts=1730317182.8058558 logger=role.dhcp msg=DHCP packet instance=gravity-ipm01 version=0.13.3-c083d3e3 request=50656f99-aa69-4258-a93c-41037e6e3f07-0xd00c552b deviceIdentifier=00:50:56:81:be:50 opCode=BootRequest hopCount=0 transactionID=0xd00c552b flagsToString=Unicast clientIPAddr=192.168.212.89 yourIPAddr=0.0.0.0 serverIPAddr=0.0.0.0 gatewayIPAddr=0.0.0.0 hostname=k8stestworker01 clientIdentifier= messageType=REQUEST client=192.168.212.89:68

INF ts=1730317182.8063526 logger=role.dhcp msg=Re-assigning address for lease due to changed request scope instance=gravity-ipm01 version=0.13.3-c083d3e3 identifier=00:50:56:81:be:50 newIP=192.168.210.86

INF ts=1730317182.8250089 logger=role.dhcp msg=DHCP packet instance=gravity-ipm01 version=0.13.3-c083d3e3 request=50656f99-aa69-4258-a93c-41037e6e3f07-0xd00c552b deviceIdentifier=00:50:56:81:be:50 opCode=BootReply hopCount=0 transactionID=0xd00c552b flagsToString=Unicast clientIPAddr=0.0.0.0 yourIPAddr=192.168.210.86 serverIPAddr=192.168.210.2 gatewayIPAddr=0.0.0.0 hostname=k8stestworker01 clientIdentifier= messageType=ACK

Refreshing Scope views in Gravity, the reservation for 00:50:56:81:be:50 has now moved from the correct scope "container-dev" to the incorrect one "container". Interesting to see all this is happening automatically, I'm just watching the show.

@BeryJu
Copy link
Owner

BeryJu commented Oct 30, 2024

I've seen this happen before but I haven't been able to reproduce it, does this reliably happen to you? Could you set the log level to debug?

Could you also run gravity cli export and either paste the result here or send it to [email protected] (you can filter the output by keys that have the prefix of /gravity/dhcp, there's nothing sensitive in there aside from mac addresses and dns domain)

@tompaah
Copy link
Author

tompaah commented Oct 31, 2024

Sent some logs with LOG_LEVEL=debug to your e-mail. Right now it's operating (fresh install) in single-node mode.

DHCP Scope TTL is deliberately set ridiculously low so that I didn't have to wait all day for this to happen.

@tompaah
Copy link
Author

tompaah commented Oct 31, 2024

I realise you probably meant the DEBUG=true should be set, so I recreated the situation and sent new logs.

@BeryJu
Copy link
Owner

BeryJu commented Nov 2, 2024

@tompaah I believe #1269 should fix the issue, however I'm building a better e2e testing suite for this right now. Could you send me a tcpdump running on the gravity server when the issue happens? Just tcpdump -i any udp port 67 -w dump.pcap and send me that pcap file?

@tompaah
Copy link
Author

tompaah commented Nov 4, 2024

Saw that v0.14.0 was out so upgraded to it.

Sent pcap.dump from the process of

  1. getting an IP from a Reservation in the correct Scope.
  2. renewing after 300 seconds, getting IP-adress from incorrect Scope.

Thanks for investigating the issue.

@tompaah
Copy link
Author

tompaah commented Nov 15, 2024

Been experimenting a bit and I see a pattern.

If the client has no IP-adress yet, the DHCP request is done with a broadcast that is forwarded by a DHCP forwarder in that subnet to Gravity.
Note
clientIPAddr=0.0.0.0 gatewayIPAddr=192.168.212.1 client=192.168.212.1:67

INF ts=1731660104.3620853 logger=role.dhcp msg=DHCP packet instance=gravity-ipm01 version=0.14.0-ac87319b request=198dd488-50e2-48db-937d-90c7241ef95c-0x3e51db8a deviceIdentifier=00:50:56:81:35:d4 opCode=BootRequest hopCount=1 transactionID=0x3e51db8a flagsToString=Unicast **clientIPAddr=0.0.0.0** yourIPAddr=0.0.0.0 serverIPAddr=0.0.0.0 **gatewayIPAddr=192.168.212.1** hostname=ipmtest01 clientIdentifier=ff9f6e852400020000ab11f4d8d732b055a24a messageType=REQUEST client=192.168.212.1:67

INF ts=1731660104.36852 logger=role.dhcp msg=DHCP packet instance=gravity-ipm01 version=0.14.0-ac87319b request=198dd488-50e2-48db-937d-90c7241ef95c-0x3e51db8a deviceIdentifier=00:50:56:81:35:d4 opCode=BootReply hopCount=0 transactionID=0x3e51db8a flagsToString=Unicast **clientIPAddr=0.0.0.0** yourIPAddr=192.168.212.99 serverIPAddr=192.168.210.2 **gatewayIPAddr=192.168.212.1** hostname=ipmtest01 clientIdentifier=ff9f6e852400020000ab11f4d8d732b055a24a messageType=ACK

But if the client already has an adress and is just trying to renew the lease, the request is sent directly to Gravity and not via broadcast/DHCP forwarder. This is where it gives address from the wrong scope, as if unable to determine which subnet the client is on.
Note
clientIPAddr=192.168.212.99 gatewayIPAddr=0.0.0.0 client=192.168.212.99:68

INF ts=1731660404.249668 logger=role.dhcp msg=DHCP packet instance=gravity-ipm01 version=0.14.0-ac87319b request=4b759bbc-7765-491e-80c2-6c016759f5a6-0x3e51db8a deviceIdentifier=00:50:56:81:35:d4 opCode=BootRequest hopCount=0 transactionID=0x3e51db8a flagsToString=Unicast **clientIPAddr=192.168.212.99** yourIPAddr=0.0.0.0 serverIPAddr=0.0.0.0 **gatewayIPAddr=0.0.0.0** hostname=ipmtest01 clientIdentifier=ff9f6e852400020000ab11f4d8d732b055a24a messageType=REQUEST client=client=19192.168.212.99:68

INF ts=1731660404.2497654 logger=role.dhcp msg=Re-assigning address for lease due to changed request scope instance=gravity-ipm01 version=0.14.0-ac87319b identifier=00:50:56:81:35:d4 newIP=192.168.210.60

INF ts=1731660404.260131 logger=role.dhcp msg=DHCP packet instance=gravity-ipm01 version=0.14.0-ac87319b request=4b759bbc-7765-491e-80c2-6c016759f5a6-0x3e51db8a deviceIdentifier=00:50:56:81:35:d4 opCode=BootReply hopCount=0 transactionID=0x3e51db8a flagsToString=Unicast **clientIPAddr=0.0.0.0** yourIPAddr=192.168.210.60 serverIPAddr=192.168.210.2 **gatewayIPAddr=0.0.0.0** hostname=ipmtest01 clientIdentifier=ff9f6e852400020000ab11f4d8d732b055a24a messageType=ACK

Judging from the logs, the gatewayIPAddr is what is deciding what subnet to assign, when it is correctly set to the DHCP forwarder it works correctly, when it's 0.0.0.0 it gives from the incorrect scope.

@BeryJu
Copy link
Owner

BeryJu commented Nov 25, 2024

@tompaah thanks for bearing with, this should be fixed in 0.17 (disregard the whole big refactor I started in #1304, turns out this was a pretty easy fix 🙃)

@tompaah
Copy link
Author

tompaah commented Nov 26, 2024

Nice, thanks and no worries, can confirm that 0.17.1 gives out correct address to clients in differens subnets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants