Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Windows] Make internode pod communication work in the Windows environment #9295

Open
Tracked by #9313
caroline-suse-rancher opened this issue Jan 25, 2024 · 5 comments

Comments

@caroline-suse-rancher
Copy link
Contributor

caroline-suse-rancher commented Jan 25, 2024

As a follow-on from the initial work done in this issue, we've identified that internode pod communication may not be working on k3s in Windows OS. Please investigate this and provide a fix.

The expectation is that the solution honors what a CNI plugin should fulfill in a k8s cluster:
1 - Pods on different nodes can communicate to each other
2 - Node-pod communication works regardless of where the pod is running

At this initial step, only vxlan encapsulation is required to work

@ValeriiVozniuk
Copy link

Hi, could this be the reason that pod on Windows is unable to reach CoreDNS, and resolve both internal and external DNS names?

@ValeriiVozniuk
Copy link

I see that Linux nodes are annotated
flannel.alpha.coreos.com/backend-data: {"VNI":1,"VtepMAC":"aa:f6:23:e5:46:7a"}
And Windows is
flannel.alpha.coreos.com/backend-data: {"VNI":4096,"VtepMAC":"00:15:5d:d5:db:a5"}
But I don't see a way to change VNI in k3s
https://docs.k3s.io/networking/basic-network-options

@brandond
Copy link
Member

brandond commented Feb 11, 2025

Yeah, I think the VNI mismatch is currently a blocker for mixed linux/windows nodes. I think Windows only support 4096?

cc @manuelbuil

Ref: https://github.com/kubernetes-sigs/sig-windows-tools/blob/master/guides/flannel.md

Note The VNI must be set to 4096 and port 4789 for Flannel on Linux to interoperate with Flannel on Windows.

@brandond brandond added this to the 2025-03 Release Cycle milestone Feb 11, 2025
@ValeriiVozniuk
Copy link

Yeah, I think the VNI mismatch is currently a blocker for mixed linux/windows nodes. I think Windows only support 4096?

That's right, and it also wants name "vxlan0" set for flannel (see https://github.com/microsoft/SDN/tree/master/Kubernetes/flannel/overlay). I've ended up writing "override" configs to /var/lib/rancher/k3s/agent/etc/flannel and /var/lib/rancher/k3s/agent/etc/cni/net.d/ folders, and set them via --flannel-conf/--flannel-cni-conf parameters.
Also, network policies must be disabled on Linux nodes.

@brandond
Copy link
Member

brandond commented Feb 12, 2025

Ref: https://learn.microsoft.com/en-us/windows-server/networking/sdn/technologies/hyper-v-network-virtualization/hyperv-network-virtualization-technical-details-windows-server

Each virtual subnet belongs to a single virtual network (RDID), and it is assigned a unique Virtual Subnet ID (VSID) using either the TNI or VNI key in the encapsulated packet header. The VSID must be unique within the datacenter and is in the range 4096 to 2^24-2.

I guess everything below 4096 is assumed to be a traditional 802.1q VLAN tag?

The easiest way to handle this would probably be to wire up a new flannel option: --flannel-opt=vni=X, but I guess we decided against adding that flag.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Accepted
Development

No branches or pull requests

3 participants