-
Notifications
You must be signed in to change notification settings - Fork 90
Does this still work? #35
Comments
To get my one working, I ended up making the changes as per nelg@e4a0b33 |
Hi nelg If I just wanna setup a Linux 2 NAT instance, don't wanna IaC to provision all other infra. which commands I should run to be able to have a Amazon Linux 2 NAT working? Thanks in advance. |
Here is the PR #37 |
This issue and your fix solved 5+ hours of debugging work for me. Thank you and I hope it gets merged soon. |
It seems NAT connection is lost after the NAT instance is rebooted.
I noticed the route table is broken after reboot as follows: ## When an instance is created
ssm-user@ip-172-18-138-43 bin]$ ip ro
default via 172.18.128.1 dev eth1 metric 10001
169.254.169.254 dev eth0
172.18.128.0/20 dev eth0 proto kernel scope link src 172.18.138.43
172.18.128.0/20 dev eth1 proto kernel scope link src 172.18.132.145
ssm-user@ip-172-18-138-43 bin]$ sudo reboot
## After reboot
ssm-user@ip-172-18-138-43 bin]$ ip ro
default via 172.18.128.1 dev eth0
default via 172.18.128.1 dev eth1 metric 10001
169.254.169.254 dev eth0
172.18.128.0/20 dev eth0 proto kernel scope link src 172.18.138.43
172.18.128.0/20 dev eth1 proto kernel scope link src 172.18.132.145 Finally I could fixed this problem by removing the config of eth0: sudo rm /etc/sysconfig/network-scripts/ifcfg-eth0 I will add it to the script. |
I think #42 resolved the issue. Please let me know if the issue still occurs. |
I have tested version 2.0.1 release on terraform registry, and it doesn't work.. still have eth0 as the default route, so the instance can't send traffic to the internet. which version should I test? |
I'm quite keen to get a version of this published on the registry that works. Rather than me publishing a copy of your one, can we work together to get it working, if you have time sometime in the next couple of weeks. My solution is working for us, but it's not perfect and ends up with 2 default routes, and two interfaces in the same subnet. |
Yeah, this latest fix is bogus. I built this module from the example in README.md and this is my NAT instance's networking details after a reboot:
Long story short, it appears this module is broken, I tried downgrading to 2.0.0, but after that I couldn't even connect to the EC2 instance via SSM to debug this. |
When you tried using it, did you have an eip assigned to the nat instance?
It has to be created externally to the module, then passed in.
I had problems when I didn't assign an EIP.
…On Sat, 23 Jul 2022, 2:53 AM Julian Calaby, ***@***.***> wrote:
Yeah, this latest fix is bogus.
I built this module from the example in README.md and this is my NAT
instance's networking details *after a reboot*:
sh-4.2$ ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 06:e8:a4:c9:de:f6 brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
link/ether 06:33:c7:03:41:92 brd ff:ff:ff:ff:ff:ff
inet 10.0.128.88/24 brd 10.0.128.255 scope global dynamic eth1
valid_lft 3401sec preferred_lft 3401sec
inet6 fe80::433:c7ff:fe03:4192/64 scope link
valid_lft forever preferred_lft forever
sh-4.2$ ip route
default via 10.0.128.1 dev eth1 metric 1000110.0.128.0/24 dev eth1 proto kernel scope link src 10.0.128.88
sh-4.2$ sudo iptables -t nat -L
Chain PREROUTING (policy ACCEPT)
target prot opt source destination
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
Chain POSTROUTING (policy ACCEPT)
target prot opt source destination
MASQUERADE all -- anywhere anywhere
Long story short, it appears this module is broken, I tried downgrading to
2.0.0, but after that I couldn't even connect to the EC2 instance via SSM
to debug this.
—
Reply to this email directly, view it on GitHub
<#35 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAY4PN3FAU7YM5G4MY4ERLVVKYXHANCNFSM4XXI7VQA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Yep, had to reorganise stuff so I could, but I did have an EIP on the NAT instance when I did my first round of testing with version 2.0.1. My initial testing of this module failed to produce a working internet connection on the NAT instance or an instance on a private subnet, so it looks like something's misconfigured or missing. For the record, it's possible that the "something missing" is entirely my fault. My understanding is that NAT gateways work like this: private host -> network interface -> NAT -> route table -> internet. So therefore how we get to the internet shouldn't matter, which makes the act of deleting the eth0 configuration script and therefore leaving that interface unconfigured after a reboot seem bogus as it shouldn't matter. That said, all my previous hacking has used separate interfaces for the input and output sides of the NAT gateway, so it's quite possible it'll all work on one interface and leaving eth0 unconfigured is correct. I suspect that I've made a mistake somewhere here, but I also know that the NAT gateway should have had internet access in my testing, and the fact that it doesn't is concerning. I'm going to try a couple of other options then maybe return to this depending on the outcome. fck-nat seems promising if I can figure out a simple way to Terraformise it's setup. (Another thing that stood out is that the ENI handling needs to be smarter: we should be able to detect whether it's already connected or somehow still in-use (e.g. after an instance is terminated) and respond appropriately.) |
I've been thinking about this over the past couple of days and worked out why deconfiguring Essentially the bit I was missing here is that we need to have a public IP address so we can send stuff through an internet gateway and that the floating ENI ( This makes sense with the current use cases:
The reason why it wasn't working for me initially is because if the EIP isn't available before the EC2 instance starts, it doesn't get the routes it needs and is therefore cut off from the internet. I'd really like this module to work without an EIP, so I'm going to hack together a patch to always use |
Ok:
@int128 these changes are probably overkill and I haven't tested DNAT, but they Work For Me so they should be mergeable. |
This module uses eth1 with the EIP to pin the source IP address. I think your change breaks the fixed IP feature. How do you think? |
This is what I have been using, which seems to be ok, at least not enough I've had problems.
|
I guess it depends on your use case. If you need all your NATed traffic to come from a constant IP, then yeah, this breaks that, but this should be a pretty niche use-case and NAT instances should be pretty long-running and therefore have a relatively constant IP address, just not one known in advance. If DNAT port forwarding is enabled, it should still work as long as the services inside the private subnet aren't expecting to be able to tell remote services something like "hey, connect to whatever my IP is, but on port 1234", where port 1234 has previously been opened using DNAT. Again, this should be a pretty niche use-case and I think that most common services that do this, e.g. active FTP, already have special case handling in Linux. I guess that in my opinion, a constant source IP address isn't required for well over 90% of use cases, so this will be fine and removes the need for an EIP, reducing costs and resource usage. But yeah, we can't ignore those niche cases, so maybe this should be switchable then? No EIP required for the common use cases, and tell the module it'll have an EIP if you absolutely need certainty about the source IP. |
I think this case needs to be supported, it's not that uncommon to have a white listed external IP.
As per the AWS docs:
So, having the Elastic IP I don't think is adding any costs, because the NAT instance exists all of the time. |
I agree that there are situations where it's needed, so I'll make it configurable.
True, but you're limited to 5 of them without jumping through hoops with AWS support - I had to change how I was doing stuff in my VPC because I was using all 5 before I deployed this, so for people in situations where stuff you can't use is using most of your allocation or you want more than 5 VPCs with NAT instances, it'd be nice to not require one. |
I am using this module then, EIP is not attching to nat instance and the snat service is failing When i analyzed the repo i find out that This NAT module has a runonce.sh script and a snat.sh Now this runonce.sh script is responsible to attach the ENI to the same nat instance and then start the snat service But this is not working as per expected |
Hi,
I've had issues with this not working, although it used to work.
It seems that when it deletes the default route:
The nat instance then looses all internet connectivity.
Does this still work for you?
The text was updated successfully, but these errors were encountered: