Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lack of per-plugin timeout make transactions time out (w/patch) #294

Open
yitzhaq opened this issue Jul 19, 2020 · 1 comment
Open

Lack of per-plugin timeout make transactions time out (w/patch) #294

yitzhaq opened this issue Jul 19, 2020 · 1 comment

Comments

@yitzhaq
Copy link

yitzhaq commented Jul 19, 2020

It seems this is an old and well-known issue (even acknowledged by @abh back in the day as a "pretty serious bug"), but I couldn't find a report for it in here, so it seems worth raising. Plenty of threads to read through by Googling qpsmtpd timeout.

AFAICT there doesn't seem to be any mechanism by which to define a timeout per plugin, and only a general timeout (which from observation seems to be a good 600s) appears to apply. This can have particularly nasty consequences when something third-party, called by a plugin, experiences locking issues or similar. Typical examples would be SpamAssassin, any virus scanner, DSPAM etc.

The unfortunate common manifestation of this seems to be that the other end will drop the connection while waiting up to ten minutes for any response, thinking delivery has failed, even though it will eventually succeed once the timeout is reached. Thus delivery gets retried, leading to a duplicate message as far as content, but which usually will be sufficiently different as to not get caught by most duplicate detection. If the issue repeats itself, this will lead to another duplicate message, rinse and repeat. On top of this, these stalled connections can cause qpsmtpd to run out of available connections, causing further delivery issues. Not a pretty sight, and worst case this can bring a MTA to its knees when third-party software experiences issues.

Back in 2008 @vetinari posted a proposed plugin to configure per-plugin timeouts, which I believe would be the way to go here, and a superior approach to adding timeout functionality to each and every plugin. A few core changes are however necessary to support this mechanism (also in the patch), so it's not quite just a drop-in fix. His proposal received no comments, so I don't know to what extent any of this has been sanity checked.

It would be wonderful if someone (which if we're being realistic at this point probably means @msimerson) would be willing to review Hanno's code, apply whatever polish feels necessary, and if it works well, merge it.

@msimerson
Copy link
Member

Having seen this same issue in both Qpsmtpd and Haraka, I don't think that that per-plugin timeouts are the correct answer. A more robust solution is for the MTA to verify that the remote is still connected immediately before calling the queue plugin(s). If there's nobody at the other end, we won't be able to inform them of queue success or failure and thus should discard the message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants