Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix too greedy TrackLink regex replacement. #2355

Merged
merged 1 commit into from
Mar 25, 2025
Merged

Conversation

0xdeb
Copy link
Contributor

@0xdeb 0xdeb commented Mar 17, 2025

Hi there!

First, thank you for developing and maintaining listmonk - it's an awesome tool and very easy to use!

I implemented listmonk at my company a few weeks ago, and one recurring issue reported by the marketing team is that they sometimes cannot save tracked links.

Issue Description

After investigating, I found that the issue stems from the TrackLink regex replacement, which incorrectly matches more characters than intended. This results in template compilation errors like:

{"message":"Error compiling template: error compiling message: template: content:1: unexpected \"\u003e\" in operand"}

A minimal reproducible example is inserting two links in the same line:

<a href="https://google.de">Untracked Link</a><a href="https://google.de@TrackLink">Tracked Link</a>

This causes the campaign to fail rendering.
image

Root Cause

The issue occurs because the @TrackLink syntax regex matches too many characters, unintentionally breaking the DOM.
Screenshot_20250317_172446
As you can see here, the expression starts with the "https://" of the first link and ends with the second tracked link.
Different combinations of imgs and links in the same link can of course also trigger this issue.

Fix

I updated the regex to only match characters that are valid in a URL per RFC 3986, ensuring a clean stop after the URL.

Let me know if you'd like any adjustments. Thanks!

@knadh
Copy link
Owner

knadh commented Mar 18, 2025

Ah, that's a neat gotcha! Thanks for the well articulated PR.

The Regexp can be simplified as (https?://[\w\-\.~!#$&'()*+,/:;=?@\[\]]*)@TrackLink. Could you please amend the PR with this? This uses aliases for character classes and removes redundant escape chars.

PS: I tried to amend the PR with the GitHub CLI and messed it up somehow.

@0xdeb 0xdeb force-pushed the master branch 2 times, most recently from 05b51a9 to dab20f0 Compare March 18, 2025 23:15
@0xdeb
Copy link
Contributor Author

0xdeb commented Mar 18, 2025

Sure! Integrated the change and rebased on current master to get rid of the merge commit.

On second thought, what so you think about using \p{L} in the expression instead?
\w does not match accented letters and the rich text editor does not save the domain in IDN (xn--...). So if one were to enter a internationalized domain and enable tracking, that wouldn't work with \w.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants