Skip to content

feat(migrate): implement migration sort customization #1186

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

jeffreydwalter
Copy link
Contributor

Fixes #1185

@jeffreydwalter jeffreydwalter changed the title bug(migrations): Sort migrations as ints not strings bug(migrations): sort migrations as ints not strings Apr 17, 2025
@jeffreydwalter jeffreydwalter force-pushed the jeffreydwalter-fix-migration-sort branch from 754bfd1 to 22f9eb4 Compare April 17, 2025 22:13
@jeffreydwalter jeffreydwalter force-pushed the jeffreydwalter-fix-migration-sort branch from 22f9eb4 to 60f07c3 Compare April 17, 2025 22:16
@jeffreydwalter jeffreydwalter changed the title bug(migrations): sort migrations as ints not strings fix(migrations): sort migrations as ints not strings Apr 17, 2025
@Aoang
Copy link
Contributor

Aoang commented Apr 18, 2025

Thanks for raising this issue and submitting the pull request. You've identified a valid point: the current lexicographical sort on the migration prefix isn't ideal for purely numeric prefixes like 0_, 1_, 10_, where it can lead to incorrect ordering.

Your proposed change to perform a numeric sort when the prefix is identified as a number is a good improvement for handling such cases and standard timestamps correctly.

However, implementing this change directly as the default behavior would unfortunately introduce a breaking change. For example, users who might currently have files named 01_... and 001_... and rely on the specific lexicographical ordering (001, 01) would see the behavior change (they would be treated as numerically equal: 1, 1). A more impactful example is files like 10_ten.up.sql and 2_two.up.sql; lexicographical sort yields (10_ten, 2_two), while numeric sort would yield (2_two, 10_ten), changing the execution order.

To accommodate the improved sorting logic without disrupting existing users, we could introduce it as an optional behavior. We could add a new Option to the migrator configuration that enables numeric sorting. By default, this option would be disabled, preserving the current lexicographical sorting and ensuring backward compatibility.

Furthermore, thinking about long-term flexibility, we could potentially abstract the sorting logic further. Perhaps defining a sorter interface or allowing users to provide a custom comparison function would be beneficial. This would empower users to implement their own preferred sorting strategy, catering to diverse naming conventions like the 0001_user_add_index, or any other scheme they might devise.

What are your thoughts on introducing this numeric sort capability as an opt-in feature initially? This seems like a safe way to add the functionality while we consider if a more customizable sorting mechanism is needed down the line.

Thanks again for your contribution!

@jeffreydwalter
Copy link
Contributor Author

jeffreydwalter commented Apr 18, 2025

I don't really have an opinion about how you guys solve the problem. I think giving user the ability to implement their own sorting logic is fine. I personally think using timestamps is too cumbersome. Thinking about this a bit more though, what probably makes the most sense here is to do a human friendly "natural" sort... Doing so would handle all three of the use cases you identified as expected. Really, the only behavior change would be to filenames that start with ints.

@jeffreydwalter
Copy link
Contributor Author

Perhaps something like this:

import (
	"sort"
	"strconv"
)

func sortAsc(ms MigrationSlice) {
	sort.Slice(ms, func(i, j int) bool {
		return naturalLess(ms[i].Name, ms[j].Name)
	})
}

func sortDesc(ms MigrationSlice) {
	sort.Slice(ms, func(i, j int) bool {
		return naturalLess(ms[j].Name, ms[i].Name) // reverse
	})
}

// naturalLess:
//   1. If both names start with a valid integer, compare numerically.
//      • Same numeric value → shorter literal wins (01 < 001 < 0001).
//   2. Otherwise fall back to plain lexicographic comparison.
func naturalLess(a, b string) bool {
	ai, errA := strconv.ParseInt(a, 10, 64)
	bi, errB := strconv.ParseInt(b, 10, 64)

	if errA == nil && errB == nil {
		if ai != bi {
			return ai < bi
		}
		return len(a) < len(b) // same value, shorter numeric string first
	}

	return a < b // lexicographic when either prefix isn’t a proper integer
}

@jeffreydwalter
Copy link
Contributor Author

jeffreydwalter commented Apr 18, 2025

Or this if you're not adverse to using a library:

import (
	"sort"

	"golang.org/x/text/collate"
	"golang.org/x/text/language"
)

var coll = collate.New(language.Und, collate.Numeric)

func sortAsc(ms MigrationSlice) {
	sort.Slice(ms, func(i, j int) bool {
		return coll.CompareString(ms[i].Name, ms[j].Name) < 0
	})
}

func sortDesc(ms MigrationSlice) {
	sort.Slice(ms, func(i, j int) bool {
		return coll.CompareString(ms[i].Name, ms[j].Name) > 0
	})
}

@Aoang
Copy link
Contributor

Aoang commented Apr 18, 2025

You're right, it's challenging to definitively say which sorting method is universally "better," as different users have different needs and conventions (it's hard to please everyone!). That's precisely why empowering users with the choice seems like the most flexible path forward.

Our general philosophy is to avoid breaking changes whenever possible, unless they are necessary to fix a bug that prevents the library from functioning correctly. We try to be very careful about this because introducing breaking changes, even with good intentions, can cause significant disruption and unexpected work for existing users who rely on the current behavior.

Therefore, providing the sorting capability as an opt-in feature aligns best with this principle. It allows users who need it to benefit from the new logic without negatively impacting others.

@jeffreydwalter
Copy link
Contributor Author

Great, when can I expect a fix for this then? Personally, I think the natural sort is the better solution here. You guys are going to make a mess of bun by being overly flexible and accommodating.

@bevzzz
Copy link
Collaborator

bevzzz commented Apr 21, 2025

Personally, I think the natural sort is the better solution here. [...] Really, the only behavior change would be to filenames that start with ints.

As @Aoang pointed out above, b/c is one of the primary concerns that guides our decision-making. Other projects and productive systems rely on Bun and we won't be introducing a change that may negatively impact them.

That being said, sorting filename prefixes numerically is a sensible addition and we will consider providing that as a Migrator option.

You guys are going to make a mess of bun by being overly flexible and accommodating.

Bun's documentation is very clear on what the convention for naming migration files is: timestamp + comment.

We want the library to be flexible and useful to as many users as possible; still, this does not come at the cost of reliability or consistency. In contrast, this issue/PR demands that we introduce a potentially breaking change because I [you] personally think using timestamps is too cumbersome.

Great, when can I expect a fix for this then?

The new option may arrive in one of the next releases. We'll keep #1185 as a tracking issue for this, so keep an eye on it 👍

@Aoang
Copy link
Contributor

Aoang commented Apr 24, 2025

You're right, natural sort is indeed a very sensible solution, and arguably a better default in isolation. We agree it has strong merits.

However, introducing it directly as the default behavior in the current major version (v1) is something we're hesitant to do precisely because it constitutes a breaking change. We believe avoiding such changes is crucial for maintaining stability and predictability for our existing users, which is generally a good practice, wouldn't you agree? We anticipate revisiting default behaviors like this when we plan for a future major version (like v2), where breaking changes are more expected.

Regarding implementing the option for alternative sorting methods, we'd be very grateful if you were willing to contribute that! It would certainly help get the feature available sooner. Otherwise, it will be added when I, another team member, or a community contributor has the bandwidth to implement it.

@bevzzz
Copy link
Collaborator

bevzzz commented Apr 28, 2025

@Aoang I am planning to do some work on Bun this week, will take care of introducing this additional option too, np 👍

@jeffreydwalter jeffreydwalter changed the title fix(migrations): sort migrations as ints not strings feat(migrate): implement migration sort customization May 5, 2025
@jeffreydwalter jeffreydwalter force-pushed the jeffreydwalter-fix-migration-sort branch from 5d0ce02 to 80fc253 Compare May 5, 2025 18:12
@jeffreydwalter
Copy link
Contributor Author

@Aoang I updated the PR to include WithSort()

@jeffreydwalter jeffreydwalter force-pushed the jeffreydwalter-fix-migration-sort branch from 76e0077 to 3de66a6 Compare May 5, 2025 18:32
@bevzzz
Copy link
Collaborator

bevzzz commented May 6, 2025

Thank you for adding the updates @jeffreydwalter. Just one comment, looks good otherwise.

@@ -232,7 +231,7 @@ func (ms MigrationSlice) Applied() MigrationSlice {
applied = append(applied, ms[i])
}
}
sortDesc(applied)
SafeDescSort(applied)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that MigrationSlice.Applied is no longer used internally in Bun?

@@ -38,7 +38,7 @@ func NewMigrations(opts ...MigrationsOption) *Migrations {
func (m *Migrations) Sorted() MigrationSlice {
migrations := make(MigrationSlice, len(m.ms))
copy(migrations, m.ms)
sortAsc(migrations)
SafeAscSort(migrations)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is the only place where sorting needs to be applied, could we allow users to pass in a sorting function when calling it, making the parameter optional?

If we do that, we can store the sorting logic in the migrator, thereby avoiding the need to implement any locking-related logic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a breaking change. You said you breaking changes weren't allowed...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@j2gg0s do you mean passing as an option to NewMigrator? Because I don't think users ever call ms.Sorted directly so something like ms.SortedFunc(fn sort.Less) probably won't be that convenient.

Personally, I think having a package-level sorting is fine -- seems like something you'd only want to set once.

Re: maintenance, the overhead of that locking is immaterial + we needn't configure it for AutoMigrator separately.

wdyt?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping @j2gg0s

jeffreydwalter and others added 2 commits May 6, 2025 17:40
Co-authored-by: dyma solovei <[email protected]>
Update SetSort comment to match func name.
@bevzzz
Copy link
Collaborator

bevzzz commented May 14, 2025

@jeffreydwalter the lint job on you latest commits fails because Bun uses conventional commits -- could you please rewrite those to something like refactor: update migrate/migrator.go so it'd pass?

it's a minor detail, but it helps us have a consistent style. thanks!

@jeffreydwalter
Copy link
Contributor Author

@bevzzz unfortunately, I'm no longer able to make changes to that repo. I left that company. So, feel free to make any updates you want to address the issue.

@bevzzz
Copy link
Collaborator

bevzzz commented May 30, 2025

hey @jeffreydwalter! thanks for the heads-up 👍
I'll take care of merging this PR as soon as we reach an agreement on how the option should be specified (cc @j2gg0s).

Thank you for your contribution and good luck in your next position 💼

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Migration tool is doing lexicographical sort on timestamp part of the migration filenames
4 participants