Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PK for APIKey models should not contain hashed key #128

Open
florimondmanca opened this issue Apr 7, 2020 · 7 comments
Open

PK for APIKey models should not contain hashed key #128

florimondmanca opened this issue Apr 7, 2020 · 7 comments
Labels
enhancement Enhancement of an existing feature

Comments

@florimondmanca
Copy link
Owner

Is your feature request related to a problem? Please describe.
Currently the PK on APIKey model (or derived models) is a string in the form of {prefix}.{hash}. While not completely insecure, this string contains special characters so it's tricky to pass in URLs, making it hard to build frontend API key management functionality.

Describe the solution you'd like
Convert the APIKey PK to a standard autoincremented integer.

A migration and detailed upgrade instructions should be provided to make this change non-breaking (no API keys should be lost/have to be regenerated in the process).

Describe alternatives you've considered
/

Additional context
This is a long-discussed feature, and actually there has been discussion on this in the past:

So to solve this we should:

@florimondmanca florimondmanca added the enhancement Enhancement of an existing feature label Apr 7, 2020
@florimondmanca florimondmanca changed the title APIKey PK should be an integer APIKey PK should be an integer Apr 7, 2020
@florimondmanca florimondmanca changed the title APIKey PK should be an integer Integer PK for APIKey models Apr 7, 2020
@wijowa
Copy link

wijowa commented Nov 18, 2020

This seems like common sense. I don't see the purpose of having a string PK for the api keys and it causes more problems, as well as being very difficult to work with.

The string PK breaks pretty much everything in Django, Django Admin, and Django Rest Framework. I am trying to figure out how to deal with this right now.

@davidfischer
Copy link
Contributor

While implementing #244, I highlighted that upgrading the hash algorithm will cause the hashed key and the hashed key that is part of the PK to be out of sync. The solutions aren't great and I decided to upgrade the PK with some awkward code.

My team uses this package for short lived API keys that have both a short expiry time and we revoke them when we're done with them. We discovered that if a key has the hash updated and then somebody tries to update/revoke the key they will get an error since the PK has changed. This isn't really a big deal as this error will be caught but other people may hit this niche issue with old keys when they're updated and then immediately revoked. However, this issue wouldn't exist if the hash wasn't part of the PK.

You could build the change you propose to switch to an autoincremented ID. It might be a little complicated since you're changing the type of the PK and maybe you'd need multiple migrations with an intermediate field. A possibly simpler solution, since you already have 150 characters for the PK is to just store a stringified random UUID as the PK. You wouldn't need to change the type of the field and the migration would be to just loop over all keys and save a UUID over the PK. I didn't see any lookups by PK in the existing code so this should just work but could break other people's code in the wild if they're depending on the PK being in the hash format (which they shouldn't be).

For example

class BaseAPIKeyManager(models.Manager):
    key_generator = KeyGenerator()

    def assign_key(self, obj: "AbstractAPIKey") -> str:
        try:
            key, prefix, hashed_key = self.key_generator.generate()
        except ValueError:  # Compatibility with < 1.4
            generate = typing.cast(
                typing.Callable[[], typing.Tuple[str, str]], self.key_generator.generate
            )
            key, hashed_key = generate()
            prefix, hashed_key = split(hashed_key)

        obj.id = str(uuid.uuid4())   # <-------------------------
        obj.prefix = prefix
        obj.hashed_key = hashed_key

        return key

@florimondmanca florimondmanca changed the title Integer PK for APIKey models PK for APIKey models should not contain hashed key Sep 14, 2023
@florimondmanca
Copy link
Owner Author

florimondmanca commented Sep 14, 2023

We discovered that if a key has the hash updated and then somebody tries to update/revoke the key they will get an error since the PK has changed.

Would that be because they would have obtained the API key object by ID, instead of using the prefix?

Or do you mean due to a concurrency issue where they hold onto the API key with a hashed key in ID, then the migration occurs, and they've got a stale object now?

Would you reckon that this warrants being in the "Changed" section of the changelog in #248? For now it is marked as "Fixed". Also, how about a proper "Upgrade guide" in the docs? Like for 2.0... https://florimondmanca.github.io/djangorestframework-api-key/upgrade/2.0/

just store a stringified random UUID as the PK

Yes, that would certainly make for a simpler migration.

I also don't think the PK change would be a problem for us. And although you can never really know what people do in the wild, there's one typical situation we can expect to happen: the API key's ID stored as a foreign key on other models.

I think I saw this back then and had thought about ways or hints for people to upgrade. Essentially people would need to also update the foreign keys with the new UUIDs. If we keep VARCHAR instead of a proper UUID type (e.g. Postgres has that), then their migration would also be simpler: "just" replace the FK content with the new UUID. That would need some scripting though.

@beda42
Copy link

beda42 commented Sep 14, 2023

I wonder, is there really some need to update the PK as part of #244? I think that the only requirements for PK are stability and uniqueness, and the original PK should work OK in this respect.

Is there any part of the code that relies on the PK having the same content as the prefix and hashed_key attrs? If yes, it may be better to remove such reliance rather than breaking possible foreign keys to the model.

@davidfischer
Copy link
Contributor

Would that be because they would have obtained the API key object by ID, instead of using the prefix?

Or do you mean due to a concurrency issue where they hold onto the API key with a hashed key in ID, then the migration occurs, and they've got a stale object now?

It's the latter. The object is stale and the reference is to an object whose PK is no longer in the DB.

Would you reckon that this warrants being in the "Changed" section of the changelog in #248? For now it is marked as "Fixed". Also, how about a proper "Upgrade guide" in the docs?

I think it will have to have an upgrade docs especially since anybody linking to the PK with a FK will have problems.

I think I saw this back then and had thought about ways or hints for people to upgrade. Essentially people would need to also update the foreign keys with the new UUIDs. If we keep VARCHAR instead of a proper UUID type (e.g. Postgres has that), then their migration would also be simpler: "just" replace the FK content with the new UUID. That would need some scripting though.

It would but at least it is possible. Since the old FK will contain the prefix, it should be possible. However, it will cause the migration to upgrade PKs to fail since it can't leave dangling FKs.

I wonder, is there really some need to update the PK as part of #244? I think that the only requirements for PK are stability and uniqueness, and the original PK should work OK in this respect.

This is worth a consideration. Nothing in the code in this module requires that. We could just leave them out of sync until this issue is fixed up.

@beda42
Copy link

beda42 commented Sep 15, 2023

I wonder, is there really some need to update the PK as part of #244? I think that the only requirements for PK are stability and uniqueness, and the original PK should work OK in this respect.

This is worth a consideration. Nothing in the code in this module requires that. We could just leave them out of sync until this issue is fixed up.

If one thinks about the PK simply as a unique identifier, then this seems like a no-brainer - just leave it as it is. Expecting the PK to somehow reflect the internal content of the objects is something I have never seen anywhere else and does not really make sense.

@davidfischer
Copy link
Contributor

Expecting the PK to somehow reflect the internal content of the objects is something I have never seen anywhere else and does not really make sense.

I agree it doesn't but having an inconsistency isn't great either. However, if the plan is to just switch to a unique PK that doesn't reflect the hash (whether that's a string UUID or just an integer PK) soon, keeping it consistent seems unnecessary.

I've created a PR #251 that removes the code that updates the PK.

@florimondmanca florimondmanca added breaking This involves breaking API changes and removed breaking This involves breaking API changes labels Sep 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement of an existing feature
Projects
None yet
Development

No branches or pull requests

4 participants