Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compliance with vtypes #86

Open
smarie opened this issue Feb 20, 2020 · 11 comments
Open

Compliance with vtypes #86

smarie opened this issue Feb 20, 2020 · 11 comments

Comments

@smarie
Copy link

smarie commented Feb 20, 2020

Hi, I found out that pytypes was not able to validate a vtype:

from vtypes import VType
from pytypes import is_of_type

class PositiveInt(int, VType):
    __validators__ = {'should be positive': lambda x: x >= 0}

assert isinstance(1, PositiveInt)
assert is_of_type(1, PositiveInt)  # AssertionError !

I do not know if this can be fixed easily, but I thought you could be interested. Indeed is_of_type documentation states

Works like isinstance, but supports PEP 484 style types from typing module.

Since pytypes is one of the two type validation library supported by default in pyfields, I would like to make both behave the same way (typeguard seems to handle vtype ok)

Thanks!

@Stewori
Copy link
Owner

Stewori commented Feb 20, 2020

This looks like correct behavior to me:

from pytypes import is_of_type

class int2(int): pass

print (issubclass(int, int2))    # True
print (issubclass(int2, int))    # False

A = int2()
print(A)
print(is_of_type(1, int))    # True
print(is_of_type(A, int2))   # True
print(is_of_type(A, int))    # True
print(is_of_type(1, int2))   # False

1 is plainly an int, not a subclass of int, unless instantiated in a special way, i.e. as an instance of that subclass. In the example the subclass relation goes only in one direction: an int2 is an int but an int is not an int2. Same with your vtypes example.

Maybe you can achieve what you want if you use a subclasshook in PositiveInt. Looking at the vtypes code, it does not seem to apply subclasshooks.

from pytypes import is_of_type

class int2meta(type):
    def __subclasscheck__(cls, subclass):
        if subclass is int: return True
        return issubclass(cls, int2)

class int2(int, metaclass=int2meta): pass

print (issubclass(int, int2))    # True
print (issubclass(int2, int))    # True

A = int2()
print(A)
print(is_of_type(1, int))    # True
print(is_of_type(A, int2))   #True
print(is_of_type(A, int))    #True
print(is_of_type(1, int2))   #True

However, compliance with subclasshooks is not much tested in pytypes. Use with care! On the surface it seems to work, like in the example above. But in complex situations, I recommend careful testing. Please report back bugs with pytypes and subclasshook if they pop up.

Also note that there are literal types now, which have some overlap with what vtypes wants to achieve. However literal types are not yet supported in pytypes and I cannot tell when they will be.

@smarie
Copy link
Author

smarie commented Feb 20, 2020

The thing is, a vtype is not a standard type. I could have written:

PositiveInt = vtype('PositiveInt', int, {'should be positive': lambda x: x >= 0})

or

class PositiveInt(VType):
    __type__ = int
    __validators__ = {'should be positive': lambda x: x >= 0}

Both are equivalent to my initial example. VTypes override __instancecheck__ to determine if an object is an instance of them. And the implementation just checks that the object is an instance of all the __type__ and passes all the __validators__. I can not reach the same result using a subclass hook, since the validators check the value, not the type: 1 is an instance of PositiveInt while -1 is not.

So, the only way to get it right is to call isinstance... which I thought is_of_type would do (and that typeguard, as well as my own small type checker in pyfields do). That's why I was surprised. I understand that this is an implementation choice, however the doc is a bit misleading then..

You are right that there is one similarity with literal types, in the sense that the type validation is on the value, not on the type. We can see a literal type as a VType without base types and with a simple "equality" validator.

Thanks!

@Stewori
Copy link
Owner

Stewori commented Feb 20, 2020

Indeed, I overlooked that __instancecheck__ should be considered by is_of_type. However, if someone overrides isinstance and/or issubclass I would usually expect the promise of consistency to hold:
isinstance(A, typeA) == issubclass(type(A), typeA). Maybe pytypes should actually check this consistency; it's about checking consistencies afterall.
I am not sure how __instancecheck__ can or should be respected by is_of_type. I will think about it. Don't you think that vtypes should on the other hand respect the consistency promise mentioned above?

Unfortunately, the Python documentation and also the original PEP seem not to be specific about which check should be authoritive over the other:
by what right would we trust the result of isinstance(A, typeA) over the result of issubclass(type(A), typeA)? I would think a failure of any of them should be a reason to alert the user/reject the overall typecheck. Maybe is_of_type should have a fail-fast early check like
if not isinstance(A, typeA): return False.

Maybe the best for all usecases would be to make it configurable. However, it is non-trivial work to let is_of_type prefer isinstance(A, typeA) over the result of issubclass(type(A), typeA) and I doubt if this additional option would pay off.

@Stewori
Copy link
Owner

Stewori commented Feb 20, 2020

Okay, I had some thought and I think the best solution would be:

  • you add a __subclasscheck__ to VType (i.e. to the metaclass) such that at least for types like
    class SomeSubtypename(someType, VType):, issubclass(SomeSubtypename, someType) returns true. IMO this would be a desirable consistency for VTypes anyway.

  • I make sure that is_of_type fails fast regarding isinstance() That means, it would reject is_of_type(-1, PositiveInt). Correct and strict behavior for the negative case is also important and what you want, right? It's even the main idea I suppose.

  • I'll update the doc of is_of_type to make clear it differs from isinstance in the sense that issubclass(type(A), typeA) is checked in addition to isinstance(A, typeA), i.e. is significantly stricter than one is used from usual isinstance behavior.

These actions should cause vtypes and pytypes to be compatible, right? How this sounds?

@Stewori
Copy link
Owner

Stewori commented Feb 21, 2020

I realize it is more complicated to get this right if the vtype is not toplevel, but somewhere nested in tuples, etc.. However, similar work would be required for LiteralType support. So it should be done at some point, but I cannot tell when.

@smarie
Copy link
Author

smarie commented Feb 21, 2020

I am not sure there is a need to change the subclass behaviour. Current behaviour seems intuitive to me:

>>> issubclass(PositiveInt, int)  # all positive ints are ints
True
>>> issubclass(int, PositiveInt)  # not all ints are positive ints
False

Now I understand that more work is needed to integrate isinstance in the checking method especially when nesting in Union and Tuple etc. is present.
Thanks for the answer anyway !

@Stewori
Copy link
Owner

Stewori commented Feb 21, 2020

issubclass(int, PositiveInt) # not all ints are positive ints

Indeed, that should be this way.

Now I understand that more work is needed to integrate isinstance in the checking method

I have another idea, that would be rather easy to implement, but computationally not very efficient. It would require an opt-in, so it may be okay.
deep_type could be made aware of the validating types such that it would actually return PositiveInt as the tightest type for e.g. 1.
To support what you want, it would be inherently necessary to make pytypes aware of some special types that are "value types" and are allowed to break the consistency isinstance(A, typeA) == issubclass(type(A), typeA). I think like offering something like a function register_value_type. You would have to call something like pytypes.register_value_type(VType) in pyfields right after import pytypes. Then deep_type would use __subclasses__ on registered value type classes and once it has inferred e.g. an int it would apply all known validators that are subclasses of int to see which apply. A kind of intersection type would be needed to handle multiple positive results, e.g. deep_type(2) may result in Intersection[PositiveInt, EvenInt]. A general intersection type would be not trivial to get right in coherence with typing module, especially across versions. I rather think of a type only for internal use and not supporting to be used as a type annotation.
Finally a method is_subclass_intersection_value_types would need to be written, however I think a lot of complexity like in is_subclass_union could be avoided because that complexity arises from the possibility of recursions and type variables. These difficult cases wouldn't occur for a specialized intersection type that would possibly only permit value types for this particular internal use case.

This would let the nesting logic be done by deep_type where it is already implemented, yet on the cost of various unnecessary isinstance(blah, some_vtype) checks. It might become a problem if there are significantly many different subclasses of vtype in the runtime that scope the same supertype, e.g. int. However, it would be a first step and further steps would just be performance improvements then. Also, having deep_type support value types would be a value on its own. Maybe LiteralTypes could be supported in the same manner.

@smarie
Copy link
Author

smarie commented Feb 22, 2020

To support what you want, it would be inherently necessary to make pytypes aware of some special types that are "value types" and are allowed to break the consistency isinstance(A, typeA) == issubclass(type(A), typeA)

This is one solution, another one even simpler could be to rely on isinstance(A, typeA) everywhere pytypes relies on issubclass(type(A), typeA) - but you mentioned earlier that this means a major refactoring that may not pay off. Just out of curiosity, would that have any other drawback at all apart from refactoring ? If this has drawbacks, then I agree that something else and more complex should be done. I am not very fond of having to register classes, it seems a bit overkill. But I could definitely register the VType parent class with pytypes if that helps - using a methods such as register_instancechecking_class(). Unfortunately, that will add an extra cost everywhere (checking if target class is a subclass of all registered classes)... probably not something you want to do.

Maybe you should just reject this proposal for now - a VType is clearly a "strange type" after all - I would understand.

Also, maybe we should wait for Literal[...] to be handled first, that will probably pave the way ?

@Stewori
Copy link
Owner

Stewori commented Feb 23, 2020

Maybe you should just reject this proposal for now

The thing is, I wanted to have value types anyway. In the long run, to be able to work with numpy arrays. That's why this is not rejected. However, the priority is #40 and getting the basics right.

The value types I have in mind are somewhat different from vtypes but this is reason enough to consider preparing the necessary architecture (also for Literal types).

rely on isinstance(A, typeA) everywhere

A main motivation in pytypes is to have a proper PEP 484 aware issubclass equivalent. The typechecking is more a proof of concept on top of it. (I personally find typesafe @override much more important, also given there are alternatives for typechecking). That's a main reason why is_of_type works as a combination of deep_type and is_subtype. Also, in general I think the issubclass(type(A), typeA) is a justified check that shouldn't be dropped. And normally inconsistency between issubclass- and isinstance-based checks is a reason for alert on its own right. So I won't drop that.

What I might do is kind of (in pseudo code):

if [value available]:
    if not isinstance(value, [type]): return False
    # skip further tests if it is a value type
    if [type is registered as instancechecking]: return True

# ordinary types must fulfill this too:
return issubclass(type(value), [type])

But that requires to provide the value throughout various methods that are originally designed for subtypechecking. This requires more architecture change. I will consider this after #40. Also, don't expect a solution too soon. You may have to suspend pytypes support for a while.

@Stewori
Copy link
Owner

Stewori commented Feb 23, 2020

checking if target class is a subclass of all registered classes

not all registered classes. Only those that are also subclasses of the target class. They would be stored in a dict with the target class as the key, e.g. {int: [PositiveInt, EvenInt, etc], float: [PositiveFloat, etc], etc}. That should reduce it a bit. Not the final solution though.

The bet here is of course that one would normally not have thousands of different validators for the same type, but only a handful.

@smarie
Copy link
Author

smarie commented Feb 23, 2020

I see. Thanks for these thoughts ! I'll check for updates from time to time.

I added a note in pyfields documentation in the meantime: here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants