Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[stdlib] Implement __contains__ for Tuple, List, ListLiteral (almost) #2658

Open
laszlokindrat opened this issue May 15, 2024 · 18 comments
Open
Labels
enhancement New feature or request good first issue Good for newcomers mojo-repo Tag all issues with this label [stdlib] core-library-q2-2024

Comments

@laszlokindrat
Copy link
Contributor

Now that we have ComparableCollectionElement, we can try to implement __contains__ for some common collection types using a workaround similar to what was employed in #2190. It's possible that the compiler would be able to correctly pick up a __contains__ overload with @staticmethod, e.g. something like this might just work:

struct Bucket[T: CollectionElement]:
    @staticmethod
    fn __contains__[S: ComparableCollectionElement](self: Bucket[S], elem: s) -> Bool: ...

var b: Bool = MyStuff() in Bucket[MyStuff]()

Even if in doesn't work with this, the implementation shouldn't have to change much once we roll out conditional conformance.

@laszlokindrat laszlokindrat added enhancement New feature or request good first issue Good for newcomers mojo-repo Tag all issues with this label labels May 15, 2024
@ChristopherLR
Copy link

Tried something like this:

struct Bucket[T: CollectionElement]:
    var elements: List[T]

    fn __init__(inout self, elements: List[T]):
        self.elements = elements

    @staticmethod
    fn __contains__[C: ComparableCollectionElement](self: Bucket[C], borrowed elem: C) -> Bool:
        for i in range(self.elements.size):
            if elem.__eq__(self.elements[i]):
                return True
        return False

However, i'm getting:

error: special method may not be a static method
    fn __contains__[C: ComparableCollectionElement](self: Bucket[C], borrowed elem: C) -> Bool:
       ^
mojo: error: failed to parse the provided Mojo source module

If I remove @staticmethod:

error: 'self' argument must have type 'Bucket[T]', but actually has type 'Bucket[C]'
    fn __contains__[C: ComparableCollectionElement](self: Bucket[C], borrowed elem: C) -> Bool:
                                                    ^     ~~~~~~~~~
mojo: error: failed to parse the provided Mojo source module

@ChristopherLR
Copy link

Is there something similar to c++ enable_if that i'm missing? https://en.cppreference.com/w/cpp/types/enable_if

@rd4com
Copy link

rd4com commented May 15, 2024

@laszlokindrat , @ChristopherLR , if self can be a reference? this parametrization could do the job:

struct Bucket[T: CollectionElement]:
    var elements: List[T]

    fn __init__(inout self):
        self.elements = List[T]()

    fn __contains__[C: ComparableCollectionElement](
        self: Reference[Bucket[C]], elem: C
    ) -> Bool:
        for i in range(self[].elements.size):
            if elem.__eq__(self[].elements[i]):
                return True
        return False

fn main():
    var x = Bucket[Int]()
    x.elements.append(1)
    if x.__contains__(1):
         print("ok")
    #if 1 in x: #not yet

@ChristopherLR
Copy link

Or if we don't use a special method:

struct Bucket[T: CollectionElement]:
    var elements: List[T]

    fn __init__(inout self, elements: List[T]):
        self.elements = elements

    @staticmethod
    fn contains[C: ComparableCollectionElement](self: Bucket[C], value: C) -> Bool:
        for i in range(self.elements.size):
            if value.__eq__(self.elements[i]):
                return True
        return False


fn main():
    var b = Bucket[Int](List(1, 2, 3))
    if __type_of(b).contains(b, 2):
        print("2 is in the bucket")
    else:
        print("2 is not in the bucket")

@laszlokindrat
Copy link
Contributor Author

@laszlokindrat , @ChristopherLR , if self can be a reference? this parametrization could do the job:

struct Bucket[T: CollectionElement]:
    var elements: List[T]

    fn __init__(inout self):
        self.elements = List[T]()

    fn __contains__[C: ComparableCollectionElement](
        self: Reference[Bucket[C]], elem: C
    ) -> Bool:
        for i in range(self[].elements.size):
            if elem.__eq__(self[].elements[i]):
                return True
        return False

fn main():
    var x = Bucket[Int]()
    x.elements.append(1)
    if x.__contains__(1):
         print("ok")
    #if 1 in x: #not yet

@rd4com Very cool, and I am actually surprised that parameter inference fails for the 1 in x sugar. I will look into this.

modularbot pushed a commit that referenced this issue May 16, 2024
Following @rd4com's suggestion in
#2658, this method can take a
reference type as self, allowing the callsite to use a natural syntax.

This patch also adds a changelog entry for `List.index`.

MODULAR_ORIG_COMMIT_REV_ID: 291dd5f544d7d5b463188e7c986b2ee3c8416662
@laszlokindrat
Copy link
Contributor Author

@rd4com I used your trick on List.index and it worked like a charm: ca10f35. This basically means that we have conditional conformance in these methods, as long as you are willing to deal with the explicit dereferencing (i.e. self[]) in the method body, which I think is totally okay. This is pretty amazing.

Would you be interested in going ahead and implementing __contains__ for the structs above (and maybe others)? Please ping me on the PRs, if so.

Regarding the 1 in x not desugaring correctly, this seems like a parser bug. Not sure when the fix will be ready, but explicit calls to __contains__ are acceptable for the time being.

@rd4com
Copy link

rd4com commented May 16, 2024

@laszlokindrat Nice, List.__contains__ is there: #2667 ,

1 in x works on that one (rebind, _type_is_eq and constained) 🥳

I'll try to implement Tuple.__contains__, just don't know which way is prefered ?

Of course, you're welcome to make theses PR's @ChristopherLR aswel (and any other person too)!

lsh pushed a commit to lsh/mojo that referenced this issue May 17, 2024
Following @rd4com's suggestion in
modularml#2658, this method can take a
reference type as self, allowing the callsite to use a natural syntax.

This patch also adds a changelog entry for `List.index`.

MODULAR_ORIG_COMMIT_REV_ID: 291dd5f544d7d5b463188e7c986b2ee3c8416662

Signed-off-by: Lukas Hermann <[email protected]>
@ChristopherLR
Copy link

Hey @rd4com and @laszlokindrat, not quite Tuple or ListLiteral, I've picked InlineList just as a starting point for me to get used to the codebase :) If available would you mind checking out #2703

@soraros
Copy link
Contributor

soraros commented May 17, 2024

Should we even provide __contains__ for heterogeneous collections like Tuple?

@ChristopherLR
Copy link

ChristopherLR commented May 17, 2024

@rd4com I discovered that this test fails for List when testing InlineList:

def test_list_contains_str():
    var x = List[String]("a", "b", "c")
    assert_false("d" in x)
    assert_true(x.__contains__("a"))
    assert_false(x.__contains__("e"))
constraint failed: value type is not self.T

However this passes:

def test_list_contains_str():
   var x = List[String]("a", "b", "c")
   assert_true(x.__contains__("a"))
   assert_false(x.__contains__("e"))

Note that i've delete assert_false("d" in x)

@laszlokindrat
Copy link
Contributor Author

@rd4com I discovered that this test fails for List when testing InlineList:

def test_list_contains_str():
    var x = List[String]("a", "b", "c")
    assert_false("d" in x)
    assert_true(x.__contains__("a"))
    assert_false(x.__contains__("e"))
constraint failed: value type is not self.T

However this passes:

def test_list_contains_str():
   var x = List[String]("a", "b", "c")
   assert_true(x.__contains__("a"))
   assert_false(x.__contains__("e"))

Note that i've delete assert_false("d" in x)

Yeah, this is the parser bug I mentioned before. Don't worry about it, direct call to __contains__ is fine for now. Thanks for the patch, I will take a look!

@laszlokindrat
Copy link
Contributor Author

laszlokindrat commented May 17, 2024

Should we even provide __contains__ for heterogeneous collections like Tuple?

It would be nice if we could do if "foo" in ("foo", "bar", 1): ....

@soraros
Copy link
Contributor

soraros commented May 17, 2024

@laszlokindrat I agree that it's rather convenient. But shouldn't we write ["foo", "bar", ...] instead?
From the typing point of view, I find it difficult to justify having a __contains__ for a struct (Tuple is a case of this) which iterate over all the compatible fields to test for equity. That feels highly ad-hoc, and feels like a meta programming feature.

@laszlokindrat
Copy link
Contributor Author

laszlokindrat commented May 17, 2024

@laszlokindrat I agree that it's rather convenient. But shouldn't we write ["foo", "bar", ...] instead? From the typing point of view, I find it difficult to justify having a __contains__ for a struct (Tuple is a case of this) which iterate over all the compatible fields to test for equity. That feels highly ad-hoc, and feels like a meta programming feature.

Can you give an example on how this could be misused/abused? I feel pretty strongly that, from a "python superset" point of view, "foo" in ("foo", "bar", 1) has to work, at least for the simple cases.

@soraros
Copy link
Contributor

soraros commented May 17, 2024

The concern here isn't misuse, but rather whether we need to mimic Python in this particular case. Python's tuple serves multiple functions, effectively encapsulating three distinct types:

  1. Tuple: heterogeneous and of fixed length.
  2. InlineArray: homogeneous and of fixed length.
  3. List: homogeneous and of variable length.

In a typed language like Mojo, it's logical for the latter two types to implement this interface. Moreover, a Python tuple can be mapped to either InlineArray[PyObject] or List[PyObject]. Given that the default implementation for __contains__ is any(val == x for x in self), it seems unnecessary to introduce this ad-hoc method for non-sequence types. Similarly, we might want to reconsider the utility of using for v in (a, b, c): in Mojo.

@laszlokindrat
Copy link
Contributor Author

Syntactically, for v in (a, b, c): et al have to be valid. Considering how basic this feature is, and how quickly someone coming from python would try to do this (anecdotally, this was actually the first line of mojo I ever tried to write), we just cannot not have it.

Moreover, a Python tuple can be mapped to either InlineArray[PyObject] or List[PyObject].

Now this is a good point; maybe the problem is that (1, 2, 3) maps to Tuple while it should map to InlineArray? At the same time, Python's Tuple type is also heterogeneous, so our hands are a bit tied (admittedly, though, type checkers probably give up when they see someone iterate through a tuple). So my philosophy is this: if we can't see obvious ways to misuse/abuse, we should push for syntactic compatibility. If we discover that methods like this do pose unforeseen problems, we can 1) document sharp edges, 2) improve/fix them, 3) deprecate and then remove them.

@soraros
Copy link
Contributor

soraros commented May 18, 2024

Syntactically, for v in (a, b, c): et al have to be valid.

Totally, if we can enforce the same type requirement at compile time.

Now this is a good point; maybe the problem is that (1, 2, 3) maps to Tuple while it should map to InlineArray?

Again, agreed.

Question: should we allow True in (1, 2, "a")? Or equivalently, do we require T in Ts for where val: T and t: Tuple[*Ts]. It's tricky because it makes sense to not enforce it for heterogeneous tuple literal, but not so much for homogeneous ones.

modularbot pushed a commit that referenced this issue May 18, 2024
…tains__ to InlineList (#40189)

[External] [mojo-stdlib] Add variadic initialiser, __iter__ and
__contains__ to InlineList

This PR adds some features to InlineList ( related issue #2658 )

*Variadic initialiser*
```mojo
var x = InlineList[Int](1,2,3)
```

*iter*
```mojo
var x = InlineList[Int](1,2,3)
for i in x:
    print(i)
```

*contains*
```mojo
var x = InlineList[Int](1,2,3)
if 3 in x: print("ok")
```

Co-authored-by: Chris <[email protected]>
Closes #2703
MODULAR_ORIG_COMMIT_REV_ID: 6a9eb7b41905e62f8db0855159fd851b7ffb6174
msaelices pushed a commit to msaelices/mojo that referenced this issue May 18, 2024
Following @rd4com's suggestion in
modularml#2658, this method can take a
reference type as self, allowing the callsite to use a natural syntax.

This patch also adds a changelog entry for `List.index`.

MODULAR_ORIG_COMMIT_REV_ID: 291dd5f544d7d5b463188e7c986b2ee3c8416662
msaelices pushed a commit to msaelices/mojo that referenced this issue May 18, 2024
…tains__ to InlineList (#40189)

[External] [mojo-stdlib] Add variadic initialiser, __iter__ and
__contains__ to InlineList

This PR adds some features to InlineList ( related issue modularml#2658 )

*Variadic initialiser*
```mojo
var x = InlineList[Int](1,2,3)
```

*iter*
```mojo
var x = InlineList[Int](1,2,3)
for i in x:
    print(i)
```

*contains*
```mojo
var x = InlineList[Int](1,2,3)
if 3 in x: print("ok")
```

Co-authored-by: Chris <[email protected]>
Closes modularml#2703
MODULAR_ORIG_COMMIT_REV_ID: 6a9eb7b41905e62f8db0855159fd851b7ffb6174
@laszlokindrat
Copy link
Contributor Author

Question: should we allow True in (1, 2, "a")?

I don't see why not. A perfect implementation of __contains__ might not be possible today (we might need parametric traits or something to correctly model comparisons between arbitrary types), but I don't think that should stop us from having something.

martinvuyk pushed a commit to martinvuyk/mojo that referenced this issue May 24, 2024
Following @rd4com's suggestion in
modularml#2658, this method can take a
reference type as self, allowing the callsite to use a natural syntax.

This patch also adds a changelog entry for `List.index`.

MODULAR_ORIG_COMMIT_REV_ID: 291dd5f544d7d5b463188e7c986b2ee3c8416662
martinvuyk pushed a commit to martinvuyk/mojo that referenced this issue May 24, 2024
…tains__ to InlineList (#40189)

[External] [mojo-stdlib] Add variadic initialiser, __iter__ and
__contains__ to InlineList

This PR adds some features to InlineList ( related issue modularml#2658 )

*Variadic initialiser*
```mojo
var x = InlineList[Int](1,2,3)
```

*iter*
```mojo
var x = InlineList[Int](1,2,3)
for i in x:
    print(i)
```

*contains*
```mojo
var x = InlineList[Int](1,2,3)
if 3 in x: print("ok")
```

Co-authored-by: Chris <[email protected]>
Closes modularml#2703
MODULAR_ORIG_COMMIT_REV_ID: 6a9eb7b41905e62f8db0855159fd851b7ffb6174
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers mojo-repo Tag all issues with this label [stdlib] core-library-q2-2024
Projects
None yet
Development

No branches or pull requests

5 participants