Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Narrowly Scoped equality on SelectionSet models #3351

Open
AnthonyMDev opened this issue Mar 12, 2024 · 1 comment
Open

Narrowly Scoped equality on SelectionSet models #3351

AnthonyMDev opened this issue Mar 12, 2024 · 1 comment
Labels
codegen Issues related to or arising from code generation

Comments

@AnthonyMDev
Copy link
Contributor

The Problem

The generated SelectionSet models conform to Equatable comparing the entire DataDict of the two objects. In most cases, this is the intended behavior, but in some situations, particularly dealing with fragments, this is not the desired behavior.

Consider the following example:

fragment TaxonomyFragment on Animal {
  species
  genus
}

query AnimalQuery {
  animals {
    ...TaxonomyFragment
    ... on Pet {
        petName
    }
  }
}

Assuming a response (concatenated to only relevant data):

"animals" [
  {
    "__typename": "Dog",
    "petName": "Spot",
    "species": "Familiarus",
    "genus": "Canis"
  },
  {
    "__typename": "Dog",
    "petName": "Clifford",
    "species": "Familiarus",
    "genus": "Canis"
  }
]

When comparing two animals returned from this query, you would expect animals[0] == animals[1] to return false. However, when comparing just the TaxonomyFragment using animals[0]taxonomyFragment == animals[1]taxonomyFragment, the expected behavior is unclear.

You could argue that because these fragments are still representing entirely different entities that this should return false. However, if you are only comparing the TaxonomyFragment of each, those fields are equal, and you may expect this to return true. This becomes even more unclear in the (albeit contrived) situation where you are comparing the TaxonomyFragment of two response objects that actually do represent the same entity (same unique ID), but with differing other selected fields, due to being fetched from two different queries. And again, it is unclear how this should behave if TaxonomyFragment was instead an inline fragment (... on Taxonomical), since the generated fragment would then include a merged asPet field that would include petName (assuming you haven't disabled fragment field merging, which is not even implemented yet).

I think that it is more likely that the desired behavior is for animals[0].taxonomyFragment == animals[1].taxonomyFragment to return true in this situation. However, that is not how this not how this works today. Both animals[0] == animals[1] return false and animals[0].taxonomyFragment == animals[1].taxonomyFragment return false because we are comparing the entire underlying DataDict.

Complexities

Upon investigating this issue, we found that the expected behavior (and implementing it) becomes vastly more complex as you begin to consider nested inline fragment type cases (... as XXX); inline fragments with @include/@skip directives, and other complex yet valid queries.

Without giving multiple detailed examples, the summary is, you would need to keep track of each parent fragment of the selection set to be compared, as well as each possible child fragment; and compare all the relevant fields on the object. To be perfectly correct, this would require:

  • including all fields from the object compared
  • including all possible child fields from inline and named fragments on the object compared
  • When generated as a merged selection set spread into an operation:
    • including all direct parent fields
    • omitting all fields from other fragments spread into the parent
  • When generated as a selection set in an isolated named fragment:
    • omitting all fields from parents that are not directly selected by the entity in the fragment

Possible Solutions

While we have the capability to compute this list of fields during code generation, the inclusion of that list would noticeably increase binary size of the generated models for users with large operation sets or complex, deeply nested fragments. Other possible solutions we have considered would provide this information by listing the parent selection sets whose fields need to be compared. This would keep binary size growth to a minimum, but would require a "lightweight execution" that would traverse the selections and compare fields according to the algorithm above (at runtime instead of during code generation). This would likely decrease runtime performance of equality comparisons, especially in complex queries with many inline fragments. Lastly, we could explore using reflection to determine which fields to compare at run time. This would help with the binary size problem, but reflection in Swift is notoriously slow, and we don't feel it's appropriate to regress performance here with reflection in the ==, which should generally run as fast as is computationally possible.

We've determined that implementing this would cause performance and/or binary size regressions; that it is unclear when this is the desired behavior or not; and that in most circumstances, comparing the DataDicts does suffice. This would also be a breaking change of existing behavior. Due to this reasoning we don't believe that it is appropriate to implement this as the default handling of Equatable conformance at this time.

However we recognize that this is necessary behavior in some situations.

Suggested Solution

We propose allowing for the code generation to generate an equality function that compares only the data in the scope of the current fragment on an opt-in, per selection set basis. For the large majority of use cases where this is not necessary, the existing behavior will remain, and when you do need this behavior, you can force the code generation engine to generate it for you with a directive. Naming of this directive is still up for discussion, but something like @generateScopedEqualityFunction could be used.

Because we do not want to create ambiguity in the way that the == operator works, we don't intend for this directive to generate an override of ==. Instead, it should generate another function like func fragmentIsEqual(to:) -> Bool (name also TBD).

In addition, we are going to add some documentation to the existing == implementation to explain the existing semantic meaning behind the Equatable conformance of SelectionSet.

Current Workaround

This issue is being created to track this work, but we don't anticipate working on this new directive immediately. In the mean time, we recommend you work around this when necessary by creating custom implementations of a func fragmentIsEqual(to:) -> Bool manually. (We'd love any feedback on what you decide to name those functions!)

We've heard from a team that was generating == overloads for their entire set of generated models. We believe this is probably not the best approach, as it increases binary size for all of the == functions where this is not actually a problem. Instead we recommend you only create these functions in the specific cases where it is necessary.

@AnthonyMDev AnthonyMDev added this to the Minor Releases (1.x) milestone Mar 12, 2024
@AnthonyMDev AnthonyMDev added the codegen Issues related to or arising from code generation label Mar 12, 2024
@frehulfd
Copy link

frehulfd commented Mar 13, 2024

Thanks for the writeup. I think I have a differing opinion on the ambiguity you mentioned.

I think it's mostly a divergence in how we're thinking about these models. One perspective comes from a GraphQL perspective of "These fragments come from different underlying data, so they're not equal", but our perspective is to look at these things purely as Swift data structures. In our view the source of these structures is an implementation detail. It could be GraphQL or, as in our case at Stitch Fix, it could be fixtures constructed using initializers during a unit test. If we view your taxonomy fragment example in the abstract, it boils down to something like:

struct TaxonomyFragment: Equatable {
    var species: String { /* ... */ }
    var genus: String { /* ... */ }

    init(species: String, genus: String) {
        /* ...  */
    }
}

Purely from a Swift API contract perspective, there really is no ambiguity. Having two of these structures be unequal because of hidden influences breaks the spirit of the Equatable contract. The only way the other fields become an issue is if we let implementation details leak out of the abstraction, which is essentially what is happening today. The current Equatable conformance for fragments is closer to an identity check (===), which I think is separate from the question of whether two TaxonomyFragments are equivalent.

I suggest that a breaking change here is appropriate because I think the current behavior is unexpected. Requiring a separate custom "actually equivalent" function is inconvenient because it renders many built in standard Swift library functions subtly broken with no warning. Just to name a few:

  • Set
  • Publisher.removeDuplicates()
  • Array.contains(Self.Element) -> Bool
  • Array.firstIndex(of element: Self.Element) -> Int?
  • and many more

Our Plan

Since the SelectionSet protocol declares itself to be Equatable and the subtle unreliability of that check, we decided that this was going to be a source for new bugs and/or unit test difficulties. Our solution is going to be extending our catalog of post-processing Sourcery stencils we use to add to the generated code to add custom == operator overloads for all SelectionSets that strictly check just the logical fields at each level. This would result in something like the following for the taxonomy example above.

public func ==(lhs: TaxonomyFragment, rhs: TaxonomyFragment) -> Bool {
    lhs.species == rhs.species &&
    lhs.genus == rhs.genus
}

This is a bit inconvenient and time-consuming, and you're right that this will add to our build time and binary size, but we're prioritizing fewer subtle bugs and developer experience over conciseness in this case.

Update:

After benchmarking the compile-time difference with our auto-generated == overloads (a ~30-40s increase in build times) and after scoping the feasibility of doing equality checks at runtime using already-present schema metadata, we've decided to go the runtime route by traversing the DataDict structure itself and using each models __selections array as a guide for informing how to compare various values at each level. After a couple bumpy patches we have everything building and passing unit tests with some additional unit tests thrown in for our custom implementation.

It does seem very possible for this to be built-in to the standard Apollo library, and I encourage you to consider doing so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
codegen Issues related to or arising from code generation
Projects
None yet
Development

No branches or pull requests

2 participants