-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clean up slice fallback and remove bit64 slice hacks #1707
base: main
Are you sure you want to change the base?
Conversation
- Handles dim case - Always restores as needed - Which means R level shaped dispatch no longer proxy/restores - R level shaped dispatch also no longer unclasses, which defeated the purpose of dispatching! - Fixes `vec_is_restored()` to also ignore `dim` and `dimnames`
# Called when `x` has dimensions | ||
vec_slice_fallback <- function(x, i) { | ||
out <- unclass(vec_proxy(x)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intention of this vec_slice_fallback()
path was to handle shaped fallback vectors when calling vec_slice()
. It was intended to give the [
methods for shaped vectors a chance to run, which we'd then restore on the backside if needed with vec_is_restored()
and vec_restore()
(because base [
methods drop the class).
However, we were unclassing the input before calling [
, which kind of defeats the whole point!
This was a problem with bit64, because the [.integer64
method is required to correctly handle NA_integer_
indices.
library(vctrs)
`[.myclass` <- function(x, ...) {
print("wow!")
NextMethod()
}
x <- matrix(1:6, nrow = 3)
class(x) <- "myclass"
# Going through current `vec_slice_fallback()`.
# Unclassing so we don't get to call our `[` method and we don't see "wow!".
vec_slice(x, 1)
#> [,1] [,2]
#> [1,] 1 4
#> attr(,"class")
#> [1] "myclass"
Created on 2022-09-30 by the reprex package (v2.0.1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, this function does not need to proxy/restore, because if a vec_proxy()
method was defined then we would not have gone through the fallback path to begin with!
I've trimmed it down to only calling [
, which feels more correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lastly, it was given a rather unfortunate name, because it made us think this was the fallback slice method that we should be using everywhere, when really it was just intended for shaped vectors.
It was actually being used in other places, like the fallback paths of list_unchop()
and vec_chop()
, when really we should have been using a more generic slice fallback function that actually handles that "maybe restore" part on the backside.
I'm fairly certain this was the reason we tried to make it unclass/proxy/restore all in the same function.
It now has the name bracket_shaped_dispatch()
to go along with the C level bracket_dispatch()
which calls x[i]
in the non-shaped case.
@@ -296,6 +296,10 @@ static SEXP chop_shaped(SEXP x, SEXP indices, struct vctrs_chop_info info) { | |||
} | |||
|
|||
static SEXP chop_fallback(SEXP x, SEXP indices, struct vctrs_chop_info info) { | |||
// TODO: Do we really care about micro performance that much here? Can we just | |||
// merge with `chop_shaped_fallback()` now that `vec_slice_fallback()` | |||
// handles shaped an unshaped vectors? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This chop_fallback()
only runs on S3/S4 objects that don't have a vec_proxy()
method defined for them (i.e. vec_requires_fallback()
is true
).
I can't imagine we care about the performance of those cases that much. I think I'd rather simplify this code by a lot by merging it with chop_fallback_shaped()
, which just allows vec_slice_fallback()
to handle everything related to setting up the environment and doing vec_is_restored()
. The less places we have that vec_is_restored()
check lying around, the better IMO.
static | ||
r_obj* bracket_dispatch(r_obj* x, r_obj* subscript) { | ||
return vctrs_dispatch2( | ||
syms_bracket, fns_bracket, | ||
syms_x, x, | ||
syms_i, subscript | ||
); | ||
} | ||
static | ||
r_obj* bracket_shaped_dispatch(r_obj* x, r_obj* subscript) { | ||
return vctrs_dispatch2( | ||
syms.bracket_shaped_dispatch, fns.bracket_shaped_dispatch, | ||
syms_x, x, | ||
syms_i, subscript | ||
); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We now have these nice two parallel functions. bracket_dispatch()
calls x[i]
and bracket_shaped_dispatch()
just calls x[i, , , drop = FALSE]
with as many ,
as needed.
syms_x, x, | ||
syms_i, subscript | ||
); | ||
} | ||
|
||
r_obj* vec_slice_fallback(r_obj* x, r_obj* subscript) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point of vec_slice_fallback()
is now to completely handle the fallback slice case - which is basically how we have been trying to use it this whole time.
It first hands off to [
and then optionally calls vec_restore()
if it looks like we needed to. This allows us to localize the "maybe restore" logic into one place, so we don't forget to do it whenever we use vec_slice_fallback()
elsewhere.
if (r_node_tag(node) == r_syms.names) { | ||
r_obj* tag = r_node_tag(node); | ||
|
||
if (tag == r_syms.names || | ||
tag == r_syms.dim || | ||
tag == r_syms.dim_names) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vec_is_restored()
had this bug where it said "you don't need to restore" if the object had dim
or dimname
attributes, which definitely isn't true. Those are two of the attributes that base R [
methods will propagate, like names (but unlike the class).
# are requesting native vctrs slicing), then you need to declare a | ||
# `vec_proxy()` method as well to tell vctrs what it needs to be natively | ||
# slicing. | ||
|
||
local_methods( | ||
`[.vctrs_foobar` = function(x, i, ...) vec_slice(x, i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so lets talk about this. This infloop check is for:
- A class that has clearly opted into using vctrs, because they call
vec_slice()
- But isn't using
new_vctr()
, which has avec_proxy()
method defined - And isn't defining their own
vec_proxy()
method
In that case, vec_slice()
tries to fall back to the [
method, which causes the infloop.
I don't think that avoiding this infloop is our responsibility. The way we tried to do this was by calling unclass()
in that original vec_slice_fallback()
function, but as I showed earlier that defeats the whole purpose of the function.
It it also 100% possible to do this already with non-shaped vectors. i.e. this is with CRAN vctrs:
library(vctrs)
`[.vctrs_foobar` <- function(x, i, ...) vec_slice(x, i)
x <- structure(1:4, class = "vctrs_foobar")
# infloop
x[1]
I think that if you are telling vctrs that you want [
to forward to vec_slice()
, then you have to tell vctrs what it should be natively slicing by providing a vec_proxy()
method (if you aren't extending vctrs_vctr).
@@ -351,6 +363,7 @@ test_that("vec_restore() is called after slicing data frames", { | |||
|
|||
test_that("additional subscripts are forwarded to `[`", { | |||
local_methods( | |||
vec_proxy.vctrs_foobar = function(x, ...) x, | |||
`[.vctrs_foobar` = function(x, i, ...) vec_index(x, i, ...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise produces the infloop from before, but we are just trying to test vec_index()
properties here (used by vctrs_vctr) so providing a proxy method like vctrs_vctr to avoid the infloop seems reasonable.
I originally intended to just remove the old bit64 hacks around slicing with
NA_integer_
, but trying to do so exposed some strange bugs with the way our fallback slicing was working. I'll add some inline comments below.The first commit cleans up the fallback slicing approach. The second commit removes the bit64 hacks.