Skip to content

Add a no-op constructor when DataValue is called on itself. #48

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Add a no-op constructor when DataValue is called on itself. #48

wants to merge 1 commit into from

Conversation

quinnj
Copy link
Contributor

@quinnj quinnj commented Oct 4, 2018

@davidanthoff
Copy link
Member

I don't understand this. Why do we need this?

@quinnj
Copy link
Contributor Author

quinnj commented Oct 4, 2018

It's just a no-op constructor; so if someone does DataValue{String}(DataValue("hey")), then the argument itself is returned. I could define my own "helper" constructor in Tables.jl to get around, but I figured this would be uncontroversial. It's because of how we convert to DataValue here: https://github.com/JuliaData/Tables.jl/blob/master/src/datavalues.jl#L43.

@quinnj
Copy link
Contributor Author

quinnj commented Oct 4, 2018

If you don't want to define this for whatever reason, just let me know; I can define a work-around on Tables.jl side.

@nalimilan
Copy link
Contributor

FWIW we had to define the same constructor in CategoricalArrays for this reason.

@quinnj
Copy link
Contributor Author

quinnj commented Oct 4, 2018

I'm not aware of what a better solution would be: we want to ensure we have an instance of a type that we know beforehand. For Int, String, Float64, to get an instance of those scalar types, you call T(x). We could make a custom scalarconvert` function that defined the no-op case, but it just seems like something that might come up anyway for a "scalar" type.

@nalimilan
Copy link
Contributor

I think we are still used to constructors automatically falling back to convert, which is defined as a no-op when the argument is already of the right type. That's why we are surprised we need to define these explicitly now, but that sounds logical to me.

@davidanthoff
Copy link
Member

So shouldn't we just use convert(T, x) then in the code that @quinnj pointed to?

The semantics of the DataValue constructor are really "wrap the argument inside a DataValue". So if a user asks to wrap a DataValue{Int} inside a DataValue{Int}, that seems like a bug to me: if you call the constructor on DataValue{Int}, then you need to pass an Int. If you want to wrap a DataValue{Int}, then you need to call the constructor of DataValue{DataValue{Int}}.

On the other hand, the semantics of convert seem exactly what @quinnj is looking for in that code, right?

@nalimilan
Copy link
Contributor

Yeah, maybe. I don't understand that code TBH. Where is the constructor called?

@davidanthoff
Copy link
Member

Haha, I was also pretty lost in that code :)

@quinnj
Copy link
Contributor Author

quinnj commented Oct 4, 2018

Given a NamedTuple like NamedTuple{(:a, :b, :c), Tuple{Int64, Float64, Union{String, Missing}}, the DataValueRowIterator computes the corresponding DataValue-based NamedTuple: NamedTuple{(:a, :b, :c), Tuple{Int64, Float64, DataValue{String}}, then when it iterates, it makes a call like:

NamedTuple{(:a, :b, :c)}( (Int64(input.a), Float64(intput.b), DataValue{String}(input.c)) )

convert does seem like it would be the more natural call here, but then we have:

julia> DataValue{String}(missing)
DataValue{String}()

julia> convert(DataValue{String}, missing)
ERROR: MethodError: Cannot `convert` an object of type Missing to an object of type String
Closest candidates are:
  convert(::Type{T<:AbstractString}, ::T<:AbstractString) where T<:AbstractString at strings/basic.jl:207
  convert(::Type{T<:AbstractString}, ::AbstractString) where T<:AbstractString at strings/basic.jl:208
  convert(::Type{T}, ::T) where T at essentials.jl:154
Stacktrace:
 [1] convert(::Type{DataValue{String}}, ::Missing) at /Users/jacobquinn/.julia/packages/DataValues/SNSrX/src/scalar/core.jl:25
 [2] top-level scope at none:0

which is why I think I originally went with the constructor call. @davidanthoff , would you be ok defining convert(::Type{D}, missing) where {D <: DataValue} = D()?

@davidanthoff
Copy link
Member

would you be ok defining convert(::Type{D}, missing) where {D <: DataValue} = D()

Yes, that seems like the right way to do this.

Side question: why does this syntax actually work? It seems it does, but I would have expected that one needs to write this as convert(::Type{D}, ::Missing) where {D <: DataValue} = D()? Maybe using ::Missing for the second argument would be a more standard way of writing this?

@quinnj
Copy link
Contributor Author

quinnj commented Oct 4, 2018

Oh, does that syntax work? I wouldn't expect it to; just a typo by me!

@davidanthoff
Copy link
Member

Oh, does that syntax work?

Ah, no, it doesn't work. I somehow mixed up the order in which I tried things :)

@quinnj
Copy link
Contributor Author

quinnj commented Oct 4, 2018

Ok, updated this PR to define convert instead; if we're good w/ this, then once it's merged and we have a new release, I'll update Tables.jl.

@davidanthoff
Copy link
Member

There seems to be a method ambiguity problem. Maybe if you define it as

Base.convert(::Type{DataValue{T}}, ::Missing) where {T} = DataValue{T}()

that won't be the case? Not entirely clear to me why we are getting this error...

How are you going to handle this on the Tables.jl side? Given that this is loaded behind a @require clause and you can't put version bounds on these, you'll have to handle both the old and the new DataValues.jl versions in there... I did that throughout the IterableTables.jl story whenever this situation came up, but it was somewhat easier because the new version of a package that I @required always had some new symbol defined which allowed me to detect which version was actually being loaded, and then depending on that I loaded a different implementation. Is there some API with the new package manager maybe, that allows one to ask "which version of package X is currently loaded?"

@quinnj
Copy link
Contributor Author

quinnj commented Oct 5, 2018

Good points David. In lieu of dealing w/ the complications of Requires.jl versioning, I think the approach here should work for now. It's not the most elegant, but the alternatives don't seem to offer anything much cleaner.

Should we close this then? Or try to fix the ambiguity?

@davidanthoff
Copy link
Member

Yeah, I think your PR over at Tables.jl is probably the easiest path forward. I think we should probably still add the convert method in this PR, it seems quite sensible. But no rush with that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants