Description
Related to #158
I'm doing some MCMC where there are too many parameters to save all samples. So instead I figured, let's use OnlineStats to store only the few statistics that I'm interested in (e.g., Mean, Variance, AutoCorrelation). However, it is unclear to me how to do this properly. Right now, I have a function like so:
function initialize_online_statistics(online_statistics, p::Integer, k::Integer)
ne_plus_diag = p * (p + 1) ÷ 2
ne = ne_plus_diag - p
# not sure about this structure
return (
a = [OnlineStats.Group([copy(online_statistics) for _ in 1:ne_plus_diag]) for _ in 1:k],
b = OnlineStats.Group([copy(online_statistics) for _ in 1:ne]),
c = copy(online_statistics)
)
end
The idea is that a user (just me for now) supplies a Series
which gets passed to online_statistics
. Internally, I run some MCMC algorith and after each iteration I update the statistics for a
, b
, and c
. This way, a user can specify which statistics to track themselves and they're not baked into the code.
Example usage:
initialize_online_statistics(OnlineStats.Series(Mean(), Variance(), AutoCov(5)), 5, 10)
works fine. However, for a larger size, this same fails with a StackOverflowError:
initialize_online_statistics(OnlineStats.Series(Mean(), Variance(), AutoCov(5)), 116, 100)
ERROR: StackOverflowError:
[1] promote_type(::Type, ::Type, ::Type, ::Vararg{Type}) (repeats 6161 times)
@ Base ./promotion.jl:293
[2] Group(stats::Vector{OnlineStatsBase.Series{Number, Tuple{Mean{Float64, EqualWeight}, Variance{Float64, Float64, EqualWeight}, AutoCov{Float64, Float64}}}})
@ OnlineStatsBase ~/.julia/packages/OnlineStatsBase/4TwKN/src/stats.jl:368
[3] #26
@ ./none:0 [inlined]
[4] iterate
@ ./generator.jl:47 [inlined]
[5] collect(itr::Base.Generator{UnitRange{Int64}, var"#26#29"{OnlineStatsBase.Series{Number, Tuple{Mean{Float64, EqualWeight}, Variance{Float64, Float64, EqualWeight}, AutoCov{Float64, Float64}}}, Int64}})
@ Base ./array.jl:782
[6] initialize_online_statistics(online_statistics::OnlineStatsBase.Series{Number, Tuple{Mean{Float64, EqualWeight}, Variance{Float64, Float64, EqualWeight}, AutoCov{Float64, Float64}}}, p::Int64, k::Int64)
[7] top-level scope
I think the problem is that Group
does not specialize when the contents
are all the same. For example Group(Mean(), Mean())
has type Group{Tuple{Mean{Float64, EqualWeight}, Mean{Float64, EqualWeight}}, Union{Tuple{Number, Number}, NamedTuple{names, R} where R<:Tuple{Number, Number}, AbstractVector{<:Number}} where names}
and Group(Mean(), Mean(), Mean())
has an additional , Mean{Float64, EqualWeight}
. Eventually, a StackOverflow is reached.
For now, I can use a Vector{T} Where {T<:OnlineStat}
, but I feel like I'm reinventing the idea behind Group
. Perhaps there should be a specialized type for this case, like MonoGroup{T, U<:Int} where {T<:Union{Series, OnlineStat}}
?