Penalized Composite Link Model for Efficient Estimation of Smooth Distributions from Coarsely Binned Data
This repository contains a versatile method for ungrouping histograms (binned count data) assuming that counts are Poisson distributed and that the underlying sequence on a fine grid to be estimated is smooth. The method is based on the composite link model and estimation is achieved by maximizing a penalized likelihood. Smooth detailed sequences of counts and rates are so estimated from the binned counts. Ungrouping binned data can be desirable for many reasons: Bins can be too coarse to allow for accurate analysis; comparisons can be hindered when different grouping approaches are used in different histograms; and the last interval is often wide and open-ended and, thus, covers a lot of information in the tail area. Age-at-death distributions grouped in age classes and abridged life tables are examples of binned data. Because of modest assumptions, the approach is suitable for many demographic and epidemiological applications. For a detailed description of the method and applications see Rizzi et al. (2015).
-
Make sure you have the most recent version of R
-
Run the following code in your R console
install.packages("ungroup")
You can track (and contribute to) the development of ungroup
at https://github.com/mpascariu/ungroup. To install it:
-
Install the release version of
devtools
from CRAN withinstall.packages("devtools")
. -
Make sure you have a working development environment.
- Windows: Install Rtools.
- Mac: Install
Xcode
from the Mac App Store. - Linux: Install a compiler and various development libraries (details vary across different flavours of Linux).
-
Install the development version of
ungroup
.devtools::install_github("mpascariu/ungroup")
Get started with ungroup
by checking the vignette
browseVignettes(package = "ungroup")
This software is an academic project. We welcome any issues and pull requests.
- If
ungroup
is malfunctioning, please report the case by submitting an issue on GitHub. - If you wish to contribute, please submit a pull request following the guidelines in CONTRIBUTING.md.
Rizzi S, Gampe J and Eilers PHC. 2015. Efficient Estimation of Smooth Distributions From Coarsely Grouped Data. American Journal of Epidemiology, Volume 182, Issue 2, Pages 138-147.
Eilers PHC. 2007. Ill-posed problems with counts, the composite link model and penalized likelihood. Statistical Modelling, Volume 7, Issue 3, Pages 239-254.