Title: | Confidence Intervals for the Current Status Model |
---|---|
Description: | Computes the maximum likelihood estimator, the smoothed maximum likelihood estimator and pointwise bootstrap confidence intervals for the distribution function under current status data. Groeneboom and Hendrickx (2017) <doi:10.1214/17-EJS1345>. |
Authors: | Piet Groeneboom [aut], Kim Hendrickx [cre] |
Maintainer: | Kim Hendrickx <[email protected]> |
License: | GPL-3 |
Version: | 0.1.1 |
Built: | 2025-03-10 05:35:47 UTC |
Source: | https://github.com/kimhendrickx/curstatci |
The function ComputeBW computes the bandwidth that minimizes the pointwise Mean Squared Error using the subsampling principle in combination with undersmoothing.
ComputeBW(data, x)
ComputeBW(data, x)
data |
Dataframe with three variables:
|
x |
numeric vector containing the points where the confidence intervals are computed. |
bw data-driven bandwidth vector of size length(x)
containing the bandwidth value for each point in x.
Groeneboom, P. and Hendrickx, K. (2017). The nonparametric bootstrap for the current status model. Electronic Journal of Statistics 11(2):3446-3848.
vignette("curstatCI")
library(Rcpp) library(curstatCI) # sample size n <- 1000 # truncated exponential distribution on (0,2) set.seed(100) t <- rep(NA, n) delta <- rep(NA, n) for(i in (1:n) ){ x<-runif(1) y<--log(1-(1-exp(-2))*x) t[i]<-2*runif(1); if(y<=t[i]){ delta[i]<-1} else{delta[i]<-0}} A<-cbind(t[order(t)], delta[order(t)], rep(1,n)) # x vector grid<-seq(0.1,1.9 ,by = 0.1) # data-driven bandwidth vector bw <- ComputeBW(data =A, x = grid) plot(grid, bw)
library(Rcpp) library(curstatCI) # sample size n <- 1000 # truncated exponential distribution on (0,2) set.seed(100) t <- rep(NA, n) delta <- rep(NA, n) for(i in (1:n) ){ x<-runif(1) y<--log(1-(1-exp(-2))*x) t[i]<-2*runif(1); if(y<=t[i]){ delta[i]<-1} else{delta[i]<-0}} A<-cbind(t[order(t)], delta[order(t)], rep(1,n)) # x vector grid<-seq(0.1,1.9 ,by = 0.1) # data-driven bandwidth vector bw <- ComputeBW(data =A, x = grid) plot(grid, bw)
The function ComputeConfIntervals computes pointwise confidence intervals for the distribution function under current status data. The confidence intervals are based on the Smoothed Maximum likelihood Estimator and constructed using the nonparametric bootstrap.
ComputeConfIntervals(data, x, alpha, bw)
ComputeConfIntervals(data, x, alpha, bw)
data |
Dataframe with three variables:
|
x |
numeric vector containing the points where the confidence intervals are computed.
This vector needs to be contained within the observation interval: |
alpha |
confidence level of pointwise confidence intervals. |
bw |
numeric vector of size |
In the current status model, the variable of interest with distribution function
is not observed directly.
A censoring variable
is observed instead together with the indicator
.
ComputeConfIntervals computes the pointwise
1-alpha
bootstrap confidence intervals around the SMLE of based on a sample of size
n <- sum(data$freq2)
.
The bandwidth parameter vector that minimizes the pointwise Mean Squared Error using the subsampling principle in combination with undersmoothing is returned by the function ComputeBW
.
The default method for constructing the confidence intervals in [Groeneboom & Hendrickx (2017)] is based on estimating the asymptotic variance of the SMLE. When the bandwidth is small for some point in x, the variance estimate of the SMLE at this point might not exist. If this happens the Non-Studentized confidence interval is returned for this particular point in x.
List with 5 variables:
Maximum Likelihood Estimator. This is a matrix of dimension (m+1)x2 where m is the number of jump points of the MLE. The first column consists of the point zero and the jump locations of the MLE. The second column contains the value zero and the values of the MLE at the jump points.
Smoothed Maximum Likelihood Estimator. This is a vector of size length(x)
containing the values of the SMLE for each point in the vector x.
pointwise confidence interval. This is a matrix of dimension length(x)
x2.
The first resp. second column contains the lower resp. upper values of the confidence intervals for each point in x.
points in x for which Studentized nonparametric bootstrap confidence intervals are computed.
points in x for which classical nonparametric bootstrap confidence intervals are computed.
Groeneboom, P. and Hendrickx, K. (2017). The nonparametric bootstrap for the current status model. Electronic Journal of Statistics 11(2):3446-3848.
vignette("curstatCI")
library(Rcpp) library(curstatCI) # sample size n <- 1000 # Uniform data U(0,2) set.seed(2) y <- runif(n,0,2) t <- runif(n,0,2) delta <- as.numeric(y <= t) A<-cbind(t[order(t)], delta[order(t)], rep(1,n)) # x vector grid<-seq(0.1,1.9 ,by = 0.1) # data-driven bandwidth vector bw <- ComputeBW(data =A, x = grid) # pointwise confidence intervals at grid points: out<-ComputeConfIntervals(data = A,x =grid,alpha = 0.05, bw = bw) left <- out$CI[,1] right <- out$CI[,2] plot(grid, out$SMLE,type ='l', ylim=c(0,1), main= "",ylab="",xlab="",las=1) points(grid, left, col = 4) points(grid, right, col = 4) segments(grid,left, grid, right)
library(Rcpp) library(curstatCI) # sample size n <- 1000 # Uniform data U(0,2) set.seed(2) y <- runif(n,0,2) t <- runif(n,0,2) delta <- as.numeric(y <= t) A<-cbind(t[order(t)], delta[order(t)], rep(1,n)) # x vector grid<-seq(0.1,1.9 ,by = 0.1) # data-driven bandwidth vector bw <- ComputeBW(data =A, x = grid) # pointwise confidence intervals at grid points: out<-ComputeConfIntervals(data = A,x =grid,alpha = 0.05, bw = bw) left <- out$CI[,1] right <- out$CI[,2] plot(grid, out$SMLE,type ='l', ylim=c(0,1), main= "",ylab="",xlab="",las=1) points(grid, left, col = 4) points(grid, right, col = 4) segments(grid,left, grid, right)
The function ComputeMLE computes the Maximum Likelihood Estimator of the distribution function under current status data.
ComputeMLE(data)
ComputeMLE(data)
data |
Dataframe with three variables:
|
In the current status model, the variable of interest with distribution function
is not observed directly.
A censoring variable
is observed instead together with the indicator
.
ComputeMLE computes the MLE of
based on a sample of size
n <- sum(data$freq2)
.
Dataframe with two variables :
jump locations of the MLE
MLE evaluated at the jump locations
Groeneboom, P. and Hendrickx, K. (2017). The nonparametric bootstrap for the current status model. Electronic Journal of Statistics 11(2):3446-3848.
library(Rcpp) library(curstatCI) # sample size n <- 1000 # Uniform data U(0,2) set.seed(2) y <- runif(n,0,2) t <- runif(n,0,2) delta <- as.numeric(y <= t) A<-cbind(t[order(t)], delta[order(t)], rep(1,n)) mle <-ComputeMLE(A) plot(mle$x, mle$mle,type ='s', ylim=c(0,1), main= "",ylab="",xlab="",las=1)
library(Rcpp) library(curstatCI) # sample size n <- 1000 # Uniform data U(0,2) set.seed(2) y <- runif(n,0,2) t <- runif(n,0,2) delta <- as.numeric(y <= t) A<-cbind(t[order(t)], delta[order(t)], rep(1,n)) mle <-ComputeMLE(A) plot(mle$x, mle$mle,type ='s', ylim=c(0,1), main= "",ylab="",xlab="",las=1)
The function ComputeSMLE computes the Smoothed Maximum Likelihood Estimator of the distribution function under current status data.
ComputeSMLE(data, x, bw)
ComputeSMLE(data, x, bw)
data |
Dataframe with three variables:
|
x |
numeric vector containing the points where the confidence intervals are computed. |
bw |
numeric vector of size |
In the current status model, the variable of interest with distribution function
is not observed directly.
A censoring variable
is observed instead together with the indicator
.
ComputeSMLE computes the SMLE of
based on a sample of size
n <- sum(data$freq2)
.
The bandwidth parameter vector that minimizes the pointwise Mean Squared Error using the subsampling principle in combination with undersmoothing is returned by the function ComputeBW.
SMLE(x) Smoothed Maximum Likelihood Estimator. This is a vector of size length(x)
containing the values of the SMLE for each point in the vector x.
Groeneboom, P. and Hendrickx, K. (2017). The nonparametric bootstrap for the current status model. Electronic Journal of Statistics 11(2):3446-3848.
library(Rcpp) library(curstatCI) # sample size n <- 1000 # Uniform data U(0,2) set.seed(2) y <- runif(n,0,2) t <- runif(n,0,2) delta <- as.numeric(y <= t) A<-cbind(t[order(t)], delta[order(t)], rep(1,n)) grid <-seq(0,2 ,by = 0.01) # bandwidth vector h<-rep(2*n^-0.2,length(grid)) smle <-ComputeSMLE(A,grid,h) plot(grid, smle,type ='l', ylim=c(0,1), main= "",ylab="",xlab="",las=1)
library(Rcpp) library(curstatCI) # sample size n <- 1000 # Uniform data U(0,2) set.seed(2) y <- runif(n,0,2) t <- runif(n,0,2) delta <- as.numeric(y <= t) A<-cbind(t[order(t)], delta[order(t)], rep(1,n)) grid <-seq(0,2 ,by = 0.01) # bandwidth vector h<-rep(2*n^-0.2,length(grid)) smle <-ComputeSMLE(A,grid,h) plot(grid, smle,type ='l', ylim=c(0,1), main= "",ylab="",xlab="",las=1)
A dataset on the prevalence of hepatitis A in individuals from Bulgaria with age ranging from 1 to 86 years. The data consists of a cross-sectional survey conducted in 1964.
hepatitisA
hepatitisA
A data frame with 83 rows and three variables:
Age of the individual
Number of individuals of age t that are seropositive for Hepatitis A
Total number of individuals of age t
Keiding, N. (1991). Age-specic incidence and prevalence: a statistical perspective. J. Roy. Statist. Soc. Ser. A,154(3):371-412.
A dataset on the prevalence of rubella in 230 Austrian males older than three months for whom the exact date of birth was known. Each individual was tested at the Institute of Virology, Vienna during the period 1–25 March 1988 for immunization against Rubella.
rubella
rubella
A data frame with 225 rows and three variables:
Age of the individual at the time of testing for immunization
Number of individuals of age t that are immune for Rubella
Total number of individuals of age t
Keiding, N., Begtrup, K., Scheike, T., and Hasibeder, G. (1996). Estimation from current status data in continuous time. Lifetime Data Anal., 2:119-129.