| Title: | LIC for Distributed Skewed Regression |
|---|---|
| Description: | This comprehensive toolkit for skewed regression is designated as "SLIC" (The LIC for Distributed Skewed Regression Analysis). It is predicated on the assumption that the error term follows a skewed distribution, such as the Skew-Normal, Skew-t, or Skew-Laplace. The methodology and theoretical foundation of the package are described in Guo G.(2020) <doi:10.1080/02664763.2022.2053949>. |
| Authors: | Guangbao Guo [aut, cre] (ORCID: <https://orcid.org/0000-0002-4115-6218>), Hengxin Gao [aut] |
| Maintainer: | Guangbao Guo <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.3 |
| Built: | 2026-05-28 08:19:40 UTC |
| Source: | https://github.com/cran/SLIC |
Caculate the estimators of beta on the A-opt and D-opt
beta_AD(K = K, nk = nk, alpha = alpha, X = X, y = y)beta_AD(K = K, nk = nk, alpha = alpha, X = X, y = y)
K |
is the number of subsets |
nk |
is the length of subsets |
alpha |
is the significance level |
X |
is the observation matrix |
y |
is the response vector |
A list containing:
betaA |
The estimator of beta on the A-opt. |
betaD |
The estimator of beta on the D-opt. |
Guo, G., Song, H. & Zhu, L. The COR criterion for optimal subset selection in distributed estimation. Statistics and Computing, 34, 163 (2024). doi:10.1007/s11222-024-10471-z
p=6;n=1000;K=2;nk=200;alpha=0.05;sigma=1 e=rnorm(n,0,sigma); beta=c(sort(c(runif(p,0,1)))); data=c(rnorm(n*p,5,10));X=matrix(data, ncol=p); y=X%*%beta+e; beta_AD(K=K,nk=nk,alpha=alpha,X=X,y=y)p=6;n=1000;K=2;nk=200;alpha=0.05;sigma=1 e=rnorm(n,0,sigma); beta=c(sort(c(runif(p,0,1)))); data=c(rnorm(n*p,5,10));X=matrix(data, ncol=p); y=X%*%beta+e; beta_AD(K=K,nk=nk,alpha=alpha,X=X,y=y)
Caculate the estimator of beta on the COR
beta_cor(K = K, nk = nk, alpha = alpha, X = X, y = y)beta_cor(K = K, nk = nk, alpha = alpha, X = X, y = y)
K |
is the number of subsets |
nk |
is the length of subsets |
alpha |
is the significance level |
X |
is the observation matrix |
y |
is the response vector |
A list containing:
betaC |
The estimator of beta on the COR. |
Guo, G., Song, H. & Zhu, L. The COR criterion for optimal subset selection in distributed estimation. Statistics and Computing, 34, 163 (2024). doi:10.1007/s11222-024-10471-z
p=6;n=1000;K=2;nk=200;alpha=0.05;sigma=1 e=rnorm(n,0,sigma); beta=c(sort(c(runif(p,0,1)))); data=c(rnorm(n*p,5,10));X=matrix(data, ncol=p); y=X%*%beta+e; beta_cor(K=K,nk=nk,alpha=alpha,X=X,y=y)p=6;n=1000;K=2;nk=200;alpha=0.05;sigma=1 e=rnorm(n,0,sigma); beta=c(sort(c(runif(p,0,1)))); data=c(rnorm(n*p,5,10));X=matrix(data, ncol=p); y=X%*%beta+e; beta_cor(K=K,nk=nk,alpha=alpha,X=X,y=y)
Calculate the LIC estimator based on A-optimal and D-optimal criterion
LICnew(X, Y, alpha, K, nk)LICnew(X, Y, alpha, K, nk)
X |
A matrix of observations (design matrix) with size n x p |
Y |
A vector of responses with length n |
alpha |
The significance level for confidence intervals |
K |
The number of subsets to consider |
nk |
The size of each subset |
A list containing:
E5 |
The LIC estimator based on A-optimal and D-optimal criterion. |
Guo, G., Song, H. & Zhu, L. The COR criterion for optimal subset selection in distributed estimation. Statistics and Computing, 34, 163 (2024). doi:10.1007/s11222-024-10471-z
p = 6; n = 1000; K = 2; nk = 200; alpha = 0.05; sigma = 1 e = rnorm(n, 0, sigma); beta = c(sort(c(runif(p, 0, 1)))); data = c(rnorm(n * p, 5, 10)); X = matrix(data, ncol = p); Y = X %*% beta + e; LICnew(X = X, Y = Y, alpha = alpha, K = K, nk = nk)p = 6; n = 1000; K = 2; nk = 200; alpha = 0.05; sigma = 1 e = rnorm(n, 0, sigma); beta = c(sort(c(runif(p, 0, 1)))); data = c(rnorm(n * p, 5, 10)); X = matrix(data, ncol = p); Y = X %*% beta + e; LICnew(X = X, Y = Y, alpha = alpha, K = K, nk = nk)
Generate data with skewed errors
serr(n, nr, p, dist_type, ...)serr(n, nr, p, dist_type, ...)
n |
Number of total observations |
nr |
Number of observations with a different error distribution |
p |
Number of predictors |
dist_type |
Type of error distribution ("skew_normal", "skew_t", "skew_laplace") |
... |
Additional parameters for the error distribution |
A list with X (design matrix), Y (response), and e (error)
set.seed(123) data <- serr(1000, 200, 5, "skew_t") str(data)set.seed(123) data <- serr(1000, 200, 5, "skew_t") str(data)
The SLIC function extends the LIC method by assuming that the error term follows a skewed distribution (Skew-Normal, Skew-t, or Skew-Laplace), thereby improving the length and information optimisation criterion.
SLIC(X, Y, alpha = 0.05, K = 10, nk = NULL, dist_type = "skew_normal")SLIC(X, Y, alpha = 0.05, K = 10, nk = NULL, dist_type = "skew_normal")
X |
is a design matrix |
Y |
is a random response vector of observed values |
alpha |
is the significance level |
K |
is the number of subsets |
nk |
is the sample size of subsets |
dist_type |
is the type of skewed error distribution: "skew_normal", "skew_t", or "skew_laplace" |
MUopt, Bopt, MAEMUopt, MSEMUopt, opt, Yopt
set.seed(123) n <- 1000 p <- 5 X <- matrix(rnorm(n * p), ncol = p) beta <- runif(p, 1, 2) e <- sn::rsn(n = n, xi = 0, omega = 1, alpha = 5) Y <- X %*% beta + e SLIC(X, Y, alpha = 0.05, K = 10, dist_type = "skew_normal")set.seed(123) n <- 1000 p <- 5 X <- matrix(rnorm(n * p), ncol = p) beta <- runif(p, 1, 2) e <- sn::rsn(n = n, xi = 0, omega = 1, alpha = 5) Y <- X %*% beta + e SLIC(X, Y, alpha = 0.05, K = 10, dist_type = "skew_normal")