Package 'FPCdpca' reference manual

Title:	The FPCdpca Criterion on Distributed Principal Component Analysis
Description:	We consider optimal subset selection in the setting that one needs to use only one data subset to represent the whole data set with minimum information loss, and devise a novel intersection-based criterion on selecting optimal subset, called as the FPC criterion, to handle with the optimal sub-estimator in distributed principal component analysis; That is, the FPCdpca. The philosophy of the package is described in Guo G. (2020) <doi:10.1007/s00180-020-00974-4>.
Authors:	Guangbao Guo [aut, cre, cph], Jiarui Li [ctb]
Maintainer:	Guangbao Guo <[email protected]>
License:	Apache License (== 2.0)
Version:	0.1.0
Built:	2025-03-24 04:38:20 UTC
Source:	https://github.com/cran/FPCdpca

Decentralized PCA

Description

Decentralized PCA is a technology that applies decentralized PCA to distributed computing environments.

Usage

Depca(data,K,nk, eps,nit.max)
Depca(data,K,nk, eps,nit.max)

Arguments

`data`	is sparse random projection matrix.
`K`	is the desired target rank.
`nk`	is the size of subsets.
`eps`	is the noise.
`nit.max`	is the repeat times.

Value

MSEXrp,MSEvrp, MSESrp, kopt

Examples

K=20; nk=50; nr=10; p=8; k=4; n=K*nk;d=6
data=matrix(c(rnorm((n-nr)*p,0,1),rpois(nr*p,100)),ncol=p)
set.seed(1234)
eps=10^(-1);nit.max=1000
TXde=TSde=c(rep(0,5))
for (j in 1:5){
  depca=Depca(data=data,K=K, nk=nk,eps=eps,nit.max=nit.max)
  TXde[j]=as.numeric(depca)[1]
  TSde[j]=as.numeric(depca)[2]
}
mean(TXde)
mean(TSde)
K=20; nk=50; nr=10; p=8; k=4; n=K*nk;d=6
data=matrix(c(rnorm((n-nr)*p,0,1),rpois(nr*p,100)),ncol=p)
set.seed(1234)
eps=10^(-1);nit.max=1000
TXde=TSde=c(rep(0,5))
for (j in 1:5){
  depca=Depca(data=data,K=K, nk=nk,eps=eps,nit.max=nit.max)
  TXde[j]=as.numeric(depca)[1]
  TSde[j]=as.numeric(depca)[2]
}
mean(TXde)
mean(TSde)

Distributed PCA

Description

Distributed PCA is a technology that applies PCA to distributed computing environments.

Usage

Dpca(data,K, nk)
Dpca(data,K, nk)

Arguments

`data`	is the n random vectors constitute the data matrix.
`K`	is an index subset/sub-vector specifying.
`nk`	is the size of subsets.

Value

MSEXp, MSEvp, MSESp, kopt

Examples

K=20; nk=50; nr=10; p=8;n=K*nk;d=6
data=matrix(c(rnorm((n-nr)*p,0,1),rpois(nr*p,100)),ncol=p)
Dpca(data,K,nk)
K=20; nk=50; nr=10; p=8;n=K*nk;d=6
data=matrix(c(rnorm((n-nr)*p,0,1),rpois(nr*p,100)),ncol=p)
Dpca(data,K,nk)

Distributed random projection

Description

Distributed random projection is a technology that applies random projection to distributed computing environments.

Usage

Drp(data,K, nk,d)
Drp(data,K, nk,d)

Arguments

`data`	is sparse random projection matrix.
`K`	is the number of distributed nodes.
`nk`	is the size of subsets.
`d`	is the dimension number.

Value

MSEXrp,MSEvrp, MSESrp, kopt

Examples

K=20; nk=50; nr=10; p=8; d=5; n=K*nk;
data=matrix(c(rnorm((n-nr)*p,0,1),rpois(nr*p,100)),ncol=p)
data=matrix(rpois((n-nr)*p,1),ncol=p); rexp(nr*p,1); rchisq(10000, df = 5);
Drp(data=data,K=K, nk=nk,d=d)
K=20; nk=50; nr=10; p=8; d=5; n=K*nk;
data=matrix(c(rnorm((n-nr)*p,0,1),rpois(nr*p,100)),ncol=p)
data=matrix(rpois((n-nr)*p,1),ncol=p); rexp(nr*p,1); rchisq(10000, df = 5);
Drp(data=data,K=K, nk=nk,d=d)

Distributed random PCA

Description

Distributed random PCA is a technology that applies random PCA to distributed computing environments.

Usage

Drpca(data,K, nk,d)
Drpca(data,K, nk,d)

Arguments

`data`	is sparse random projection matrix.
`K`	is the number of distributed nodes.
`nk`	is the size of subsets.
`d`	is the dimension number.

Value

MSEXrp, MSEvrp, kSopt, kxopt

Examples

K=20; nk=50; nr=50; p=8;d=5; n=K*nk;
data=matrix(c(rnorm((n-nr)*p,0,1),rpois(nr*p,100)),ncol=p)
Drpca(data,K, nk,d)
K=20; nk=50; nr=50; p=8;d=5; n=K*nk;
data=matrix(c(rnorm((n-nr)*p,0,1),rpois(nr*p,100)),ncol=p)
Drpca(data,K, nk,d)

Distributed random svd

Description

Distributed random svd is a technology that applies random SVD to distributed computing environments.

Usage

Drsvd(data,K, nk,m,q,k)
Drsvd(data,K, nk,m,q,k)

Arguments

`data`	sparse random projection matrix.
`K`	the number of distributed nodes.
`nk`	the size of subsets.
`m`	the dimension of variables.
`q`	number of additional power iterations.
`k`	the desired target rank.

Value

`MSEXrsvd`	The MSE value of Xrsvd
`MSEvrsvd`	The MSE value of vrsvd
`MSESrsvd`	The MSE value of Srsvd
`kopt`	The size of optimal subset

Examples

K=20; nk=50; nr=10; p=8; m=5; q=5;k=4;n=K*nk;
data=X=matrix(rexp(n*p,0.8),ncol=p)
#data=matrix(c(rnorm((n-nr)*p,0,1),rpois(nr*p,100)),ncol=p)
#data=X=matrix(rpois((n-nr)*p,1),ncol=p); rexp(nr*p,1); rchisq(10000, df = 5);
#data=X=matrix(rexp(n*p,0.8),ncol=p)
Drsvd(data=data,K=K,nk=nk,m=m,q=q,k=k)
K=20; nk=50; nr=10; p=8; m=5; q=5;k=4;n=K*nk;
data=X=matrix(rexp(n*p,0.8),ncol=p)
#data=matrix(c(rnorm((n-nr)*p,0,1),rpois(nr*p,100)),ncol=p)
#data=X=matrix(rpois((n-nr)*p,1),ncol=p); rexp(nr*p,1); rchisq(10000, df = 5);
#data=X=matrix(rexp(n*p,0.8),ncol=p)
Drsvd(data=data,K=K,nk=nk,m=m,q=q,k=k)

Distributed svd

Description

Distributed svd is a technology that applies SVD to distributed computing environments.

Usage

Dsvd(data,K, nk,k)
Dsvd(data,K, nk,k)

Arguments

`data`	A independent variable.
`K`	the number of distributed nodes.
`nk`	the number of each blocks.
`k`	the desired target rank.

Value

`MSEXs`	the MSE of Xs
`MSEvsvd`	the MSE of vsvd
`MSESsvd`	the MSE of Ssvd
`kopt`	the size of optimal subset

Examples

#install.packages("matrixcalc")
library(matrixcalc)
K=20; nk=50; nr=10; p=8; k=4; n=K*nk;
data=matrix(c(rnorm((n-nr)*p,0,1),rpois(nr*p,100)),ncol=p)
Dsvd(data=data,K=K, nk=nk,k=k)
#install.packages("matrixcalc")
library(matrixcalc)
K=20; nk=50; nr=10; p=8; k=4; n=K*nk;
data=matrix(c(rnorm((n-nr)*p,0,1),rpois(nr*p,100)),ncol=p)
Dsvd(data=data,K=K, nk=nk,k=k)

FPC

Description

FPC is a technology that applies FPC A to distributed computing environments.

Usage

FPC(data,K,nk)FPC(data,K,nk)

Arguments

`data`	is a data set matrix.
`K`	is the desired target rank.
`nk`	is the size of subsets.

Value

MSEv1,MSEv2,MSEvopt,MSESopt1,MSESopt2,MSESopt,MSEShat,MSESba,MSESw

Examples

K=20; nk=500; p=8; n=10000;m=50
data=matrix(c(rnorm((n-m)*p,0,1),rpois(m*p,100)),ncol=p)
FPC(data=data,K=K,nk=nk)K=20; nk=500; p=8; n=10000;m=50
data=matrix(c(rnorm((n-m)*p,0,1),rpois(m*p,100)),ncol=p)
FPC(data=data,K=K,nk=nk)

Package 'FPCdpca'

Help Index

Decentralized PCA

Description

Usage

Arguments

Value

Examples

Distributed PCA

Description

Usage

Arguments

Value

Examples

Distributed random projection

Description

Usage

Arguments

Value

Examples

Distributed random PCA

Description

Usage

Arguments

Value

Examples

Distributed random svd

Description

Usage

Arguments

Value

Examples

Distributed svd

Description

Usage

Arguments

Value

Examples

FPC

Description

Usage

Arguments

Value

Examples