Functional regression is a version of regression analysis when responses or covariates include functional data. Functional regression models can be classified into four types depending on whether the responses or covariates are functional or scalar: (i) scalar responses with functional covariates, (ii) functional responses with scalar covariates, (iii) functional responses with functional covariates, and (iv) scalar or functional responses with functional and scalar covariates. In addition, functional regression models can be linear, partially linear, or nonlinear. In particular, functional polynomial models, functional single and multiple index models and functional additive models are three special cases of functional nonlinear models.

Functional linear models (FLMs) edit

Functional linear models (FLMs) are an extension of linear models (LMs). A linear model with scalar response   and scalar covariates   can be written as

 

(1)

where   denotes the inner product in Euclidean space,   and   denote the regression coefficients, and   is a random error with mean zero and finite variance. FLMs can be divided into two types based on the responses.

Functional linear models with scalar responses edit

Functional linear models with scalar responses can be obtained by replacing the scalar covariates   and the coefficient vector   in model (1) by a centered functional covariate   and a coefficient function   with domain  , respectively, and replacing the inner product in Euclidean space by that in Hilbert space  ,

 

(2)

where   here denotes the inner product in  . One approach to estimating   and   is to expand the centered covariate   and the coefficient function   in the same functional basis, for example, B-spline basis or the eigenbasis used in the Karhunen–Loève expansion. Suppose   is an orthonormal basis of  . Expanding   and   in this basis,  ,  , model (2) becomes

 
For implementation, regularization is needed and can be done through truncation,   penalization or   penalization.[1] In addition, a reproducing kernel Hilbert space (RKHS) approach can also be used to estimate   and   in model (2)[2]

Adding multiple functional and scalar covariates, model (2) can be extended to

 

(3)

where   are scalar covariates with  ,   are regression coefficients for  , respectively,   is a centered functional covariate given by  ,   is regression coefficient function for  , and   is the domain of   and  , for  . However, due to the parametric component  , the estimation methods for model (2) cannot be used in this case[3] and alternative estimation methods for model (3) are available.[4][5]

Functional linear models with functional responses edit

For a functional response   with domain   and a functional covariate   with domain  , two FLMs regressing   on   have been considered.[3][6] One of these two models is of the form

 

(4)

where   is still the centered functional covariate,   and   are coefficient functions, and   is usually assumed to be a random process with mean zero and finite variance. In this case, at any given time  , the value of  , i.e.,  , depends on the entire trajectory of  . Model (4), for any given time  , is an extension of multivariate linear regression with the inner product in Euclidean space replaced by that in  . An estimating equation motivated by multivariate linear regression is

 
where  ,   is defined as   with   for  .[3] Regularization is needed and can be done through truncation,   penalization or   penalization.[1] Various estimation methods for model (4) are available.[7][8]
When   and   are concurrently observed, i.e.,  ,[9] it is reasonable to consider a historical functional linear model, where the current value of   only depends on the history of  , i.e.,   for   in model (4).[3][10] A simpler version of the historical functional linear model is the functional concurrent model (see below).
Adding multiple functional covariates, model (4) can be extended to
 

(5)

where for  ,   is a centered functional covariate with domain  , and   is the corresponding coefficient function with the same domain, respectively.[3] In particular, taking   as a constant function yields a special case of model (5)

 
which is a FLM with functional responses and scalar covariates.

Functional concurrent models edit

Assuming that  , another model, known as the functional concurrent model, sometimes also referred to as the varying-coefficient model, is of the form

 

(6)

where   and   are coefficient functions. Note that model (6) assumes the value of   at time  , i.e.,  , only depends on that of   at the same time, i.e.,  . Various estimation methods can be applied to model (6).[11][12][13]
Adding multiple functional covariates, model (6) can also be extended to

 
where   are multiple functional covariates with domain   and   are the coefficient functions with the same domain.[3]

Functional nonlinear models edit

Functional polynomial models edit

Functional polynomial models are an extension of the FLMs with scalar responses, analogous to extending linear regression to polynomial regression. For a scalar response   and a functional covariate   with domain  , the simplest example of functional polynomial models is functional quadratic regression[14]

 
where   is the centered functional covariate,   is a scalar coefficient,   and   are coefficient functions with domains   and  , respectively, and   is a random error with mean zero and finite variance. By analogy to FLMs with scalar responses, estimation of functional polynomial models can be obtained through expanding both the centered covariate   and the coefficient functions   and   in an orthonormal basis.[14]

Functional single and multiple index models edit

A functional multiple index model is given by

 
Taking   yields a functional single index model. However, for  , this model is problematic due to curse of dimensionality. With   and relatively small sample sizes, the estimator given by this model often has large variance.[15] An alternative  -component functional multiple index model can be expressed as
 
Estimation methods for functional single and multiple index models are available.[15][16]

Functional additive models (FAMs) edit

Given an expansion of a functional covariate   with domain   in an orthonormal basis  :  , a functional linear model with scalar responses shown in model (2) can be written as

 
One form of FAMs is obtained by replacing the linear function of  , i.e.,  , by a general smooth function  ,
 
where   satisfies   for  .[3][17] Another form of FAMs consists of a sequence of time-additive models:
 
where   is a dense grid on   with increasing size  , and   with   a smooth function, for  [3][18]

Extensions edit

A direct extension of FLMs with scalar responses shown in model (2) is to add a link function to create a generalized functional linear model (GFLM) by analogy to extending linear regression to generalized linear regression (GLM), of which the three components are:

  1. Linear predictor  ;
  2. Variance function  , where   is the conditional mean;
  3. Link function   connecting the conditional mean and the linear predictor through  .

See also edit

References edit

  1. ^ a b Morris, Jeffrey S. (2015). "Functional Regression". Annual Review of Statistics and Its Application. 2 (1): 321–359. arXiv:1406.4068. Bibcode:2015AnRSA...2..321M. doi:10.1146/annurev-statistics-010814-020413. S2CID 18637009.
  2. ^ Yuan and Cai (2010). "A reproducing kernel Hilbert space approach to functional linear regression". The Annals of Statistics. 38 (6):3412–3444. doi:10.1214/09-AOS772.
  3. ^ a b c d e f g h Wang, Jane-Ling; Chiou, Jeng-Min; Müller, Hans-Georg (2016). "Functional Data Analysis". Annual Review of Statistics and Its Application. 3 (1): 257–295. Bibcode:2016AnRSA...3..257W. doi:10.1146/annurev-statistics-041715-033624.
  4. ^ Kong, Xue, Yao and Zhang (2016). "Partially functional linear regression in high dimensions". Biometrika. 103 (1):147–159. doi:10.1093/biomet/asv062.
  5. ^ Hu, Wang and Carroll (2004). "Profile-kernel versus backfitting in the partially linear models for longitudinal/clustered data". Biometrika. 91 (2): 251–262. doi:10.1093/biomet/91.2.251.
  6. ^ Ramsay and Silverman (2005). Functional data analysis, 2nd ed., New York: Springer, ISBN 0-387-40080-X.
  7. ^ Ramsay and Dalzell (1991). "Some tools for functional data analysis". Journal of the Royal Statistical Society. Series B (Methodological). 53 (3):539–572. https://www.jstor.org/stable/2345586.
  8. ^ Yao, Müller and Wang (2005). "Functional linear regression analysis for longitudinal data". The Annals of Statistics. 33 (6):2873–2903. doi:10.1214/009053605000000660.
  9. ^ Grenander (1950). "Stochastic processes and statistical inference". Arkiv Matematik. 1 (3):195–277. doi:10.1007/BF02590638.
  10. ^ Malfait and Ramsay (2003). "The historical functional linear model". Canadian Journal of Statistics. 31 (2):115–128. doi:10.2307/3316063.
  11. ^ Fan and Zhang (1999). "Statistical estimation in varying coefficient models". The Annals of Statistics. 27 (5):1491–1518. doi:10.1214/aos/1017939139.
  12. ^ Huang, Wu and Zhou (2004). "Polynomial spline estimation and inference for varying coefficient models with longitudinal data". Biometrika. 14 (3):763–788. https://www.jstor.org/stable/24307415.
  13. ^ Şentürk and Müller (2010). "Functional varying coefficient models for longitudinal data". Journal of the American Statistical Association. 105 (491):1256–1264. doi:10.1198/jasa.2010.tm09228.
  14. ^ a b Yao and Müller (2010). "Functional quadratic regression". Biometrika. 97 (1):49–64. doi:10.1093/biomet/asp069.
  15. ^ a b Chen, Hall and Müller (2011). "Single and multiple index functional regression models with nonparametric link". The Annals of Statistics. 39 (3):1720–1747. doi:10.1214/11-AOS882.
  16. ^ Jiang and Wang (2011). "Functional single index models for longitudinal data". 39 (1):362–388. doi:10.1214/10-AOS845.
  17. ^ Müller and Yao (2008). "Functional additive models". Journal of the American Statistical Association. 103 (484):1534–1544. doi:10.1198/016214508000000751.
  18. ^ Fan, James and Radchenko (2015). "Functional additive regression". The Annals of Statistics. 43 (5):2296–2325. doi:10.1214/15-AOS1346.