浙江大学数学研究中心

首页 > 学术活动 > 学术报告

Prof. Fu Wenjiang：Variable Selection via the Lasso 下午2：00－3：00

2006-12-26 来源：数学科学研究中心

活动地点：

活动类型：学术报告

主讲人：Fu Wenjiang

活动时间：

活动内容：

	2006


Center of Mathematical Sciences at Zhejiang University
学术演讲
题目：Variable Selection via the Lasso 报告人：Fu Wenjiang（Department of Epidemiology,Michigan State University）内容：Information technology (IT) has generated massive data sets in our work and life, such as finance data in stock market, customer data in marketing / economics, medical imaging data and genetic / genomic data in biological / biomedical research, etc. These massive data not only provide opportunities for quantitative research, but also challenge us with unprecedented requirements of model sophistication and computational power. Very often, these data sets involve hundreds or even thousands of variables in order to understand the relationship between the occurrence of major events of interests (response variables) and the variables that may provide potential explanation (explanatory variables) to the major events. Statisticians have contributed greatly to the modeling and analysis of massive data in the post-IT and post-genome era in working with other scientists in data mining and bioinformatics, etc. One of the major statistical tasks is to deal with a large number of variables. Although a large number of variables are collected in many studies, not all of them contribute to the event under investigation. For example, microarray studies of breast cancer often collect gene expression data from thousands of genes / probes, it has been shown that only 70 genes may determine the prognosis of patient’s survival after surgery (van de Vijver et al. 2002). The statistical challenge is how to select these significant / important variables from a large or huge number of variables collected. This is the so called statistical variable selection problem. Traditional variable selection was achieved with forward and backward selection by dropping insignificant variables and adding significant ones to the statistical models. Such a procedure can be unstable due to its discrete nature. The lasso method proposed by Tibshirani (1996) provides an alternative to the variable selection. It achieves variable selection in regression models in a continuous way by shrinking small parameters to zero and leaves large parameters in the model and thus is more stable. Recent studies show that although the lasso may generate biased estimation, it can provide asymptotically unbiased estimation and variable selection through an adaptive fashion. In this talk, I will discuss the properties of the lasso and its applications.

Copyright & 2004，浙江大学数学研究中心 All Rights Reserved 技术支持: 浙江同力信息科技有限公司

信址：中国杭州市浙江大学数学科学研究中心邮编：310027

电话：0086-571-87953030,87953107 传真：0086-571-87953035