当前位置: 首页  学术动态(原)
“数理讲堂”2023年第11期
发布日期:2023-05-11 10:00:00   发布人:数理与统计学院

主题:Optimal decorrelatedscore subsampling for generalized linear models with massive data

时间:2023年5月11号 10:00-11:30

地点:腾讯会议:748-682-688

主持人:姜荣 教授

报告人简介:

  王磊,南开大学统计与数据科学学院副研究员,博士生导师。研究方向是复杂数据分析和统计学习,已在Biometrika、SCIENCE CHINA Mathematics、Bernoulli、Statistica Sinica等统计学杂志发表学术论文50多篇,主持3项国家自然科学基金和1项天津市自然科学基金项目。

讲座简介:

  Inthis paper, we consider a unified optimal subsampling estimation and inferenceon lowdimensional parameter of main interest in the presence of nuisanceparameter for low/high-dimensional generalized linear models (GLMs) withmassive data.We first presenta general subsampling decorrelated score function to reduce the influence ofthe less accurate nuisance parameter estimation with slow convergence rate. Theconsistency and asymptotic normality of the resultant subsample estimator froma general decorrelated score subsampling algorithm are established, and twooptimal subsampling probabilities are derived under the A- and L-optimalitycriteria to downsize the data volume and reduce the computational burden. Theproposed optimal subsampling probabilities provably improve the asymptoticefficiency upon the subsampling schemes in the lowdimensional GLMs and performbetter than the uniform subsampling scheme in the high-dimensional GLMs. Atwo-step algorithm is further proposed to implement and the asymptoticproperties of the corresponding estimators are also given. Simulations showsatisfactory performance of the proposed estimators, and two applications tocensus income and Fashion-MNIST datasets also demonstrate its practicalapplicability.


关于活动获得“第二课堂学分”的说明(线上):

  ①腾讯会议:进入腾讯会议后更改自己昵称备注为学号+姓名

  ②讲座开始后 将在任意两个时段由工作人员记录信息,进行比对审核,成功匹配的计算第二课堂积分。

  ③请同学们全程参与讲座,不可中途来回进出。聆听讲座时确保自己的昵称更改为要求格式,否则最终审核不通过,将无法获得第二课堂积分。

分享到:
相关信息