Skip to content Skip to navigation

Colloquium: Fast approximation of U-statistics: a free lunch.

Event Type: 
Prof. Wei Zheng
Event Date: 
Tuesday, September 29, 2020 -
3:30pm to 5:00pm
Zoom Meetings:
General PublicFaculty/StaffStudentsAlumni/Friends

Event Description: 

Title: Fast approximation of U-statistics: a free lunch.

Abstract: U-statistics are widely used in fields such as economics, machine learning, and statistics. However, while they enjoy desirable statistical properties, they have an obvious drawback in that the computation becomes impractical as the data size $n$ increases. Specifically, the number of combinations, say $m$, that a U-statistic of order $d$ has to evaluate is $O(n^d)$. Many efforts have been made to approximate the original U-statistic using a small subset of combinations since Blom (1976), who referred to such an approximation as an incomplete U-statistic. To the best of our knowledge, all existing methods require $m$ to grow at least faster than $n$, albeit more slowly than $n^d$, in order for the corresponding incomplete U-statistic to be asymptotically efficient in terms of the mean squared error. In this paper, we introduce a new type of incomplete U-statistic that can be asymptotically efficient, even when $m$ grows more slowly than $n$. In some cases, $m$ is only required to grow faster than $\sqrt{n}$. Our theoretical and empirical results both show significant improvements in the statistical efficiency of the new incomplete U-statistic.

About the speaker: Dr. Wei Zheng is currently an Associate Professor at the Department of Business Analytics and Statistics (BAS), Haslam College of Business (HCB), the University of Tennessee Knoxville (UTK). He was an Assistant Professor in Indiana University-Purdue University Indianapolis before joining UTK. He received his Ph.D. degree from University of Illinois at Chicago. His main research interest is optimal design of experiment, which is to study the most cost-effective way of conducting experiments with the purpose of maximizing the information of the data to be collected. His early work on crossover designs and related studies mostly appear on top journals in statistics such as Annals of Statistics and Journal of the American Statistical Association. Recently, he is extending his research interest to the interface between design and computing in two ways. The obvious one is to develop fast algorithms to derive optimal or efficient designs. The other one is the opposite: use the design idea to improve on computational performance of the methods in both statistics and machine learnings. He also has the particular interest in new design problems inspired by real applications and welcome any form of collaboration towards this mission. His research is supported by NSF grant DMS-1612978. He currently serves as an associate editor of Metrika.

Event Contact

Contact Name: Guoyi Zhang

Contact Email: