no code implementations • 22 Jan 2024 • T. Tony Cai, Hongming Pu
Transfer learning for nonparametric regression is considered.
no code implementations • 8 Jan 2024 • T. Tony Cai, Dong Xia, Mengyue Zha
Estimating a covariance matrix and its associated principal components is a fundamental problem in contemporary statistics.
no code implementations • 13 Mar 2023 • T. Tony Cai, Yichen Wang, Linjun Zhang
The score attack method is based on the tracing attack concept in differential privacy and can be applied to any statistical model with a well-defined score statistic.
no code implementations • 3 Jan 2023 • T. Tony Cai, Zheng Tracy Ke, Paxton Turner
Motivated by applications in text mining and discrete distribution inference, we investigate the testing for equality of probability mass functions of $K$ groups of high-dimensional multinomial distributions.
no code implementations • 22 Nov 2022 • Changxiao Cai, T. Tony Cai, Hongzhe Li
The results quantify the contribution of the data from the source domains for learning in the target domain in the context of nonparametric contextual multi-armed bandits.
1 code implementation • 22 Mar 2022 • Ziyi Liang, T. Tony Cai, Wenguang Sun, Yin Xia
Linkage analysis has provided valuable insights to the GWAS studies, particularly in revealing that SNPs in linkage disequilibrium (LD) can jointly influence disease phenotypes.
no code implementations • 17 Jan 2022 • T. Tony Cai, Rong Ma
Motivated by applications in single-cell biology and metagenomics, we investigate the problem of matrix reordering based on a noisy disordered monotone Toeplitz matrix model.
no code implementations • 1 Jul 2021 • T. Tony Cai, Hongji Wei
Distributed minimax estimation and distributed adaptive estimation under communication constraints for Gaussian sequence model and white noise model are studied.
no code implementations • 16 May 2021 • T. Tony Cai, Rong Ma
This paper investigates the theoretical foundations of the t-distributed stochastic neighbor embedding (t-SNE) algorithm, a popular nonlinear dimension reduction and data visualization method.
no code implementations • 8 Nov 2020 • T. Tony Cai, Yichen Wang, Linjun Zhang
We propose differentially private algorithms for parameter estimation in both low-dimensional and high-dimensional sparse generalized linear models (GLMs) by constructing private versions of projected gradient descent.
no code implementations • 6 Nov 2020 • Linjun Zhang, Rong Ma, T. Tony Cai, Hongzhe Li
Based on the iterative estimators, we further construct debiased estimators and establish their asymptotic normality.
1 code implementation • 21 Oct 2020 • Sai Li, T. Tony Cai, Hongzhe Li
Transfer learning for high-dimensional Gaussian graphical models (GGMs) is studied with the goal of estimating the target GGM by utilizing the data from similar and related auxiliary studies.
1 code implementation • 18 Jun 2020 • Sai Li, T. Tony Cai, Hongzhe Li
This paper considers the estimation and prediction of a high-dimensional linear regression in the setting of transfer learning, using samples from the target model as well as auxiliary samples from different but possibly related regression models.
no code implementations • 18 Feb 2020 • T. Tony Cai, Hongzhe Li, Rong Ma
Driven by a wide range of applications, many principal subspace estimation problems have been studied individually under different structural constraints.
no code implementations • 24 Jan 2020 • T. Tony Cai, Hongji Wei
Although optimal estimation of a Gaussian mean is relatively simple in the conventional setting, it is quite involved under the communication constraints, both in terms of the optimal procedure design and lower bound argument.
no code implementations • 26 Nov 2019 • Abhishek Chakrabortty, Jiarui Lu, T. Tony Cai, Hongzhe Li
Under mild tail assumptions and arbitrarily chosen (working) models for the propensity score (PS) and the outcome regression (OR) estimators, satisfying only some high-level conditions, we establish finite sample performance bounds for the DDR estimator showing its (optimal) $L_2$ error rate to be $\sqrt{s (\log d)/ n}$ when both models are correct, and its consistency and DR properties when only one of them is correct.
no code implementations • 21 Sep 2019 • T. Tony Cai, Anru R. Zhang, Yuchen Zhou
We study sparse group Lasso for high-dimensional double sparse linear regression, where the parameter of interest is simultaneously element-wise and group-wise sparse.
no code implementations • 7 Jun 2019 • T. Tony Cai, Hongji Wei
In this paper, we study transfer learning in the context of nonparametric classification based on observations from different distributions under the posterior drift model, which is a general framework and arises in many practical problems.
no code implementations • 12 Feb 2019 • T. Tony Cai, Yichen Wang, Linjun Zhang
By refining the "tracing adversary" technique for lower bounds in the theoretical computer science literature, we formulate a general lower bound argument for minimax risks with differential privacy constraints, and apply this argument to high-dimensional mean estimation and linear regression problems.
1 code implementation • 19 Oct 2018 • Anru R. Zhang, T. Tony Cai, Yihong Wu
A general framework for principal component analysis (PCA) in the presence of heteroskedastic noise is introduced.
no code implementations • 12 Sep 2017 • T. Tony Cai, Tengyuan Liang, Alexander Rakhlin
We develop an optimally weighted message passing algorithm to reconstruct labels for SBM based on the minimum energy flow and the eigenvectors of a certain Markov transition matrix.
no code implementations • 23 Jun 2016 • Anru Zhang, Lawrence D. Brown, T. Tony Cai
Estimators are proposed along with corresponding confidence intervals for the population mean.
no code implementations • 21 Apr 2016 • T. Tony Cai, Tengyuan Liang, Alexander Rakhlin
In this paper, we study detection and fast reconstruction of the celebrated Watts-Strogatz (WS) small-world random graph model \citep{watts1998collective} which aims to describe real-world complex networks that exhibit both high clustering and short average length properties.
no code implementations • 22 Mar 2016 • T. Tony Cai, Tengyuan Liang, Alexander Rakhlin
We study the community detection and recovery problem in partially-labeled stochastic block models (SBM).
no code implementations • 16 Feb 2016 • T. Tony Cai, Linjun Zhang
We discuss a clustering method for Gaussian mixture model based on the sparse principal component analysis (SPCA) method and compare it with the IF-PCA method.
1 code implementation • 10 Jun 2015 • T. Tony Cai, Xiao-Dong Li, Zongming Ma
This paper considers the noisy sparse phase retrieval problem: recovering a sparse signal $x \in \mathbb{R}^p$ from noisy quadratic measurements $y_j = (a_j' x )^2 + \epsilon_j$, $j=1, \ldots, m$, with independent sub-exponential noise $\epsilon_j$.
no code implementations • 8 Apr 2015 • Tianxi Cai, T. Tony Cai, Anru Zhang
Matrix completion has attracted significant recent attention in many fields including statistics, applied mathematics and electrical engineering.
no code implementations • 6 Feb 2015 • T. Tony Cai, Tengyuan Liang, Alexander Rakhlin
The second threshold, $\sf SNR_s$, captures the statistical boundary, below which no method can succeed with probability going to one in the minimax sense.
no code implementations • 10 Jul 2014 • T. Tony Cai, Ming Yuan
Motivated by a range of applications in engineering and genomics, we consider in this paper detection of very short signal segments in three settings: signals with known shape, arbitrary signals, and smooth signals.
no code implementations • 23 Apr 2014 • T. Tony Cai, Xiao-Dong Li
To the best of the authors' knowledge, our result is the first in the literature in terms of clustering communities with fast growing numbers under the GSBM where a portion of arbitrary outlier nodes exist.
no code implementations • 17 Apr 2014 • T. Tony Cai, Tengyuan Liang, Alexander Rakhlin
This paper presents a unified geometric framework for the statistical analysis of a general ill-posed linear inverse model which includes as special cases noisy compressed sensing, sign vector recovery, trace regression, orthogonal matrix estimation, and noisy matrix completion.
no code implementations • 22 Oct 2013 • T. Tony Cai, Anru Zhang
In this paper, we introduce a rank-one projection model for low-rank matrix recovery and propose a constrained nuclear norm minimization method for stable recovery of low-rank matrices in the noisy case.
no code implementations • 24 Sep 2013 • T. Tony Cai, Wen-Xin Zhou
The rate of convergence for the estimate is obtained.
no code implementations • 5 Jun 2013 • T. Tony Cai, Anru Zhang
It is shown that for any given constant $t\ge {4/3}$, in compressed sensing $\delta_{tk}^A < \sqrt{(t-1)/t}$ guarantees the exact recovery of all $k$ sparse signals in the noiseless case through the constrained $\ell_1$ minimization, and similarly in affine rank minimization $\delta_{tr}^\mathcal{M}< \sqrt{(t-1)/t}$ ensures the exact reconstruction of all matrices with rank at most $r$ in the noiseless case via the constrained nuclear norm minimization.
no code implementations • 2 Mar 2013 • T. Tony Cai, Wen-Xin Zhou
Matrix completion has been well studied under the uniform sampling model and the trace-norm regularized methods perform well both theoretically and numerically in such a setting.