Search Results for author: Jason M. Klusowski

Found 25 papers, 0 papers with code

Challenges in Variable Importance Ranking Under Correlation

no code implementations • 5 Feb 2024 • Annie Liang, Thomas Jemielita, Andy Liaw, Vladimir Svetnik, Lingkang Huang, Richard Baumgartner, Jason M. Klusowski

Recently, several adjustments to marginal permutation utilizing feature knockoffs were proposed to address this issue, such as the variable importance measure known as conditional predictive impact (CPI).

Feature Correlation Interpretable Machine Learning

Paper
Add Code

Stochastic Gradient Descent for Additive Nonparametric Regression

no code implementations • 1 Jan 2024 • Xin Chen, Jason M. Klusowski

This paper introduces an iterative algorithm for training additive models that enjoys favorable memory storage and computational requirements.

Additive models regression

Paper
Add Code

Inference with Mondrian Random Forests

no code implementations • 15 Oct 2023 • Matias D. Cattaneo, Jason M. Klusowski, William G. Underwood

Random forests are popular methods for classification and regression, and many different variants have been proposed in recent years.

regression valid

Paper
Add Code

Robust Transfer Learning with Unreliable Source Data

no code implementations • 6 Oct 2023 • Jianqing Fan, Cheng Gao, Jason M. Klusowski

This paper addresses challenges in robust transfer learning stemming from ambiguity in Bayes classifiers and weak transferable signals between the target and source distribution.

regression Transfer Learning

Paper
Add Code

Error Reduction from Stacked Regressions

no code implementations • 18 Sep 2023 • Xin Chen, Jason M. Klusowski, Yan Shuo Tan

In this paper, we learn these weights analogously by minimizing an estimate of the population risk subject to a nonnegativity constraint.

Model Selection regression

Paper
Add Code

On the Implicit Bias of Adam

no code implementations • 31 Aug 2023 • Matias D. Cattaneo, Jason M. Klusowski, Boris Shigida

In previous literature, backward error analysis was used to find ordinary differential equations (ODEs) approximating the gradient descent trajectory.

Paper
Add Code

Sharp Convergence Rates for Matching Pursuit

no code implementations • 15 Jul 2023 • Jason M. Klusowski, Jonathan W. Siegel

We study the fundamental limits of matching pursuit, or the pure greedy algorithm, for approximating a target function by a sparse linear combination of elements from a dictionary.

Paper
Add Code

On the Pointwise Behavior of Recursive Partitioning and Its Implications for Heterogeneous Causal Effect Estimation

no code implementations • 19 Nov 2022 • Matias D. Cattaneo, Jason M. Klusowski, Peter M. Tian

Decision tree learning is increasingly being used for pointwise inference.

feature selection regression

Paper
Add Code

Large Scale Prediction with Decision Trees

no code implementations • 28 Apr 2021 • Jason M. Klusowski, Peter M. Tian

This paper shows that decision trees constructed with Classification and Regression Trees (CART) and C4. 5 methodology are consistent for regression and classification tasks, even when the number of predictor variables grows sub-exponentially with the sample size, under natural 0-norm and 1-norm sparsity constraints.

regression Vocal Bursts Intensity Prediction

Paper
Add Code

Nonparametric Variable Screening with Optimal Decision Stumps

no code implementations • 5 Nov 2020 • Jason M. Klusowski, Peter M. Tian

Decision trees and their ensembles are endowed with a rich set of diagnostic tools for ranking and screening variables in a predictive model.

Model Selection Variable Selection

Paper
Add Code

Good Classifiers are Abundant in the Interpolating Regime

no code implementations • 22 Jun 2020 • Ryan Theisen, Jason M. Klusowski, Michael W. Mahoney

Inspired by the statistical mechanics approach to learning, we formally define and develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers from several model classes.

Learning Theory

Paper
Add Code

Sparse learning with CART

no code implementations • NeurIPS 2020 • Jason M. Klusowski

In doing so, we find that the training error is governed by the Pearson correlation between the optimal decision stump and response data in each node, which we bound by constructing a prior distribution on the split points and solving a nonlinear optimization problem.

regression Sparse Learning

Paper
Add Code

Global Capacity Measures for Deep ReLU Networks via Path Sampling

no code implementations • 22 Oct 2019 • Ryan Theisen, Jason M. Klusowski, Huan Wang, Nitish Shirish Keskar, Caiming Xiong, Richard Socher

Classical results on the statistical complexity of linear models have commonly identified the norm of the weights $\|w\|$ as a fundamental capacity measure.

Generalization Bounds Multi-class Classification

Paper
Add Code

Analyzing CART

no code implementations • 24 Jun 2019 • Jason M. Klusowski

For binary classification and regression models, this approach recursively divides the data into two near-homogenous daughter nodes according to a split point that maximizes the reduction in sum of squares error (the impurity) along a particular variable.

Binary Classification regression

Paper
Add Code

Complexity, Statistical Risk, and Metric Entropy of Deep Nets Using Total Path Variation

no code implementations • 2 Feb 2019 • Andrew R. Barron, Jason M. Klusowski

For any ReLU network there is a representation in which the sum of the absolute values of the weights into each node is exactly $1$, and the input layer variables are multiplied by a value $V$ coinciding with the total variation of the path weights.

Paper
Add Code

Approximation and Estimation for High-Dimensional Deep Learning Networks

no code implementations • 10 Sep 2018 • Andrew R. Barron, Jason M. Klusowski

It has been experimentally observed in recent years that multi-layer artificial neural networks have a surprising ability to generalize, even when trained with far more parameters than observations.

Vocal Bursts Intensity Prediction

Paper
Add Code

Sharp Analysis of a Simple Model for Random Forests

no code implementations • 7 May 2018 • Jason M. Klusowski

Random forests have become an important tool for improving accuracy in regression and classification problems since their inception by Leo Breiman in 2001.

regression

Paper
Add Code

Counting Motifs with Graph Sampling

no code implementations • 21 Feb 2018 • Jason M. Klusowski, Yihong Wu

Applied researchers often construct a network from a random sample of nodes in order to infer properties of the parent network.

Graph Sampling

Paper
Add Code

Estimating the Number of Connected Components in a Graph via Subgraph Sampling

no code implementations • 12 Jan 2018 • Jason M. Klusowski, Yihong Wu

Learning properties of large graphs from samples has been an important problem in statistical network analysis since the early work of Goodman \cite{Goodman1949} and Frank \cite{Frank1978}.

Paper
Add Code

Finite-sample risk bounds for maximum likelihood estimation with arbitrary penalties

no code implementations • 29 Dec 2017 • W. D. Brinda, Jason M. Klusowski

The MDL two-part coding $ \textit{index of resolvability} $ provides a finite-sample upper bound on the statistical risk of penalized likelihood estimators over countable models.

Paper
Add Code

Estimating the Coefficients of a Mixture of Two Linear Regressions by Expectation Maximization

no code implementations • 26 Apr 2017 • Jason M. Klusowski, Dana Yang, W. D. Brinda

We also show that the population EM operator for mixtures of two regressions is anti-contractive from the target parameter vector if the cosine angle between the input vector and the target parameter vector is too small, thereby establishing the necessity of our conic condition.

Paper
Add Code

Minimax Lower Bounds for Ridge Combinations Including Neural Nets

no code implementations • 9 Feb 2017 • Jason M. Klusowski, Andrew R. Barron

Estimation of functions of $ d $ variables is considered using ridge combinations of the form $ \textstyle\sum_{k=1}^m c_{1, k} \phi(\textstyle\sum_{j=1}^d c_{0, j, k}x_j-b_k) $ where the activation function $ \phi $ is a function with bounded value and derivative.

Paper
Add Code

Statistical Guarantees for Estimating the Centers of a Two-component Gaussian Mixture by EM

no code implementations • 7 Aug 2016 • Jason M. Klusowski, W. D. Brinda

In that method, the basin of attraction for valid initialization is required to be a ball around the truth.

LEMMA valid

Paper
Add Code

Approximation by Combinations of ReLU and Squared ReLU Ridge Functions with $ \ell^1 $ and $ \ell^0 $ Controls

no code implementations • 26 Jul 2016 • Jason M. Klusowski, Andrew R. Barron

We establish $ L^{\infty} $ and $ L^2 $ error bounds for functions of many variables that are approximated by linear combinations of ReLU (rectified linear unit) and squared ReLU ridge functions with $ \ell^1 $ and $ \ell^0 $ controls on their inner and outer parameters.

Paper
Add Code

Risk Bounds for High-dimensional Ridge Function Combinations Including Neural Networks

no code implementations • 5 Jul 2016 • Jason M. Klusowski, Andrew R. Barron

On the other hand, if the candidate fits are chosen from a discretization, we show that $ \mathbb{E}\|\hat{f} - f^{\star} \|^2 \leq \left(v^3_{f^{\star}}\frac{\log d}{n}\right)^{2/5} $.

Vocal Bursts Intensity Prediction

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.