Search Results for author: Matthew Lease

Found 38 papers, 15 papers with code

longhorns at DADC 2022: How many linguists does it take to fool a Question Answering model? A systematic approach to adversarial attacks.

no code implementations • NAACL (DADC) 2022 • Venelin Kovatchev, Trina Chatterjee, Venkata S Govindarajan, Jifan Chen, Eunsol Choi, Gabriella Chronis, Anubrata Das, Katrin Erk, Matthew Lease, Junyi Jessy Li, Yating Wu, Kyle Mahowald

Developing methods to adversarially challenge NLP systems is a promising avenue for improving both model performance and interpretability.

Extractive Question-Answering Question Answering

Paper
Add Code

Benchmark Transparency: Measuring the Impact of Data on Evaluation

no code implementations • 31 Mar 2024 • Venelin Kovatchev, Matthew Lease

In this paper we present an exploratory research on quantifying the impact that data distribution has on the performance and evaluation of NLP models.

Paper
Add Code

Diverse, but Divisive: LLMs Can Exaggerate Gender Differences in Opinion Related to Harms of Misinformation

no code implementations • 29 Jan 2024 • Terrence Neumann, Sooyong Lee, Maria De-Arteaga, Sina Fazelpour, Matthew Lease

We pose two central questions: (1) To what extent do prompts with explicit gender references reflect gender differences in opinion in the United States on topics of social relevance?

Fact Checking Language Modelling +2

Paper
Add Code

A General Model for Aggregating Annotations Across Simple, Complex, and Multi-Object Annotation Tasks

1 code implementation • 20 Dec 2023 • Alexander Braylan, Madalyn Marabella, Omar Alonso, Matthew Lease

Beyond investigating these research questions above, we discuss the foundational concept of annotation complexity, present a new aggregation model as a bridge between traditional models and our own, and contribute a new semi-supervised learning method for complex label aggregation that outperforms prior work.

Model Selection Navigate

Paper
Code

DMLR: Data-centric Machine Learning Research -- Past, Present and Future

no code implementations • 21 Nov 2023 • Luis Oala, Manil Maskey, Lilith Bat-Leah, Alicia Parrish, Nezihe Merve Gürel, Tzu-Sheng Kuo, Yang Liu, Rotem Dror, Danilo Brajovic, Xiaozhe Yao, Max Bartolo, William A Gaviria Rojas, Ryan Hileman, Rainier Aliment, Michael W. Mahoney, Meg Risdal, Matthew Lease, Wojciech Samek, Debojyoti Dutta, Curtis G Northcutt, Cody Coleman, Braden Hancock, Bernard Koch, Girmaw Abebe Tadesse, Bojan Karlaš, Ahmed Alaa, Adji Bousso Dieng, Natasha Noy, Vijay Janapa Reddi, James Zou, Praveen Paritosh, Mihaela van der Schaar, Kurt Bollacker, Lora Aroyo, Ce Zhang, Joaquin Vanschoren, Isabelle Guyon, Peter Mattson

Drawing from discussions at the inaugural DMLR workshop at ICML 2023 and meetings prior, in this report we outline the relevance of community engagement and infrastructure development for the creation of next-generation public datasets that will advance machine learning science.

Paper
Add Code

Interpretable by Design: Wrapper Boxes Combine Neural Performance with Faithful Explanations

no code implementations • 15 Nov 2023 • Yiheng Su, Juni Jessy Li, Matthew Lease

Can we preserve the accuracy of neural models while also providing faithful explanations?

Paper
Add Code

Human-centered NLP Fact-checking: Co-Designing with Fact-checkers using Matchmaking for AI

no code implementations • 14 Aug 2023 • Houjiang Liu, Anubrata Das, Alexander Boltz, Didi Zhou, Daisy Pinaroc, Matthew Lease, Min Kyung Lee

While many Natural Language Processing (NLP) techniques have been proposed for fact-checking, both academic research and fact-checking organizations report limited adoption of such NLP work due to poor alignment with fact-checker practices, values, and needs.

Fact Checking Misinformation

Paper
Add Code

Designing Closed-Loop Models for Task Allocation

1 code implementation • 31 May 2023 • Vijay Keswani, L. Elisa Celis, Krishnaram Kenthapadi, Matthew Lease

Instead, we find ourselves in a "closed" decision-making loop in which the same fallible human decisions we rely on in practice must also be used to guide task allocation.

Decision Making

Paper
Code

Same Same, But Different: Conditional Multi-Task Learning for Demographic-Specific Toxicity Detection

1 code implementation • 14 Feb 2023 • Soumyajit Gupta, Sooyong Lee, Maria De-Arteaga, Matthew Lease

We propose framing toxicity detection as multi-task learning (MTL), allowing a model to specialize on the relationships that are relevant to each demographic group while also leveraging shared properties across groups.

Multi-Task Learning

Paper
Code

Learning Complementary Policies for Human-AI Teams

no code implementations • 6 Feb 2023 • Ruijiang Gao, Maytal Saar-Tsechansky, Maria De-Arteaga, Ligong Han, Wei Sun, Min Kyung Lee, Matthew Lease

We then extend our approach to leverage opportunities and mitigate risks that arise in important contexts in practice: 1) when a team is composed of multiple humans with differential and potentially complementary abilities, 2) when the observational data includes consistent deterministic actions, and 3) when the covariate distribution of future decisions differ from that in the historical data.

Paper
Add Code

New Metrics to Encourage Innovation and Diversity in Information Retrieval Approaches

1 code implementation • 19 Jan 2023 • Mehmet Deniz Türkmen, Matthew Lease, Mucahid Kutlu

In addition, we show that our metrics achieve higher evaluation stability and discriminative power than the standard metrics we modify.

Information Retrieval Retrieval

Paper
Code

The State of Human-centered NLP Technology for Fact-checking

no code implementations • 8 Jan 2023 • Anubrata Das, Houjiang Liu, Venelin Kovatchev, Matthew Lease

We recommend that future research include collaboration with fact-checker stakeholders early on in NLP research, as well as incorporation of human-centered design practices in model development, in order to further guide technology development for human use and practical adoption.

Explainable Models Fact Checking +1

Paper
Add Code

Measuring Annotator Agreement Generally across Complex Structured, Multi-object, and Free-text Annotation Tasks

1 code implementation • 15 Dec 2022 • Alexander Braylan, Omar Alonso, Matthew Lease

When annotators label data, a key metric for quality assurance is inter-annotator agreement (IAA): the extent to which annotators agree on their labels.

text annotation

Paper
Code

longhorns at DADC 2022: How many linguists does it take to fool a Question Answering model? A systematic approach to adversarial attacks

no code implementations • 29 Jun 2022 • Venelin Kovatchev, Trina Chatterjee, Venkata S Govindarajan, Jifan Chen, Eunsol Choi, Gabriella Chronis, Anubrata Das, Katrin Erk, Matthew Lease, Junyi Jessy Li, Yating Wu, Kyle Mahowald

Developing methods to adversarially challenge NLP systems is a promising avenue for improving both model performance and interpretability.

Extractive Question-Answering Question Answering

Paper
Add Code

Fairly Accurate: Learning Optimal Accuracy vs. Fairness Tradeoffs for Hate Speech Detection

no code implementations • 15 Apr 2022 • Venelin Kovatchev, Soumyajit Gupta, Anubrata Das, Matthew Lease

In this work, we first introduce a differentiable measure that enables direct optimization of group fairness (specifically, balancing accuracy across groups) in model training.

Fairness Hate Speech Detection

Paper
Add Code

ProtoTEx: Explaining Model Decisions with Prototype Tensors

1 code implementation • ACL 2022 • Anubrata Das, Chitrank Gupta, Venelin Kovatchev, Matthew Lease, Junyi Jessy Li

We present ProtoTEx, a novel white-box NLP classification architecture based on prototype networks.

Propaganda detection

Paper
Code

The Effects of Interactive AI Design on User Behavior: An Eye-tracking Study of Fact-checking COVID-19 Claims

1 code implementation • 17 Feb 2022 • Li Shi, Nilavra Bhattacharya, Anubrata Das, Matthew Lease, Jacek Gwidzka

We conducted a lab-based eye-tracking study to investigate how the interactivity of an AI-powered fact-checking system affects user interactions, such as dwell time, attention, and mental resources involved in using the system.

Fact Checking

Paper
Code

Designing Closed Human-in-the-loop Deferral Pipelines

1 code implementation • 9 Feb 2022 • Vijay Keswani, Matthew Lease, Krishnaram Kenthapadi

Our key insight is that by exploiting weak prior information, we can match experts to input examples to ensure fairness and accuracy of the resulting deferral framework, even when imperfect and biased experts are used in place of ground truth labels.

Decision Making Fairness

Paper
Code

In Search of Ambiguity: A Three-Stage Workflow Design to Clarify Annotation Guidelines for Crowd Workers

no code implementations • 4 Dec 2021 • Vivek Krishna Pradhan, Mike Schaekermann, Matthew Lease

We propose a novel three-stage FIND-RESOLVE-LABEL workflow for crowdsourced annotation to reduce ambiguity in task instructions and thus improve annotation quality.

TAG

Paper
Add Code

Data Excellence for AI: Why Should You Care

no code implementations • 19 Nov 2021 • Lora Aroyo, Matthew Lease, Praveen Paritosh, Mike Schaekermann

The efficacy of machine learning (ML) models depends on both algorithms and data.

Paper
Add Code

Scalable Unidirectional Pareto Optimality for Multi-Task Learning with Constraints

no code implementations • 28 Oct 2021 • Soumyajit Gupta, Gurpreet Singh, Raghu Bollapragada, Matthew Lease

Multi-objective optimization (MOO) problems require balancing competing objectives, often under constraints.

Image Classification Multi-Task Learning

Paper
Add Code

A Two-Stage Neural-Filter Pareto Front Extractor and the need for Benchmarking

no code implementations • 29 Sep 2021 • Soumyajit Gupta, Gurpreet Singh, Matthew Lease

The Stage-1 neural network efficiently extracts the \textit{weak} Pareto front, using Fritz-John Conditions (FJC) as the discriminator, with no assumptions of convexity on the objectives or constraints.

Benchmarking Multi-Task Learning

Paper
Add Code

The Case for Claim Difficulty Assessment in Automatic Fact Checking

no code implementations • 20 Sep 2021 • Prakhar Singh, Anubrata Das, Junyi Jessy Li, Matthew Lease

Fact-checking is the process of evaluating the veracity of claims (i. e., purported facts).

Fact Checking

Paper
Add Code

An Information Retrieval Approach to Building Datasets for Hate Speech Detection

1 code implementation • 17 Jun 2021 • Md Mustafizur Rahman, Dinesh Balakrishnan, Dhiraj Murthy, Mucahid Kutlu, Matthew Lease

Our key insight is that the rarity and subjectivity of hate speech are akin to that of relevance in information retrieval (IR).

Active Learning Explanation Generation +3

Paper
Code

Towards Unbiased and Accurate Deferral to Multiple Experts

1 code implementation • 25 Feb 2021 • Vijay Keswani, Matthew Lease, Krishnaram Kenthapadi

Machine learning models are often implemented in cohort with humans in the pipeline, with the model having an option to defer to a domain expert in cases where it has low confidence in its inference.

BIG-bench Machine Learning Fairness

Paper
Code

A Hybrid 2-stage Neural Optimization for Pareto Front Extraction

no code implementations • 27 Jan 2021 • Gurpreet Singh, Soumyajit Gupta, Matthew Lease, Clint Dawson

The first stage (neural network) efficiently extracts a weak Pareto front, using Fritz-John conditions as the discriminator, with no assumptions of convexity on the objectives or constraints.

Fairness

Paper
Add Code

Understanding and Predicting Characteristics of Test Collections in Information Retrieval

no code implementations • 24 Dec 2020 • Md Mustafizur Rahman, Mucahid Kutlu, Matthew Lease

Research community evaluations in information retrieval, such as NIST's Text REtrieval Conference (TREC), build reusable test collections by pooling document rankings submitted by many teams.

Information Retrieval Retrieval +1

Paper
Add Code

You Are What You Tweet: Profiling Users by Past Tweets to Improve Hate Speech Detection

no code implementations • 16 Dec 2020 • Prateek Chaudhry, Matthew Lease

Hate speech detection research has predominantly focused on purely content-based methods, without exploiting any additional context.

Hate Speech Detection

Paper
Add Code

Range-Net: A High Precision Streaming SVD for Big Data Applications

no code implementations • 27 Oct 2020 • Gurpreet Singh, Soumyajit Gupta, Matthew Lease, Clint Dawson

Although these methods are claimed to be applicable to scientific computations due to associated tail-energy error bounds, the approximation errors in the singular vectors and values are high when the aforementioned assumption does not hold.

Vocal Bursts Intensity Prediction

Paper
Add Code

Extracting Optimal Solution Manifolds using Constrained Neural Optimization

no code implementations • 13 Sep 2020 • Gurpreet Singh, Soumyajit Gupta, Matthew Lease

However, such an approach is often restricted to a strict class of functions, deviation from which results in sub-optimal solution to the original problem.

Computational Efficiency Hyperspectral Unmixing

Paper
Add Code

A Conceptual Framework for Evaluating Fairness in Search

1 code implementation • 22 Jul 2019 • Anubrata Das, Matthew Lease

While search efficacy has been evaluated traditionally on the basis of result relevance, fairness of search has attracted recent attention.

Fairness

Paper
Code

CobWeb: A Research Prototype for Exploring User Bias in Political Fact-Checking

1 code implementation • 8 Jul 2019 • Anubrata Das, Kunjan Mehta, Matthew Lease

The effect of user bias in fact-checking has not been explored extensively from a user-experience perspective.

Fact Checking

Paper
Code

Efficient Test Collection Construction via Active Learning

no code implementations • 17 Jan 2018 • Md Mustafizur Rahman, Mucahid Kutlu, Tamer Elsayed, Matthew Lease

To create a new IR test collection at low cost, it is valuable to carefully select which documents merit human relevance judgments.

Active Learning

Paper
Add Code

Aggregating and Predicting Sequence Labels from Crowd Annotations

1 code implementation • ACL 2017 • An Thanh Nguyen, Byron Wallace, Junyi Jessy Li, Ani Nenkova, Matthew Lease

Despite sequences being core to NLP, scant work has considered how to handle noisy sequence labels from multiple annotators for the same text.

named-entity-recognition Named Entity Recognition +2

Paper
Code

Exploiting Domain Knowledge via Grouped Weight Sharing with Application to Text Categorization

no code implementations • ACL 2017 • Ye Zhang, Matthew Lease, Byron C. Wallace

A fundamental advantage of neural models for NLP is their ability to learn representations from scratch.

General Classification Model Compression +1

Paper
Add Code

Neural Information Retrieval: A Literature Review

no code implementations • 18 Nov 2016 • Ye Zhang, Md Mustafizur Rahman, Alex Braylan, Brandon Dang, Heng-Lu Chang, Henna Kim, Quinten McNamara, Aaron Angert, Edward Banner, Vivek Khetan, Tyler McDonnell, An Thanh Nguyen, Dan Xu, Byron C. Wallace, Matthew Lease

A recent "third wave" of Neural Network (NN) approaches now delivers state-of-the-art performance in many machine learning tasks, spanning speech recognition, computer vision, and natural language processing.

Information Retrieval Retrieval +2

Paper
Add Code

Active Discriminative Text Representation Learning

1 code implementation • 14 Jun 2016 • Ye Zhang, Matthew Lease, Byron C. Wallace

We also show that, as expected, the method quickly learns discriminative word embeddings.

Active Learning Document Classification +6

521

Paper
Code

TurKPF: TurKontrol as a Particle Filter

1 code implementation • 20 Apr 2014 • Ethan Petuchowski, Matthew Lease

TurKontrol, and algorithm presented in (Dai et al. 2010), uses a POMDP to model and control an iterative workflow for crowdsourced work.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.