Search Results for author: Manuel R. Ciosici

Found 14 papers, 5 papers with code

One of these words is not like the other: a reproduction of outlier identification using non-contextual word representations

1 code implementation • EMNLP (Eval4NLP) 2020 • Jesper Brink Andersen, Mikkel Bak Bertelsen, Mikkel Hørby Schou, Manuel R. Ciosici, Ira Assent

The data set is expanded to contain semantic and syntactic tests and is multilingual (English, German, and Italian).

Word Embeddings

Paper
Code

Remember what you did so you know what to do next

no code implementations • 30 Oct 2023 • Manuel R. Ciosici, Alex Hedges, Yash Kankanampati, Justin Martin, Marjorie Freedman, Ralph Weischedel

In work contemporaneous with ours, Lin et al. (2023) demonstrated a two-part approach (SwiftSage) that uses a small LLM (T5-large) complemented by OpenAI's massive LLMs to achieve outstanding results in ScienceWorld.

Language Modelling Large Language Model +1

Paper
Add Code

Efficient Methods for Natural Language Processing: A Survey

no code implementations • 31 Aug 2022 • Marcos Treviso, Ji-Ung Lee, Tianchu Ji, Betty van Aken, Qingqing Cao, Manuel R. Ciosici, Michael Hassid, Kenneth Heafield, Sara Hooker, Colin Raffel, Pedro H. Martins, André F. T. Martins, Jessica Zosa Forde, Peter Milder, Edwin Simpson, Noam Slonim, Jesse Dodge, Emma Strubell, Niranjan Balasubramanian, Leon Derczynski, Iryna Gurevych, Roy Schwartz

Recent work in natural language processing (NLP) has yielded appealing results from scaling model parameters and training data; however, using only scale to improve performance means that resource consumption also grows.

Information Retrieval Open-Domain Question Answering

Paper
Add Code

Training a T5 Using Lab-sized Resources

no code implementations • 25 Aug 2022 • Manuel R. Ciosici, Leon Derczynski

Training large neural language models on large datasets is resource- and time-intensive.

Language Modelling Large Language Model

Paper
Add Code

Perhaps PTLMs Should Go to School -- A Task to Assess Open Book and Closed Book QA

no code implementations • 4 Oct 2021 • Manuel R. Ciosici, Joe Cecil, Alex Hedges, Dong-Ho Lee, Marjorie Freedman, Ralph Weischedel

Our goal is to deliver a new task and leaderboard to stimulate research on question answering and pre-trained language models (PTLMs) to understand a significant instructional document, e. g., an introductory college textbook or a manual.

Question Answering

Paper
Add Code

A reproduction of Apple's bi-directional LSTM models for language identification in short strings

1 code implementation • EACL 2021 • Mads Toftrup, Søren Asger Sørensen, Manuel R. Ciosici, Ira Assent

Language Identification is the task of identifying a document's language.

Ranked #1 on Language Identification on OpenSubtitles

Language Identification

Paper
Code

Machine-Assisted Script Curation

1 code implementation • NAACL 2021 • Manuel R. Ciosici, Joseph Cummings, Mitchell DeHaven, Alex Hedges, Yash Kankanampati, Dong-Ho Lee, Ralph Weischedel, Marjorie Freedman

We describe Machine-Aided Script Curator (MASC), a system for human-machine collaborative script authoring.

Paper
Code

The Danish Gigaword Project

no code implementations • 7 May 2020 • Leon Strømberg-Derczynski, Manuel R. Ciosici, Rebekah Baglini, Morten H. Christiansen, Jacob Aarup Dalsgaard, Riccardo Fusaroli, Peter Juel Henrichsen, Rasmus Hvingelby, Andreas Kirkedal, Alex Speed Kjeldsen, Claus Ladefoged, Finn Årup Nielsen, Malte Lau Petersen, Jonathan Hvithamar Rystrøm, Daniel Varab

Danish language technology has been hindered by a lack of broad-coverage corpora at the scale modern NLP prefers.

Paper
Add Code

Accelerated High-Quality Mutual-Information Based Word Clustering

1 code implementation • LREC 2020 • Manuel R. Ciosici, Ira Assent, Leon Derczynski

We present efficient implementations of Brown clustering and the alternative Exchange clustering as well as a number of methods to accelerate the computation of both hierarchical and flat clusters.

Clustering Vocal Bursts Intensity Prediction

Paper
Code

CRAFT Shared Tasks 2019 Overview --- Integrated Structure, Semantics, and Coreference

no code implementations • WS 2019 • William Baumgartner, Michael Bada, Sampo Pyysalo, Manuel R. Ciosici, Negacy Hailu, Harrison Pielke-Lombardo, Michael Regan, Lawrence Hunter

As part of the BioNLP Open Shared Tasks 2019, the CRAFT Shared Tasks 2019 provides a platform to gauge the state of the art for three fundamental language processing tasks {---} dependency parse construction, coreference resolution, and ontology concept identification {---} over full-text biomedical articles.

coreference-resolution Dependency Parsing +2

Paper
Add Code

Quantifying the morphosyntactic content of Brown Clusters

no code implementations • NAACL 2019 • Manuel R. Ciosici, Leon Derczynski, Ira Assent

We show that increases in Average Mutual Information, the clustering algorithms{'} optimization goal, are highly correlated with improvements in encoding of morphosyntactic information.

Clustering

Paper
Add Code

Abbreviation Explorer - an interactive system for pre-evaluation of Unsupervised Abbreviation Disambiguation

no code implementations • NAACL 2019 • Manuel R. Ciosici, Ira Assent

We present Abbreviation Explorer, a system that supports interactive exploration of abbreviations that are challenging for Unsupervised Abbreviation Disambiguation (UAD).

Paper
Add Code

Abbreviation Expander - a Web-based System for Easy Reading of Technical Documents

no code implementations • COLING 2018 • Manuel R. Ciosici, Ira Assent

Abbreviations and acronyms are a part of textual communication in most domains.

Paper
Add Code

Improving Quality of Hierarchical Clustering for Large Data Series

1 code implementation • 3 Aug 2016 • Manuel R. Ciosici

Because of its ability to produce high-quality, human-understandable cluster, Brown clustering has seen high uptake the NLP research community where it is used in the preprocessing and feature generation steps.

Clustering

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.