Search Results for author: Arseny Moskvichev

Found 5 papers, 3 papers with code

Comparing Humans, GPT-4, and GPT-4V On Abstraction and Reasoning Tasks

no code implementations14 Nov 2023 Melanie Mitchell, Alessandro B. Palmarini, Arseny Moskvichev

We explore the abstract reasoning abilities of text-only and multimodal versions of GPT-4, using the ConceptARC benchmark [10], which is designed to evaluate robust understanding and reasoning with core-knowledge concepts.

NarrativeXL: A Large-scale Dataset For Long-Term Memory Models

1 code implementation23 May 2023 Arseny Moskvichev, Ky-Vinh Mai

We show that our questions 1) adequately represent the source material 2) can be used to diagnose a model's memory capacity 3) are not trivial for modern language models even when the memory demand does not exceed those models' context lengths.

Multiple-choice Reading Comprehension +1

The ConceptARC Benchmark: Evaluating Understanding and Generalization in the ARC Domain

1 code implementation11 May 2023 Arseny Moskvichev, Victor Vikram Odouard, Melanie Mitchell

In this paper we describe an in-depth evaluation benchmark for the Abstraction and Reasoning Corpus (ARC), a collection of few-shot abstraction and analogy problems developed by Chollet [2019].

Updater-Extractor Architecture for Inductive World State Representations

no code implementations12 Apr 2021 Arseny Moskvichev, James A. Liu

In this paper, we propose a novel transformer-based Updater-Extractor architecture and a training procedure that can work with sequences of arbitrary length and refine its knowledge about the world based on linguistic inputs.

LEMMA

Reinforcement Communication Learning in Different Social Network Structures

1 code implementation ICML Workshop LaReL 2020 Marina Dubova, Arseny Moskvichev, Robert Goldstone

We examined the effects of social network organization on the properties of communication systems emerging in decentralized, multi-agent reinforcement learning communities.

Multi-agent Reinforcement Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.