1 code implementation • SIGDIAL (ACL) 2021 • Prasanna Parthasarathi, Joelle Pineau, Sarath Chandar
Predicting the next utterance in dialogue is contingent on encoding of users’ input text to generate appropriate and relevant response in data-driven approaches.
1 code implementation • Findings (EMNLP) 2021 • Prasanna Parthasarathi, Koustuv Sinha, Joelle Pineau, Adina Williams
Rapid progress in Neural Machine Translation (NMT) systems over the last few years has focused primarily on improving translation quality, and as a secondary focus, improving robustness to perturbations (e. g. spelling).
no code implementations • 27 Feb 2024 • Sayash Kapoor, Rishi Bommasani, Kevin Klyman, Shayne Longpre, Ashwin Ramaswami, Peter Cihon, Aspen Hopkins, Kevin Bankston, Stella Biderman, Miranda Bogen, Rumman Chowdhury, Alex Engler, Peter Henderson, Yacine Jernite, Seth Lazar, Stefano Maffulli, Alondra Nelson, Joelle Pineau, Aviya Skowron, Dawn Song, Victor Storchan, Daniel Zhang, Daniel E. Ho, Percy Liang, Arvind Narayanan
To understand their risks of misuse, we design a risk assessment framework for analyzing their marginal risk.
1 code implementation • 23 Oct 2022 • Koustuv Sinha, Amirhossein Kazemnejad, Siva Reddy, Joelle Pineau, Dieuwke Hupkes, Adina Williams
Transformer language models encode the notion of word order using positional information.
1 code implementation • 21 Jun 2022 • Devendra Singh Sachan, Mike Lewis, Dani Yogatama, Luke Zettlemoyer, Joelle Pineau, Manzil Zaheer
We introduce ART, a new corpus-level autoencoding approach for training dense retrieval models that does not require any labeled training data.
1 code implementation • 15 Apr 2022 • Devendra Singh Sachan, Mike Lewis, Mandar Joshi, Armen Aghajanyan, Wen-tau Yih, Joelle Pineau, Luke Zettlemoyer
We propose a simple and effective re-ranking method for improving passage retrieval in open question answering.
3 code implementations • ICLR 2022 • Lucas Caccia, Rahaf Aljundi, Nader Asadi, Tinne Tuytelaars, Joelle Pineau, Eugene Belilovsky
In this work, we focus on the change in representations of observed data that arises when previously unobserved classes appear in the incoming data stream, and new classes must be distinguished from previous ones.
no code implementations • 28 Feb 2022 • Martin Cousineau, Vedat Verter, Susan A. Murphy, Joelle Pineau
In the absence of randomized controlled and natural experiments, it is necessary to balance the distributions of (observable) covariates of the treated and control groups in order to obtain an unbiased estimate of a causal effect of interest; otherwise, a different effect size may be estimated, and incorrect recommendations may be given.
no code implementations • 14 Feb 2022 • Annie Xie, Shagun Sodhani, Chelsea Finn, Joelle Pineau, Amy Zhang
Reinforcement learning (RL) agents need to be robust to variations in safety-critical environments.
no code implementations • 5 Jan 2022 • Anthony GX-Chen, Veronica Chelu, Blake A. Richards, Joelle Pineau
We illustrate that incorporating predictive knowledge through an $\eta\gamma$-discounted SF model makes more efficient use of sampled experience, compared to either extreme, i. e. bootstrapping entirely on the value function estimate, or bootstrapping on the product of separately estimated successor features and instantaneous reward models.
no code implementations • 13 Oct 2021 • Shagun Sodhani, Franziska Meier, Joelle Pineau, Amy Zhang
In this work, we propose to examine this continual reinforcement learning setting through the block contextual MDP (BC-MDP) framework, which enables us to relax the assumption of stationarity.
1 code implementation • 21 Jun 2021 • Jongmin Lee, Wonseok Jeon, Byung-Jun Lee, Joelle Pineau, Kee-Eung Kim
We consider the offline reinforcement learning (RL) setting where the agent aims to optimize the policy solely from the data without further environment interactions.
1 code implementation • SIGDIAL (ACL) 2021 • Prasanna Parthasarathi, Mohamed Abdelsalam, Joelle Pineau, Sarath Chandar
Neural models trained for next utterance generation in dialogue task learn to mimic the n-gram sequences in the training set with training objectives like negative log-likelihood (NLL) or cross-entropy.
1 code implementation • 20 Jun 2021 • Prasanna Parthasarathi, Joelle Pineau, Sarath Chandar
Predicting the next utterance in dialogue is contingent on encoding of users' input text to generate appropriate and relevant response in data-driven approaches.
no code implementations • 16 Jun 2021 • Lucas Caccia, Joelle Pineau
This paper presents SPeCiaL: a method for unsupervised pretraining of representations tailored for continual learning.
1 code implementation • 7 Jun 2021 • Emmanuel Bengio, Joelle Pineau, Doina Precup
A common optimization tool used in deep reinforcement learning is momentum, which consists in accumulating and discounting past gradients, reapplying them at each iteration.
no code implementations • NeurIPS 2021 • Harsh Satija, Philip S. Thomas, Joelle Pineau, Romain Laroche
We study the problem of Safe Policy Improvement (SPI) under constraints in the offline Reinforcement Learning (RL) setting.
no code implementations • 15 Apr 2021 • Prasanna Parthasarathi, Koustuv Sinha, Joelle Pineau, Adina Williams
Rapid progress in Neural Machine Translation (NMT) systems over the last few years has been driven primarily towards improving translation quality, and as a secondary focus, improved robustness to input perturbations (e. g. spelling and grammatical mistakes).
no code implementations • EMNLP 2021 • Koustuv Sinha, Robin Jia, Dieuwke Hupkes, Joelle Pineau, Adina Williams, Douwe Kiela
A possible explanation for the impressive performance of masked language model (MLM) pre-training is that such models have learned to represent the syntactic structures prevalent in classical NLP pipelines.
3 code implementations • 11 Apr 2021 • Lucas Caccia, Rahaf Aljundi, Nader Asadi, Tinne Tuytelaars, Joelle Pineau, Eugene Belilovsky
In this work, we focus on the change in representations of observed data that arises when previously unobserved classes appear in the incoming data stream, and new classes must be distinguished from previous ones.
no code implementations • 14 Mar 2021 • Kalesha Bullard, Douwe Kiela, Franziska Meier, Joelle Pineau, Jakob Foerster
In contrast, in this work, we present a novel problem setting and the Quasi-Equivalence Discovery (QED) algorithm that allows for zero-shot coordination (ZSC), i. e., discovering protocols that can generalize to independently trained agents.
no code implementations • ICLR Workshop SSL-RL 2021 • Clare Lyle, Amy Zhang, Minqi Jiang, Joelle Pineau, Yarin Gal
To address this, we present a robust exploration strategy which enables causal hypothesis-testing by interaction with the environment.
no code implementations • ICLR Workshop SSL-RL 2021 • Manan Tomar, Amy Zhang, Roberto Calandra, Matthew E. Taylor, Joelle Pineau
Unlike previous forms of state abstractions, a model-invariance state abstraction leverages causal sparsity over state variables.
no code implementations • 14 Feb 2021 • Bonnie Li, Vincent François-Lavet, Thang Doan, Joelle Pineau
We consider the problem of generalization in reinforcement learning where visual aspects of the observations might differ, e. g. when there are different backgrounds or change in contrast, brightness, etc.
2 code implementations • 11 Feb 2021 • Shagun Sodhani, Amy Zhang, Joelle Pineau
We posit that an efficient approach to knowledge transfer is through the use of multiple context-dependent, composable representations shared across a family of tasks.
no code implementations • EACL 2021 • Dora Jambor, Komal Teru, Joelle Pineau, William L. Hamilton
Real-world knowledge graphs are often characterized by low-frequency relations - a challenge that has prompted an increasing interest in few-shot link prediction methods.
1 code implementation • 13 Jan 2021 • Anuroop Sriram, Matthew Muckley, Koustuv Sinha, Farah Shamout, Joelle Pineau, Krzysztof J. Geras, Lea Azour, Yindalon Aphinyanaphongs, Nafissa Yakubova, William Moore
The first is deterioration prediction from a single image, where our model achieves an area under receiver operating characteristic curve (AUC) of 0. 742 for predicting an adverse event within 96 hours (compared to 0. 703 with supervised pretraining) and an AUC of 0. 765 for predicting oxygen requirements greater than 6 L a day at 24 hours (compared to 0. 749 with supervised pretraining).
1 code implementation • 1 Jan 2021 • Koustuv Sinha, Shagun Sodhani, Joelle Pineau, William L. Hamilton
In this work, we study the logical generalization capabilities of GNNs by designing a benchmark suite grounded in first-order logic.
1 code implementation • ACL 2021 • Koustuv Sinha, Prasanna Parthasarathi, Joelle Pineau, Adina Williams
We provide novel evidence that complicates this claim: we find that state-of-the-art Natural Language Inference (NLI) models assign the same labels to permuted examples as they do to the original, i. e. they are largely invariant to random word-order permutations.
1 code implementation • 3 Dec 2020 • Melissa Mozifian, Amy Zhang, Joelle Pineau, David Meger
The goal of this work is to address the recent success of domain randomization and data augmentation for the sim2real setting.
no code implementations • 29 Oct 2020 • Kalesha Bullard, Franziska Meier, Douwe Kiela, Joelle Pineau, Jakob Foerster
Indeed, emergent communication is now a vibrant field of research, with common settings involving discrete cheap-talk channels.
no code implementations • ICLR 2021 • Wonseok Jeon, Chen-Yang Su, Paul Barde, Thang Doan, Derek Nowrouzezahrai, Joelle Pineau
Inverse Reinforcement Learning (IRL) aims to facilitate a learner's ability to imitate expert behavior by acquiring reward functions that explain the expert's decisions.
1 code implementation • NeurIPS 2020 • Ruo Yu Tao, Vincent François-Lavet, Joelle Pineau
We then leverage these intrinsic rewards for sample-efficient exploration with planning routines in representational space for hard exploration tasks with sparse rewards.
no code implementations • ICML 2020 • Harsh Satija, Philip Amortila, Joelle Pineau
In standard RL, the agent is incentivized to explore any behavior as long as it maximizes rewards, but in the real world, undesired behavior can damage either the system or the agent in a way that breaks the learning process itself.
1 code implementation • 24 Aug 2020 • Prasanna Parthasarathi, Joelle Pineau, Sarath Chandar
To bridge this gap in evaluation, we propose designing a set of probing tasks to evaluate dialogue models.
2 code implementations • ICLR 2021 • Amy Zhang, Shagun Sodhani, Khimya Khetarpal, Joelle Pineau
Further, we provide transfer and generalization bounds based on task and state similarity, along with sample complexity bounds that depend on the aggregate number of samples across tasks, rather than the number of tasks, a significant improvement over prior work that use the same environment assumptions.
no code implementations • 6 Jul 2020 • Joshua Romoff, Peter Henderson, David Kanaa, Emmanuel Bengio, Ahmed Touati, Pierre-Luc Bacon, Joelle Pineau
We investigate whether Jacobi preconditioning, accounting for the bootstrap term in temporal difference (TD) learning, can help boost performance of adaptive optimizers.
no code implementations • 3 Jul 2020 • Deepak Sharma, Audrey Durand, Marc-André Legault, Louis-Philippe Lemieux Perreault, Audrey Lemaçon, Marie-Pierre Dubé, Joelle Pineau
Genome-Wide Association Studies are typically conducted using linear models to find genetic variants associated with common diseases.
3 code implementations • NeurIPS 2020 • Paul Barde, Julien Roy, Wonseok Jeon, Joelle Pineau, Christopher Pal, Derek Nowrouzezahrai
Adversarial Imitation Learning alternates between learning a discriminator -- which tells apart expert's demonstrations from generated ones -- and a generator's policy to produce trajectories that can fool this discriminator.
1 code implementation • 7 May 2020 • Ge Yang, Amy Zhang, Ari S. Morcos, Joelle Pineau, Pieter Abbeel, Roberto Calandra
In this paper we introduce plan2vec, an unsupervised representation learning approach that is inspired by reinforcement learning.
no code implementations • 6 May 2020 • Iulian Vlad Serban, Varun Gupta, Ekaterina Kochmar, Dung D. Vu, Robert Belfer, Joelle Pineau, Aaron Courville, Laurent Charlin, Yoshua Bengio
We present Korbit, a large-scale, open-domain, mixed-interface, dialogue-based intelligent tutoring system (ITS).
no code implementations • 5 May 2020 • Ekaterina Kochmar, Dung Do Vu, Robert Belfer, Varun Gupta, Iulian Vlad Serban, Joelle Pineau
Our model is used in Korbit, a large-scale dialogue-based ITS with thousands of students launched in 2019, and we demonstrate that the personalized feedback leads to considerable improvement in student learning outcomes and in the subjective evaluation of the feedback.
1 code implementation • ACL 2020 • Koustuv Sinha, Prasanna Parthasarathi, Jasmine Wang, Ryan Lowe, William L. Hamilton, Joelle Pineau
Evaluating the quality of a dialogue interaction between two agents is a difficult task, especially in open-domain chit-chat style dialogue.
no code implementations • 27 Mar 2020 • Joelle Pineau, Philippe Vincent-Lamarre, Koustuv Sinha, Vincent Larivière, Alina Beygelzimer, Florence d'Alché-Buc, Emily Fox, Hugo Larochelle
Reproducibility, that is obtaining similar results as presented in a paper or talk, using the same code and data (when available), is a necessary step to verify the reliability of research findings.
1 code implementation • ICML Workshop LifelongML 2020 • Koustuv Sinha, Shagun Sodhani, Joelle Pineau, William L. Hamilton
Recent research has highlighted the role of relational inductive biases in building learning agents that can generalize and reason in a compositional manner.
no code implementations • ICML 2020 • Emmanuel Bengio, Joelle Pineau, Doina Precup
We study the link between generalization and interference in temporal-difference (TD) learning.
1 code implementation • ICML 2020 • Amy Zhang, Clare Lyle, Shagun Sodhani, Angelos Filos, Marta Kwiatkowska, Joelle Pineau, Yarin Gal, Doina Precup
Generalization across environments is critical to the successful application of reinforcement learning algorithms to real-world challenges.
1 code implementation • 9 Mar 2020 • Ahmed Touati, Amy Zhang, Joelle Pineau, Pascal Vincent
Trust Region Policy Optimization (TRPO) and Proximal Policy Optimization (PPO) are among the most successful policy gradient approaches in deep reinforcement learning (RL).
no code implementations • 28 Feb 2020 • Benjamin Haibe-Kains, George Alexandru Adam, Ahmed Hosny, Farnoosh Khodakarami, MAQC Society Board, Levi Waldron, Bo wang, Chris McIntosh, Anshul Kundaje, Casey S. Greene, Michael M. Hoffman, Jeffrey T. Leek, Wolfgang Huber, Alvis Brazma, Joelle Pineau, Robert Tibshirani, Trevor Hastie, John P. A. Ioannidis, John Quackenbush, Hugo J. W. L. Aerts
In their study, McKinney et al. showed the high potential of artificial intelligence for breast cancer screening.
Applications
no code implementations • 24 Feb 2020 • Wonseok Jeon, Paul Barde, Derek Nowrouzezahrai, Joelle Pineau
Multi-agent adversarial inverse reinforcement learning (MA-AIRL) is a recent approach that applies single-agent AIRL to multi-agent problems where we seek to recover both policies for our agents and reward functions that promote expert-like behavior.
no code implementations • 7 Feb 2020 • Bogdan Mazoure, Thang Doan, Tianyu Li, Vladimir Makarenkov, Joelle Pineau, Doina Precup, Guillaume Rabusseau
We propose a general framework for policy representation for reinforcement learning tasks.
1 code implementation • ICLR 2020 • Ryan Lowe, Abhinav Gupta, Jakob Foerster, Douwe Kiela, Joelle Pineau
A promising approach for teaching artificial agents to use natural language involves using human-in-the-loop training.
2 code implementations • 31 Jan 2020 • Peter Henderson, Jieru Hu, Joshua Romoff, Emma Brunskill, Dan Jurafsky, Joelle Pineau
Accurate reporting of energy and carbon usage is essential for understanding the potential climate impacts of machine learning research.
no code implementations • NeurIPS 2019 • Philip Paquette, Yuchen Lu, Seton Steven Bocco, Max Smith, Satya O.-G., Jonathan K. Kummerfeld, Joelle Pineau, Satinder Singh, Aaron C. Courville
Diplomacy is a seven-player non-stochastic, non-cooperative game, where agents acquire resources through a mix of teamwork and betrayal.
1 code implementation • 20 Nov 2019 • Eric Crawford, Joelle Pineau
The ability to detect and track objects in the visual world is a crucial skill for any intelligent agent, as it is a necessary precursor to any object-level reasoning process.
1 code implementation • ICML 2020 • Lucas Caccia, Eugene Belilovsky, Massimo Caccia, Joelle Pineau
We show how to use discrete auto-encoders to effectively address this challenge and introduce Adaptive Quantization Modules (AQM) to control variation in the compression ability of the module at any given stage of learning.
no code implementations • 16 Nov 2019 • Riashat Islam, Komal K. Teru, Deepak Sharma, Joelle Pineau
This data distribution shift between current and past samples can significantly impact the performance of most modern off-policy based policy optimization algorithms.
no code implementations • WS 2019 • Abhinav Gupta, Ryan Lowe, Jakob Foerster, Douwe Kiela, Joelle Pineau
Once the meta-learning agent is able to quickly adapt to each population of agents, it can be deployed in new populations, including populations speaking human language.
1 code implementation • 9 Oct 2019 • Viswanath Sivakumar, Olivier Delalleau, Tim Rocktäschel, Alexander H. Miller, Heinrich Küttler, Nantas Nardelli, Mike Rabbat, Joelle Pineau, Sebastian Riedel
This is largely an artifact of building on top of frameworks designed for RL in games (e. g. OpenAI Gym).
4 code implementations • 3 Oct 2019 • Scott Fujimoto, Edoardo Conti, Mohammad Ghavamzadeh, Joelle Pineau
Widely-used deep reinforcement learning algorithms have been shown to fail in the batch setting--learning from a fixed data set without interaction with the environment.
3 code implementations • 2 Oct 2019 • Denis Yarats, Amy Zhang, Ilya Kostrikov, Brandon Amos, Joelle Pineau, Rob Fergus
A promising approach is to learn a latent representation together with the control policy.
no code implementations • 25 Sep 2019 • Emmanuel Bengio, Doina Precup, Joelle Pineau
Current Deep Reinforcement Learning (DRL) methods can exhibit both data inefficiency and brittleness, which seem to indicate that they generalize poorly.
no code implementations • 25 Sep 2019 • Lucas Caccia, Eugene Belilovsky, Massimo Caccia, Joelle Pineau
We first replace the episodic memory used in Experience Replay with SQM, leading to significant gains on standard continual learning benchmarks using a fixed memory budget.
no code implementations • 17 Sep 2019 • Thang Doan, Bogdan Mazoure, Moloud Abdar, Audrey Durand, Joelle Pineau, R. Devon Hjelm
Continuous control tasks in reinforcement learning are important because they provide an important framework for learning in high-dimensional state spaces with deceptive rewards, where the agent can easily become trapped into suboptimal solutions.
1 code implementation • 4 Sep 2019 • Philip Paquette, Yuchen Lu, Steven Bocco, Max O. Smith, Satya Ortiz-Gagne, Jonathan K. Kummerfeld, Satinder Singh, Joelle Pineau, Aaron Courville
Diplomacy is a seven-player non-stochastic, non-cooperative game, where agents acquire resources through a mix of teamwork and betrayal.
5 code implementations • IJCNLP 2019 • Koustuv Sinha, Shagun Sodhani, Jin Dong, Joelle Pineau, William L. Hamilton
The recent success of natural language understanding (NLU) systems has been troubled by results highlighting the failure of these models to generalize in a systematic and robust way.
Inductive logic programming Natural Language Understanding +2
no code implementations • 25 Jun 2019 • Amy Zhang, Zachary C. Lipton, Luis Pineda, Kamyar Azizzadenesheli, Anima Anandkumar, Laurent Itti, Joelle Pineau, Tommaso Furlanello
In this paper, we propose an algorithm to approximate causal states, which are the coarsest partition of the joint history of actions and observations in partially-observable Markov decision processes (POMDP).
1 code implementation • NeurIPS 2019 • Mahmoud Assran, Joshua Romoff, Nicolas Ballas, Joelle Pineau, Michael Rabbat
We show that we can run several loosely coupled GALA agents in parallel on a single GPU and achieve significantly higher hardware utilization and frame-rates than vanilla A2C at comparable power draws.
no code implementations • 23 May 2019 • Pierre Thodoroff, Nishanth Anand, Lucas Caccia, Doina Precup, Joelle Pineau
Despite recent successes in Reinforcement Learning, value-based methods often suffer from high variance hindering performance.
1 code implementation • 16 May 2019 • Bogdan Mazoure, Thang Doan, Audrey Durand, R. Devon Hjelm, Joelle Pineau
The ability to discover approximately optimal policies in domains with sparse rewards is crucial to applying reinforcement learning (RL) in many real-world scenarios.
1 code implementation • 12 Mar 2019 • Ryan Lowe, Jakob Foerster, Y-Lan Boureau, Joelle Pineau, Yann Dauphin
How do we know if communication is emerging in a multi-agent system?
1 code implementation • 5 Feb 2019 • Joshua Romoff, Peter Henderson, Ahmed Touati, Emma Brunskill, Joelle Pineau, Yann Ollivier
In settings where this bias is unacceptable - where the system must optimize for longer horizons at higher discounts - the target of the value function approximator may increase in variance leading to difficulties in learning.
2 code implementations • 31 Jan 2019 • Emily Dinan, Varvara Logacheva, Valentin Malykh, Alexander Miller, Kurt Shuster, Jack Urbanek, Douwe Kiela, Arthur Szlam, Iulian Serban, Ryan Lowe, Shrimai Prabhumoye, Alan W. black, Alexander Rudnicky, Jason Williams, Joelle Pineau, Mikhail Burtsev, Jason Weston
We describe the setting and results of the ConvAI2 NeurIPS competition that aims to further the state-of-the-art in open-domain chatbots.
1 code implementation • 4 Dec 2018 • Lucas Caccia, Herke van Hoof, Aaron Courville, Joelle Pineau
In this work, we show that one can adapt deep generative models for this task by unravelling lidar scans into a 2D point map.
1 code implementation • NeurIPS 2018 • Pierre Thodoroff, Audrey Durand, Joelle Pineau, Doina Precup
Several applications of Reinforcement Learning suffer from instability due to high variance.
3 code implementations • 30 Nov 2018 • Vincent Francois-Lavet, Peter Henderson, Riashat Islam, Marc G. Bellemare, Joelle Pineau
Deep reinforcement learning is the combination of reinforcement learning (RL) and deep learning.
2 code implementations • 14 Nov 2018 • Amy Zhang, Yuxin Wu, Joelle Pineau
While current benchmark reinforcement learning (RL) tasks have been useful to drive progress in the field, they are in many ways poor substitutes for learning with real-world data.
2 code implementations • 7 Nov 2018 • Koustuv Sinha, Shagun Sodhani, William L. Hamilton, Joelle Pineau
Neural networks for natural language reasoning have largely focused on extractive, fact-based question-answering (QA) and common-sense inference.
1 code implementation • 7 Nov 2018 • Nicolas Gontier, Koustuv Sinha, Peter Henderson, Iulian Serban, Michael Noseworthy, Prasanna Parthasarathi, Joelle Pineau
This article presents in detail the RLLChatbot that participated in the 2017 ConvAI challenge.
1 code implementation • ICLR 2020 • Massimo Caccia, Lucas Caccia, William Fedus, Hugo Larochelle, Joelle Pineau, Laurent Charlin
Generating high-quality text with sufficient diversity is essential for a wide range of Natural Language Generation (NLG) tasks.
no code implementations • 4 Nov 2018 • Peter Henderson, Koustuv Sinha, Rosemary Nan Ke, Joelle Pineau
Adversarial examples can be defined as inputs to a model which induce a mistake - where the model output is different than that of an oracle, perhaps in surprising or malicious ways.
2 code implementations • 1 Nov 2018 • Pierre Thodoroff, Audrey Durand, Joelle Pineau, Doina Precup
Several applications of Reinforcement Learning suffer from instability due to high variance.
no code implementations • ICLR 2019 • Abhishek Das, Théophile Gervet, Joshua Romoff, Dhruv Batra, Devi Parikh, Michael Rabbat, Joelle Pineau
We propose a targeted communication architecture for multi-agent reinforcement learning, where agents learn both what messages to send and whom to address them to while performing cooperative tasks in partially-observable environments.
1 code implementation • 5 Oct 2018 • Peter Henderson, Joshua Romoff, Joelle Pineau
We find that adaptive optimizers have a narrow window of effective learning rates, diverging in other cases, and that the effectiveness of momentum varies depending on the properties of the environment.
no code implementations • EMNLP 2018 • Prasanna Parthasarathi, Joelle Pineau
The use of connectionist approaches in conversational agents has been progressing rapidly due to the availability of large corpora.
no code implementations • ICLR 2018 • Eric Crawford, Guillaume Rabusseau, Joelle Pineau
Achieving machine intelligence requires a smooth integration of perception and reasoning, yet models developed to date tend to specialize in one or the other; sophisticated manipulation of symbols acquired from rich perceptual spaces has so far proved elusive.
1 code implementation • 12 Sep 2018 • Vincent François-Lavet, Yoshua Bengio, Doina Precup, Joelle Pineau
In the quest for efficient and robust reinforcement learning methods, both model-free and model-based approaches offer advantages.
3 code implementations • 31 Jul 2018 • Thang Doan, Joao Monteiro, Isabela Albuquerque, Bogdan Mazoure, Audrey Durand, Joelle Pineau, R. Devon Hjelm
We argue that less expressive discriminators are smoother and have a general coarse grained view of the modes map, which enforces the generator to cover a wide portion of the data distribution support.
no code implementations • 12 Jul 2018 • Iulian Vlad Serban, Chinnadhurai Sankar, Michael Pieper, Joelle Pineau, Yoshua Bengio
Deep reinforcement learning has recently shown many impressive successes.
Model-based Reinforcement Learning reinforcement-learning +1
no code implementations • 20 Jun 2018 • Amy Zhang, Nicolas Ballas, Joelle Pineau
The risks and perils of overfitting in machine learning are well known.
no code implementations • ICML 2018 • Nan Rosemary Ke, Konrad Zolna, Alessandro Sordoni, Zhouhan Lin, Adam Trischler, Yoshua Bengio, Joelle Pineau, Laurent Charlin, Chris Pal
We evaluate this method on several types of tasks with different attributes.
Ranked #3 on Open-Domain Question Answering on SearchQA (Unigram Acc metric)
1 code implementation • 6 Jun 2018 • Ahmed Touati, Harsh Satija, Joshua Romoff, Joelle Pineau, Pascal Vincent
In particular, we augment DQN and DDPG with multiplicative normalizing flows in order to track a rich approximate posterior distribution over the parameters of the value function.
1 code implementation • 9 May 2018 • Joshua Romoff, Peter Henderson, Alexandre Piché, Vincent Francois-Lavet, Joelle Pineau
However, introduction of corrupt or stochastic rewards can yield high variance in learning.
1 code implementation • 27 Apr 2018 • Amy Zhang, Harsh Satija, Joelle Pineau
Current reinforcement learning (RL) methods can successfully learn single tasks but often generalize poorly to modest perturbations in task domain or training procedure.
no code implementations • 26 Feb 2018 • Valentin Thomas, Emmanuel Bengio, William Fedus, Jules Pondard, Philippe Beaudoin, Hugo Larochelle, Joelle Pineau, Doina Precup, Yoshua Bengio
It has been postulated that a good representation is one that disentangles the underlying explanatory factors of variation.
no code implementations • 20 Jan 2018 • Iulian V. Serban, Chinnadhurai Sankar, Mathieu Germain, Saizheng Zhang, Zhouhan Lin, Sandeep Subramanian, Taesup Kim, Michael Pieper, Sarath Chandar, Nan Rosemary Ke, Sai Rajeswar, Alexandre de Brebisson, Jose M. R. Sotelo, Dendi Suhubdy, Vincent Michalski, Alexandre Nguyen, Joelle Pineau, Yoshua Bengio
We present MILABOT: a deep reinforcement learning chatbot developed by the Montreal Institute for Learning Algorithms (MILA) for the Amazon Alexa Prize competition.
no code implementations • ICLR 2018 • Matthew J. A. Smith, Herke van Hoof, Joelle Pineau
In this work we develop a novel policy gradient method for the automatic learning of policies with options.
no code implementations • NeurIPS 2017 • Guillaume Rabusseau, Borja Balle, Joelle Pineau
We first present a natural notion of relatedness between WFAs by considering to which extent several WFAs can share a common underlying representation.
1 code implementation • 24 Nov 2017 • Peter Henderson, Koustuv Sinha, Nicolas Angelard-Gontier, Nan Rosemary Ke, Genevieve Fried, Ryan Lowe, Joelle Pineau
The use of dialogue systems as a medium for human-machine interaction is an increasingly prevalent paradigm.
no code implementations • 13 Nov 2017 • Anirudh Goyal, Nan Rosemary Ke, Alex Lamb, R. Devon Hjelm, Chris Pal, Joelle Pineau, Yoshua Bengio
This makes it fundamentally difficult to train GANs with discrete data, as generation in this case typically involves a non-differentiable function.
no code implementations • 22 Sep 2017 • Vincent Francois-Lavet, Guillaume Rabusseau, Joelle Pineau, Damien Ernst, Raphael Fonteneau
This paper provides an analysis of the tradeoff between asymptotic bias (suboptimality with unlimited data) and overfitting (additional suboptimality due to limited data) in the context of reinforcement learning with partial observability.
1 code implementation • 20 Sep 2017 • Peter Henderson, Wei-Di Chang, Pierre-Luc Bacon, David Meger, Joelle Pineau, Doina Precup
Inverse reinforcement learning offers a useful paradigm to learn the underlying reward function directly from expert demonstrations.
4 code implementations • 19 Sep 2017 • Peter Henderson, Riashat Islam, Philip Bachman, Joelle Pineau, Doina Precup, David Meger
In recent years, significant progress has been made in solving challenging problems across various domains using deep reinforcement learning (RL).
no code implementations • 7 Sep 2017 • Iulian V. Serban, Chinnadhurai Sankar, Mathieu Germain, Saizheng Zhang, Zhouhan Lin, Sandeep Subramanian, Taesup Kim, Michael Pieper, Sarath Chandar, Nan Rosemary Ke, Sai Rajeshwar, Alexandre de Brebisson, Jose M. R. Sotelo, Dendi Suhubdy, Vincent Michalski, Alexandre Nguyen, Joelle Pineau, Yoshua Bengio
By applying reinforcement learning to crowdsourced data and real-world user interactions, the system has been trained to select an appropriate response from the models in its ensemble.
1 code implementation • ACL 2017 • Ryan Lowe, Michael Noseworthy, Iulian V. Serban, Nicolas Angelard-Gontier, Yoshua Bengio, Joelle Pineau
Automatically evaluating the quality of dialogue responses for unstructured domains is a challenging problem.
no code implementations • 3 Aug 2017 • Valentin Thomas, Jules Pondard, Emmanuel Bengio, Marc Sarfati, Philippe Beaudoin, Marie-Jean Meurs, Joelle Pineau, Doina Precup, Yoshua Bengio
It has been postulated that a good representation is one that disentangles the underlying explanatory factors of variation.
no code implementations • 2 Aug 2017 • Audrey Durand, Odalric-Ambrym Maillard, Joelle Pineau
The variance of the noise is not assumed to be known.
no code implementations • WS 2017 • Michael Noseworthy, Jackie Chi Kit Cheung, Joelle Pineau
We then propose a turn-based hierarchical neural network model that can be used to predict success without requiring a structured goal definition.
1 code implementation • WS 2017 • Hoai Phuoc Truong, Prasanna Parthasarathi, Joelle Pineau
We propose a software architecture designed to ease the implementation of dialogue systems.
no code implementations • 22 Mar 2017 • Emmanuel Bengio, Valentin Thomas, Joelle Pineau, Doina Precup, Yoshua Bengio
Finding features that disentangle the different causes of variation in real data is a difficult task, that has nonetheless received considerable attention in static domains like natural images.
no code implementations • 1 Jan 2017 • Ryan Lowe, Nissan Pow, Iulian Vlad Serban, Laurent Charlin, Chia-Wei Liu, Joelle Pineau
In this paper, we analyze neural network-based dialogue systems trained in an end-to-end manner using an updated version of the recent Ubuntu Dialogue Corpus, a dataset containing almost 1 million multi-turn dialogues, with a total of over 7 million utterances and 100 million words.
2 code implementations • EMNLP (ACL) 2017 • Iulian V. Serban, Alexander G. Ororbia II, Joelle Pineau, Aaron Courville
Advances in neural variational inference have facilitated the learning of powerful directed graphical models with continuous latent variables, such as variational autoencoders.
no code implementations • 18 Nov 2016 • Iulian Vlad Serban, Ryan Lowe, Laurent Charlin, Joelle Pineau
Researchers have recently started investigating deep neural networks for dialogue applications.
no code implementations • 14 Sep 2016 • Mohammad Ghavamzadeh, Shie Mannor, Joelle Pineau, Aviv Tamar
The objective of the paper is to provide a comprehensive survey on Bayesian RL algorithms and their theoretical and empirical properties.
1 code implementation • 31 Jul 2016 • Pierre Thodoroff, Joelle Pineau, Andrew Lim
We present and evaluate the capacity of a deep neural network to learn robust features from EEG to automatically detect seizures.
3 code implementations • 24 Jul 2016 • Dzmitry Bahdanau, Philemon Brakel, Kelvin Xu, Anirudh Goyal, Ryan Lowe, Joelle Pineau, Aaron Courville, Yoshua Bengio
We present an approach to training neural networks to generate sequences using actor-critic methods from reinforcement learning (RL).
Ranked #8 on Machine Translation on IWSLT2015 English-German
9 code implementations • 19 May 2016 • Iulian Vlad Serban, Alessandro Sordoni, Ryan Lowe, Laurent Charlin, Joelle Pineau, Aaron Courville, Yoshua Bengio
Sequential data often possesses a hierarchical structure with complex dependencies between subsequences, such as found between the utterances in a dialogue.
no code implementations • WS 2016 • Ryan Lowe, Iulian V. Serban, Mike Noseworthy, Laurent Charlin, Joelle Pineau
An open challenge in constructing dialogue systems is developing methods for automatically learning dialogue strategies from large amounts of unlabelled data.
2 code implementations • EMNLP 2016 • Chia-Wei Liu, Ryan Lowe, Iulian V. Serban, Michael Noseworthy, Laurent Charlin, Joelle Pineau
We investigate evaluation metrics for dialogue response generation systems where supervised labels, such as task completion, are not available.
4 code implementations • 17 Dec 2015 • Iulian Vlad Serban, Ryan Lowe, Peter Henderson, Laurent Charlin, Joelle Pineau
During the past decade, several areas of speech and language understanding have witnessed substantial breakthroughs from the use of data-driven models.
1 code implementation • 19 Nov 2015 • Emmanuel Bengio, Pierre-Luc Bacon, Joelle Pineau, Doina Precup
In this paper, we use reinforcement learning as a tool to optimize conditional computation policies.
7 code implementations • 17 Jul 2015 • Iulian V. Serban, Alessandro Sordoni, Yoshua Bengio, Aaron Courville, Joelle Pineau
We investigate the task of building open domain, conversational dialogue systems based on large dialogue corpora using generative models.
21 code implementations • WS 2015 • Ryan Lowe, Nissan Pow, Iulian Serban, Joelle Pineau
This paper introduces the Ubuntu Dialogue Corpus, a dataset containing almost 1 million multi-turn dialogues, with a total of over 7 million utterances and 100 million words.
no code implementations • 21 Jul 2014 • André M. S. Barreto, Doina Precup, Joelle Pineau
In this paper we introduce an algorithm that turns KBRL into a practical reinforcement learning tool.
no code implementations • 24 Feb 2014 • Ouais Alsharif, Philip Bachman, Joelle Pineau
Consider a Machine Learning Service Provider (MLSP) designed to rapidly create highly accurate learners for a never-ending stream of new tasks.
no code implementations • 16 Jan 2014 • Mahdi Milani Fard, Joelle Pineau
Although conventional methods in reinforcement learning have proved to be useful in problems concerning sequential decision-making, they cannot be applied in their current form to decision support systems, such as those in medical domains, as they suggest policies that are often highly prescriptive and leave little room for the users input.
no code implementations • 15 Jan 2014 • Stéphane Ross, Joelle Pineau, Sébastien Paquet, Brahim Chaib-Draa
Partially Observable Markov Decision Processes (POMDPs) provide a rich framework for sequential decision-making under uncertainty in stochastic domains.
no code implementations • NeurIPS 2013 • Mahdi Milani Fard, Yuri Grinberg, Amir-Massoud Farahmand, Joelle Pineau, Doina Precup
This paper addresses the problem of automatic generation of features for value function approximation in reinforcement learning.
no code implementations • NeurIPS 2013 • Beomjoon Kim, Amir-Massoud Farahmand, Joelle Pineau, Doina Precup
We achieve this by integrating LfD in an approximate policy iteration algorithm.
no code implementations • 1 Dec 2013 • William L. Hamilton, Mahdi Milani Fard, Joelle Pineau
Predictive state representations (PSRs) offer an expressive framework for modelling partially observable systems.
no code implementations • 30 Oct 2013 • Boyu Wang, Joelle Pineau
While both cost-sensitive learning and online learning have been studied extensively, the effort in simultaneously dealing with these two issues is limited.
no code implementations • 7 Oct 2013 • Ouais Alsharif, Joelle Pineau
The problem of detecting and recognizing text in natural scenes has proved to be more challenging than its counterpart in documents, with most of the previous work focusing on a single part of the problem.
no code implementations • NeurIPS 2012 • Doina Precup, Joelle Pineau, Andre S. Barreto
The ability to learn a policy for a sequential decision problem with continuous state space using on-line data is a long-standing challenge.
no code implementations • 14 Feb 2012 • Mahdi Milani Fard, Joelle Pineau, Csaba Szepesvari
PAC-Bayesian methods overcome this problem by providing bounds that hold regardless of the correctness of the prior distribution.
no code implementations • NeurIPS 2011 • Andre S. Barreto, Doina Precup, Joelle Pineau
Kernel-based reinforcement-learning (KBRL) is a method for learning a decision policy from a set of sample transitions which stands out for its strong theoretical guarantees.
no code implementations • NeurIPS 2010 • Mahdi M. Fard, Joelle Pineau
This paper introduces the first set of PAC-Bayesian bounds for the batch reinforcement learning problem in finite state spaces.
no code implementations • NeurIPS 2009 • Keith Bush, Joelle Pineau
Interesting real-world datasets often exhibit nonlinear, noisy, continuous-valued states that are unexplorable, are poorly described by first principles, and are only partially observable.
Model-based Reinforcement Learning reinforcement-learning +1
no code implementations • NeurIPS 2008 • Mahdi M. Fard, Joelle Pineau
Markov Decision Processes (MDPs) have been extensively studied and used in the context of planning and decision-making, and many methods exist to find the optimal policy for problems modelled as MDPs.
no code implementations • NeurIPS 2007 • Stephane Ross, Joelle Pineau, Brahim Chaib-Draa
The algorithm uses search heuristics based on an error analysis of lookahead search, to guide the online search towards reachable beliefs with the most potential to reduce error.