no code implementations • 9 May 2024 • Owen Randall, Martin Müller, Ting Han Wei, Ryan Hayward
We propose Expected Work Search (EWS), a new game solving algorithm.
1 code implementation • 18 Dec 2023 • Farnaz Kohankhaki, Kiarash Aghakasiri, Hongming Zhang, Ting-Han Wei, Chao GAO, Martin Müller
We study and improve MCTS in the context where the environment model is given but imperfect.
1 code implementation • ICLR 2023 • Hongming Zhang, Chenjun Xiao, Han Wang, Jun Jin, Bo Xu, Martin Müller
In this work, we further exploit the information in the replay memory by treating it as an empirical \emph{Replay Memory MDP (RM-MDP)}.
no code implementations • 3 Aug 2022 • Timo Bertram, Johannes Fürnkranz, Martin Müller
In this work, we adapt a training approach inspired by the original AlphaGo system to play the imperfect information game of Reconnaissance Blind Chess.
no code implementations • 20 Apr 2022 • Timo Bertram, Johannes Fürnkranz, Martin Müller
In this paper, we study learning in probabilistic domains where the learner may receive incorrect labels but can improve the reliability of labels by repeatedly sampling them.
1 code implementation • 7 Feb 2022 • Martin Müller, Florian Laurent
Scaling up the size and training of autoregressive language models has enabled novel ways of solving Natural Language Processing tasks using zero-shot and few-shot learning.
no code implementations • 19 Jan 2022 • Camille Gontier, Simone Carlo Surace, Igor Delvendahl, Martin Müller, Jean-Pascal Pfister
Bayesian Active Learning (BAL) is an efficient framework for learning the parameters of a model, in which input stimuli are selected to maximize the mutual information between the observations and the unknown parameters.
no code implementations • 21 Sep 2021 • Hongming Zhang, Ke Sun, Bo Xu, Linglong Kong, Martin Müller
In this paper, we propose a simple yet effective anomaly detection framework for deep RL algorithms that simultaneously considers random, adversarial and out-of-distribution~(OOD) state outliers.
no code implementations • 29 Jul 2021 • Aurèle Goetz, Ali Riza Durmaz, Martin Müller, Akhil Thomas, Dominik Britz, Pierre Kerfriden, Chris Eberl
We show that a state-of-the-art UDA approach surpasses the na\"ive application of source domain trained models on the target domain (generalization baseline) to a large extent.
no code implementations • 9 Jul 2021 • Timo Bertram, Johannes Fürnkranz, Martin Müller
We discuss and compare two different Siamese network architectures for this task: a twin network that compares the two sets resulting after the addition, and a triplet network that models the contribution of each candidate to the existing set.
1 code implementation • 25 May 2021 • Timo Bertram, Johannes Fürnkranz, Martin Müller
Drafting, i. e., the selection of a subset of items from a larger candidate set, is a key element of many games and related problems.
1 code implementation • 10 May 2021 • Md Solimul Chowdhury, Martin Müller, Jia You
We also show an important connection between consecutive clauses learned within the same mc decision, where one learned clause triggers the learning of the next one forming a chain of clauses.
1 code implementation • 3 Dec 2020 • Martin Müller, Marcel Salathé
We show that while vaccine sentiment has declined considerably during the COVID-19 pandemic in 2020, algorithms trained on pre-pandemic data would have largely missed this decline due to concept drift.
1 code implementation • 19 Aug 2020 • Kristina Gligorić, Manoel Horta Ribeiro, Martin Müller, Olesia Altunina, Maxime Peyrard, Marcel Salathé, Giovanni Colavizza, Robert West
Timely access to accurate information is crucial during the COVID-19 pandemic.
Social and Information Networks
1 code implementation • 15 May 2020 • Martin Müller, Marcel Salathé, Per E Kummervold
In this work, we release COVID-Twitter-BERT (CT-BERT), a transformer-based model, pretrained on a large corpus of Twitter messages on the topic of COVID-19.
no code implementations • 24 Dec 2019 • Chenjun Xiao, Yifan Wu, Chen Ma, Dale Schuurmans, Martin Müller
Despite its potential to improve sample complexity versus model-free approaches, model-based reinforcement learning can fail catastrophically if the model is inaccurate.
Model-based Reinforcement Learning reinforcement-learning +1
no code implementations • NeurIPS 2019 • Chenjun Xiao, Ruitong Huang, Jincheng Mei, Dale Schuurmans, Martin Müller
We then extend this approach to general sequential decision making by developing a general MCTS algorithm, Maximum Entropy for Tree Search (MENTS).
no code implementations • 25 Apr 2019 • Md Solimul Chowdhury, Martin Müller, Jia-Huai You
We first show experimentally, by running the state-of-the-art CDCL SAT solver MapleLCMDist on benchmarks from SAT Competition-2017 and 2018, that branching decisions with glue variables are categorically more inference and conflict efficient than nonglue variables.
no code implementations • 16 Jun 2017 • Ruitong Huang, Mohammad M. Ajallooeian, Csaba Szepesvári, Martin Müller
We study the problem of identifying the best action among a set of possible options when the value of each action is given by a mapping from a number of noisy micro-observables in the so-called fixed confidence setting.