TASK |
DATASET |
MODEL |
METRIC NAME |
METRIC VALUE |
GLOBAL RANK |
REMOVE |
Unsupervised Reinforcement Learning
|
URLB (pixels, 10^5 frames)
|
APT
|
Walker (mean normalized return)
|
7.71±7.39
|
# 7
|
|
Unsupervised Reinforcement Learning
|
URLB (pixels, 10^5 frames)
|
APT
|
Quadruped (mean normalized return)
|
21.22±5.14
|
# 7
|
|
Unsupervised Reinforcement Learning
|
URLB (pixels, 10^5 frames)
|
APT
|
Jaco (mean normalized return)
|
0.37±0.64
|
# 9
|
|
Unsupervised Reinforcement Learning
|
URLB (pixels, 10^6 frames)
|
APT
|
Walker (mean normalized return)
|
7.68±6.50
|
# 8
|
|
Unsupervised Reinforcement Learning
|
URLB (pixels, 10^6 frames)
|
APT
|
Quadruped (mean normalized return)
|
17.20±4.90
|
# 9
|
|
Unsupervised Reinforcement Learning
|
URLB (pixels, 10^6 frames)
|
APT
|
Jaco (mean normalized return)
|
1.02±1.27
|
# 7
|
|
Unsupervised Reinforcement Learning
|
URLB (pixels, 2*10^6 frames)
|
APT
|
Walker (mean normalized return)
|
7.32±7.05
|
# 9
|
|
Unsupervised Reinforcement Learning
|
URLB (pixels, 2*10^6 frames)
|
APT
|
Quadruped (mean normalized return)
|
18.18±5.22
|
# 10
|
|
Unsupervised Reinforcement Learning
|
URLB (pixels, 2*10^6 frames)
|
APT
|
Jaco (mean normalized return)
|
0.12±0.21
|
# 10
|
|
Unsupervised Reinforcement Learning
|
URLB (pixels, 5*10^5 frames)
|
APT
|
Walker (mean normalized return)
|
7.41±7.01
|
# 8
|
|
Unsupervised Reinforcement Learning
|
URLB (pixels, 5*10^5 frames)
|
APT
|
Quadruped (mean normalized return)
|
16.74±5.04
|
# 9
|
|
Unsupervised Reinforcement Learning
|
URLB (pixels, 5*10^5 frames)
|
APT
|
Jaco (mean normalized return)
|
0.38±0.41
|
# 8
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 10^5 frames)
|
APT
|
Walker (mean normalized return)
|
82.49±31.70
|
# 3
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 10^5 frames)
|
APT
|
Quadruped (mean normalized return)
|
34.24±17.32
|
# 4
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 10^5 frames)
|
APT
|
Jaco (mean normalized return)
|
71.21±9.07
|
# 5
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 10^6 frames)
|
APT
|
Walker (mean normalized return)
|
76.89±32.64
|
# 4
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 10^6 frames)
|
APT
|
Quadruped (mean normalized return)
|
39.44±12.85
|
# 5
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 10^6 frames)
|
APT
|
Jaco (mean normalized return)
|
63.31±6.73
|
# 3
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 2*10^6 frames)
|
APT
|
Walker (mean normalized return)
|
73.43±32.57
|
# 6
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 2*10^6 frames)
|
APT
|
Quadruped (mean normalized return)
|
52.00±10.56
|
# 5
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 2*10^6 frames)
|
APT
|
Jaco (mean normalized return)
|
55.67±7.80
|
# 5
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 5*10^5 frames)
|
APT
|
Walker (mean normalized return)
|
80.70±30.47
|
# 3
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 5*10^5 frames)
|
APT
|
Quadruped (mean normalized return)
|
32.15±11.77
|
# 6
|
|
Unsupervised Reinforcement Learning
|
URLB (states, 5*10^5 frames)
|
APT
|
Jaco (mean normalized return)
|
67.57±9.44
|
# 2
|
|