Skip to content

Commit d6b8e9c

Browse files
authored
Add trajectory transformer (#17141)
* Add trajectory transformer Fix model init Fix end of lines for .mdx files Add trajectory transformer model to toctree Add forward input docs Fix docs, remove prints, simplify prediction test Apply suggestions from code review Co-authored-by: Sylvain Gugger <[email protected]> Apply suggestions from code review Co-authored-by: Lysandre Debut <[email protected]> Co-authored-by: Sylvain Gugger <[email protected]> Update docs, more descriptive comments Apply suggestions from code review Co-authored-by: Sylvain Gugger <[email protected]> Update readme Small comment update and add conversion script Rebase and reformat Fix copies Fix rebase, remove duplicates Fix rebase, remove duplicates * Remove tapex * Remove tapex * Remove tapex
1 parent c352640 commit d6b8e9c

19 files changed

+1297
-1
lines changed

.gitattributes

+2-1
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
11
*.py eol=lf
22
*.rst eol=lf
3-
*.md eol=lf
3+
*.md eol=lf
4+
*.mdx eol=lf

README.md

+1
Original file line numberDiff line numberDiff line change
@@ -321,6 +321,7 @@ Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih.
321321
1. **[T5v1.1](https://huggingface.co/docs/transformers/model_doc/t5v1.1)** (from Google AI) released in the repository [google-research/text-to-text-transfer-transformer](https://github.com/google-research/text-to-text-transfer-transformer/blob/main/released_checkpoints.md#t511) by Colin Raffel and Noam Shazeer and Adam Roberts and Katherine Lee and Sharan Narang and Michael Matena and Yanqi Zhou and Wei Li and Peter J. Liu.
322322
1. **[TAPAS](https://huggingface.co/docs/transformers/model_doc/tapas)** (from Google AI) released with the paper [TAPAS: Weakly Supervised Table Parsing via Pre-training](https://arxiv.org/abs/2004.02349) by Jonathan Herzig, Paweł Krzysztof Nowak, Thomas Müller, Francesco Piccinno and Julian Martin Eisenschlos.
323323
1. **[TAPEX](https://huggingface.co/docs/transformers/main/model_doc/tapex)** (from Microsoft Research) released with the paper [TAPEX: Table Pre-training via Learning a Neural SQL Executor](https://arxiv.org/abs/2107.07653) by Qian Liu, Bei Chen, Jiaqi Guo, Morteza Ziyadi, Zeqi Lin, Weizhu Chen, Jian-Guang Lou.
324+
1. **[Trajectory Transformer](https://huggingface.co/docs/transformers/main/model_doc/trajectory_transformers)** (from the University of California at Berkeley) released with the paper [Offline Reinforcement Learning as One Big Sequence Modeling Problem](https://arxiv.org/abs/2106.02039) by Michael Janner, Qiyang Li, Sergey Levine
324325
1. **[Transformer-XL](https://huggingface.co/docs/transformers/model_doc/transfo-xl)** (from Google/CMU) released with the paper [Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context](https://arxiv.org/abs/1901.02860) by Zihang Dai*, Zhilin Yang*, Yiming Yang, Jaime Carbonell, Quoc V. Le, Ruslan Salakhutdinov.
325326
1. **[TrOCR](https://huggingface.co/docs/transformers/model_doc/trocr)** (from Microsoft), released together with the paper [TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models](https://arxiv.org/abs/2109.10282) by Minghao Li, Tengchao Lv, Lei Cui, Yijuan Lu, Dinei Florencio, Cha Zhang, Zhoujun Li, Furu Wei.
326327
1. **[UniSpeech](https://huggingface.co/docs/transformers/model_doc/unispeech)** (from Microsoft Research) released with the paper [UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data](https://arxiv.org/abs/2101.07597) by Chengyi Wang, Yu Wu, Yao Qian, Kenichi Kumatani, Shujie Liu, Furu Wei, Michael Zeng, Xuedong Huang.

README_ko.md

+1
Original file line numberDiff line numberDiff line change
@@ -300,6 +300,7 @@ Flax, PyTorch, TensorFlow 설치 페이지에서 이들을 conda로 설치하는
300300
1. **[T5v1.1](https://huggingface.co/docs/transformers/model_doc/t5v1.1)** (from Google AI) released in the repository [google-research/text-to-text-transfer-transformer](https://github.com/google-research/text-to-text-transfer-transformer/blob/main/released_checkpoints.md#t511) by Colin Raffel and Noam Shazeer and Adam Roberts and Katherine Lee and Sharan Narang and Michael Matena and Yanqi Zhou and Wei Li and Peter J. Liu.
301301
1. **[TAPAS](https://huggingface.co/docs/transformers/model_doc/tapas)** (from Google AI) released with the paper [TAPAS: Weakly Supervised Table Parsing via Pre-training](https://arxiv.org/abs/2004.02349) by Jonathan Herzig, Paweł Krzysztof Nowak, Thomas Müller, Francesco Piccinno and Julian Martin Eisenschlos.
302302
1. **[TAPEX](https://huggingface.co/docs/transformers/main/model_doc/tapex)** (from Microsoft Research) released with the paper [TAPEX: Table Pre-training via Learning a Neural SQL Executor](https://arxiv.org/abs/2107.07653) by Qian Liu, Bei Chen, Jiaqi Guo, Morteza Ziyadi, Zeqi Lin, Weizhu Chen, Jian-Guang Lou.
303+
1. **[Trajectory Transformer](https://huggingface.co/docs/transformers/main/model_doc/trajectory_transformers)** (from the University of California at Berkeley) released with the paper [Offline Reinforcement Learning as One Big Sequence Modeling Problem](https://arxiv.org/abs/2106.02039) by Michael Janner, Qiyang Li, Sergey Levine
303304
1. **[Transformer-XL](https://huggingface.co/docs/transformers/model_doc/transfo-xl)** (from Google/CMU) released with the paper [Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context](https://arxiv.org/abs/1901.02860) by Zihang Dai*, Zhilin Yang*, Yiming Yang, Jaime Carbonell, Quoc V. Le, Ruslan Salakhutdinov.
304305
1. **[TrOCR](https://huggingface.co/docs/transformers/model_doc/trocr)** (from Microsoft), released together with the paper [TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models](https://arxiv.org/abs/2109.10282) by Minghao Li, Tengchao Lv, Lei Cui, Yijuan Lu, Dinei Florencio, Cha Zhang, Zhoujun Li, Furu Wei.
305306
1. **[UniSpeech](https://huggingface.co/docs/transformers/model_doc/unispeech)** (from Microsoft Research) released with the paper [UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data](https://arxiv.org/abs/2101.07597) by Chengyi Wang, Yu Wu, Yao Qian, Kenichi Kumatani, Shujie Liu, Furu Wei, Michael Zeng, Xuedong Huang.

README_zh-hans.md

+1
Original file line numberDiff line numberDiff line change
@@ -324,6 +324,7 @@ conda install -c huggingface transformers
324324
1. **[T5v1.1](https://huggingface.co/docs/transformers/model_doc/t5v1.1)** (来自 Google AI) 伴随论文 [google-research/text-to-text-transfer-transformer](https://github.com/google-research/text-to-text-transfer-transformer/blob/main/released_checkpoints.md#t511) 由 Colin Raffel and Noam Shazeer and Adam Roberts and Katherine Lee and Sharan Narang and Michael Matena and Yanqi Zhou and Wei Li and Peter J. Liu 发布。
325325
1. **[TAPAS](https://huggingface.co/docs/transformers/model_doc/tapas)** (来自 Google AI) 伴随论文 [TAPAS: Weakly Supervised Table Parsing via Pre-training](https://arxiv.org/abs/2004.02349) 由 Jonathan Herzig, Paweł Krzysztof Nowak, Thomas Müller, Francesco Piccinno and Julian Martin Eisenschlos 发布。
326326
1. **[TAPEX](https://huggingface.co/docs/transformers/main/model_doc/tapex)** (来自 Microsoft Research) 伴随论文 [TAPEX: Table Pre-training via Learning a Neural SQL Executor](https://arxiv.org/abs/2107.07653) 由 Qian Liu, Bei Chen, Jiaqi Guo, Morteza Ziyadi, Zeqi Lin, Weizhu Chen, Jian-Guang Lou 发布。
327+
1. **[Trajectory Transformer](https://huggingface.co/docs/transformers/main/model_doc/trajectory_transformers)** (from the University of California at Berkeley) released with the paper [Offline Reinforcement Learning as One Big Sequence Modeling Problem](https://arxiv.org/abs/2106.02039) by Michael Janner, Qiyang Li, Sergey Levine
327328
1. **[Transformer-XL](https://huggingface.co/docs/transformers/model_doc/transfo-xl)** (来自 Google/CMU) 伴随论文 [Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context](https://arxiv.org/abs/1901.02860) 由 Zihang Dai*, Zhilin Yang*, Yiming Yang, Jaime Carbonell, Quoc V. Le, Ruslan Salakhutdinov 发布。
328329
1. **[TrOCR](https://huggingface.co/docs/transformers/model_doc/trocr)** (来自 Microsoft) 伴随论文 [TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models](https://arxiv.org/abs/2109.10282) 由 Minghao Li, Tengchao Lv, Lei Cui, Yijuan Lu, Dinei Florencio, Cha Zhang, Zhoujun Li, Furu Wei 发布。
329330
1. **[UniSpeech](https://huggingface.co/docs/transformers/model_doc/unispeech)** (来自 Microsoft Research) 伴随论文 [UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data](https://arxiv.org/abs/2101.07597) 由 Chengyi Wang, Yu Wu, Yao Qian, Kenichi Kumatani, Shujie Liu, Furu Wei, Michael Zeng, Xuedong Huang 发布。

README_zh-hant.md

+1
Original file line numberDiff line numberDiff line change
@@ -336,6 +336,7 @@ conda install -c huggingface transformers
336336
1. **[T5v1.1](https://huggingface.co/docs/transformers/model_doc/t5v1.1)** (from Google AI) released with the paper [google-research/text-to-text-transfer-transformer](https://github.com/google-research/text-to-text-transfer-transformer/blob/main/released_checkpoints.md#t511) by Colin Raffel and Noam Shazeer and Adam Roberts and Katherine Lee and Sharan Narang and Michael Matena and Yanqi Zhou and Wei Li and Peter J. Liu.
337337
1. **[TAPAS](https://huggingface.co/docs/transformers/model_doc/tapas)** (from Google AI) released with the paper [TAPAS: Weakly Supervised Table Parsing via Pre-training](https://arxiv.org/abs/2004.02349) by Jonathan Herzig, Paweł Krzysztof Nowak, Thomas Müller, Francesco Piccinno and Julian Martin Eisenschlos.
338338
1. **[TAPEX](https://huggingface.co/docs/transformers/main/model_doc/tapex)** (from Microsoft Research) released with the paper [TAPEX: Table Pre-training via Learning a Neural SQL Executor](https://arxiv.org/abs/2107.07653) by Qian Liu, Bei Chen, Jiaqi Guo, Morteza Ziyadi, Zeqi Lin, Weizhu Chen, Jian-Guang Lou.
339+
1. **[Trajectory Transformer](https://huggingface.co/docs/transformers/main/model_doc/trajectory_transformers)** (from the University of California at Berkeley) released with the paper [Offline Reinforcement Learning as One Big Sequence Modeling Problem](https://arxiv.org/abs/2106.02039) by Michael Janner, Qiyang Li, Sergey Levine
339340
1. **[Transformer-XL](https://huggingface.co/docs/transformers/model_doc/transfo-xl)** (from Google/CMU) released with the paper [Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context](https://arxiv.org/abs/1901.02860) by Zihang Dai*, Zhilin Yang*, Yiming Yang, Jaime Carbonell, Quoc V. Le, Ruslan Salakhutdinov.
340341
1. **[TrOCR](https://huggingface.co/docs/transformers/model_doc/trocr)** (from Microsoft) released with the paper [TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models](https://arxiv.org/abs/2109.10282) by Minghao Li, Tengchao Lv, Lei Cui, Yijuan Lu, Dinei Florencio, Cha Zhang, Zhoujun Li, Furu Wei.
341342
1. **[UniSpeech](https://huggingface.co/docs/transformers/model_doc/unispeech)** (from Microsoft Research) released with the paper [UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data](https://arxiv.org/abs/2101.07597) by Chengyi Wang, Yu Wu, Yao Qian, Kenichi Kumatani, Shujie Liu, Furu Wei, Michael Zeng, Xuedong Huang.

docs/source/en/_toctree.yml

+2
Original file line numberDiff line numberDiff line change
@@ -342,6 +342,8 @@
342342
title: TAPAS
343343
- local: model_doc/tapex
344344
title: TAPEX
345+
- local: model_doc/trajectory_transformer
346+
title: Trajectory Transformer
345347
- local: model_doc/transfo-xl
346348
title: Transformer XL
347349
- local: model_doc/trocr

docs/source/en/index.mdx

+2
Original file line numberDiff line numberDiff line change
@@ -142,6 +142,7 @@ The library currently contains JAX, PyTorch and TensorFlow implementations, pret
142142
1. **[T5v1.1](model_doc/t5v1.1)** (from Google AI) released in the repository [google-research/text-to-text-transfer-transformer](https://github.com/google-research/text-to-text-transfer-transformer/blob/main/released_checkpoints.md#t511) by Colin Raffel and Noam Shazeer and Adam Roberts and Katherine Lee and Sharan Narang and Michael Matena and Yanqi Zhou and Wei Li and Peter J. Liu.
143143
1. **[TAPAS](model_doc/tapas)** (from Google AI) released with the paper [TAPAS: Weakly Supervised Table Parsing via Pre-training](https://arxiv.org/abs/2004.02349) by Jonathan Herzig, Paweł Krzysztof Nowak, Thomas Müller, Francesco Piccinno and Julian Martin Eisenschlos.
144144
1. **[TAPEX](model_doc/tapex)** (from Microsoft Research) released with the paper [TAPEX: Table Pre-training via Learning a Neural SQL Executor](https://arxiv.org/abs/2107.07653) by Qian Liu, Bei Chen, Jiaqi Guo, Morteza Ziyadi, Zeqi Lin, Weizhu Chen, Jian-Guang Lou.
145+
1. **[Trajectory Transformer](model_doc/trajectory_transformers)** (from the University of California at Berkeley) released with the paper [Offline Reinforcement Learning as One Big Sequence Modeling Problem](https://arxiv.org/abs/2106.02039) by Michael Janner, Qiyang Li, Sergey Levine
145146
1. **[Transformer-XL](model_doc/transfo-xl)** (from Google/CMU) released with the paper [Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context](https://arxiv.org/abs/1901.02860) by Zihang Dai*, Zhilin Yang*, Yiming Yang, Jaime Carbonell, Quoc V. Le, Ruslan Salakhutdinov.
146147
1. **[TrOCR](model_doc/trocr)** (from Microsoft), released together with the paper [TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models](https://arxiv.org/abs/2109.10282) by Minghao Li, Tengchao Lv, Lei Cui, Yijuan Lu, Dinei Florencio, Cha Zhang, Zhoujun Li, Furu Wei.
147148
1. **[UniSpeech](model_doc/unispeech)** (from Microsoft Research) released with the paper [UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data](https://arxiv.org/abs/2101.07597) by Chengyi Wang, Yu Wu, Yao Qian, Kenichi Kumatani, Shujie Liu, Furu Wei, Michael Zeng, Xuedong Huang.
@@ -259,6 +260,7 @@ Flax), PyTorch, and/or TensorFlow.
259260
| Swin | | | | | |
260261
| T5 | | | | | |
261262
| TAPAS | | | | | |
263+
| Trajectory Transformer | | | | | |
262264
| Transformer-XL | | | | | |
263265
| TrOCR | | | | | |
264266
| UniSpeech | | | | | |
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
<!--Copyright 2022 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License.
11+
-->
12+
13+
# Trajectory Transformer
14+
15+
## Overview
16+
17+
The Trajectory Transformer model was proposed in [Offline Reinforcement Learning as One Big Sequence Modeling Problem](https://arxiv.org/abs/2106.02039) by Michael Janner, Qiyang Li, Sergey Levine.
18+
19+
The abstract from the paper is the following:
20+
21+
*Reinforcement learning (RL) is typically concerned with estimating stationary policies or single-step models,
22+
leveraging the Markov property to factorize problems in time. However, we can also view RL as a generic sequence
23+
modeling problem, with the goal being to produce a sequence of actions that leads to a sequence of high rewards.
24+
Viewed in this way, it is tempting to consider whether high-capacity sequence prediction models that work well
25+
in other domains, such as natural-language processing, can also provide effective solutions to the RL problem.
26+
To this end, we explore how RL can be tackled with the tools of sequence modeling, using a Transformer architecture
27+
to model distributions over trajectories and repurposing beam search as a planning algorithm. Framing RL as sequence
28+
modeling problem simplifies a range of design decisions, allowing us to dispense with many of the components common
29+
in offline RL algorithms. We demonstrate the flexibility of this approach across long-horizon dynamics prediction,
30+
imitation learning, goal-conditioned RL, and offline RL. Further, we show that this approach can be combined with
31+
existing model-free algorithms to yield a state-of-the-art planner in sparse-reward, long-horizon tasks.*
32+
33+
Tips:
34+
35+
This Transformer is used for deep reinforcement learning. To use it, you need to create sequences from
36+
actions, states and rewards from all previous timesteps. This model will treat all these elements together
37+
as one big sequence (a trajectory).
38+
39+
This model was contributed by [CarlCochet](https://huggingface.co/CarlCochet). The original code can be found [here](https://github.com/jannerm/trajectory-transformer).
40+
41+
## TrajectoryTransformerConfig
42+
43+
[[autodoc]] TrajectoryTransformerConfig
44+
45+
46+
## TrajectoryTransformerModel
47+
48+
[[autodoc]] TrajectoryTransformerModel
49+
- forward

src/transformers/__init__.py

+20
Original file line numberDiff line numberDiff line change
@@ -284,6 +284,10 @@
284284
"models.t5": ["T5_PRETRAINED_CONFIG_ARCHIVE_MAP", "T5Config"],
285285
"models.tapas": ["TAPAS_PRETRAINED_CONFIG_ARCHIVE_MAP", "TapasConfig", "TapasTokenizer"],
286286
"models.tapex": ["TapexTokenizer"],
287+
"models.trajectory_transformer": [
288+
"TRAJECTORY_TRANSFORMER_PRETRAINED_CONFIG_ARCHIVE_MAP",
289+
"TrajectoryTransformerConfig",
290+
],
287291
"models.transfo_xl": [
288292
"TRANSFO_XL_PRETRAINED_CONFIG_ARCHIVE_MAP",
289293
"TransfoXLConfig",
@@ -1571,6 +1575,13 @@
15711575
"load_tf_weights_in_t5",
15721576
]
15731577
)
1578+
_import_structure["models.trajectory_transformer"].extend(
1579+
[
1580+
"TRAJECTORY_TRANSFORMER_PRETRAINED_MODEL_ARCHIVE_LIST",
1581+
"TrajectoryTransformerModel",
1582+
"TrajectoryTransformerPreTrainedModel",
1583+
]
1584+
)
15741585
_import_structure["models.transfo_xl"].extend(
15751586
[
15761587
"TRANSFO_XL_PRETRAINED_MODEL_ARCHIVE_LIST",
@@ -2788,6 +2799,10 @@
27882799
from .models.t5 import T5_PRETRAINED_CONFIG_ARCHIVE_MAP, T5Config
27892800
from .models.tapas import TAPAS_PRETRAINED_CONFIG_ARCHIVE_MAP, TapasConfig, TapasTokenizer
27902801
from .models.tapex import TapexTokenizer
2802+
from .models.trajectory_transformer import (
2803+
TRAJECTORY_TRANSFORMER_PRETRAINED_CONFIG_ARCHIVE_MAP,
2804+
TrajectoryTransformerConfig,
2805+
)
27912806
from .models.transfo_xl import (
27922807
TRANSFO_XL_PRETRAINED_CONFIG_ARCHIVE_MAP,
27932808
TransfoXLConfig,
@@ -3863,6 +3878,11 @@
38633878
T5PreTrainedModel,
38643879
load_tf_weights_in_t5,
38653880
)
3881+
from .models.trajectory_transformer import (
3882+
TRAJECTORY_TRANSFORMER_PRETRAINED_MODEL_ARCHIVE_LIST,
3883+
TrajectoryTransformerModel,
3884+
TrajectoryTransformerPreTrainedModel,
3885+
)
38663886
from .models.transfo_xl import (
38673887
TRANSFO_XL_PRETRAINED_MODEL_ARCHIVE_LIST,
38683888
AdaptiveEmbedding,

src/transformers/models/__init__.py

+1
Original file line numberDiff line numberDiff line change
@@ -116,6 +116,7 @@
116116
t5,
117117
tapas,
118118
tapex,
119+
trajectory_transformer,
119120
transfo_xl,
120121
trocr,
121122
unispeech,

src/transformers/models/auto/configuration_auto.py

+2
Original file line numberDiff line numberDiff line change
@@ -113,6 +113,7 @@
113113
("swin", "SwinConfig"),
114114
("t5", "T5Config"),
115115
("tapas", "TapasConfig"),
116+
("trajectory_transformer", "TrajectoryTransformerConfig"),
116117
("transfo-xl", "TransfoXLConfig"),
117118
("trocr", "TrOCRConfig"),
118119
("unispeech", "UniSpeechConfig"),
@@ -338,6 +339,7 @@
338339
("t5v1.1", "T5v1.1"),
339340
("tapas", "TAPAS"),
340341
("tapex", "TAPEX"),
342+
("trajectory_transformer", "Trajectory Transformer"),
341343
("transfo-xl", "Transformer-XL"),
342344
("trocr", "TrOCR"),
343345
("unispeech", "UniSpeech"),

src/transformers/models/auto/modeling_auto.py

+1
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,7 @@
108108
("swin", "SwinModel"),
109109
("t5", "T5Model"),
110110
("tapas", "TapasModel"),
111+
("trajectory_transformer", "TrajectoryTransformerModel"),
111112
("transfo-xl", "TransfoXLModel"),
112113
("unispeech", "UniSpeechModel"),
113114
("unispeech-sat", "UniSpeechSatModel"),

0 commit comments

Comments
 (0)