You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Jul 7, 2023. It is now read-only.
I want to use transformer to reproduce the result of EN-DE 4.5M dataset which is used in the paper "attention is all you need". But I can't find any guideline.
What I need:
How to run the transformer?
There're some examples which use "t2t_trainner", but EN-DE 4.5M dataset is not in the problems list.
How to feed the EN-DE 4.5M dataset into the model?
I just have 4.5M EN-DE sentence pairs, how to produce "target_space_id" or other features to the model? How to initialize the embedding matrix?
The explaination of the inputs/outputs of the function in transformer is not clear. e.g. The explaination of "target_space_id" is "A Tensor". But I want to know more about these inputs/outputs, how could I find them?
Is there any guideline for reproducing the result of the paper or just explaining how to use transformer to train a model on a dataset which only contains sentence pairs?
TensorFlow and tensor2tensor versions
Tensorflow 1.4, tensor2tensor 1.5.6
The text was updated successfully, but these errors were encountered:
It is here https://github.com/tensorflow/tensor2tensor#walkthrough
You should get BLEU>20 on a single GPU, depending on the batch_size you can fit into your GPU and how long you let it train (--train_steps).
To get better results (replicate the paper or get even better results), you should use 8 GPUs and batch size 4096.
For further discussion of the issues with replicating the en-de results see #317 (and close this issue) and recent Gitter discussion.
but EN-DE 4.5M dataset is not in the problems list.
It is. translate_ende_wmt32k
How to feed the EN-DE 4.5M dataset into the model?
Just follow the Walkthrough.
The input and target space ids were important for multi-task (multi-problem) training, which is now broken. You can ignore it.
Description
I want to use transformer to reproduce the result of EN-DE 4.5M dataset which is used in the paper "attention is all you need". But I can't find any guideline.
What I need:
There're some examples which use "t2t_trainner", but EN-DE 4.5M dataset is not in the problems list.
I just have 4.5M EN-DE sentence pairs, how to produce "target_space_id" or other features to the model? How to initialize the embedding matrix?
Is there any guideline for reproducing the result of the paper or just explaining how to use transformer to train a model on a dataset which only contains sentence pairs?
TensorFlow and tensor2tensor versions
Tensorflow 1.4, tensor2tensor 1.5.6
The text was updated successfully, but these errors were encountered: