Skip to content
This repository was archived by the owner on Jul 7, 2023. It is now read-only.

v1.3.1 #451

Merged
merged 11 commits into from
Dec 1, 2017
Merged

v1.3.1 #451

merged 11 commits into from
Dec 1, 2017

Conversation

rsepassi
Copy link
Contributor

@rsepassi rsepassi commented Dec 1, 2017

No description provided.

T2T Team and others added 11 commits December 1, 2017 09:17
…PU training. Modify transformer to keep the packed-together examples from attending to one another.

PiperOrigin-RevId: 177481956
…int compatibility bug.

PiperOrigin-RevId: 177487398
PiperOrigin-RevId: 177505082
…mprovements/fixes

PiperOrigin-RevId: 177538074
PiperOrigin-RevId: 177540047
PiperOrigin-RevId: 177547599
PiperOrigin-RevId: 177554962
PiperOrigin-RevId: 177641254
@googlebot
Copy link

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for the commit author(s). If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and have the pull request author add another comment and the bot will run again.

@@ -24,7 +24,6 @@
'tensor2tensor/bin/t2t-datagen',
'tensor2tensor/bin/t2t-decoder',
'tensor2tensor/bin/t2t-make-tf-configs',
'tensor2tensor/bin/t2t-bleu',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the deletion of t2t-bleu intentional?
(I can understand it, if yes.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for noticing that. No, it was not intentional and likely due to a bad internal merge of the PR.

@lukaszkaiser
Copy link
Contributor

Sorry guys, it was me and halfway intentional. There was a python 2 vs 3 problem and I'm not sure if we want to maintain another binary when there is SacreBLEU now. But we should certainly add the new BLEU as a metric to Tensorboard. In any case: can we get this in and redo the BLEU binary in another PR? Would that be ok Martin? Sorry for the issue!

@martinpopel
Copy link
Contributor

Yes, we can redo t2t-bleu.py in another PR, I don't want to slow down your work.
If really needed, I can look at the Python2 compatibility (btw. SacréBLEU is Python3 only).
Note that t2t-bleu.py does not really overlap with the sacréBLEU functionality (actually, it could use SacréBLEU internally) - it focuses on evaluating all checkpoints and storing to Tensorboard event file (see #436 summary).

@lukaszkaiser
Copy link
Contributor

Let's do that. The problem with python2 is that Google uses it internally, so if it's not compatible then all our internal tests and stuff start failing. So yes, let's get this in and then redo the BLEU. Could we have your BLEU score added to metrics, or maybe replace the current approx-BLEU entirely? It's much more useful as it's much closer to the real BLEU than the current approx. What do you think?

@lukaszkaiser lukaszkaiser merged commit 970dac9 into tensorflow:master Dec 1, 2017
@martinpopel
Copy link
Contributor

Could we have your BLEU score added to metrics, or maybe replace the current approx-BLEU entirely?

"my" BLEU actually uses "your" bleu_hook.py compute_bleu(), i.e. the same code as is used for computing approx_bleu (I just added smoothing, but this is already merged).
The important difference is that for approx_bleu a tensor with subword IDs is used as the input reference_corpus and translation_corpus, but when used from t2t-bleu a list of words is used instead (tokenized from plaintext using bleu_tokenize()). I am not sure whether it is wise to use this instead of approx_bleu - it may be too slow, and also I am not sure if the detokenization+tokenization can be converted to static graph ops. To get close to the real BLEU, we would also need eval_run_autoregressive and beam search. Currently, internal evaluation does not work for multi-gpu training, so we need an external script/job anyway. I like it this way because I can run it on CPU (or older GPU) and not slow down the training on GPUs.

@martinpopel
Copy link
Contributor

What about the failing Travis for this PR?
I've noticed the tests sometime fail non-deterministically, e.g. here.

@rsepassi
Copy link
Contributor Author

rsepassi commented Dec 2, 2017

Yes, that Algorithmic Algebra test is flaky. You can see that master builds fine (no change from this PR). We'll likely end up removing those problems as nobody is using them.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants