Sequence-to-Sequence NMT #806

Om-Pandey · 2019-12-22T14:17:15Z

@seanpmorgan with reference to #335 , I have created the tutorial, please check and merge the files. Thanks!

review-notebook-app · 2019-12-22T14:17:21Z

Check out this pull request on

You'll be able to see Jupyter notebook diff and discuss changes. Powered by ReviewNB.

seanpmorgan · 2019-12-23T01:10:54Z

@Om-Pandey Thanks very much for the contribution. Looks very good, but would it be possible to utilize a dataset found in tfds?
https://www.tensorflow.org/datasets/catalog/overview

Because our tutorials will be tested for correctness #485, we need these to run end-to-end.

Om-Pandey · 2019-12-23T09:57:03Z

@seanpmorgan yeah, I saw the datasets, I realize that I will have to change the entire data reading and cleaning process for them to conduct tokenization, even if I keep the model constant. Secondly, the translational datasets on https://www.tensorflow.org/datasets/catalog/overview are too large ≈ 1.5 GB each. I really don't have the kind of compute for it. 😅😬

But, I can assure you that the Dataset I used is pretty authentic(link) and that my entire code has been tested on this so in case of testing for correctness there shouldn't be any errors. A link to the dataset has also been given in the notebook for reference. But, I can compile the cells again and re-upload along with results if you want, it will just make the notebook bigger though.

Check 'em:

Om-Pandey

Removed a few of errors in display pipeline and removed unnecessary spaces

seanpmorgan · 2019-12-27T01:11:04Z

Removed a few of errors in display pipeline and removed unnecessary spaces

Thanks @Om-Pandey! So yeah having the results within the notebook is ideal for users to read through. You can remove the output from any cells that you don't think is necessary for understanding.

In order to run the code in Colab, you'll need to download the data. You can run commands on the colab host by prefixing with !. For example: !wget --quiet https://dataset.tar.gz

If the full notebook can run end-to-end then we can include this as a sort of "test" that the tf-docs team will run at each release.

How long is the total runtime of this example on a Colab instance?

* Coded-in the downloading and unzipping of the data module. * Subsequent changes made to support the feature

Om-Pandey · 2019-12-27T03:25:35Z

@seanpmorgan yeah, so final changes have been made, I have added a lot of comments, which I think are sufficient for anyone to understand the code properly. As you said, I also coded-in the data downloading and unzipping processes on Colab, in my recent commit.
I just checked the notebook typically takes less than a minute to run, except the variable training time - which takes 22secs per epoch, compute that by whatever number of epochs you find necessary to train. Finally, the testing and final prediction take about a minute too.

seanpmorgan

Thanks for the contribution! LGTM. We may need a small patch after we try testing this from the docs team, but we can address that once we get test results for the first time.

user06039 · 2020-01-06T07:26:19Z

@Om-Pandey Will you be implementing Attention Mechanism, Beam Search, Bi-directional encoder-decoder model? It's just that your notebook is not at all using TensorFlow Addons

Om-Pandey added 2 commits December 22, 2019 19:36

Created using Colaboratory - Sequence-to-Sequence NMT

794e7ce

Small changes

ccd8c96

Om-Pandey requested a review from a team as a code owner December 22, 2019 14:17

googlebot added the cla: yes label Dec 22, 2019

Om-Pandey mentioned this pull request Dec 22, 2019

Request for examples: Seq2seq #335

Closed

seanpmorgan added seq2seq tutorials labels Dec 23, 2019

Grammatical errors corrected

47ee0d0

Om-Pandey commented Dec 23, 2019

View reviewed changes

Om-Pandey mentioned this pull request Dec 26, 2019

Adding pooling layers #727

Closed

Necessary changes made

25794bd

* Coded-in the downloading and unzipping of the data module. * Subsequent changes made to support the feature

seanpmorgan approved these changes Dec 31, 2019

View reviewed changes

seanpmorgan merged commit f663cb3 into tensorflow:master Dec 31, 2019

Om-Pandey deleted the tutor branch December 31, 2019 19:47

seanpmorgan mentioned this pull request Jan 26, 2020

Seq2seq NMT tutorial needs refactoring #935

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Sequence-to-Sequence NMT #806

Sequence-to-Sequence NMT #806

Uh oh!

Om-Pandey commented Dec 22, 2019

Uh oh!

review-notebook-app bot commented Dec 22, 2019

Uh oh!

seanpmorgan commented Dec 23, 2019

Uh oh!

Om-Pandey commented Dec 23, 2019

Uh oh!

Om-Pandey left a comment

Uh oh!

seanpmorgan commented Dec 27, 2019

Uh oh!

Om-Pandey commented Dec 27, 2019 •

edited

Loading

Uh oh!

seanpmorgan left a comment

Uh oh!

user06039 commented Jan 6, 2020

Uh oh!

Uh oh!

Sequence-to-Sequence NMT #806

Sequence-to-Sequence NMT #806

Uh oh!

Conversation

Om-Pandey commented Dec 22, 2019

Uh oh!

review-notebook-app bot commented Dec 22, 2019

Uh oh!

seanpmorgan commented Dec 23, 2019

Uh oh!

Om-Pandey commented Dec 23, 2019

Uh oh!

Om-Pandey left a comment

Choose a reason for hiding this comment

Uh oh!

seanpmorgan commented Dec 27, 2019

Uh oh!

Om-Pandey commented Dec 27, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

seanpmorgan left a comment

Choose a reason for hiding this comment

Uh oh!

user06039 commented Jan 6, 2020

Uh oh!

Uh oh!

Om-Pandey commented Dec 27, 2019 •

edited

Loading