-
Notifications
You must be signed in to change notification settings - Fork 614
Sequence-to-Sequence NMT #806
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Check out this pull request on You'll be able to see Jupyter notebook diff and discuss changes. Powered by ReviewNB. |
@Om-Pandey Thanks very much for the contribution. Looks very good, but would it be possible to utilize a dataset found in tfds? Because our tutorials will be tested for correctness #485, we need these to run end-to-end. |
@seanpmorgan yeah, I saw the datasets, I realize that I will have to change the entire data reading and cleaning process for them to conduct tokenization, even if I keep the model constant. Secondly, the translational datasets on https://www.tensorflow.org/datasets/catalog/overview are too large ≈ 1.5 GB each. I really don't have the kind of compute for it. 😅😬 But, I can assure you that the Dataset I used is pretty authentic(link) and that my entire code has been tested on this so in case of testing for correctness there shouldn't be any errors. A link to the dataset has also been given in the notebook for reference. But, I can compile the cells again and re-upload along with results if you want, it will just make the notebook bigger though. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed a few of errors in display pipeline and removed unnecessary spaces
Thanks @Om-Pandey! So yeah having the results within the notebook is ideal for users to read through. You can remove the output from any cells that you don't think is necessary for understanding. In order to run the code in Colab, you'll need to download the data. You can run commands on the colab host by prefixing with If the full notebook can run end-to-end then we can include this as a sort of "test" that the tf-docs team will run at each release. How long is the total runtime of this example on a Colab instance? |
* Coded-in the downloading and unzipping of the data module. * Subsequent changes made to support the feature
@seanpmorgan yeah, so final changes have been made, I have added a lot of comments, which I think are sufficient for anyone to understand the code properly. As you said, I also coded-in the data downloading and unzipping processes on Colab, in my recent commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution! LGTM. We may need a small patch after we try testing this from the docs team, but we can address that once we get test results for the first time.
@Om-Pandey Will you be implementing Attention Mechanism, Beam Search, Bi-directional encoder-decoder model? It's just that your notebook is not at all using TensorFlow Addons |
@seanpmorgan with reference to #335 , I have created the tutorial, please check and merge the files. Thanks!