-
Notifications
You must be signed in to change notification settings - Fork 610
Implement generate_vbe_metadata cpu #3715
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This pull request was exported from Phabricator. Differential Revision: D69162870 |
✅ Deploy Preview for pytorch-fbgemm-docs ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
7529dfe
to
b2d0bcd
Compare
Summary: X-link: facebookresearch/FBGEMM#796 This diff implements `generate_vbe_metadata` for cpu, such that the function returns the same output for CPU, CUDA and MTIA. To support VBE on CPU with existing fixed-batch-size CPU kernel, we need to recompute offsets, which is previously done in python. This diff implements offsets recomputation in C++ such that all manipulations are done in C++. Note that reshaping offsets and grad_input to work with existing fixed-batch-size CPU kernels are done in Autograd instead of wrapper to avoid multiple computations. VBE CPU tests are in the next diff. Differential Revision: D69162870
This pull request was exported from Phabricator. Differential Revision: D69162870 |
1 similar comment
This pull request was exported from Phabricator. Differential Revision: D69162870 |
b2d0bcd
to
aac9690
Compare
Summary: Pull Request resolved: pytorch#3715 X-link: facebookresearch/FBGEMM#796 This diff implements `generate_vbe_metadata` for cpu, such that the function returns the same output for CPU, CUDA and MTIA. To support VBE on CPU with existing fixed-batch-size CPU kernel, we need to recompute offsets, which is previously done in python. This diff implements offsets recomputation in C++ such that all manipulations are done in C++. Note that reshaping offsets and grad_input to work with existing fixed-batch-size CPU kernels are done in Autograd instead of wrapper to avoid multiple computations. VBE CPU tests are in the next diff. Differential Revision: D69162870
This pull request was exported from Phabricator. Differential Revision: D69162870 |
Summary: Pull Request resolved: pytorch#3715 X-link: facebookresearch/FBGEMM#796 This diff implements `generate_vbe_metadata` for cpu, such that the function returns the same output for CPU, CUDA and MTIA. To support VBE on CPU with existing fixed-batch-size CPU kernel, we need to recompute offsets, which is previously done in python. This diff implements offsets recomputation in C++ such that all manipulations are done in C++. Note that reshaping offsets and grad_input to work with existing fixed-batch-size CPU kernels are done in Autograd instead of wrapper to avoid multiple computations. VBE CPU tests are in the next diff. Differential Revision: D69162870
aac9690
to
ae43025
Compare
Summary: X-link: facebookresearch/FBGEMM#796 This diff implements `generate_vbe_metadata` for cpu, such that the function returns the same output for CPU, CUDA and MTIA. To support VBE on CPU with existing fixed-batch-size CPU kernel, we need to recompute offsets, which is previously done in python. This diff implements offsets recomputation in C++ such that all manipulations are done in C++. Note that reshaping offsets and grad_input to work with existing fixed-batch-size CPU kernels are done in Autograd instead of wrapper to avoid multiple computations. VBE CPU tests are in the next diff. Differential Revision: D69162870
ae43025
to
90727e6
Compare
Summary: X-link: facebookresearch/FBGEMM#796 This diff implements `generate_vbe_metadata` for cpu, such that the function returns the same output for CPU, CUDA and MTIA. To support VBE on CPU with existing fixed-batch-size CPU kernel, we need to recompute offsets, which is previously done in python. This diff implements offsets recomputation in C++ such that all manipulations are done in C++. Note that reshaping offsets and grad_input to work with existing fixed-batch-size CPU kernels are done in Autograd instead of wrapper to avoid multiple computations. VBE CPU tests are in the next diff. Differential Revision: D69162870
3a59542
to
104fc25
Compare
Summary: X-link: facebookresearch/FBGEMM#796 This diff implements `generate_vbe_metadata` for cpu, such that the function returns the same output for CPU, CUDA and MTIA. To support VBE on CPU with existing fixed-batch-size CPU kernel, we need to recompute offsets, which is previously done in python. This diff implements offsets recomputation in C++ such that all manipulations are done in C++. Note that reshaping offsets and grad_input to work with existing fixed-batch-size CPU kernels are done in Autograd instead of wrapper to avoid multiple computations. VBE CPU tests are in the next diff. Reviewed By: sryap Differential Revision: D69162870
This pull request was exported from Phabricator. Differential Revision: D69162870 |
104fc25
to
ea6a843
Compare
Summary: X-link: facebookresearch/FBGEMM#796 This diff implements `generate_vbe_metadata` for cpu, such that the function returns the same output for CPU, CUDA and MTIA. To support VBE on CPU with existing fixed-batch-size CPU kernel, we need to recompute offsets, which is previously done in python. This diff implements offsets recomputation in C++ such that all manipulations are done in C++. Note that reshaping offsets and grad_input to work with existing fixed-batch-size CPU kernels are done in Autograd instead of wrapper to avoid multiple computations. VBE CPU tests are in the next diff. Reviewed By: sryap Differential Revision: D69162870
This pull request was exported from Phabricator. Differential Revision: D69162870 |
ea6a843
to
5ab0b9b
Compare
Summary: X-link: facebookresearch/FBGEMM#796 This diff implements `generate_vbe_metadata` for cpu, such that the function returns the same output for CPU, CUDA and MTIA. To support VBE on CPU with existing fixed-batch-size CPU kernel, we need to recompute offsets, which is previously done in python. This diff implements offsets recomputation in C++ such that all manipulations are done in C++. Note that reshaping offsets and grad_input to work with existing fixed-batch-size CPU kernels are done in Autograd instead of wrapper to avoid multiple computations. VBE CPU tests are in the next diff. Reviewed By: sryap Differential Revision: D69162870
Summary: X-link: facebookresearch/FBGEMM#796 This diff implements `generate_vbe_metadata` for cpu, such that the function returns the same output for CPU, CUDA and MTIA. To support VBE on CPU with existing fixed-batch-size CPU kernel, we need to recompute offsets, which is previously done in python. This diff implements offsets recomputation in C++ such that all manipulations are done in C++. Note that reshaping offsets and grad_input to work with existing fixed-batch-size CPU kernels are done in Autograd instead of wrapper to avoid multiple computations. VBE CPU tests are in the next diff. Reviewed By: sryap Differential Revision: D69162870
5ab0b9b
to
bb38a62
Compare
This pull request was exported from Phabricator. Differential Revision: D69162870 |
1 similar comment
This pull request was exported from Phabricator. Differential Revision: D69162870 |
Summary: Pull Request resolved: pytorch#3715 X-link: facebookresearch/FBGEMM#796 This diff implements `generate_vbe_metadata` for cpu, such that the function returns the same output for CPU, CUDA and MTIA. To support VBE on CPU with existing fixed-batch-size CPU kernel, we need to recompute offsets, which is previously done in python. This diff implements offsets recomputation in C++ such that all manipulations are done in C++. Note that reshaping offsets and grad_input to work with existing fixed-batch-size CPU kernels are done in Autograd instead of wrapper to avoid multiple computations. VBE CPU tests are in the next diff. Reviewed By: sryap Differential Revision: D69162870
f990938
to
cd6d50b
Compare
Summary: X-link: facebookresearch/FBGEMM#796 This diff implements `generate_vbe_metadata` for cpu, such that the function returns the same output for CPU, CUDA and MTIA. To support VBE on CPU with existing fixed-batch-size CPU kernel, we need to recompute offsets, which is previously done in python. This diff implements offsets recomputation in C++ such that all manipulations are done in C++. Note that reshaping offsets and grad_input to work with existing fixed-batch-size CPU kernels are done in Autograd instead of wrapper to avoid multiple computations. VBE CPU tests are in the next diff. Reviewed By: sryap, nautsimon Differential Revision: D69162870
cd6d50b
to
791a482
Compare
This pull request was exported from Phabricator. Differential Revision: D69162870 |
Summary: Pull Request resolved: pytorch#3715 X-link: facebookresearch/FBGEMM#796 This diff implements `generate_vbe_metadata` for cpu, such that the function returns the same output for CPU, CUDA and MTIA. To support VBE on CPU with existing fixed-batch-size CPU kernel, we need to recompute offsets, which is previously done in python. This diff implements offsets recomputation in C++ such that all manipulations are done in C++. Note that reshaping offsets and grad_input to work with existing fixed-batch-size CPU kernels are done in Autograd instead of wrapper to avoid multiple computations. VBE CPU tests are in the next diff. Reviewed By: sryap, nautsimon Differential Revision: D69162870
791a482
to
7adeb1c
Compare
This pull request was exported from Phabricator. Differential Revision: D69162870 |
Summary: Pull Request resolved: pytorch#3715 X-link: facebookresearch/FBGEMM#796 This diff implements `generate_vbe_metadata` for cpu, such that the function returns the same output for CPU, CUDA and MTIA. To support VBE on CPU with existing fixed-batch-size CPU kernel, we need to recompute offsets, which is previously done in python. This diff implements offsets recomputation in C++ such that all manipulations are done in C++. Note that reshaping offsets and grad_input to work with existing fixed-batch-size CPU kernels are done in Autograd instead of wrapper to avoid multiple computations. VBE CPU tests are in the next diff. Reviewed By: sryap, nautsimon Differential Revision: D69162870
7adeb1c
to
be33b5b
Compare
Summary: X-link: facebookresearch/FBGEMM#796 This diff implements `generate_vbe_metadata` for cpu, such that the function returns the same output for CPU, CUDA and MTIA. To support VBE on CPU with existing fixed-batch-size CPU kernel, we need to recompute offsets, which is previously done in python. This diff implements offsets recomputation in C++ such that all manipulations are done in C++. Note that reshaping offsets and grad_input to work with existing fixed-batch-size CPU kernels are done in Autograd instead of wrapper to avoid multiple computations. VBE CPU tests are in the next diff. Reviewed By: sryap, nautsimon Differential Revision: D69162870
be33b5b
to
4b55c11
Compare
Summary: X-link: facebookresearch/FBGEMM#796 This diff implements `generate_vbe_metadata` for cpu, such that the function returns the same output for CPU, CUDA and MTIA. To support VBE on CPU with existing fixed-batch-size CPU kernel, we need to recompute offsets, which is previously done in python. This diff implements offsets recomputation in C++ such that all manipulations are done in C++. Note that reshaping offsets and grad_input to work with existing fixed-batch-size CPU kernels are done in Autograd instead of wrapper to avoid multiple computations. VBE CPU tests are in the next diff. Reviewed By: sryap, nautsimon Differential Revision: D69162870
4b55c11
to
9222538
Compare
This pull request was exported from Phabricator. Differential Revision: D69162870 |
Summary: Pull Request resolved: pytorch#3715 X-link: facebookresearch/FBGEMM#796 This diff implements `generate_vbe_metadata` for cpu, such that the function returns the same output for CPU, CUDA and MTIA. To support VBE on CPU with existing fixed-batch-size CPU kernel, we need to recompute offsets, which is previously done in python. This diff implements offsets recomputation in C++ such that all manipulations are done in C++. Note that reshaping offsets and grad_input to work with existing fixed-batch-size CPU kernels are done in Autograd instead of wrapper to avoid multiple computations. VBE CPU tests are in the next diff. Reviewed By: sryap, nautsimon Differential Revision: D69162870
9222538
to
0f37ada
Compare
Summary: Pull Request resolved: pytorch#3715 X-link: facebookresearch/FBGEMM#796 This diff implements `generate_vbe_metadata` for cpu, such that the function returns the same output for CPU, CUDA and MTIA. To support VBE on CPU with existing fixed-batch-size CPU kernel, we need to recompute offsets, which is previously done in python. This diff implements offsets recomputation in C++ such that all manipulations are done in C++. Note that reshaping offsets and grad_input to work with existing fixed-batch-size CPU kernels are done in Autograd instead of wrapper to avoid multiple computations. VBE CPU tests are in the next diff. Reviewed By: sryap, nautsimon Differential Revision: D69162870
This pull request was exported from Phabricator. Differential Revision: D69162870 |
0f37ada
to
3892615
Compare
Summary: X-link: facebookresearch/FBGEMM#796 This diff implements `generate_vbe_metadata` for cpu, such that the function returns the same output for CPU, CUDA and MTIA. To support VBE on CPU with existing fixed-batch-size CPU kernel, we need to recompute offsets, which is previously done in python. This diff implements offsets recomputation in C++ such that all manipulations are done in C++. Note that reshaping offsets and grad_input to work with existing fixed-batch-size CPU kernels are done in Autograd instead of wrapper to avoid multiple computations. VBE CPU tests are in the next diff. Reviewed By: sryap, nautsimon Differential Revision: D69162870
Summary: X-link: facebookresearch/FBGEMM#796 This diff implements `generate_vbe_metadata` for cpu, such that the function returns the same output for CPU, CUDA and MTIA. To support VBE on CPU with existing fixed-batch-size CPU kernel, we need to recompute offsets, which is previously done in python. This diff implements offsets recomputation in C++ such that all manipulations are done in C++. Note that reshaping offsets and grad_input to work with existing fixed-batch-size CPU kernels are done in Autograd instead of wrapper to avoid multiple computations. VBE CPU tests are in the next diff. Reviewed By: sryap, nautsimon Differential Revision: D69162870
Summary: X-link: facebookresearch/FBGEMM#796 This diff implements `generate_vbe_metadata` for cpu, such that the function returns the same output for CPU, CUDA and MTIA. To support VBE on CPU with existing fixed-batch-size CPU kernel, we need to recompute offsets, which is previously done in python. This diff implements offsets recomputation in C++ such that all manipulations are done in C++. Note that reshaping offsets and grad_input to work with existing fixed-batch-size CPU kernels are done in Autograd instead of wrapper to avoid multiple computations. VBE CPU tests are in the next diff. Reviewed By: sryap, nautsimon Differential Revision: D69162870
Summary: Pull Request resolved: pytorch#3715 X-link: facebookresearch/FBGEMM#796 This diff implements `generate_vbe_metadata` for cpu, such that the function returns the same output for CPU, CUDA and MTIA. To support VBE on CPU with existing fixed-batch-size CPU kernel, we need to recompute offsets, which is previously done in python. This diff implements offsets recomputation in C++ such that all manipulations are done in C++. Note that reshaping offsets and grad_input to work with existing fixed-batch-size CPU kernels are done in Autograd instead of wrapper to avoid multiple computations. VBE CPU tests are in the next diff. Reviewed By: sryap, nautsimon Differential Revision: D69162870
This pull request has been merged in f0ff8bb. |
Summary: X-link: pytorch#3715 Pull Request resolved: facebookresearch/FBGEMM#796 This diff implements `generate_vbe_metadata` for cpu, such that the function returns the same output for CPU, CUDA and MTIA. To support VBE on CPU with existing fixed-batch-size CPU kernel, we need to recompute offsets, which is previously done in python. This diff implements offsets recomputation in C++ such that all manipulations are done in C++. Note that reshaping offsets and grad_input to work with existing fixed-batch-size CPU kernels are done in Autograd instead of wrapper to avoid multiple computations. VBE CPU tests are in the next diff. Reviewed By: sryap, nautsimon Differential Revision: D69162870 fbshipit-source-id: 08c6e45b8f0d319b96371acaba0d9a27570a1bd7
Summary:
X-link: https://github.com/facebookresearch/FBGEMM/pull/796
This diff implements
generate_vbe_metadata
for cpu, such that the function returns the same output for CPU, CUDA and MTIA.To support VBE on CPU with existing fixed-batch-size CPU kernel, we need to recompute offsets, which is previously done in python. This diff implements offsets recomputation in C++ such that all manipulations are done in C++.
Note that reshaping offsets and grad_input to work with existing fixed-batch-size CPU kernels are done in Autograd instead of wrapper to avoid multiple computations.
VBE CPU tests are in the next diff.
Reviewed By: sryap
Differential Revision: D69162870