-
Notifications
You must be signed in to change notification settings - Fork 268
Merge group related improvements for convolution operations #3439
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
18ead08 to
cf8a4d4
Compare
706ca76 to
6fe2eaa
Compare
include/ck/tensor_operation/operator_transform/transform_conv_bwd_data_to_gemm_v1.hpp
Show resolved
Hide resolved
include/ck/tensor_operation/operator_transform/transform_conv_bwd_data_to_gemm_v1.hpp
Show resolved
Hide resolved
include/ck/tensor_operation/operator_transform/transform_conv_bwd_data_to_gemm_v1.hpp
Show resolved
Hide resolved
| } | ||
| }; | ||
|
|
||
| /** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to avoid duplication with the struct above? Afaik, these functions are only used in two places, so I would recommend to always call the MG variant and to have a defalut value (and ignore the GStep return value) to avoid duplicating the rest of the logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did the unification for the generator helper function which was a low hanging fruit. With the structure I am not sure if its a good idea. It can be generalized but since the dimensions don't match it is slightly more complicated. For the time being I would leave it like this since I don't see the long term strategy with it. If this is the new method a more general infrastructure will be needed to support all convolution varaints, if that is not the case having this dirty variant seems fine to me.
@bartekxk do you have an opinion in this?
bdc37dd to
73ecc3a
Compare
73ecc3a to
f7025f6
Compare
A collection of several depthwise convolution related improvements.
Proposed changes
The list of todos came from the investigation of 2d convolution performance on fp32 data input. It turns out CK has limited support for merged group convolutions. The purpose of this PR is to add some of the missing functionality.
Checklist
Please put an
xinto the boxes that apply. You can also fill these out after creating the PR. If you're not sure, please don't hesitate to ask.clang-formaton all changed filesDiscussion
If this is a relatively large or complex change, feel free to start a discussion by explaining why you chose the solution you did and what alternatives you considered