CUTLASS
CUDA Templates for Linear Algebra Subroutines and Solvers

threadblock → warp Relation

File in include/cutlass/epilogue/threadblockIncludes file in include/cutlass/epilogue/warp
default_epilogue_complex_tensor_op.hfragment_iterator_complex_tensor_op.h
default_epilogue_complex_tensor_op.htile_iterator_tensor_op.h
default_epilogue_simt.hfragment_iterator_simt.h
default_epilogue_simt.htile_iterator_simt.h
default_epilogue_tensor_op.hfragment_iterator_tensor_op.h
default_epilogue_tensor_op.htile_iterator_tensor_op.h
default_epilogue_volta_tensor_op.hfragment_iterator_volta_tensor_op.h
default_epilogue_volta_tensor_op.htile_iterator_volta_tensor_op.h
default_epilogue_wmma_tensor_op.hfragment_iterator_wmma_tensor_op.h
default_epilogue_wmma_tensor_op.htile_iterator_wmma_tensor_op.h