Notation for tensor and matrix dimensions are inconsistent

In NCHW tensor notation, the last dimension is the contiguous dimension. For column-major matrix notation, the first dimension is the contiguous dimension. We haven't needed to think that much about this since our data samples are usually 1D or 3D, but with transformers we need to do batched matrix multiplication. We should settle this question and commit to a consistent scheme to avoid confusion.

As much as it pains me coming from an applied math background, I think we should switch to C/row-major/NCHW notation. It matches PyTorch, TensorFlow, and NumPy and seems to be more natural for practitioners.

Whatever we decide, DiHydrogen should use the same scheme as LBANN. Pinging @benson31, @naoyam, @ndryden.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Notation for tensor and matrix dimensions are inconsistent #1344

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Notation for tensor and matrix dimensions are inconsistent #1344

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions