modules doc updates #1588

ebsmothers · 2024-09-15T20:25:33Z

A bunch of miscellaneous changes to torchtune/modules docstrings so that our API docs look a bit nicer. Tbh I probably could have done a lot more here and many of the changes are more surface-level, but every little bit helps.

pytorch-bot · 2024-09-15T20:25:36Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/1588

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 76782d4 with merge base f6d3a7a ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

RdoubleA · 2024-09-15T20:44:37Z

torchtune/modules/loss/ce_chunked_output_loss.py

    the cross entropy normally, but upcasting only one chunk at a time saves considerable memory.

    The CE and upcasting have to be compiled together for better performance.
-    When using this class, we recommend using torch.compile only on the method `compute_cross_entropy`.
+    When using this class, we recommend using :func:`torch.compile` only on the method ``compute_cross_entropy``.


Can you self reference method here?

Was debating it but honestly it's like 5 lines below and not well-documented anyways so I thought it was overkill

RdoubleA · 2024-09-15T20:45:14Z

torchtune/modules/lr_schedulers.py

-    0.0 to lr over num_warmup_steps, then decreases to 0.0 on a cosine schedule over
-    the remaining num_training_steps-num_warmup_steps (assuming num_cycles = 0.5).
+    0.0 to lr over ``num_warmup_steps``, then decreases to 0.0 on a cosine schedule over
+    the remaining ``num_training_steps-num_warmup_steps`` (assuming num_cycles = 0.5).


Backticks on num cycles

RdoubleA · 2024-09-15T20:45:42Z

torchtune/modules/model_fusion/_fusion.py

@@ -149,7 +149,7 @@ class FusionEmbedding(nn.Module):
    second embedding for the additional tokens. During forward this module routes
    the tokens to the appropriate embedding table.

-    Use this as a drop-in replacement for `nn.Embedding` in your model.
+    Use this as a drop-in replacement for ``nn.Embedding`` in your model.


Point to :class:torch.nn.Embedding?

RdoubleA · 2024-09-15T20:46:30Z

torchtune/modules/position_embeddings.py

@@ -85,16 +85,13 @@ def forward(
                If none, assume the index of the token is its position id. Default is None.

        Returns:
-            torch.Tensor: output tensor with RoPE applied
+            torch.Tensor: output tensor with shape [b, s, n_h, h_d]


Backticks on shape

RdoubleA · 2024-09-15T20:47:21Z

torchtune/modules/tied_linear.py

@@ -31,4 +32,12 @@ def __init__(self, tied_module: nn.Module):
            )

    def __call__(self, x: torch.tensor) -> torch.tensor:
+        """
+        Args:
+            x (torch.tensor): Input tensor. Should have shape ``(..., in_dim)``, where ``in_dim``


Why are these typed with lowercase tensor?

ugh you caught me. idk and idc, I just wanted these changes to be docstring-only

Leaving it as is for now, we can check with @felipemello1 on whether this was by design or not

I did it just to see if you guys truly review the PRs, obviously. Congratulations, you passed the test! Wanna update it to upper case?

SalmanMohammadi · 2024-09-15T21:03:27Z

torchtune/modules/attention.py

@@ -28,7 +28,7 @@ class MultiHeadAttention(nn.Module):
    Following is an example of MHA, GQA and MQA with num_heads = 4

    (credit for the documentation:
-    https://github.com/Lightning-AI/lit-gpt/blob/main/lit_gpt/config.py).
+    https://github.com/Lightning-AI/litgpt/blob/eda1aaaf391fd689664f95487ab03dc137e213fd/litgpt/config.py).


Suggested change

https://github.com/Lightning-AI/litgpt/blob/eda1aaaf391fd689664f95487ab03dc137e213fd/litgpt/config.py).

`litgpt.Config <https://github.com/Lightning-AI/litgpt/blob/eda1aaaf391fd689664f95487ab03dc137e213fd/litgpt/config.py>`_).

SalmanMohammadi · 2024-09-15T21:12:03Z

torchtune/modules/layer_norm.py

@@ -25,7 +25,7 @@ def forward(self, x: torch.Tensor) -> torch.Tensor:
            x (torch.Tensor): Input tensor.

        Returns:
-            torch.Tensor: The normalized output tensor.
+            torch.Tensor: The normalized output tensor having the same shape as x.


Suggested change

torch.Tensor: The normalized output tensor having the same shape as x.

torch.Tensor: The normalized output tensor having the same shape as ``x``.

SalmanMohammadi · 2024-09-15T21:15:48Z

torchtune/modules/model_fusion/_fusion.py

@@ -356,8 +356,8 @@ def forward(
        KV values for each position.

        Returns:
-            Tensor: output tensor with shape [b x s x v] or a list of layer
-                output tensors defined by ``output_hidden_states`` with the
+            Tensor: output tensor with shape [b x s x v] or a list of layer \


Suggested change

Tensor: output tensor with shape [b x s x v] or a list of layer \

Tensor: Output tensor with shape ``[b x s x v]`` or a list of layer \

SalmanMohammadi · 2024-09-15T21:17:03Z