Contributing To Nemo-Skills

Thanks for your interest in contributing to Nemo-Skills!

General guidelines

Applies to both humans and AI agents! Make sure the agent you use has access to these guidelines when implementing or reviewing Nemo-Skills features.

Don't be overly defensive!

Being overly defensive can lead to silent misbehavior for our code which is a lot worse than a quick runtime failure. Here are some examples to avoid

Don't use .get for accessing dictionary keys if the code expects them to be present. If we just replace a key that has to be there with empty string via data.get(key_name, "") we might silently corrupt the data and not notice it if previous step failed for whatever reason. It's not enough to do a warning in this case, it's much better to just do data[key_name] and let the code fail with a clear error
Don't catch exceptions when we don't expect them to be normally raised. Same motivation as above, just let the code fail when something unexpected happens, so that users notice it and can fix problem instead of silently misbehaving. We don't always read logs, so better to just crash the job.
Avoid cases when user-passed parameters are unused! E.g. if user specifies a new argument that's not supported by our code, the code should fail (do not silently ignore such parameters)! If a user doesn't specify an argument that's required, the code should fail! Avoid using defaults when there is no default value that's reasonable for majority of use-cases. This doesn't mean you need to have explicit checks for every parameter. It's best to use a dataclass or **kwargs syntax which will automatically handle this without complicating the code.
Don't complicate code when security concerns are irrelevant. Nemo-Skills assumes that users have full access to the underlying system. They don't access it via api, they can run any commands directly. So things like allowing arbitrary command execution from user input are totally normal and shouldn't be flagged (e.g. subprocess with shell=True or directly executing command passed via an argument). The only place where we should pay attention to security concerns is when executing code generated by an LLM and we should generally try to always use our provided sandbox api for that.

When adding new benchmarks

The following things are required when adding new benchmarks

Add it to a corresponding place in the documentation. Make sure to add an example command for how to run evaluation and expected results for any model you tested it with. Describe any details that are specific to this dataset. Any special arguments to prepare data? Any non-standard inference arguments? Any other things to pay attention to?
Don't forget to run mkdocs serve and visually check that the documentation renders properly in the browser.
Avoid data loss because of evaluation mistakes. Our current design overrides the original generation files with new evaluation-specific keys. Make sure to do all computation before re-opening the files for writing to avoid accidental data loss if there is a bug and code fails before writing is complete.
Run GPU tests in the CI (or locally). To run in CI, we need to set "run GPU tests" label (toggle it off and back on if rerunning after changes). By default all datasets will be prepared and evaluated on a few samples in the CI. You can remove your dataset from the test explicitly if it requires very heavy data preparation or has another reason why we can't use it. But try to avoid that if possible!
If you enabled new modality or added new complicated evaluation / metrics logic, consider adding the dataset into slurm tests. This is the most comprehensive test we can do by running full evaluation on cluster with arbitrary model and check that results are as expected.

Respect the Core / Pipeline dependency boundary

NeMo Skills is split into Core (inference, evaluation, tools, benchmarks) and Pipeline (CLI, cluster orchestration). The one-way rule:

Pipeline can import from Core
Core CANNOT import from Pipeline (no nemo_run, no nemo_skills.pipeline)

When adding dependencies: inference/evaluation/benchmark deps go in core/requirements.txt, orchestration deps go in requirements/pipeline.txt. This boundary is enforced by tests/test_dependency_isolation.py.

For full details (examples, common patterns, what to avoid), see Dependency Boundary Guide.

Keep the code elegant

When adding new features, try to keep the code simple and elegant.

Can you reuse / extend an existing functionality?
Do you need to check too many conditions?
Is there a way to write this simpler (maybe sacrificing some really rare edge-cases)?
Do the comments / docstrings you add help? Or is the code self-explanatory?
Avoid complicated types. Do use types for simple things like dict / list / int / float / existing classes, but don't define new type interfaces with unions or other complicated structures. They quickly go out-of-date and aren't helpful in most cases.
If adding a new parameter / function / method, keep names consistent. If a thing is called X in one place, it shouldn't be called Y in another place. Follow existing conventions.

When in doubt, think about this

>>> import this
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

Setup

Install the dependencies, including development dependencies as,

pip install -e .[dev]

Pre-Commit Hooks

We use pre-commit to manage pre-commit hooks. To install, run

pre-commit install

All subsequent commits will be checked according to configuration in .pre-commit-config.yaml.

Signing Your Work

We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
Any contribution which contains commits that are not Signed-Off will not be accepted.
To sign off on a commit you simply use the --signoff (or -s) option when committing your changes:
```
$ git commit -s -m "Add cool feature."
```
This will append the following to your commit message:
```
Signed-off-by: Your Name <your@email.com>
```

Full text of the DCO:

Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.


Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

Running Tests

Check existing github actions CI for cpu/gpu tests. Slurm tests documentation is here. When running cpu tests, it's important to always add -s flag to pytest as otherwise all tests using nemo-run will fail.

More details TBD

TIP: Our CI depends on some secret variables only accessible to developers of the repository. To run the full suite of tests, please create pull requests from a branch instead of a fork whenever possible.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contributing To Nemo-Skills

General guidelines

Don't be overly defensive!

When adding new benchmarks

Respect the Core / Pipeline dependency boundary

Keep the code elegant

Setup

Pre-Commit Hooks

Signing Your Work

Running Tests

FilesExpand file tree

CONTRIBUTING.md

Latest commit

History

CONTRIBUTING.md

File metadata and controls

Contributing To Nemo-Skills

General guidelines

Don't be overly defensive!

When adding new benchmarks

Respect the Core / Pipeline dependency boundary

Keep the code elegant

Setup

Pre-Commit Hooks

Signing Your Work

Running Tests