Thanks for your interest in contributing to Nemo-Skills!
Applies to both humans and AI agents! Make sure the agent you use has access to these guidelines when implementing or reviewing Nemo-Skills features.
Being overly defensive can lead to silent misbehavior for our code which is a lot worse than a quick runtime failure. Here are some examples to avoid
- Don't use
.getfor accessing dictionary keys if the code expects them to be present. If we just replace a key that has to be there with empty string viadata.get(key_name, "")we might silently corrupt the data and not notice it if previous step failed for whatever reason. It's not enough to do a warning in this case, it's much better to just dodata[key_name]and let the code fail with a clear error - Don't catch exceptions when we don't expect them to be normally raised. Same motivation as above, just let the code fail when something unexpected happens, so that users notice it and can fix problem instead of silently misbehaving. We don't always read logs, so better to just crash the job.
- Avoid cases when user-passed parameters are unused! E.g. if user specifies a new argument that's not supported by our code, the code should fail (do not silently ignore such parameters)! If a user doesn't specify an argument that's required, the code should fail! Avoid using defaults when there is no default value that's reasonable for majority of use-cases. This doesn't mean you need to have explicit checks for every parameter. It's best to use a dataclass or **kwargs syntax which will automatically handle this without complicating the code.
- Don't complicate code when security concerns are irrelevant. Nemo-Skills assumes that users have full
access to the underlying system. They don't access it via api, they can run any commands directly.
So things like allowing arbitrary command execution from user input are totally normal and shouldn't be flagged
(e.g. subprocess with
shell=Trueor directly executing command passed via an argument). The only place where we should pay attention to security concerns is when executing code generated by an LLM and we should generally try to always use our provided sandbox api for that.
The following things are required when adding new benchmarks
- Add it to a corresponding place in the documentation. Make sure to add an example command for how to run evaluation and expected results for any model you tested it with. Describe any details that are specific to this dataset. Any special arguments to prepare data? Any non-standard inference arguments? Any other things to pay attention to?
- Don't forget to run
mkdocs serveand visually check that the documentation renders properly in the browser. - Avoid data loss because of evaluation mistakes. Our current design overrides the original generation files with new evaluation-specific keys. Make sure to do all computation before re-opening the files for writing to avoid accidental data loss if there is a bug and code fails before writing is complete.
- Run GPU tests in the CI (or locally). To run in CI, we need to set "run GPU tests" label (toggle it off and back on if rerunning after changes). By default all datasets will be prepared and evaluated on a few samples in the CI. You can remove your dataset from the test explicitly if it requires very heavy data preparation or has another reason why we can't use it. But try to avoid that if possible!
- If you enabled new modality or added new complicated evaluation / metrics logic, consider adding the dataset into slurm tests. This is the most comprehensive test we can do by running full evaluation on cluster with arbitrary model and check that results are as expected.
NeMo Skills is split into Core (inference, evaluation, tools, benchmarks) and Pipeline (CLI, cluster orchestration). The one-way rule:
- Pipeline can import from Core
- Core CANNOT import from Pipeline (no
nemo_run, nonemo_skills.pipeline)
When adding dependencies: inference/evaluation/benchmark deps go in core/requirements.txt, orchestration deps go in requirements/pipeline.txt. This boundary is enforced by tests/test_dependency_isolation.py.
For full details (examples, common patterns, what to avoid), see Dependency Boundary Guide.
When adding new features, try to keep the code simple and elegant.
- Can you reuse / extend an existing functionality?
- Do you need to check too many conditions?
- Is there a way to write this simpler (maybe sacrificing some really rare edge-cases)?
- Do the comments / docstrings you add help? Or is the code self-explanatory?
- Avoid complicated types. Do use types for simple things like dict / list / int / float / existing classes, but don't define new type interfaces with unions or other complicated structures. They quickly go out-of-date and aren't helpful in most cases.
- If adding a new parameter / function / method, keep names consistent. If a thing is called X in one place, it shouldn't be called Y in another place. Follow existing conventions.
When in doubt, think about this
>>> import this
The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
Install the dependencies, including development dependencies as,
pip install -e .[dev]We use pre-commit to manage pre-commit hooks.
To install, run
pre-commit installAll subsequent commits will be checked according to configuration in .pre-commit-config.yaml.
-
We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
-
Any contribution which contains commits that are not Signed-Off will not be accepted.
-
To sign off on a commit you simply use the
--signoff(or-s) option when committing your changes:$ git commit -s -m "Add cool feature."This will append the following to your commit message:
Signed-off-by: Your Name <your@email.com> -
Full text of the DCO:
Developer Certificate of Origin Version 1.1 Copyright (C) 2004, 2006 The Linux Foundation and its contributors. Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Developer's Certificate of Origin 1.1 By making a contribution to this project, I certify that: (a) The contribution was created in whole or in part by me and I have the right to submit it under the open source license indicated in the file; or (b) The contribution is based upon previous work that, to the best of my knowledge, is covered under an appropriate open source license and I have the right under that license to submit that work with modifications, whether created in whole or in part by me, under the same open source license (unless I am permitted to submit under a different license), as indicated in the file; or (c) The contribution was provided directly to me by some other person who certified (a), (b) or (c) and I have not modified it. (d) I understand and agree that this project and the contribution are public and that a record of the contribution (including all personal information I submit with it, including my sign-off) is maintained indefinitely and may be redistributed consistent with this project or the open source license(s) involved.
Check existing github actions CI for cpu/gpu tests. Slurm tests documentation is here.
When running cpu tests, it's important to always add -s flag to pytest as otherwise all tests using nemo-run
will fail.
More details TBD
TIP: Our CI depends on some secret variables only accessible to developers of the repository. To run the full suite of tests, please create pull requests from a branch instead of a fork whenever possible.