Skip to content

Update EFA Node exporter dockerfile to latest procfs and node exporte…#887

Merged
rpovelik merged 1 commit intomainfrom
efa_node_exporter
Oct 31, 2025
Merged

Update EFA Node exporter dockerfile to latest procfs and node exporte…#887
rpovelik merged 1 commit intomainfrom
efa_node_exporter

Conversation

@rpovelik
Copy link
Copy Markdown
Contributor

…r versions

Description of changes: Update the docker file with the latest versions of procfs and node exporter dependencies.
Checked on team cluster with NCCL tests that the counters are increased and identical to rdma tool

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

…r versions

Signed-off-by: rpovelik <rpovelik@amazon.co.uk>
@rpovelik rpovelik requested a review from nghtm October 31, 2025 11:38
@nghtm
Copy link
Copy Markdown
Contributor

nghtm commented Oct 31, 2025

please confirm whether container build has been tested with updated versions, and metrics are populating

Copy link
Copy Markdown
Contributor

@nghtm nghtm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awaiting confirmation of build testing and metrics

@rpovelik
Copy link
Copy Markdown
Contributor Author

Hey @nghtm ,

I think I confirmed it in PR’s description already

Checked on team cluster with NCCL tests that the counters are increased and identical to rdma too

Do you want to add a screenshot of metrics here?

Copy link
Copy Markdown
Contributor

@nghtm nghtm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry missed that - so we confirmed this container image builds with new procFS and node exporter versions. LGTM!

@nghtm
Copy link
Copy Markdown
Contributor

nghtm commented Oct 31, 2025

please squash and merge

@nghtm
Copy link
Copy Markdown
Contributor

nghtm commented Oct 31, 2025

Screenshot 2025-10-31 at 11 05 47 AM Confirming successful build on shared cluster and metrics are flowing (note did not run NCCL test)

@rpovelik rpovelik merged commit 6dde96a into main Oct 31, 2025
4 checks passed
@rpovelik rpovelik deleted the efa_node_exporter branch October 31, 2025 16:37
KeitaW pushed a commit that referenced this pull request Feb 17, 2026
…r versions (#887)

Signed-off-by: rpovelik <rpovelik@amazon.co.uk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants