Integrating PyTorch in Alpaka heterogeneous core #47984

lukaszmichalskii · 2025-04-30T09:20:09Z

PR description:

This PR enables seamless integration between PyTorch and the Alpaka-based heterogeneous computing backend, supporting inference workflows with usage of pytorch library with PortableCollections objects. It provides:

Compatibility with Alpaka device/queue abstractions.
Support for automatic conversion of optimized SoA to torch tensors, with memory blobs reusage.
Support for both just-in-time (JIT) and ahead-of-time (AOT) model execution (Beta version for AOT).
Single-threading and CUDA stream management are handled by Guard objects specialized for each supported backend.

This implementation was presented and discussed at Core Software Meeting: https://indico.cern.ch/event/1538634/

PR validation:

Included demonstration code of interoperability between SoA constructs with PyTorch C++ API and CMSSW environment in plugins and test packages.

PyTorch Ahead-of-time compilation

This pull request also investigates AOT compilation strategy but is in beta version (proof of concept) not yet ready for production usage.

GPU support

CUDA backend is supported and tested, ROCm is not yet supported: cms-sw/cmsdist#9786

FYI @valsdav @ericcano @felicepantaleo @chrisizeh @leobeltra

cmsbuild · 2025-04-30T09:20:42Z

cms-bot internal usage

cmsbuild · 2025-04-30T09:23:46Z

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-47984/44654

Found files with invalid states:
- DataFormats/PyTorch/src/classes.h:
  - Added: 3ba3ba4
  - Deleted: 3e87013
- PhysicsTools/PyTorch/interface/SoAMetadata.h:
  - Added: 3ba3ba4
  - Deleted: 46072a6
- DataFormats/PyTorch/src/alpaka/classes_rocm_def.xml:
  - Added: 3ba3ba4
  - Deleted: 3e87013
- DataFormats/PyTorch/interface/alpaka/Collections.h:
  - Added: 3ba3ba4
  - Deleted: 3e87013
- DataFormats/PyTorch/src/classes_def.xml:
  - Added: 3ba3ba4
  - Deleted: 3e87013
- DataFormats/PyTorch/src/alpaka/classes_rocm.h:
  - Added: 3ba3ba4
  - Deleted: 3e87013
- DataFormats/PyTorch/BuildFile.xml:
  - Added: 3ba3ba4
  - Deleted: 3e87013
- DataFormats/PyTorch/interface/Layout.h:
  - Added: 3ba3ba4
  - Deleted: 3e87013
- DataFormats/PyTorch/interface/Host.h:
  - Added: 3ba3ba4
  - Deleted: 3e87013
- DataFormats/PyTorch/src/alpaka/classes_cuda.h:
  - Added: 3ba3ba4
  - Deleted: 3e87013
- DataFormats/PyTorch/interface/Device.h:
  - Added: 3ba3ba4
  - Deleted: 3e87013
- DataFormats/PyTorch/src/classes.cc:
  - Added: 3ba3ba4
  - Deleted: 3e87013
- DataFormats/PyTorch/src/alpaka/classes_cuda_def.xml:
  - Added: 3ba3ba4
  - Deleted: 3e87013
There are other open Pull requests which might conflict with changes you have proposed:
- File PhysicsTools/PyTorch/BuildFile.xml modified in PR(s): Pytorch alpaka #45441
- File PhysicsTools/PyTorch/test/BuildFile.xml modified in PR(s): Pytorch alpaka #45441, 15 0 1 scouting run3 ct #47951
- File PhysicsTools/PyTorch/test/create_simple_dnn.py modified in PR(s): Pytorch alpaka #45441
- File PhysicsTools/PyTorch/test/testBase.h modified in PR(s): Pytorch alpaka #45441
- File PhysicsTools/PyTorch/test/testBaseCUDA.h modified in PR(s): Pytorch alpaka #45441, 15 0 1 scouting run3 ct #47951
- File PhysicsTools/PyTorch/test/testTorchFromBufferPinnedMemory.cu modified in PR(s): Pytorch alpaka #45441
- File PhysicsTools/PyTorch/test/testTorchSimpleDnn.cc modified in PR(s): Pytorch alpaka #45441
- File PhysicsTools/PyTorch/test/testTorchSimpleDnnCUDA.cc modified in PR(s): Pytorch alpaka #45441

Code check has found code style and quality issues which could be resolved by applying following patch(s)

code-format:
https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-47984/44654/code-format.patch
e.g. curl -k https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-47984/44654/code-format.patch | patch -p1
You can also run scram build code-format to apply code format directly

cmsbuild · 2025-04-30T10:08:25Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-47984/44655

There are other open Pull requests which might conflict with changes you have proposed:
- File PhysicsTools/PyTorch/BuildFile.xml modified in PR(s): Pytorch alpaka #45441
- File PhysicsTools/PyTorch/test/BuildFile.xml modified in PR(s): Pytorch alpaka #45441, 15 0 1 scouting run3 ct #47951
- File PhysicsTools/PyTorch/test/create_simple_dnn.py modified in PR(s): Pytorch alpaka #45441
- File PhysicsTools/PyTorch/test/testBase.h modified in PR(s): Pytorch alpaka #45441
- File PhysicsTools/PyTorch/test/testBaseCUDA.h modified in PR(s): Pytorch alpaka #45441, 15 0 1 scouting run3 ct #47951
- File PhysicsTools/PyTorch/test/testTorchFromBufferPinnedMemory.cu modified in PR(s): Pytorch alpaka #45441
- File PhysicsTools/PyTorch/test/testTorchSimpleDnn.cc modified in PR(s): Pytorch alpaka #45441
- File PhysicsTools/PyTorch/test/testTorchSimpleDnnCUDA.cc modified in PR(s): Pytorch alpaka #45441

cmsbuild · 2025-04-30T10:08:52Z

A new Pull Request was created by @lukaszmichalskii for master.

It involves the following packages:

DataFormats/PyTorchTest (****)
PhysicsTools/PyTorch (ml)

The following packages do not have a category, yet:

DataFormats/PyTorchTest
Please create a PR for https://github.com/cms-sw/cms-bot/blob/master/categories_map.py to assign category

@cmsbuild, @valsdav, @y19y19 can you please review it and eventually sign? Thanks.
@missirol, @mmusich, @rovere this is something you requested to watch as well.
@antoniovilela, @mandrenguyen, @rappoccio, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

valsdav · 2025-05-01T08:25:12Z

enable gpu

valsdav · 2025-05-07T10:02:23Z

please test

cmsbuild · 2025-05-07T10:11:04Z

-1

Failed Tests: Build ClangBuild
Size: This PR adds an extra 20KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-c68a5d/45909/summary.html
COMMIT: c01d07f
CMSSW: CMSSW_15_1_X_2025-05-06-2300/el8_amd64_gcc12
Additional Tests: CUDA,ROCM
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/47984/45909/install.sh to create a dev area with all the needed externals and cmssw changes.

Build

I found compilation error when building:

>> Compiling  src/PhysicsTools/PyTorch/test/testModel.cc
/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02888/el8_amd64_gcc12/external/gcc/12.3.1-40d504be6370b5a30e3947a6e575ca28/bin/c++ -c -DCMS_MICRO_ARCH='x86-64-v3' -DGNU_GCC -D_GNU_SOURCE -DCMSSW_GIT_HASH='CMSSW_15_1_X_2025-05-06-2300' -DPROJECT_NAME='CMSSW' -DPROJECT_VERSION='CMSSW_15_1_X_2025-05-06-2300' -Isrc -Ipoison -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02888/el8_amd64_gcc12/cms/cmssw/CMSSW_15_1_X_2025-05-06-2300/src -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02888/el8_amd64_gcc12/external/pytorch/2.6.0-a6d0e4413a9e766b40a2b79f83b4b176/include -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02888/el8_amd64_gcc12/external/pytorch/2.6.0-a6d0e4413a9e766b40a2b79f83b4b176/include/torch/csrc/api/include -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02888/el8_amd64_gcc12/external/cppunit/1.15.x-25a760f1303b0fca73df75b14e1358bc/include -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02888/el8_amd64_gcc12/external/cuda/12.8.1-f1c01abd08373a07ceeffab8d5f1930a/include -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02888/el8_amd64_gcc12/external/protobuf/3.21.9-1126508a53768c90e66f6bf1821ac03a/include -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02888/el8_amd64_gcc12/external/zlib/1.2.13-d217cdbdd8d586e845e05946de2796be/include -O3 -pthread -pipe -Werror=main -Werror=pointer-arith -Werror=overlength-strings -Wno-vla -Werror=overflow -std=c++20 -ftree-vectorize -Werror=array-bounds -Werror=format-contains-nul -Werror=type-limits -fvisibility-inlines-hidden -fno-math-errno --param vect-max-version-for-alias-checks=50 -Xassembler --compress-debug-sections -Wno-error=array-bounds -Warray-bounds -fuse-ld=bfd -march=x86-64-v3 -felide-constructors -fmessage-length=0 -Wall -Wno-non-template-friend -Wno-long-long -Wreturn-type -Wextra -Wpessimizing-move -Wclass-memaccess -Wno-cast-function-type -Wno-unused-but-set-parameter -Wno-ignored-qualifiers -Wno-unused-parameter -Wunused -Wparentheses -Werror=return-type -Werror=missing-braces -Werror=unused-value -Werror=unused-label -Werror=address -Werror=format -Werror=sign-compare -Werror=write-strings -Werror=delete-non-virtual-dtor -Werror=strict-aliasing -Werror=narrowing -Werror=unused-but-set-variable -Werror=reorder -Werror=unused-variable -Werror=conversion-null -Werror=return-local-addr -Wnon-virtual-dtor -Werror=switch -fdiagnostics-show-option -Wno-unused-local-typedefs -Wno-attributes -Wno-psabi -DBOOST_DISABLE_ASSERTS -flto=auto -fipa-icf -flto-odr-type-merging -fno-fat-lto-objects -Wodr -fPIC -MMD -MF tmp/el8_amd64_gcc12/src/PhysicsTools/PyTorch/test/testModel/testModel.cc.d src/PhysicsTools/PyTorch/test/testModel.cc -o tmp/el8_amd64_gcc12/src/PhysicsTools/PyTorch/test/testModel/testModel.cc.o
>> Compiling  src/PhysicsTools/PyTorch/test/testRunner.cc
/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02888/el8_amd64_gcc12/external/gcc/12.3.1-40d504be6370b5a30e3947a6e575ca28/bin/c++ -c -DCMS_MICRO_ARCH='x86-64-v3' -DGNU_GCC -D_GNU_SOURCE -DCMSSW_GIT_HASH='CMSSW_15_1_X_2025-05-06-2300' -DPROJECT_NAME='CMSSW' -DPROJECT_VERSION='CMSSW_15_1_X_2025-05-06-2300' -Isrc -Ipoison -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02888/el8_amd64_gcc12/cms/cmssw/CMSSW_15_1_X_2025-05-06-2300/src -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02888/el8_amd64_gcc12/external/pytorch/2.6.0-a6d0e4413a9e766b40a2b79f83b4b176/include -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02888/el8_amd64_gcc12/external/pytorch/2.6.0-a6d0e4413a9e766b40a2b79f83b4b176/include/torch/csrc/api/include -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02888/el8_amd64_gcc12/external/cppunit/1.15.x-25a760f1303b0fca73df75b14e1358bc/include -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02888/el8_amd64_gcc12/external/cuda/12.8.1-f1c01abd08373a07ceeffab8d5f1930a/include -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02888/el8_amd64_gcc12/external/protobuf/3.21.9-1126508a53768c90e66f6bf1821ac03a/include -I/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02888/el8_amd64_gcc12/external/zlib/1.2.13-d217cdbdd8d586e845e05946de2796be/include -O3 -pthread -pipe -Werror=main -Werror=pointer-arith -Werror=overlength-strings -Wno-vla -Werror=overflow -std=c++20 -ftree-vectorize -Werror=array-bounds -Werror=format-contains-nul -Werror=type-limits -fvisibility-inlines-hidden -fno-math-errno --param vect-max-version-for-alias-checks=50 -Xassembler --compress-debug-sections -Wno-error=array-bounds -Warray-bounds -fuse-ld=bfd -march=x86-64-v3 -felide-constructors -fmessage-length=0 -Wall -Wno-non-template-friend -Wno-long-long -Wreturn-type -Wextra -Wpessimizing-move -Wclass-memaccess -Wno-cast-function-type -Wno-unused-but-set-parameter -Wno-ignored-qualifiers -Wno-unused-parameter -Wunused -Wparentheses -Werror=return-type -Werror=missing-braces -Werror=unused-value -Werror=unused-label -Werror=address -Werror=format -Werror=sign-compare -Werror=write-strings -Werror=delete-non-virtual-dtor -Werror=strict-aliasing -Werror=narrowing -Werror=unused-but-set-variable -Werror=reorder -Werror=unused-variable -Werror=conversion-null -Werror=return-local-addr -Wnon-virtual-dtor -Werror=switch -fdiagnostics-show-option -Wno-unused-local-typedefs -Wno-attributes -Wno-psabi -DBOOST_DISABLE_ASSERTS -flto=auto -fipa-icf -flto-odr-type-merging -fno-fat-lto-objects -Wodr -fPIC -MMD -MF tmp/el8_amd64_gcc12/src/PhysicsTools/PyTorch/test/testModel/testRunner.cc.d src/PhysicsTools/PyTorch/test/testRunner.cc -o tmp/el8_amd64_gcc12/src/PhysicsTools/PyTorch/test/testModel/testRunner.cc.o
In file included from src/PhysicsTools/PyTorch/test/testModel.cc:5:
src/PhysicsTools/PyTorch/test/testUtilities.h:4:10: fatal error: boost/filesystem.hpp: No such file or directory
    4 | #include 
      |          ^~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
gmake: *** [tmp/el8_amd64_gcc12/src/PhysicsTools/PyTorch/test/testModel/testModel.cc.o] Error 1
>> Building binary testModel

Clang Build

I found compilation error while trying to compile with clang. Command used:

USER_CUDA_FLAGS='--expt-relaxed-constexpr' USER_CXXFLAGS='-Wno-register -fsyntax-only' /usr/bin/time -v scram build -k -j 32 COMPILER='llvm compile'

>> Local Products Rules ..... done
>> Creating project symlinks
>> Entering Package PhysicsTools/PyTorch
>> Entering Package DataFormats/PyTorchTest
>> Compile sequence completed for CMSSW CMSSW_15_1_X_2025-05-06-2300
gmake: *** [There are compilation/build errors. Please see the detail log above.] Error 1
Command exited with non-zero status 1
	Command being timed: "scram build -k -j 32 COMPILER=llvm compile BUILD_LOG=yes"
	User time (seconds): 907.23
	System time (seconds): 91.81
	Percent of CPU this job got: 655%

cmsbuild · 2025-06-06T13:49:58Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-47984/45085

cmsbuild · 2025-06-06T13:50:24Z

Pull request #47984 was updated. @cmsbuild, @valsdav, @y19y19 can you please check and sign again.

valsdav · 2025-06-06T14:04:08Z

please test

valsdav · 2025-06-20T17:47:59Z

please abort

valsdav · 2025-06-20T17:49:24Z

please test

cmsbuild · 2025-06-20T19:55:49Z

+1

Size: This PR adds an extra 16KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-c68a5d/46856/summary.html
COMMIT: cd3feb9
CMSSW: CMSSW_15_1_X_2025-06-20-1100/el8_amd64_gcc12
Additional Tests: CUDA,ROCM
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/47984/46856/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

You potentially removed 1 lines from the logs
Reco comparison results: 6 differences found in the comparisons
DQMHistoTests: Total files compared: 50
DQMHistoTests: Total histograms compared: 4058157
DQMHistoTests: Total failures: 5
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 4058132
DQMHistoTests: Total skipped: 20
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 49 files compared)
Checked 215 log files, 184 edm output root files, 50 DQM output files
TriggerResults: no differences found

CUDA Comparison Summary

Summary:

No significant changes to the logs found
Reco comparison results: 0 differences found in the comparisons
DQMHistoTests: Total files compared: 7
DQMHistoTests: Total histograms compared: 53212
DQMHistoTests: Total failures: 28
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 53184
DQMHistoTests: Total skipped: 0
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 6 files compared)
Checked 24 log files, 30 edm output root files, 7 DQM output files
TriggerResults: no differences found

ROCM Comparison Summary

Summary:

No significant changes to the logs found
Reco comparison results: 47 differences found in the comparisons
DQMHistoTests: Total files compared: 7
DQMHistoTests: Total histograms compared: 53212
DQMHistoTests: Total failures: 4059
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 49153
DQMHistoTests: Total skipped: 0
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 6 files compared)
Checked 24 log files, 30 edm output root files, 7 DQM output files

valsdav · 2025-06-23T21:37:28Z

+ml

Signing for ML, but we would like to get some feedback about this from @cms-sw/core-team. Thanks in advance!

makortel · 2025-07-02T22:45:14Z

but we would like to get some feedback about this from @cms-sw/core-team. Thanks in advance!

I'll take a look in the coming days. In the future please use @cms-sw/core-l2 team.

makortel · 2025-07-03T13:56:21Z

FYI @cms-sw/heterogeneous-l2

cmsbuild · 2025-07-03T14:53:06Z

Pull request has been put on hold by @fwyzard
They need to issue an unhold command to remove the hold state or L1 can unhold it for all

fwyzard · 2025-07-03T14:53:40Z

I would like to give feedback from the @cms-sw/heterogeneous-l2 and alpaka point of view before this proceeds.

In fact, I would suggest presenting the work at one of the upcoming GPU developments meeting ?

fwyzard · 2025-07-03T14:53:53Z

assign heterogeneous

cmsbuild · 2025-07-03T14:55:32Z

New categories assigned: heterogeneous

@fwyzard,@makortel you have been requested to review this Pull request/Issue and eventually sign? Thanks

makortel

From first read through.

makortel · 2025-07-03T13:58:48Z

DataFormats/PyTorchTest/BuildFile.xml

The files in this new package do not seem to technically depend on PyTorch. The dependencies are thus the same as in DataFormats/PortableTestObjects (except PortableTestObjects depends also on eigen).

I'd suggest to consider moving the test data types to DataFormats/PortableTestObjects (I'm not against a separate package, but I'd want the rationale for a separate package to be documented in the PR discussion)

makortel · 2025-07-03T13:59:59Z

DataFormats/PyTorchTest/interface/Device.h

@@ -0,0 +1,20 @@
+#ifndef DATA_FORMATS__PYTORCH_TEST__INTERFACE__DEVICE_H_


Preferred format would be

Suggested change

#ifndef DATA_FORMATS__PYTORCH_TEST__INTERFACE__DEVICE_H_

#ifndef DataFormats_PyTorchTest_interface_Device_h

(see 4.1 in https://cms-sw.github.io/cms_coding_rules.html#4--technical-coding-rules-1)

(also in other files)

makortel · 2025-07-03T15:36:36Z

DataFormats/PyTorchTest/interface/Device.h

+#include "DataFormats/Portable/interface/PortableDeviceCollection.h"
+#include "DataFormats/PyTorchTest/interface/Layout.h"
+
+namespace torchportable {


I'd suggest to include test in the namespace name

makortel · 2025-07-03T15:38:12Z

DataFormats/PyTorchTest/interface/Device.h

I find the file name Device.h confusing, because the file does not define or declare any device. I'd suggest to rename e.g. along PyTorchTestDeviceCollections.h. Same for Host.h, Layout.h, and alpaka/Collections.h.

See also DataFormats/PortableTestObjects/interface (and DataFormats/TestObjects/interface) for established practice.

makortel · 2025-07-03T15:40:11Z

DataFormats/PyTorchTest/interface/alpaka/Collections.h

+  using ::torchportable::ClassificationCollectionDevice;
+  using ::torchportable::ClassificationCollectionHost;
+  using ::torchportable::ParticleCollectionDevice;
+  using ::torchportable::ParticleCollectionHost;
+  using ::torchportable::RegressionCollectionDevice;
+  using ::torchportable::RegressionCollectionHost;


These shouldn't be needed

Suggested change

using ::torchportable::ClassificationCollectionDevice;

using ::torchportable::ClassificationCollectionHost;

using ::torchportable::ParticleCollectionDevice;

using ::torchportable::ParticleCollectionHost;

using ::torchportable::RegressionCollectionDevice;

using ::torchportable::RegressionCollectionHost;

makortel · 2025-07-03T18:47:46Z

PhysicsTools/PyTorch/test/testPipeline.py

+process.schedule = cms.Schedule(
+    process.path
+)


This is not necessary. In absence of process.schedule the framework will run all Paths.

Suggested change

process.schedule = cms.Schedule(

process.path

)

makortel · 2025-07-03T18:49:00Z

PhysicsTools/PyTorch/test/testPipelineStandalone.sh

+
+function die { echo Failed $1: status $2 ; exit $2 ; }
+
+SCRIPT="PhysicsTools/PyTorch/test/testPipeline.py"


Is there some other reason to have this testPipelineStandalone.sh than avoiding the ${LOCALTOP}/src part in the testPipeline.sh?

makortel · 2025-07-03T18:52:49Z

PhysicsTools/PyTorch/test/BuildFile.xml

-<bin name="testTorch" file="testTorch.cc">
+<iftool name="cuda-gcc-support">
+<bin name="testTensorStride" file="testRunner.cc,testTensorStride.cu">
+  <use name="catch2"/>


catch2 doesn't seem to be used

(same for the other test executables below)

makortel · 2025-07-03T18:53:09Z

PhysicsTools/PyTorch/test/BuildFile.xml

+  <use name="FWCore/ParameterSet"/>
+  <use name="FWCore/ParameterSetReader"/>
+  <use name="FWCore/PluginManager"/>
+  <use name="FWCore/ServiceRegistry"/>
+  <use name="FWCore/Utilities"/>
+  <use name="HeterogeneousCore/CUDAServices"/>
+  <use name="HeterogeneousCore/CUDAUtilities"/>


Of these, only CUDAUtilities seem to be used

Suggested change

<use name="FWCore/ParameterSet"/>

<use name="FWCore/ParameterSetReader"/>

<use name="FWCore/PluginManager"/>

<use name="FWCore/ServiceRegistry"/>

<use name="FWCore/Utilities"/>

<use name="HeterogeneousCore/CUDAServices"/>

<use name="HeterogeneousCore/CUDAUtilities"/>

<use name="HeterogeneousCore/CUDAUtilities"/>

(same for the other test executables below)

makortel · 2025-07-03T18:54:01Z

PhysicsTools/PyTorch/test/BuildFile.xml

+<test name="TestPipelineCpu" command="testPipeline.sh cpu"/>
+<test name="TestPipelineCuda" command="testPipeline.sh cuda">


I'd suggest to have more specific names for these tests, e.g.

Suggested change

<test name="TestPipelineCpu" command="testPipeline.sh cpu"/>

<test name="TestPipelineCuda" command="testPipeline.sh cuda">

<test name="TestPyTorchPipelineCpu" command="testPipeline.sh cpu"/>

<test name="TestPyTorchPipelineCuda" command="testPipeline.sh cuda">

cmsbuild added this to the CMSSW_15_1_X milestone Apr 30, 2025

cmsbuild added pending-signatures tests-pending orp-pending new-package-pending code-checks-pending ml-pending changes-dataformats labels Apr 30, 2025

valsdav mentioned this pull request Apr 30, 2025

Pytorch alpaka #45441

Closed

cmsbuild added code-checks-rejected and removed code-checks-pending labels Apr 30, 2025

lukaszmichalskii force-pushed the torch-alpaka-pr branch from 46072a6 to c01d07f Compare April 30, 2025 10:06

cmsbuild added code-checks-pending and removed code-checks-rejected labels Apr 30, 2025

cmsbuild added code-checks-approved and removed code-checks-pending labels Apr 30, 2025

lukaszmichalskii changed the title ~~Integrating PyTorch in Alpakac heterogeneous core~~ Integrating PyTorch in Alpaka heterogeneous core Apr 30, 2025

cmsbuild added tests-started and removed tests-pending labels May 7, 2025

cmsbuild added tests-rejected and removed tests-started tests-rejected code-checks-approved labels May 7, 2025

cmsbuild added the code-checks-pending label Jun 6, 2025

cmsbuild added code-checks-approved and removed code-checks-pending labels Jun 6, 2025

cmsbuild added tests-started and removed tests-pending labels Jun 6, 2025

cmsbuild added tests-pending and removed tests-started labels Jun 20, 2025

cmsbuild added tests-started and removed tests-pending labels Jun 20, 2025

cmsbuild added tests-approved and removed tests-started labels Jun 20, 2025

cmsbuild added ml-approved and removed ml-pending labels Jun 23, 2025

cmsbuild added the hold label Jul 3, 2025

cmsbuild removed the hold label Jul 3, 2025

cmsbuild added the heterogeneous-pending label Jul 3, 2025

makortel reviewed Jul 3, 2025

View reviewed changes

		@@ -0,0 +1,20 @@
		#ifndef DATA_FORMATS__PYTORCH_TEST__INTERFACE__DEVICE_H_

	#ifndef DATA_FORMATS__PYTORCH_TEST__INTERFACE__DEVICE_H_
	#ifndef DataFormats_PyTorchTest_interface_Device_h


		function die { echo Failed $1: status $2 ; exit $2 ; }

		SCRIPT="PhysicsTools/PyTorch/test/testPipeline.py"

		<test name="TestPipelineCpu" command="testPipeline.sh cpu"/>
		<test name="TestPipelineCuda" command="testPipeline.sh cuda">

Integrating PyTorch in Alpaka heterogeneous core #47984

Are you sure you want to change the base?

Integrating PyTorch in Alpaka heterogeneous core #47984

Conversation

lukaszmichalskii commented Apr 30, 2025

PR description:

PR validation:

PyTorch Ahead-of-time compilation

GPU support

Uh oh!

cmsbuild commented Apr 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cmsbuild commented Apr 30, 2025

Uh oh!

cmsbuild commented Apr 30, 2025

Uh oh!

cmsbuild commented Apr 30, 2025

Uh oh!

valsdav commented May 1, 2025

Uh oh!

valsdav commented May 7, 2025

Uh oh!

cmsbuild commented May 7, 2025

Build

Clang Build

Uh oh!

cmsbuild commented Jun 6, 2025

Uh oh!

cmsbuild commented Jun 6, 2025

Uh oh!

valsdav commented Jun 6, 2025

Uh oh!

valsdav commented Jun 20, 2025

Uh oh!

valsdav commented Jun 20, 2025

Uh oh!

cmsbuild commented Jun 20, 2025

Comparison Summary

CUDA Comparison Summary

ROCM Comparison Summary

Uh oh!

valsdav commented Jun 23, 2025

Uh oh!

makortel commented Jul 2, 2025

Uh oh!

makortel commented Jul 3, 2025

Uh oh!

cmsbuild commented Jul 3, 2025

Uh oh!

fwyzard commented Jul 3, 2025

Uh oh!

fwyzard commented Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cmsbuild commented Jul 3, 2025

Uh oh!

makortel left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cmsbuild commented Apr 30, 2025 •

edited

Loading

fwyzard commented Jul 3, 2025 •

edited

Loading