@@ -1449,6 +1449,42 @@ The AMDGPU backend supports the following LLVM IR attributes.
1449
1449
the frame. This is an internal detail of how LDS variables are lowered,
1450
1450
language front ends should not set this attribute.
1451
1451
1452
+ "amdgpu-gds-size" Bytes expected to be allocated at the start of GDS memory at entry.
1453
+
1454
+ "amdgpu-git-ptr-high" The hard-wired high half of the address of the global information table
1455
+ for AMDPAL OS type. 0xffffffff represents no hard-wired high half, since
1456
+ current hardware only allows a 16 bit value.
1457
+
1458
+ "amdgpu-32bit-address-high-bits" Assumed high 32-bits for 32-bit address spaces which are really truncated
1459
+ 64-bit addresses (i.e., addrspace(6))
1460
+
1461
+ "amdgpu-color-export" Indicates shader exports color information if set to 1.
1462
+ Defaults to 1 for :ref:`amdgpu_ps <amdgpu-cc>`, and 0 for other calling
1463
+ conventions. Determines the necessity and type of null exports when a shader
1464
+ terminates early by killing lanes.
1465
+
1466
+ "amdgpu-depth-export" Indicates shader exports depth information if set to 1. Determines the
1467
+ necessity and type of null exports when a shader terminates early by killing
1468
+ lanes. A depth-only shader will export to depth channel when no null export
1469
+ target is available (GFX11+).
1470
+
1471
+ "InitialPSInputAddr" Set the initial value of the `spi_ps_input_addr` register for
1472
+ :ref:`amdgpu_ps <amdgpu-cc>` shaders. Any bits enabled by this value will
1473
+ be enabled in the final register value.
1474
+
1475
+ "amdgpu-wave-priority-threshold" VALU instruction count threshold for adjusting wave priority. If exceeded,
1476
+ temporarily raise the wave priority at the start of the shader function
1477
+ until its last VMEM instructions to allow younger waves to issue their VMEM
1478
+ instructions as well.
1479
+
1480
+ "amdgpu-memory-bound" Set internally by backend
1481
+
1482
+ "amdgpu-wave-limiter" Set internally by backend
1483
+
1484
+ "amdgpu-unroll-threshold" Set base cost threshold preference for loop unrolling within this function,
1485
+ default is 300. Actual threshold may be varied by per-loop metadata or
1486
+ reduced by heuristics.
1487
+
1452
1488
"amdgpu-max-num-workgroups"="x,y,z" Specify the maximum number of work groups for the kernel dispatch in the
1453
1489
X, Y, and Z dimensions. Generated by the ``amdgpu_max_num_work_groups``
1454
1490
CLANG attribute [CLANG-ATTR]_. Clang only emits this attribute when all
0 commit comments