Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

builtin: update inline spirv hlsl #739

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

alichraghi
Copy link
Contributor

Description

Follow up to Devsh-Graphics-Programming/SPIRV-Headers#3

Testing

Ran arithmetic test

@devshgraphicsprogrammingjenkins
Copy link
Contributor

[CI]: Can one of the admins verify this patch?

Comment on lines 185 to 203
// TODO: redundant T
template<typename T>
struct bitfieldExtract<T, true, true>
{
static T __call( T val, uint32_t offsetBits, uint32_t numBits )
{
return spirv::bitFieldSExtract<T>( val, offsetBits, numBits );
return spirv::bitFieldExtract( val, offsetBits, numBits );
}
};

// TODO: redundant T
template<typename T>
struct bitfieldExtract<T, false, true>
{
static T __call( T val, uint32_t offsetBits, uint32_t numBits )
{
return spirv::bitFieldUExtract<T>( val, offsetBits, numBits );
return spirv::bitFieldExtract( val, offsetBits, numBits );
}
};

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline SPIR-V should have exact same names as Op enums, except without Op and the leading letter should be lowercase

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alos why do you think T is redundant?

P.S. now I realize the overload choice chould have been done nicer with enable_if_t


//! Std 450 Extended set operations
template<typename SquareMatrix>
[[vk::ext_instruction(GLSLstd450MatrixInverse)]]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they need an extra parameter microsoft/DirectXShaderCompiler#6751

template<typename T, typename U>
[[vk::ext_capability(spv::CapabilityPhysicalStorageBufferAddresses)]]
[[vk::ext_instruction(spv::OpBitcast)]]
enable_if_t<is_spirv_type_v<T> && is_spirv_type_v<U>, T> bitcast(U);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm you should probably have spirv::is_pointer_v for this

Comment on lines 52 to 54
template<class T, class U>
[[vk::ext_instruction(spv::OpBitcast)]]
T bitcast(U);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs enable_if to check that T is fundamental or a builtin HLSL vector (read the SPIR-V spec) + sizeof(T)==sizeof(U)

Comment on lines 57 to 58
namespace builtin
{[[vk::ext_builtin_output(spv::BuiltInPosition)]]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

crappy formatting, the { should have a line all to itself

Comment on lines 62 to 63
[[vk::ext_builtin_input(spv::BuiltInNumWorkgroups)]]
static const uint32_t3 NumWorkGroups;
// TODO: Doesn't work, find out why and file issue on DXC!
//[[vk::ext_builtin_input(spv::BuiltInWorkgroupSize)]]
//static const uint32_t3 WorkgroupSize;
static const uint32_t3 NumWorkgroups;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

look at the comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean that WorkgroupSize should be enabled?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that you can't define that builtin because DXC shits its pants

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what can i do? add that TODO back in generator?

Comment on lines 96 to 102
//! Execution Modes
namespace execution_mode
{
void invocations()
{
vk::ext_execution_mode(spv::ExecutionModeInvocations);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have you tested this works with godbolt?

AFAIK this only works if you call directly from entry point

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvm we did it this way before, all is good

Comment on lines 588 to 598
template<typename T, typename P>
[[vk::ext_instruction(spv::OpLoad)]]
enable_if_t<is_spirv_type_v<P>, T> load(P pointer, [[vk::ext_literal]] uint32_t memoryAccess);

template<typename T, typename P>
[[vk::ext_instruction(spv::OpLoad)]]
enable_if_t<is_spirv_type_v<P>, T> load(P pointer, [[vk::ext_literal]] uint32_t memoryAccess, [[vk::ext_literal]] uint32_t memoryAccessParam);

template<typename T, typename P, uint32_t alignment>
[[vk::ext_instruction(spv::OpLoad)]]
enable_if_t<is_spirv_type_v<P>, T> load(P pointer, [[vk::ext_literal]] uint32_t __aligned = /*Aligned*/0x00000002, [[vk::ext_literal]] uint32_t __alignment = alignment);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you gen these overloads?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's defined in the grammer. MemoryAccess operand may have zero or more parameters.

Comment on lines 604 to 618
template<typename T>
[[vk::ext_instruction(spv::OpLoad)]]
T load(pointer_t<spv::StorageClassPhysicalStorageBuffer, T> pointer, [[vk::ext_literal]] uint32_t memoryAccess);

template<typename T>
[[vk::ext_instruction(spv::OpLoad)]]
T load(pointer_t<spv::StorageClassPhysicalStorageBuffer, T> pointer, [[vk::ext_literal]] uint32_t memoryAccess, [[vk::ext_literal]] uint32_t memoryAccessParam);

template<typename T, uint32_t alignment>
[[vk::ext_instruction(spv::OpLoad)]]
T load(pointer_t<spv::StorageClassPhysicalStorageBuffer, T> pointer, [[vk::ext_literal]] uint32_t __aligned = /*Aligned*/0x00000002, [[vk::ext_literal]] uint32_t __alignment = alignment);

template<typename T>
[[vk::ext_instruction(spv::OpLoad)]]
T load(pointer_t<spv::StorageClassPhysicalStorageBuffer, T> pointer);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a BDA load, and as such:

  1. it will match the loads above because pointer_t is a spirv type, and never get called or worse, clash being ambiguous
  2. BDA load/store ALWAYS take aligned operands and require alignment to be specified (read the spec)
  3. BDA load overloads need to emit the PhysicalStorageBuffer capability

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will match the loads above because pointer_t is a spirv type, and never get called or worse, clash being ambiguous

hmm do you have any idea beside renaming the BDA overload to something else?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enable_if

Comment on lines 620 to 650
template<typename T, typename P>
[[vk::ext_instruction(spv::OpStore)]]
enable_if_t<is_spirv_type_v<P>, void> store(P pointer, T object, [[vk::ext_literal]] uint32_t memoryAccess);

template<typename T, typename P>
[[vk::ext_instruction(spv::OpStore)]]
enable_if_t<is_spirv_type_v<P>, void> store(P pointer, T object, [[vk::ext_literal]] uint32_t memoryAccess, [[vk::ext_literal]] uint32_t memoryAccessParam);

template<typename T, typename P, uint32_t alignment>
[[vk::ext_instruction(spv::OpStore)]]
enable_if_t<is_spirv_type_v<P>, void> store(P pointer, T object, [[vk::ext_literal]] uint32_t __aligned = /*Aligned*/0x00000002, [[vk::ext_literal]] uint32_t __alignment = alignment);

template<typename T, typename P>
[[vk::ext_instruction(spv::OpStore)]]
enable_if_t<is_spirv_type_v<P>, void> store(P pointer, T object);

template<typename T>
[[vk::ext_instruction(spv::OpStore)]]
void store(pointer_t<spv::StorageClassPhysicalStorageBuffer, T> pointer, T object, [[vk::ext_literal]] uint32_t memoryAccess);

template<typename T>
[[vk::ext_instruction(spv::OpStore)]]
void store(pointer_t<spv::StorageClassPhysicalStorageBuffer, T> pointer, T object, [[vk::ext_literal]] uint32_t memoryAccess, [[vk::ext_literal]] uint32_t memoryAccessParam);

template<typename T, uint32_t alignment>
[[vk::ext_instruction(spv::OpStore)]]
void store(pointer_t<spv::StorageClassPhysicalStorageBuffer, T> pointer, T object, [[vk::ext_literal]] uint32_t __aligned = /*Aligned*/0x00000002, [[vk::ext_literal]] uint32_t __alignment = alignment);

template<typename T>
[[vk::ext_instruction(spv::OpStore)]]
void store(pointer_t<spv::StorageClassPhysicalStorageBuffer, T> pointer, T object);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comments as for load

Comment on lines 652 to 658
template<typename T, typename P>
[[vk::ext_instruction(spv::OpGenericPtrMemSemantics)]]
enable_if_t<is_spirv_type_v<P>, T> genericPtrMemSemantics(P pointer);

template<typename T>
[[vk::ext_instruction(spv::OpGenericPtrMemSemantics)]]
T genericPtrMemSemantics(pointer_t<spv::StorageClassPhysicalStorageBuffer, T> pointer);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't emit stuff that requires OpenCL / kernel environment

Comment on lines 660 to 677
template<typename T>
[[vk::ext_capability(spv::CapabilityBitInstructions)]]
[[vk::ext_instruction(spv::OpBitFieldInsert)]]
T bitFieldInsert(T base, T insert, uint32_t offset, uint32_t count);

[[vk::ext_capability(spv::CapabilityBitInstructions)]]
[[vk::ext_instruction(spv::OpBitFieldSExtract)]]
int32_t bitFieldExtract(int32_t base, uint32_t offset, uint32_t count);

[[vk::ext_capability(spv::CapabilityBitInstructions)]]
[[vk::ext_instruction(spv::OpBitFieldSExtract)]]
int64_t bitFieldExtract(int64_t base, uint32_t offset, uint32_t count);

[[vk::ext_capability(spv::CapabilityBitInstructions)]]
[[vk::ext_instruction(spv::OpBitFieldUExtract)]]
uint32_t bitFieldExtract(uint32_t base, uint32_t offset, uint32_t count);

[[vk::ext_capability(spv::CapabilityBitInstructions)]]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can probably skip emitting caps and extensions that DXC would emit anyway when targetting vulkan 1.3

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's a cleaner way to do OpBitFieldUExtract and OpBitFieldSExtract than having every single possible overload imaginable.

template<typename U>
enable_if_t<is_unsigned_v<U>,U>

and

template<typename S>
enable_if_t<is_signed_v<S>,S>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps. note that we have to emit an overload for 64bit ints/floats anyways

Comment on lines 681 to 688
template<typename T>
[[vk::ext_capability(spv::CapabilityBitInstructions)]]
[[vk::ext_instruction(spv::OpBitReverse)]]
T bitReverse(T base);

template<typename T>
[[vk::ext_instruction(spv::OpBitCount)]]
T bitCount(T base);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you probably want to express type constraints with enable_if_t on any templated instruction

Comment on lines 690 to 722
[[vk::ext_instruction(spv::OpControlBarrier)]]
void controlBarrier(uint32_t executionScope, uint32_t memoryScope, uint32_t semantics);

[[vk::ext_instruction(spv::OpMemoryBarrier)]]
void memoryBarrier(uint32_t memoryScope, uint32_t semantics);

template<typename T>
[[vk::ext_instruction(spv::OpAtomicLoad)]]
T atomicLoad([[vk::ext_reference]] T pointer, uint32_t memoryScope, uint32_t semantics);

template<typename T, typename P>
[[vk::ext_instruction(spv::OpAtomicLoad)]]
enable_if_t<is_spirv_type_v<P>, T> atomicLoad(P pointer, uint32_t memoryScope, uint32_t semantics);

template<typename T>
[[vk::ext_instruction(spv::OpAtomicStore)]]
void atomicStore([[vk::ext_reference]] T pointer, uint32_t memoryScope, uint32_t semantics, T value);

template<typename T, typename P>
[[vk::ext_instruction(spv::OpAtomicStore)]]
enable_if_t<is_spirv_type_v<P>, void> atomicStore(P pointer, uint32_t memoryScope, uint32_t semantics, T value);

template<typename T>
[[vk::ext_instruction(spv::OpAtomicExchange)]]
T atomicExchange([[vk::ext_reference]] T pointer, uint32_t memoryScope, uint32_t semantics, T value);

template<typename T, typename P>
[[vk::ext_instruction(spv::OpAtomicExchange)]]
enable_if_t<is_spirv_type_v<P>, T> atomicExchange(P pointer, uint32_t memoryScope, uint32_t semantics, T value);

template<typename T> // integers operate on 2s complement so same op for signed and unsigned
[[vk::ext_capability(spv::CapabilityInt64Atomics)]]
template<typename T>
[[vk::ext_instruction(spv::OpAtomicCompareExchange)]]
T atomicCompareExchange([[vk::ext_reference]] T pointer, uint32_t memoryScope, uint32_t equal, uint32_t unequal, T value, T comparator);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw does the SPIR-V spec say that any of the memory operands need to be literals or not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, just checked

Comment on lines 768 to 822
[[vk::ext_instruction(spv::OpAtomicSMin)]]
int32_t atomicMin([[vk::ext_reference]] int32_t pointer, uint32_t memoryScope, uint32_t semantics, int32_t value);

template<typename T, typename Ptr_T> // DXC Workaround
[[vk::ext_capability(spv::CapabilityInt64Atomics)]]
[[vk::ext_instruction(spv::OpAtomicISub)]]
enable_if_t<is_spirv_type_v<Ptr_T> && (is_same_v<T,uint64_t> || is_same_v<T,int64_t>), T> atomicISub(Ptr_T ptr, uint32_t memoryScope, uint32_t memorySemantics, T value);
[[vk::ext_instruction(spv::OpAtomicSMin)]]
int64_t atomicMin([[vk::ext_reference]] int64_t pointer, uint32_t memoryScope, uint32_t semantics, int64_t value);

template<typename P>
[[vk::ext_instruction(spv::OpAtomicSMin)]]
enable_if_t<is_spirv_type_v<P>, int32_t> atomicMin(P pointer, uint32_t memoryScope, uint32_t semantics, int32_t value);

template<typename P>
[[vk::ext_instruction(spv::OpAtomicSMin)]]
enable_if_t<is_spirv_type_v<P>, int64_t> atomicMin(P pointer, uint32_t memoryScope, uint32_t semantics, int64_t value);

[[vk::ext_instruction(spv::OpAtomicUMin)]]
uint32_t atomicMin([[vk::ext_reference]] uint32_t pointer, uint32_t memoryScope, uint32_t semantics, uint32_t value);

[[vk::ext_instruction(spv::OpAtomicUMin)]]
uint64_t atomicMin([[vk::ext_reference]] uint64_t pointer, uint32_t memoryScope, uint32_t semantics, uint64_t value);

template<typename P>
[[vk::ext_instruction(spv::OpAtomicUMin)]]
enable_if_t<is_spirv_type_v<P>, uint32_t> atomicMin(P pointer, uint32_t memoryScope, uint32_t semantics, uint32_t value);

template<typename P>
[[vk::ext_instruction(spv::OpAtomicUMin)]]
enable_if_t<is_spirv_type_v<P>, uint64_t> atomicMin(P pointer, uint32_t memoryScope, uint32_t semantics, uint64_t value);

[[vk::ext_instruction(spv::OpAtomicSMax)]]
int32_t atomicMax([[vk::ext_reference]] int32_t pointer, uint32_t memoryScope, uint32_t semantics, int32_t value);

[[vk::ext_instruction(spv::OpAtomicSMax)]]
int64_t atomicMax([[vk::ext_reference]] int64_t pointer, uint32_t memoryScope, uint32_t semantics, int64_t value);

template<typename P>
[[vk::ext_instruction(spv::OpAtomicSMax)]]
enable_if_t<is_spirv_type_v<P>, int32_t> atomicMax(P pointer, uint32_t memoryScope, uint32_t semantics, int32_t value);

template<typename P>
[[vk::ext_instruction(spv::OpAtomicSMax)]]
enable_if_t<is_spirv_type_v<P>, int64_t> atomicMax(P pointer, uint32_t memoryScope, uint32_t semantics, int64_t value);

[[vk::ext_instruction(spv::OpAtomicUMax)]]
uint32_t atomicMax([[vk::ext_reference]] uint32_t pointer, uint32_t memoryScope, uint32_t semantics, uint32_t value);

[[vk::ext_instruction(spv::OpAtomicUMax)]]
uint64_t atomicMax([[vk::ext_reference]] uint64_t pointer, uint32_t memoryScope, uint32_t semantics, uint64_t value);

template<typename P>
[[vk::ext_instruction(spv::OpAtomicUMax)]]
enable_if_t<is_spirv_type_v<P>, uint32_t> atomicMax(P pointer, uint32_t memoryScope, uint32_t semantics, uint32_t value);

template<typename P>
[[vk::ext_instruction(spv::OpAtomicUMax)]]
enable_if_t<is_spirv_type_v<P>, uint64_t> atomicMax(P pointer, uint32_t memoryScope, uint32_t semantics, uint64_t value);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw the 64bit atomics need a special capability (64bit atomics)

Comment on lines 724 to 735
template<typename T, typename P>
[[vk::ext_instruction(spv::OpAtomicCompareExchange)]]
enable_if_t<is_spirv_type_v<P>, T> atomicCompareExchange(P pointer, uint32_t memoryScope, uint32_t equal, uint32_t unequal, T value, T comparator);

template<typename T>
[[vk::ext_instruction(spv::OpAtomicCompareExchangeWeak)]]
T atomicCompareExchangeWeak([[vk::ext_reference]] T pointer, uint32_t memoryScope, uint32_t equal, uint32_t unequal, T value, T comparator);

template<typename T, typename P>
[[vk::ext_instruction(spv::OpAtomicCompareExchangeWeak)]]
enable_if_t<is_spirv_type_v<P>, T> atomicCompareExchangeWeak(P pointer, uint32_t memoryScope, uint32_t equal, uint32_t unequal, T value, T comparator);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to do atomics of any other size than 4, you need special caps or extensions :(

Comment on lines 824 to 846
template<typename T>
[[vk::ext_instruction(spv::OpAtomicAnd)]]
enable_if_t<is_same_v<T,uint32_t> || is_same_v<T,int32_t>, T> atomicAnd([[vk::ext_reference]] T ptr, uint32_t memoryScope, uint32_t memorySemantics, T value);
T atomicAnd([[vk::ext_reference]] T pointer, uint32_t memoryScope, uint32_t semantics, T value);

template<typename T, typename Ptr_T> // DXC Workaround
template<typename T, typename P>
[[vk::ext_instruction(spv::OpAtomicAnd)]]
enable_if_t<is_spirv_type_v<Ptr_T> && (is_same_v<T,uint32_t> || is_same_v<T,int32_t>), T> atomicAnd(Ptr_T ptr, uint32_t memoryScope, uint32_t memorySemantics, T value);
enable_if_t<is_spirv_type_v<P>, T> atomicAnd(P pointer, uint32_t memoryScope, uint32_t semantics, T value);

template<typename T>
[[vk::ext_instruction(spv::OpAtomicOr)]]
enable_if_t<is_same_v<T,uint32_t> || is_same_v<T,int32_t>, T> atomicOr([[vk::ext_reference]] T ptr, uint32_t memoryScope, uint32_t memorySemantics, T value);
T atomicOr([[vk::ext_reference]] T pointer, uint32_t memoryScope, uint32_t semantics, T value);

template<typename T, typename Ptr_T> // DXC Workaround
template<typename T, typename P>
[[vk::ext_instruction(spv::OpAtomicOr)]]
enable_if_t<is_spirv_type_v<Ptr_T> && (is_same_v<T,uint32_t> || is_same_v<T,int32_t>), T> atomicOr(Ptr_T ptr, uint32_t memoryScope, uint32_t memorySemantics, T value);
enable_if_t<is_spirv_type_v<P>, T> atomicOr(P pointer, uint32_t memoryScope, uint32_t semantics, T value);

template<typename T>
[[vk::ext_instruction(spv::OpAtomicXor)]]
enable_if_t<is_same_v<T,uint32_t> || is_same_v<T,int32_t>, T> atomicXor([[vk::ext_reference]] T ptr, uint32_t memoryScope, uint32_t memorySemantics, T value);
T atomicXor([[vk::ext_reference]] T pointer, uint32_t memoryScope, uint32_t semantics, T value);

template<typename T, typename Ptr_T> // DXC Workaround
template<typename T, typename P>
[[vk::ext_instruction(spv::OpAtomicXor)]]
enable_if_t<is_spirv_type_v<Ptr_T> && (is_same_v<T,uint32_t> || is_same_v<T,int32_t>), T> atomicXor(Ptr_T ptr, uint32_t memoryScope, uint32_t memorySemantics, T value);
enable_if_t<is_spirv_type_v<P>, T> atomicXor(P pointer, uint32_t memoryScope, uint32_t semantics, T value);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as with the compare or swap atomics, caps/extensions for sizes other than 32bit

Comment on lines 848 to 862
template<typename T>
[[vk::ext_instruction(spv::OpAtomicFlagTestAndSet)]]
T atomicFlagTestAndSet([[vk::ext_reference]] T pointer, uint32_t memoryScope, uint32_t semantics);

template<typename Signed, typename Ptr_T> // DXC Workaround
[[vk::ext_instruction(spv::OpAtomicSMin)]]
enable_if_t<is_spirv_type_v<Ptr_T> && is_same_v<Signed,int32_t>, Signed> atomicSMin(Ptr_T ptr, uint32_t memoryScope, uint32_t memorySemantics, Signed value);
template<typename T, typename P>
[[vk::ext_instruction(spv::OpAtomicFlagTestAndSet)]]
enable_if_t<is_spirv_type_v<P>, T> atomicFlagTestAndSet(P pointer, uint32_t memoryScope, uint32_t semantics);

template<typename Unsigned>
[[vk::ext_instruction( spv::OpAtomicUMin )]]
enable_if_t<is_same_v<Unsigned,uint32_t>, Unsigned> atomicUMin([[vk::ext_reference]] Unsigned ptr, uint32_t memoryScope, uint32_t memorySemantics, Unsigned value);
template<typename T>
[[vk::ext_instruction(spv::OpAtomicFlagClear)]]
void atomicFlagClear([[vk::ext_reference]] T pointer, uint32_t memoryScope, uint32_t semantics);

template<typename Unsigned, typename Ptr_T> // DXC Workaround
[[vk::ext_instruction(spv::OpAtomicUMin)]]
enable_if_t<is_spirv_type_v<Ptr_T> && is_same_v<Unsigned,uint32_t>, Unsigned> atomicUMin(Ptr_T ptr, uint32_t memoryScope, uint32_t memorySemantics, Unsigned value);
template<typename P>
[[vk::ext_instruction(spv::OpAtomicFlagClear)]]
enable_if_t<is_spirv_type_v<P>, void> atomicFlagClear(P pointer, uint32_t memoryScope, uint32_t semantics);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SPIR-V non OpenCL env allows for flag ops?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also I'm sure there would be some type constraint on T

Comment on lines 864 to 932
[[vk::ext_capability(spv::CapabilityGroupNonUniform)]]
[[vk::ext_instruction(spv::OpGroupNonUniformElect)]]
bool groupNonUniformElect(uint32_t executionScope);

template<typename Signed, typename Ptr_T> // DXC Workaround
[[vk::ext_instruction(spv::OpAtomicSMax)]]
enable_if_t<is_spirv_type_v<Ptr_T> && is_same_v<Signed,int32_t>, Signed> atomicSMax(Ptr_T ptr, uint32_t memoryScope, uint32_t memorySemantics, Signed value);
[[vk::ext_capability(spv::CapabilityGroupNonUniformVote)]]
[[vk::ext_instruction(spv::OpGroupNonUniformAll)]]
bool groupNonUniformAll(uint32_t executionScope, bool predicate);

template<typename Unsigned>
[[vk::ext_instruction( spv::OpAtomicUMax )]]
enable_if_t<is_same_v<Unsigned,uint32_t>, Unsigned> atomicUMax([[vk::ext_reference]] uint32_t ptr, uint32_t memoryScope, uint32_t memorySemantics, Unsigned value);
[[vk::ext_capability(spv::CapabilityGroupNonUniformVote)]]
[[vk::ext_instruction(spv::OpGroupNonUniformAny)]]
bool groupNonUniformAny(uint32_t executionScope, bool predicate);

template<typename Unsigned, typename Ptr_T> // DXC Workaround
[[vk::ext_instruction(spv::OpAtomicUMax)]]
enable_if_t<is_spirv_type_v<Ptr_T> && is_same_v<Unsigned,uint32_t>, Unsigned> atomicUMax(Ptr_T ptr, uint32_t memoryScope, uint32_t memorySemantics, Unsigned value);
[[vk::ext_capability(spv::CapabilityGroupNonUniformVote)]]
[[vk::ext_instruction(spv::OpGroupNonUniformAllEqual)]]
bool groupNonUniformAllEqual(uint32_t executionScope, bool value);

template<typename T>
[[vk::ext_instruction(spv::OpAtomicExchange)]]
T atomicExchange([[vk::ext_reference]] T ptr, uint32_t memoryScope, uint32_t memorySemantics, T value);
[[vk::ext_capability(spv::CapabilityGroupNonUniformBallot)]]
[[vk::ext_instruction(spv::OpGroupNonUniformBroadcast)]]
T groupNonUniformBroadcast(uint32_t executionScope, T value, uint32_t id);

template<typename T, typename Ptr_T> // DXC Workaround
[[vk::ext_instruction(spv::OpAtomicExchange)]]
enable_if_t<is_spirv_type_v<Ptr_T>, T> atomicExchange(Ptr_T ptr, uint32_t memoryScope, uint32_t memorySemantics, T value);
template<typename T>
[[vk::ext_capability(spv::CapabilityGroupNonUniformBallot)]]
[[vk::ext_instruction(spv::OpGroupNonUniformBroadcastFirst)]]
T groupNonUniformBroadcastFirst(uint32_t executionScope, T value);

[[vk::ext_capability(spv::CapabilityGroupNonUniformBallot)]]
[[vk::ext_instruction(spv::OpGroupNonUniformBallot)]]
uint32_t4 groupNonUniformBallot(uint32_t executionScope, bool predicate);

[[vk::ext_capability(spv::CapabilityGroupNonUniformBallot)]]
[[vk::ext_instruction(spv::OpGroupNonUniformInverseBallot)]]
bool groupNonUniformInverseBallot(uint32_t executionScope, uint32_t4 value);

[[vk::ext_capability(spv::CapabilityGroupNonUniformBallot)]]
[[vk::ext_instruction(spv::OpGroupNonUniformBallotBitExtract)]]
bool groupNonUniformBallotBitExtract(uint32_t executionScope, uint32_t4 value, uint32_t index);

[[vk::ext_capability(spv::CapabilityGroupNonUniformBallot)]]
[[vk::ext_instruction(spv::OpGroupNonUniformBallotBitCount)]]
uint32_t groupNonUniformBallotBitCount(uint32_t executionScope, [[vk::ext_literal]] uint32_t operation, uint32_t4 value);

[[vk::ext_capability(spv::CapabilityGroupNonUniformBallot)]]
[[vk::ext_instruction(spv::OpGroupNonUniformBallotFindLSB)]]
uint32_t groupNonUniformBallotFindLSB(uint32_t executionScope, uint32_t4 value);

[[vk::ext_capability(spv::CapabilityGroupNonUniformBallot)]]
[[vk::ext_instruction(spv::OpGroupNonUniformBallotFindMSB)]]
uint32_t groupNonUniformBallotFindMSB(uint32_t executionScope, uint32_t4 value);

template<typename T>
[[vk::ext_instruction(spv::OpAtomicCompareExchange)]]
T atomicCompareExchange([[vk::ext_reference]] T ptr, uint32_t memoryScope, uint32_t memSemanticsEqual, uint32_t memSemanticsUnequal, T value, T comparator);
[[vk::ext_capability(spv::CapabilityGroupNonUniformShuffle)]]
[[vk::ext_instruction(spv::OpGroupNonUniformShuffle)]]
T groupNonUniformShuffle(uint32_t executionScope, T value, uint32_t id);

template<typename T, typename Ptr_T> // DXC Workaround
[[vk::ext_instruction(spv::OpAtomicCompareExchange)]]
enable_if_t<is_spirv_type_v<Ptr_T>, T> atomicCompareExchange(Ptr_T ptr, uint32_t memoryScope, uint32_t memSemanticsEqual, uint32_t memSemanticsUnequal, T value, T comparator);
template<typename T>
[[vk::ext_capability(spv::CapabilityGroupNonUniformShuffle)]]
[[vk::ext_instruction(spv::OpGroupNonUniformShuffleXor)]]
T groupNonUniformShuffleXor(uint32_t executionScope, T value, uint32_t mask);

template<typename T>
[[vk::ext_capability(spv::CapabilityGroupNonUniformShuffleRelative)]]
[[vk::ext_instruction(spv::OpGroupNonUniformShuffleUp)]]
T groupNonUniformShuffleUp(uint32_t executionScope, T value, uint32_t delta);

template<typename T, uint32_t alignment>
[[vk::ext_capability(spv::CapabilityPhysicalStorageBufferAddresses)]]
[[vk::ext_instruction(spv::OpLoad)]]
T load(pointer_t<spv::StorageClassPhysicalStorageBuffer,T> pointer, [[vk::ext_literal]] uint32_t __aligned = /*Aligned*/0x00000002, [[vk::ext_literal]] uint32_t __alignment = alignment);
template<typename T>
[[vk::ext_capability(spv::CapabilityGroupNonUniformShuffleRelative)]]
[[vk::ext_instruction(spv::OpGroupNonUniformShuffleDown)]]
T groupNonUniformShuffleDown(uint32_t executionScope, T value, uint32_t delta);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does execution scope need to be a literal?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nop. just checked

Comment on lines 1218 to 1228
[[vk::ext_capability(spv::CapabilityAtomicFloat16MinMaxEXT)]]
[[vk::ext_instruction(spv::OpAtomicFMinEXT)]]
float atomicMinEXT_AtomicFloat16MinMaxEXT([[vk::ext_reference]] float pointer, uint32_t memoryScope, uint32_t semantics, float value);

[[vk::ext_capability(spv::CapabilityAtomicFloat32MinMaxEXT)]]
[[vk::ext_instruction(spv::OpAtomicFMinEXT)]]
float atomicMinEXT_AtomicFloat32MinMaxEXT([[vk::ext_reference]] float pointer, uint32_t memoryScope, uint32_t semantics, float value);

[[vk::ext_capability(spv::CapabilityAtomicFloat64MinMaxEXT)]]
[[vk::ext_instruction(spv::OpAtomicFMinEXT)]]
float atomicMinEXT_AtomicFloat64MinMaxEXT([[vk::ext_reference]] float pointer, uint32_t memoryScope, uint32_t semantics, float value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrong types, first should be float16_t second should be float32_t, last should be float64_t

Comment on lines 1230 to 1232
[[vk::ext_capability(spv::CapabilityAtomicFloat16VectorNV)]]
[[vk::ext_instruction(spv::OpAtomicFMinEXT)]]
float atomicMinEXT_AtomicFloat16VectorNV([[vk::ext_reference]] float pointer, uint32_t memoryScope, uint32_t semantics, float value);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would be for a vector<float16_t,N> not float

Comment on lines 1234 to 1252
template<typename P>
[[vk::ext_capability(spv::CapabilityAtomicFloat16MinMaxEXT)]]
[[vk::ext_instruction(spv::OpAtomicFMinEXT)]]
enable_if_t<is_spirv_type_v<P>, float> atomicMinEXT_AtomicFloat16MinMaxEXT(P pointer, uint32_t memoryScope, uint32_t semantics, float value);

template<typename P>
[[vk::ext_capability(spv::CapabilityAtomicFloat32MinMaxEXT)]]
[[vk::ext_instruction(spv::OpAtomicFMinEXT)]]
enable_if_t<is_spirv_type_v<P>, float> atomicMinEXT_AtomicFloat32MinMaxEXT(P pointer, uint32_t memoryScope, uint32_t semantics, float value);

template<typename P>
[[vk::ext_capability(spv::CapabilityAtomicFloat64MinMaxEXT)]]
[[vk::ext_instruction(spv::OpAtomicFMinEXT)]]
enable_if_t<is_spirv_type_v<P>, float> atomicMinEXT_AtomicFloat64MinMaxEXT(P pointer, uint32_t memoryScope, uint32_t semantics, float value);

template<typename P>
[[vk::ext_capability(spv::CapabilityAtomicFloat16VectorNV)]]
[[vk::ext_instruction(spv::OpAtomicFMinEXT)]]
enable_if_t<is_spirv_type_v<P>, float> atomicMinEXT_AtomicFloat16VectorNV(P pointer, uint32_t memoryScope, uint32_t semantics, float value);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same issue as with the non template P overloads

Comment on lines 1254 to 1288
[[vk::ext_capability(spv::CapabilityAtomicFloat16MinMaxEXT)]]
[[vk::ext_instruction(spv::OpAtomicFMaxEXT)]]
float atomicMaxEXT_AtomicFloat16MinMaxEXT([[vk::ext_reference]] float pointer, uint32_t memoryScope, uint32_t semantics, float value);

[[vk::ext_capability(spv::CapabilityAtomicFloat32MinMaxEXT)]]
[[vk::ext_instruction(spv::OpAtomicFMaxEXT)]]
float atomicMaxEXT_AtomicFloat32MinMaxEXT([[vk::ext_reference]] float pointer, uint32_t memoryScope, uint32_t semantics, float value);

[[vk::ext_capability(spv::CapabilityAtomicFloat64MinMaxEXT)]]
[[vk::ext_instruction(spv::OpAtomicFMaxEXT)]]
float atomicMaxEXT_AtomicFloat64MinMaxEXT([[vk::ext_reference]] float pointer, uint32_t memoryScope, uint32_t semantics, float value);

[[vk::ext_capability(spv::CapabilityAtomicFloat16VectorNV)]]
[[vk::ext_instruction(spv::OpAtomicFMaxEXT)]]
float atomicMaxEXT_AtomicFloat16VectorNV([[vk::ext_reference]] float pointer, uint32_t memoryScope, uint32_t semantics, float value);

template<typename P>
[[vk::ext_capability(spv::CapabilityAtomicFloat16MinMaxEXT)]]
[[vk::ext_instruction(spv::OpAtomicFMaxEXT)]]
enable_if_t<is_spirv_type_v<P>, float> atomicMaxEXT_AtomicFloat16MinMaxEXT(P pointer, uint32_t memoryScope, uint32_t semantics, float value);

template<typename Signed>
[[vk::ext_instruction( spv::OpBitFieldSExtract )]]
enable_if_t<is_signed_v<Signed>, Signed> bitFieldSExtract( Signed val, uint32_t offsetBits, uint32_t numBits );
template<typename P>
[[vk::ext_capability(spv::CapabilityAtomicFloat32MinMaxEXT)]]
[[vk::ext_instruction(spv::OpAtomicFMaxEXT)]]
enable_if_t<is_spirv_type_v<P>, float> atomicMaxEXT_AtomicFloat32MinMaxEXT(P pointer, uint32_t memoryScope, uint32_t semantics, float value);

template<typename Integral>
[[vk::ext_instruction( spv::OpBitFieldInsert )]]
Integral bitFieldInsert( Integral base, Integral insert, uint32_t offset, uint32_t count );
template<typename P>
[[vk::ext_capability(spv::CapabilityAtomicFloat64MinMaxEXT)]]
[[vk::ext_instruction(spv::OpAtomicFMaxEXT)]]
enable_if_t<is_spirv_type_v<P>, float> atomicMaxEXT_AtomicFloat64MinMaxEXT(P pointer, uint32_t memoryScope, uint32_t semantics, float value);

template<typename P>
[[vk::ext_capability(spv::CapabilityAtomicFloat16VectorNV)]]
[[vk::ext_instruction(spv::OpAtomicFMaxEXT)]]
enable_if_t<is_spirv_type_v<P>, float> atomicMaxEXT_AtomicFloat16VectorNV(P pointer, uint32_t memoryScope, uint32_t semantics, float value);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same issues as with your min

Comment on lines 1290 to 1324
[[vk::ext_capability(spv::CapabilityAtomicFloat16AddEXT)]]
[[vk::ext_instruction(spv::OpAtomicFAddEXT)]]
float atomicAddEXT_AtomicFloat16AddEXT([[vk::ext_reference]] float pointer, uint32_t memoryScope, uint32_t semantics, float value);

[[vk::ext_capability(spv::CapabilityAtomicFloat32AddEXT)]]
[[vk::ext_instruction(spv::OpAtomicFAddEXT)]]
float atomicAddEXT_AtomicFloat32AddEXT([[vk::ext_reference]] float pointer, uint32_t memoryScope, uint32_t semantics, float value);

[[vk::ext_capability(spv::CapabilityAtomicFloat64AddEXT)]]
[[vk::ext_instruction(spv::OpAtomicFAddEXT)]]
float atomicAddEXT_AtomicFloat64AddEXT([[vk::ext_reference]] float pointer, uint32_t memoryScope, uint32_t semantics, float value);

[[vk::ext_capability(spv::CapabilityAtomicFloat16VectorNV)]]
[[vk::ext_instruction(spv::OpAtomicFAddEXT)]]
float atomicAddEXT_AtomicFloat16VectorNV([[vk::ext_reference]] float pointer, uint32_t memoryScope, uint32_t semantics, float value);

template<typename P>
[[vk::ext_capability(spv::CapabilityAtomicFloat16AddEXT)]]
[[vk::ext_instruction(spv::OpAtomicFAddEXT)]]
enable_if_t<is_spirv_type_v<P>, float> atomicAddEXT_AtomicFloat16AddEXT(P pointer, uint32_t memoryScope, uint32_t semantics, float value);

template<typename P>
[[vk::ext_capability(spv::CapabilityAtomicFloat32AddEXT)]]
[[vk::ext_instruction(spv::OpAtomicFAddEXT)]]
enable_if_t<is_spirv_type_v<P>, float> atomicAddEXT_AtomicFloat32AddEXT(P pointer, uint32_t memoryScope, uint32_t semantics, float value);

template<typename P>
[[vk::ext_capability(spv::CapabilityAtomicFloat64AddEXT)]]
[[vk::ext_instruction(spv::OpAtomicFAddEXT)]]
enable_if_t<is_spirv_type_v<P>, float> atomicAddEXT_AtomicFloat64AddEXT(P pointer, uint32_t memoryScope, uint32_t semantics, float value);

template<typename P>
[[vk::ext_capability(spv::CapabilityAtomicFloat16VectorNV)]]
[[vk::ext_instruction(spv::OpAtomicFAddEXT)]]
enable_if_t<is_spirv_type_v<P>, float> atomicAddEXT_AtomicFloat16VectorNV(P pointer, uint32_t memoryScope, uint32_t semantics, float value);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same float sizing issues as with the min-max stuff

Comment on lines 1326 to 1332
[[vk::ext_capability(spv::CapabilitySplitBarrierINTEL)]]
[[vk::ext_instruction(spv::OpControlBarrierArriveINTEL)]]
void controlBarrierArriveINTEL(uint32_t executionScope, uint32_t memoryScope, uint32_t semantics);

[[vk::ext_capability(spv::CapabilitySplitBarrierINTEL)]]
[[vk::ext_instruction(spv::OpControlBarrierWaitINTEL)]]
void controlBarrierWaitINTEL(uint32_t executionScope, uint32_t memoryScope, uint32_t semantics);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are OpenCL AFAIK

Comment on lines -24 to -32
[[vk::ext_capability(spv::CapabilityFragmentShaderPixelInterlockEXT)]]
[[vk::ext_extension("SPV_EXT_fragment_shader_interlock")]]
[[vk::ext_instruction(spv::OpBeginInvocationInterlockEXT)]]
void beginInvocationInterlockEXT();

[[vk::ext_capability(spv::CapabilityFragmentShaderPixelInterlockEXT)]]
[[vk::ext_extension("SPV_EXT_fragment_shader_interlock")]]
[[vk::ext_instruction(spv::OpEndInvocationInterlockEXT)]]
void endInvocationInterlockEXT();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are missing in your new codegen, please test with example 62 CAD as well as arithmetic unit test

Comment on lines -17 to -72
[[vk::ext_capability( spv::CapabilityGroupNonUniformArithmetic )]]
[[vk::ext_instruction( spv::OpGroupNonUniformIAdd )]]
int32_t groupAdd(uint32_t groupScope, [[vk::ext_literal]] uint32_t operation, int32_t value);
[[vk::ext_capability( spv::CapabilityGroupNonUniformArithmetic )]]
[[vk::ext_instruction( spv::OpGroupNonUniformIAdd )]]
uint32_t groupAdd(uint32_t groupScope, [[vk::ext_literal]] uint32_t operation, uint32_t value);
[[vk::ext_capability( spv::CapabilityGroupNonUniformArithmetic )]]
[[vk::ext_instruction( spv::OpGroupNonUniformFAdd )]]
float32_t groupAdd(uint32_t groupScope, [[vk::ext_literal]] uint32_t operation, float32_t value);

[[vk::ext_capability( spv::CapabilityGroupNonUniformArithmetic )]]
[[vk::ext_instruction( spv::OpGroupNonUniformIMul )]]
int32_t groupMul(uint32_t groupScope, [[vk::ext_literal]] uint32_t operation, int32_t value);
[[vk::ext_capability( spv::CapabilityGroupNonUniformArithmetic )]]
[[vk::ext_instruction( spv::OpGroupNonUniformIMul )]]
uint32_t groupMul(uint32_t groupScope, [[vk::ext_literal]] uint32_t operation, uint32_t value);
[[vk::ext_capability( spv::CapabilityGroupNonUniformArithmetic )]]
[[vk::ext_instruction( spv::OpGroupNonUniformFMul )]]
float32_t groupMul(uint32_t groupScope, [[vk::ext_literal]] uint32_t operation, float32_t value);

template<typename T>
[[vk::ext_capability( spv::CapabilityGroupNonUniformArithmetic )]]
[[vk::ext_instruction( spv::OpGroupNonUniformBitwiseAnd )]]
T groupBitwiseAnd(uint32_t groupScope, [[vk::ext_literal]] uint32_t operation, T value);

template<typename T>
[[vk::ext_capability( spv::CapabilityGroupNonUniformArithmetic )]]
[[vk::ext_instruction( spv::OpGroupNonUniformBitwiseOr )]]
T groupBitwiseOr(uint32_t groupScope, [[vk::ext_literal]] uint32_t operation, T value);

template<typename T>
[[vk::ext_capability( spv::CapabilityGroupNonUniformArithmetic )]]
[[vk::ext_instruction( spv::OpGroupNonUniformBitwiseXor )]]
T groupBitwiseXor(uint32_t groupScope, [[vk::ext_literal]] uint32_t operation, T value);

// The MIN and MAX operations in SPIR-V have different Ops for each arithmetic type
// so we implement them distinctly
[[vk::ext_capability( spv::CapabilityGroupNonUniformArithmetic )]]
[[vk::ext_instruction( spv::OpGroupNonUniformSMin )]]
int32_t groupBitwiseMin(uint32_t groupScope, [[vk::ext_literal]] uint32_t operation, int32_t value);
[[vk::ext_capability( spv::CapabilityGroupNonUniformArithmetic )]]
[[vk::ext_instruction( spv::OpGroupNonUniformUMin )]]
uint32_t groupBitwiseMin(uint32_t groupScope, [[vk::ext_literal]] uint32_t operation, uint32_t value);
[[vk::ext_capability( spv::CapabilityGroupNonUniformArithmetic )]]
[[vk::ext_instruction( spv::OpGroupNonUniformFMin )]]
float32_t groupBitwiseMin(uint32_t groupScope, [[vk::ext_literal]] uint32_t operation, float32_t value);

[[vk::ext_capability( spv::CapabilityGroupNonUniformArithmetic )]]
[[vk::ext_instruction( spv::OpGroupNonUniformSMax )]]
int32_t groupBitwiseMax(uint32_t groupScope, [[vk::ext_literal]] uint32_t operation, int32_t value);
[[vk::ext_capability( spv::CapabilityGroupNonUniformArithmetic )]]
[[vk::ext_instruction( spv::OpGroupNonUniformUMax )]]
uint32_t groupBitwiseMax(uint32_t groupScope, [[vk::ext_literal]] uint32_t operation, uint32_t value);
[[vk::ext_capability( spv::CapabilityGroupNonUniformArithmetic )]]
[[vk::ext_instruction( spv::OpGroupNonUniformFMax )]]
float32_t groupBitwiseMax(uint32_t groupScope, [[vk::ext_literal]] uint32_t operation, float32_t value);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kpentaris oogle this if the new codegen looks sane

Signed-off-by: Ali Cheraghi <[email protected]>
Signed-off-by: Ali Cheraghi <[email protected]>
Signed-off-by: Ali Cheraghi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants