Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLVM and SPIRV-LLVM-Translator pulldown (WW38 2024) #15464

Draft
wants to merge 1,525 commits into
base: sycl
Choose a base branch
from
Draft

Conversation

iclsrc
Copy link
Contributor

@iclsrc iclsrc commented Sep 21, 2024

hekota and others added 30 commits August 29, 2024 21:42
Introducing `HLSLAttributedResourceType` - a new type that is similar to
`AttributedType` but with additional data specific to HLSL resources.
`AttributeType` currently only stores an attribute kind and no
additional data from the type attribute parameters. This does not really
work for HLSL resources since its type attributes contain non-boolean
values that need to be retained as well.

For example:

```
template <typename T> class RWBuffer {
  __hlsl_resource_t  [[hlsl::resource_class(uav)]] [[hlsl::is_rov]] handle;
};
```

The data `HLSLAttributedResourceType` needs to eventually store are:
- resource class (SRV, UAV, CBuffer, Sampler)
- texture dimension(1-3)
- flags is_rov, is_array, is_feedback and is_multisample
- contained type

All of these values except contained type will be stored in
`HLSLAttributedResourceType::Attributes` struct and accessed
individually via the fields. There is also `Data` alias that covers all
of these values as a `unsigned` which is used for hashing and the AST
type serialization.

During type attribute processing all HLSL type attributes will be
validated and collected by SemaHLSL (by
`SemaHLSL::handleResourceTypeAttr`) and in the end combined into a
single `HLSLAttributedResourceType` instance (in
`SemaHLSL::ProcessResourceTypeAttributes`). `SemaHLSL` will also need to
short-term store the `TypeLoc` information for the new type that will be
grabbed by `TypeSpecLocFiller` soon after the type is created.

Part 1/2 of #104861
ALLOCATE and DEALLOCATE statements can be inlined in device function.
This patch updates the condition that determined to inline these actions
in lowering.

This avoid runtime calls in device function code and can speed up the
execution.

Also move `isCudaDeviceContext` from `Bridge.cpp` so it can be used
elsewhere.
Allow some interaction between LLVM and FIR dialect by allowing
conversion between FIR memory types and llvm.ptr type.
This is meant to help experimentation where FIR and LLVM dialect
coexists, and is useful to deal with cases where LLVM type makes it
early into the MLIR produced by flang, like when inserting LLVM stack
intrinsic here:
https://github.com/llvm/llvm-project/blob/0a00d32c5f88fce89006dcde6e235bc77d7b495e/flang/lib/Optimizer/Transforms/StackReclaim.cpp#L57
The current pattern was failing OpenACC semantics in acc parse tree
canonicalization:

```
!acc loop
!dir vector aligned
do i=1,n
...
```

Fix it by moving the directive before the OpenACC construct node.

Note that I think it could make sense to propagate the $dir info to the
acc.loop, at least with classic flang, the $dir seems to make a
difference. This is not done here since few directives are supported
anyway.
…oDate function

This patch extracts ModuleFile class from StandalonePrerequisiteModules
so that we can reuse it further. And also we implement
IsModuleFileUpToDate function to implement
StandalonePrerequisiteModules::CanReuse. Both of them aims to ease the
future improvements to the support of modules in clangd. And both of
them should be NFC.
__getauxval is a libgcc function that doesn't exist on Android.
Also on Linux let's use getauxval as it is anyway used other places in compiler-rt.
…Date to reference

It is better to use references instead of pointers as the argument type
of IsModuleFileUpToDate. Since the PrerequisiteModules is always
expected to exist.
…564)

shortloop is a non standard OpenACC extension
(https://docs.nvidia.com/hpc-sdk/pgi-compilers/2015/pgirn157.pdf) that
can be found on loop directives.

f18 parser was choking when seeing it. Since it can be found in existing
apps and is mainly an optimization hint, parse it on loop directives and
ignore it with a warning.

For the records, here is shortloop meaning according to the manual linked above:

"If the shortloop clause appears on a loop directive with the vector clause, it tells the compiler that the
loop trip count is less than or equal to the number of vector lanes created for that loop. This means the
value of the vector() clause on the loop directive in a kernels region, or the value of the
vector_length() clause on the parallel directive in a parallel region will be greater than or
equal to the loop trip count. This allows the compiler to generate more efficient code for the loop"
…(#106621)

Without these explicit includes, removing other headers, who implicitly
include llvm-config.h, may have non-trivial side effects.
Similarly to the existing range attribute inference, also infer the
nonnull attribute on function return values.

I think in practice FunctionAttrs will handle nearly all cases, the main
one I think it doesn't is cases involving branch conditions. But as we
already have the information here, we may as well materialize it.
…intrinsic (#103037)

Adds support for wider-than-legal vector types for the histogram
intrinsic (llvm.experimental.vector.histogram.add) by splitting the
vector. Also adds integer promotion for the Inc operand.
…-bundles`. NFC. (#106661)

With `-view-edge-bundles`, before the change, the dot file output is
kinda like
```dot
digraph {
        "%bb.0" [ shape=box ]
        0 -> "%bb.0"
        "%bb.0" -> 1
        "%bb.0" -> "%bb.1" [ color=lightgray ]
        "%bb.0" -> "%bb.6" [ color=lightgray ]
        "%bb.1" [ shape=box ]
        1 -> "%bb.1"
        "%bb.1" -> 1
        "%bb.1" -> "%bb.2" [ color=lightgray ]
        "%bb.1" -> "%bb.6" [ color=lightgray ]
        "%bb.2" [ shape=box ]
        1 -> "%bb.2"
        "%bb.2" -> 1
        "%bb.2" -> "%bb.3" [ color=lightgray ]
        "%bb.3" [ shape=box ]
        1 -> "%bb.3"
        "%bb.3" -> 2
        "%bb.3" -> "%bb.4" [ color=lightgray ]
        "%bb.4" [ shape=box ]
        2 -> "%bb.4"
        "%bb.4" -> 2
        "%bb.4" -> "%bb.4" [ color=lightgray ]
        "%bb.4" -> "%bb.5" [ color=lightgray ]
        "%bb.5" [ shape=box ]
        2 -> "%bb.5"
        "%bb.5" -> 1
        "%bb.5" -> "%bb.6" [ color=lightgray ]
        "%bb.5" -> "%bb.3" [ color=lightgray ]
        "%bb.6" [ shape=box ]
        1 -> "%bb.6"
        "%bb.6" -> 3
}
```
However, the graph output by graphviz is

![t](https://github.com/user-attachments/assets/24056c0a-3ba9-49c3-a5da-269f3140e619)
The node name corresponding to the MBB is incorrect.
After the change, the node name is consistent with MBB's name.

![s](https://github.com/user-attachments/assets/38c649d1-7222-4de1-971c-56f7721ab64c)
[D156118](https://reviews.llvm.org/D156118) states that this note is
always present, but it is better to check it explicitly, as otherwise
`lldb` may crash when trying to read registers.
This patch sets the timeout of the code formatting job to 30 minutes.
The job is currently failing in specific circumstances and needs to be
reworked, but as a temp hack, change the timeout to 30 minutes so that
we can catch these jobs before they hit the Github Actions timeout limit
of six hours.

Somewhat (hackily) alleviates #79661 slightly.
…5617)

Follow-up on 8ac140f.

The test `SemaTemplate/default-parm-init.cpp` was introduced since the
fix #80288 and mainly did the following things:

- Ensure the default arguments are properly substituted inside either
the primary template & their explicit / out-of-line specializations.
- Ensure the strategy doesn't mess up the substitution of a lambda
expression as a default argument.

The 1st is for the bug of #68490, yet it does some redundant work: each
of the member functions is duplicated twice for the `sizeof` and
`alignof` operators, respectively, and the principle under the hood are
essentially the same. So this patch removes the duplication and reduces
the 8 functions to 4 functions that reveal the same thing.

The 2nd is presumably testing that the fix in #80288 doesn't impact a
complicated substitution. However, that seems unnecessary & unrelated to
the original issue. And more importantly, we don't have any problems
with that ever. Hence, I'll remove that test from this patch.

The test for default arguments is merged into
`SemaTemplate/default-arguments.cpp` with a new namespace, and hopefully
this could reduce the entropy of our testing cases.
…S_D instruction

Reviewed By: heiher, SixWeining

Pull Request: llvm/llvm-project#106332
precommit f16 test for #87506 fp-int conversion
The PR llvm/llvm-project#105996 broke taking the
address of a vector:

**compound-literal.c**
```C
typedef int v4i32 __attribute((vector_size(16)));
v4i32 *y = &(v4i32){1,2,3,4};
```
That because the current interpreter handle vector unary operator as a
fallback when the generic code path fail. but the new interpreter was
not. we need to handle `UO_AddrOf` in
`Compiler<Emitter>::VisitVectorUnaryOperator`.

Signed-off-by: yronglin <[email protected]>
CompilerInstance can re-use same SourceManager across multiple
frontendactions. During this process it calls
`SourceManager::clearIDTables` to reset any caches based on FileIDs.

It didn't reset IncludeLocMap, resulting in wrong include locations for
workflows that triggered multiple frontend-actions through same
CompilerInstance.
This patch adds check for mutiples of `tosa.tile`. The `multiples` in
`tosa.tile` indicates how many times the tensor should be replicated
along each dimension. Zero and negative values are invalid, except for
-1, which represents a dynamic value. Therefore, each element of
`mutiples` should be positive integer or -1. Fix #106167.
A optimizable cast can also be removed by VPlan simplifications. Remove
the restriction from planContainsAdditionalSimplifications, as this
causes it to miss relevant simplifications, triggering false positives
for the cost decision verification.

Also adds debug output for printing additional cost-precomputations.

Fixes llvm/llvm-project#106641.
…ectorsCombine. (#104774)

UZP2 requires both operands to match the result type but the combine tries to replace a truncate by passing the pre-truncated operands directly to an UZP2 with the truncated result type. This patch nop-casts the operands to keep the DAG consistent.  There should be no changes to the generated code, which is fine as it.

This patch also enables more target specific getNode() validation for fixed length vector types.
DavidSpickett and others added 19 commits September 2, 2024 09:06
llvm/llvm-project#106075 has removed the
last dependency on LoopInfo in InstCombine, so don't fetch the
analysis anymore and remove the use-loop-info pass option.
  CONFLICT (content): Merge conflict in llvm/test/Transforms/InstCombine/gep-combine-loop-invariant.ll
CONFLICT (content): Merge conflict in llvm/utils/git/requirements.txt
CONFLICT (content): Merge conflict in llvm/utils/git/requirements_formatting.txt
We already ignore llvm.trap, adding llvm.debugtrap as well.

Signed-off-by: Marcos Maronas <[email protected]>

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@3e523388f13bcc7
also fix formatting issue from a recent PR.

Signed-off-by: Sidorov, Dmitry <[email protected]>

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@9b0f29c488ada22
This allows `max_work_group_size` metadata to behave like
`reqd_work_group_size` and `work_group_size_hint` in that it may have
between 1 and 3 operands. Missing dimensions are filled in with 1s.

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@c38378cd5d814ed
In LLVM, array allocations might have constant size:

%array = alloca i32, i64 4, align 4
Represent this kind of allocations using OpVariable + OpBitcast.

Before this patch, the SPV_INTEL_variable_length_array extension was used.

Signed-off-by: Victor Perez <[email protected]>

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@ea2fcc172f6861e
)

The refactoring is to simplify the vectorization of generated functions.

Signed-off-by: Cui, Dele <[email protected]>

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@a5952614c12594c
Per cl_intel_subgroups_short V 1.1.0 short16 is allowed for these
builtins.

Signed-off-by: Sidorov, Dmitry <[email protected]>

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@6895a2eb5d053d8
…kend target (#2716)

This adds a new command line option "spirv-use-llvm-backend-target" that is to translate LLVM IR to SPIR-V using the LLVM SPIRV Backend target. cmake definitions are modified to search for the LLVM SPIRV Backend target while configuring the project, and when LLVM is built with SPIRV Backend support, we may use the interface exposed by SPIRV BE to translate Module to SPIR-V code, but only if a user explicitly asks for this way of LLVM IR transformation.

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@dfeb22b8696d2f5
Variadic functions are not supported in SPIR-V, the only known exception is printf.

Signed-off-by: Marcos Maronas <[email protected]>

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@569972a61c86aa6
…ure (#2722)

Fix issue #2721: Incorrect translation of calls to a builtin that returns a structure: create just one load, and account for a special case when Translator prepared a well-known pattern (store/load) for itself.

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@dc1221cd83e67ef
@iclsrc iclsrc added the disable-lint Skip linter check step and proceed with build jobs label Sep 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disable-lint Skip linter check step and proceed with build jobs
Projects
None yet
Development

Successfully merging this pull request may close these issues.