{"payload":{"feedbackUrl":"https://github.com/orgs/community/discussions/53140","repo":{"id":511514284,"defaultBranch":"main","name":"llvm-project","ownerLogin":"sjoerdmeijer","currentUserCanPush":false,"isFork":true,"isEmpty":false,"createdAt":"2022-07-07T12:14:15.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/54853335?v=4","public":true,"private":false,"isOrgOwned":false},"refInfo":{"name":"","listCacheKey":"v0:1725028613.0","currentOid":""},"activityList":{"items":[{"before":"aa4220e76da6e64c958234ff75ab408631543f00","after":null,"ref":"refs/heads/testsuite-benchmarks","pushedAt":"2024-08-30T14:36:53.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"sjoerdmeijer","name":"Sjoerd Meijer","path":"/sjoerdmeijer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/54853335?s=80&v=4"}},{"before":"46d2ee2eb3664fe1bd405bb80be618ab2c2d3074","after":"aa4220e76da6e64c958234ff75ab408631543f00","ref":"refs/heads/testsuite-benchmarks","pushedAt":"2024-08-30T14:17:44.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"sjoerdmeijer","name":"Sjoerd Meijer","path":"/sjoerdmeijer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/54853335?s=80&v=4"},"commit":{"message":"[test-suite] Document the LLVM test-suite benchmark apps\n\nThere is no documentation or description of the different apps in the\nLLVM benchmark test-suite and this is a first attempt to document this.","shortMessageHtmlLink":"[test-suite] Document the LLVM test-suite benchmark apps"}},{"before":null,"after":"46d2ee2eb3664fe1bd405bb80be618ab2c2d3074","ref":"refs/heads/testsuite-benchmarks","pushedAt":"2024-08-23T15:10:30.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"sjoerdmeijer","name":"Sjoerd Meijer","path":"/sjoerdmeijer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/54853335?s=80&v=4"},"commit":{"message":"Document the LLVM test-suite benchmark apps\n\nThere is no documentation or description of the different apps in the\nLLVM benchmark test-suite and this is a first attempt to document this\nfor the MultiSource apps.","shortMessageHtmlLink":"Document the LLVM test-suite benchmark apps"}},{"before":"34e15adb5a725a5ecc7c0d5ef5571d307d751a93","after":"6a8f73803a32db75d22490d341bf8744722a9025","ref":"refs/heads/main","pushedAt":"2024-08-23T15:08:22.000Z","pushType":"push","commitsCount":438,"pusher":{"login":"sjoerdmeijer","name":"Sjoerd Meijer","path":"/sjoerdmeijer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/54853335?s=80&v=4"},"commit":{"message":"Revert \"Reland \"[asan] Remove debug tracing from `report_globals` (#104404)\" (#105601)\"\n\nthat change still breaks\n\n SanitizerCommon-asan-x86_64-Darwin :: Darwin/print-stack-trace-in-code-loaded-after-fork.cpp\n\n> This reverts commit 2704b804bec50c2b016bf678bd534c330ec655b6\n> and relands #104404.\n>\n> The Darwin should not fail after #105599.\n\nThis reverts commit 8c6f8c29e90666b747fc4b4612647554206a2be5.","shortMessageHtmlLink":"Revert \"Reland \"[asan] Remove debug tracing from report_globals (ll…"}},{"before":"f678232c16edd66156ea0df89d834fa65ccbedd5","after":null,"ref":"refs/heads/testsuite-benchmarks","pushedAt":"2024-08-23T15:08:00.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"sjoerdmeijer","name":"Sjoerd Meijer","path":"/sjoerdmeijer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/54853335?s=80&v=4"}},{"before":null,"after":"f678232c16edd66156ea0df89d834fa65ccbedd5","ref":"refs/heads/testsuite-benchmarks","pushedAt":"2024-08-23T15:04:19.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"sjoerdmeijer","name":"Sjoerd Meijer","path":"/sjoerdmeijer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/54853335?s=80&v=4"},"commit":{"message":"[test-suite] Document the LLVM test-suite benchmark apps\n\nThere is no documentation or description of the different apps in the\nLLVM benchmark test-suite and this is a first attempt to document this\nfor the MultiSource apps.","shortMessageHtmlLink":"[test-suite] Document the LLVM test-suite benchmark apps"}},{"before":"db5df85f36664cfefa5b80d520415bc34d7d9c8f","after":null,"ref":"refs/heads/fix-assert-fixed-elem","pushedAt":"2024-08-21T14:27:27.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"sjoerdmeijer","name":"Sjoerd Meijer","path":"/sjoerdmeijer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/54853335?s=80&v=4"}},{"before":"253c608932636db9ab88905d2e2a1c408fb15707","after":"db5df85f36664cfefa5b80d520415bc34d7d9c8f","ref":"refs/heads/fix-assert-fixed-elem","pushedAt":"2024-08-21T12:35:28.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"sjoerdmeijer","name":"Sjoerd Meijer","path":"/sjoerdmeijer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/54853335?s=80&v=4"},"commit":{"message":"[AArch64] Bail out for scalable vecs in areExtractShuffleVectors\n\nThe added test triggers the following assert in\n`areExtractShuffleVectors` that is called from `shouldSinkOperands`:\n\nAssertion `(!isScalable() || isZero()) && \"Request for a fixed element count on a scalable object\"' failed.\n\nI don't think scalable types can be extract shuffles, so bail early if\nthis is the case.","shortMessageHtmlLink":"[AArch64] Bail out for scalable vecs in areExtractShuffleVectors"}},{"before":null,"after":"253c608932636db9ab88905d2e2a1c408fb15707","ref":"refs/heads/fix-assert-fixed-elem","pushedAt":"2024-08-21T08:41:38.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"sjoerdmeijer","name":"Sjoerd Meijer","path":"/sjoerdmeijer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/54853335?s=80&v=4"},"commit":{"message":"[AArch64] Assert \"Request for a fixed element count on a scalable object\"\n\nThe added test triggers this assert in `areExtractShuffleVectors` that\nis called from `shouldSinkOperands`:\n\nAssertion `(!isScalable() || isZero()) && \"Request for a fixed element count on a scalable object\"' failed.\n\nI don't think scalable types can be extract shuffles, so bail early if\nthis is the case.","shortMessageHtmlLink":"[AArch64] Assert \"Request for a fixed element count on a scalable obj…"}},{"before":"f231d3dab3da9966621ed4e72847f1292db54ede","after":"34e15adb5a725a5ecc7c0d5ef5571d307d751a93","ref":"refs/heads/main","pushedAt":"2024-08-20T12:47:09.000Z","pushType":"push","commitsCount":1639,"pusher":{"login":"sjoerdmeijer","name":"Sjoerd Meijer","path":"/sjoerdmeijer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/54853335?s=80&v=4"},"commit":{"message":"[AArch64] Remove TargetParser CPU/Arch feature tests (#104587)\n\nThese are annoying to update, and are redundant since the tests in\r\nclang/test/Driver/print-enabled-extensions/ were added.","shortMessageHtmlLink":"[AArch64] Remove TargetParser CPU/Arch feature tests (llvm#104587)"}},{"before":"03e17da510963ce6b7a1d0ab4d67f753a6cc7495","after":"f231d3dab3da9966621ed4e72847f1292db54ede","ref":"refs/heads/main","pushedAt":"2024-08-05T09:50:47.000Z","pushType":"push","commitsCount":690,"pusher":{"login":"sjoerdmeijer","name":"Sjoerd Meijer","path":"/sjoerdmeijer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/54853335?s=80&v=4"},"commit":{"message":"Fix some X86 tests (#101944)\n\nextractelement-shuffle.ll: Test for bugfix in DAGCombiner, moved to\r\nGeneric.\r\n\r\n2010-07-06-DbgCrash.ll and 2006-10-02-BoolRetCrash.ll: Bugfixes in X86,\r\nrun tests with X86 backend.","shortMessageHtmlLink":"Fix some X86 tests (llvm#101944)"}},{"before":"6507e9e259cac8d3749f071c8a762a8c1a6d7072","after":"4800695ca223c659149ae11c9b417059dc83c579","ref":"refs/heads/ofast","pushedAt":"2024-07-29T18:53:54.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"sjoerdmeijer","name":"Sjoerd Meijer","path":"/sjoerdmeijer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/54853335?s=80&v=4"},"commit":{"message":"Update clang/docs/CommandGuide/clang.rst\n\nCo-authored-by: Vlad Serebrennikov ","shortMessageHtmlLink":"Update clang/docs/CommandGuide/clang.rst"}},{"before":"22f2e40c6d44c562e97162f04dee055a1be57460","after":"6507e9e259cac8d3749f071c8a762a8c1a6d7072","ref":"refs/heads/ofast","pushedAt":"2024-07-29T14:54:40.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"sjoerdmeijer","name":"Sjoerd Meijer","path":"/sjoerdmeijer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/54853335?s=80&v=4"},"commit":{"message":"Update clang/docs/CommandGuide/clang.rst\n\nCo-authored-by: Aaron Ballman ","shortMessageHtmlLink":"Update clang/docs/CommandGuide/clang.rst"}},{"before":"7357ef4d5b346d0c317ff09c6700fa944f6ae770","after":"22f2e40c6d44c562e97162f04dee055a1be57460","ref":"refs/heads/ofast","pushedAt":"2024-07-29T14:31:27.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"sjoerdmeijer","name":"Sjoerd Meijer","path":"/sjoerdmeijer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/54853335?s=80&v=4"},"commit":{"message":"Ofast documentation deprecation clarifications\n\nFollowing up on the RFC discussion, this is clarifying that the main purpose\nand effect of the -Ofast deprecation is to discourage its usage and that\neverything else is more or less open for discussion, e.g. there is no timeline\nyet for removal.","shortMessageHtmlLink":"Ofast documentation deprecation clarifications"}},{"before":null,"after":"7357ef4d5b346d0c317ff09c6700fa944f6ae770","ref":"refs/heads/ofast","pushedAt":"2024-07-29T12:39:17.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"sjoerdmeijer","name":"Sjoerd Meijer","path":"/sjoerdmeijer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/54853335?s=80&v=4"},"commit":{"message":"Ofast deprecation clarifications\n\nFollowing up on the RFC discussion, this is clarifying that the main purpose\nand effect of the -Ofast deprecation is to discourage its usage and that\neverything else is more or less open for discussion, e.g. there is no timeline\nyet for removal.","shortMessageHtmlLink":"Ofast deprecation clarifications"}},{"before":"e3a3397209fe05ec65d74e9096347fc7a76e919e","after":"03e17da510963ce6b7a1d0ab4d67f753a6cc7495","ref":"refs/heads/main","pushedAt":"2024-07-29T09:29:30.000Z","pushType":"push","commitsCount":3,"pusher":{"login":"sjoerdmeijer","name":"Sjoerd Meijer","path":"/sjoerdmeijer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/54853335?s=80&v=4"},"commit":{"message":"[DWARF] Emit line 0 source locations for BB padding nops (#99496)\n\nThis patch makes LLVM emit line 0 source locations for nops emitted as\r\nbasic block padding.\r\n\r\n---------\r\n\r\nCo-authored-by: Orlando Cazalet-Hyams ","shortMessageHtmlLink":"[DWARF] Emit line 0 source locations for BB padding nops (llvm#99496)"}},{"before":"1feef92a775daf8818faf766e0b1332421b48c5f","after":"e3a3397209fe05ec65d74e9096347fc7a76e919e","ref":"refs/heads/main","pushedAt":"2024-07-29T09:20:52.000Z","pushType":"push","commitsCount":438,"pusher":{"login":"sjoerdmeijer","name":"Sjoerd Meijer","path":"/sjoerdmeijer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/54853335?s=80&v=4"},"commit":{"message":"Revert \"[Clang] Demote always_inline error to warning for mismatching SME attrs\" (#100991)\n\nReverts llvm/llvm-project#100740","shortMessageHtmlLink":"Revert \"[Clang] Demote always_inline error to warning for mismatching…"}},{"before":"d1b2fd0a6f42f4e7e2e7e3d730fb8e36cc358c4f","after":"0160e7ac7f28c7256f123fe360086a6fd802085a","ref":"refs/heads/interleave4","pushedAt":"2024-07-29T07:53:04.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"sjoerdmeijer","name":"Sjoerd Meijer","path":"/sjoerdmeijer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/54853335?s=80&v=4"},"commit":{"message":"[AArch64] Set MaxInterleaving to 4 for Neoverse V2 and V3\n\nThis helps loop based benchmarks quite a lot, SPEC INT is unaffected.","shortMessageHtmlLink":"[AArch64] Set MaxInterleaving to 4 for Neoverse V2 and V3"}},{"before":null,"after":"d1b2fd0a6f42f4e7e2e7e3d730fb8e36cc358c4f","ref":"refs/heads/interleave4","pushedAt":"2024-07-24T14:08:35.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"sjoerdmeijer","name":"Sjoerd Meijer","path":"/sjoerdmeijer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/54853335?s=80&v=4"},"commit":{"message":"[AArch64] Set MaxInterleaving to 4 for Neoverse V2\n\nThis helps loop based benchmarks quite a lot, SPEC INT is unaffected.","shortMessageHtmlLink":"[AArch64] Set MaxInterleaving to 4 for Neoverse V2"}},{"before":"9b9194af408003e7d484d621fb3ee61389bdd20e","after":"1feef92a775daf8818faf766e0b1332421b48c5f","ref":"refs/heads/main","pushedAt":"2024-07-24T14:06:29.000Z","pushType":"push","commitsCount":970,"pusher":{"login":"sjoerdmeijer","name":"Sjoerd Meijer","path":"/sjoerdmeijer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/54853335?s=80&v=4"},"commit":{"message":"Fix lifetimebound for field access (#100197)\n\nFixes: https://github.com/llvm/llvm-project/issues/81589\r\n\r\nThere is no way to switch this off without `-Wno-dangling`.","shortMessageHtmlLink":"Fix lifetimebound for field access (llvm#100197)"}},{"before":"b037d0f0e5f6c7ab528fe3ed9d855f0d770c6709","after":"9b9194af408003e7d484d621fb3ee61389bdd20e","ref":"refs/heads/main","pushedAt":"2024-07-17T10:00:32.000Z","pushType":"push","commitsCount":311,"pusher":{"login":"sjoerdmeijer","name":"Sjoerd Meijer","path":"/sjoerdmeijer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/54853335?s=80&v=4"},"commit":{"message":"[gn build] Port e94e72a0c229","shortMessageHtmlLink":"[gn build] Port e94e72a"}},{"before":"2c50720ab53d3f02ec30f632e38c65ebaadc8613","after":null,"ref":"refs/heads/lv-prefer-fixed","pushedAt":"2024-07-17T09:59:58.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"sjoerdmeijer","name":"Sjoerd Meijer","path":"/sjoerdmeijer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/54853335?s=80&v=4"}},{"before":"e65cb498422a6cbd9bdd2eb3bd25265ae60ecfa3","after":"2c50720ab53d3f02ec30f632e38c65ebaadc8613","ref":"refs/heads/lv-prefer-fixed","pushedAt":"2024-07-15T08:43:57.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"sjoerdmeijer","name":"Sjoerd Meijer","path":"/sjoerdmeijer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/54853335?s=80&v=4"},"commit":{"message":"[LV][AArch64] Prefer Fixed over Scalable if cost-model is equal (Neoverse V2)\n\nFor the Neoverse V2 we would like to prefer fixed width over scalable\nvectorisation if the cost-model assigns an equal cost to both for certain\nloops. This improves 7 kernels from TSVC-2 and several production kernels by\nabout 2x, and does not affect SPEC21017 INT and FP. This also adds a new TTI\nhook that can steer the loop vectorizater to preferring fixed width\nvectorization, which can be set per CPU. For now, this is only enabled for the\nNeoverse V2.\n\nThere are 3 reasons why preferring NEON might be better in the case the\ncost-model is a tie and the SVE vector size is the same as NEON (128-bit):\narchitectural reasons, micro-architecture reasons, and SVE codegen reasons. The\nlatter will be improved over time, so the more important reasons are the former\ntwo. I.e., (micro) architecture reason is the use of LPD/STP instructions which\nare not available in SVE2 and it avoids predication.\n\nFor what it is worth: this codegen strategy to generate more NEON is inline\nwith GCC's codegen strategy, which is actually even more aggressive in\ngenerating NEON when no predication is required. We could be smarter about the\ndecision making, but this seems to be a first good step in the right direction,\nand we can always revise this later (for example make the target hook more\ngeneral).","shortMessageHtmlLink":"[LV][AArch64] Prefer Fixed over Scalable if cost-model is equal (Neov…"}},{"before":"c3c1c6a63dfb2c43b59a884769defca12f5e4389","after":"e65cb498422a6cbd9bdd2eb3bd25265ae60ecfa3","ref":"refs/heads/lv-prefer-fixed","pushedAt":"2024-07-15T08:35:59.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"sjoerdmeijer","name":"Sjoerd Meijer","path":"/sjoerdmeijer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/54853335?s=80&v=4"},"commit":{"message":"[LV][AArch64] Prefer Fixed over Scalable if cost-model is equal (Neoverse V2)\n\nFor the Neoverse V2 we would like to prefer fixed width over scalable\nvectorisation if the cost-model assigns an equal cost to both for certain\nloops. This improves 7 kernels from TSVC-2 and several production kernels by\nabout 2x, and does not affect SPEC21017 INT and FP. This also adds a new TTI\nhook that can steer the loop vectorizater to preferring fixed width\nvectorization, which can be set per CPU. For now, this is only enabled for the\nNeoverse V2.\n\nThere are 3 reasons why preferring NEON might be better in the case the\ncost-model is a tie and the SVE vector size is the same as NEON (128-bit):\narchitectural reasons, micro-architecture reasons, and SVE codegen reasons. The\nlatter will be improved over time, so the more important reasons are the former\ntwo. I.e., (micro) architecture reason is the use of LPD/STP instructions which\nare not available in SVE2 and it avoids predication.\n\nFor what it is worth: this codegen strategy to generate more NEON is inline\nwith GCC's codegen strategy, which is actually even more aggressive in\ngenerating NEON when no predication is required. We could be smarter about the\ndecision making, but this seems to be a first good step in the right direction,\nand we can always revise this later (for example make the target hook more\ngeneral).","shortMessageHtmlLink":"[LV][AArch64] Prefer Fixed over Scalable if cost-model is equal (Neov…"}},{"before":"35ddc17f36282f24324275e0691fb57e270f113d","after":"b037d0f0e5f6c7ab528fe3ed9d855f0d770c6709","ref":"refs/heads/main","pushedAt":"2024-07-15T08:09:38.000Z","pushType":"push","commitsCount":3561,"pusher":{"login":"sjoerdmeijer","name":"Sjoerd Meijer","path":"/sjoerdmeijer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/54853335?s=80&v=4"},"commit":{"message":"[BOLT][docs] Expand Heatmaps.md (#98162)\n\nImprove documentation on heatmaps.\r\nAdd example for X axis labels.","shortMessageHtmlLink":"[BOLT][docs] Expand Heatmaps.md (llvm#98162)"}},{"before":"5a4b1b0cf15f558b40dd1a4a7e7db7e03512e247","after":"c3c1c6a63dfb2c43b59a884769defca12f5e4389","ref":"refs/heads/lv-prefer-fixed","pushedAt":"2024-07-11T14:38:53.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"sjoerdmeijer","name":"Sjoerd Meijer","path":"/sjoerdmeijer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/54853335?s=80&v=4"},"commit":{"message":"[LV][AArch64] Prefer Fixed over Scalable if cost-model is equal (Neoverse V2)\n\nFor the Neoverse V2 we would like to prefer fixed width over scalable\nvectorisation if the cost-model assigns an equal cost to both for certain\nloops. This improves 7 kernels from TSVC-2 and several production kernels by\nabout 2x, and does not affect SPEC21017 INT and FP. This also adds a new TTI\nhook that can steer the loop vectorizater to preferring fixed width\nvectorization, which can be set per CPU. For now, this is only enabled for the\nNeoverse V2.\n\nThere are 3 reasons why preferring NEON might be better in the case the\ncost-model is a tie and the SVE vector size is the same as NEON (128-bit):\narchitectural reasons, micro-architecture reasons, and SVE codegen reasons. The\nlatter will be improved over time, so the more important reasons are the former\ntwo. I.e., (micro) architecture reason is the use of LPD/STP instructions which\nare not available in SVE2 and it avoids predication.\n\nFor what it is worth: this codegen strategy to generate more NEON is inline\nwith GCC's codegen strategy, which is actually even more aggressive in\ngenerating NEON when no predication is required. We could be smarter about the\ndecision making, but this seems to be a first good step in the right direction,\nand we can always revise this later (for example make the target hook more\ngeneral).","shortMessageHtmlLink":"[LV][AArch64] Prefer Fixed over Scalable if cost-model is equal (Neov…"}},{"before":"6efcff18dfc42038bafa67091e990b9c1b839a71","after":"5a4b1b0cf15f558b40dd1a4a7e7db7e03512e247","ref":"refs/heads/lv-prefer-fixed","pushedAt":"2024-07-09T13:51:45.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"sjoerdmeijer","name":"Sjoerd Meijer","path":"/sjoerdmeijer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/54853335?s=80&v=4"},"commit":{"message":"[LV][AArch64] Prefer Fixed over Scalable if cost-model is equal (Neoverse V2)\n\nFor the Neoverse V2 we would like to prefer fixed width over scalable\nvectorisation if the cost-model assigns an equal cost to both for certain\nloops. This improves 7 kernels from TSVC-2 and several production kernels by\nabout 2x, and does not affect SPEC21017 INT and FP. This also adds a new TTI\nhook that can steer the loop vectorizater to preferring fixed width\nvectorization, which can be set per CPU. For now, this is only enabled for the\nNeoverse V2.\n\nThere are 3 reasons why preferring NEON might be better in the case the\ncost-model is a tie and the SVE vector size is the same as NEON (128-bit):\narchitectural reasons, micro-architecture reasons, and SVE codegen reasons. The\nlatter will be improved over time, so the more important reasons are the former\ntwo. I.e., (micro) architecture reason is the use of LPD/STP instructions which\nare not available in SVE2 and it avoids predication.\n\nFor what it is worth: this codegen strategy to generate more NEON is inline\nwith GCC's codegen strategy, which is actually even more aggressive in\ngenerating NEON when no predication is required. We could be smarter about the\ndecision making, but this seems to be a first good step in the right direction,\nand we can always revise this later (for example make the target hook more\ngeneral).","shortMessageHtmlLink":"[LV][AArch64] Prefer Fixed over Scalable if cost-model is equal (Neov…"}},{"before":"193457ab3be8ab0ba92b5ae91697b1a51b967cfe","after":"6efcff18dfc42038bafa67091e990b9c1b839a71","ref":"refs/heads/lv-prefer-fixed","pushedAt":"2024-06-26T09:50:16.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"sjoerdmeijer","name":"Sjoerd Meijer","path":"/sjoerdmeijer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/54853335?s=80&v=4"},"commit":{"message":"[LV][AArch64] Prefer Fixed over Scalable if cost-model is equal (Neoverse V2)\n\nFor the Neoverse V2, we would like to prefer fixed width over scalable\nvectorisation if the cost-model assigns an equal cost for certain loops. This\nimproves 7 kernels from TSVC-2 by about 2x, and does not affect SPEC21017 INT\nand FP. This also adds a new TTI new hook that can steer the loop vectoriser\nto preferring fixed width vectorization, which can be set per CPU. For now,\nthis is only enabled for the Neoverse V2.\n\nThis tends to benefit small kernels, like the ones in TSVC, for a\nnumber of reasons: processing the predicates does not come entirely\nfor free, NEON tends to generate slightly less code which can have a\nbig impact on these small kernels, and then there are second order\neffects that SVE codegen is slightly less optimal in some areas.\n\nThis codegen strategy to generate more NEON is inline with GCC's codegen\nstrategy, which is actually even more aggressive in generating NEON when\nno predication is required. We could be smarter and more aggressive too\nabout generating more NEON (and improve performance), but this seems to\nbe a first good and straight forward step.","shortMessageHtmlLink":"[LV][AArch64] Prefer Fixed over Scalable if cost-model is equal (Neov…"}},{"before":"58b8a8444f7a7c8eb0c85464d7d9d3d335fd4f2c","after":"193457ab3be8ab0ba92b5ae91697b1a51b967cfe","ref":"refs/heads/lv-prefer-fixed","pushedAt":"2024-06-24T09:58:52.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"sjoerdmeijer","name":"Sjoerd Meijer","path":"/sjoerdmeijer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/54853335?s=80&v=4"},"commit":{"message":"[LV][AArch64] Prefer Fixed over Scalable if cost-model is equal (Neoverse V2)\n\nFor the Neoverse V2, we would like to prefer fixed width over scalable\nvectorisation if the cost-model assigns an equal cost to both for\ncertain loops. This improves 7 kernels from TSVC-2 by about 2x, and does\nnot affect SPEC21017 INT and FP. This also adds a new TTI new hook that\ncan steer the loop vectorizater to preferring fixed width vectorization,\nwhich can be set per CPU. For now, this is only enabled for the Neoverse\nV2.\n\nThis tends to benefit small kernels, like the ones in TSVC, for a\nnumber of reasons: processing the predicates does not come entirely\nfor free, NEON tends to generate slightly less code which can have a\nbig impact on these small kernels, and then there are second order\naffects that SVE codegen is slightly less optimal in some areas.\n\nThis codegen strategy to generate more NEON is inline with GCC's codegen\nstrategy, which is actually even more aggressive in generating NEON when\nno predication is required. We could be smarter and more aggressive too\nabout generating more NEON (and improve performance), but this seems to\nbe a first good and straight forward step.","shortMessageHtmlLink":"[LV][AArch64] Prefer Fixed over Scalable if cost-model is equal (Neov…"}},{"before":"6abec3a340c0b2d90ac5edd0286b42b34371a54e","after":"58b8a8444f7a7c8eb0c85464d7d9d3d335fd4f2c","ref":"refs/heads/lv-prefer-fixed","pushedAt":"2024-06-17T18:22:45.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"sjoerdmeijer","name":"Sjoerd Meijer","path":"/sjoerdmeijer","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/54853335?s=80&v=4"},"commit":{"message":"[LV][AArch64] Prefer Fixed over Scalable if cost-model is equal (Neoverse V2)\n\nFor the Neoverse V2, prefer fixed width vectorisation If the cost-model\nassigns an equal cost to fixed and scalable vectorisation. This improves\n7 kernels from TSVC-2 by about 2x, and does not affect SPEC21017 INT\nand FP.\n\nThis tends to benefit small kernels, like the ones in TSVC, for a\nnumber of reasons: processing the predicates does not come entirely\nfor free, NEON tends to generate slightly less code which can have a\nbig impact on these small kernels, and then there are second order\naffects that SVE codegen is slightly less optimal in some areas.\n\nThis codegen strategy to generate more NEON is inline with GCC's codegen\nstrategy, which is actually even more aggressive in generating NEON when\nno predication is required. We could be smarter and more aggressive too\nabout generating more NEON (and improve performance), but this seems to\nbe a first good and straight forward step.\n\nThis depends on #95818.","shortMessageHtmlLink":"[LV][AArch64] Prefer Fixed over Scalable if cost-model is equal (Neov…"}}],"hasNextPage":true,"hasPreviousPage":false,"activityType":"all","actor":null,"timePeriod":"all","sort":"DESC","perPage":30,"cursor":"Y3Vyc29yOnYyOpK7MjAyNC0wOC0zMFQxNDozNjo1My4wMDAwMDBazwAAAASow3y4","startCursor":"Y3Vyc29yOnYyOpK7MjAyNC0wOC0zMFQxNDozNjo1My4wMDAwMDBazwAAAASow3y4","endCursor":"Y3Vyc29yOnYyOpK7MjAyNC0wNi0xN1QxODoyMjo0NS4wMDAwMDBazwAAAARnrMd1"}},"title":"Activity · sjoerdmeijer/llvm-project"}