Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quantifying the benefits of DT_AARCH64_AUTH_RELR #252

Open
MaskRay opened this issue Mar 24, 2024 · 5 comments
Open

Quantifying the benefits of DT_AARCH64_AUTH_RELR #252

MaskRay opened this issue Mar 24, 2024 · 5 comments

Comments

@MaskRay
Copy link
Contributor

MaskRay commented Mar 24, 2024

Recently, I have proposed a compact relocation format CREL at https://groups.google.com/g/generic-abi/c/yb0rjw56ORw/m/eiBcYxSfAQAJ (previously named RELLEB).
(https://github.com/MaskRay/llvm-project/tree/demo-crel adds CREL support to clang/lld/LLVM binary utilities. clang -fuse-ld=lld -mcrel a.c)

CREL shines for GLOB_DAT/JUMP_SLOT and absolute relocations, often requiring mere 2 bytes (offset adjustment and symbol index update).
While it might not be as effective for relative relocations compared to RELR, I'm interested in a quantitative evaluation of DT_AARCH64_AUTH_RELR (a RELR variant).

DT_AARCH64_AUTH_RELR requires the addend to be 32-bit and the linker support requires moving addend-out-of-range relocations to .rela.dyn, complicating the design.

Are there figures on how many regular relative relocations are transformed into R_AARCH64_AUTH_RELATIVE under PAuthABI?
(Open-source rtld/libc might favor a smaller set of supported relocation formats. CREL is tackling with a very difficult problem, but it has the potential to phase out REL/RELA, decreasing complexity for future new architectures.)


I am using a Release build of clang-16 as an example. RELR is much better than CREL in compacting relative relocations.

# file size without RELR/CREL: 175310712
% fld.lld @response.txt -o - | fllvm-readelf -S - | grep -E ' \.c?rel.?\.'
  [ 8] .rela.dyn         RELA            00000000005df318 5df318 c3a980 18   A  3   0  8
  [ 9] .rela.plt         RELA            0000000001219c98 1219c98 001f38 18  AI  3  26  8
% fld.lld @response.txt -z pack-relative-relocs -o - | fllvm-readelf -S - | grep -E ' \.c?rel.?\.'
  [ 8] .rela.dyn         RELA            00000000005df340 5df340 011088 18   A  3   0  8
  [ 9] .relr.dyn         RELR            00000000005f03c8 5f03c8 0259d0 08   A  0   0  8
  [10] .rela.plt         RELA            0000000000615d98 615d98 001f38 18  AI  3  27  8
% fld.lld @response.txt -z crel -o - | fllvm-readelf -S - | grep -E ' \.c?rel.?\.'  # similar to REL, addend_bit is 0
  [ 8] .crel.dyn         CREL            00000000005df318 5df318 082b86 00   A  3   0  8
  [ 9] .rel.plt          REL             0000000000661ea0 661ea0 0014d0 10  AI  3  26  8
% fld.lld @response.txt -z crel -z rela -o - | fllvm-readelf -S - | grep -E ' \.c?rel.?\.'  # similar to RELA, addend_bit is 1, in-relocation addends
  [ 8] .crel.dyn         CREL            00000000005df318 5df318 1c0be3 00   A  3   0  8
  [ 9] .rela.plt         RELA            000000000079ff00 79ff00 001f38 18  AI  3  26  8
% fld.lld @response.txt -z pack-relative-relocs -z crel -o - | fllvm-readelf -S - | grep -E ' \.c?rel.?\.'
  [ 8] .crel.dyn         CREL            00000000005df340 5df340 000fbc 00   A  3   0  8
  [ 9] .relr.dyn         RELR            00000000005e0300 5e0300 0259d0 08   A  0   0  8
  [10] .crel.plt         CREL            0000000000605cd0 605cd0 0002a0 00  AI  3  27  8

In a release build of Clang 16, using -z crel resulted in a .crel.dyn section size of only 1.0% of the file size. Notably, enabling implicit addends with -z crel -z rel further reduced the size to just 0.3%. While DT_AARCH64_AUTH_RELR will achieve a noticeable smaller relocation size if most relative relocations are encoded with it, the advantage seems less significant considering CREL's already compact size.

% ~/projects/bloaty/out/release/bloaty clang.crel  # CREL with implicit addends, not recommended
    FILE SIZE        VM SIZE
 --------------  --------------
  46.7%  77.5Mi  54.7%  77.5Mi    .text
  22.7%  37.7Mi  26.6%  37.7Mi    .rodata
  12.1%  20.1Mi   0.0%       0    .strtab
   5.5%  9.20Mi   6.5%  9.20Mi    .data.rel.ro
   5.2%  8.68Mi   6.1%  8.68Mi    .eh_frame
   2.8%  4.73Mi   0.0%       0    .symtab
   2.5%  4.09Mi   2.9%  4.09Mi    .dynstr
   0.8%  1.29Mi   0.9%  1.29Mi    .dynsym
   0.8%  1.28Mi   0.9%  1.28Mi    .eh_frame_hdr
   0.3%   523Ki   0.4%   523Ki    .crel.dyn
   0.0%       0   0.3%   492Ki    .bss
   0.2%   400Ki   0.3%   400Ki    .gnu.hash
   0.2%   334Ki   0.2%   334Ki    .data
   0.1%   109Ki   0.1%   109Ki    .gnu.version
   0.0%  5.22Ki   0.0%  5.22Ki    .plt
   0.0%  4.55Ki   0.0%  4.53Ki    .init_array
   0.0%  3.15Ki   0.0%  2.66Ki    [15 Others]
   0.0%  2.62Ki   0.0%  2.63Ki    .got.plt
   0.0%       0   0.0%  2.62Ki    .relro_padding
   0.0%  2.55Ki   0.0%  2.55Ki    .got
   0.0%  2.00Ki   0.0%       0    [ELF Section Headers]
 100.0%   165Mi 100.0%   141Mi    TOTAL

A CREL with addend implementation for relative relocations unnecessarily encode the addend in the relocation. -z crel -z rel switches to implicit addends:

0000000008cd53f8  0000014300000006 R_X86_64_GLOB_DAT      0000000000000000 _ZTTSt14basic_ifstreamIcSt11char_traitsIcEE@GLIBCXX_3.4 + 0
0000000008cd5400  0000014400000006 R_X86_64_GLOB_DAT      0000000000000000 _ZTVSt13basic_filebufIcSt11char_traitsIcEE@GLIBCXX_3.4 + 0
0000000008cd5440  0000015200000006 R_X86_64_GLOB_DAT      0000000000000000 _ZTVSt12future_error@GLIBCXX_3.4.14 + 0
00000000083a0000  0000000000000008 R_X86_64_RELATIVE                 0

My blog post provides more description about implicit addends around https://maskray.me/blog/2024-03-09-a-compact-relocation-format-for-elf#:~:text=DT_AARCH64_AUTH_RELR

@pcc
Copy link

pcc commented Mar 25, 2024

Are there figures on how many regular relative relocations are transformed into R_AARCH64_AUTH_RELATIVE under PAuthABI?

The 32b limit was chosen on the basis that in the vast majority of cases (small code model, position independent code) the addend is going to be within the 0..2^32 range. So I would say that in a typical userspace, pretty much every relocation is going to use this.

@MaskRay
Copy link
Contributor Author

MaskRay commented Mar 25, 2024

I agree that in the majority of cases linkers don't need to move a relocation from .relr.auth.dyn to .rela.dyn.
The question is do we need this RELR variant?

If we use CREL (while size(.crel.dyn)/size(.relr.dyn) may be 3.5x, size(.crel.dyn)/size(.o) is just 0.3%), supporting AUTH relative will take very few lines in https://github.com/MaskRay/llvm-project/blob/demo-crel/lld/ELF/SyntheticSections.cpp around CrelSection<uint>::updateAllocSize, and the code will be centralized in one place.

@pcc
Copy link

pcc commented Mar 26, 2024

Got it, I misread your earlier message and thought you were asking about .rela.dyn -> .relr.auth.dyn rather than R_AARCH64_ABS64 -> R_AARCH64_AUTH_ABS64. Let me see if I can collect those numbers for AOSP.

@pcc
Copy link

pcc commented Mar 30, 2024

I built AOSP with PAuth ABI disabled and enabled for the Cuttlefish (i.e. emulator) arm64 target and ran the following commands from the symbols directory of the build directory, which contains unstripped versions of all built binaries shipped on the device.

$ llvm-readelf -rW (find -type f) | grep 000000000000e20| wc -l
524144
$ llvm-readelf -rW (find -type f) | grep R_AARCH64_RELATIVE| wc -l
725136

So around 40% of R_AARCH64_RELATIVE relocations become R_AARCH64_AUTH_RELATIVE.

Also, the total size of all SHT_RELR sections in a tree of binaries built with PAuth ABI disabled is 460824 bytes and with PAuth ABI enabled is 307032 bytes. So we can estimate that the total DT_AARCH64_AUTH_RELR size if it were implemented would be around 150KB. Assuming a 3.5x ratio for CREL vs RELR that would mean it would cost 375KB in the shipped image not to have DT_AARCH64_AUTH_RELR (i.e. 0.04% of the total image size of 1.1GB), but I would expect the actual ratio to be higher than for DT_RELR since each relocation contains more information.

@smithp35
Copy link
Contributor

smithp35 commented Apr 2, 2024

Is there any need for any specification updates at the moment? My intent behind DT_AARCH64_AUTH_RELR was to match the existing ELF RELR as closely as possible on the grounds that it would be fairly simple to implement in existing loaders. Their use is not meant to be required by the base standard, although a platform that implements the spec may require it.

I can see that if CREL is standardised and accepted then it would make sense to support that too, although I'm reluctant to take out DT_AARCH64_AUTH_RELR unless all implementors are happy to switch to CREL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants