Reuse the analysis in SpSV and SpSM #14

amontoison · 2023-11-09T05:17:52Z

amontoison · 2023-11-09T06:44:47Z

François and Michel, I updated the file backsolve.jl such that we can reuse the analysis for multiple triangular solves.
It works well in all cases if we don't update the sparse triangular matrice -- just update the right-hand side(s).
If the sparse triangular matrices are also updated, which is the case here because we overwrite the matrix that contains the factors of the LU decomposition, we need to update the SpSV and SpSM descriptors.
NVIDIA provides a routine to do that but only for the SpSV descriptor...
I will send a mail to Yuki about it now.

Note that we must allocate two buffers instead of one now.
We allocate a buffer specific to the backward sweep and another one for the forward sweep.

amontoison · 2023-11-09T07:28:23Z

JuliaGPU/CUDA.jl#2159

frapac · 2023-11-09T16:32:04Z

Thank you for taking care of this.
The modifications look good to me. Are the tests passing locally on your side? I am facing the same issue as the one reported by the CI on my local computer: apparently, the results returned by RFLU are wrong if we update the values of the factorized matrix A internally. I guess this is expected, according to your comment.

A workaround might be calling the analysis phase again right-after calling the function rf_refactor!. Do you think that would work?

frapac · 2023-11-09T16:43:19Z

I also did a quick benchmark on the instances we are interested in :

using LinearAlgebra
using CUDA
using CUDA.CUSPARSE

using MatrixMarket
using CUSOLVERRF

Gx = mmread("Gx_case9241pegase.mtx")
Gu = mmread("Gu_case9241pegase.mtx")
nx = size(Gx, 1)

Gd = CuSparseMatrixCSR(Gx)

nrhs = 256
X = zeros(nx, nrhs)
X .= Gu[:, 1:nrhs]

Xd = CuMatrix(X)
Yd = CUDA.zeros(Float64, nx, nrhs)

rf = CUSOLVERRF.RFLU(Gd; symbolic=:KLU, nrhs=nrhs)
CUDA.@time ldiv!(Yd, rf, Xd)
CUDA.@time ldiv!(Yd, rf', Xd)

Using CUDA runtime 11.8, we get:

  0.011515 seconds (121 CPU allocations: 3.078 KiB) (2 GPU allocations: 2.109 KiB, 0.18% memmgmt time)                                                                                                                                         
  0.013541 seconds (122 CPU allocations: 3.359 KiB) (2 GPU allocations: 2.109 KiB, 0.18% memmgmt time)

Using CUDA runtime 12.3, we have with this PR:

  0.034388 seconds (112 CPU allocations: 2.828 KiB) (2 GPU allocations: 2.109 KiB, 0.04% memmgmt time)
  0.036004 seconds (113 CPU allocations: 3.125 KiB) (2 GPU allocations: 2.109 KiB, 0.06% memmgmt time)

so we should expect a factor of 3 in the performance using the new CUDA API.

amontoison · 2023-11-09T16:44:28Z

If we call again the analysis phase after the function rf_refactor! it will work.
Do you suggest that we should just update SpSV for now?

amontoison · 2023-11-09T16:46:44Z

I also did a quick benchmark on the instances we are interested in :

using LinearAlgebra
using CUDA
using CUDA.CUSPARSE

using MatrixMarket
using CUSOLVERRF

Gx = mmread("Gx_case9241pegase.mtx")
Gu = mmread("Gu_case9241pegase.mtx")
nx = size(Gx, 1)

Gd = CuSparseMatrixCSR(Gx)

nrhs = 256
X = zeros(nx, nrhs)
X .= Gu[:, 1:nrhs]

Xd = CuMatrix(X)
Yd = CUDA.zeros(Float64, nx, nrhs)

rf = CUSOLVERRF.RFLU(Gd; symbolic=:KLU, nrhs=nrhs)
CUDA.@time ldiv!(Yd, rf, Xd)
CUDA.@time ldiv!(Yd, rf', Xd)

Using CUDA runtime 11.8, we get:

  0.011515 seconds (121 CPU allocations: 3.078 KiB) (2 GPU allocations: 2.109 KiB, 0.18% memmgmt time)                                                                                                                                         
  0.013541 seconds (122 CPU allocations: 3.359 KiB) (2 GPU allocations: 2.109 KiB, 0.18% memmgmt time)

Using CUDA runtime 12.3, we have with this PR:

  0.034388 seconds (112 CPU allocations: 2.828 KiB) (2 GPU allocations: 2.109 KiB, 0.04% memmgmt time)
  0.036004 seconds (113 CPU allocations: 3.125 KiB) (2 GPU allocations: 2.109 KiB, 0.06% memmgmt time)

so we should expect a factor of 3 in the performance using the new CUDA API.

Did you permute the versions 11.8 and 12.3 François or is it really 3 times slower with the new API?

frapac · 2023-11-09T16:54:01Z

If we call again the analysis phase after the function rf_refactor! it will work.
Do you suggest that we should just update SpSV for now?

I would suggest to do that, till NVIDIA provides a function to update the SpSM descriptor. Would be ok doing that?

Did you permute the versions 11.8 and 12.3 François or is it really 3 times slower with the new API?

I just permuted the versions 11.8 and 12.3, nothing to do with this PR: if I detail the timing, I observe that the new function cusparseSpSM_solve is indeed 3x slower than the previous one.

amontoison · 2023-11-09T17:00:58Z

If we call again the analysis phase after the function rf_refactor! it will work.
Do you suggest that we should just update SpSV for now?

I would suggest to do that, till NVIDIA provides a function to update the SpSM descriptor. Would be ok doing that?

Yes, I'm in the train for Argonne right now but I can do that after our meeting this afternoon.

Did you permute the versions 11.8 and 12.3 François or is it really 3 times slower with the new API?

I just permuted the versions 11.8 and 12.3, nothing to do with this PR: if I detail the timing, I observe that the new function cusparseSpSM_solve is indeed 3x slower than the previous one.

I highly suspect that this factor 3 comes for an in-place version of the new API.
We can do more operations in parrallel with an out-of-place version and the new API was probably developed for that.

I suppose that they added an unoptimized in-place version to not break code that was using the old API.
It will explain why it was not added before CUDA 12.0.

frapac · 2023-11-09T17:09:30Z

Sure, I don't think we are in a rush to solve this issue.

I suppose that they added an unoptimized in-place version to not break code that was using the old API. It will explain why it was not added before CUDA 12.0.

Interesting, that would make sense indeed. That would advocate for developing an out-of-place triangular solve in CUSOLVERRF.

amontoison · 2023-11-10T04:48:09Z

@frapac
I checked tonight and we have the same performance if the operation is in-place of not with the new API.
I forgot that we allocate a buffer so in-place or not means nothing in that case we can do the same number operations in parrallel.
I also tested on a bigger problem and we don't the factor three like you observed.

using LinearAlgebra
using CUDA
using CUDA.CUSPARSE

using MatrixMarket
using CUSOLVERRF

Gx = mmread("Gx_case_ACTIVSg70k.mtx")
Gu = mmread("Gu_case_ACTIVSg70k.mtx")
nx = size(Gx, 1)

Gd = CuSparseMatrixCSR(Gx)

nrhs = 256
X = zeros(nx, nrhs)
X .= Gu[:, 1:nrhs]

Xd = CuMatrix(X)
Yd = CUDA.zeros(Float64, nx, nrhs)
CUDA.@profile ldiv!(Xd, UpperTriangular(Gd), Yd)

CUDA v11.8:

Profiler ran for 3.52 ms, capturing 130 events.

Host-side activity: calling CUDA APIs took 1.53 ms (43.34% of the trace)
┌──────────┬───────────┬───────┬───────────┬───────────┬───────────┬─────────────────────────┐
│ Time (%) │      Time │ Calls │  Avg time │  Min time │  Max time │ Name                    │
├──────────┼───────────┼───────┼───────────┼───────────┼───────────┼─────────────────────────┤
│   22.68% │ 798.94 µs │    11 │  72.63 µs │   6.68 µs │  418.9 µs │ cudaMemcpyAsync         │
│   13.12% │ 462.29 µs │     1 │ 462.29 µs │ 462.29 µs │ 462.29 µs │ cuMemcpyDtoDAsync       │
│    2.47% │  87.02 µs │    12 │   7.25 µs │   4.53 µs │  16.21 µs │ cudaLaunchKernel        │
│    1.94% │  68.43 µs │    11 │   6.22 µs │    3.1 µs │  24.08 µs │ cudaMemsetAsync         │
│    0.91% │  32.19 µs │     1 │  32.19 µs │  32.19 µs │  32.19 µs │ cuMemAllocFromPoolAsync │
│    0.54% │  19.07 µs │     8 │   2.38 µs │   1.91 µs │   5.48 µs │ cudaStreamSynchronize   │
│    0.44% │   15.5 µs │     3 │   5.17 µs │   1.91 µs │   7.63 µs │ cudaMallocAsync         │
│    0.30% │  10.73 µs │     1 │  10.73 µs │  10.73 µs │  10.73 µs │ cuMemFreeAsync          │
│    0.27% │   9.54 µs │     3 │   3.18 µs │   2.15 µs │   5.25 µs │ cudaFree                │
│    0.20% │   6.91 µs │     1 │   6.91 µs │   6.91 µs │   6.91 µs │ cudaDeviceSynchronize   │
│    0.17% │   5.96 µs │    16 │ 372.53 ns │    0.0 ns │   2.62 µs │ cudaGetLastError        │
│    0.01% │ 476.84 ns │     2 │ 238.42 ns │    0.0 ns │ 476.84 ns │ cuDeviceGet             │
└──────────┴───────────┴───────┴───────────┴───────────┴───────────┴─────────────────────────┘

Device-side activity: GPU was busy for 2.82 ms (80.08% of the trace)
┌──────────┬───────────┬───────┬───────────┬───────────┬───────────┬────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ Time (%) │      Time │ Calls │  Avg time │  Min time │  Max time │ Name                                                                                                               │
├──────────┼───────────┼───────┼───────────┼───────────┼───────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│   36.41% │   1.28 ms │     1 │   1.28 ms │   1.28 ms │   1.28 ms │ _Z46csrsm2_solve_upper_nontranspose_byLevel_kernelIdLi7EEviiiPKT_PKiS4_PS0_iijPiS6_S2_S0_iS6_S6_i                  │
│   19.10% │ 672.82 µs │     2 │ 336.41 µs │ 333.55 µs │ 339.27 µs │ _Z45cusparse_transpose_readWrite_alignment_kernelIdLi1ELb0ELi6ELi4ELi4EEv23cusparseTransposeParamsIT_EPKS1_PS1_S4_ │
│    9.89% │ 348.57 µs │     2 │ 174.28 µs │   3.81 µs │ 344.75 µs │ [copy device to device memory]                                                                                     │
│    8.59% │ 302.55 µs │     6 │  50.43 µs │  41.72 µs │  54.84 µs │ _Z32stable_sort_by_key_domino_phase1ILi256ELi4EEviiiPiS0_S0_S0_S0_S0_                                              │
│    4.80% │ 169.28 µs │     1 │ 169.28 µs │ 169.28 µs │ 169.28 µs │ _Z41csrsm2_analysis_upper_nontranspose_kernelILi5ELi3EEviPKiS1_PiiS2_S2_S2_i                                       │
│    0.63% │  22.17 µs │    11 │   2.02 µs │   1.43 µs │   4.05 µs │ [set device memory]                                                                                                │
│    0.36% │  12.64 µs │     8 │   1.58 µs │   1.43 µs │   1.67 µs │ [copy device to pageable memory]                                                                                   │
│    0.11% │   3.81 µs │     1 │   3.81 µs │   3.81 µs │   3.81 µs │ _Z10set_identyILi128EEviPi                                                                                         │
│    0.11% │   3.81 µs │     1 │   3.81 µs │   3.81 µs │   3.81 µs │ _Z28stable_sort_by_key_stop_coreILi256ELi4EEviPiS0_                                                                │
│    0.08% │   2.86 µs │     2 │   1.43 µs │   1.43 µs │   1.43 µs │ [copy pageable to device memory]                                                                                   │
└──────────┴───────────┴───────┴───────────┴───────────┴───────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

CUDA v12.3:

Profiler ran for 3.39 ms, capturing 125 events.

Host-side activity: calling CUDA APIs took 715.26 µs (21.08% of the trace)
┌──────────┬───────────┬───────┬───────────┬───────────┬───────────┬────────────────────────────────────────────────────────┐
│ Time (%) │      Time │ Calls │  Avg time │  Min time │  Max time │ Name                                                   │
├──────────┼───────────┼───────┼───────────┼───────────┼───────────┼────────────────────────────────────────────────────────┤
│   10.70% │ 363.11 µs │     4 │  90.78 µs │   4.05 µs │ 347.61 µs │ cudaMemsetAsync                                        │
│    5.54% │ 187.87 µs │     1 │ 187.87 µs │ 187.87 µs │ 187.87 µs │ cudaMemcpyAsync                                        │
│    2.64% │  89.65 µs │    12 │   7.47 µs │   4.53 µs │  16.93 µs │ cudaLaunchKernel                                       │
│    1.03% │  34.81 µs │     1 │  34.81 µs │  34.81 µs │  34.81 µs │ cuMemAllocFromPoolAsync                                │
│    0.27% │    9.3 µs │     1 │    9.3 µs │    9.3 µs │    9.3 µs │ cudaLaunchCooperativeKernel                            │
│    0.22% │   7.63 µs │     1 │   7.63 µs │   7.63 µs │   7.63 µs │ cuMemFreeAsync                                         │
│    0.16% │   5.48 µs │    48 │ 114.24 ns │    0.0 ns │ 476.84 ns │ cudaGetLastError                                       │
│    0.12% │   4.05 µs │     2 │   2.03 µs │ 476.84 ns │   3.58 µs │ cudaOccupancyMaxActiveBlocksPerMultiprocessorWithFlags │
│    0.11% │   3.81 µs │     8 │ 476.84 ns │ 238.42 ns │   2.15 µs │ cudaGetDevice                                          │
│    0.08% │   2.86 µs │     2 │   1.43 µs │ 238.42 ns │   2.62 µs │ cudaDeviceGetAttribute                                 │
│    0.06% │   1.91 µs │     8 │ 238.42 ns │    0.0 ns │ 476.84 ns │ cudaPeekAtLastError                                    │
└──────────┴───────────┴───────┴───────────┴───────────┴───────────┴────────────────────────────────────────────────────────┘

Device-side activity: GPU was busy for 2.83 ms (83.33% of the trace)
┌──────────┬───────────┬───────┬───────────┬───────────┬───────────┬────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
│ Time (%) │      Time │ Calls │  Avg time │  Min time │  Max time │ Name                                                                                                                                  ⋯
├──────────┼───────────┼───────┼───────────┼───────────┼───────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
│   54.52% │   1.85 ms │     1 │   1.85 ms │   1.85 ms │   1.85 ms │ _ZN8cusparse14spsm_v1_kernelIidLi8ELi1ELi4ELi16ELi32EEEvT_S1_llliPT0_PKS2_PKS1_S7_S5_S5_PjS8_S8_NS_12KernelCoeffsIS2_EEijj            ⋯
│   20.81% │  706.2 µs │     2 │  353.1 µs │ 333.55 µs │ 372.65 µs │ _Z45cusparse_transpose_readWrite_alignment_kernelIdiLi1ELb0ELi6ELi4ELi4EEv23cusparseTransposeParamsIT_EPKS1_PS1_S4_                   ⋯
│    5.19% │ 175.95 µs │     1 │ 175.95 µs │ 175.95 µs │ 175.95 µs │ _ZN8cusparse18find_colors_kernelIiLi256ELi4ELb0EEEvT_PKS1_S3_PS1_S4_PjS5_S5_S5_S5_S5_i18cusparseFillMode_t                            ⋯
│    1.10% │  37.19 µs │     1 │  37.19 µs │  37.19 µs │  37.19 µs │ _ZN8cusparse21create_perm_nt_kernelIidLi128ELi16EEEvT_PKS1_S3_PKT0_PjPS1_S8_PS4_S9_i18cusparseFillMode_t18cusparseDiagType_tb         ⋯
│    0.47% │  15.97 µs │     1 │  15.97 µs │  15.97 µs │  15.97 µs │ _ZN8cusparse28do_triangular_pattern_kernelIiLi128ELi4EEEvT_PKS1_S3_PS1_S4_i18cusparseFillMode_t                                       ⋯
│    0.41% │  13.83 µs │     1 │  13.83 µs │  13.83 µs │  13.83 µs │ _ZN8cusparse21number_of_deps_kernelIiLi128ELb0ELi4EEEvT_PKS1_S3_PjPS1_i18cusparseFillMode_t                                           ⋯
│    0.29% │   9.78 µs │     2 │   4.89 µs │   4.77 µs │   5.01 µs │ _ZN3cub53CUB_200101_500_520_600_610_700_750_800_860_890_900_NS16DeviceScanKernelINS0_16DeviceScanPolicyIiE9Policy600EPKiPiNS0_13ScanT ⋯
│    0.20% │   6.91 µs │     4 │   1.73 µs │ 953.67 ns │   2.38 µs │ [set device memory]                                                                                                                   ⋯
│    0.11% │   3.81 µs │     1 │   3.81 µs │   3.81 µs │   3.81 µs │ _ZN8cusparse23gather_row_color_kernelIiLi128EEEvT_PjS2_PS1_                                                                           ⋯
│    0.10% │   3.34 µs │     2 │   1.67 µs │   1.67 µs │   1.67 µs │ _ZN3cub53CUB_200101_500_520_600_610_700_750_800_860_890_900_NS20DeviceScanInitKernelINS0_13ScanTileStateIiLb1EEEEEvT_i                ⋯
│    0.08% │   2.62 µs │     1 │   2.62 µs │   2.62 µs │   2.62 µs │ _ZN8cusparse16streamcpy_kernelIiLi128EEEvT_PS1_S2_S1_                                                                                 ⋯
│    0.06% │   1.91 µs │     1 │   1.91 µs │   1.91 µs │   1.91 µs │ [copy device to pageable memory]                                                                                                      ⋯
└──────────┴───────────┴───────┴───────────┴───────────┴───────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

frapac

This looks good to me. Before merging this, we need to discuss where to update the analysis for CuSparseSM. I am not sure if it is better to do that in src/backsolve.jl (as implemented right now) or directly in src/interface.jl when calling the function lu! (see comment below).

frapac · 2023-11-10T18:01:11Z

src/backsolve.jl

+    CUSPARSE.cusparseSpSV_analysis(
+        CUSPARSE.handle(), transa, Ref{T}(alpha), descL, descX, descX, T, algo, spsv_L, buffer_L,
+    )
+    CUSPARSE.cusparseSpSV_analysis(


I am not sure to understand why we needs two buffers. Are buffer_L and buffer_U used implicitly in cusparseSpSV_solve?

Yes, they are used implicitly by cusparseSpSV_solve. cusparseSpSV_analysis updates the pointer of the buffer in the SpSV descriptor.

frapac · 2023-11-10T18:01:47Z

src/backsolve.jl

+    chktrans(transa)
+
+    # CUDA 12 is required for inplace computation in cusparseSpSM
+    @assert CUDA.runtime_version() ≥ v"12.0"


Maybe update to v"12.3" as this is required after?

frapac · 2023-11-10T18:04:38Z

src/backsolve.jl

-        [(s.descU, s.infoU), (s.descL, s.infoL)]
-    end
-
+    operations = s.transa == 'N' ? [(s.descL, s.infoL), (s.descU, s.infoU)] : [(s.descU, s.infoU), (s.descL, s.infoL)]


I guess the interface remains the same as before for CuSparseSM. How about using cusparseSpSM_solve in that function, and updating the analysis apart in the interface RFLU?
https://github.com/exanauts/CUSOLVERRF.jl/blob/master/src/interface.jl#L85

Do you call multiple time SpSM without updating the rf structure?
In that case we can reuse the buffer but otherwise we will gain nothing to move the computation of the SpSM analyis.

Yes, we call SpSM multiple times inside a for loop (on large instances we cannot instantiate a buffer large enough ).

amontoison · 2023-11-14T03:19:15Z

I got an answer from NVIDIA:
Yes. The cusparseSpSV_updateMatrix was added to support the HPCG benchmark. We plan to support the "updateMatrix" routine for SpSM in future releases (probably CUDA 12.4.x).

frapac · 2023-11-14T11:24:32Z

This PR looks good to merge once we have addressed the update of SpSM once the matrix has been refactorized after a change in the coefficients.

amontoison · 2023-11-14T18:02:34Z

It should be good now François.

frapac

Thank you!

amontoison · 2023-11-30T09:14:42Z

@frapac

Hi Alexis , We are glad to let you know our CUSPARSE engineering team has implemented this request of cusparseSpSM_updateMatrix . I've verified it initially on our testframe. This will target release CUDA 12.4.

amontoison added 6 commits November 8, 2023 23:13

Update spsv to reuse analysis

6f2689a

[CI] Update CUDA version

e731a6c

Update backsolve.jl

f7da791

Comment the failing testset in test/cusolverRF.jl

880992b

Use two buffers

30fb513

Add more tests

6e62f23

amontoison force-pushed the spsv_spsm branch from aa6e528 to 1843c1f Compare November 9, 2023 06:20

Use cusparseSpSV_updateMatrix

480709d

amontoison force-pushed the spsv_spsm branch from 1843c1f to 480709d Compare November 9, 2023 06:24

Delimite the modifications to do in CUDA.jl

7cb87f4

amontoison changed the title ~~Spsv spsm~~ Reuse the analysis in SpSV and SpSM Nov 9, 2023

Only update SpSV

f553bbf

frapac reviewed Nov 10, 2023

View reviewed changes

Require CUDA 11.8 or 12.3

7fec0ce

amontoison added 2 commits November 14, 2023 11:42

Perform the analysis of SpSM when we refactor A

559ff42

Check the version of CUDA when we update the analysis of SpSM

2ca3f2e

amontoison requested a review from frapac November 14, 2023 17:55

frapac approved these changes Nov 15, 2023

View reviewed changes

frapac merged commit d406df3 into exanauts:master Nov 15, 2023
2 checks passed

amontoison deleted the spsv_spsm branch November 30, 2023 09:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reuse the analysis in SpSV and SpSM #14

Reuse the analysis in SpSV and SpSM #14

amontoison commented Nov 9, 2023

amontoison commented Nov 9, 2023 •

edited

Loading

amontoison commented Nov 9, 2023

frapac commented Nov 9, 2023

frapac commented Nov 9, 2023

amontoison commented Nov 9, 2023

amontoison commented Nov 9, 2023 •

edited

Loading

frapac commented Nov 9, 2023

amontoison commented Nov 9, 2023 •

edited

Loading

frapac commented Nov 9, 2023

amontoison commented Nov 10, 2023

frapac left a comment

frapac Nov 10, 2023

amontoison Nov 11, 2023

frapac Nov 10, 2023

amontoison Nov 11, 2023

frapac Nov 10, 2023

amontoison Nov 11, 2023 •

edited

Loading

frapac Nov 14, 2023

amontoison commented Nov 14, 2023 •

edited

Loading

frapac commented Nov 14, 2023

amontoison commented Nov 14, 2023

frapac left a comment

amontoison commented Nov 30, 2023

Reuse the analysis in SpSV and SpSM #14

Reuse the analysis in SpSV and SpSM #14

Conversation

amontoison commented Nov 9, 2023

amontoison commented Nov 9, 2023 • edited Loading

amontoison commented Nov 9, 2023

frapac commented Nov 9, 2023

frapac commented Nov 9, 2023

amontoison commented Nov 9, 2023

amontoison commented Nov 9, 2023 • edited Loading

frapac commented Nov 9, 2023

amontoison commented Nov 9, 2023 • edited Loading

frapac commented Nov 9, 2023

amontoison commented Nov 10, 2023

frapac left a comment

Choose a reason for hiding this comment

frapac Nov 10, 2023

Choose a reason for hiding this comment

amontoison Nov 11, 2023

Choose a reason for hiding this comment

frapac Nov 10, 2023

Choose a reason for hiding this comment

amontoison Nov 11, 2023

Choose a reason for hiding this comment

frapac Nov 10, 2023

Choose a reason for hiding this comment

amontoison Nov 11, 2023 • edited Loading

Choose a reason for hiding this comment

frapac Nov 14, 2023

Choose a reason for hiding this comment

amontoison commented Nov 14, 2023 • edited Loading

frapac commented Nov 14, 2023

amontoison commented Nov 14, 2023

frapac left a comment

Choose a reason for hiding this comment

amontoison commented Nov 30, 2023

amontoison commented Nov 9, 2023 •

edited

Loading

amontoison commented Nov 9, 2023 •

edited

Loading

amontoison commented Nov 9, 2023 •

edited

Loading

amontoison Nov 11, 2023 •

edited

Loading

amontoison commented Nov 14, 2023 •

edited

Loading