Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements to complete PRs 979 and 980 #984

Merged
merged 16 commits into from
Sep 1, 2024

Conversation

valassi
Copy link
Member

@valassi valassi commented Aug 31, 2024

Hi @oliviermattelaer these are my improvements to complete PRs #979 and #980.

If you agree with this, I can merge the whole lot.

This depends on mg5amcnlo/mg5amcnlo#135. Please review that one first, or together with this.

Can you review please? THanks Andrea

…ll' instead of 'make cleanavx' to clean the build (these are identical)
… up P* cleanup; remove cleanavx which is identical to cleanall (launch_plugin.py has been changed accordingly)
…efile, split clean into cleanSource+clean and move cleanavxs at the end via %(additional_clean)
…k cleanSource to speed up P* cleanup

Also remove cleanavx which is identical to cleanall (launch_plugin.py has been changed accordingly)

Add some comments in Olivier's write_source_makefile
…l commit message later (regenerating the patch changes nothing)
…g changes: I just want to mark that Source/makefile is no longer there)

The only files that still need to be patched are
- 2 in patch.common: Source/genps.inc, SubProcesses/makefile
- 3 in patch.P1: auto_dsig1.f, driver.f, matrix1.f

./CODEGEN/generateAndCompare.sh gg_tt --mad --nopatch
git diff --no-ext-diff -R gg_tt.mad/Source/genps.inc gg_tt.mad/SubProcesses/makefile > CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/MG5aMC_patches/PROD/patch.common
git diff --no-ext-diff -R gg_tt.mad/SubProcesses/P1_gg_ttx/auto_dsig1.f gg_tt.mad/SubProcesses/P1_gg_ttx/driver.f gg_tt.mad/SubProcesses/P1_gg_ttx/matrix1.f > CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/MG5aMC_patches/PROD/patch.P1
git checkout gg_tt.mad
@oliviermattelaer
Copy link
Member

Hi Andrea,

I'm completely confused here. I know that I do not yet fully understand github and this is likely the issue here.
How am I suppose to review this?
What I need is to have a diff that compare your change to my latest change.
If I keep the branch as master this is obviously not what I need.
and If I put my branch as bases, it does not work either (why?)

Is their an issue in the way I created those branches (or in the way you handle those? --I guess the issue is on my side--)

I will try to create such diff manually but if you have a trick or an understanding why the diff in such PR are so bad? it could be usefull

@oliviermattelaer oliviermattelaer changed the base branch from testsuite_only_fixed to master August 31, 2024 19:49
@oliviermattelaer
Copy link
Member

Ok even with the "diff" that I want, looks like you are merging something else at the same time.
It seems that you have the CI fix for mac at the same time but also some change with the helicity, the timer,...
Is this intended? Are you sure that those are required for #979 and #980? (I hope not)
Could you try to create a new branch with only the required changed?
And maybe create another PR for the additional stuff?
(or ping me on some PR that I have to approve that does contain this)

Otherwise in launch_plugin.py, you replace
cleanavx by cleanall
Which means you force a recompilation of the full Source directory in top.
Is this on purpose? You never mention such need. We can do it by security, but this sounds an overkill...

So I would propose to have

  • cleansource
  • cleanavx
  • cleanall

Now i would say that cleanall can be

cleanall: cleansource cleanavx

but I might miss a point here, so maybe they are an advantage to keep your current definition
(even if adding cleanavx target)

@valassi
Copy link
Member Author

valassi commented Sep 1, 2024

Hi @oliviermattelaer I am sorry this is confusing

(This is exactly why I was saying that using submodules is nice but can be confusing).

There are two parts here, what I added on top of mg5amcnlo and what I added on top of madgraph4gpu.

(1) For mg5amcnlo, this is now into your mg5amcnlo/mg5amcnlo#132

(There are still issues to solve there, but hopefully not in my part)

(2) For madgraph4gpu, let me explain.

I think the problem is that your testsuite_only_fixed did not include the latest master. Now I have modified that by merging master into it (well I included it into testsuite_only, and then the latter into testsuite_only_fixed).

So now that you modified this PR to target testsuite_only_fixed you should be able to see the differences.

The commits are

[avalassi@itscrd90 gcc11/usr] /data/avalassi/GPU2023/madgraph4gpuX/epochX/cudacpp> git log --oneline upstream/testsuite_only_fixed..HEAD
93ef4ba17 (HEAD -> gpucpp, origin/gpucpp) [gpucpp] upgrade mg5amcnlo from 185cc4c9c to fa706c696 (latest gpucpp_fixci): fix my bug>
57c72a475 Merge remote-tracking branch 'upstream/testsuite_only_fixed' into gpucpp
807f48173 [gpucpp] regenerate all processes - this completes the add-ons to PR #979 and #980
7e69c82dc [gpucpp] regenerate CODEGEN patch from gg_tt.mad (but actually nothing changes: I just want to mark that Source/makefile>
c79843a22 [gpucpp] in CODEGEN dummy change to patch.common to allow a meaningful commit message later (regenerating the patch chan>
5aa2b4400 [gpucpp] regenerate gg_tt.mad - so that I can regenerate patches from it
326ce7e7d [gpucpp] regenerate gq_ttq.mad - heck that all is ok
6ff187893 [gpucpp] in CODEGEN backport Source/makefile from gq_ttq.mad: add back cleanSource to speed up P* cleanup
44e4f7dd9 [gpucpp] upgrade mg5amcnlo from a3ccfd2bf to 185cc4c9c: in Source makefile, split clean into cleanSource+clean and move >
aad68d552 [gpucpp] in gq_ttq.mad Source/makefile, add back cleanSource to speed up P* cleanup; remove cleanavx which is identical >
6da711506 [gpucpp] in gq_ttq.mad and CODEGEN launch_plugin.py, use 'make cleanall' instead of 'make cleanavx' to clean the build (>
d28f85e69 [gpucpp] regenerate gq_ttq.mad using the current proposal for PR #980
a8ff34265 Merge remote-tracking branch 'upstream/testsuite_only_fixed' into gpucpp

And the diffs are here, excluding regenerated code

git diff upstream/testsuite_only_fixed..HEAD --name-only | egrep -v '(.mad/|.sa/)'
MG5aMC/mg5amcnlo
epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/launch_plugin.py
epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/madgraph/iolibs/template_files/madevent_makefile_source_addon
epochX/cudacpp/CODEGEN/PLUGIN/CUDACPP_SA_OUTPUT/output.py

In practice:

  • In mg5amcnlo, I moved to what is described above, split clean and cleanSource, fix additional_clean
  • In CODEGEN output.py, I moved the clean targets at the end by using additional_clean instead of additional_dependencies
  • In CODEGEN makefile_addon and launch_plugin.py, I removed 'cleanavx' because this was doing the same as 'cleanall', so I only keep cleanall and use that

Otherwise this is essentially identical to your 980... Does this clarify?

If you agree, I would

NB: in mg5amcnlo/mg5amcnlo#132 I think there are still some acceptance tests failing, but these are in your code/tests and I have no idea how to look at them...

@valassi
Copy link
Member Author

valassi commented Sep 1, 2024

Otherwise in launch_plugin.py, you replace
cleanavx by cleanall
Which means you force a recompilation of the full Source directory in top.

Sorry, my mistake again... I did not understand what you meant, now I see it

Before I changed it, the two targets were (this is current master)


cleanavx:
	for i in `ls -d ../SubProcesses/P*`; do cd $$i; make cleanavxs; cd -; done;
cleanall: cleanSource # THIS IS THE ONE
	for i in `ls -d ../SubProcesses/P*`; do cd $$i; make cleanavxs; cd -; done;

I thought they are identical, they are not.

The difference is that calling 'cleanall' ALSO cleans Source, while cleanavx does not. I do not remember why 'cleanavx' got there, I thought I had added it and forgotten it by mistake, but maybe you added it for this reason.

Anyway: the part that is really slow, both in the cleanup and in the build, is the P* subdirectories. Deleting and recreating the objects in Source is very fast compared to that. For extra safety I would actually remove those and recreate them. Or you would prefer to avoid that?

Is this on purpose? You never mention such need. We can do it by security, but this sounds an overkill...

Anyway, to be clear. It was not on purpose in the sense that I made a mistake and I had not noticed.

That said, now that I did notice, I actually think that it is a small overkill and I would keep that.

Rephrasing: are you sure yourself that in Source there are things that do not depend on vector.inc and need no recompilation when vector.inc changes? I see for instance that you added a dependency of libmodel on vector.inc, so I guess it must be recompiled?

(By the way, I noticed a strange error that I will report with 'vector.inc missing for lepton pdf', which is exactly related to vector.inc and libmodel).

@oliviermattelaer oliviermattelaer changed the base branch from master to testsuite_only_fixed September 1, 2024 09:34
@valassi
Copy link
Member Author

valassi commented Sep 1, 2024

Hi @oliviermattelaer as discussed offline, I merge this into your branch for #980.

Then I will merge #979 and #980 into master.

Thanks Andrea

@valassi valassi merged commit c153482 into madgraph5:testsuite_only_fixed Sep 1, 2024
valassi added a commit to valassi/madgraph4gpu that referenced this pull request Sep 1, 2024
…, madgraph5#980, madgraph5#984 patches for the new CI and Source/makefile) into june24

Fix conflicts:
- MG5aMC/mg5amcnlo (keep the current june24 version 4ef15cab1 i.e. current valassi_gpucpp_june24)
- epochX/cudacpp/gg_tt.mad/bin/internal/banner.py (keep a debug printout)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants