Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add workflow support #23

Merged
merged 34 commits into from
Nov 10, 2023
Merged

Add workflow support #23

merged 34 commits into from
Nov 10, 2023

Conversation

koparasy
Copy link
Member

@koparasy koparasy commented Nov 7, 2023

No description provided.

Copy link
Member Author

@koparasy koparasy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lpottier I will revert this fix. Although it runs correctly it doesn't do what we need to support. We need to have AMS use umpire "default" parent pools (HOST/DEVICE/PINNED) etc.

koparasy and others added 27 commits November 9, 2023 09:41
* Add wrapper to start flux with proxy
* Add infrastructure of connecting to and receiving from RMQ
* Add initial daemon implementation
…messages to a RabbitMQ queue

Signed-off-by: Loic Pottier <[email protected]>
* Added initial script to bootstrap Flux on CORAL/IBM machine
* Added support Slurm based system, tested with flux-core 0.49 on Ruby/Lassen
* Added script to launch AMS miniapp with Flux
* Reverted script to support older version of flux, Lassen bootstrap only works wityh Flux<= 0.45 (tested with 0.45)
* Added scripts to add secrets on OC
* Added new scripts to launch the entire AMS workflow
* Upgrade all scripts, they are now fully functional (main script communicates with AMS daemon via RMQ)

---------

Signed-off-by: Loic Pottier <[email protected]>
* This commit addresses multiple problems in the broker part of AMS
- we are not sending input/output as encoded string anymore, we send binary blobs
- base64 has been removed
- a bug has been fixed with (very) old libevent version (<= 2.0.21-stable)
- offloading inputs/outputs to the thread managing RMQ is now much faster

* Moved to ResourceManager, created AMSMessage structures, moved to smart pointers.

* Complete re-design of the RabbitMQ backend

* Removed EventBuffer, removed pthread and signals. Big cleanup of the code.

* Added documentation and new AMSMsgHeader class + moved from memcpy to ResourceManager::copy
Copy link
Collaborator

@lpottier lpottier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FIXME: RMQ support is broken in that version. A future PR will fix that.

@koparasy koparasy merged commit cd160ba into develop Nov 10, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants