Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Mutex, Atom, Channel and Exhaustable Channel in stdlib/synch #1909

Open
wants to merge 52 commits into
base: master
Choose a base branch
from

Conversation

glyh
Copy link
Collaborator

@glyh glyh commented Apr 6, 2024

General Info

This PR implement several sync primitives, in order to implement Channel and Exhaustable Channel.

Dependency relation

PipeMethods is an interface

 classDiagram
    FIFO ..> FileDescriptor
    Semaphore ..> FIFO
    Mutex ..> Semaphore
    Channel ..> Mutex
    Channel ..> FIFO
    Channel ..> PipeMethods
    Atom ..> flock
    ExhaustableChannel ..> Mutex
    ExhaustableChannel ..> FIFO
    ExhaustableChannel ..> PipeMethods
    ExhaustableChannel ..> Atom

    PipeMethods <|-- Netstring
    PipeMethods <|-- BlockedNetstring
Loading

Descriptions

  • Atom: A lock following flock semantic, but has some helper functions for easier writing code in a higher level. It also backs a file which can be used for shared memory communication.
  • Channel: A net-string based channel. User may communicate muitple bufs across this channel. No need to consider buffering or delimitors.
  • Exhaustable Channel: Same as Channel but exhaustable, meaning an interface is provided for user to exhaust every data piece in the channel.
    All of the above pass untyped data.

Non-POSIX dependencies

Dependency Linux FreeBSD OpenBSD
flock https://man7.org/linux/man-pages/man2/flock.2.html https://man.freebsd.org/cgi/man.cgi?query=flock&sektion=2 https://man.openbsd.org/flock.2

TODOs

  • Enforce styles
  • Ensuring no double closes in fds
  • Implement channels, closes Delimit streams into records -- e.g. read --netstr #1905
  • Implement an actual working Atom, backed by flock
  • Implement exhaustable channels
  • add test cases for Atoms
  • add test cases exhaustable channels
    • Fix broken exhaustable channels
  • Refactor: We may refactor this out to 3 modules. For a bare minimum, descriptor, pipe and atom should be merged.
    • descriptor (for file descriptor manage)
    • pipe (for communication on pipe, we can have netstring implementation there, and also create a chunked pipe alternative to netstring)
    • sync (the original module).
  • implement blocked netstring pipe.

Future Improvement

  • We may improve flock as internal command for the following reason
    • flock is slow
    • Bundling flock will improve our portability
  • Extensive tests on Atom
  • Some hacks are done when reading fdinfo because of [ FR ] expose number base conversion to ysh #1921

@glyh glyh marked this pull request as ready for review April 6, 2024 13:21
@glyh glyh marked this pull request as draft April 6, 2024 21:27
@glyh glyh changed the title Fix some minor issues in stdlib/synch. Some improvements on std/synch Apr 6, 2024
@glyh glyh self-assigned this Apr 6, 2024
@glyh glyh changed the title Some improvements on std/synch Improvements on stdlib/synch Apr 6, 2024
@glyh glyh marked this pull request as ready for review April 6, 2024 22:22
@glyh glyh requested a review from andychu April 6, 2024 22:22
@glyh glyh marked this pull request as draft April 6, 2024 22:23
@glyh glyh marked this pull request as ready for review April 6, 2024 23:29
@glyh glyh marked this pull request as draft April 7, 2024 00:12
@glyh glyh marked this pull request as ready for review April 7, 2024 02:01
@glyh glyh marked this pull request as draft April 7, 2024 02:24
@glyh glyh marked this pull request as ready for review April 7, 2024 21:07
@glyh
Copy link
Collaborator Author

glyh commented Apr 7, 2024

Ping @andychu

@bar-g
Copy link
Contributor

bar-g commented Apr 8, 2024

Hm, "synch" (as from the last pull request) sounded so unfamilar to me that I didn't know what it was referring to before looking into it. If it would have read "sync" it would have been clear to me.

Overall, maybe name it, e.g. draft-proc_sync.ysh?

@glyh
Copy link
Collaborator Author

glyh commented Apr 8, 2024

The name is borrowed from Berkeley's Pintos' synch.h where a bunch of synchronization primitives are implemented.

https://inst.eecs.berkeley.edu/~cs162/su20/static/projects/proj0-intro.pdf

I'm up to a more conventional name, but sync looks like it relates to rsync. Naming is hard in shell.

@bar-g
Copy link
Contributor

bar-g commented Apr 8, 2024

Therefore I thought of proc_sync.ysh or even better to understand: process_sync.ysh

@glyh
Copy link
Collaborator Author

glyh commented Apr 8, 2024

Makes sense. Also waiting for comments from andy.

@glyh glyh added the stdlib label Apr 8, 2024
@andychu
Copy link
Contributor

andychu commented Apr 9, 2024

Thanks for doing this! I skimmed over the code, I think this is cool

I like that you are testing the language -- in particular I'm interested in netstrings

And the int(s, n) feedback is good, we should have that

This is exactly what we need to flesh out the language


Though I still question if it needs to be in the stdlib, i.e. what applications can be written with this. Part of me suspects this is more like Go's domain -- and Go is going to do it better

i.e. a shell can call Go programs, and JSON RPC programs, and HTTP programs like curl

It's a little more of the "control plane"

whereas this is kind of a "Data plane" solution in pure shell

(I even have a blog post draft about this - #blog-ideas > The Worst Amount of Shell is 0% or 100% -- i.e. I don't necessarily believe in the "pure bash bible" which shows you how to do everthing in shell itself


This is still marked "draft", but I wonder if it should be in contrib/ or something -- to show people what can be done with YSH

Or we can wait on that decision until there are some apps built with it (again I suggest the minimal things for xargs and make)


Our philosophy is a bit more like Rust than Go/Python. The stdlib should be minimal and other code should be outside. Whereas Go/Python build a lot of stuff in the stdlib.

@andychu
Copy link
Contributor

andychu commented Apr 9, 2024

e.g. maybe to make it concrete -- would you use this in your YSH dotfiles ? I think you don't really need it for parallelism, and can use a higher level solution

but maybe I'm wrong

(I still need to take a closer look at how that works)


Also do you have a lobste.rs account? I can send you an invite if not

It would be cool to post some of this there, and on Reddit, to show people what can be done with the language

@glyh
Copy link
Collaborator Author

glyh commented Apr 9, 2024

Rust has std::sync https://doc.rust-lang.org/std/sync/

}

# NOTE: I would love to optimize this a bit more, for example netstring of size n
# now takes log_10(n) over head. We can certainly do this better by byte encoding
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a non-issue. log base 10 is fine in practice :)

1MB message takes ~7 bytes

Netstrings are fast and efficient :)

@glyh
Copy link
Collaborator Author

glyh commented Apr 9, 2024

I don't use lobsters, what is it?

@glyh
Copy link
Collaborator Author

glyh commented Apr 9, 2024

As for control vs data. I would say sometimes you have to touch the data to do control. e.g. we may use bash for a simple router that routes data requests based on first few bytes of the input to different applications.

For higher level parallelism, we may take clojure's inspiration, but none of those can be implemented without proper lambdas. Also even clojure provides some low level primitives like atom. I think you may assume my solutions here too low level just because of the naming? They're actually pretty high level. And some low level primitives are needed for implementation of higher level one and yes we don't recommend using them in application level codes.

And I think the line for shell and binary applications should be drawn by end user rather than the library designer.

@glyh glyh changed the title Implement Mutex, RWLock, Channel and Exhaustable Channel in stdlib/synch Implement Mutex, Atom, Channel and Exhaustable Channel in stdlib/synch Apr 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants