Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Asynchronous IO API #9111

Open
wants to merge 313 commits into
base: development
Choose a base branch
from

Conversation

RealyUniqueName
Copy link
Member

@RealyUniqueName RealyUniqueName commented Jan 29, 2020

Lower-level async API. Take 1.
I tried to stay close to C API.

Please, review thoroughly. Especially the socket-related API, because I don't have much experience with sockets.
Also, your thoughts on TODOs are very appreciated.

While I tried to define the behavior in doc blocks I expect things to change during the implementation journey for the sake of cross-platform consistency.

filesystem

net
TODO

processes
TODO

@RealyUniqueName RealyUniqueName added this to the Release 4.1 milestone Jan 29, 2020
std/asyncio/CErrNo.hx Outdated Show resolved Hide resolved
std/asyncio/CErrNo.hx Outdated Show resolved Hide resolved
std/asyncio/IoErrorType.hx Outdated Show resolved Hide resolved
std/asyncio/IoErrorType.hx Outdated Show resolved Hide resolved
std/asyncio/IDuplex.hx Outdated Show resolved Hide resolved
//Callback is expected to be one-time and this cleanup is expected to help
//to spot multiple calls
var fn = this;
this = null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this line do anything? Surely other references/aliases to this function don't get set to null.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, right. That's a pity. Any ideas how to guarantee that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can without wrapping the function in a closure or otherwise introducing state.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed that for now.
What about wrapping the function in an additional closure, but only in -debug mode?

This is an attempt to design a cross-platform API for big byte buffers (more than 2GB)
without any unnecessary allocations.
**/
class BigBuffer {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is not used in the APIs? Also endianness should be configurable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, an intermediary Bytes instance has to be used to pass bytes between APIs and BigBuffer.
Added endian property.

import haxe.Callback;
import haxe.errors.NotImplemented;

class Socket implements IDuplex {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nadako made a good point – TCP and IPC sockets should be separate, let this be a base class for TcpSocket and IpcSocket.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about that anymore. They share the API and can be distinguished by connection addresses. What would be the benefit of separating them like that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could probably simplify the constructors and/or thngs like connect.

std/asyncio/net/Server.hx Outdated Show resolved Hide resolved
std/asyncio/net/SecureSocket.hx Outdated Show resolved Hide resolved
@Aurel300
Copy link
Member

The docs are a bit wonky in places, but this can be cleaned up later.

@@ -0,0 +1,62 @@
package haxe;

typedef CallbackHandler<T> = (error:Null<Error>, result:T) -> Void;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, result can also be null. As mentioned before, it would be nice if null safety could somehow deal with that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nullable result would generate annecessary allocations for basic types.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for null safety, yes, I want to make it aware of Callback<T> semantics. That's why Callback API does not allow to provide both error and result at the same time.


/**
An interface to read bytes from a source of bytes.
**/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps something like a (strictly advisory) prepare(size:Int) method would make sense?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean? from Int to Int?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RealyUniqueName I assume that comment was a reply to something else?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps something like a (strictly advisory) prepare(size:Int) method would make sense?

I don't know. Sounds like an implementation-specific.

@Aurel300 Aurel300 changed the title Asyncronous IO API Asynchronous IO API Jan 29, 2020
Error description.
**/
public function toString():String {
return '$message at ${pos()}';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about using Haxe's normal format (i.e. ${pos()}: $message)? It will work out of the box in VSCode \o/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it's good approach for run time exceptions.
That VSCode argument only works for console apps or exceptions printed to console.
In general you want more visibility for an error message rather than error position.
AFAIR all the languages/runtimes provide exception message first and then position/stack info.

@ncannasse
Copy link
Member

I don't like much asyncio as toplevel package. I thought we would be using asys toplevel package that would somewhat mimic the sys one but with async variants and upgraded API.

If you need a package specific for low level classes, it should be something like asys.nativeio or something similar.

@skial skial mentioned this pull request Jan 29, 2020
1 task

The underlying function type type is declared in `haxe.CallbackHandler`.
**/
abstract Callback<T>(CallbackHandler<T>) from CallbackHandler<T> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems quite specific to package implementation, I'm not sure I would make it part of haxe package. Also, you might want a notImplemented callback instead of doing .fail(new NotImplemented()) for every function missing implementation ;)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect it to be useful for hxnodejs and maybe other things once we implement coroutines.

@RealyUniqueName
Copy link
Member Author

RealyUniqueName commented Jan 30, 2020

@ncannasse I'll move everything to asys.native once the API is settled.

@porfirioribeiro
Copy link
Contributor

It's amazing to see Haxe implementing an async API!!

I just wonder, what would be the technical implications to not use an async/await terminology instead of callbacks?

@Aurel300
Copy link
Member

Aurel300 commented Feb 8, 2020

I just wonder, what would be the technical implications to not use an async/await terminology instead of callbacks?

Haxe will be getting both asys (an API similar to this PR) and coroutines in the future. Coroutines are a big undertaking because they have to be integrated properly into the language. It is not just a case of adding async and await as keywords (which would also be a breaking change).

And, once we do have asys (with callback-style API) and coroutines, the two should be easy to join. It may be that our coroutines will be designed in a way that callback-style API is directly compatible with an asynchronous call.

@nadako nadako modified the milestones: Release 4.1, Release 4.2 Apr 5, 2020
@nadako
Copy link
Member

nadako commented Apr 5, 2020

This goes to 4.2, according to our plans :)

@ntrf
Copy link

ntrf commented May 10, 2020

Sorry for intervention, but i'm worried you all missing some important points. This API does not seem to account for multithreading environments in any way. It's a straight copy of node js, which does not have that problem by design. Unfortunatly, most of the "sys" platforms support threading. Therefore i have some questions:

  • What mechanism will be responsible for invoking callbacks in long-lived application? Typically there has to be a message loop of some sort in order to perform this task.
  • If there will be a message loop, how to start and stop this message loop in background thread, so any Async API calls return on that background thread?
  • Will there be a mechanism allowing callbacks to be processed periodically in background theread, without waiting, so that background thread can continue to perform iterative task?
  • If all callbacks will be invoked strictly on only one "main" thread, can such thread be assigned manually? In some applications, startup thread is not suitable for such work. For example, it is the only thread capable of performing 3d rendering reliably, so background threads will be more suitable for loading, parsing and properly structuring the external data.

EntryPoint/MainLoop classes seem to be a solution only for "one thread to call them all" case. As far as i can tell these classes are not designed for use with callbacks in background threads. It'd be nice to have an equivalent of SyncronizationContext from C# or Looper from android api.

@farteryhr
Copy link

just to mention that deno 1.0 has been released
https://deno.land/v1
"design mistakes" in Node as stated by the author, and their correspondence in current version, might be worth checking.

@nadako nadako mentioned this pull request Jun 5, 2020
@RealyUniqueName
Copy link
Member Author

Sorry for intervention, but i'm worried you all missing some important points. This API does not seem to account for multithreading environments in any way. <...>

The idea is that callbacks get executed in the same thread they were created in. That's supposed to be the common denominator of all of our (very different) targets.
What happens behind the API is target implementation details, which can be specified later should the need arise.

@Aidan63
Copy link
Contributor

Aidan63 commented Feb 28, 2024

Yes, I thought about that and saw that Apple had unfortunately moved away from the SChannel / independent from the transport layer style their old api had (although I've seen one or two posts which implies the network framer api might be able to do something similar). I would be inclined to say that haxe apple users survived this long with mbedtls so could survive a bit longer if it means windows and linux users get a more flexible TLS api.
This is willfully ignoring iOS which is a big / important OS to many, but given that libuv has no official iOS support I'm still inclined to just shrug as I'm not sure what asys on ios would look like from hxcpp anyway. Not sure what haxe targets people deploying to iOS mainly use (probably mostly hxcpp?) but I have no intentions of adding any apple specific APIs to my hxcpp asys implementation at the moment.

@Aidan63
Copy link
Contributor

Aidan63 commented Feb 28, 2024

Another thing I forgot to add was around tests. I've added tests for sockets now but the current implementation of those tests seems a bit dodgy. I spin up a new thread to setup a listen server using the existing blocking api, however I have a lock which is released as the last bit of code for that thread which the main test thread waits for before calling async.done. This thread also then needs try catches to make sure that lock is actually released to prevent the test thread waiting forever if that thread throws (oh for something like C#'s IDisposable / using in haxe).
I needed this as I seemed to be running into situations where the next test would start before previous tests server thread had exited, running to intermittant failures when it tried to open the same ip and port combo for the server. I haven't noticed any issues with it, but it feels filthy and is one of those bits of code I have zero confidence in.

I had a quick look around the haxe repo and there appears to be no tests for the existing socket API so I've got nothing to go on from there, libuv tests its sockets using its own api, i.e. it sets up a libuv server for the libuv socket to connect to. This is probably a more reliable way of doing things but still feels wrong testing against ourself and not the existing synchronus api.
The other downside is you very quickly find your self up to your neck in callbacks with more involved test cases, but coroutines might solve this down the line.

Any thoughts on best approaches to testing this sort of stuff is much appreciated (probably extends to issues testing IPC, testing current process signals, current process stdio, etc, etc...).

@Apprentice-Alchemist
Copy link
Contributor

Yes, I thought about that and saw that Apple had unfortunately moved away from the SChannel / independent from the transport layer style their old api had (although I've seen one or two posts which implies the network framer api might be able to do something similar). I would be inclined to say that haxe apple users survived this long with mbedtls so could survive a bit longer if it means windows and linux users get a more flexible TLS api.

The main reason why I'd like to get rid of mbedtls, at least for Windows and macOS, is because it is a pain to deal with from a packaging perspective.

Vendoring it doesn't work because then it's out of date real quick (hxccp's version was 5 years out of date before I updated it, which reminds me that it probably needs to be updated again).
(And even if we keep it up to date on our side, it means developers using Haxe also need to keep things up to date and make new releases of their apps when there's an important mbedtls update)
Using eg Homebrew doesn't work either because then the generated executables aren't portable.

While a flexible TLS api could be nice (though I'm not sure how useful it would actually be in practice), I would very much prefer not to lock ourselves in to having to use mbedtls or another third-party library on macOS.

(And the framer idea doesn't work, that's for implementing a message based protocol on top of tcp/tls)

This is willfully ignoring iOS which is a big / important OS to many, but given that libuv has no official iOS support I'm still inclined to just shrug as I'm not sure what asys on ios would look like from hxcpp anyway. Not sure what haxe targets people deploying to iOS mainly use (probably mostly hxcpp?) but I have no intentions of adding any apple specific APIs to my hxcpp asys implementation at the moment.

Libuv should work fine on iOS. watchOS blocks BSD sockets though, so libuv's networking bits won't work there. On that platform you need to use high level apis like URLSession or (for specific usecases) Network.framework (https://developer.apple.com/documentation/technotes/tn3135-low-level-networking-on-watchos)


I was thinking a bit about the reading apis.

On Windows the "standard" for async IO is to use completion ports. The api there ends up pretty much the same as the current asys api, you hand a buffer to the api and you get notified when it's been filled.

Java's AsynchronousSocketChannel has a similar API.

On Linux with io_uring you can also do that, but for better memory effiency it's recommended to use buffer pools, where the kernel will select a buffer from a buffer group by itself.

Apple's Network.framework is also completion-based, but instead of taking a buffer to write into, it hands you back a dispatch_data_t object on completion.

Python asyncio apis also return buffers instead of taking them.

So the question is: might an api that looks like function read(maxLength:Int, cb:(Bytes)->Void); be better?
(maybe even returning an array of buffers, since for example dispatch_data_t objects can contain multiple non-contiguous regions of data)

This would allow implementations to take advantage of eg IO uring buffer pools or use apis like Network.framework or python asyncio without incurring extra copies.
libuv based implementations could potentially benefit from this too.
The disadvantage is that some other implementations then require internal buffer pools.
And there would also need to be an api to return buffers to the pool / free them...

@Aidan63
Copy link
Contributor

Aidan63 commented Mar 7, 2024

I'm very much away of how crap vendoring is as a dependency mechanism, but to date there has been no mention of Mac specific TLS engines in any of the main repos issue boards so it doesn't seem like a massive issue to anyone.
So I'd prefer a more flexible api which is possible with the underlying libraries used by targets today, if someone were to offer a SecureSocket backed by the Mac API then that probably changes things, but that doesn't seem to be happening anytime soon considering no ones been bothered by the lack of it so far.

Changing the read function seems like increasing api complexity for the sake of something that won't happen. My understanding is that targets are going to be using libuv to do the heavy lifting, even if someone were to provide at least one alternative async io implementation for a target / os it probably makes sense to not accept it. Man power is forever a problem and the advantage of all unifying around something like libuv means implementation complexity is lowered and knowledge gained from working on one targets implementation can be easily transfered to another target.

@Apprentice-Alchemist
Copy link
Contributor

I'm very much away of how crap vendoring is as a dependency mechanism, but to date there has been no mention of Mac specific TLS engines in any of the main repos issue boards so it doesn't seem like a massive issue to anyone. So I'd prefer a more flexible api which is possible with the underlying libraries used by targets today, if someone were to offer a SecureSocket backed by the Mac API then that probably changes things, but that doesn't seem to be happening anytime soon considering no ones been bothered by the lack of it so far.

I'm bothered and when I get my hands on a Mac I would very much like to make an asys implementation backed by native Apple apis.

And using the native apis has more advantages than just TLS. You also get built-in Happy Eyeballs support, compatibility with VPN On Demand, better handling of connectivity issues, and better performance thanks to an improved underlying network stack.

Also, do you have a concrete example where TLS on top of not-TCP would be useful?
Securing IPC connections with TLS seems pointless, and I can't really think of network protocols other than TCP where TLS make sense as a security layer.

I think this usecase is better addressed by external libraries, possibly in combination with mbedtls being exposed on platforms where the standard library uses it.

Changing the read function seems like increasing api complexity for the sake of something that won't happen.

I could argue that adding a flexible TLS abstraction seems like increasing api complexity for the sake of some use case that might not even exist :)

My understanding is that targets are going to be using libuv to do the heavy lifting, even if someone were to provide at least one alternative async io implementation for a target / os it probably makes sense to not accept it.

Java and Python are not going to use libuv (I hope). For those two we should most definitely rely on their built-in async io capabilities, pulling in a native dependency would create needless complications.

Man power is forever a problem and the advantage of all unifying around something like libuv means implementation complexity is lowered and knowledge gained from working on one targets implementation can be easily transfered to another target.

libuv's far from perfect and we should not assume that we'll be using it forever.

That said, we could leave the api as is for now and add a second reading api at some point in the future that offers a fast path for eg IO uring, and apps that care about performance could then select which api to use.

@Aidan63
Copy link
Contributor

Aidan63 commented Mar 8, 2024

Yep, it could certainly been seen as extra complexity. But considering pretty much all targets will be using TLS implementations which are independent of the transport layer (mbedtls, node, jvm) it won't make any difference to the implementations if the haxe side is locked to a socket or not.
Time will tell whenever decisions start getting made on all this, in the mean time here are some more notes I forgot about before I start moving onto the system namespace tests.

Access After Close

Are we defining behaviour for what should happen if a user attempts to call a function / read or write a property of a class after close has been called? I hadn't though of this until the other week so I'm not sure what the current behaviour is, but its probably not great and is certainly untested. In C# objects which implement IDisposable are recommended to throw ObjectDisposedException if users call functions or access properties after dispose has been called. So maybe we should specify a ObjectClosedException will be thrown when accessing properties or returned as the error in callback functions.
If we don't specify a behaviour I can easily imagine a world where this falls into the "target specific behaviour category" as some implementations might cache values and be accessible after closing but cause exceptions on other targets.

(Aidan63/hxcpp_asys#67)

uv_stream_write byte count

The IWritable write function callback says the number of bytes which has been written is returned, but uv_stream_write doesn't return this information. I've just been returning the number of bytes the user provided if the write succeeded, no idea if this is "correct" or not, but it seems to be the best that can be done.

(Aidan63/hxcpp_asys#68)

More read and write functions

Previously mentioned that I think there should be more read and write convenience functions to match the existing Input and Output classes (having to manually deal with buffers has been an exercise in tedium when setting up test cases), so here's a more solid suggestion. Create IWritableExtensions and IReadableExtensions classses to hold a series of static extensions.

class IWritableExtensions {
    public static function writeInt8(writable:IWritable, v:Int, callback:Callback<Int>) {
        final output = new BytesOutput();
        output.writeByte(v);

        writable.write(output.getBytes(), 0, output.length, callback);
    }

    public static function writeInt16(writable:IWritable, v:Int, callback:Callback<Int>) {
        final output = new BytesOutput();
        output.writeInt16(v);

        writable.write(output.getBytes(), 0, output.length, callback);
    }
}

With this static extension class approach I'm trying to solve two problems. Allow targets to shadow this class and provide a more optimised implementation if they've got one and to have a single fallback implementation which doesn't need to be copied for each target which isn't shadowing it. This is an attempt to not have a repeat of the ifdef nightmare in Bytes.hx.
I'm happy to go ahead and create these two classes with a bunch of static extensions and fallback implementations (and will then probably provide an optimised hxcpp version). One thing this doesn't deal with is that file input and output doesn't implement any of these writable, readable, or duplex interfaces so won't benefit, but I've brought that up before.

(Aidan63/hxcpp_asys#2)

Endian-ness

The Input and Output streams provide an endian toggle, if we go with the above extension class style big and little endian versions could be added if we think its worth while.

(Aidan63/hxcpp_asys#2)

Asys HTTP!

Everyone seems to love whinging about haxe.Http and no one has offered to do much about it (as is usually the case), but it is one of the main things which doesn't have an asys equivilent. I don't have any strong feelings about it either way (not sure I've ever used haxe.Http or really used Http clients in other languages) so probably wouldn't be a good person to try and design an asys Http API. But I thought I could throw the hand grenade and let others bicker about solutions.

(Aidan63/hxcpp_asys#71)

@Apprentice-Alchemist
Copy link
Contributor

Apprentice-Alchemist commented Mar 9, 2024

Yep, it could certainly been seen as extra complexity. But considering pretty much all targets will be using TLS implementations which are independent of the transport layer (mbedtls, node, jvm) it won't make any difference to the implementations if the haxe side is locked to a socket or not. Time will tell whenever decisions start getting made on all this, in the mean time here are some more notes I forgot about before I start moving onto the system namespace tests.

Compromise: initially implement regular SecureSockets, and put off the discussion on whether to expose a more flexible TLS api to another time.
I would also advocate for keeping the initial secure socket api surface minimal (eg Certificate objects would be opaque), and we also need to consider how to hook up native OS trust management (the current sys.ssl api for example can't work on iOS because you can't get root certificates, if mbedtls is used you need to set a verification callback and use SecTrust objects)

Access After Close

Are we defining behaviour for what should happen if a user attempts to call a function / read or write a property of a class after close has been called? I hadn't though of this until the other week so I'm not sure what the current behaviour is, but its probably not great and is certainly untested. In C# objects which implement IDisposable are recommended to throw ObjectDisposedException if users call functions or access properties after dispose has been called. So maybe we should specify a ObjectClosedException will be thrown when accessing properties or returned as the error in callback functions. If we don't specify a behaviour I can easily imagine a world where this falls into the "target specific behaviour category" as some implementations might cache values and be accessible after closing but cause exceptions on other targets.

Returning a consistent error seems reasonable (though this should probably be passed via the callback instead of throwing).

uv_stream_write byte count

The IWritable write function callback says the number of bytes which has been written is returned, but uv_stream_write doesn't return this information. I've just been returning the number of bytes the user provided if the write succeeded, no idea if this is "correct" or not, but it seems to be the best that can be done.

It's probably correct, with a true async api writing less bytes that provided can only happen if the stream is closed which should give an EOF error.

More read and write functions

Previously mentioned that I think there should be more read and write convenience functions to match the existing Input and Output classes (having to manually deal with buffers has been an exercise in tedium when setting up test cases), so here's a more solid suggestion. Create IWritableExtensions and IReadableExtensions classses to hold a series of static extensions.

class IWritableExtensions {
    public static function writeInt8(writable:IWritable, v:Int, callback:Callback<Int>) {
        final output = new BytesOutput();
        output.writeByte(v);

        writable.write(output.getBytes(), 0, output.length, callback);
    }

    public static function writeInt16(writable:IWritable, v:Int, callback:Callback<Int>) {
        final output = new BytesOutput();
        output.writeInt16(v);

        writable.write(output.getBytes(), 0, output.length, callback);
    }
}

With this static extension class approach I'm trying to solve two problems. Allow targets to shadow this class and provide a more optimised implementation if they've got one and to have a single fallback implementation which doesn't need to be copied for each target which isn't shadowing it. This is an attempt to not have a repeat of the ifdef nightmare in Bytes.hx. I'm happy to go ahead and create these two classes with a bunch of static extensions and fallback implementations (and will then probably provide an optimised hxcpp version). One thing this doesn't deal with is that file input and output doesn't implement any of these writable, readable, or duplex interfaces so won't benefit, but I've brought that up before.

It may be better to implement some kind of BufferedReader on top of IReadable, to avoid reading/writing a lot of very small chunks. Same for writing.
I don't think there's that much potential for target-specific optimization.

As for files not implementing IReadable and co, the solution there is probably to expose a separate file stream class?

Endian-ness

The Input and Output streams provide an endian toggle, if we go with the above extension class style big and little endian versions could be added if we think its worth while.

Haven't really got an opinion here.

Asys HTTP!

Everyone seems to love whinging about haxe.Http and no one has offered to do much about it (as is usually the case)

People are doing things about it, just not in the standard library.
For example: https://github.com/core-haxe/http

but it is one of the main things which doesn't have an asys equivilent. I don't have any strong feelings about it either way (not sure I've ever used haxe.Http or really used Http clients in other languages) so probably wouldn't be a good person to try and design an asys Http API. But I thought I could throw the hand grenade and let others bicker about solutions.

Imo HTTP should be a separate library, which could be pulled into the standard library at some point in the future, though I'm not sure whether we'd even want to have a full HTTP stack in the standard library, given that that would require HTTP/2 and 3 (and thus also QUIC) support.

@Aidan63
Copy link
Contributor

Aidan63 commented Mar 9, 2024

Passing a closed exception through the callback was my intention, but probably worded it poorly in my initial comment.

My intention with the readable and writable extensions were that implementations could provide those functions to avoid needing to allocate haxe bytes and copy data into them. E.g. hxcpp writeString could pin the string and pass the pointer straight to libuv (depending on the encoding). My thinking was also that these extensions would be usable on any custom implementation of IWritable such as a buffered one, but looking at it now I'm not sure how that would work with static extensions on the interface, so I'm not too sure what I was thinking of yesterday... I'll give it some more another time as I think the principle is sound, even if the implementation isn't...

@Apprentice-Alchemist
Copy link
Contributor

My intention with the readable and writable extensions were that implementations could provide those functions to avoid needing to allocate haxe bytes and copy data into them. E.g. hxcpp writeString could pin the string and pass the pointer straight to libuv (depending on the encoding). My thinking was also that these extensions would be usable on any custom implementation of IWritable such as a buffered one, but looking at it now I'm not sure how that would work with static extensions on the interface, so I'm not too sure what I was thinking of yesterday...

Note that most if not all implementations of IWritable will not do any buffering by themselves.
(Libuv for example doesn't do any buffering, so every uv_write call results in a WSASend/writev call)
I also don't think having buffering directly in the network streams is a good idea as that would be counterproductive for applications that already do buffering themselves.

So I still stand by the idea of putting these utility functions on separate BufReader/BufWriter classes.
(I feel like this would only be ergonomic with coroutines though, otherwise you end up in true callback hell.)

I'll give it some more another time as I think the principle is sound, even if the implementation isn't...

@Aidan63
Copy link
Contributor

Aidan63 commented Mar 31, 2024

Object Lifetimes

Going to take a quick detour to bring up this topic I've had brewing the back of my mind for a few months and this point and keep putting off.

Take the following innocent looking code snippet where the user opens a file but doesn't explicitely close it, when should the libuv handle and any associated non GC resources be released?

FileSystem.openFile("test.txt", (_, file) -> {
	trace(file.path);
});

The naive approach might be to free stuff in whatever finalisation method the target provides, but there are some annoying details which prevent this. Below is a quick summary of libuv handle management.

  • For every uv_init_* call you must call uv_close to signal to libuv you're done with the handle.
  • It is not safe to delete handles or resources that handle has access to until the uv_close callback is invoked.
  • Like most of libuv, uv_close is not thread safe and must be called from the same thread the libuv loop that was passed into the init call is associated with.
  • Shutting down a libuv loop when a thread exits involves iterating over all remaining handles and closing them before deleting the loop.

It's that third point which makes the finalisation technique a non starter as finalisation traditionally makes no guarentees about which thread it will be called from. Further more depending on what handle you're closing the callbacks of queued work may be called with an error response which could lead to haxe code being executed from the finaliser which is usually not allowed by targets which provide finalisation.

The annoying thing is if close isn't called and the haxe object is collected I don't think there's much we can do beyond leak the libuv handle until it closed at loop closure / thread exit (which could well be the main thread so it remains for the lifetime of the program). The catch with this is that its not just native resources which would be leaked, but potentially GC objects. Many libuv handles contain callback functions for events, in order for these to call into haxe code the haxe callback object needs to be rooted by the GC and would only be unrooted when invoked or the handle and its resources are deleted. This means you could end up with very large retention graphs depending on what these rooted callbacks capture.

There are maybe some things we could do to mitigate this. In the finaliser of objects which hold libuv handles add a reference to that handle to some thread safe structure of orphaned handles and have a libuv task which preiodically checks that list and cleans up handles and its resources found within. But these things start to get quite complex quite quickly.

It was the realisation that calling close from the haxe side is very important to ensuring handles are deleted and objects do not remain rooted by the GC that led me to creating the auto close evolution proposal as I don't think the language provides the tools necessary to ensure close is reliably called in the face of any function or property potentially raising exceptions. Expecting the user to manually create nested try catches and all the null checking that comes with that is entirely unreasonable (not to mention unrealistic based on existing code).

Hopefully I've just been thinking about this for too long and I can no longer see the forrest for the trees, maybe there's something simple we can do to cleanup objects eagerly in the case when close isn't called.

(Aidan63/hxcpp_asys#19)

@Apprentice-Alchemist
Copy link
Contributor

Libuv handles should have a reference to the loop they where created with (https://docs.libuv.org/en/v1.x/handle.html#c.uv_handle_t.loop), so in the finalizer you could just queue a cleanup callback on that loop, right?

@Apprentice-Alchemist
Copy link
Contributor

Some thoughts about the API design that I don't think have been raised yet:

  • Signals. The majority of the PosixSignalCode and some of the asys.native.system.Signal signals are completely meaningless on Windows and will yield errors if used.
    I'm not sure it's a good idea to even include signals and signal handlers in the core api, there's just too much platform specific stuff going on.
    Apis related to handling Ctrl+C and such should be provided as a "proper" console api (though I think that's out of scope for an initial MVP)

  • symlinks: on Windows there is a difference between directory and file symlinks, but the current api does not allow differentiating between those.
    I would propose splitting that function up into hardlink, symlinkFile and symlinkDir.

@Aidan63
Copy link
Contributor

Aidan63 commented Apr 1, 2024

You could queue up the handle for cleaning from the finalisers but you'd need to mostly DIY it and then use something like uv_async_t (the only thread safe libuv api) to signal the loop to check that list and perform the cleanup. The haxe objects would also need to know if the handle it holds has been automatically cleaned by the loop teardown, for situations where the haxe object is finalised after the thread it was created on has exited. This also seems like the sort of solution which is very easy to get wrong and have all sort of nasty race conditions hiding in wait, which is why I'm a bit hesitant to attempt it unless there's no other option.

I've nearly finished the tests for the system package and signal issues you mention was definitely on my list of things to bring up, removing a lot of the signal stuff is probably the easiest solution. I don't think the process api need much more than a kill function and the fact that half the libuv doc page for uv_signal_t is dedicated to all sorts of platform specific behaviour speaks to how difficult it is to have a meaningful cross platform signal api.
It would be nice to keep Ctrl+C interrupts in some form though. I use that on hxcppdbg with my libuv io library which is where this hxcpp implementation started from, but it wouldn't be too much effort to just keep some extra signals api in that project when I update it to asys.

Don't have much to say about the link stuff apart from there are some OS specific test failures in that area which I need to get around to checking if that is due to a dodgy implementaion on my part or OS specific behaviour.

@skial skial mentioned this pull request Apr 3, 2024
1 task
@Aidan63
Copy link
Contributor

Aidan63 commented Apr 3, 2024

Another thing I've been thinking about these last few days and hasn't come up yet it is cancellation.

There are many asynchronous operations which may never complete for one reason or another (reading from sockets / stdio / anything which could produce data, waiting for process exits, etc, etc) and there are many situations where you may want to attempt some operation up until a certain point at which point it isn't relevant / wanted any more (e.g. having a timeout on reading user input from stdin). You could probably of emulate some of these situations by closing the object whenever your chosen cancellation criteria has been met, but the entire object is then un-usable as oposed to just an operation of your choosing currently being performed or waiting to be performed getting stopped.

Part of the answer to this might also be in the coroutine design, in kotlin you can request the cancellation of a coroutine though a function on the object representing a launched coroutine. So if haxe coroutines end up having something similar that might be one approach for this.
Another approach could be the cancellation token technique used in dotnet, this approach is quite nice in that its completely divorced from from their async / tasks and can also be used in standard synchronous / traditional threading code. I'm also not sure if the eventual plan is for all the haxe asys functions to be manually changed to suspending functions or if there is desire to allow them to be used in their current callback style if the user is mad enough, that choice would also effect what cancellation might look like to the asys api.

@Aidan63
Copy link
Contributor

Aidan63 commented Apr 6, 2024

I've gone through and created issues in my repo for all the things I've mentioned along with giving them labels and updating my comments in this merge to point to them. I will continue to post everything here as well, I just wanted something easier to track, especially as half of my comments are tucked away and I have to keep clicking "show more comments" to find them.

The filesystem, net, and system labels are pretty self explanatory as they map onto the respective packages. implementation is for issues related to libuv or other external libraries doing a lot of the heavy lifting. design is for the "larger" questions around missing / alternative functionality as opposed to small tweaks to what we already have, and coroutine is given for issues which may want to consider coroutines in any solutions.

I went back and added a bunch of server and socket tests once I figured out a good approach. What I've done is created a bunch of very small haxe "script" files for hosting a server or running a socket using the exisitng synchronous API. I spawn a haxe process running one of these scripts on the interpreter for the test to interact with. This should mean its portable across all other targets as well.

Before I start on my system package notes there are a few more network related ones which came about from writing these tests.

Tcp Socket Buffer Size is a Suggestion

On Windows if I set the send and receive buffers of a TCP socket to something like 7000 and then fetch the value it comes out as 7000, but on linux fetching the value returns 16384. I'm guessing these buffer sizes are more of a suggestion to the OS and its free to choose a different value. So, should our API recognise this? Might be be better to have these functions return the value which was set, similar to how write functions return how many bytes were actually sent vs what you offered.

(Aidan63/hxcpp_asys#74)

Send and Receive Buffer Sizes Unsupported on Windows Pipes

Setting or getting the size of pipe buffers on windows is not supported by libuv.

(Aidan63/hxcpp_asys#75)

Closing Server with Pending Accepts

If the user has called accept on a server and later closes it before a connection has been made should the accept callback be invoked with an error of some sort?

(Aidan63/hxcpp_asys#76)

System Package

Process PID not a property

The pid is stored as a field, not a property on the process, this isn't a problem unless we want to go ahead and say after closing accessing any functions or properties will result in an error which we can't do with a field.
Another question is should the PID still be accessible after the process has exited or is it only readable when the process is running?

(Aidan63/hxcpp_asys#77)

Null stdio streams if not piped

If the user specifies that the stdio streams should be, say, ignored, are the stdin, stdout, and stderr properties expected to return null values or throw some sort of exception? I'm guessing throw since they're not null annotated. But the base process class provides all streams as an array with the first items being the stdio streams, so should they be null here?

(Aidan63/hxcpp_asys#78)

Multiple exitCode calls

If the user makes multiple exitCode calls on a process which is still running does the latest one take precedence or should they be queued up and all get invoked when the process finally exits.

(Aidan63/hxcpp_asys#79)

Closing spawned process stdio pipes

If the user pipes the stdio do they need to call close to ensure the resources are correctly cleaned up or should closing the overall process ensure everything cleaned up. If closing the process is enough then what purpose does the close function on those stdio pipes really serve? Yes, the user could close the pipe before the process exits, but does that really serve a purpose?

(Aidan63/hxcpp_asys#80)

Process cwd option

This option takes a string when it could probably take the new FilePath abstract instead.

(Aidan63/hxcpp_asys#81)

Process User and group ID on Windows

These libuv options are not supported on Windows but are exposed in the process options structure, this probably just adds more fuel to the fire that the current permissions design doesn't really work. But, if we do want to keep it what should happen if they're supplied on windows, silently ignore them? return some sort of not supported error?

(Aidan63/hxcpp_asys#82)

Non Existing CWD Causes Confusing Error Message

(nodejs/node#11520)

Much like the user in that issue I started questioning my sanity when I started to receive "file not found" errors when writing tests against a program which definitely existing. Turns out it was a bug in my FilePath implementation which resulted in a non-existing cwd path.
The reasons in that issue for not just checking if the file existing seem reasonable so we might want to add to the doc comment that this error will be returned if the specified cwd doesn't existing either.

(Aidan63/hxcpp_asys#83)

Process Exit Code Not 64bit

libuv exit codes are represented as uint64_ts but they get truncated to ints on the haxe side.

(Aidan63/hxcpp_asys#84)

ProcessOptions user and group uses Ints not SystemUser and SystemGroup abstract

You can se the uid and gui on a spawned process but these options are integers not the SystemUser and SystemGroup abstracts defined and used for file system permissions.

(Aidan63/hxcpp_asys#85)

Should Spawned Processes Keep the Event Loop Alive

If the user spawns a long running process I think it makes senese that the event loop will continue to run (and therefore that thread not exit) until the process has exited. However I don't think this makese sense for the detached case as the entire point of that option is that the process continue to run after the parent closes.

(Aidan63/hxcpp_asys#86)

Unclear Process Environment Variable Behaviour

Are we copying libuv's env behaviour? That is, if the user provides environment variables it does not inherit the environment of the parent process.

function test() {
	final env   = "FOO";
    final value = "BAR";

    Process.open("program.exe", { env: [ "BAR" => "BAZ" ] }, (proc, error) -> {});
}

In the above example the spawned process has access to BAR but not FOO, it will only have access to FOO if env is null. The doc comment seems to imply this, but it might be worth making clear this is the case if intended.

Closing Stdio Pipes

Closing stdio pipes isn't something you really want to do and has inconsistent behaviour on libuv.

libuv/libuv#2962

So, what should we do if the user calls close on one of the stdio streams in Process.current. Right now I just invoke the success callback and do nothing behind the scenese. Given that closing stdio is dangerous, and I'm not sure what what the use case of this is on the haxe side either, maybe close shouldn't be exposed here?

(Aidan63/hxcpp_asys#88)

StdioConfig File Pipes

The StdioConfig type allows you to ignore or pipe output from a process, but there are also two extra ones to pipe straight to a file path or an already opened file. Why are files the ones allowed here, what if I want to pipe straight out of a socket, or what if I want to pipe straight into my own custom IWritable. It seems odd that file is the only special case here.

We could add extra config options for these other cases but at that point it might be better to provide a more generic pipe mechanism as currently the user can't, say, automatically send a file through a socket. This would also greately reduce the complexity of the process implementation.

Alternatively don't provde any fancy pipe stuff and let the user do it themselves.

(Aidan63/hxcpp_asys#89)

Process.execute Stdio

This function is designed to be a shortcut for calling a function but it accepting the same process options as the full open function means you could do stuff like set stdout to Ignore. In this case would you expect the stdout bytes in the execute callback to be null? You could also configure stdin but then have no way to interact with it.
This seems pretty useless so I just ignore all stdio pipe options with execute, the process options typedef should probably be split up so stdio is independent of the other config option which execute can make use of.

Process.execute("haxe.exe", { stdio: [ PipeWrite, Ignore, Ignore ] }, (error, result) -> {
	trace(result.stdout, result.stderr);
});

(Aidan63/hxcpp_asys#90)

@Aidan63
Copy link
Contributor

Aidan63 commented Apr 6, 2024

Two more general points now

More close Considerations

Follow up from my comment about the state of objects after closing.

What is the state of objects after the user has issued a close call but before the callback has been invoked?

file.close((_, error) -> trace("closed"));
file.write(myBuffer, 0, myBuffer.length, (count, error) -> trace("???"));

In the above example a write is issued after close is called, should this also error, but with what? Reuse the proposed ObjectClosedException? introduce an ObjectClosingException?

The next question is what is the user expected to do if they get a close error? Try again? Are we making guarentees that if closing fails the object is still perfectly usable, or is it in some weird unknowable state where maybe its usable, maybe some of its resources have been cleaned. From a libuv point of view uv_close will always succeed and cannot return an error (either in its callback or from the call itself). The only "closing" related call which might fail is uv_shutdown which flushes any pending writes, but examples online seem to go straight to calling uv_close if this returns a failure.

I think the following would be best to remove any confusion.

  • The moment close is called the object is considered "closed" and any further access will result in ObjectCloseException results.
  • Change the close callback signature to not have an error, treat it as the special case it is with just a simple Void->Void callback (Not sure how this would effect any coroutine integration plans).
  • Calling close multiple times could be defined as a special case where its allowed and doesn't result in any error and will just do nothing if already closed.

(Aidan63/hxcpp_asys#91)

Unclosable Streams

The IDuplex interface and friends all have a close function but the majority of the out of the box implementations of these interfaces have questionable use / implementations.

  • Closing file descriptors 0-2 (stdio) is bad for security reasons and libuv has inconcistent behaviour on different systems with this. Due to this my implementation always returns success when closing the current process stdio and doesn't call uv_close. The bigger question is why would the user actually want to close the stdout of the current process?
  • Closing the pipes of a spawned process does work, but would the user really want to do it? Assuming calling close on the ChildProcess object will automatically clean up those pipes what use cases are there for closing child process pipes early?
  • The only place close has a real purpose is sockets and servers where they close connections and stop the server respectively.
  • File operations make no use of IDuplex or related interfaces.

Considering 2 / 3 of the close implementations are questionable at best it seems like this design is wrong. Obviously there is a use for closable streams and user defined ones might have need of them, but it does seem like we need something to represent an non-closable streams as well.

(Aidan63/hxcpp_asys#92)

@Aidan63
Copy link
Contributor

Aidan63 commented Apr 6, 2024

Here's an "executive summary" of the hxcpp asys implementation and test suite as is since I've not implemented most of the API and have tests in most areas.

My hxcpp asys branch is here (https://github.com/Aidan63/hxcpp/tree/asys) and the haxe side is here (https://github.com/Aidan63/hxcpp_asys). The haxe side only requires a 4.3.x build and shadows the Thread implementation to drive the libuv loop. You can easily play around with this today by just adding a class path to the src folder in that repo and using my hxcpp branch. No custom haxe build required.

~95% of the api is implemented, It's possible I've missed some things but these are the things I've explicitely left unimplemented.

  • UDP Sockets
  • SecureServer
  • SecureSocket on non Windows platforms

I have made some changes from that API in this merge which should be mostly uncontroversial.

  • There was a TODO on the Callback typedefs to combine them all together when default parameters are implemented, since they now are I went ahead and made that change. I also swapped around the order of result and error in the callback, they are now (result, error) to match the error type being the right most parameter of the callback, this also matches the WIP coroutine branch callback argument order
  • I made the fields on the socket option typedef optional.
  • I added a verifyCertificate option to the SecureSocket option typedef to allow me to implement this type.

Test suite uses utest2 and should be easy to run on targets other than hxcpp. There is one set of filesystem tests which peek at the underlying BytesData implementation for hxcpp and the process tests need to be updated to work with the haxe spawning technique I used with socket tests.

IPC has zero tests right now as I'm not sure what to test against (no synchronous equivilant). Secure sockets also have no tests, need to figure out what to do with certificates and stuff (would be nice to have tests which don't all disable certificate verification and go out to some external url).

Socket keep alive stuff is untested, it's hard to do this without making the test suite take an age to run if we can't specify the keep alive time.

Many more error case tests could be added to the process api but I've left these for now as I've mentioned quite a few things around process api so will wait until those are cleared up.

Everything related to permissions and user / group ids is untested.

There are some areas (sockets and process io pipes) where I wanted to add more complex tests around reading and writing lots of data, but the callback style drains my will to test for these more complex ones. Migrating this suite over to coroutines once available will massively increase the maintainability of it all.

Does utest have any parametric test functionality? There are several places where I want to run the same test against different input and have basically had to do it manually with a loop. But this makes it a pain to see which test cases are actually failing.

As a final test metric this is what I've got so far, 1822 assertions across 415 tests.

assertations: 1822
successes: 1787
errors: 0
failures: 35
warnings: 0
execution time: 22.627

There are still some things to do in the hxcpp implementation (SecureSocket for non Windows, general cleanup, investigate some of the legitimate failing tests I've got), but I think the next thing to do is probably start making decisions on some of the mentioned.

@Apprentice-Alchemist
Copy link
Contributor

Regarding the UID/GID stuff, I would just remove that from the API for the time being.
For permissions, the only portable attribute is the readonly bit, so that's the only thing that should be exposed imo.

(See https://github.com/Apprentice-Alchemist/haxe/tree/feature/refactor_std/std/sys/fs for my take on a (synchronous) filesystem api)

For now we should focus on exposing portable functionality, platform specific stuff is a problem for later.

@Aidan63
Copy link
Contributor

Aidan63 commented Apr 6, 2024

I'd agree with removing the UID and GID stuff, but I don't keeping read only is much better either as there's no easy way to make it consistent (and useful) across platforms.

Take rusts readonly permission, on Windows it just checks the readonly file attribute which is a hold over from the OS/2 days and not an actual access control mechanism, because of this it can report that a file is writable but trying to write can then fail if there is a deny ACL for the given user.
Mac is somewhat similar, it only checks the underlying posix bits which are used as a last check if there are no deny ACLs, so it can also report wrong values.

We could document all of that, but I'd imagine most users will have very little knowledge of the underlying permission systems and even if they did read it it would be in one ear and out the other. Leading to code which naively uses some readonly property / function to check if they can write to a file and ignoring any errors which might come about from actually trying to write.

Best thing to do if you want to see if you can actually write to a file is try and open it in a write mode, we should probably be encouraging that instead of providing questionable shortcuts.

@Apprentice-Alchemist
Copy link
Contributor

Not exposing the readonly bit is fine by me.

@Aidan63
Copy link
Contributor

Aidan63 commented May 30, 2024

@9Morello didn't realise you reacted to my response in the coro pr, but I'll move it here since its probably a better place.

I don't think something like libxev would make things easier, the docs are very sparse and makes no mention of its threading model. It seems heavily inspired by libuv so I'm guessing its the same, which is a big pain point of libuv. Not to mention "Schrodinger's Windows support" (readme mentions Windows is both supported and Windows support is planned)!

If something else were to be chosen I'd want it to not be a C library. Working with libuv has been a good reminder as to why I steer clear of these big C libraries. Macros to "fake" inheritance, passing around data as void pointers, having to deal with function pointers, etc, etc. What an absolute pain!

@Aidan63
Copy link
Contributor

Aidan63 commented Jul 20, 2024

Whats the relationship between asys and the thread event loop? I was debugging an issue from how I've integrated the two but then realised I've made the assumption that they're related. E.g. is the following expected to work?

Thread.create(() -> {
	asys.native.filesystem.FileSystem.openFile('foo.txt', Read, (_, _) -> {
		trace('complete');
	});
});

In my implementation this throws an exception as it piggy backs off the thread event loop, which Thread.create doesn't have. Is it expected that this should work? If so how should the current thread event loop and the asys event loop work together.
If the above should work then should there be a change in haxe 5 for threads to always have an event loop? I'm not sure if there was a specific need for a thread without an event loop outside of precautions in case it would break things.

This sort of event loop integration was briefly mentioned in the coroutines PR, so there might be a lot of overlap here.

Aidan63/hxcpp_asys#95


On the topic of the process api I'm proposing the following changes / simplification.

  • Instead of exposing an array of streams for io simply have three properties, stdin, stdout, and stderr, you lose the ability to do stuff like keep printing to stdout as well as piping to a file but this greatly simplifies the implementation. It also doesn't raise questsions such as why does file piping have special support but not socket piping. If there is massive demand for this it can be added after the fact.
  • StdioConfig gets reduced to three enums. Ignore, Inherit, and Redirect, Redirect being the new one which allows you to access the specified stdio from the haxe api. The meaning of redirect is easily inferred by the stdio its used against, you don't have to figure out which read or write enum to used based on which direction you're supposed to be "looking" in.
  • The stdio properties on the process class throw something like NotRedirectedException instead of returning null if they are not redirected. Trying to access the stdio of a stream which is not redirected seems like a programming error rather than a possible outcome so an exception might be more appropriate.
  • Multiple callbacks to exitCode should be allowed when the process is running and all should be notified. Thinking of possible coroutine usage, I can definitely imagine having multiple coroutines running which all want to be notified of when a process exits.

So overall the class looks like

enum abstract StdioConfig(Int) {
    var Redirect;
    var Inherit;
    var Ignore;
}

typedef ProcessOptions = {
    var ?args:Array<String>;
    var ?cwd:String;
    var ?env:Map<String, String>;
    var ?stdin:StdioConfig;
    var ?stdout:StdioConfig;
    var ?stderr:StdioConfig;
    var ?user:Int;
    var ?group:Int;
    var ?detached:Bool;
}

class Process {
    public var pid(get, never):Int;

    public var stdin(get, never):IWritable;

    public var stdout(get, never):IReadable;

    public var stdin(get, never):IReadable;

    static public function execute(command:String, ?options:ProcessOptions, callback:Callback<{?stdout:Bytes, ?stderr:Bytes, exitCode:Int}>);

    static public function open(command:String, ?options:ProcessOptions, callback:Callback<ChildProcess>);

    public function sendSignal(signal:Signal, callback:Callback<NoData>);
}

This does still leave questions about the execute function, should the stdio options do anything for that function? We could always split stdio options into their own typedef and not have that be user provided if that is the case.

@skial skial mentioned this pull request Jul 24, 2024
1 task
@Aidan63
Copy link
Contributor

Aidan63 commented Aug 28, 2024

Three other things I noted down while doing some cleanup.

isFile and null

Realised that my isFile implementation and a few others return argument exceptions when provided null. Guessing it would be preferable for null in these cases to return false, as null is not a file.

(Aidan63/hxcpp_asys#96)

IOExceptions

The IOException type is rather awkward to use, especially if you look at it from a coroutine point of view. If, for example, I'm only interested in the file not being found and I'm fine with other exceptions being bubbled up I'd have to write something like this.

try {
	FileSystem.info(path).modificationTime;

	// do other stuff
} catch (exn:IOException) {
	if (exn.type.match(FileNotFound)) {
		return false;
	}

	// rethrow exceptions we're not handling...
	throw exn;
}

Usually I'm all for reducing inheritance but with the built in type checking of catches, having exception sub classes for each exception type makes for nicer code and less likely to accidentally swallow errors.

try {
	FileSystem.info(path).modificationTime;

	// do other stuff
} catch (_:FileNotFoundException) {
	return false;
}

(Aidan63/hxcpp_asys#97)

Calling close on alive processes

One I found while going over some libuv docs, you shouldn't call uv_close on a process before its exit callback has arrived, doing this doesn't result in an error but leaves the process in some sort of weird state... (libuv/libuv#1911)
While I'm sure there are good reasons why they implemented things that way it is thorougly unintuative and seems like an accident waiting to happen.
I think it would make sense if we say that calling close on a process still alive will kill it. So we'd need to some some tracking so on a close call we'd see if we need to issue kill the process and wait for its exit before doing cleanup and invoking the provided callback.

(Aidan63/hxcpp_asys#98)

@skial skial mentioned this pull request Sep 4, 2024
1 task
@Aidan63
Copy link
Contributor

Aidan63 commented Sep 5, 2024

Had an idea the other day which might solve several questions around threading, lifetimes, and some coroutine stuff.

I've been operating on a 1 haxe thread, 1 libuv loop principle. But what if there was a dedicated background thread which just ran the libuv loop and which all asys api calls were serialised onto. I think this approach solves several problems.

Cross thread usage. You no longer have the limitation of asys objects only being usable on the thread they were created on. Passing asys objects around threads might seem odd but given that there as been a fair bit of discussion about coroutine scheduling, it might be possible for coroutines to resume on a threadpool which means asys objects being usable across threads is important and this approach would allow that.

Resouce management. A single libuv thread owned by the runtime makes object lifetimes much easier. If close hasn't been called by the time the asys object get finalised, it's easy to schedule the close on dedicated the libuv thread. You no longer have to worry about tracking haxe threads, which objects were created on them, is that thread still alive, etc, etc.

Not tied to the thread event loop. This would also free asys from the thread event loop and would allow a much easier blocking start implementation in coroutines. Basing this off my recent coroutine experiments and assuming the coroutine asys wrappers shuttles the callbacks through the coroutine scheduler, start could create its own EventLoop and pump that between checking for results. This way its true blocking, instead of the thread event loop pumping we were thinking of before which could cause other code to be executed.

There are potentially some downsides though. This approach would probably work fine for hxcpp and hl where they have their own runtime, but if another target which doesn't really implement its own runtime and wants to use libuv for its implementation, that might make things a bit more difficult.

This single thread could become a bottle neck as work will have to be placed in some sort of thread safe collection which can be picked up by the libuv thread to process. This might not end up being a problem under normal conditions though.

(Aidan63/hxcpp_asys#4)

@Aidan63
Copy link
Contributor

Aidan63 commented Sep 17, 2024

This is probably more coroutine related than asys but it's a follow on from the above comment so I'm putting it here.

Played around with the global libuv loop idea and it seems to work well, converted a very small subset of my asys stuff to use it (file open, write, and close). A coroutine implementation of openFile based on my coroutine branch would look like this.

@:coroutine public static function openFile<T>(path:FilePath, flag:FileOpenFlag<T>):T {
	if (path == null) {
		throw new ArgumentException('path', 'path was null');
	}

	return Coroutine.suspend(cont -> {
		cpp.asys.File.open(
			path,
			cast flag,
			file -> cont.resume(@:privateAccess new File(file), null),
			error -> cont.resumt(null, new FsException(error, path)))
	});
}

The cpp.asys.File.open call queues up work on the libuv thread and the callbacks are also performed on that thread. The call to the continuations resume which goes through the current scheduler will move execution back onto whatever the scheduler is implemented to do, that could be an event loop, thread pool, or something else.

This makes implementing a blocking coroutine start much more fesible. previously we were thinking we'd need to pump the threads main event loop, but with this setup we wouldn't need to. We can create a new sys.thread.EventLoop, setup a scheduler to execute functions on that event loop, and then pump just that event loop. Only work from the coroutine will be executed, not anything else which happens to also use the threads event loop.

// possible `start` implementation.
final loop    = new EventLoop();
final blocker = new WaitingCompletion(loop, new EventLoopScheduler(loop));
final result  = switch myCoroutine(blocker) {
	case Suspended:
		// wait will pump the provided event loop until its `resume` is called indicating the coroutine has completed.
		blocker.wait();
	case Success(v):
		v;
	case Error(exn):
		throw exn;
}

An enhancement / slight alternative to this global loop which might help libuv integration with other targets could be to have some sort of AsysContext type which is created by the initial coroutine and stored in the context. Suspending functions can then access that context which would manage a libuv loop / do whatever target specific stuff is needed without requiring a global thread.
You would start to reintroduce some unsupported operations though, e.g. not being able to use opened files across two different coroutine start calls as they would each have a separate context and therefore libuv loop. Although this might be an acceptable limitation.

I've created a new branch in both my hxcpp fork and hxcpp_asys repo with this small global loop test.

@skial skial mentioned this pull request Sep 18, 2024
1 task
@0b1kn00b
Copy link

I've got quite a bit about coroutines in Haxe over here if it's any help: stx_coroutine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.