Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preliminary support for Openbsd #681

Open
wants to merge 18 commits into
base: main
Choose a base branch
from
Open

Conversation

bogwonch
Copy link

Hi

I've tried to get Maestral working for OpenBSD. It is completing an initial sync, and the tests can all run and pass (except for the autostart test: OpenBSD doesn't have a systemd/launchctl equivalent so I've skipped it).

It needed relatively few changes: make some updates to platform checks (mostly same as Linux) and an update to the fcntl structure format for OpenBSD (its the same as MacOSs for us).

That said, this isn't ready for merging: I'm getting an encoding bug. I'm unsure if this is an issue with my own Dropbox (the unicode symbol for a Florin isn't exactly a common mark but it was used in ancient pre-OSX Mac icon packs to denote folders); or whether its a more general encoding error. Any pointers or suggestions would be appreciated :-) Log message bellow.

Cheers,
Jo

2022-05-22 10:43:20 manager ERROR: Unexpected error
Traceback (most recent call last):
  File "/home/joseph/.local/share/python/lib/python3.9/site-packages/maestral/manager.py", line 843, in _handle_sync_thread_errors
    yield
  File "/home/joseph/.local/share/python/lib/python3.9/site-packages/maestral/manager.py", line 821, in startup_worker
    self.sync.download_sync_cycle(client)
  File "/home/joseph/.local/share/python/lib/python3.9/site-packages/maestral/sync.py", line 2900, in download_sync_cycle
    for changes, cursor in changes_iter:
  File "/home/joseph/.local/share/python/lib/python3.9/site-packages/maestral/sync.py", line 2966, in list_remote_changes_iterator
    sync_events = [SyncEvent.from_metadata(md, self) for md in changes.entries]
  File "/home/joseph/.local/share/python/lib/python3.9/site-packages/maestral/sync.py", line 2966, in <listcomp>
    sync_events = [SyncEvent.from_metadata(md, self) for md in changes.entries]
  File "/home/joseph/.local/share/python/lib/python3.9/site-packages/maestral/models.py", line 321, in from_metadata
    if sync_engine.get_local_rev(md.path_lower):
  File "/home/joseph/.local/share/python/lib/python3.9/site-packages/maestral/sync.py", line 794, in get_local_rev
    entry = self.get_index_entry(dbx_path_lower)
  File "/home/joseph/.local/share/python/lib/python3.9/site-packages/maestral/sync.py", line 763, in get_index_entry
    return self._index_table.get(dbx_path_lower)
  File "/home/joseph/.local/share/python/lib/python3.9/site-packages/maestral/database/orm.py", line 294, in get
    pk_sql = self.pk_column.py_to_sql(primary_key)
  File "/home/joseph/.local/share/python/lib/python3.9/site-packages/maestral/database/orm.py", line 158, in py_to_sql
    return self.type.py_to_sql(value)
  File "/home/joseph/.local/share/python/lib/python3.9/site-packages/maestral/database/types.py", line 96, in py_to_sql
    return os.fsencode(value)
  File "/usr/local/lib/python3.9/os.py", line 812, in fsencode
    return filename.encode(encoding, errors)
UnicodeEncodeError: 'ascii' codec can't encode character '\u0192' in position 35: ordinal not in range(128)
2022-05-22 10:43:20 manager INFO: Shutting down threads...
2022-05-22 10:43:20 sync INFO: Sync aborted
2022-05-22 10:43:59 manager INFO: Paused

@codecov
Copy link

codecov bot commented May 22, 2022

Codecov Report

Merging #681 (a208994) into main (0392c77) will decrease coverage by 0.74%.
The diff coverage is 78.57%.

❗ Current head a208994 differs from pull request most recent head 266d67c. Consider uploading reports for the commit 266d67c to get more accurate results

@@            Coverage Diff             @@
##             main     #681      +/-   ##
==========================================
- Coverage   78.21%   77.48%   -0.74%     
==========================================
  Files          42       44       +2     
  Lines        7465     7466       +1     
==========================================
- Hits         5839     5785      -54     
- Misses       1626     1681      +55     
Flag Coverage Δ
pytest 77.48% <78.57%> (-0.74%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/maestral/daemon.py 79.06% <40.00%> (-1.41%) ⬇️
src/maestral/constants.py 94.11% <100.00%> (-5.89%) ⬇️
src/maestral/utils/appdirs.py 89.70% <100.00%> (+0.15%) ⬆️

... and 36 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Owner

@samschott samschott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution and for running the test suite on OpenBSD, it is good to know that the tests pass when using the watchdog's kqueue backend. Though I am surprised about this, I would have at least expected some failures in the integration tests regarding symlink handling, based on how the watchdog kqueue backend is implemented. In any case, before merging this, I'd like to set up CI support for either some flavour of BSD to test kqueue event reporting on macOS.

This PR has some overlap with #680 and I am wondering if FreeBSD and OpenBSD would be sufficiently similar to only have a single flag IS_BSD. What do you think, does that make sense?

The encoding issue is strange. os.fsencode() seems to believe that your file system encoding is ASCII and therefore fails when given out of range unicode characters such as ƒ. Are you really using an Ascii file system encoding or could your environment variables LC_CTYPE, etc, just be badly configured?

tests/offline/test_cli.py Show resolved Hide resolved
@bogwonch
Copy link
Author

Yes it was surprisingly painless, which is always nice. And lovely when you fix the shutdown bug and suddenly see all the 6 failing tests go green! Haven’t done proper dev work in years :-D !

Re: #680 …aha… someone else has done it too. Yes the patches look essentially the same. I would imagine that having a shared IS_BSD path is entirely reasonable, but I don’t know enough to be sure—I’d ask the @christopherkobayashi person what they think. Given that the bulk of it seems to be the XDG config dirs (which is the same as linux) and the fcntl flags (which is the same as OSX) it seems unlikely that individual BSDs may go too wild here and is one less path to test for you.

Re: unicode: OpenBSD only supports UTF-8 (and LC_TYPE is set to utf-8), so its odd that Python is dropping to ASCII at all. I will investigate further. Heck, I’ll see what Maestral on Linux does with my weird folder names…

@christopherkobayashi
Copy link

Chiming in, as I was mentioned ...

... yes, our two pull requests are functionally the same. Setting IS_BSD conditionally on the platform being FreeBSD/OpenBSD looks to be the right thing to do.

I'm okay with withdrawing my pull request in favor of this one, if the FreeBSD bits are folded in.

Cheers, and thanks for doing this.

@samschott
Copy link
Owner

I've added a test run for kqueue on macOS now (GitHub actions don't have OpenBSD or FreeBSD test runners). Two things to flag:

  1. There is a bit of a performance hit. Integration tests take > 1 h to complete with kqueue rather than ~ 30 min. The duration of those tests is typically dominated by the long wait times between file changes (to make sure everything is synced and to account for connection issues). I'm therefore surprised that changing file system notification backends has an effect at all.
  2. There are a number of sync test failures which I'll need to investigate before declaring the kqueue backend as "supported". Not too many, fortunately, but fixing them might require changes to the upstream watchdog library.

OpenBSD only supports one value of LC_CTYPE: C.  For applications which really want to use UTF-8 it also supports en_US.UTF-8 as an acceptable value (and implicitly converts anything ending with .UTF-8 to en_US.UTF-8). See OpenBSD's man 1 locale.

When spawning the daemon maestral execs it in a custom environment with the value of LC_CTYPE set UTF-8 which isn't valid on this platform so the daemon fallsback to running in ASCII mode... which breaks everything.

Unfortunately the call to env.update in daemon.py still seems to break things even if the LC_CTYPE is set correctly, so for OpenBSD I've disabled it.  Things seem to run fine otherwise.
@bogwonch
Copy link
Author

bogwonch commented May 24, 2022

Right fixed the issue with unicode characters! That's 3 hours of my life I'll never get back judiciously inserting print(sys.getfilenameencoding()) everywhere until I figured it out ;-)

When you launch the daemon process in daemon.py Maestral mucks about with the environment taking values from ENV in constants.py... specifically setting LC_CTYPE to UTF-8.

Problem is OpenBSD only officially supports one value of LC_CTYPE: C; and sort of supports en_US.UTF-8 converting anything that ends with .UTF-8 to that. See locale(1). All other values are invalid so when Python gets the UTF-8 it says it's invalid and kicks it back to ascii breaking everything.

What fun!

Even if I set the LC_CTYPE to the correct value in the ENV structure, the env.update(ENV) still seems to break it... so I've removed it for OpenBSD. We don't need no stinkin' optimization!

Running a sync from clean now, but it is looking happy. We'll see how it is in the morning.

@samschott
Copy link
Owner

Does that mean only Ascii file names are supported with the default LC_CTYPE=C? From the linked man page:

the default of LC_CTYPE=C selects the US-ASCII character set and encoding, treating the bytes 0x80 to 0xff as non-printable characters of application-specific meaning

Since Python 3.7, Python will just default to assuming UTF-8 file system encoding if it sees LC_CTYPE=C (see https://docs.python.org/3/library/os.html#python-utf-8-mode), which would explain why it works when you don't set the environment variable. However, after reading the man page, I'm also confused as to why LC_CTYPE=C.UTF8 is not picked up properly. Again from the man page:

If the value of LC_CTYPE ends in ‘.UTF-8’, programs in the OpenBSD base system ignore the beginning of it, treating for example zh_CN.UTF-8 exactly like en_US.UTF-8.

@bogwonch
Copy link
Author

That would be my reading of the manual too.

The funny thing is if I explicitly set LC_CTYPE to en_US.UTF-8 in constants.py it still falls back to ASCII, and if I set ENV to {} it still goes back to ASCII. It's only if I comment out the env.update() call in daemon.py that it seems to work correctly.

Under an X session LC_CTYPE seems to get a default value of C.UTF-8 without explicitly setting anything. The behaviour seems to be the same though.

Whats really odd is that I'm fairly convinced it's something Maestral is doing. If I run the following script:

import os
import sys

print(f"Contents of os.environ:")
for k, v in os.environ.items():
    print(f"- {k}={v}")
print('')

env=os.environ.copy()
print(f"Contents of os.environ copy:")
for k, v in env.items():
    print(f"- {k}={v}")
print('')

new_env={'PYTHONOPTIMIZE': '2', 'LC_CTYPE': 'en_US.UTF-8'}
env.update(new_env)
print(f"Contents of updated os.environ copy:")
for k, v in env.items():
    print(f"- {k}={v}")
print('')

print("Spawned with os.environ:")
pid=os.spawnve(os.P_NOWAIT, "/home/joseph/envdump", ["/home/joseph/envdump"], os.environ)
os.waitpid(pid,0)
print('')

print("Spawned with updated os.environ:")
pid=os.spawnve(os.P_NOWAIT, "/home/joseph/envdump", ["/home/joseph/envdump"], env)
os.waitpid(pid,0)
print('')

print(f"sys.executable={sys.executable}")
print('')

print("fsencoding spawned with os.environ:")
pid=os.spawnve(os.P_NOWAIT, sys.executable, [sys.executable, "-c", "import sys; print(sys.getfilesystemencoding())"], os.environ)
os.waitpid(pid,0)
print('')

print("fsencoding spawned with updated os.environ:")
pid=os.spawnve(os.P_NOWAIT, sys.executable, [sys.executable, "-c", "import sys; print(sys.getfilesystemencoding())"], env)
os.waitpid(pid,0)
print('')

print("fsencode spawned with os.environ:")
pid=os.spawnve(os.P_NOWAIT, sys.executable, [sys.executable, "-c", "import os; print(os.fsencode('⁇'))"], os.environ)
os.waitpid(pid,0)
print('')

print("fsencode spawned with updated os.environ:")
pid=os.spawnve(os.P_NOWAIT, sys.executable, [sys.executable, "-c", "import os; print(os.fsencode('⁇'))"], env)
os.waitpid(pid,0)
print('')

Where envdump is the compiled form of:

#include <stdio.h>

int main(int argc, char *argv[], char *environ[]) {
  while (*environ != NULL)
    printf("- %s\n", *environ++);
  
  return 0;
}

You get the following:

Contents of os.environ:
- SHELL=/usr/local/bin/bash
- WINDOWID=48234874
- COLORTERM=truecolor
- MOZ_ACCELERATED=1
- KITTY_PID=64878
- EDITOR=emacsclient -t
- PWD=/home/joseph
- LOGNAME=joseph
- MANPATH=:/home/joseph/.local/share/man:/usr/local/opt/share/man:/usr/share/man:/usr/local/share/man
- HOSTDISPLAY=banana-slug.homenet.arpa:0
- FVWM_MODULEDIR=/usr/local/libexec/fvwm/2.6.9
- MOZ_WEBRENDER=1
- WINDOWPATH=5
- HOME=/home/joseph
- VIRTUAL_ENV=/home/joseph/.local/share/python
- KITTY_WINDOW_ID=2
- TERMINFO=/usr/local/lib/kitty/terminfo
- TERM=xterm-kitty
- AUTOCONF_VERSION=2.71
- USER=joseph
- FVWM_USERDIR=/home/joseph/.fvwm
- VISUAL=emacsclient -c -a emacs
- DISPLAY=:0
- SHLVL=1
- PAGER=less
- LC_CTYPE=C.UTF-8
- PS2=\[\e]133;k;start_kitty\a\]\[\e]133;A;k=s\a\]\[\e]133;k;end_kitty\a\] ⋯ 
- PS3= ⁇ 
- PS1=\[\e]133;k;start_kitty\a\]\[\e]133;A\a\]\[\e]133;k;end_kitty\a\]\n\[\e]133;k;start_secondary_kitty\a\]\[\e]133;A;k=s\a\]\[\e]133;k;end_secondary_kitty\a\] ⧣ \[\e]133;k;start_suffix_kitty\a\]\[\e[5 q\]\[\e]2;\w\a\]\[\e]133;k;end_suffix_kitty\a\]
- PS4= $0 +$LINENO ⧣ 
- BROWSER=firefox
- ALTERNATE_EDITOR=
- PATH=/home/joseph/.local/share/python/bin:/bin:/usr/bin:/sbin:/usr/sbin:/usr/X11R6/bin:/usr/local/bin:/usr/local/sbin:/home/joseph/.local/bin:/usr/local/opt/bin:/usr/games:/usr/local/games
- FVWM_DATADIR=/usr/local/share/fvwm
- AUTOMAKE_VERSION=1.16
- KITTY_INSTALLATION_DIR=/usr/local/lib/kitty
- GOPATH=/home/joseph/.local/share/go
- _=/home/joseph/.local/share/python/bin/python

Contents of os.environ copy:
- SHELL=/usr/local/bin/bash
- WINDOWID=48234874
- COLORTERM=truecolor
- MOZ_ACCELERATED=1
- KITTY_PID=64878
- EDITOR=emacsclient -t
- PWD=/home/joseph
- LOGNAME=joseph
- MANPATH=:/home/joseph/.local/share/man:/usr/local/opt/share/man:/usr/share/man:/usr/local/share/man
- HOSTDISPLAY=banana-slug.homenet.arpa:0
- FVWM_MODULEDIR=/usr/local/libexec/fvwm/2.6.9
- MOZ_WEBRENDER=1
- WINDOWPATH=5
- HOME=/home/joseph
- VIRTUAL_ENV=/home/joseph/.local/share/python
- KITTY_WINDOW_ID=2
- TERMINFO=/usr/local/lib/kitty/terminfo
- TERM=xterm-kitty
- AUTOCONF_VERSION=2.71
- USER=joseph
- FVWM_USERDIR=/home/joseph/.fvwm
- VISUAL=emacsclient -c -a emacs
- DISPLAY=:0
- SHLVL=1
- PAGER=less
- LC_CTYPE=C.UTF-8
- PS2=\[\e]133;k;start_kitty\a\]\[\e]133;A;k=s\a\]\[\e]133;k;end_kitty\a\] ⋯ 
- PS3= ⁇ 
- PS1=\[\e]133;k;start_kitty\a\]\[\e]133;A\a\]\[\e]133;k;end_kitty\a\]\n\[\e]133;k;start_secondary_kitty\a\]\[\e]133;A;k=s\a\]\[\e]133;k;end_secondary_kitty\a\] ⧣ \[\e]133;k;start_suffix_kitty\a\]\[\e[5 q\]\[\e]2;\w\a\]\[\e]133;k;end_suffix_kitty\a\]
- PS4= $0 +$LINENO ⧣ 
- BROWSER=firefox
- ALTERNATE_EDITOR=
- PATH=/home/joseph/.local/share/python/bin:/bin:/usr/bin:/sbin:/usr/sbin:/usr/X11R6/bin:/usr/local/bin:/usr/local/sbin:/home/joseph/.local/bin:/usr/local/opt/bin:/usr/games:/usr/local/games
- FVWM_DATADIR=/usr/local/share/fvwm
- AUTOMAKE_VERSION=1.16
- KITTY_INSTALLATION_DIR=/usr/local/lib/kitty
- GOPATH=/home/joseph/.local/share/go
- _=/home/joseph/.local/share/python/bin/python

Contents of updated os.environ copy:
- SHELL=/usr/local/bin/bash
- WINDOWID=48234874
- COLORTERM=truecolor
- MOZ_ACCELERATED=1
- KITTY_PID=64878
- EDITOR=emacsclient -t
- PWD=/home/joseph
- LOGNAME=joseph
- MANPATH=:/home/joseph/.local/share/man:/usr/local/opt/share/man:/usr/share/man:/usr/local/share/man
- HOSTDISPLAY=banana-slug.homenet.arpa:0
- FVWM_MODULEDIR=/usr/local/libexec/fvwm/2.6.9
- MOZ_WEBRENDER=1
- WINDOWPATH=5
- HOME=/home/joseph
- VIRTUAL_ENV=/home/joseph/.local/share/python
- KITTY_WINDOW_ID=2
- TERMINFO=/usr/local/lib/kitty/terminfo
- TERM=xterm-kitty
- AUTOCONF_VERSION=2.71
- USER=joseph
- FVWM_USERDIR=/home/joseph/.fvwm
- VISUAL=emacsclient -c -a emacs
- DISPLAY=:0
- SHLVL=1
- PAGER=less
- LC_CTYPE=en_US.UTF-8
- PS2=\[\e]133;k;start_kitty\a\]\[\e]133;A;k=s\a\]\[\e]133;k;end_kitty\a\] ⋯ 
- PS3= ⁇ 
- PS1=\[\e]133;k;start_kitty\a\]\[\e]133;A\a\]\[\e]133;k;end_kitty\a\]\n\[\e]133;k;start_secondary_kitty\a\]\[\e]133;A;k=s\a\]\[\e]133;k;end_secondary_kitty\a\] ⧣ \[\e]133;k;start_suffix_kitty\a\]\[\e[5 q\]\[\e]2;\w\a\]\[\e]133;k;end_suffix_kitty\a\]
- PS4= $0 +$LINENO ⧣ 
- BROWSER=firefox
- ALTERNATE_EDITOR=
- PATH=/home/joseph/.local/share/python/bin:/bin:/usr/bin:/sbin:/usr/sbin:/usr/X11R6/bin:/usr/local/bin:/usr/local/sbin:/home/joseph/.local/bin:/usr/local/opt/bin:/usr/games:/usr/local/games
- FVWM_DATADIR=/usr/local/share/fvwm
- AUTOMAKE_VERSION=1.16
- KITTY_INSTALLATION_DIR=/usr/local/lib/kitty
- GOPATH=/home/joseph/.local/share/go
- _=/home/joseph/.local/share/python/bin/python
- PYTHONOPTIMIZE=2

Spawned with os.environ:
- SHELL=/usr/local/bin/bash
- WINDOWID=48234874
- COLORTERM=truecolor
- MOZ_ACCELERATED=1
- KITTY_PID=64878
- EDITOR=emacsclient -t
- PWD=/home/joseph
- LOGNAME=joseph
- MANPATH=:/home/joseph/.local/share/man:/usr/local/opt/share/man:/usr/share/man:/usr/local/share/man
- HOSTDISPLAY=banana-slug.homenet.arpa:0
- FVWM_MODULEDIR=/usr/local/libexec/fvwm/2.6.9
- MOZ_WEBRENDER=1
- WINDOWPATH=5
- HOME=/home/joseph
- VIRTUAL_ENV=/home/joseph/.local/share/python
- KITTY_WINDOW_ID=2
- TERMINFO=/usr/local/lib/kitty/terminfo
- TERM=xterm-kitty
- AUTOCONF_VERSION=2.71
- USER=joseph
- FVWM_USERDIR=/home/joseph/.fvwm
- VISUAL=emacsclient -c -a emacs
- DISPLAY=:0
- SHLVL=1
- PAGER=less
- LC_CTYPE=C.UTF-8
- PS2=\[\e]133;k;start_kitty\a\]\[\e]133;A;k=s\a\]\[\e]133;k;end_kitty\a\] ⋯ 
- PS3= ⁇ 
- PS1=\[\e]133;k;start_kitty\a\]\[\e]133;A\a\]\[\e]133;k;end_kitty\a\]\n\[\e]133;k;start_secondary_kitty\a\]\[\e]133;A;k=s\a\]\[\e]133;k;end_secondary_kitty\a\] ⧣ \[\e]133;k;start_suffix_kitty\a\]\[\e[5 q\]\[\e]2;\w\a\]\[\e]133;k;end_suffix_kitty\a\]
- PS4= $0 +$LINENO ⧣ 
- BROWSER=firefox
- ALTERNATE_EDITOR=
- PATH=/home/joseph/.local/share/python/bin:/bin:/usr/bin:/sbin:/usr/sbin:/usr/X11R6/bin:/usr/local/bin:/usr/local/sbin:/home/joseph/.local/bin:/usr/local/opt/bin:/usr/games:/usr/local/games
- FVWM_DATADIR=/usr/local/share/fvwm
- AUTOMAKE_VERSION=1.16
- KITTY_INSTALLATION_DIR=/usr/local/lib/kitty
- GOPATH=/home/joseph/.local/share/go
- _=/home/joseph/.local/share/python/bin/python

Spawned with updated os.environ:
- SHELL=/usr/local/bin/bash
- WINDOWID=48234874
- COLORTERM=truecolor
- MOZ_ACCELERATED=1
- KITTY_PID=64878
- EDITOR=emacsclient -t
- PWD=/home/joseph
- LOGNAME=joseph
- MANPATH=:/home/joseph/.local/share/man:/usr/local/opt/share/man:/usr/share/man:/usr/local/share/man
- HOSTDISPLAY=banana-slug.homenet.arpa:0
- FVWM_MODULEDIR=/usr/local/libexec/fvwm/2.6.9
- MOZ_WEBRENDER=1
- WINDOWPATH=5
- HOME=/home/joseph
- VIRTUAL_ENV=/home/joseph/.local/share/python
- KITTY_WINDOW_ID=2
- TERMINFO=/usr/local/lib/kitty/terminfo
- TERM=xterm-kitty
- AUTOCONF_VERSION=2.71
- USER=joseph
- FVWM_USERDIR=/home/joseph/.fvwm
- VISUAL=emacsclient -c -a emacs
- DISPLAY=:0
- SHLVL=1
- PAGER=less
- LC_CTYPE=en_US.UTF-8
- PS2=\[\e]133;k;start_kitty\a\]\[\e]133;A;k=s\a\]\[\e]133;k;end_kitty\a\] ⋯ 
- PS3= ⁇ 
- PS1=\[\e]133;k;start_kitty\a\]\[\e]133;A\a\]\[\e]133;k;end_kitty\a\]\n\[\e]133;k;start_secondary_kitty\a\]\[\e]133;A;k=s\a\]\[\e]133;k;end_secondary_kitty\a\] ⧣ \[\e]133;k;start_suffix_kitty\a\]\[\e[5 q\]\[\e]2;\w\a\]\[\e]133;k;end_suffix_kitty\a\]
- PS4= $0 +$LINENO ⧣ 
- BROWSER=firefox
- ALTERNATE_EDITOR=
- PATH=/home/joseph/.local/share/python/bin:/bin:/usr/bin:/sbin:/usr/sbin:/usr/X11R6/bin:/usr/local/bin:/usr/local/sbin:/home/joseph/.local/bin:/usr/local/opt/bin:/usr/games:/usr/local/games
- FVWM_DATADIR=/usr/local/share/fvwm
- AUTOMAKE_VERSION=1.16
- KITTY_INSTALLATION_DIR=/usr/local/lib/kitty
- GOPATH=/home/joseph/.local/share/go
- _=/home/joseph/.local/share/python/bin/python
- PYTHONOPTIMIZE=2

sys.executable=/home/joseph/.local/share/python/bin/python

fsencoding spawned with os.environ:
utf-8

fsencoding spawned with updated os.environ:
utf-8

fsencode spawned with os.environ:
b'\xe2\x81\x87'

fsencode spawned with updated os.environ:
b'\xe2\x81\x87'

Which looks right to me. In Maestral though something is happening to set the sys.getfilesystemencoding() to 'ansi' when spawning the daemon, and it appears to be something to do with the env.update.

Very weird...

@bogwonch
Copy link
Author

It gets weirder:

If I change the script in src/maestral/daemon.py: start_maestral_daemon_process to:

script = f'print(f"pre-import: {sys.getfilesystemencoding()}"); import maestral.daemon; print(f"post-import {sys.getfilesystemencoding()}"); maestral.daemon.start_maestral_daemon("{cc}"); print(f"post-start: {sys.getfilesystemencoding()}");'

And add prints, pre and post the spawn of the daemon:

    print(f"pre-spawn: {sys.getfilesystemencoding()}")
    pid = os.spawnve(os.P_NOWAIT, sys.executable, [sys.executable, "-c", script], env)
    print(f"post-spawn: {sys.getfilesystemencoding()}")

And at the start of maestral.daemon.start_maestral_daemon:


def start_maestral_daemon(
    config_name: str = "maestral", log_to_stderr: bool = False
) -> None:
    """
    Starts the Maestral daemon with event loop in the current thread.

    Startup is race free: there will never be more than one daemon running with the same
    config name. The daemon is a :class:`maestral.main.Maestral` instance which is
    exposed as Pyro daemon object and listens for requests on a unix domain socket. This
    call starts an asyncio event loop to process client requests and blocks until the
    event loop shuts down. On macOS, the event loop is integrated with Cocoa's
    CFRunLoop. This allows processing Cocoa events and callbacks, for instance for
    desktop notifications.

    :param config_name: The name of the Maestral configuration to use.
    :param log_to_stderr: If ``True``, write logs to stderr.
    :raises RuntimeError: if a daemon for the given ``config_name`` is already running.
    """
    print(f"start_maestral_daemon start: {sys.getfilesystemencoding()}")
    ...

With the call to env.update() I get:


 ⧣ maestral start
Starting Maestral...pre-spawn: utf-8
post-spawn: utf-8
pre-import: utf-8
post-import utf-8
start_maestral_daemon start: ascii
Starting Maestral...        [OK]

And then Unicode errors...
And when I stop:

 ⧣ maestral stop
Stopping Maestral...post-start: utf-8

And then without the env.update() I get:

 ⧣ maestral start
Starting Maestral...pre-spawn: utf-8
post-spawn: utf-8
pre-import: utf-8
post-import utf-8
start_maestral_daemon start: utf-8
Starting Maestral...        [OK]

 ⧣ maestral stop
Stopping Maestral...        [KILLED]

Very very weird. Something about that update to the environment is making Python on calling first function switch the file system encoding. I hate to say it (and would tell my students off for it...) but this feels like a Python bug(!?) I wonder if I can create a minimal example?

@bogwonch
Copy link
Author

No would be the answer:

odd.py:

import sys
import os

def my_function():
    print(f"In my_function: {sys.getfilesystemencoding()}")

if __name__=='__main__':
    print(f"At main: {sys.getfilesystemencoding()}")
    my_function()

    print("Spawing with env")
    pid=os.spawnve(os.P_NOWAIT, sys.executable, [sys.executable, "-c", "import odd; odd.my_function()"], os.environ)
    os.waitpid(pid, 0)

    print("Spawing with updated env")
    env = os.environ.copy()
    env.update({'PYTHONOPTIMIZE': 2})
    pid=os.spawnve(os.P_NOWAIT, sys.executable, [sys.executable, "-c", "import odd; odd.my_function()"], os.environ)
    os.waitpid(pid, 0)

Gives:

 ⧣ python odd.py 
At main: utf-8
In my_function: utf-8
Spawing with env
In my_function: utf-8
Spawing with updated env
In my_function: utf-8

I think I'm going to have to leave this to someone with more Python experience than me.

Copy link
Owner

@samschott samschott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments in-line regarding the ENV settings. I do prefer having a controlled environment without needing to special-case certain platforms if not strictly required.

Other points:

  1. Have you checked enabling behaviour on BSDs in general?
  2. The kqueue file system events emitter appears to be much slower due to an implementation detail: kqueues can be used to monitor existing files (with a new "subscription" started for each file) but they don't notify about which new files are created. Watchdog therefore takes a full directory snapshot of the watched folder hierarchy on any change and diffs it with a previous snapshot. This leads to large memory usage and bad performance for large directories. I am not sure if this can be optimised but would like to discuss it with the watchdog maintainers first.
  3. There remain the sync test failures, which I have not yet investigated.

I won't consider 2 + 3 blockers for this PR, just want to flag them here since both will need to be addressed before BSDs can be considered supported. In the meantime, I'm happy to keep it as "experimental" with some known sync failures.

@@ -17,7 +17,12 @@
APP_NAME = "Maestral"
BUNDLE_ID = "com.samschott.maestral"
APP_ICON_PATH = path("maestral.resources", "maestral.png").__enter__()
ENV = {"PYTHONOPTIMIZE": "2", "LC_CTYPE": "UTF-8"}

if platform.system == "OpenBSD":
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note the missing brackets in the function call. I think you want platform.system() == "OpenBSD" here instead. This should clear up the strange behaviour.

#ENV = {"PYTHONOPTIMIZE": "2", "LC_CTYPE": "C.UTF-8"}
ENV = {}
else:
ENV = {"PYTHONOPTIMIZE": "2", "LC_CTYPE": "UTF-8"}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to OpenBSD man pages, switching to LC_CTYPE=C.UTF-8 should do the trick. This should also be compatible with other platforms so we won't need a special case for OpenBSD.

@samschott
Copy link
Owner

BTW, my reasons for forcing LC_CTYPE=UTF-8 are:

  1. Most systems actually use this encoding, even if environment variables indicate otherwise. Forcing this setting has spared me a lot of headache from stray bug reports from Linux users.
  2. When started in particular ways, the environment variables for Python to pick up the encoding might not be set at all. This includes macOS app bundles, but also service managers such as systemd or launchd under some circumstances.

So far, I've not run into issues with users actually having a non-UTF-8 file system encoding. The opposite occurs far more often: Python thinks they don't because the environment is misconfigured.

@bogwonch
Copy link
Author

Hi Sam,

Just continuing with this while I have a free moment. I have merged in the changes from your last release, and refactored things to use the IS_BSD as suggested. I have been testing on FreeBSD and it seems to work fine. On OpenBSD and NetBSD you have to increase the number of available files to a user via a sysctl/login.conf (OpenBSD is rather conservative, but is happy after, NetBSD seems to still moan... but worked fine with a small number of files). Couldn't get a DragonFlyBSD VM up and running but if you're covering Free, Open and Net; that is gonna be most of them.

I'll let them run overnight and then rerun the test suites and then kick it back to you.

Cheers,
Jo.

@bogwonch
Copy link
Author

(oh and stupid parenthesis in function calls... thankyou for your help... that's what I get for not programming in python for yonks).

@bogwonch
Copy link
Author

A few comments in-line regarding the ENV settings. I do prefer having a controlled environment without needing to special-case certain platforms if not strictly required.

Updated as requested.

Other points:

1. Have you checked enabling behaviour on BSDs in general?

I've been running maestral on VMs for FreeBSD and NetBSD. The tests are all passing and they seem happy syncing my own Dropbox for the last 2 days. NetBSD seems to complain about a limited number of available files, but seems to sync okay. OpenBSD has this issue too, but I know how to increase the number of available files there. In either case, this is not a Maestral issue. OpenBSD has been syncing my Dropbox happily for a couple of weeks now.

2. The kqueue file system events emitter appears to be much slower due to an implementation detail: kqueues can be used to monitor existing files (with a new "subscription" started for each file) but they don't notify about which new files are created. Watchdog therefore takes a full directory snapshot of the watched folder hierarchy on any change and diffs it with a previous snapshot. This leads to large memory usage and bad performance for large directories. I am not sure if this can be optimised but would like to discuss it with the watchdog maintainers first.

I'm afraid I don't know anything about this.

3. There remain the sync test failures, which I have not yet investigated.

I won't consider 2 + 3 blockers for this PR, just want to flag them here since both will need to be addressed before BSDs can be considered supported. In the meantime, I'm happy to keep it as "experimental" with some known sync failures.

Utterly reasonable.

This looks good from my end though.

@bogwonch bogwonch requested a review from samschott June 14, 2022 09:59
Copy link
Owner

@samschott samschott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left a few minor comments inline but in general this looks good to me.

A question regarding the tests: When you say that they consistently pass on your VMs, did you also run the integration tests in tests/linked/integration? I'm asking because some of those consistently fail on GitHub actions when using the kqueue backend, as I would expect from looking at how it's implemented.

@@ -69,6 +70,7 @@ class FileStatus(Enum):
# platform detection
IS_MACOS = platform.system() == "Darwin"
IS_LINUX = platform.system() == "Linux"
IS_BSD = platform.system() == "OpenBSD" or platform.system() == "FreeBSD" or platform.system() == "NetBSD"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A membership check would be more elegant here:

IS_BSD = platform.system() in {"OpenBSD", "FreeBSD", "NetBSD"}

elif IS_BSD:
fmt = "qqihh"
pid_index = 2
flock = struct.pack(fmt, 0, 0, 0, fcntl.F_WRLCK, 0)
Copy link
Owner

@samschott samschott Jun 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I missing something or this exactly the same as on macOS (which is BSD-based)? If yes, let's just have a single IS_MACOS or IS_BSD check.

@@ -538,6 +541,7 @@ def start_maestral_daemon_process(
cc = quote(config_name).strip("'")

script = f'import maestral.daemon; maestral.daemon.start_maestral_daemon("{cc}")'

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) Remove empty line.

@@ -9,6 +9,8 @@
from os import path as osp
from typing import Optional

from maestral.constants import IS_BSD

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) Remove empty line.

@@ -9,6 +9,8 @@
from os import path as osp
from typing import Optional

from maestral.constants import IS_BSD
Copy link
Owner

@samschott samschott Jun 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite remember why I did not import from maestral.constants in the first place here, maybe there was a circular dependency somewhere before this module was moved.

Could you adapt the other checks accordingly with IS_LINUX and IS_MACOS? Also, please change to a relative import from ..constants import ... for consistency and to make the life of static type checkers easier.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... well the reason seems to be that the test_*_dirs tests seem to break if you do that!

I shall put it all back to platform.system() as before.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. That test monkey patches platform.system() and would need to be adapted to monkey patch maestral.constants instead.

@samschott
Copy link
Owner

Also, apologies about the noise from the failing tests. There was a small bug in the lock mechanism that synchronises access to the test accounts used for integration tests. As a result, tests would fail immediately instead of waiting until they could acquire the lock. This is fixed now in the master branch.

@bogwonch
Copy link
Author

I've left a few minor comments inline but in general this looks good to me.

Cool, I'll fix up the nits when I have a free moment.

A question regarding the tests: When you say that they consistently pass on your VMs, did you also run the integration tests in tests/linked/integration? I'm asking because some of those consistently fail on GitHub actions when using the kqueue backend, as I would expect from looking at how it's implemented.

Aha... no I'm just running pytest from the main folder... are there more I need to run?

@samschott
Copy link
Owner

Aha... no I'm just running pytest from the main folder... are there more I need to run?

In principle, all tests should run when running the pytest tests command from the main folder. However, the tests in tests/linked will be skipped unless you have a DROPBOX_REFRESH_TOKEN or DROPBOX_ACCESS_TOKEN set in the environment for the tests to use. Those should be tokens for a Dropbox account with no data that you care about, its entire content will be wiped before and after each test case.

See https://github.com/samschott/maestral/blob/master/CONTRIBUTING.md#tests for more information.

@bogwonch
Copy link
Author

Aha I thought I'd been running those (with my actual Dropbox account... though it hasn't been being erased) evidently I'm using the wrong refresh token!

I thought I'd been following those instructions, but the way to get a token seems incorrect...

AttributeError: 'DropboxClient' object has no attribute 'auth'

Instead of:

print(m.client.auth.refresh_token)

I think it should be:

print(m.client._cred_storage.token)

That said, I don't think the tests had been running before?
Anyway I'm re-running the tests now and pytest is running the integration tests now if things pass I'll fix up issues and kick back.

Cheers for the help!

@samschott
Copy link
Owner

Yes, apologies, I've neglected to update those instructions for some time, the API has changes a bit. I've also added a warning now about using a separate account for testing!

Anyway I'm re-running the tests now and pytest is running the integration tests now if things pass I'll fix up issues and kick back.

I've had a brief look at the test failures on GitHub actions and they will be quite difficult to fix -- and fixes might need to be applied to watchdog instead of Maestral. That's probably best left for a separate PR.

@samschott samschott mentioned this pull request Jun 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants