Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use global lock instead of NewExclusivePool to allow distributed lock between multiple Gitea instances #31813

Merged
merged 18 commits into from
Sep 6, 2024

Conversation

lunny
Copy link
Member

@lunny lunny commented Aug 9, 2024

Replace #26486
Fix #19620

@lunny lunny added the type/feature Completely new functionality. Can only be merged if feature freeze is not active. label Aug 9, 2024
@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Aug 9, 2024
@pull-request-size pull-request-size bot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Aug 9, 2024
@lunny lunny added the docs-update-needed The document needs to be updated synchronously label Aug 9, 2024
@github-actions github-actions bot added modifies/go Pull requests that update Go code modifies/dependencies labels Aug 9, 2024
@delvh
Copy link
Member

delvh commented Aug 12, 2024

What is needed to make this PR ready for review?

@lunny
Copy link
Member Author

lunny commented Aug 13, 2024

What is needed to make this PR ready for review?

It should pass the test at least.

@pull-request-size pull-request-size bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Aug 18, 2024
@lunny lunny marked this pull request as ready for review August 19, 2024 03:44
@lunny lunny added this to the 1.23.0 milestone Aug 19, 2024
Copy link
Member

@wolfogre wolfogre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, the implementations of lock don't LGTM.

modules/globallock/lock.go Outdated Show resolved Hide resolved
modules/globallock/lock.go Outdated Show resolved Hide resolved
@pull-request-size pull-request-size bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 19, 2024
@lunny lunny marked this pull request as draft August 20, 2024 05:01
@lunny
Copy link
Member Author

lunny commented Aug 23, 2024

Depend on #31908

wolfogre added a commit that referenced this pull request Aug 26, 2024
To help #31813, but do not replace it, since this PR just introduces the
new module but misses some work:

- New option in settings. `#31813` has done it.
- Use the locks in business logic. `#31813` has done it.

So I think the most efficient way is to merge this PR first (if it's
acceptable) and then finish #31813.

## Design principles

### Use spinlock even in memory implementation

In actual use cases, users may cancel requests. `sync.Mutex` will block
the goroutine until the lock is acquired even if the request is
canceled. And the spinlock is more suitable for this scenario since it's
possible to give up the lock acquisition.

Although the spinlock consumes more CPU resources, I think it's
acceptable in most cases.

### Do not expose the mutex to callers

If we expose the mutex to callers, it's possible for callers to reuse
the mutex, which causes more complexity.

For example:
```go
lock := GetLocker(key)
lock.Lock()
// ...
// even if the lock is unlocked, we cannot GC the lock,
// since the caller may still use it again.
lock.Unlock()
lock.Lock()
// ...
lock.Unlock()

// callers have to GC the lock manually.
RemoveLocker(key)
```

That's why
#31813 (comment)

In this PR, we only expose `ReleaseFunc` to callers. So callers just
need to call `ReleaseFunc` to release the lock, and do not need to care
about the lock's lifecycle.
```go
_, release, err := locker.Lock(ctx, key)
if err != nil {
    return err
}
// ...
release()

// if callers want to lock again, they have to re-acquire the lock.
_, release, err := locker.Lock(ctx, key)
// ...
```

In this way, it's also much easier for redis implementation to extend
the mutex automatically, so that callers do not need to care about the
lock's lifecycle. See also
#31813 (comment)

### Use "release" instead of "unlock"

For "unlock", it has the meaning of "unlock an acquired lock". So it's
not acceptable to call "unlock" when failed to acquire the lock, or call
"unlock" multiple times. It causes more complexity for callers to decide
whether to call "unlock" or not.

So we use "release" instead of "unlock" to make it clear. Whether the
lock is acquired or not, callers can always call "release", and it's
also safe to call "release" multiple times.

But the code DO NOT expect callers to not call "release" after acquiring
the lock. If callers forget to call "release", it will cause resource
leak. That's why it's always safe to call "release" without extra
checks: to avoid callers to forget to call it.

### Acquired locks could be lost

Unlike `sync.Mutex` which will be locked forever once acquired until
calling `Unlock`, in the new module, the acquired lock could be lost.

For example, the caller has acquired the lock, and it holds the lock for
a long time since auto-extending is working for redis. However, it lost
the connection to the redis server, and it's impossible to extend the
lock anymore.

If the caller don't stop what it's doing, another instance which can
connect to the redis server could acquire the lock, and do the same
thing, which could cause data inconsistency.

So the caller should know what happened, the solution is to return a new
context which will be canceled if the lock is lost or released:

```go
ctx, release, err := locker.Lock(ctx, key)
if err != nil {
    return err
}
defer release()
// ...
DoSomething(ctx)

// the lock is lost now, then ctx has been canceled.

// Failed, since ctx has been canceled.
DoSomethingElse(ctx)
```

### Multiple ways to use the lock

1. Regular way

```go
ctx, release, err := Lock(ctx, key)
if err != nil {
    return err
}
defer release()
// ...
```

2. Early release

```go
ctx, release, err := Lock(ctx, key)
if err != nil {
    return err
}
defer release()
// ...
// release the lock earlier and reset the context back
ctx = release()
// continue to do something else
// ...
```

3. Functional way

```go
if err := LockAndDo(ctx, key, func(ctx context.Context) error {
    // ...
    return nil
}); err != nil {
    return err
}
```
@lunny lunny changed the title Use an abstract lock layer to allow distributed lock between multiple Gitea instances Use global lock instead of NewExclusivePool to allow distributed lock between multiple Gitea instances Aug 26, 2024
@lunny lunny marked this pull request as ready for review August 26, 2024 17:39
services/repository/transfer.go Outdated Show resolved Hide resolved
services/repository/transfer.go Outdated Show resolved Hide resolved
@wolfogre wolfogre mentioned this pull request Aug 28, 2024
@wolfogre
Copy link
Member

Please review #31933.

lunny pushed a commit that referenced this pull request Aug 29, 2024
Follow #31908. The main refactor is that it has removed the returned
context of `Lock`.

The returned context of `Lock` in old code is to provide a way to let
callers know that they have lost the lock. But in most cases, callers
shouldn't cancel what they are doing even it has lost the lock. And the
design would confuse developers and make them use it incorrectly.

See the discussion history:
#31813 (comment) and
#31813 (comment)

It's a breaking change, but since the new module hasn't been used yet, I
think it's OK to not add the `pr/breaking` label.

## Design principles

It's almost copied from #31908, but with some changes.

### Use spinlock even in memory implementation (unchanged)

In actual use cases, users may cancel requests. `sync.Mutex` will block
the goroutine until the lock is acquired even if the request is
canceled. And the spinlock is more suitable for this scenario since it's
possible to give up the lock acquisition.

Although the spinlock consumes more CPU resources, I think it's
acceptable in most cases.

### Do not expose the mutex to callers (unchanged)

If we expose the mutex to callers, it's possible for callers to reuse
the mutex, which causes more complexity.

For example:
```go
lock := GetLocker(key)
lock.Lock()
// ...
// even if the lock is unlocked, we cannot GC the lock,
// since the caller may still use it again.
lock.Unlock()
lock.Lock()
// ...
lock.Unlock()

// callers have to GC the lock manually.
RemoveLocker(key)
```

That's why
#31813 (comment)

In this PR, we only expose `ReleaseFunc` to callers. So callers just
need to call `ReleaseFunc` to release the lock, and do not need to care
about the lock's lifecycle.
```go
release, err := locker.Lock(ctx, key)
if err != nil {
    return err
}
// ...
release()

// if callers want to lock again, they have to re-acquire the lock.
release, err := locker.Lock(ctx, key)
// ...
```

In this way, it's also much easier for redis implementation to extend
the mutex automatically, so that callers do not need to care about the
lock's lifecycle. See also
#31813 (comment)

### Use "release" instead of "unlock" (unchanged)

For "unlock", it has the meaning of "unlock an acquired lock". So it's
not acceptable to call "unlock" when failed to acquire the lock, or call
"unlock" multiple times. It causes more complexity for callers to decide
whether to call "unlock" or not.

So we use "release" instead of "unlock" to make it clear. Whether the
lock is acquired or not, callers can always call "release", and it's
also safe to call "release" multiple times.

But the code DO NOT expect callers to not call "release" after acquiring
the lock. If callers forget to call "release", it will cause resource
leak. That's why it's always safe to call "release" without extra
checks: to avoid callers to forget to call it.

### Acquired locks could be lost, but the callers shouldn't stop

Unlike `sync.Mutex` which will be locked forever once acquired until
calling `Unlock`, for distributed lock, the acquired lock could be lost.

For example, the caller has acquired the lock, and it holds the lock for
a long time since auto-extending is working for redis. However, it lost
the connection to the redis server, and it's impossible to extend the
lock anymore.

In #31908, it will cancel the context to make the operation stop, but
it's not safe. Many operations are not revert-able. If they have been
interrupted, then the instance goes corrupted. So `Lock` won't return
`ctx` anymore in this PR.

### Multiple ways to use the lock

1. Regular way

```go
release, err := Lock(ctx, key)
if err != nil {
    return err
}
defer release()
// ...
```

2. Early release

```go
release, err := Lock(ctx, key)
if err != nil {
    return err
}
defer release()
// ...
// release the lock earlier
release()
// continue to do something else
// ...
```

3. Functional way

```go
if err := LockAndDo(ctx, key, func(ctx context.Context) error {
    // ...
    return nil
}); err != nil {
    return err
}
```
services/pull/merge.go Outdated Show resolved Hide resolved
services/pull/merge.go Show resolved Hide resolved
services/repository/transfer.go Outdated Show resolved Hide resolved
services/repository/transfer.go Outdated Show resolved Hide resolved
services/repository/transfer.go Outdated Show resolved Hide resolved
services/repository/transfer.go Outdated Show resolved Hide resolved
@GiteaBot GiteaBot added lgtm/need 1 This PR needs approval from one additional maintainer to be merged. and removed lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. labels Aug 30, 2024
@lunny lunny mentioned this pull request Sep 3, 2024
8 tasks
@GiteaBot GiteaBot added lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. and removed lgtm/need 1 This PR needs approval from one additional maintainer to be merged. labels Sep 6, 2024
@lafriks lafriks enabled auto-merge (squash) September 6, 2024 09:45
@lafriks lafriks merged commit 2da2000 into go-gitea:main Sep 6, 2024
26 checks passed
@lunny lunny deleted the lunny/lock_abstract2 branch September 6, 2024 14:02
zjjhot added a commit to zjjhot/gitea that referenced this pull request Sep 9, 2024
* giteaofficial/main:
  [skip ci] Updated licenses and gitignores
  [skip ci] Updated translations via Crowdin
  Remove SHA1 for support for ssh rsa signing (go-gitea#31857)
  Upgrade cache to v0.2.1 (go-gitea#32003)
  Add automatic light/dark option for the colorblind theme (go-gitea#31997)
  [skip ci] Updated translations via Crowdin
  Use global lock instead of NewExclusivePool to allow distributed lock between multiple Gitea instances (go-gitea#31813)
  Use forum.gitea.com instead of old URL (go-gitea#31989)
  Distinguish official vs non-official reviews, add tool tips, and upgr… (go-gitea#31924)
lunny added a commit that referenced this pull request Sep 21, 2024
Use globallock for maven package uploads.

Thanks @tlusser for the test code.

Depends on ~#31813~
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs-update-needed The document needs to be updated synchronously lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. modifies/go Pull requests that update Go code size/L Denotes a PR that changes 100-499 lines, ignoring generated files. type/feature Completely new functionality. Can only be merged if feature freeze is not active.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

replace sync module
6 participants