Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance optimizations and caching #698

Open
wants to merge 30 commits into
base: master
Choose a base branch
from

Conversation

oprypkhantc
Copy link
Contributor

@oprypkhantc oprypkhantc commented Aug 26, 2024

Closes #671

There a few things I focused in this PR. It might seem a bit overcomplicated for what it is, but as I said in the comment to the issue, it's harder than I anticipated. These were my primary goals for a development environment:

  1. make sure finding files in the codebase was only ever done once per "boot"
  2. make sure to never autoload/reflect classes which haven't changed since the last scan
  3. make sure cache is invalidated in parts, i.e. if ClassA changes, then only the cache related to that class is invalidated to avoid the "full scan"
  4. make sure adding a "wide" namespace (i.e. the entire project namespace) is only expensive for the first run and isn't for all consecutive runs

These may seem a bit too much, but unfortunately filesystems in containerized environments on non-Linux hosts are slow as hell. Even with recent improvements that Docker has done for MacOS, it's still slower than native Linux environments by orders of magnitude (2400ms versus 100ms on my Mac), which is why I feel that all of the above optimizations were necessary for a good DX.

Still, I attempted to hide as much of that complexity as possible behind a ClassFinder interface. All of it is contained within one namespace, and if this ever becomes irrelevant or too much to handle - it can just as easily be removed.

Namespaces and class finding

As proposed in the issue, addControllerNamespace() and addTypeNamespace() were combined into a single addNamespace(). Deprecations were added too. This allowed simplifying and combining a lot of the code that used to handle caching of different namespaces. Now there's a single ClassFinder, global for the entire SchemaFactory , and other classes don't need to think about multiple namespaces or caching.

devMode(), prodMode() and globTTL

There used to be a setting called globTTL, which controlled the expiration of glob() caches. This is no longer relevant and hence was removed:

  • devMode() enables filemtime() based caching, as well as setting a very small expiration on "computed" caches (that is - caches that can't be invalidated in parts based on file changes, only entirely)
  • prodMode() disables filemtime() based caching, as well as setting an infinite expiration on "computed" caches. filemtime() is very performant and even has built-in cache, but is still unnecessary on production

ClassFinderComputedCache

That's a new type of cache that I used in all places where ClassFinder used to be iterated directly previously. It has a single function:

public function compute(
    ClassFinder $classFinder,
    string $key,
    callable $map,
    callable $reduce,
): mixed;

The idea is that it's responsible for iterating over the ClassFinder instance you provide. Initially, when there's no cache, it will iterate over all found classes and call $map($classReflection) on every found reflection, and remember the results in the "entries" cache, keyed by the filename. This way we know which entries relate to which files, and we can invalidate them per file when only some files change.

Then, $reduce is called to combine the resulting entries map into a result - usually some kind of cache object. This also allows skipping the "invalidate them per file" part on production, where individual entries aren't stored, but rather the result of $reduce is cached as-is.

Result

The schema introspection time (from the ground up, without cache) reduced from 60s+ to 15s, and allowed adding/changing/removing types and controllers without resetting the whole cache - which now takes <5s.

# Conflicts:
#	tests/Mappers/Parameters/TypeMapperTest.php
# Conflicts:
#	src/FactoryContext.php
#	src/Mappers/ClassFinderTypeMapper.php
#	src/Mappers/GlobTypeMapper.php
#	src/Mappers/StaticClassListTypeMapper.php
#	src/Mappers/StaticClassListTypeMapperFactory.php
#	src/Reflection/CachedDocBlockFactory.php
#	src/SchemaFactory.php
#	tests/AbstractQueryProvider.php
#	tests/FieldsBuilderTest.php
#	tests/Mappers/GlobTypeMapperTest.php
#	tests/Mappers/Parameters/TypeMapperTest.php
#	tests/Mappers/RecursiveTypeMapperTest.php
#	tests/SchemaFactoryTest.php
#	website/docs/other-frameworks.mdx
@oojacoboo
Copy link
Collaborator

So, we're basically building a lot of complexity and cache clearing logic for development environments, that could be solved by just caching the autoload in dev and clearing when editing your vendor packages? Is that right? I'm just not sure I see why it's such an issue to write a command to clear your composer autoload when you add a new vendor package. What am I missing?

@oprypkhantc
Copy link
Contributor Author

oprypkhantc commented Aug 27, 2024

No, not really. This has nothing to do with vendor or autoload at all.

In small-medium sized codebases, you may have a dedicated GraphQL namespace that only contains types and controllers, that's small and quick to load. In our case, however, there's no such thing. As the application is modular, GraphQLite annotated files may be contained in multiple namespaces. So we have to supply the entire application namespace to SchemaFactory:

$schemaFactory->addNamespace('App')

Now, in a production environment GraphQLite just caches everything with a large TTL, so scanning the entire codebase isn't a concern. It's done once, and never thereafter.

In dev environments, however, this is a massive pain. GraphQLite does 5+ filesystem scans (in search of classes), each of which takes 100ms in native FS environments (Linux or non-Dockerized Mac/Windows) and 2500-3500ms in Docker for Mac in our case. So 10-15 secs are wasted just to scan the project for files.

Also, because of a short cache TTL, basically each request starts without cache, meaning GraphQLite has to process thousands of classes and extract type/controllers from them on every single request. This could, of course, be solved with a larger cache TTL, but then developers would have to be manually resetting the cache every time they make a change in the application's code.

So this PR aims to solve exactly those two last points:

  • make sure the filesystem is only ever scanned once per application lifetime, instead of 5+ times
  • drop the "low cache ttl for dev" concept, and instead start invalidating cache in parts. E.g. if one file FileA.php out of 1000 changes, then only parts of the cache generated from that file are invalidated

The first point addressed by combining "controller" and "type" namespaces together, and making sure all of GraphQLite uses ClassFinder, so it can be effectively cached internally. The second point is addressed by ClassFinderComputedCache, which internally handles the partial cache invalidations automatically. On production, no partial invalidations happen though; it works the same way it used to be.

There's generally no added complexity, aside from the Discovery\Cache namespace, but I'll make sure to have it covered with tests.

@codecov-commenter
Copy link

codecov-commenter commented Aug 27, 2024

Codecov Report

Attention: Patch coverage is 94.34524% with 19 lines in your changes missing coverage. Please review.

Project coverage is 95.16%. Comparing base (53f9d49) to head (2bc9ecf).
Report is 88 commits behind head on master.

Files with missing lines Patch % Lines
src/SchemaFactory.php 74.19% 16 Missing ⚠️
src/Cache/FilesSnapshot.php 96.66% 1 Missing ⚠️
src/Mappers/GlobAnnotationsCache.php 50.00% 1 Missing ⚠️
src/Mappers/GlobTypeMapperCache.php 83.33% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master     #698      +/-   ##
============================================
- Coverage     95.72%   95.16%   -0.56%     
- Complexity     1773     1818      +45     
============================================
  Files           154      174      +20     
  Lines          4586     4781     +195     
============================================
+ Hits           4390     4550     +160     
- Misses          196      231      +35     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@oprypkhantc oprypkhantc marked this pull request as ready for review September 2, 2024 20:00
@oprypkhantc
Copy link
Contributor Author

@oojacoboo I've added missing tests and simplified some things where possible. I've also attempted to extract parts of this PR as separate PRs, but it's all interconnected so it's a bit hard to do. I will if needed though.

Because the PR touches a lot of things, here's the list of changes:

Class finding:

  • addControllerNamespace and addTypeNamespace were combined into one addNamespace. This is both simpler for end users, faster in terms of performance and easier in terms of code support
  • some code was already using FinderInterface from kcs/class-finder, while other places relied on custom code from NS class. Both were meant for the same thing, so I removed the NS code and added an interface ClassFinder. I didn't want the whole GraphQLite to be dependent on kcs/class-finder, and it also allowed doing more simplifications
  • StaticClassListTypeMapper was removed, since it was essentially just GlobTypeMapper but used array of classes instead of NS. Now you can just do new ClassFinderTypeMapper(new StaticClassFinder([Class1::class]))
  • AbstractTypeMapper and GlobTypeMapper became just ClassFinderTypeMapper, as these were practically the same thing. It now simply accepts a ClassFinder as constructor argument, so you can use it with both a static class list (new StaticClassFinder($classes)) and with an automated class finder.
  • GlobControllerQueryProvider now also just accepts a ClassFinder

Class bound cache:

  • when the dependency on thecodingmachine/cache-utils was removed, it's code wasn't fully transferred here. Namely, the ClassBoundCache would no longer track file modification times as it used to. This meant that if a class changed in a dev environment, it's respective cache wouldn't be invalidated automatically. I fixed it by adding SnapshotClassBoundCache - it now invalidates caches if files changes as it used to

Docblocks:

  • caching was normalized - there were places where caching wasn't even used; now everything uses a DocBlockFactory interface from GraphQLite
  • caching was also greatly simplified, as it now just uses class bound cache to do the same thing it did manually
  • in production, it would still look at file modification times and try to invalidate the cache. This no longer happens

Class finder computed cache:

  • this replaced basically all caches in GraphQLite. In development (devMode()), instead of caching the final result of something (for example the GlobTypeMapperCache which holds mapping of all type names to all class names), GraphQLite now caches individual bits keyed by filename. When a file changes, only parts of the cache are invalidated, while the rest of the cache stays valid
  • in production, the final result is still cached exactly as it used to by HardClassFinderComputedCache. It doesn't track file changes nor do anything complicated here.

@Lappihuan
Copy link
Contributor

@oprypkhantc this is amazing!
Am i understanding this correctly, that this also means while debugging graphqlite there will be way less noise from the multiple rescans and what not?
Its been a while but it from what i remember it made it quite annoying to work with

@oprypkhantc
Copy link
Contributor Author

@oprypkhantc this is amazing! Am i understanding this correctly, that this also means while debugging graphqlite there will be way less noise from the multiple rescans and what not? Its been a while but it from what i remember it made it quite annoying to work with

Thank you 😄 And yes, it should be way less unnecessary stuff being scanned and parsed. Ideally none xd

@oojacoboo
Copy link
Collaborator

oojacoboo commented Sep 12, 2024

@oprypkhantc so, looks like we were setting the GlobTTL explicitly, which has been improving performance on our end in development. But, after testing this PR, I'm seeing some additional performance improvements, so that's great.

Can you try upgrading node for the docs? See if that fixes the tests without other issues. Also, this is going to be a BC break, so it looks like we'll target 7.1. The website/docs/CHANGELOG.md needs updating as well.

@Lappihuan
Copy link
Contributor

We just wanted to upgrade but were stopped by this.
Should this PR also solve those performance issues?

@oprypkhantc
Copy link
Contributor Author

@Lappihuan It would definitely improve on the situation a lot, since the filesystem scan would only ever be done once in a request's lifespan, which should (?) fix the issue, but I can't say for sure. There are things where kcs/class-finder could be improved further, but nothing major as far as my profiling went. And I don't think there's anything that would require further changes in graphqlite itself.

@oprypkhantc
Copy link
Contributor Author

Great @oojacoboo :) I've fixed the docs and added a changelog entry.

@Lappihuan
Copy link
Contributor

Is the only bc break the namespace configuration?
I'd like to move this forward as this is a awful situation the bundle is in right now.

@oprypkhantc
Copy link
Contributor Author

@Lappihuan Also the setGlobTTL() of those that are likely to affect users.

Namespaces aren't truly a breaking change though - I've marked both methods as deprecated, but they they still exist

@Lappihuan
Copy link
Contributor

Can you try upgrading node for the docs? See if that fixes the tests without other issues. Also, this is going to be a BC break, so it looks like we'll target 7.1. The website/docs/CHANGELOG.md needs updating as well.

So this is not the case and it doesnt need to wait for 7.1

@oprypkhantc
Copy link
Contributor Author

Well it's up to you. Technically there are still breaking changes, just that they're not likely to affect many users.

@Lappihuan
Copy link
Contributor

could setGlobTTL be kept with a deprecation notice?

@oprypkhantc
Copy link
Contributor Author

Sure thing, I added it back. It just sets either ->devMode() or ->prodMode() depending on the TTL.

@oojacoboo
Copy link
Collaborator

oojacoboo commented Sep 13, 2024

@Lappihuan what's the issue with targeting 7.1? This whole PR is a pretty sizable update with a higher likelihood of breaks.

@oojacoboo
Copy link
Collaborator

oojacoboo commented Sep 13, 2024

@oprypkhantc Can you please review the new ClassMapFinder that @alekitto added (SchemaFactory example), to see if makes sense to include/replace the existing finder logic you've added in this PR? I agree with @Lappihuan. It probably makes the most sense to merge these two and come up with the best solution as a combined PR.

@oprypkhantc
Copy link
Contributor Author

oprypkhantc commented Sep 13, 2024

@oojacoboo Sure, I took a look. In short, after my PR, alekitto's changes would not really help.

What alekitto did was add a cache of classes that exist in the project, with a TTL. That's only important in this scenario: devMode(), php-fpm, big namespace (>1-2k classes) added using addNamespace() and a slow filesystem (e.g. Docker Mac/Win). The problem is it wouldn't prevent filemtime() checks on every file from the project that devMode() does, which eliminates the possible performance improvement from cache :/

For that specific case, these are the options without sacrificing the developer experience:

  • use a long-running server like Swoole instead of php-fpm
  • use smaller, more specific namespaces with ->addNamespace()
  • use native filesystem (Docker Synchronized file shares; Mutagen; Docker for Linux; or just not use Docker at all)

If none of the options work, and a developer is willing to sacrifice the DX and most improvements from this PR, they can use prodMode() and provide a default cache with a TTL:

$cache = new Psr16Cache(new FilesystemAdapter(defaultLifetime: 15));

$factory = (new SchemaFactory($cache))->prodMode();

This is more or less equivalent to the current setGlobTTL(15), but with additional perf improvements from this PR. Of course this means that they'd have to wait for the cache to expire, the way they used to before this PR. But with or without alekitto's changes - this wouldn't change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement a developer friendly caching for larger projects
4 participants