-
Notifications
You must be signed in to change notification settings - Fork 316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for dependency caching #2383
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good so far, some initial comments:
|
||
interface CacheConfig { | ||
paths: string[]; | ||
hash: string[]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe "key" is more descriptive? Consider adding some short doc to explain what these fields do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added docs to this, but I have left it as hash
for now. key
may be misleading, since the cache key is comprised of more than just the hash calculated from the files that match these patterns. Perhaps hashFilePatterns
would be better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No strong feelings, I think with the doc hash
is fine.
const cacheConfig = CODEQL_DEFAULT_CACHE_CONFIG[language]; | ||
|
||
if (cacheConfig === undefined) { | ||
logger.info( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should probably be debug statements before we merge this PR
// with an empty string. | ||
const globber = await makeGlobber(cacheConfig.hash); | ||
|
||
if ((await globber.glob()).length === 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean we list files twice (once here and once in cacheKey
)? Might not be a performance problem in practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it does. I looked at this before your review but, unfortunately, the hashFiles
implementation in @action/glob
isn't exposed in a way that we can just throw an existing array of paths at it, so we'd have to copy the implementation (or a variant of it, depending on how much we care about the intricacies of theirs).
continue; | ||
} | ||
|
||
const size = await getTotalCacheSize(cacheConfig.paths, logger); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the size is very small, should we skip the upload?
- Pro for skipping: for small sizes, caching may slow down the run
- Pro for uploading: this will let us deal better with registry outages or network issues
Whichever way we choose, let's document why.
Another consideration: if the size is very large, should we also skip? Larger caches are more likely to push out other cache entries. Another thing we could do is look at the existing cache usage and selectively upload based on that. This is something we can address in a followup PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should have noted this in the description, but I have left this kind of thing out of the implementation for now while we are in the early stages of testing to stick to just a minimal viable implementation. I agree that we should look at this before we merge / ship anything and your points are good.
If the size is very small, should we skip the upload?
My feeling is that we probably shouldn't skip the upload if the cache is small. As you say, the benefit of dealing with third-party outages is useful, and storing/restoring a small cache should be fairly quick.
Another consideration: if the size is very large, should we also skip? Larger caches are more likely to push out other cache entries. Another thing we could do is look at the existing cache usage and selectively upload based on that. This is something we can address in a followup PR.
Agreed. I think I touched on this in the EDR as well.
04e457d
to
c07c5b0
Compare
Work-in-progress to add support for dependency caching to the
init
Action.Merge / deployment checklist