Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Reset chunks from cache if oudated #8449

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

bsekachev
Copy link
Member

@bsekachev bsekachev commented Sep 17, 2024

Motivation and context

How has this been tested?

Checklist

  • I submit my changes into the develop branch
  • I have created a changelog fragment
  • I have updated the documentation accordingly
  • I have added tests to cover my changes
  • I have linked related issues (see GitHub docs)
  • I have increased versions of npm packages if it is necessary
    (cvat-canvas,
    cvat-core,
    cvat-data and
    cvat-ui)

License

  • I submit my code changes under the same MIT License that covers the project.
    Feel free to contact the maintainers if that's a concern.

Summary by CodeRabbit

  • New Features

    • Enhanced caching mechanism for frame metadata, ensuring up-to-date information and improved memory management.
    • Introduced chunks_updated_date to track when frame chunks were last updated.
  • Bug Fixes

    • Improved consistency in accessing frame data by refreshing the cache when necessary.
  • Documentation

    • Updated documentation to reflect changes in metadata structure and caching behavior.

Copy link
Contributor

coderabbitai bot commented Sep 17, 2024

Walkthrough

The changes introduce enhancements to the caching mechanism for frame metadata in the application. A new timestamp property tracks the last fetch time of metadata, and a function checks for outdated data, refreshing the cache as necessary. Additionally, modifications to the server proxy functions include returning a new property indicating when frame chunks were last updated. These updates aim to improve data consistency and management across various components of the application.

Changes

Files Change Summary
cvat-core/src/frames.ts Introduced fetchTimestamp in frameDataCache, added chunksUpdatedDate in FramesMetaData, modified functions to refresh cache if outdated.
cvat-core/src/server-proxy.ts Modified getMeta and saveMeta functions to return an object including chunks_updated_date, indicating the last update time for frame chunks.
cvat-core/src/server-response-types.ts Added chunks_updated_date property to SerializedFramesMetaData interface to enhance metadata structure with a timestamp for frame chunk updates.

Poem

🐰 In the meadow where frames do play,
A timestamp hops to save the day!
Chunks updated, fresh and bright,
Caching memories, oh what a sight!
With every fetch, our data sings,
In harmony, the rabbit brings! 🐇✨


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    -- I pushed a fix in commit <commit_id>, please review it.
    -- Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    -- @coderabbitai generate unit testing code for this file.
    -- @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    -- @coderabbitai generate interesting stats about this repository and render them as a table.
    -- @coderabbitai read src/utils.ts and generate unit testing code.
    -- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    -- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.
Early access features: enabled

We are currently testing the following features in early access:

  • OpenAI o1 for code reviews: OpenAI's new o1 model is being tested for generating code suggestions in code reviews.

Note:

  • You can enable or disable early access features from the CodeRabbit UI or by updating the CodeRabbit configuration file.

Copy link

sonarcloud bot commented Sep 17, 2024

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between e929a3d and 21f6f6d.

Files selected for processing (3)
  • cvat-core/src/frames.ts (9 hunks)
  • cvat-core/src/server-proxy.ts (2 hunks)
  • cvat-core/src/server-response-types.ts (1 hunks)
Additional comments not posted (10)
cvat-core/src/server-response-types.ts (1)

462-462: Addition of chunks_updated_date is appropriate and consistent

The property chunks_updated_date: string; is properly added to the SerializedFramesMetaData interface, following the existing naming conventions and type usage for date properties in this file.

cvat-core/src/frames.ts (7)

19-19: Initialization of fetchTimestamp property is appropriate

The addition of the fetchTimestamp property to frameDataCache will help track when the metadata was last fetched. This is essential for ensuring that outdated cache entries are refreshed timely.


59-59: Addition of chunksUpdatedDate to FramesMetaData

Including chunksUpdatedDate in the FramesMetaData class allows the system to detect changes in chunk definitions effectively.


74-74: Setting chunks_updated_date to undefined in initial data

Ensuring chunks_updated_date is initialized is important for accurate comparisons later in the code.


139-141: Definition of getter for chunksUpdatedDate

The getter method for chunksUpdatedDate correctly retrieves the value from data.chunks_updated_date.


507-507: Ensure fetchTimestamp is always initialized

In the calculation of isOutdated, if cached.fetchTimestamp is undefined, it could lead to unexpected results.

Please confirm that fetchTimestamp is always initialized before refreshJobCacheIfOutdated is called. If there is a possibility of it being undefined, consider initializing it with a default value.


653-653: Initialization of fetchTimestamp property

fetchTimestamp is correctly initialized with the current timestamp when creating the cache for a job.


709-709: Type annotation for id parameter in getDeletedFrames

Specifying the type of id as number enhances type safety and clarity.

cvat-core/src/server-proxy.ts (2)

1484-1488: Ensure Type Consistency with SerializedFramesMetaData

The getMeta function now returns an object with an additional property chunks_updated_date. This property may not be defined in the SerializedFramesMetaData interface, which could lead to type inconsistencies or TypeScript compilation errors.

Consider updating the SerializedFramesMetaData interface to include the chunks_updated_date property, or adjust the return type to accurately reflect the returned object.

You can run the following script to verify if chunks_updated_date is included in the SerializedFramesMetaData type definition:

Verification successful

chunks_updated_date is Already Defined in SerializedFramesMetaData

The verification process has confirmed that the chunks_updated_date property is already included in the SerializedFramesMetaData interface definition. This property is defined as a string in the interface, which is consistent with the implementation in the getMeta function.

  • Location: cvat-core/src/server-response-types.ts, line 462

The concern about potential type inconsistencies or TypeScript compilation errors due to the addition of chunks_updated_date is not valid. The interface definition already accounts for this property, ensuring type consistency throughout the codebase.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if 'chunks_updated_date' is defined in 'SerializedFramesMetaData' interface.

# Test: Search for the interface definition including 'chunks_updated_date'.
ast-grep --lang typescript --pattern $'interface SerializedFramesMetaData {
  $$$
  chunks_updated_date: $_
  $$$
}'

Length of output: 1255


1505-1509: Ensure Type Consistency with SerializedFramesMetaData

The saveMeta function now returns an object with an additional property chunks_updated_date. Similar to the previous comment, this property may not be declared in the SerializedFramesMetaData interface, potentially causing type inconsistencies or TypeScript compilation errors.

Please update the SerializedFramesMetaData interface to include chunks_updated_date, or adjust the return type accordingly.

You can run the following script to verify if chunks_updated_date is included in the SerializedFramesMetaData type definition:

export function getContextImage(jobID: number, frame: number): Promise<Record<string, ImageBitmap>> {
return new Promise<Record<string, ImageBitmap>>((resolve, reject) => {
if (!(jobID in frameDataCache)) {
reject(new Error(
'Frame data was not initialized for this job. Try first requesting any frame.',
));
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider updating cache refresh logic in getContextImage

As inconsistencies may arise, consider implementing cache refresh logic within getContextImage similar to getFrame.

Add cache validation to ensure context images correspond to the latest frame data.

// In this case (extremely rare, but nevertheless possible) user may get context images related to another frame
// - if cache gets outdated after getFrame() call
// - and before getContextImage() call
// - and both calls refer to the same fram that is refreshed honeypot frame
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typographical error in comment

There is a typo in the comment at line 689.

Apply this diff to correct the typo:

-        // - and both calls refer to the same fram that is refreshed honeypot frame
+        // - and both calls refer to the same frame that is refreshed

Also, consider clarifying or removing "honeypot frame" if it's not relevant in this context.

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// - and both calls refer to the same fram that is refreshed honeypot frame
// - and both calls refer to the same frame that is refreshed

Comment on lines +677 to +692
// basically the following functions may be affected if job cache is outdated
// - getFrame
// - getContextImage
// - getCachedChunks
// And from this idea we should call refreshJobCacheIfOutdated from each one
// Hovewer, following from the order, these methods are usually called
// it may lead to even more confusing behaviour
//
// Usually user first receives frame, then user receives ranges and finally user receives context images
// In this case (extremely rare, but nevertheless possible) user may get context images related to another frame
// - if cache gets outdated after getFrame() call
// - and before getContextImage() call
// - and both calls refer to the same fram that is refreshed honeypot frame
// Thus, it is better to only call `refreshJobCacheIfOutdated` from getFrame()
await refreshJobCacheIfOutdated(jobID);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential inconsistency between frame and context image data

The comments discuss the possibility of inconsistencies if refreshJobCacheIfOutdated is not called in getContextImage. To ensure data consistency, it might be better to call refreshJobCacheIfOutdated in both getFrame and getContextImage.

Consider adding await refreshJobCacheIfOutdated(jobID); at the beginning of the getContextImage function to prevent potential inconsistencies.

export function getContextImage(jobID: number, frame: number): Promise<Record<string, ImageBitmap>> {
    return new Promise<Record<string, ImageBitmap>>((resolve, reject) => {
+       await refreshJobCacheIfOutdated(jobID);
        if (!(jobID in frameDataCache)) {
            reject(new Error(
                'Frame data was not initialized for this job. Try first requesting any frame.',
            ));
        }

Committable suggestion was skipped due to low confidence.

Comment on lines +500 to +530
async function refreshJobCacheIfOutdated(jobID: number): Promise<void> {
const cached = frameDataCache[jobID];
if (!cached) {
throw new Error('Frame data cache is abscent');
}

const META_DATA_RELOAD_PERIOD = 1 * 60 * 60 * 1000; // 1 hour
const isOutdated = (Date.now() - cached.fetchTimestamp) > META_DATA_RELOAD_PERIOD;

if (isOutdated) {
// get metadata again if outdated
const meta = await getFramesMeta('job', jobID, true);
if (new Date(meta.chunksUpdatedDate) > new Date(cached.meta.chunksUpdatedDate)) {
// chunks were re-defined. Existing data not relevant anymore
// currently we only re-write meta, remove all cached frames from provider and clear cached context images
// other parameters (e.g. chunkSize) are not supposed to be changed
cached.meta = meta;
cached.provider.cleanup(Number.MAX_SAFE_INTEGER);
for (const frame of Object.keys(cached.contextCache)) {
for (const image of Object.values(cached.contextCache[+frame].data)) {
// close images to immediate memory release
image.close();
}
}
cached.contextCache = {};
}

cached.fetchTimestamp = Date.now();
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typographical error in error message

There is a typo in the error message at line 503.

Apply this diff to correct the typo:

-    throw new Error('Frame data cache is abscent');
+    throw new Error('Frame data cache is absent');
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async function refreshJobCacheIfOutdated(jobID: number): Promise<void> {
const cached = frameDataCache[jobID];
if (!cached) {
throw new Error('Frame data cache is abscent');
}
const META_DATA_RELOAD_PERIOD = 1 * 60 * 60 * 1000; // 1 hour
const isOutdated = (Date.now() - cached.fetchTimestamp) > META_DATA_RELOAD_PERIOD;
if (isOutdated) {
// get metadata again if outdated
const meta = await getFramesMeta('job', jobID, true);
if (new Date(meta.chunksUpdatedDate) > new Date(cached.meta.chunksUpdatedDate)) {
// chunks were re-defined. Existing data not relevant anymore
// currently we only re-write meta, remove all cached frames from provider and clear cached context images
// other parameters (e.g. chunkSize) are not supposed to be changed
cached.meta = meta;
cached.provider.cleanup(Number.MAX_SAFE_INTEGER);
for (const frame of Object.keys(cached.contextCache)) {
for (const image of Object.values(cached.contextCache[+frame].data)) {
// close images to immediate memory release
image.close();
}
}
cached.contextCache = {};
}
cached.fetchTimestamp = Date.now();
}
}
async function refreshJobCacheIfOutdated(jobID: number): Promise<void> {
const cached = frameDataCache[jobID];
if (!cached) {
throw new Error('Frame data cache is absent');
}
const META_DATA_RELOAD_PERIOD = 1 * 60 * 60 * 1000; // 1 hour
const isOutdated = (Date.now() - cached.fetchTimestamp) > META_DATA_RELOAD_PERIOD;
if (isOutdated) {
// get metadata again if outdated
const meta = await getFramesMeta('job', jobID, true);
if (new Date(meta.chunksUpdatedDate) > new Date(cached.meta.chunksUpdatedDate)) {
// chunks were re-defined. Existing data not relevant anymore
// currently we only re-write meta, remove all cached frames from provider and clear cached context images
// other parameters (e.g. chunkSize) are not supposed to be changed
cached.meta = meta;
cached.provider.cleanup(Number.MAX_SAFE_INTEGER);
for (const frame of Object.keys(cached.contextCache)) {
for (const image of Object.values(cached.contextCache[+frame].data)) {
// close images to immediate memory release
image.close();
}
}
cached.contextCache = {};
}
cached.fetchTimestamp = Date.now();
}
}

if (isOutdated) {
// get metadata again if outdated
const meta = await getFramesMeta('job', jobID, true);
if (new Date(meta.chunksUpdatedDate) > new Date(cached.meta.chunksUpdatedDate)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tip

Codebase Verification

Validate comparison of chunksUpdatedDate - Potential improvement needed

The comparison of meta.chunksUpdatedDate and cached.meta.chunksUpdatedDate is implemented correctly using new Date(). However, there's no explicit validation to ensure these are valid date strings. While the type is declared as string, which is appropriate, consider adding a safeguard against potential Invalid Date issues:

  • In the if statement at line 512, add a check to ensure both dates are valid before comparison:
if (!isNaN(new Date(meta.chunksUpdatedDate).getTime()) && 
    !isNaN(new Date(cached.meta.chunksUpdatedDate).getTime()) &&
    new Date(meta.chunksUpdatedDate) > new Date(cached.meta.chunksUpdatedDate)) {
    // existing code
}

This addition will prevent potential runtime errors if either chunksUpdatedDate is invalid, improving the robustness of the code.

Analysis chain

Validate comparison of chunksUpdatedDate

Ensure that meta.chunksUpdatedDate and cached.meta.chunksUpdatedDate are valid date strings to avoid potential Invalid Date issues during comparison.

Run the following script to check if chunksUpdatedDate is a valid date string:

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if chunksUpdatedDate is a valid date in the meta data

# Test: Extract chunksUpdatedDate and attempt to parse it as a date
# Expect: Both dates should be valid

jq '.chunksUpdatedDate' cvat-core/src/frames.ts | xargs -I {} date -d {}

Length of output: 133


Script:

#!/bin/bash
# Search for usage of chunksUpdatedDate in the TypeScript file
ast-grep --lang typescript --pattern 'new Date($chunksUpdatedDate)' cvat-core/src/frames.ts

# Search for any type assertions or validations related to chunksUpdatedDate
rg "chunksUpdatedDate" cvat-core/src/frames.ts -A 5 -B 5

Length of output: 1515

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant