Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorporate Recent Commits #192

Merged
merged 40 commits into from
Aug 6, 2024
Merged

Incorporate Recent Commits #192

merged 40 commits into from
Aug 6, 2024

Conversation

yangky11
Copy link
Member

No description provided.

@yangky11
Copy link
Member Author

Still a draft. Don't merge.

@RexWzh
Copy link
Contributor

RexWzh commented Aug 1, 2024

About the Cache

Here use relative paths in cache.get and cache.store, determining the cache path by LeanGitRepo.format_dirname rather than within the cache functions.

Original Method:

def get(self, url: str, commit: str) -> Optional[Path]:
    _, repo_name = _split_git_url(url)
    dirname = _format_dirname(url, commit)

def store(self, src: Path) -> Path:
    url, commit = get_repo_info(src)
    dirpath = self.cache_dir / _format_dirname(url, commit)
    _, repo_name = _split_git_url(url)

# Cache the traced repo
src_dir = tmp_dir / repo.name
path = cache.store(src_dir)

Modified Version:

def get(self, rel_cache_dir: Path) -> Optional[Path]:
    dirname = rel_cache_dir.parent
    dirpath = self.cache_dir / dirname
    cache_path = self.cache_dir / rel_cache_dir

def store(self, src: Path, rel_cache_dir: Path) -> Path:
    """Store a repo at path ``src``. Return its path in the cache.

    Args:
        src (Path): Path to the repo.
        rel_cache_dir (Path): The relative path of the stored repo in the cache.
    """
    dirpath = self.cache_dir / rel_cache_dir.parent
    cache_path = self.cache_dir / rel_cache_dir

# Cache the traced repo
src_dir = tmp_dir / repo.name
rel_cache_dir = repo.format_dirname / repo.name
path = cache.store(src_dir, rel_cache_dir)

The function get_repo_info in utils.py can be removed.

Moreover, cache.store is used for storing repo cache for non-github repo types. The LeanGitRepo.repo is a git.Repository object, with the working directory in CACHE_DIR/repos.

$HOME/.cache/lean_dojo
├── aesop-79fb157c6a5061190d169535f8e5cb007914a82e
│   └── aesop
├── batteries-e7897807913fafdab31b01b9f627550bcc96cff2
│   └── batteries
...
├── repos
│   ├── gitpython-aesop-79fb157c6a5061190d169535f8e5cb007914a82e
│   ├── gitpython-batteries-e7897807913fafdab31b01b9f627550bcc96cff2

For git repo, it uses a fixed prefix (gitpython) rather than the second-to-last directory name to avoid introducing temporary file names or duplicating the repo cache.


btw, the repo_type_of_url function in utils.py is duplicated and can therefore be deleted.

@RexWzh
Copy link
Contributor

RexWzh commented Aug 1, 2024

A note about the cache that might be helpful for the review process. Feel free to @ me if you get any questions😃



def cleanse_string(s: Union[str, Path]) -> str:
"""Replace : and / with _ in a string."""
return str(s).replace("/", "_").replace(":", "_")


@cache
def _to_commit_hash(repo: Repository, label: str) -> str:
def _to_commit_hash(repo: Union[Repository, Repo], label: str) -> str:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RexWzh The new version of _to_commit_hash no longer caches the results for GitHub repos?

Copy link
Contributor

@RexWzh RexWzh Aug 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I omitted that line when updating the function. You can reintroduce it.

I guess this is why the latest tests take much time.

@yangky11 yangky11 merged commit b9d2115 into main Aug 6, 2024
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants