Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make mining worker delete commit files that has no merge conflict #169

Open
zegabr opened this issue Apr 22, 2023 · 0 comments
Open

Make mining worker delete commit files that has no merge conflict #169

zegabr opened this issue Apr 22, 2023 · 0 comments

Comments

@zegabr
Copy link
Contributor

zegabr commented Apr 22, 2023

While doing my research on CSDiff, I had to compare many versions of it, meaning I had to run miningframework multiple times.

For a given time interval, the tool downloads every commit that is non-fast-forward, meaning that even if the commit has 1 files with conflict and 100 files with no conflicts (fast forward merge), all the 101 files will be downloaded. This can easily take about 30GB of the device's if you run the tool with 10 projects for an interval of 1 month.

In my case, I needed only the files where the results between CSDiff and Diff3 were different. To be able to obtain only the info i needed with the current implementation, I had to do some workarounds see this branch.

In summary I:

  1. created one csv for each project here
  2. ran miningframework once for each project script
    2.1) deleted every unwanted file using this after each run
  3. then i created another csv with the relevant data

I think this can be done directly via miningframework, probably around here. As the tool have filters for the commits, it could probably have filters for files too.

This would make it possible to get more data for next researches using less memory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant