Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Task / Epic]: HUGE repository cleanup ✨ #6820

Open
5 of 13 tasks
ColorfulRhino opened this issue Jun 28, 2024 · 7 comments
Open
5 of 13 tasks

[Task / Epic]: HUGE repository cleanup ✨ #6820

ColorfulRhino opened this issue Jun 28, 2024 · 7 comments
Assignees
Labels
08 Milestone: Third quarter release Discussion Being discussed - Voice your opinions :) Help needed We need your involvement Task/To-Do Project management: To-Do or task(s) someone is working on

Comments

@ColorfulRhino
Copy link
Collaborator

ColorfulRhino commented Jun 28, 2024

Task description

The Problem

Over the years, some older stuff was partially removed/not used anymore, but not fully cleaned up. Those files and leftovers in the code still live in the repository, leading to confusions (what is this? is this still used? can this be deleted?) and misleading search/grep results (e.g. for packages/extras-buildpkgs/htop or packages/extras-buildpkgs/hostapd which included changelogs and therefore lots of unrelated text).

Besides that, many blobs were added to the build repo, only some of them still remain. But even the deleted blobs still remain in the repo since Git saves all the history: The history size is huge, even though the current size of the packages/blobs folder is only 55MB. This leads to a unnecesarily bloated repository, increasing the size for everybody.
I remember one person often having to visit a local library or university to download/update their Armbian repo since the size was too large on their slow or resticted internet connection at home.

For comparison:

I don't believe that Armbian/build is a bigger project than U-Boot, but it is amlost triple the size in MB. Vastly reducing the repo size (TODO: calculate actual size before/after blob purge) will make contributions more inclusive overall and save time on many occasions.

The Solutions

Removing all known leftover code and moving all blobs to a separate blob repository, like already done with the Rockchip blobs in the Armbian/rkbin repo. After this is done, purge the build repository's history from all the blobs (original idea by @rpardini I believe). The goal is to have a completely blobless Arbian/build repo while blobs are only pulled from other repositories.

Task List

Leftover code:

Blobs:

  • Create new repository for blobs (e.g. Armbian/blobs)
  • Copy all existing blobs to the new blob repo
  • Change all references in the code to point to the new repo for the blobs
  • Remove all existing blobs from the Armbian/build repo
  • Purge all blobs from the Armbian/build Git history by rewriting history (this is where the actual size reduction happens)
  • Add an Actions workflow to check new PRs for blobs and if detected, kindly auto-remind them to commit the blobs to the blob repo instead of the build repo

This task list will be extended with new findings. PRs solving specific tasks will be linked.

This task/story is open for ideas and discussions! 😄


Some statistics for fun and to compare the impact of this cleanup:

Before After Difference
Lines of code 6 458 280 TBD
# of files 6503 TBD
Repo size ~ 636 MB TBD

Commands used:

  • Lines of code: git diff --shortstat 4b825dc642cb6eb9a060e54bf8d69288fbee4904
  • # of files: git ls-files | wc -l
  • Repo size: Queried on GitHub
@ColorfulRhino ColorfulRhino added Discussion Being discussed - Voice your opinions :) Help needed We need your involvement 08 Milestone: Third quarter release Task/To-Do Project management: To-Do or task(s) someone is working on labels Jun 28, 2024
Copy link

Jira ticket: AR-2391

@The-going
Copy link
Contributor

The-going commented Jun 29, 2024

  • Remove unused packages/extras-buildpkgs/hostapd plus its Realtek part and its traces
  • Remove unused packages/extras-buildpkgs/htop and its traces
  • Remove unused packages/extras-buildpkgs/sunxi-tools and its traces

This code is not used by the build system. And can be deleted.
The last minutes of the life of the functions that did something you can see:
armbian/build> gitk -- lib/functions/extras

This functionality was designed to build packages in the native "chroot" environment
that required library dependencies from the environment.

I am currently still using this, and will be able to bring the code back if users want to use it.

@ColorfulRhino
Copy link
Collaborator Author

This code is not used by the build system. And can be deleted.

Thanks for confirming this!

The last minutes of the life of the functions that did something you can see:
armbian/build> gitk -- lib/functions/extras

My build host does not have a graphical user interface, but I get what you mean 😄

If you know of any other unused code, let me know and I'll add it to the list :)

@The-going
Copy link
Contributor

The last minutes of the life of the functions that did something you can see:
armbian/build> gitk -- lib/functions/extras

My build host does not have a graphical user interface, but I get what you mean 😄

git log -p -- lib/functions/extras

If you know of any other unused code, let me know and I'll add it to the list

Unfortunately, I'm still in the old paradigm 1.5 years ago.

ColorfulRhino added a commit to ColorfulRhino/build that referenced this issue Jun 29, 2024
@ColorfulRhino ColorfulRhino changed the title [Task / Story]: HUGE repository cleanup ✨ [Task / Epic]: HUGE repository cleanup ✨ Jun 29, 2024
rpardini pushed a commit to ColorfulRhino/build that referenced this issue Jun 30, 2024
@rpardini
Copy link
Member

I fully agree with the cleanup, but keep in mind git's history will be unaffected, and thus the repo size will only ever become bigger, not smaller. We'd need to rebase things out of existence (rewrite history) and force-push to actually make it smaller. See git filter-branch and git-filter-repo for possible approaches, but it would be very impacting.

@ColorfulRhino
Copy link
Collaborator Author

ColorfulRhino commented Jun 30, 2024

We'd need to rebase things out of existence (rewrite history) and force-push to actually make it smaller. See git filter-branch and git-filter-repo for possible approaches, but it would be very impacting.

Yes, I think this is what you meant when you were talking about this a long while ago. Please correct me if I'm wrong 😅

My plan is to test this in a completely separate repository (and in a second stage in the main repo but a separate branch), trying to understand its impact and getting opinions of multiple people. This will definitely have to be approved by more than one or two people 😄

The plan is to use one of those tools for this:

The thing is, if we want to do this, the best time is sooner rather than in 3 or 5 years.

@rpardini
Copy link
Member

Awesome. You've full understanding.

ColorfulRhino added a commit that referenced this issue Jun 30, 2024
These extras are leftovers of a legacy from the past.
See #6820 (comment)
Dangku pushed a commit to Dangku/armbian-build that referenced this issue Jul 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
08 Milestone: Third quarter release Discussion Being discussed - Voice your opinions :) Help needed We need your involvement Task/To-Do Project management: To-Do or task(s) someone is working on
Development

No branches or pull requests

4 participants