Tutorial: Safely Reducing the Size of a Git Repository

Before starting any of these procedures, make sure to backup your repository.

Tutorial: Safely Reducing the Size of a Git Repository

Prerequisites:
  • A local Git repository.
  • Backup the repository before making changes.
  • Optional: Install tools like BFG Repo-Cleaner.
Step 1: Basic Repository Cleanup

1.1. Run Git Garbage Collection

Start by running the Git built-in garbage collection command, which can help clean up unnecessary files and optimize the repository.

bash 复制代码
cd /path/to/your/repo
git gc --aggressive --prune=now
  • --aggressive: More thorough cleanup.
  • --prune=now: Removes objects that are no longer needed.

1.2. Clean Reflogs

Reflogs record when the tips of branches and other references were updated in the repo. They can consume space, especially in large projects.

bash 复制代码
git reflog expire --expire=now --all
Step 2: Identify and Remove Large Files

2.1. Find Large Files

Use a script to find large files in your repository's history.

bash 复制代码
git rev-list --objects --all |
git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' |
sort -n -k 3 |
tail -n 10

This command will list the top 10 largest objects in the repo.

2.2. Remove Large Files Using BFG

If you find large files that should not be in the repository, use BFG Repo-Cleaner, which is faster and simpler than git filter-branch.

First, download and run BFG:

bash 复制代码
java -jar bfg.jar --strip-blobs-bigger-than 100M /path/to/your/repo

2.3. Alternative: Use git filter-branch

If you prefer not to use BFG, you can manually remove large files with git filter-branch:

bash 复制代码
git filter-branch --force --index-filter \
  "git rm --cached --ignore-unmatch PATH_TO_LARGE_FILE" \
  --prune-empty --tag-name-filter cat -- --all

Replace PATH_TO_LARGE_FILE with the path to the file you wish to remove.

Step 3: Clone the Repository Afresh

After cleaning up the history, it might be beneficial to clone the repository afresh to start with a new, smaller .git directory.

bash 复制代码
cd ..
git clone --mirror /path/to/old/repo new-repo
cd new-repo
git reflog expire --expire=now --all
git gc --aggressive --prune=now
Step 4: Replace Old Repository

Once you are satisfied with the new repository's state, you can replace the old repository:

bash 复制代码
mv /path/to/old/repo /path/to/old/repo-old
mv new-repo /path/to/old/repo
Final Notes
  • After performing these actions, especially if you changed the history, you will need to force-push to any remotes and inform collaborators to re-clone the repository.
  • Always ensure you have backups and confirm that no critical data is lost during the cleanup.

This tutorial will guide you through reducing the size of your Git repository effectively and safely. Remember, these changes affect the repository's history, which can impact collaborative workflows.

相关推荐
我叫张小白。6 小时前
Git 分支管理与团队协作
git
DogDaoDao7 小时前
Windows 下 Git 报错:`touch` 无法识别 —— 原因分析与 7 种解决方案(从入门到精通)
windows·git·程序员·npm·powershell·cmd·touch
caicai_xiaobai8 小时前
Ubuntu上Git安装步骤
linux·git·ubuntu
come112349 小时前
git 区分是 Git 分支还是 worktree 路径名
git
憧憬成为java架构高手的小白9 小时前
git多人工作之个人规范使用【ai+个人理解】
git
CVer儿9 小时前
git简单操作
git
Andya_net10 小时前
Git | Git 核心命令深入解析:从原理到实战
大数据·git·elasticsearch
wh_xia_jun11 小时前
给小白的 Maven 命令行执行测试 完整指南
git·maven·intellij-idea
专业白嫖怪11 小时前
H3C UniServer R4950 G5 服务器压测实战:13根内存条24小时压力测试全流程
git
我先去打把游戏先11 小时前
Ubuntu虚拟机(服务器版本)Git安装教程(附常用命令)——从零开始掌握版本控制
服务器·c语言·c++·git·嵌入式硬件·物联网·ubuntu