At this point, there is very little dispute within the software industry in the benefits of using a Monorepo. There are obviously use cases where this approach doesn’t suit, but on-balance, it will often make a lot more sense to use a single repository than many. Some benefits include:
company’s code. Instead of needing to ask another team how something works internally or where to find code, said engineer can immediately look at the internals of another service and figure it out for themselves.
act with initiative to fix problems they find in other teams domains. In the case of multiple repositories, the location of code is often hidden knowledge of the owning team which blocks this collaboration.
building a complete Kubernetes infrastructure only one CI system needs to be set up via Github actions, in place of a CI system for each repository.
engineers and teams know that their code is visible company wide, they are more focused on the quality and readability and stick to company standards more, in place of team level norms.
facilitate allowing for styles and standards to be kept more congruent.
Prior to jumping in, I’ve also found the question of submodules being an alternative come up quite often when people are looking for a path to migrate to a Monorepo style workflow. I have tried this previously and in my experience this is not a good option.
Using submodules absolutely subtracts the benefits of visibility as new changes are often only seen as an updated hash in the parent repo, and to keep many submodules which are often moving requires extra tooling to keep a local up to date. It also introduces a significant amount of brittleness to any CI system as all dependencies need to be available, have the proper authorizations set and each commit to the parent repository must include the correct hashes of the children that must be pushed prior. The ease of being able to commit a hash to a parent prior to the child repo having it’s hash pushed upstream is way too easy to do.
Submodules definitely have a place in some contexts where you want to reduce visibility / noise, and where what is being committed is not a dependency for code in the core Monorepo, but in the context of a Monorepo approach I feel they should be left well alone.
Git natively supports the ability to merge multiple repositories, but it can be difficult and slow.
Some approaches include using
filter-branch, but these can break and can be
problematic when the previous merge history of a given repository is highly complex and not very
Instead of this, the git-filter-repo package provides
a much quicker and trustworthy approach to merge multiple existing histories. It can easily be
pip3 on mac via the following:
$ pip3 install git-filter-repo Collecting git-filter-repo Downloading https://files.pythonhosted.org/packages/c7/a3/f5a470387c6b9c6d560b74d6cee21d56d595c970a439e5e702b595e520a8/git_filter_repo-2.28.0-py2.py3-none-any.whl (97kB) 100% |████████████████████████████████| 102kB 619kB/s Installing collected packages: git-filter-repo Successfully installed git-filter-repo-2.28.0
As an example, I’m going to assume that my company owns two products,
Vue.js and React. Let’s say
both of these repositories have a lot of code that I don’t want to import into the Monorepo as it is
tooling and support files which are common to both projects and I’ll recreate in the Monorepo after
I’ve imported both products, so as an example, let’s say I really only want what is in the
scripts directory for each repo (using the same directory in each is a good example of getting
around conflicting file names too).
To start off, I create a new repository which I’ll call simply monorepo and add two commits.
$ mkdir monorepo $ cd monorepo $ git init Initialized empty Git repository in /path/to/monorepo/.git/ $ echo "# Monorepos are great" > README.md $ git add README.md $ git commit -m "Add initial commit" [master (root-commit) e89ea26] Add initial commit 1 file changed, 1 insertion(+) create mode 100644 README.md $ echo "# Monorepos are practical in many contexts, not all" > README.md $ git add README.md $ git commit -m "Change README title to less embellished version" [master d7d2d73] Change README title to less embellished version 1 file changed, 1 insertion(+), 1 deletion(-) $ git log --pretty=oneline d7d2d737ff4c2aa7d543c37e3b1d92c38dcab55b (HEAD -> master) Change README title to less embellished version e89ea269cded4eaa521cc922a5cec8aa9355cffd Add initial commit $ cd ..
Next, I create a local copy of each of the repositories that I’m going to consolidate into the new
repository. A clone of the repository should be done for this purpose as the
mutates the repository it is working on, so you will somewhat burn your local copy in the process.
While you’re likely not going to use it again and start using the Monorepo, it’s probably best
making a separate clone so you can easily start over if need be.
$ git clone [email protected]:vuejs/vue.git Cloning into 'vue'... remote: Enumerating objects: 18, done. remote: Counting objects: 100% (18/18), done. remote: Compressing objects: 100% (18/18), done. remote: Total 56508 (delta 3), reused 0 (delta 0), pack-reused 56490 Receiving objects: 100% (56508/56508), 26.95 MiB | 11.19 MiB/s, done. Resolving deltas: 100% (39656/39656), done. $ git clone [email protected]:facebook/react.git Cloning into 'react'... remote: Enumerating objects: 182979, done. remote: Total 182979 (delta 0), reused 0 (delta 0), pack-reused 182979 Receiving objects: 100% (182979/182979), 154.14 MiB | 18.60 MiB/s, done. Resolving deltas: 100% (127869/127869), done.
Now that all three repositories are siblings in the one directory, I can use the
package via the git command to filter the repository history so that it only includes the files in
scripts directory in each project. The following is for the vue project.
$ cd vue $ git filter-repo --path scripts/ --path-rename 'scripts':'vue-scripts' --tag-rename '':'vue-' Parsed 6289 commits New history written in 0.85 seconds; now repacking/cleaning... Repacking your repo and cleaning out old unneeded objects HEAD is now at 20c5987b chore: remove unused build alias (#9525) Enumerating objects: 129, done. Counting objects: 100% (129/129), done. Delta compression using up to 8 threads Compressing objects: 100% (67/67), done. Writing objects: 100% (129/129), done. Total 129 (delta 54), reused 50 (delta 38) Completely finished after 1.13 seconds. $ ls -la total 0 drwxr-xr-x 4 ianbelcher staff 128 May 5 12:29 . drwxr-xr-x 5 ianbelcher staff 160 May 5 12:22 .. drwxr-xr-x 14 ianbelcher staff 448 May 5 12:29 .git drwxr-xr-x 12 ianbelcher staff 384 May 5 12:29 vue-scripts $ cd ..
A few things to note here:
files apart from those that were in the
--path-renameswitch and providing
'scripts':'vue-scripts', the filter actively
scripts directory to
vue-scripts. This means that we can avoid conflicts when
adding the history to the monorepo.
--tag-renameswitch and providing
'':'vue-', tags which are attached to commits
that were not removed will be prepended with
vue-. Again, another measure to keep as much history
without creating any conflicts.
A similar process can then be applied to the React project.
$ cd react $ git filter-repo --path scripts/ --path-rename 'scripts':'react-scripts' --tag-rename '':'react-' Parsed 16803 commits New history written in 2.80 seconds; now repacking/cleaning... Repacking your repo and cleaning out old unneeded objects HEAD is now at 88053ee1b Release script: allow preparing RC from npm Enumerating objects: 9421, done. Counting objects: 100% (9421/9421), done. Delta compression using up to 8 threads Compressing objects: 100% (2834/2834), done. Writing objects: 100% (9421/9421), done. Total 9421 (delta 5126), reused 9333 (delta 5063) Completely finished after 4.28 seconds. $ ls -al total 0 drwxr-xr-x 4 ianbelcher staff 128 May 5 12:32 . drwxr-xr-x 5 ianbelcher staff 160 May 5 12:22 .. drwxr-xr-x 14 ianbelcher staff 448 May 5 12:32 .git drwxr-xr-x 21 ianbelcher staff 672 May 5 12:32 react-scripts $ cd ..
The next step is to consolidate these histories into the main repository. This can be done simply
by merging each in as a remote, and using the
--allow-unrelated-histories flag when performing
For the following example, the
dev branch in vue is being used, while the
master branch from
react is being used.
$ cd monorepo $ git remote add vue ../vue/ $ git remote add react ../react/ $ git fetch vue <REDACTED OUTPUT> $ git fetch react <REDACTED OUTPUT> $ git branch vue remotes/vue/dev Branch 'vue' set up to track remote branch 'dev' from 'vue'. $ git merge vue -m "Merge vue history" --allow-unrelated-histories <REDACTED OUTPUT> $ git branch react remotes/react/master Branch 'react' set up to track remote branch 'master' from 'react'. $ git merge react -m "Merge react history" --allow-unrelated-histories <REDACTED OUTPUT> $ git remote remove vue $ git remote remove react $ git branch -d vue $ git branch -d react
At this point for the sake of the example, I make a quick change to the
file and commit it as well. (Simply change the hashbang to /bin/zsh for some silly reason)
$ vi vue-scripts/release.sh $ git add vue-scripts/release.sh $ git commit -m "Change hashbang to /bin/zsh" [master 48093a8] Change hashbang to /bin/zsh 1 file changed, 1 insertion(+), 1 deletion(-)
At this point, the history tree looks like the following.
Hurrah! At this point, any engineer is able to look at any file in the monorepo and see a complete history of the changes.
Moving forward, the monorepo can be used as the single source of truth for all code and the singular repositories archived.
It is also worth mentioning that this method will also still work without the repo-filter example.
If a repository doesn’t have any conflicting filenames with what is already in the repository,
creating the remote and merging using the
--allow-unrelated-histories switch will still work.
You may be inclined to do this and then move the files as needed via a large commit. This is also
a viable option.
In summary, keeping histories from multiple single projects when consolidating into a Monorepo is not that difficult of a task and it’s worth the time and effort in consolidating if your company can gain the benefits.