nirvana
and the target repository
(the one I want to put the commits into) midgard
. I’ll also assume
that I’m interested in the master
branch of both repositories.
Warning Some of the commands I’m going to use have the potential to damage the history of your git repository. If you aren’t sure what any of the commands below do you should make a backup of your repository before proceeding.
The process goes like this:
midgard
.nirvana
.rewrite
branch based on nirvana/master
.rewrite
branch commit messages and file structure.rewrite
branch on midgard/master
.rewrite
branch upstream (to PR, review, or whatever).I hate having to restore repositories from backup (especially when I don’t have backups and have to rely on my local clones) so I always make a throw-away clone when I’m rewriting history, etc.
git clone git@example.com:midgard.git import-nirvana
cd import-nirvana
git remote add nirvana git@example.com:nirvana.git
git fetch nirvana
These commands give me a new clone of the midgard
repository, and add a
remote and fetch the nirvana
repository.
Once I have the commits it’s time to massage them gently into shape
ready for import. In my case, I want to add Nirvana:
to the front
of every commit message (so that I can tell which sub-project a commit
changes when I review the git log
) and move everything in the
nirvana
repository into a nirvana
sub-directory (to avoid
clobbering build scripts, etc. which exist in both).
These changes involve rewriting history so I’ll do them on a new
rewrite
branch. If [when] I screw it up the first few times I’ll be
able to discard the branch and try again without cloning and fetching
both repositories from scratch.
git checkout -b rewrite nirvana/master
Rewriting the commit messages is conceptually easy but kind of
annoying when using git-filter-branch
. The --msg-filter
argument
takes a command which accepts the current commit message on its
standard input and must produce the new message on its standard
output. Using echo
and cat
to add “Nirvana:” to the start of the
first line looks like this:
git filter-branch --msg-filter '/bin/echo -n "Nirvana: " && cat'
Moving the files into a subdirectory called nirvana
is essentially
the same but the command (passed to --tree-filter
this time) is
quite a bit more complex. Instead of echo
ing some text and then
cat
ing the current message we’ll mkdir
a new directory and then
mv
the files into it.
git filter-branch --tree-filter 'mkdir -p nirvana &&
find . -mindepth 1 -maxdepth 1 ! -name nirvana -exec mv {} nirvana
\;' HEAD
I’ve used mkdir -p nirvana
to create the directory if it does not
already exist. One thing you need to be careful of when mv
ing your
files into the new nirvana
directory is that you don’t attempt to
move the nirvana
directory into itself. If you are using bash
or
some other “advanced” shell you can enable and use negative globbing
but I prefer to use find
to select only the files I want to move.
Here find . -mindepth 1 -maxdepth 1 ! -name nirvana
looks at the
current directory and finds every file (or directory) which is:
./
); andnirvana
.Then each match is substituted for the {}
in the mv {} nirvana ;
and the command is executed. This moves everything except nirvana/
into nirvana/
.
All that remains is to graft the newly prepared rewrite
branch onto
the existing midgard/master
history so that they can be merged
together. Because the tree of every commit in rewrite
has been
rewritten to have all files under the nirvana/
directory it should
be trivial to rebase the rewrite
branch on the midgard/master
branch.
git rebase origin/master
This should go through without any intervention. If it does not, you
probably already have a directory called nirvana/
with some of the
same files as the rewrite
branch. You’ll need to resolve any
conflicts that git
reports in the same way you usually do but be
sure to use the git rebase
commands to continue!
With that done the branch is ready to merge but I usually take this
opportunity to git rebase -i
and squash some commits, update build
scripts to build the imported code, etc. Once this is done I generally
push the branch into the origin repository (midgard
)
git push -u review origin/import-nirvana
And then open a pull request for my team mates to review.
]]>user.name
and user.email
configurations and they need to be
brought into line.
The first example is useful to “fix” commits which have been made with the wrong author name and email. I’ve sometimes seen this when people use git for deployment and make an “urgent” fix on a production servers and commit as “web-owner@web3.example.com”.
Rewriting all commits that have email web-owner@web3.startup.com
to have the
name “Thomas Sutton” and e-mail “me@thomas-sutton.id.au” is simple:
git filter-branch --env-filter 'if [ $GIT_AUTHOR_EMAIL = web-owner@web3.startup.com ]; then GIT_AUTHOR_EMAIL=me@thomas-sutton.id.au; GIT_AUTHOR_NAME="Thomas Sutton"; fi; export GIT_AUTHOR_EMAIL GIT_AUTHOR_NAME'
If this is a regular problem for you, you might want to rewrite all commits
with an email address that does not match some pattern. If your pattern can
be written as a glob you can use the built-in pattern matching functionality in
bash
:
git filter-branch --env-filter 'if [[ $GIT_AUTHOR_EMAIL != *@examplecorp.com ]]; then GIT_AUTHOR_EMAIL=tech@examplecorp.com; GIT_AUTHOR_NAME="Example Corp"; fi; export GIT_AUTHOR_EMAIL GIT_AUTHOR_NAME'
Here I’m rewriting every commit where the email address does not match
*@examplecorp.com
to have the name “Example Corp” and email address
tech@examplecorp.com
. Notice that this example uses double square brackets
around the condition: [[ ]]
. It’s this change that enables pattern matching
as opposed to the single brackets and simple equality in the first example.
You need to remember, though, that techniques like this rewrite the complete history of the repository. This means that all other branches, all clones, etc., etc. will all need to be re-done to match with this new history. Do not do this unless you know what this warning means and how to resolve any issues you’ll have.
]]>