Matthias Schoettle

Lead Software Developer

Move directory from one repository to another, preserving history

I just moved one directory within a Git repository to a directory within another repository including its history. For example:

repositoryA/
.........../directoryToKeep
.........../otherDirectory
.........../someFile.ext

repositoryB/
.........../someStuff

The goal is to move directoryToKeep into repositoryB with its history, i.e., all commits that affect directory1. If instead, you want to create a repository just for the contents of directoryToKeep, just skip the last step of the preparation of the source repository.

Here is how I did it, based on this blog post and StackOverflow topic:

1. Prepare the source repository

  1. Clone repositoryA (make a copy, don’t use your already existing one)
  2. cd to it
  3. Delete the link to the original repository to avoid accidentally making any remote changes
    git remote rm origin
  4. Using filter-branch, go through the complete history and remove all commits (or keep all commits affecting directoryToKeep) not related to directoryToKeep.
    git filter-branch --subdirectory-filter <directoryToKeep> -- --all

    From the git documentation:

    Only look at the history which touches the given subdirectory. The result will contain that directory (and only that) as its project root.

    You might need to add --prune-empty to avoid empty commits, in my case it was not necessary.
    This means that the result will be repositoryA containing the contents of directoryToKeep directly, which is also reflected in all the commits. If you want to create a separate repository just for directoryToKeep, skip the next step. If instead you want to move directoryToKeep to repositoryB into its own directory, you basically have two options. You might be fine with the way the commits are and create an additional commit that moves all files into a directory. However, if you are a perfectionist like myself, you can perform the following command to move directoryToKeep into its own directory, which will update all remaining commits accordingly.

  5. Replace directoryToKeep with your actual directory before, and execute the following command using index-filter this time:
    git filter-branch --index-filter '
        git ls-files -sz | 
        perl -0pe "s{\t}{\tdirectoryToKeep/}" |
        GIT_INDEX_FILE=$GIT_INDEX_FILE.new \
            git update-index --clear -z --index-info &&
            mv "$GIT_INDEX_FILE.new" "$GIT_INDEX_FILE"
    ' HEAD
  6. There might be old untracked files. You can clean up the repository with the following commands:
    git reset --hard
    git gc --aggressive
    git prune
    git clean -df
  7. If you just want a new repository for directoryToKeep, you should be able to just push it. Otherwise follow the second step.
    It’s also good at this point to make sure that the result is correct, e.g., using git log.

2. Merge into target repository

  1. Clone repository (make a copy, don’t use your already existing one)
  2. cd into it
  3. Create a remote connection to repositoryA as a branch in repositoryB.
    git remote add <branch-name-repoA> /path/to/repositoryA
  4. Pull from the branch (this assumes you performed the changes above on master)
     git pull --allow-unrelated-histories <branch-name-repoA> master

    Note: Because your branch and master don’t have a common base, git 2.9+ will refuse to merge them without the --allow-unrelated-histories option.

  5. It will create a merge commit to merge the current HEAD with your branch. The editor for the commit message should appear. Enter a meaningful commit message and proceed.
  6. Now you’re done and can push.
  7. Personally, I would just delete the cloned repositories from step 1 and go back to the actual repository.
  8. If everything works, remove directoryToKeep from repositoryA.

Update 19.01.2017: Updated step 2.4 with additional option (Thanks, Paul!)

17 Comments

  1. git 2+ will require an additional flag for the final pull flag:

    ` git pull master –allow-unrelated-histories`

  2. You’re awesome! Saved us a tonne of work!

  3. Would it be possible to write a script for this?

  4. Dimitris Pantazopoulos

    December 10, 2017 at 09:42

    Excellent, thanks a lot.

    Can move the whole repo without steps 1.4, 15 and (optionally) 1.6.

    Thanks again.

  5. Excellent post, but this does not work when you are trying you are trying to move ‘repositoryA:/path/to/directoryToKeep’ to ‘repositoryB:/path/to/directoryToKeep’. Instead the the ‘directoryToKeep’ is all copied into the root of ‘repositoryB’ (ex: ‘repositoryB:/directoryToKeep’) after I run the `git pull –allow-unrelated-histories master`. What am I missing here? How do I make sure that git pull creates everything under ‘repositoryB:/path/to/directoryToKeep’ ?

    • Matthias Schoettle

      August 29, 2018 at 15:03

      Have you tried what I mentioned above step 1.5? In the source repository you could simply move the contents (which at this point will be in the root) into its own folder with one commit.

  6. If repositoryA previously had history of the contents of directoryToKeep being moved from someOtherDirectory->directoryToKeep, this will lose all history prior to that move occurring.
    This solution literally just looks for all instances of the files under a directoryToKeep folder in all commits in the history, and only keeps the commits/portions of the commits, that affect directoryToKeep.
    A more robust solution would likely need to recursively consider all files currently under directoryToKeep, inspect them to determine all their previous locations based on the history of possible moves, and take the sum-total bundled set of individual files and request that they all be kept.

    • Matthias Schoettle

      October 12, 2018 at 14:25

      That is correct.

      Do you know the name of the directory it was renamed from? If so, you could try what this post suggests.

      I am not sure if a general solution that follows renames exists.

  7. Awesome! Still a valid procedure.

  8. Is there a way to do without changing the commit Ids. When i follow the procedure all is good, except that i have new commit Id’s for all the commits.

    • Matthias Schoettle

      October 31, 2018 at 13:39

      No, using this technique it is not possible since the parent commit id (among other things) is used to determine the commit id (SHA-1 hash).

      Is there a specific reason you need to preserve the same commit ids?

  9. Thank you! This was an easy to use tutorial – after wrestling with this issue for over 3 hours, you helped me solve it in 5 mins!

  10. Whener I try to run this command I het this error

    “`
    Cannot create a new backup.
    A previous backup already exists in refs/original/
    Force overwriting the backup with -f
    “`

Leave a Reply

Your email address will not be published. Required fields are marked *

© 2018 Matthias Schoettle

Theme by Anders NorenUp ↑