Encrypted blobs in Git and Mercurial
Git and Mercurial started their lives in the mid 2000 bringing the growing popularity DVCS concept to people. Both of them use similar techniques to gain the same results and offer very similar features. One of such features is bidirectional filtering applied to files once they get checked out and committed in (or staged to the index) from and to the repository respectively. The filters can be used in many ways, like SVN-like keyword expansion or auto-formatting.
And one of the most interesting filters is an ecrypting/decrypting filter making the repository a grey box data container. Unfortunately I still have no idea how to make it stop revealing the filenames, and not sure if it's possible to hide directory and filenames.
Git provides a special set of settings, named attributes, that can be used to fine-tune a particular repository.
Here is an example
*.* filter=crypt .gitattributes filter=
The two lines above are very simple:
*.* filter=crypt- apply the
cryptfilter to all files
.gitattributes filter=- but just don't apply the filter to the
.gitattributesfile, or the file would be encrypted hence blocking the filters
Now let's define the
crypt filter in local
[filter "crypt"] clean = openssl enc -e -aes256 -k PASSWORD smudge = openssl enc -d -aes256 -k PASSWORD
PASSWORD is your password in plain text.
Mercurial provides a very similar feature, however its use seems to be discouraged.
[encode] is an equivalent to Git
[decode] is equivalent to Git
[encode] ** = pipe: openssl enc -e -salt -aes256 -k PASSWORD **.* = pipe: openssl enc -e -salt -aes256 -k PASSWORD [decode] ** = pipe: openssl enc -d -salt -aes256 -k PASSWORD **.* = pipe: openssl enc -d -aes256 -k PASSWORD
Mercurial documentation recommends not to use the
pipe: mechanism for Windows:
The tempfile mechanism is recommended for Windows systems, where the standard shell I/O redirection operators often have strange effects. In particular, if you are doing line ending conversion on Windows using the popular
unix2dosprograms, you must use the tempfile mechanism, as using pipes will corrupt the contents of your files.
So we're going to use temporary files:
[encode] ** = tempfile: openssl enc -e -salt -aes256 -k PASSWORD -in INFILE -out OUTFILE **.* = tempfile: openssl enc -e -salt -aes256 -k PASSWORD -in INFILE -out OUTFILE [decode] ** = tempfile: openssl enc -d -aes256 -k PASSWORD -in INFILE -out OUTFILE **.* = tempfile: openssl enc -d -aes256 -k PASSWORD -in INFILE -out OUTFILE
**.* mean filenames with extensions and without extensions respectively.
I would not recommend to encrypt/decrypt files in Mercurial or Git repositories using filters, at least because of the following flaws.
Some files may have status reporting files as modified
For whatever reason, on Windows, probing the working copy status may report some files as changed regardless the real content change. This issue seems to be reproducible for both Git and Mercurial, and I don't know if there is a way to get rid of "spoiled" file statuses. Maybe this issue is caused by some file metadata, but even committing such files won't make the unchanged after the commit.
git mv and
hg mv to move files
Both Git and Mercurial seem not to track manually moved files when the files are modified dramatically using such filters. Still have to re-check it, but I have something in my memory reminding me file move issues for such repositories.
Since the almost any encrypted files look like true binary files, storing encrypted files cannot be recommended as a sufficient solution. Any file update would dramatically change the file binary representation, and this would mean that a file would be stored probably without delta. I'm not really sure about this exactly, but this is would probably work like that. However, if you prefer to encrypt a few short files only, I should not make a big impact.
Another flaw is that the password should be constant for all the repo life time.
If you change the password and update the working copy across different revisions with old passwords, the decrypted files will be broken in Mercurial because
.hgrc is not updated unlike Git
So such repositories are extremely password-sensitive anyway.
Accidental submitting an encrypted file requires history rewriting to encrypt it
Another annoying thing is when one submits a plain file to a repository with encrypted files but having no filters configured.
At least the history rewriting hell would begin if the accidentally committed file is supposed to be encrypted and matches filters filename patterns due to potential clashing.
git reset --soft HEAD~1 and
hg strip -r tip --keep would allow to make another commit and save the situation.
Git and Mercurial filtering may do great things, at least the ones described in Git online documentation, but encryption/decryption should be probably automated in other way, not with Git or Mercurial facilities.