Multiple staging areas

Git 2.5 introduced git worktree in July 2015, which allows for one clone, multiple working tree, in which you could isolate your various modifications.

But nowadays (Q4 2019), you would not be able to modify the git-add--interactive.perl for much longer, because with Git 2.25 (Q1 2020), "git add -i", which is getting rewritten in C, has been extended to cover subcommands other than the "patch".

(That rewrite is officially completed and made default in Git 2.37, Q3 2022. See the last section).

See commit 2e697ce, commit d763357, commit 8746e07, commit ab1e1cc, commit c54ef5e, commit a8c45be, commit f37c226, commit c08171d, commit 0c3944a (29 Nov 2019) by Johannes Schindelin (dscho).
(Merged by Junio C Hamano -- gitster -- in commit 3beff38, 16 Dec 2019)

built-in add -i: implement the patch command

Signed-off-by: Johannes Schindelin

Well, it is not a full implementation yet. In the interest of making this easy to review (and easy to keep bugs out), we still hand off to the Perl script to do the actual work.

The patch functionality actually makes up for more than half of the 1,800+ lines of git-add--interactive.perl. It will be ported from Perl to C incrementally, later.

Still in the context of rewriting git add in C: more test coverage update in preparation for further work on "git add -i".

See commit b4bbbbd, commit 89c8559, commit e91162b, commit 0c3222c, commit 24be352, commit 8539b46, commit 0f0fba2 (06 Dec 2019) by Johannes Schindelin (dscho).
(Merged by Junio C Hamano -- gitster -- in commit 011fc2e, 16 Dec 2019)

git add -p: use non-zero exit code when the diff generation failed

Signed-off-by: Johannes Schindelin

The first thing git add -p does is to generate a diff. If this diff cannot be generated, git add -p should not continue as if nothing happened, but instead fail.

What we actually do here is much broader: we now verify for every run_cmd_pipe() call that the spawned process actually succeeded.

Note that we have to change two callers in this patch, as we need to store the spawned process' output in a local variable, which means that the callers can no longer decide whether to interpret the return <$fh> in array or in scalar context.

This bug was noticed while writing a test case for the diff.algorithm feature, and we let that test case double as a regression test for this fixed bug, too.


With Git 2.25 (Q1 2020), The effort to move "git-add--interactive" to C continues.

See commit 2e40831, commit 54d9d9b, commit ade246e, commit d6cf873, commit 9254bdf, commit bcdd297, commit b38dd9e, commit 11f2c0d, commit 510aeca, commit 0ecd9d2, commit 5906d5d, commit 47dc4fd, commit 80399ae, commit 7584dd3, commit 12c24cf, commit 25ea47a, commit e3bd11b, commit 1942ee4, commit f6aa7ec (13 Dec 2019) by Johannes Schindelin (dscho).
(Merged by Junio C Hamano -- gitster -- in commit 45b96a6, 25 Dec 2019)

built-in add -p: implement hunk editing

Signed-off-by: Johannes Schindelin

Just like git add --edit allows the user to edit the diff before it is being applied to the index, this feature allows the user to edit the diff hunk.

Naturally, it gets a bit more complicated here because the result has to play well with the remaining hunks of the overall diff. Therefore, we have to do a loop in which we let the user edit the hunk, then test whether the result would work, and if not, drop the edits and let the user decide whether to try editing the hunk again.

Note: in contrast to the Perl version, we use the same diff "coalescing" (i.e. merging overlapping hunks into a single one) also for the check after editing, and we introduce a new flag for that purpose that asks the reassemble_patch() function to pretend that all hunks were selected for use.

This allows us to continue to run git apply without the --allow-overlap option (unlike the Perl version), and it also fixes two known breakages in t3701-add-interactive.sh (which we cannot mark as resolved so far because the Perl script version is still the default and continues to have those breakages).

And:

built-in add -p: coalesce hunks after splitting them

Signed-off-by: Johannes Schindelin

This is considered "the right thing to do", according to 933e44d3a0 ("add -p": work-around an old laziness that does not coalesce hunks, 2011-04-06, Git v1.7.5.2).

Note: we cannot simply modify the hunks while merging them; Once we implement hunk editing, we will call reassemble_patch() whenever a hunk is edited, therefore we must not modify the hunks (because the user might e.g. hit K and change their mind whether to stage the previous hunk).

And:

built-in add -i: start implementing the patch functionality in C

Signed-off-by: Johannes Schindelin

In the previous steps, we re-implemented the main loop of git add -i in C, and most of the commands.

Notably, we left out the actual functionality of patch, as the relevant code makes up more than half of git-add--interactive.perl, and is actually pretty independent of the rest of the commands.

With this commit, we start to tackle that patch part. For better separation of concerns, we keep the code in a separate file, add-patch.c. The new code is still guarded behind the add.interactive.useBuiltin config setting, and for the moment, it can only be called via git add -p.

The actual functionality follows the original implementation of 5cde71d64aff ("git add --interactive", 2006-12-10, Git v1.5.0-rc0 -- merge), but not too closely (for example, we use string offsets rather than copying strings around, and after seeing whether the k and j commands are applicable, in the C version we remember which previous/next hunk was undecided, and use it rather than looking again when the user asked to jump).

As a further deviation from that commit, We also use a comma instead of a slash to separate the available commands in the prompt, as the current version of the Perl script does this, and we also add a line about the question mark ("print help") to the help text.

While it is tempting to use this conversion of git add -p as an excuse to work on apply_all_patches() so that it does not want to read a file from stdin or from a file, but accepts, say, an strbuf instead, we will refrain from this particular rabbit hole at this stage.

The conclusion of that rewriting effort is found with Git 2.29 (Q4 2020): the "add -i/-p" machinery has been written in C but it is not used by default yet.
It is made default to those who are participating in feature.experimental experiment.

See commit 2df2d81 (08 Sep 2020) by Junio C Hamano (gitster).
(Merged by Junio C Hamano -- gitster -- in commit e96b271, 18 Sep 2020)

add -i: use the built-in version when feature.experimental is set

Acked-by: Johannes Schindelin

We have had parallel implementations of "add -i/-p" since 2.25 and have been using them from various codepaths since 2.26 days, but never made the built-in version the default.

We have found and fixed a handful of corner case bugs in the built-in version, and it may be a good time to start switching over the user base from the scripted version to the built-in version.

Let's enable the built-in version for those who opt into the feature.experimental guinea-pig program to give wider exposure.

And, still with Git 2.29 (Q4 2020), an "add -i/-p" fix:

See commit 1c6ffb5, commit dc62641 (07 Sep 2020) by Jeff King (peff).
(Merged by Junio C Hamano -- gitster -- in commit 694e517, 18 Sep 2020)

add-patch: fix inverted return code of repo_read_index()

Signed-off-by: Jeff King
Acked-by: Johannes Schindelin

After applying hunks to a file with "add -p", the C patch_update_file() function tries to refresh the index (just like the Perl version does).
We can only refresh the index if we're able to read it in, so we first check the return value of repo_read_index().
But unlike many functions, where "0" is success, that function is documented to return the number of entries in the index.
Hence we should be checking for success with a non-negative return value.

Neither the tests nor any users seem to have noticed this, probably due to a combination of:

  • this affects only the C version, which is not yet the default
  • following it up with any porcelain command like "git diff"(man) or "git commit" would refresh the index automatically.

But you can see the problem by running the plumbing "git diff-files"(man) immediately after "add -p" stages all hunks. Running the new test with GIT_TEST_ADD_I_USE_BUILTIN=1 fails without the matching code change.


With Git 2.37 (Q3 2022), "git add -i"(man) was rewritten in C some time ago and has been in testing; the reimplementation is now exposed to general public by default.

See commit 0527ccb, commit ed922dc (30 Nov 2021) by Johannes Schindelin (dscho).
(Merged by Junio C Hamano -- gitster -- in commit 1fc1879, 30 May 2022)

add -i: default to the built-in implementation

Signed-off-by: Johannes Schindelin

In 9a5315e ("Merge branch 'js/patch-mode-in-others-in-c'", 2020-02-05, Git v2.26.0-rc0 -- merge listed in batch #3), Git acquired a built-in implementation of git add(man)'s interactive mode that could be turned on via the config option add.interactive.useBuiltin.

The first official Git version to support this knob was v2.26.0.

In 2df2d81 ("add -i: use the built-in version when feature.experimental is set", 2020-09-08, Git v2.29.0-rc0 -- merge listed in batch #15), this built-in implementation was also enabled via feature.experimental.
The first version with this change was v2.29.0.

More than a year (and very few bug reports) later, it is time to declare the built-in implementation mature and to turn it on by default.

We specifically leave the add.interactive.useBuiltin configuration in place, to give users an "escape hatch" in the unexpected case should they encounter a previously undetected bug in that implementation.

git config now includes in its man page:

Set to false to fall back to the original Perl implementation of the interactive version of git add instead of the built-in version.
Is true by default.


Edit, 30 May 2020: In Git 2.15 or later I recommend using git worktree instead of trying to do the below. There are some restrictions on added work-trees that make them somewhat annoying for this kind of work-flow, but it can work, and is built in to modern Git.

Note that if you do do something like I describe below, git gc won't know to look in your alternate index files, and in fact, from its original introduction in Git 2.5 until it was fixed in Git 2.15, git gc forgot to check added work-trees and their index files!

See VonC's answer for more.


You can in fact have multiple different staging areas (more literally, multiple index files) in git. To achieve the effect you want you would have to write your own variant of git add -p anyway, so what I will do here is sketch an outline, as it were, of how to do this.

The default index file—the one git uses if you don't direct it to some other index file—lives in .git/index (or, more shell-correctly, $GIT_DIR/.index where $GIT_DIR is taken from the environment or, if not set there, from git rev-parse --git-dir).

If you set the environment variable GIT_INDEX_FILE, however, git will use that file instead as the index. Thus, you might begin your "scatter changes to four branches" process by doing something like this:

GIT_DIR=${GIT_DIR:-$(git rev-parse --git-dir)} || exit 1
index_tmp_dir=$(mktemp -d) || exit 1
trap "rm -rf $index_tmp_dir" 0 1 2 3 15 # clean up on exit

# make four copies of initial staging area
for f in i1 i2 i3 i4; do
    cp $GIT_DIR/index $index_tmp_dir/$f
done

# THIS IS THE HARD PART:
# Now, using `git diff-files -p` or similar, get patches
# (diff hunks).
# Whenever you're ready to stage one, pick an index for it,
# then use:
GIT_INDEX_FILE=$index_tmp_dir/$which git apply --cached < diffhunk

# Once done, commit each index file separately with some
# variation on:
for f in i1 i2 i3 i4; do
    GIT_INDEX_FILE=$index_tmp_dir/$which git commit
done

For the part labeled "hard part", your best bet might well be to copy git's add-interactive perl script, found in $(git --exec-path)/git-add--interactive, then modify it to suit. To remove the "exactly four commits" restriction, make this modified interactive-add create a new index file dynamically (by copying the original, or perhaps creating an "empty" index equal to the HEAD commit or whatever; see git read-tree as well).

Edit: the some variation on section really should almost certainly use git write-tree and git commit-tree to make new branches out of each of these commits, using the parent of the current commit as their parent, rather than allowing git commit to string the commits together as a linear chain. That means one must also choose some naming scheme for these various newly-created branches.

Tags:

Git