Skip to content

Conversation

@derrickstolee
Copy link

@derrickstolee derrickstolee commented Jan 17, 2026

My motivation for this feature is very similar to the bundle URI application. I can get around it by creating a tool that uses git rev-list --parents and then uses a hashset to collect the parent list and filter out any commits that ever appear as parents. It would be more efficient to use Git's native revision-walking feature.

This does bring the object struct up to a 32-bit boundary with 28 flag bits, 3 type bits, and a parsed bit. That's the biggest concern I have about this update adding a new flag bit. I would understand if this feature is not worth running out of room for extensions there.

I considered looking through the earlier bit positions to see the impact of an overlap, but they certainly looked potentially risky to reuse.

I wonder if anyone else has thought about this as a useful technique. For instance, it could be part of a strategy for choosing commits for reachability bitmaps.

Thanks,
-Stolee

cc: gitster@pobox.com
cc: Johannes Sixt j6t@kdbg.org

When inspecting a range of commits from some set of starting references, it
is sometimes useful to learn which commits are not reachable from any other
commits in the selected range.

One such application is in the creation of a sequence of bundles for the
bundle URI feature. Creating a stack of bundles representing different
slices of time includes defining which references to include. If all
references are used, then this may be overwhelming or redundant. Instead,
selecting commits that are maximal to the range could help defining a
smaller reference set to use in the bundle header.

Add a new '--maximal' option to restrict the output of a revision range to
be only the commits that are not reachable from any other commit in the
range, based on the reachability definition of the walk.

This is accomplished by adding a new 28th bit flag, CHILD_VISITED, that is
set as we walk. This does extend the bit range in object.h, but using an
earlier bit may collide with another feature.

The tests demonstrate the behavior of the feature with a positive-only
range, ranges with negative references, and walk-modifying flags like
--first-parent and --exclude-first-parent-only.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
@derrickstolee
Copy link
Author

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 18, 2026

Submitted as pull.2032.git.1768703645125.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-2032/derrickstolee/maximal-v1

To fetch this version to local tag pr-2032/derrickstolee/maximal-v1:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-2032/derrickstolee/maximal-v1

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 18, 2026

Johannes Sixt wrote on the Git mailing list (how to reply to this email):

Am 18.01.26 um 03:34 schrieb Derrick Stolee via GitGitGadget:
> diff --git a/Documentation/rev-list-options.adoc b/Documentation/rev-list-options.adoc
> index 453ec59057..f0d2ab32a9 100644
> --- a/Documentation/rev-list-options.adoc
> +++ b/Documentation/rev-list-options.adoc
> @@ -444,6 +444,10 @@ The following options affect the way the simplification is performed:
>  	times; if so, a commit is included if it is any of the commits
>  	given or if it is an ancestor or descendant of one of them.
>  
> +`--maximal`::
> +	Restrict the output commits to be those that are not reachable
> +	from any other commits in the revision range.

I had to read this sentence three times to understand what it wants to
say, and that even though I had a rough idea what it was supposed to
mean. I tried to come up with a better wording, but found it to be
really hard.

	Restrict output to the commits at the tips of the
	revision range.

is all I could do, but this isn't a lot better, I am afraid.

The option name is too generic IMHO. How about "--starting-point",
"--topmost-only"?  It's function is somewhat parallel to --boundary, but
at the positive end of the revision range. Perhaps we can use that as
inspiration.

The option is listed among options that affect the way the
simplification is performed. But is this true? Isn't it just an option
that changes what output is produced?

-- Hannes

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 18, 2026

User Johannes Sixt <j6t@kdbg.org> has been added to the cc: list.

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 18, 2026

Derrick Stolee wrote on the Git mailing list (how to reply to this email):

On 1/18/26 4:05 AM, Johannes Sixt wrote:
> Am 18.01.26 um 03:34 schrieb Derrick Stolee via GitGitGadget:
>> diff --git a/Documentation/rev-list-options.adoc b/Documentation/rev-list-options.adoc
>> index 453ec59057..f0d2ab32a9 100644
>> --- a/Documentation/rev-list-options.adoc
>> +++ b/Documentation/rev-list-options.adoc
>> @@ -444,6 +444,10 @@ The following options affect the way the simplification is performed:
>>   	times; if so, a commit is included if it is any of the commits
>>   	given or if it is an ancestor or descendant of one of them.
>>   >> +`--maximal`::
>> +	Restrict the output commits to be those that are not reachable
>> +	from any other commits in the revision range.
> > I had to read this sentence three times to understand what it wants to
> say, and that even though I had a rough idea what it was supposed to
> mean. I tried to come up with a better wording, but found it to be
> really hard.
> > 	Restrict output to the commits at the tips of the
> 	revision range.
> > is all I could do, but this isn't a lot better, I am afraid.
> > The option name is too generic IMHO. How about "--starting-point",
> "--topmost-only"?  It's function is somewhat parallel to --boundary, but
> at the positive end of the revision range. Perhaps we can use that as
> inspiration.

My perspective is skewed, because "maximal" is a concrete term in the
world of partially-ordered sets (such as commit history ordered by
reachability across child-to-parent relationships). It's important to
distinguish from "starting points" because the inputs to the command
are a list of starting points, not all of which are maximal within the
set. In fact, if some positive starting points are reachable from the
negative starting points, then they are already excluded.

My familiarity with this term is skewed by my experience working with
such terms, so I'm very open to new names for this option.

Your comparison to --boundary is interesting, because --boundary _adds_
commits to the range by selecting the commits from the negative range
that are reachable from the output commits. --maximal as defined here
_restricts_ to the output of commits in the range. It's interaction with
--boundary is trivial because no boundary commits would be included as
they are necessarily reachable from a maximal commit.

> The option is listed among options that affect the way the
> simplification is performed. But is this true? Isn't it just an option
> that changes what output is produced?

You're right that this is poorly placed. I'll put it in a better location
in v2.

Thanks,
-Stolee

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant