Skip to content

line-log: route -L output through the standard diff pipeline#2065

Open
mmontalbo wants to merge 4 commits intogitgitgadget:masterfrom
mmontalbo:spike-xdiff-line-range
Open

line-log: route -L output through the standard diff pipeline#2065
mmontalbo wants to merge 4 commits intogitgitgadget:masterfrom
mmontalbo:spike-xdiff-line-range

Conversation

@mmontalbo
Copy link

@mmontalbo mmontalbo commented Mar 6, 2026

git log -L has bypassed the standard diff pipeline since its
introduction, using dump_diff_hacky() to hand-roll diff output.
A NEEDSWORK comment has acknowledged this from the start. This
series removes dump_diff_hacky() and routes -L output through
builtin_diff() / fn_out_consume(), so that diff formatting
options like --word-diff, --color-moved, -w, and pickaxe options
(-S, -G) work with -L.

This replaces my earlier series "line-log: fix -L with pickaxe
options" [1]. Patch 1 is the crash fix from that series (unchanged).
Patch 2/2 from that series (rejecting -S/-G) is dropped because
this series makes those options work instead of rejecting them.

[1] https://lore.kernel.org/git/pull.2061.git.1772651484.gitgitgadget@gmail.com/

Patch 1 fixes a crash when combining -L with pickaxe options and
a rename.

Patch 2 is the core change: callback wrappers filter xdiff's output
to tracked line ranges, and line ranges are carried on diff_filepair
so each file's ranges travel with its filepair through the pipeline.
diffcore_std() runs at output time, so pickaxe, --orderfile, and
--diff-filter also work.

Patch 3 adds tests covering the newly-working options.

Patch 4 updates documentation.

User-visible output change: -L output now includes index lines,
new file mode headers, and funcname context in @@ headers that
were previously missing. Tools parsing -L output may need to handle
these additional lines.

Known limitations not addressed in this series:

  • line_log_print() still calls show_log() and diff_flush()
    directly, bypassing log_tree_diff_flush(). The early return in
    log_tree_commit() (and its associated NEEDSWORK about no_free
    not being restored) is pre-existing. Restructuring -L to flow
    through log_tree_diff_flush() is a larger change that would
    affect separator and header logic; it is left for a follow-up.

  • Non-patch diff formats (--raw, --numstat, --stat, etc.) remain
    unimplemented for -L.

@mmontalbo mmontalbo force-pushed the spike-xdiff-line-range branch from 696d15d to 0d3e84a Compare March 6, 2026 23:23
@mmontalbo
Copy link
Author

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 7, 2026

Submitted as pull.2065.git.1772845338.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-2065/mmontalbo/spike-xdiff-line-range-v1

To fetch this version to local tag pr-2065/mmontalbo/spike-xdiff-line-range-v1:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-2065/mmontalbo/spike-xdiff-line-range-v1

queue_diffs() passes the caller's diff_options, which may carry
user-specified pickaxe state, to diff_tree_oid() and diffcore_std()
when detecting renames for line-level history tracking.  When pickaxe
options are present on the command line (-G and -S to filter by text
pattern, --find-object to filter by object identity), diffcore_std()
also runs diffcore_pickaxe(), which may discard diff pairs that are
relevant for rename detection.  Losing those pairs breaks rename
following.

Before a2bb801 (line-log: avoid unnecessary full tree diffs,
2019-08-21), this silently truncated history at rename boundaries.
That commit moved filter_diffs_for_paths() inside the rename-
detection block, so it only runs when diff_might_be_rename() returns
true.  When pickaxe discards a rename pair, the rename goes
undetected, and a deletion pair at a subsequent commit passes
through uncleaned, reaching process_diff_filepair() with an invalid
filespec and triggering an assertion failure.

Fix this by building a private diff_options for the rename-detection
path inside queue_diffs(), following the same pattern used by blame's
find_rename().  This isolates the rename machinery from unrelated
user-specified options.

Reported-by: Matthew Hughes <matthewhughes934@gmail.com>
Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
`git log -L` has always bypassed the standard diff pipeline.
`dump_diff_hacky()` in line-log.c hand-rolls its own diff headers and
hunk output, which means most diff formatting options are silently
ignored.  A NEEDSWORK comment has acknowledged this since the feature
was introduced:

    /*
     * NEEDSWORK: manually building a diff here is not the Right
     * Thing(tm).  log -L should be built into the diff pipeline.
     */

Remove `dump_diff_hacky()` and its helpers and route -L output through
`builtin_diff()` / `fn_out_consume()`, the same path used by `git diff`
and `git log -p`.  The mechanism is a pair of callback wrappers that sit
between `xdi_diff_outf()` and `fn_out_consume()`, filtering xdiff's
output to only the tracked line ranges.  To ensure xdiff emits all lines
within each range as context, the context length is inflated to span the
largest range.

Wire up the `-L` implies `--patch` default in revision setup rather
than forcing it at output time, so `line_log_print()` is just
`diffcore_std()` + `diff_flush()` with no format save/restore.
Rename detection is a no-op since pairs are already resolved during
the history walk in `queue_diffs()`, but running `diffcore_std()`
means `-S`/`-G` (pickaxe), `--orderfile`, and `--diff-filter` now
work with `-L`, and `diff_resolve_rename_copy()` sets pair statuses
correctly without manual assignment.

Switch `diff_filepair_dup()` from `xmalloc` to `xcalloc` so that new
fields (including `line_ranges`) are zero-initialized by default.

As a result, diff formatting options that were previously silently
ignored (e.g. --word-diff, --no-prefix, -w, --color-moved) now work
with -L, and output gains `index` lines, `new file mode` headers, and
funcname context in `@@` headers.  This is a user-visible output change:
tools that parse -L output may need to handle the additional header
lines.

The context-length inflation means xdiff may process more output than
needed for very wide line ranges, but benchmarks on files up to 7800
lines show no measurable regression.

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
Now that -L output flows through the standard diff pipeline, verify
that previously-ignored diff options work: formatting (--word-diff,
--word-diff-regex, --no-prefix, --src/dst-prefix, --full-index,
--abbrev), whitespace handling (-w, -b), output indicators
(--output-indicator-new/old/context), direction reversal (-R),
--color-moved, and pickaxe options (-S, -G).

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
Now that -L output flows through the standard diff pipeline,
document that patch formatting options like --word-diff,
--color-moved, --no-prefix, whitespace handling (-w, -b),
and pickaxe options (-S, -G) are supported.

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
@mmontalbo mmontalbo force-pushed the spike-xdiff-line-range branch from da43015 to d85b939 Compare March 7, 2026 01:46
@gitgitgadget
Copy link

gitgitgadget bot commented Mar 7, 2026

Junio C Hamano wrote on the Git mailing list (how to reply to this email):

"Michael Montalbo via GitGitGadget" <gitgitgadget@gmail.com> writes:

> git log -L has bypassed the standard diff pipeline since its introduction,
> using dump_diff_hacky() to hand-roll diff output. A NEEDSWORK comment has
> acknowledged this from the start. This series removes dump_diff_hacky() and
> routes -L output through builtin_diff() / fn_out_consume(), so that diff
> formatting options like --word-diff, --color-moved, -w, and pickaxe options
> (-S, -G) work with -L.

Exciting.

> User-visible output change: -L output now includes index lines, new file
> mode headers, and funcname context in @@ headers that were previously
> missing. Tools parsing -L output may need to handle these additional lines.
>
> Known limitations not addressed in this series:
>
>  * line_log_print() still calls show_log() and diff_flush() directly,
>    bypassing log_tree_diff_flush(). The early return in log_tree_commit()
>    (and its associated NEEDSWORK about no_free not being restored) is
>    pre-existing. Restructuring -L to flow through log_tree_diff_flush() is a
>    larger change that would affect separator and header logic; it is left
>    for a follow-up.

OK.  Previously all the output routines were hand-rolled, but this
reduces the extent of deviation---as long as we are moving in the
right direction, it is a good idea to find a good place to stop and
leave the rest for later.

>  * Non-patch diff formats (--raw, --numstat, --stat, etc.) remain
>    unimplemented for -L.

It would not hurt if these are omitted.

> Michael Montalbo (4): line-log: fix crash when combined with pickaxe options
> line-log: route -L output through the standard diff pipeline t4211: add
> tests for -L with standard diff options doc: note that -L supports patch
> formatting and pickaxe options

I am not sure what this bloc is, but it looks like a reflowed
version of the list of commits below?

> Michael Montalbo (4):
>   line-log: fix crash when combined with pickaxe options
>   line-log: route -L output through the standard diff pipeline
>   t4211: add tests for -L with standard diff options
>   doc: note that -L supports patch formatting and pickaxe options

Let me throw in a handful of names found in the output of "git
shortlog --no-merges -s -n line-log.[ch]" on the Cc: line to solicit
help.
.

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 7, 2026

Michael Montalbo wrote on the Git mailing list (how to reply to this email):

On Fri, Mar 6, 2026 at 5:29 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Michael Montalbo via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > git log -L has bypassed the standard diff pipeline since its introduction,
> > using dump_diff_hacky() to hand-roll diff output. A NEEDSWORK comment has
> > acknowledged this from the start. This series removes dump_diff_hacky() and
> > routes -L output through builtin_diff() / fn_out_consume(), so that diff
> > formatting options like --word-diff, --color-moved, -w, and pickaxe options
> > (-S, -G) work with -L.
>
> Exciting.

:)

> > User-visible output change: -L output now includes index lines, new file
> > mode headers, and funcname context in @@ headers that were previously
> > missing. Tools parsing -L output may need to handle these additional lines.
> >
> > Known limitations not addressed in this series:
> >
> >  * line_log_print() still calls show_log() and diff_flush() directly,
> >    bypassing log_tree_diff_flush(). The early return in log_tree_commit()
> >    (and its associated NEEDSWORK about no_free not being restored) is
> >    pre-existing. Restructuring -L to flow through log_tree_diff_flush() is a
> >    larger change that would affect separator and header logic; it is left
> >    for a follow-up.
>
> OK.  Previously all the output routines were hand-rolled, but this
> reduces the extent of deviation---as long as we are moving in the
> right direction, it is a good idea to find a good place to stop and
> leave the rest for later.
>
> >  * Non-patch diff formats (--raw, --numstat, --stat, etc.) remain
> >    unimplemented for -L.
>
> It would not hurt if these are omitted.
>

Makes sense. I can omit in a follow-up.

> > Michael Montalbo (4): line-log: fix crash when combined with pickaxe options
> > line-log: route -L output through the standard diff pipeline t4211: add
> > tests for -L with standard diff options doc: note that -L supports patch
> > formatting and pickaxe options
>
> I am not sure what this bloc is, but it looks like a reflowed
> version of the list of commits below?
>

Yes, this was a mistake I made when crafting the cover letter.

> > Michael Montalbo (4):
> >   line-log: fix crash when combined with pickaxe options
> >   line-log: route -L output through the standard diff pipeline
> >   t4211: add tests for -L with standard diff options
> >   doc: note that -L supports patch formatting and pickaxe options
>
> Let me throw in a handful of names found in the output of "git
> shortlog --no-merges -s -n line-log.[ch]" on the Cc: line to solicit
> help.
> .
>

Great! Thank you for your help on the review!

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 7, 2026

Junio C Hamano wrote on the Git mailing list (how to reply to this email):

Michael Montalbo <mmontalbo@gmail.com> writes:

>> >  * Non-patch diff formats (--raw, --numstat, --stat, etc.) remain
>> >    unimplemented for -L.
>>
>> It would not hurt if these are omitted.
>>
>
> Makes sense. I can omit in a follow-up.

You are already omitting, no?  I took "remain unimplemented" to mean
exactly that.  Mentioning that we are not adding support for them,
like you did in the above sentence that I commented on, is a good
thing to do, I think.

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 7, 2026

Michael Montalbo wrote on the Git mailing list (how to reply to this email):

On Fri, Mar 6, 2026 at 6:05 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Michael Montalbo <mmontalbo@gmail.com> writes:
>
> >> >  * Non-patch diff formats (--raw, --numstat, --stat, etc.) remain
> >> >    unimplemented for -L.
> >>
> >> It would not hurt if these are omitted.
> >>
> >
> > Makes sense. I can omit in a follow-up.
>
> You are already omitting, no?  I took "remain unimplemented" to mean
> exactly that.  Mentioning that we are not adding support for them,
> like you did in the above sentence that I commented on, is a good
> thing to do, I think.

Ah got it, I confused myself. I thought this meant updating docs/code
to more explicitly disallow these options from interacting but this is
already the case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant