fix: Fix Unicode highlight alignment in patches #357

KeqingMoe · 2025-12-23T20:54:53Z

Summary of Changes

This PR fixes a defect in the annotate-snippets library where patch highlighting becomes misaligned when handling Unicode characters. Currently, when a source line contains Unicode characters (either before the patch location or within the patch content itself), the color highlighting (e.g., for Removal) is incorrectly offset.

Root Cause

The issue originates in the emit_suggestion_default function (src/renderer/render.rs). This function incorrectly mixes different units when calling StyledBuffer::set_style_range:

The col_start and col_end parameters expect character counts
However, the function passes: padding (character count) + span_start/end_pos (display width) + line.len() (byte length)

This inconsistent usage of units (characters, display columns, bytes) causes the highlight ranges to be calculated incorrectly when multi-byte Unicode characters are present, leading to visual misalignment.

Fix

This PR corrects the unit calculations in emit_suggestion_default, ensuring that consistent character-based positioning is used throughout. The fix ensures that:

Highlight ranges are calculated based on proper character counts (or display widths where appropriate)
Unicode characters (including emoji, CJK characters, and combining characters) no longer cause highlight misalignment
The Removal color (and other patch styles) now correctly aligns with the intended text segments

Additional Context

The same emit_suggestion_default function is also used by draw_code_line and render_source_line in render.rs. While these functions may have similar issues, they have not been tested or modified in this PR. Further investigation and potential fixes for these functions may be needed in future work.
Regarding the .term.svg Output: In the attached .term.svg output, the underline markers ("^^^^^^^^ here") show a misalignment of approximately 1.5 units of display width. I suspect this is a rendering bug in snapbox. When tested in actual terminals, the underline markers align correctly.

Muscraft

Thanks for catching this and coming up with a fix!

Can you make the commit adding the new test, the first commit in this PR? It makes it much easier to review the change. The "tip" at the end of this section in the Cargo Contributor Guide, describes the reasoning better.

Muscraft · 2025-12-23T23:34:26Z

src/renderer/render.rs

-                        (padding as isize + span_start_pos as isize) as usize,
-                        (padding as isize + span_end_pos as isize) as usize,
+                        (padding as isize + span_start_char_pos as isize) as usize,
+                        (padding as isize + span_end_char_pos as isize) as usize,


Using char spans here (and above) will cause issues with \t as we normalize the whitespace, which replaces \t with four spaces. We currently take this into account for addition lines, doing the same here should make things work.

I'll try it. However, the library's code doesn't explicitly distinguish between character count, byte count, and display width, which makes it unclear which of the three usize values it refers to. If considering normalize_whitespace, we have four different units. I might need to investigate further.

Muscraft · 2025-12-23T23:34:34Z

tests/color/multiline_removal_suggestion.ascii.term.svg

    <tspan x="10px" y="334px"><tspan class="fg-bright-blue bold">20</tspan><tspan> </tspan><tspan class="fg-bright-red">-                     )</tspan>
 </tspan>
-    <tspan x="10px" y="352px"><tspan class="fg-bright-blue bold">21</tspan><tspan> </tspan><tspan class="fg-bright-red">-                 })</tspan><tspan>.flatten()</tspan>
+    <tspan x="10px" y="352px"><tspan class="fg-bright-blue bold">21</tspan><tspan> </tspan><tspan class="fg-bright-red">-       </tspan><tspan>          }).flatten()</tspan>


This is a regression; I suspect the cause is related to the tab characters in the test's source.

Muscraft · 2025-12-23T23:34:36Z

tests/color/multiline_removal_suggestion.unicode.term.svg

    <tspan x="10px" y="334px"><tspan class="fg-bright-blue bold">20</tspan><tspan> </tspan><tspan class="fg-bright-red">-                     )</tspan>
 </tspan>
-    <tspan x="10px" y="352px"><tspan class="fg-bright-blue bold">21</tspan><tspan> </tspan><tspan class="fg-bright-red">-                 })</tspan><tspan>.flatten()</tspan>
+    <tspan x="10px" y="352px"><tspan class="fg-bright-blue bold">21</tspan><tspan> </tspan><tspan class="fg-bright-red">-       </tspan><tspan>          }).flatten()</tspan>


This is a regression; I suspect the cause is related to the tab characters in the test's source.

KeqingMoe · 2025-12-24T10:39:16Z

Thank you for pointing out the tab character issue. After investigating, I found that simply copying the addition line's approach would cause complications because the logic there is intertwined with other formatting concerns.

Instead, I've implemented a cleaner solution: adding a separate normalized field to the Loc struct to track tab-normalized positions independently. This approach:

Preserves the original fix for Unicode characters – still using character-based counting for correct multi-byte character handling.
Properly handles tab expansion – by maintaining normalized position calculations alongside the original ones.
Avoids interfering with existing logic – the change is localized and doesn't break other assumptions in the codebase.

The key changes:

In Loc, added a normalized: usize field to track the position after tab expansion
Updated position calculations to use this normalized value when determining highlight ranges
This ensures that both Unicode characters and tab characters are handled correctly in the same rendering pass

All existing tests pass with this change, including the one with tab characters that was failing before. The solution is more maintainable than trying to retrofit the addition line's complex logic into the removal handling.

Let me know if you'd like me to add more comments or if there are other edge cases we should consider.

KeqingMoe changed the title ~~fix: Improve highlighting for patches con taining Unicode characters~~ fix: Improve highlighting for patches containing Unicode characters Dec 23, 2025

KeqingMoe force-pushed the fix-unicode-patch branch from 8c25895 to 1578f54 Compare December 23, 2025 21:33

KeqingMoe changed the title ~~fix: Improve highlighting for patches containing Unicode characters~~ fix: Fix Unicode highlight alignment in patches Dec 23, 2025

Muscraft reviewed Dec 23, 2025

View reviewed changes

KeqingMoe added 4 commits December 24, 2025 18:15

refactor: Rename byte_pos to display_pos

55a280b

test: Unicode highlight alignment in patches

76c94be

feat: Add Log::normalized

ae7e9d6

fix: Fix Unicode highlight alignment in patches

2fb93f5

KeqingMoe force-pushed the fix-unicode-patch branch from 1578f54 to 2fb93f5 Compare December 24, 2025 10:33

KeqingMoe requested a review from Muscraft December 24, 2025 10:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Fix Unicode highlight alignment in patches #357

fix: Fix Unicode highlight alignment in patches #357

Uh oh!

KeqingMoe commented Dec 23, 2025

Uh oh!

Muscraft left a comment

Uh oh!

Muscraft Dec 23, 2025

Uh oh!

KeqingMoe Dec 24, 2025

Uh oh!

Muscraft Dec 23, 2025

Uh oh!

Muscraft Dec 23, 2025

Uh oh!

KeqingMoe commented Dec 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: Fix Unicode highlight alignment in patches #357

Are you sure you want to change the base?

fix: Fix Unicode highlight alignment in patches #357

Uh oh!

Conversation

KeqingMoe commented Dec 23, 2025

Summary of Changes

Root Cause

Fix

Additional Context

Uh oh!

Muscraft left a comment

Choose a reason for hiding this comment

Uh oh!

Muscraft Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

KeqingMoe Dec 24, 2025

Choose a reason for hiding this comment

Uh oh!

Muscraft Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

Muscraft Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

KeqingMoe commented Dec 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants