-
Notifications
You must be signed in to change notification settings - Fork 53
fix: Fix Unicode highlight alignment in patches #357
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
8c25895 to
1578f54
Compare
Muscraft
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching this and coming up with a fix!
Can you make the commit adding the new test, the first commit in this PR? It makes it much easier to review the change. The "tip" at the end of this section in the Cargo Contributor Guide, describes the reasoning better.
src/renderer/render.rs
Outdated
| (padding as isize + span_start_pos as isize) as usize, | ||
| (padding as isize + span_end_pos as isize) as usize, | ||
| (padding as isize + span_start_char_pos as isize) as usize, | ||
| (padding as isize + span_end_char_pos as isize) as usize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using char spans here (and above) will cause issues with \t as we normalize the whitespace, which replaces \t with four spaces. We currently take this into account for addition lines, doing the same here should make things work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try it. However, the library's code doesn't explicitly distinguish between character count, byte count, and display width, which makes it unclear which of the three usize values it refers to. If considering normalize_whitespace, we have four different units. I might need to investigate further.
| <tspan x="10px" y="334px"><tspan class="fg-bright-blue bold">20</tspan><tspan> </tspan><tspan class="fg-bright-red">- )</tspan> | ||
| </tspan> | ||
| <tspan x="10px" y="352px"><tspan class="fg-bright-blue bold">21</tspan><tspan> </tspan><tspan class="fg-bright-red">- })</tspan><tspan>.flatten()</tspan> | ||
| <tspan x="10px" y="352px"><tspan class="fg-bright-blue bold">21</tspan><tspan> </tspan><tspan class="fg-bright-red">- </tspan><tspan> }).flatten()</tspan> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a regression; I suspect the cause is related to the tab characters in the test's source.
| <tspan x="10px" y="334px"><tspan class="fg-bright-blue bold">20</tspan><tspan> </tspan><tspan class="fg-bright-red">- )</tspan> | ||
| </tspan> | ||
| <tspan x="10px" y="352px"><tspan class="fg-bright-blue bold">21</tspan><tspan> </tspan><tspan class="fg-bright-red">- })</tspan><tspan>.flatten()</tspan> | ||
| <tspan x="10px" y="352px"><tspan class="fg-bright-blue bold">21</tspan><tspan> </tspan><tspan class="fg-bright-red">- </tspan><tspan> }).flatten()</tspan> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a regression; I suspect the cause is related to the tab characters in the test's source.
1578f54 to
2fb93f5
Compare
|
Thank you for pointing out the tab character issue. After investigating, I found that simply copying the Instead, I've implemented a cleaner solution: adding a separate
The key changes:
All existing tests pass with this change, including the one with tab characters that was failing before. The solution is more maintainable than trying to retrofit the Let me know if you'd like me to add more comments or if there are other edge cases we should consider. |
Summary of Changes
This PR fixes a defect in the
annotate-snippetslibrary where patch highlighting becomes misaligned when handling Unicode characters. Currently, when a source line contains Unicode characters (either before the patch location or within the patch content itself), the color highlighting (e.g., forRemoval) is incorrectly offset.Root Cause
The issue originates in the
emit_suggestion_defaultfunction (src/renderer/render.rs). This function incorrectly mixes different units when callingStyledBuffer::set_style_range:col_startandcol_endparameters expect character countspadding(character count) +span_start/end_pos(display width) +line.len()(byte length)This inconsistent usage of units (characters, display columns, bytes) causes the highlight ranges to be calculated incorrectly when multi-byte Unicode characters are present, leading to visual misalignment.
Fix
This PR corrects the unit calculations in
emit_suggestion_default, ensuring that consistent character-based positioning is used throughout. The fix ensures that:Removalcolor (and other patch styles) now correctly aligns with the intended text segmentsAdditional Context
The same
emit_suggestion_defaultfunction is also used bydraw_code_lineandrender_source_lineinrender.rs. While these functions may have similar issues, they have not been tested or modified in this PR. Further investigation and potential fixes for these functions may be needed in future work.Regarding the .term.svg Output: In the attached .term.svg output, the underline markers (
"^^^^^^^^ here") show a misalignment of approximately 1.5 units of display width. I suspect this is a rendering bug in snapbox. When tested in actual terminals, the underline markers align correctly.