fix(wc): respect C/POSIX locale for character counting by naoNao89 · Pull Request #11006 · uutils/coreutils

naoNao89 · 2026-02-18T04:27:10Z

In C/POSIX locale, wc -m now counts bytes (not UTF-8 chars), matching GNU coreutils behavior using MB_CUR_MAX logic

github-actions · 2026-02-18T04:38:12Z

GNU testsuite comparison:

GNU test failed: tests/rm/isatty. tests/rm/isatty is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/tail/retry. tests/tail/retry is passing on 'main'. Maybe you have to rebase?
Congrats! The gnu test tests/date/date-locale-hour is no longer failing!
Congrats! The gnu test tests/cp/link-heap is now passing!

codspeed-hq · 2026-02-18T04:50:59Z

Merging this PR will improve performance by ×2.2

⚡ 1 improved benchmark
✅ 287 untouched benchmarks
⏩ 38 skipped benchmarks¹

Performance Changes

	Mode	Benchmark	`BASE`	`HEAD`	Efficiency
⚡	Simulation	`wc_chars_large_line_count[100000]`	1,022.9 µs	455.3 µs	×2.2

_{Comparing naoNao89:fix-wc-locale-chars (224212a) with main (289d701)}

38 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

sylvestre · 2026-02-18T09:08:09Z

As no human would write such duplication of code, i guess it is LLM generated ...
Please review the changes before submitting them for review...

src/uu/wc/src/wc.rs

src/uu/wc/src/count_fast.rs

cakebaker · 2026-02-18T10:51:48Z

tests/by-util/test_wc.rs

In the newly added tests you always use -m or -cm. However, this means only your changes in count_fast.rs are tested due to the logic in word_count_from_reader in wc.rs. To test your changes in wc.rs you also have to provide -w or -L.

naoNao89 · 2026-02-18T12:44:25Z

sr, refactored 💀 ~~dup is_c_or_posix_locale()~~

sylvestre · 2026-02-18T12:54:49Z

src/uu/wc/src/wc.rs

        }
        if SHOW_CHARS {
-            total.chars += 1;
+            if chars_are_bytes {


seriously ?!
please review your patches before substitutions ...

i thought clippy had the ability to check for empty if :v

github-actions · 2026-02-18T13:00:30Z

GNU testsuite comparison:

GNU test failed: tests/date/date-locale-hour. tests/date/date-locale-hour is passing on 'main'. Maybe you have to rebase?
Skipping an intermittent issue tests/pr/bounded-memory (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/cut/bounded-memory is no longer failing!

Modify wc -m to count bytes instead of UTF-8 characters when LC_ALL, LC_CTYPE, or LANG is set to C or POSIX. This matches GNU coreutils behavior where MB_CUR_MAX == 1 in these locales. Changes: - Add is_c_or_posix_locale() helper in count_fast.rs - Export and reuse function in wc.rs to avoid duplication - Update fast path and UTF-8 decoding path - Add regression tests with Vietnamese text Fixes uutils#9712, fixes uutils#5831.

github-actions · 2026-02-18T13:11:07Z

GNU testsuite comparison:

GNU test failed: tests/rm/isatty. tests/rm/isatty is passing on 'main'. Maybe you have to rebase?
Note: The gnu test tests/rm/many-dir-entries-vs-OOM is now being skipped but was previously passing.

Add tests with -w flag to ensure both count_fast.rs and wc.rs paths are tested for locale-aware character counting.

github-actions · 2026-02-18T14:08:13Z

GNU testsuite comparison:

Congrats! The gnu test tests/cut/bounded-memory is no longer failing!

cakebaker reviewed Feb 18, 2026

View reviewed changes

src/uu/wc/src/wc.rs Outdated Show resolved Hide resolved

cakebaker reviewed Feb 18, 2026

View reviewed changes

src/uu/wc/src/count_fast.rs Outdated Show resolved Hide resolved

cakebaker reviewed Feb 18, 2026

View reviewed changes

naoNao89 force-pushed the fix-wc-locale-chars branch 3 times, most recently from d906c13 to 1ed5ccd Compare February 18, 2026 12:51

sylvestre reviewed Feb 18, 2026

View reviewed changes

naoNao89 force-pushed the fix-wc-locale-chars branch from 1ed5ccd to bf04096 Compare February 18, 2026 13:02

test(wc): add tests for wc.rs locale path

224212a

Add tests with -w flag to ensure both count_fast.rs and wc.rs paths are tested for locale-aware character counting.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(wc): respect C/POSIX locale for character counting#11006

fix(wc): respect C/POSIX locale for character counting#11006
naoNao89 wants to merge 2 commits intouutils:mainfrom
naoNao89:fix-wc-locale-chars

naoNao89 commented Feb 18, 2026 •

edited by cakebaker

Loading

Uh oh!

github-actions bot commented Feb 18, 2026

Uh oh!

codspeed-hq bot commented Feb 18, 2026 •

edited

Loading

Uh oh!

sylvestre commented Feb 18, 2026

Uh oh!

Uh oh!

Uh oh!

cakebaker Feb 18, 2026 •

edited

Loading

Uh oh!

naoNao89 commented Feb 18, 2026 •

edited

Loading

Uh oh!

sylvestre Feb 18, 2026

Uh oh!

naoNao89 Feb 18, 2026

Uh oh!

github-actions bot commented Feb 18, 2026

Uh oh!

github-actions bot commented Feb 18, 2026

Uh oh!

github-actions bot commented Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Uh oh!

Conversation

naoNao89 commented Feb 18, 2026 • edited by cakebaker Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 18, 2026

Uh oh!

codspeed-hq bot commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will improve performance by ×2.2

Performance Changes

Footnotes

Uh oh!

sylvestre commented Feb 18, 2026

Uh oh!

Uh oh!

Uh oh!

cakebaker Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

naoNao89 commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sylvestre Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

naoNao89 Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 18, 2026

Uh oh!

github-actions bot commented Feb 18, 2026

Uh oh!

github-actions bot commented Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

naoNao89 commented Feb 18, 2026 •

edited by cakebaker

Loading

codspeed-hq bot commented Feb 18, 2026 •

edited

Loading

cakebaker Feb 18, 2026 •

edited

Loading

naoNao89 commented Feb 18, 2026 •

edited

Loading