Skip to content

Conversation

@RenjiSann
Copy link
Collaborator

This MR introduces a new crate uu_checksum_common, which declare

It lies on top of #10141

Architecture explanation

Starting from now, each binary has its own crate in src/uu : cksum, but also standalone tools like md5sum, b2sum, etc...

All of them depend on uu_checksum_common, which lives under src/uu/checksum_common, which itself depends on uucore::checksum.

As for who does what:

  • Top-level binaries contain the program entrypoint. They process arguments that are not common to all binaries, for example
    • cksum handles --untagged, which doesn't exist for standalones
    • b2sum handles --length where md5sum doesn't allow it
  • Then, the top-level binaries call a common main function : uu_checksum_common::checksum_main
  • checksum_main handles the common argument processing (verbose flags, --zero, etc), and calls uucore::checksum::validation or uucore::checksum::compute accordingly.

Note that all common cli error messages live under uu_checksum_common, so a bit of piping was needed in the locale setup.

Reason behind this architecture

The main goal is really to avoid code duplication as much as possible.

The symlink approach which was looked into several times, didn't feel robust enough to account for CLI differences.

Since most of the flags and the clap code was very similar, I wanted to move it in uucore::checksum::cli at first.
However, I felt like having CLI handling living under uucore was not a very good idea, since it would mix it with logic, and I wanted uucore to stay free from clap.

So I took the same approach as what's happening in base32|64|nc, where all the common code is defined in base32 which is then re-imported in base64 and basenc, only here I created a dedicated crate.

@RenjiSann RenjiSann requested a review from sylvestre January 9, 2026 12:46
@RenjiSann RenjiSann force-pushed the peron/checksum-common-cli-post-reverse branch from 28e41e6 to ca32bc7 Compare January 9, 2026 12:49
@oech3
Copy link
Contributor

oech3 commented Jan 9, 2026

Can we split this PR much more e.g. cli.rs?. diff would too big.

@RenjiSann
Copy link
Collaborator Author

Can we split this PR much more e.g. cli.rs?. diff would too big.

Some of the current diff is actually part of #10141,
I could split the introduction of uu_checksum_common, and the transition of the executables to it, but I think it would harm the review process to split them since they are very related

@RenjiSann RenjiSann force-pushed the peron/checksum-common-cli-post-reverse branch 2 times, most recently from b0e3843 to b640d07 Compare January 9, 2026 14:18
@github-actions
Copy link

github-actions bot commented Jan 9, 2026

GNU testsuite comparison:

Skip an intermittent issue tests/shuf/shuf-reservoir (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/sort/sort-stale-thread-mem (fails in this run but passes in the 'main' branch)

@github-actions
Copy link

github-actions bot commented Jan 9, 2026

GNU testsuite comparison:

Skipping an intermittent issue tests/shuf/shuf-reservoir (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/sort/sort-stale-thread-mem (passes in this run but fails in the 'main' branch)

@RenjiSann RenjiSann force-pushed the peron/checksum-common-cli-post-reverse branch from 9562744 to 8824943 Compare January 9, 2026 15:31
@github-actions
Copy link

github-actions bot commented Jan 9, 2026

GNU testsuite comparison:

Skipping an intermittent issue tests/shuf/shuf-reservoir (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/sort/sort-stale-thread-mem (passes in this run but fails in the 'main' branch)

@RenjiSann RenjiSann force-pushed the peron/checksum-common-cli-post-reverse branch 2 times, most recently from c449994 to e2a9486 Compare January 9, 2026 17:09
@github-actions
Copy link

github-actions bot commented Jan 9, 2026

GNU testsuite comparison:

Skipping an intermittent issue tests/shuf/shuf-reservoir (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/sort/sort-stale-thread-mem (passes in this run but fails in the 'main' branch)

@RenjiSann RenjiSann force-pushed the peron/checksum-common-cli-post-reverse branch from e2a9486 to a642bd6 Compare January 9, 2026 17:49
@github-actions
Copy link

github-actions bot commented Jan 9, 2026

GNU testsuite comparison:

Skipping an intermittent issue tests/shuf/shuf-reservoir (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/sort/sort-stale-thread-mem (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

@RenjiSann RenjiSann force-pushed the peron/checksum-common-cli-post-reverse branch from a642bd6 to 959640e Compare January 9, 2026 23:30
@github-actions
Copy link

github-actions bot commented Jan 9, 2026

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

@oech3
Copy link
Contributor

oech3 commented Jan 10, 2026

When I measured binary size, I have
main make MULTICALL=y PROFILE=release-fast UTILS="cksum hashsum":
2577384 byte
This PR+main make MULTICALL=y PROFILE=release-fast UTILS="cksum md5sum ...":
2557320 byte
Symlinkable cksum: make MULTICALL=y PROFILE=release-fast UTILS="cksum":
2546184 byte

Also I don't see any reason to avoid having

#!/bin/sh
exec cksum --untagged -a ALGO "$@"

with --debug and --untagged extension.

@RenjiSann RenjiSann marked this pull request as ready for review January 10, 2026 13:45
@oech3
Copy link
Contributor

oech3 commented Jan 10, 2026

https://github.com/uutils/coreutils/actions/runs/20868639121/job/59976371446?pr=10142#step:13:1125 still assuming hashsum.
Maybe, we can discard hashsum after it was fixed?

new_ucmd!()
.arg("--tag")
.arg("--text")
.arg("--md5")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--md5 should not be existing at anywhere.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks

@oech3
Copy link
Contributor

oech3 commented Jan 10, 2026

#10141 was merged, but diff is still big...

@sylvestre
Copy link
Contributor

Starting from now, each binary has its own crate in src/uu

why ?

what is the impact on build time ?

@oech3
Copy link
Contributor

oech3 commented Jan 10, 2026

main: time make PROFILE=release MULTICALL=y UTILS="cksum hashsum": 1m 28s
patched: time make PROFILE=release MULTICALL=y UTILS="cksum md5sum ...": 1m 13s

main: time make PROFILE=release UTILS="cksum hashsum": 37.70s
patched: time make PROFILE=release UTILS="cksum md5sum ...": 2m 29s

@RenjiSann RenjiSann force-pushed the peron/checksum-common-cli-post-reverse branch from 959640e to fc4ea59 Compare January 11, 2026 03:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants