Conversation
## Issue Authors: - Ishika Roy (https://github.com/Iroy30) - Ramakrishnap (https://github.com/rgsl888prabhu) Approvers: - Ramakrishnap (https://github.com/rgsl888prabhu) - James Lamb (https://github.com/jameslamb) URL: NVIDIA#556
This PR adds cuts to the MIP solver. This includes the following:
1. Add constraints in the form C*x <= d to an LP that has already been solved to optimality (and has basis information).
- The constraints must be violated at the current relaxation solution x^star. That is, C*x^star > d.
- The constraint matrix, rhs, basis, and basis factorization, are all updated to include the additional constraints.
- Dual simplex is started in phase 2 from a dual feasible solution.
2. Remove constraints from an LP that has already been solved to optimality.
- The constraints must have slacks in the basis
- The basis is refactored from scratch
3. Add cut pass loop after solving the root relaxation
4. Add a cut pool to store cuts and select cuts
- We currently score cuts based on distance and orthogonality.
6. Add Mixed Integer Gomory Cuts
- These are computed via a MIR cut on a row of the simplex tableau
7. Add Mixed Integer Rounding (MIR) Cuts
- These are constructed by aggregating rows of the constraint matrix.
8. Add Strong Chvatal-Gomory Cuts
- These are constructed from a row of the tableau matrix and from rows of the constraint matrix.
9. Fixes to Handling of Steepest Edge Norms in Dual Simplex
- Ensure that all basic variables have a positive steepest edge norms
10. Reduced Costs Fixing at Root Node
- These are applied after each cut pass and after strong branching, if a heuristic solution is available.
12. Fix issues in Crossover when solving the dual problem
- We were not correctly populating slack variables when solving the dual. This issue appeared on graph20-80-1rand
14. Fix issue in Crossover when basis became rank-deficient in dual push
15. Fix issues across the code with handling and propagating concurrent halt.
16. New solver options: mip-cut-passes, mip-mixed-integer-gomory-cuts, mip-mir-cuts, mip-strong-chvatal-gomory-cuts, mip-knapsack-cuts, mip-cut-change-threshold, mip-cut-min-orthogonality.
Closes NVIDIA#698, NVIDIA#205
Results from a GH200 A/B test with 64 threads and a 300 second time limit. Further runs needed with larger time limit. Further work is needed to get the full benefit of cuts.
A: cuts PR with --mip-cut-passes=10
B: cuts PR with --mip-cut-passes=0
Geomean MIP GAP A / (B = baseline): 0.97
Geomean Time to Optimal A/B: 0.96
A optimal 45
B optimal 37
A problems with feasible solutions 225
B problems with feasible solutions 224
A wins
drayage-100-23 1.14 8.61
gfd-schedulen180f7d50m30k18 87.85 300.0
neos-827175 12.28 25.86
neos-1171448 28.85 98.2
n5-3 180.62 300.0
neos859080 0.78 1.38
seymour1 38.78 300.0
neos-860300 71.98 90.91
neos-3083819-nubu 30.48 300.0
neos-933966 248.26 300.0
neos-957323 46.36 300.0
neos-960392 50.12 115.68
netdiversion 121.59 300.0
ns1208400 162.15 300.0
nw04 39.89 45.98
piperout-27 8.22 14.09
supportcase7 30.48 300.0
supportcase6 70.22 300.0
uccase12 26.95 41.87
A wins 19
A losses
app1-1 2.3 0.74
cbs-cta 2.21 1.83
irp 17.59 12.94
istanbul-no-cutoff 18.26 13.56
mas76 300.0 20.98
neos-1122047 8.3 6.46
neos-1445765 1.86 1.52
neos-1582420 134.3 93.44
neos-3004026-krka 6.56 1.31
neos8 0.88 0.68
ns1952667 99.84 64.92
piperout-08 10.75 8.23
pk1 224.32 91.84
qap10 134.76 62.82
swath1 60.04 50.3
trento1 300.0 172.28
triptim1 246.65 130.92
A losses 17
Authors:
- Chris Maes (https://github.com/chris-maes)
Approvers:
- Ramakrishnap (https://github.com/rgsl888prabhu)
- Akif ÇÖRDÜK (https://github.com/akifcorduk)
- Alice Boucher (https://github.com/aliceb-nv)
URL: NVIDIA#814
📝 WalkthroughWalkthroughAdds a cuts subsystem (Gomory, MIR, Knapsack, Strong CG), integrates it into dual-simplex MIP flows, extends MIP/cut solver settings, threads basis/edge-norms and timing through branch-and-bound, standardizes concurrent-halt sentinels, updates sparse/CSR helpers, and adds regression benchmarking tooling and tests. Changes
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 19
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
cpp/src/dual_simplex/mip_node.hpp (2)
232-241:⚠️ Potential issue | 🟠 MajorCarry
integer_infeasiblethroughdetach_copy().
The detached node should preserve the integer infeasibility flag; otherwise, downstream logic may observe a stale/default value.🐛 Suggested fix
copy.branch_var = branch_var; copy.branch_dir = branch_dir; copy.branch_var_lower = branch_var_lower; copy.branch_var_upper = branch_var_upper; copy.fractional_val = fractional_val; + copy.integer_infeasible = integer_infeasible; copy.objective_estimate = objective_estimate; copy.node_id = node_id;
37-49:⚠️ Potential issue | 🟠 MajorInitialize
integer_infeasiblein the default constructor.The default constructor at lines 37-49 fails to initialize
integer_infeasible, leaving it with an indeterminate value. This member is accessed during branching logic and reporting (e.g.,branch_and_bound.cppreferencesnode->integer_infeasible), which can cause nondeterministic behavior. Initialize to-1to match the second constructor's convention.Additionally, the
detach_copy()method (lines 232-243) does not copyinteger_infeasiblewhen creating a detached node, causing loss of this state during diving operations.🐛 Suggested fix for default constructor
mip_node_t() : status(node_status_t::PENDING), lower_bound(-std::numeric_limits<f_t>::infinity()), depth(0), parent(nullptr), node_id(0), branch_var(-1), branch_dir(rounding_direction_t::NONE), branch_var_lower(-std::numeric_limits<f_t>::infinity()), branch_var_upper(std::numeric_limits<f_t>::infinity()), fractional_val(std::numeric_limits<f_t>::infinity()), + integer_infeasible(-1), objective_estimate(std::numeric_limits<f_t>::infinity()), vstatus(0)cpp/src/dual_simplex/sparse_matrix.cpp (1)
572-580:⚠️ Potential issue | 🟡 MinorFix the
col_startsize guard to prevent OOB in debug checks.The guard uses
j >= col_start.size(), but you readcol_start[j + 1]; ifcol_start.size() == n, the current check won’t trigger and still OOB. Guard onj + 1(or pre-checksize() < n + 1).✅ Safer bound check
- if (j >= col_start.size()) { + if (j + 1 >= col_start.size()) { printf("Col start too small size %ld n %d\n", col_start.size(), this->n); return -1; }cpp/src/dual_simplex/branch_and_bound.cpp (1)
1556-1573:⚠️ Potential issue | 🟠 MajorBreak loop when dual simplex task completes, not just when signals arrive.
The loop at lines 1569–1573 spins indefinitely if neither
root_crossover_solution_set_norroot_concurrent_halt()changes. If dual simplex finishes (e.g., TIME_LIMIT) without an external solver providing a solution viaset_root_relaxation_solution(), both signals stay false and the loop never exits. Add a check for the future's completion to guarantee the loop terminates:Suggested guard
- while (!root_crossover_solution_set_.load(std::memory_order_acquire) && - *get_root_concurrent_halt() == 0) { + while (!root_crossover_solution_set_.load(std::memory_order_acquire) && + *get_root_concurrent_halt() == 0 && + root_status_future.wait_for(std::chrono::milliseconds(0)) != std::future_status::ready) { std::this_thread::sleep_for(std::chrono::milliseconds(1)); continue; }
🤖 Fix all issues with AI agents
In `@cpp/src/dual_simplex/bounds_strengthening.cpp`:
- Around line 105-116: The loop indexes bounds_changed by j assuming it has
length n; add validation before the loop to avoid out-of-bounds access by
checking that bounds_changed.size() == static_cast<size_t>(n) (or
bounds_changed.size() >= n) and handle the mismatch (e.g., assert,
log+return/error, or resize/initialize bounds_changed) so the subsequent loop
over j and access to bounds_changed[j] is safe; apply this check near the start
of the block that references bounds_changed (the code surrounding the for (i_t j
= 0; j < n; ++j) loop that also touches A.col_start and constraint_changed) and
ensure constraint_changed is still reset only when the size check passes.
In `@cpp/src/dual_simplex/cuts.cpp`:
- Around line 44-95: Guard against near-zero norms in cut_distance and
cut_orthogonality: compute an epsilon (e.g., std::numeric_limits<f_t>::epsilon()
* max(1.0, cut_norm) or a small constant like 1e-12), then if cut_norm (used in
cut_distance and stored in cut_norms_) or either norm_i/norm_j is below that
epsilon, set cut_violation and cut_norm consistently and return a neutral score
(for cut_distance return 0.0 and for cut_orthogonality return 1.0) instead of
performing the division; reference the functions cut_pool_t::cut_distance,
cut_pool_t::cut_orthogonality and the field cut_norms_ when adding these guards.
- Around line 8-9: The cut scoring functions cut_distance() and
cut_orthogonality() currently divide by norms that can be zero; add a small
epsilon check before dividing to avoid NaN/Inf propagation. In cut_distance(),
compute cut_norm as before but only divide cut_violation by cut_norm when
cut_norm > epsilon (otherwise set distance to 0.0); in cut_orthogonality(),
check both norm_i and norm_j against epsilon and only compute the division when
(norm_i > epsilon && norm_j > epsilon) (otherwise set orthogonality to 0.0). Use
a consistent small epsilon (e.g. std::numeric_limits<f_t>::epsilon() or a tiny
constant) and ensure the functions return a well-defined 0.0 when norms are too
small.
In `@regression/benchmark_scripts/benchmark.py`:
- Around line 253-260: The code currently hard-codes n_files = 1 which prevents
iterating all discovered files and can cause IndexError when data_files is
empty; change n_files to use the actual number of discovered files (n_files =
len(data_files)) and short‑circuit when len(data_files) == 0 (e.g.,
return/continue or raise a clear error) before indexing into data_files; update
the logic around variables data_files, d_type, dataset_file_path, gpu_id/idx to
rely on n_files = len(data_files) so workers process all files and avoid
out‑of‑range access.
- Around line 237-333: The worker function currently has multiple processes
appending to test_status_file without synchronization (race/interleaving); fix
by creating a multiprocessing.Lock (or Manager().Lock) in run, pass that lock
into worker (add lock to worker signature and Process args), and in worker
surround the append to test_status_file with lock acquisition (use
lock.acquire()/release or a context manager) so only one process writes at a
time; alternatively implement per-process temp status files (named by gpu_id)
and merge them into test_status_file in run after all processes join.
In `@regression/benchmark_scripts/configs/README.md`:
- Around line 10-12: The fenced code block containing the command "aws s3 cp
/path/to/files s3://cuopt-datasets/regression_datasets/" in README.md lacks a
language tag; update the opening fence from ``` to ```bash so the block is
annotated (e.g., change the triple-backtick line immediately before that command
to include "bash") to satisfy markdownlint rule MD040.
In `@regression/config.sh`:
- Around line 30-34: Replace hard-coded placeholders by making
CUOPT_SLACK_APP_ID, S3_FILE_PREFIX and S3_URL_PREFIX required environment
variables and fail fast with a clear error if unset: update the logic that
defines CUOPT_SLACK_APP_ID, WEBHOOK_URL, S3_FILE_PREFIX and S3_URL_PREFIX to
check for empty values and print an explicit error message and exit when any
required var is missing; alternatively, if you prefer optional defaults, add
documentation comments above these vars explaining they MUST be overridden and
show example export commands for CUOPT_SLACK_APP_ID and S3_FILE_PREFIX to avoid
silent failures.
In `@regression/create-html-reports.sh`:
- Around line 173-180: The current snippet builds plot and log links using the
loop variable f which may be unset or only hold the last value; change the logic
in create-html-reports.sh to derive fixed relative paths from RESULTS_DIR
instead of using f: compute plot_rel_path and log_rel_path from RESULTS_DIR
(e.g. remove "$RESULTS_DIR/" and "$RESULTS_DIR/benchmarks/results/") and then
set plot_path and log_path from those fixed rel paths before echoing into
$report; update references to the variables plot_rel_path, plot_path,
log_rel_path, log_path and stop relying on f so links are always correct even
when the ALL_REPORTS loop has finished.
In `@regression/cronjob.sh`:
- Line 137: The cp invocation using glob patterns (cp
$ROUTING_CONFIGS_PATH/*config.json $LP_DATASETS_PATH/*config.json
$MIP_DATASETS_PATH/*config.json $ALL_CONFIGS_PATH/) can fail under set -e when
any glob doesn't match; update the script to either enable bash nullglob (shopt
-s nullglob) before this line and restore it after, or perform explicit
existence checks for each source pattern (e.g., test -e
"$ROUTING_CONFIGS_PATH/"*config.json) and only include patterns that match when
calling cp; reference the cp call and the variables ROUTING_CONFIGS_PATH,
LP_DATASETS_PATH, MIP_DATASETS_PATH, and ALL_CONFIGS_PATH when applying the fix.
In `@regression/functions.sh`:
- Around line 108-123: In cloneRepo, the function currently calls git clone
without the repo_name and uses an undefined error function; fix it by invoking
git clone with the explicit target directory (use repo_url and repo_name) and
handle failures locally (replace the undefined error call with a local non-zero
return or exit, e.g., return 1), ensure all variable expansions are quoted
(repo_url, repo_name, dest_dir) and check the rm -rf removal result by testing
if the directory still exists and returning an error code if so; keep pushd/popd
logic but make error handling self-contained inside cloneRepo.
In `@regression/get_datasets.py`:
- Around line 875-904: Both download_lp_dataset and download_mip_dataset may
attempt to write files into a non-existent directory causing FileNotFoundError;
fix by ensuring the target directory exists at the start of each function (e.g.,
call os.makedirs(dir, exist_ok=True) at the top of download_lp_dataset and
download_mip_dataset before using download() or constructing file paths) so
subsequent open/write in download() and extract() succeed.
- Around line 840-872: In extract(...) avoid shell=True and string interpolation
to prevent command injection: replace each subprocess.run(...) that currently
uses f-strings with subprocess.run([...], cwd=dir, check=True) using argument
lists (e.g., ["bzip2","-d", basefile] or ["gunzip","-c", basefile]) and for the
gunzip case capture stdout to a file via subprocess.run(..., stdout=opened_file)
instead of using ">" redirection; remove uses of "rm -rf ..." and instead delete
files with os.remove() or shutil.rmtree() (e.g., remove unzippedfile and emps
binaries), and ensure download(...) and the gcc/emps invocations also use
argument lists and cwd plus check=True; keep references to
MittelmannInstances["emps"], download, and the local filenames (basefile,
unzippedfile, outfile) when making these changes.
- Around line 823-835: The download function currently calls
urllib.request.urlopen on any scheme and doesn't close the response; update
download to validate the URL scheme (allow only "http" and "https", raising an
exception for others) and use a context manager to ensure the response is
closed: replace the direct urllib.request.urlopen(...) usage with a "with
urllib.request.urlopen(...)" block (passing the ssl context when "plato.asu.edu"
is in the URL), then read data inside that block and return or write it as
before; keep the scheme check and the special SSL context logic but ensure the
response variable is only used inside the with-statement so resources are
cleaned up.
In `@regression/run_regression.sh`:
- Around line 1-16: Enable strict Bash mode and guard against an empty
SCRATCH_DIR before any destructive operations: add shell strict flags (e.g., set
-euo pipefail with emphasis on set -u per repo preference) at the top of
run_regression.sh, check that SCRATCH_DIR is non-empty (fail early if
unbound/empty) before calling rm -rf $SCRATCH_DIR/routing_configs/*, and quote
all expansions of SCRATCH_DIR in commands (rm -rf
"$SCRATCH_DIR"/routing_configs/*, aws s3 cp
"s3://cuopt-datasets/regression_datasets/" "$SCRATCH_DIR"/routing_configs/
--recursive, python "$SCRATCH_DIR"/cuopt/regression/get_datasets.py
"$SCRATCH_DIR"/lp_datasets lp, etc.) so destructive ops never run when
SCRATCH_DIR is unset or contains spaces.
In `@regression/save_benchmark_results.py`:
- Line 10: The function name create_update_benchamrk_db contains a typo; rename
it to create_update_benchmark_db and update every reference (calls, imports,
tests, docs) to use the corrected name so the signature (benchmark_path,
output_path, commit_hash) remains the same; ensure any exported symbols or
__all__ entries are updated accordingly and run tests to confirm no remaining
references to create_update_benchamrk_db.
- Around line 21-32: The loop shadows the outer DataFrame named data by
assigning data = pd.read_csv(...) inside the for index, rows in data.iterrows()
loop; rename the inner variable (e.g., existing or existing_data) and update
subsequent uses (pd.concat and to_csv) to reference that new name so the outer
data and the loop variable rows remain unchanged and all rows are processed
correctly.
In `@regression/save_benchmarks.sh`:
- Around line 36-38: In save_benchmarks.sh ensure the script validates the
required argument ($1) before calling python
${CUOPT_SCRIPTS_DIR}/save_benchmark_results.py and aborts with a non‑zero exit
code if missing; after changing to ${CUOPT_SCRIPTS_DIR} check for
staged/unstaged changes (e.g., git status --porcelain) before running git
add/commit to avoid failing on "nothing to commit", only run git commit when
there are changes and handle git commit failure by exiting with an error, and
perform git push with basic retry logic and error handling so push failures are
detected and cause the script to fail instead of silently continuing. Use the
symbols save_benchmarks.sh, ${CUOPT_SCRIPTS_DIR}, ${RESULTS_DIR}, and
save_benchmark_results.py to locate where to add these checks.
In `@regression/test-container.sh`:
- Around line 10-18: The script calls logger (the function defined in
functions.sh) but only sources config.sh, so add sourcing of the functions file
after PROJECT_DIR is set; for example, after export PROJECT_DIR line source the
functions file that defines logger (e.g., source "${PROJECT_DIR}/functions.sh"
or the appropriate functions file in regression) so logger is available before
it's invoked and the script won't fail under set -e.
In `@regression/update_asv_database.py`:
- Around line 12-33: The function update_asv_db uses inputs like results_dir and
commitTime without validation (e.g., Path(results_dir) and commitTime * 1000),
so add an early required-arguments guard at the top of update_asv_db that checks
for presence/valid types of results_dir and either commitHash or commitTime (and
that commitTime is numeric if used); if validation fails, raise a clear
exception (ValueError) or return early with a clear message. Ensure the checks
reference the same parameter names (results_dir, commitTime, commitHash) so
callers get a deterministic error before any Path(...) or arithmetic is
attempted.
🧹 Nitpick comments (24)
cpp/src/dual_simplex/folding.cpp (1)
212-220: Consider propagating errors instead of usingexit(1)in helper functions.The new
coloring_status_tpattern improves error handling incolor_graph, but helper functions likefind_colors_to_split(line 216) andsplit_colors(lines 262, 286, 328) still useexit(1)for error conditions. This can terminate the entire process unexpectedly in a library context.Consider refactoring these helpers in a follow-up to return error status that can be propagated up through
color_graph, ensuring consistent error handling throughout the folding module. As per coding guidelines, error propagation should be complete to user-facing APIs.regression/lp_regression_test.sh (1)
5-10: Addset -ufor unbound variable safety.🔧 Suggested change
# Abort script on first error set -e +set -uBased on learnings: In this repository, prefer using 'set -u' in Bash scripts to detect unbound variables and rely on the default unbound-variable error messages rather than implementing explicit guards with custom error messages. This approach keeps scripts concise and leverages Bash's built-in error handling.
regression/routing_regression_test.sh (1)
5-10: Addset -ufor unbound variable safety.🔧 Suggested change
# Abort script on first error set -e +set -uBased on learnings: In this repository, prefer using 'set -u' in Bash scripts to detect unbound variables and rely on the default unbound-variable error messages rather than implementing explicit guards with custom error messages. This approach keeps scripts concise and leverages Bash's built-in error handling.
regression/mip_regression_test.sh (1)
5-10: Addset -ufor unbound variable safety.🔧 Suggested change
set -e +set -uBased on learnings: In this repository, prefer using 'set -u' in Bash scripts to detect unbound variables and rely on the default unbound-variable error messages rather than implementing explicit guards with custom error messages. This approach keeps scripts concise and leverages Bash's built-in error handling.
cpp/src/dual_simplex/types.hpp (1)
22-23: Replace the macro with a typed constexpr constant.The
CONCURRENT_HALT_RETURNmacro is used exclusively as a sentinel return value (37+ usages across the codebase) and never in preprocessor directives. Converting toinline constexpr intis safe and improves type safety and scoping without any functional impact.♻️ Suggested refactor
-#define CONCURRENT_HALT_RETURN -2 +inline constexpr int CONCURRENT_HALT_RETURN = -2;cpp/src/utilities/timer.hpp (1)
58-85: Potential precision loss in epoch conversion.The function converts microseconds since epoch to
doublefortv_sec(Line 81). Sinceus_since_epoch.count()returns a large integer (~1.7×10¹⁵ for current timestamps), anddoublehas only ~15-16 significant digits, the integer division result stored indoublemay lose precision for the fractional part calculation.Consider keeping intermediate values as integers:
♻️ Suggested improvement
- double tv_sec = us_since_epoch.count() / 1000000; - double tv_usec = us_since_epoch.count() % 1000000; - - return tv_sec + 1e-6 * tv_usec; + auto total_us = us_since_epoch.count(); + return static_cast<double>(total_us) / 1000000.0;cpp/src/dual_simplex/bound_flipping_ratio_test.hpp (1)
17-21: Consider usingconstexprinstead of#definemacros.The symbolic constants improve readability. For better type safety and debugging, consider using
constexpr:♻️ Suggested improvement
-#define RATIO_TEST_NO_ENTERING_VARIABLE -1 -#define RATIO_TEST_CONCURRENT_LIMIT CONCURRENT_HALT_RETURN // -2 -#define RATIO_TEST_TIME_LIMIT -3 -#define RATIO_TEST_NUMERICAL_ISSUES -4 +inline constexpr int RATIO_TEST_NO_ENTERING_VARIABLE = -1; +inline constexpr int RATIO_TEST_CONCURRENT_LIMIT = CONCURRENT_HALT_RETURN; // -2 +inline constexpr int RATIO_TEST_TIME_LIMIT = -3; +inline constexpr int RATIO_TEST_NUMERICAL_ISSUES = -4;Note: The static analysis error about
initial_basis.hppnot found is a false positive—the analyzer lacks full project context.cpp/tests/mip/CMakeLists.txt (1)
29-31: LGTM! TheCUTS_TESTaddition is correctly configured and the referencedcuts_test.cufile exists and validates numerical correctness withEXPECT_NEARassertions on objective values.Consider expanding test coverage in
cuts_test.cuto include degenerate cases (infeasible, unbounded, empty problems) and free variables to align with broader optimization test best practices.regression/benchmark_scripts/configs/test_name_config.json (1)
1-23: Replace placeholder names/details before real use.Keeping
"test_name"and"Add details about you test"may leak into reports; consider filling these with real identifiers.regression/write-meta-data.sh (1)
5-6: Enableset -ufor unbound-variable detection.
This script relies on env vars like PROJECT_DIR and METADATA_FILE;set -umakes missing vars fail fast.Based on learnings: In this repository, prefer using 'set -u' in Bash scripts to detect unbound variables and rely on the default unbound-variable error messages rather than implementing explicit guards with custom error messages.♻️ Proposed change
-# Abort script on first error -set -e +# Abort script on first error +set -euregression/test-container.sh (1)
7-8: Considerset -uto catch missing env vars early.Based on learnings: In this repository, prefer using 'set -u' in Bash scripts to detect unbound variables and rely on the default unbound-variable error messages rather than implementing explicit guards with custom error messages.♻️ Suggested change
-# Abort script on first error -set -e +# Abort script on first error +set -euregression/benchmark_scripts/configs/example_test_creation.py (1)
17-23: Avoid hard-coded absolute dataset paths in the example script.
The current path is developer-specific; consider an env/CLI parameter or a repo-relative default so others can run it without edits.♻️ Proposed change
-from cuopt.routing import utils -import json +from cuopt.routing import utils +import json +import os @@ -base_file_name = ( - "/home/nfs/rgandham/git-repos/reopt/datasets/cvrptw/RC2_10_5.TXT" -) +base_file_name = os.environ.get( + "CUOPT_DATASET", + "datasets/cvrptw/RC2_10_5.TXT", +)cpp/src/dual_simplex/sparse_vector.cpp (1)
236-256: Consider reserving before clearing to avoid reallocation.The current order
reserve → clearis inefficient;clear()doesn't release capacity, but callingreserveafterclearwould be more idiomatic. Also, the two-pass approach (count then fill) could be replaced with a single pass if you reserve an upper bound.♻️ Optional: Single-pass with upper-bound reserve
template <typename i_t, typename f_t> void sparse_vector_t<i_t, f_t>::squeeze(sparse_vector_t<i_t, f_t>& y) const { y.n = n; - - i_t nz = 0; const i_t nx = x.size(); - for (i_t k = 0; k < nx; k++) { - if (x[k] != 0.0) { nz++; } - } - y.i.reserve(nz); - y.x.reserve(nz); y.i.clear(); y.x.clear(); + y.i.reserve(nx); + y.x.reserve(nx); for (i_t k = 0; k < nx; k++) { if (x[k] != 0.0) { y.i.push_back(i[k]); y.x.push_back(x[k]); } } + y.i.shrink_to_fit(); + y.x.shrink_to_fit(); }regression/config.sh (1)
1-7: Consider enabling strict mode for unbound variable detection.Adding
set -uwould help catch undefined variables early. The# shellcheck disable=alldisables useful linting; consider enabling shellcheck and fixing issues instead.♻️ Proposed fix
#!/bin/bash -# shellcheck disable=all # SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION. All rights reserved. # SPDX-License-Identifier: Apache-2.0 +set -u -THIS_DIR=$(cd $(dirname ${BASH_SOURCE[0]}) && pwd) +THIS_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)Based on learnings: "In this repository, prefer using 'set -u' in Bash scripts to detect unbound variables."
regression/save_benchmark_results.py (1)
35-51: Userequired=Truein argparse instead of manual validation.This provides better error messages and follows argparse conventions. Also, consider guarding module-level execution with
if __name__ == "__main__":.♻️ Proposed refactor
-parser = argparse.ArgumentParser() -parser.add_argument( - "-b", "--benchmark_path", help="Path to new sets of results" -) -parser.add_argument("-o", "--output_path", help="Path to save results") -parser.add_argument("-c", "--commit_hash", help="Git commit hash for the run") - -# Read arguments from command line -args = parser.parse_args() - -if args.benchmark_path and args.output_path and args.commit_hash: - create_update_benchamrk_db( - args.benchmark_path, args.output_path, args.commit_hash - ) -else: - raise ValueError( - "Missing mandatory options, please provide all the options" - ) +if __name__ == "__main__": + parser = argparse.ArgumentParser() + parser.add_argument( + "-b", "--benchmark_path", required=True, help="Path to new sets of results" + ) + parser.add_argument("-o", "--output_path", required=True, help="Path to save results") + parser.add_argument("-c", "--commit_hash", required=True, help="Git commit hash for the run") + + args = parser.parse_args() + create_update_benchmark_db( + args.benchmark_path, args.output_path, args.commit_hash + )cpp/tests/mip/cuts_test.cu (1)
28-166: Consider adding edge case tests for cut robustness.The current tests cover basic functionality. Per coding guidelines, consider adding tests for:
- Infeasible problems (cuts should not change infeasibility)
- Problems where cuts are insufficient (requires branching)
- Empty or singleton problems
Would you like me to generate additional test cases for edge case handling?
regression/save_benchmarks.sh (1)
6-10: Addset -uand quote variable expansions.The script uses
set -ebut notset -u. The positional parameter$1on line 36 will fail silently if not provided. Also, quote the dirname expression.♻️ Proposed fix
# Abort script on first error set -e +set -u # Must ensure PROJECT_DIR is exported first then load rapids-mg-tools env -export PROJECT_DIR=${PROJECT_DIR:-$(cd "$(dirname ${BASH_SOURCE[0]})" && pwd)} +export PROJECT_DIR=${PROJECT_DIR:-$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)}Based on learnings: "In this repository, prefer using 'set -u' in Bash scripts to detect unbound variables."
regression/cronjob.sh (1)
7-7: Addset -ufor unbound variable detection.Based on learnings from this repository, scripts should use
set -uto detect unbound variables and rely on Bash's default error messages. This helps catch typos and missing variable definitions early.Suggested change
-set -e +set -euregression/send-slack-report.sh (2)
7-7: Addset -ufor unbound variable detection.Consistent with repository conventions, add
set -uto catch unbound variables. Several variables likeWEBHOOK_URL,S3_FILE_PREFIX, andS3_URL_PREFIXare used without explicit validation.Suggested change
-set -e +set -eu
115-118: Consider validating WEBHOOK_URL before posting.If
WEBHOOK_URLis empty or unset, the curl command will fail. Adding a guard would provide a clearer error message.Suggested improvement
+ if [[ -z "${WEBHOOK_URL:-}" ]]; then + logger "Error: WEBHOOK_URL is not set. Cannot post to Slack." + exit 1 + fi echo "$(envsubst < ${PROJECT_DIR}/slack_msg.json)" curl -X POST \ -H 'Content-type: application/json' \ --data "$(envsubst < ${PROJECT_DIR}/slack_msg.json)" \ ${WEBHOOK_URL}regression/create-html-reports.sh (1)
1-9: Addset -uto catch unset variables early.
This script relies on many exported variables; enablingset -usurfaces missing exports immediately and aligns with repo Bash conventions.Based on learnings: In this repository, prefer using 'set -u' in Bash scripts to detect unbound variables and rely on the default unbound-variable error messages rather than implementing explicit guards with custom error messages.🔧 Suggested change
#!/bin/bash +set -u # shellcheck disable=SC1090regression/benchmark_scripts/utils.py (1)
55-67: Prefer os.path.join for path assembly.
This avoids edge cases with trailing slashes and keeps path handling consistent.♻️ Suggested change
- with open(data_file_path + "/" + d_type + "_config.json") as f: + with open(os.path.join(data_file_path, f"{d_type}_config.json")) as f: @@ - data_model, solver_settings = build_routing_datamodel_from_json( - data_file_path + "/" + data["file_name"] - ) + data_model, solver_settings = build_routing_datamodel_from_json( + os.path.join(data_file_path, data["file_name"]) + )cpp/src/dual_simplex/phase2.cpp (1)
1982-2005: Use solver tolerances instead of hard-coded1e-10A fixed
1e-10can be too strict or too loose for scaled problems. Prefer a settings-derived tolerance (e.g.,settings.zero_tolorsettings.dual_tol) to keep behavior consistent.As per coding guidelines: Check numerical stability: prevent overflow/underflow, precision loss, division by zero/near-zero, and use epsilon comparisons for floating-point equality checks.🔧 Example adjustment
- f_t tol = 1e-10; + f_t tol = settings.zero_tol;cpp/src/dual_simplex/cuts.cpp (1)
193-195:drop_cuts()is a stub—either implement or guard usage.If this is called, the pool won’t be pruned and can grow without bound. Consider implementing it or adding a clear assertion/log to prevent accidental use.
If you want, I can help draft a pruning policy and implementation.
| if (!bounds_changed.empty()) { | ||
| std::fill(constraint_changed.begin(), constraint_changed.end(), false); | ||
| for (i_t i = 0; i < n; ++i) { | ||
| if (bounds_changed[i]) { | ||
| const i_t row_start = A.col_start[i]; | ||
| const i_t row_end = A.col_start[i + 1]; | ||
| for (i_t p = row_start; p < row_end; ++p) { | ||
| const i_t j = A.i[p]; | ||
| constraint_changed[j] = true; | ||
| for (i_t j = 0; j < n; ++j) { | ||
| if (bounds_changed[j]) { | ||
| const i_t col_start = A.col_start[j]; | ||
| const i_t col_end = A.col_start[j + 1]; | ||
| for (i_t p = col_start; p < col_end; ++p) { | ||
| const i_t i = A.i[p]; | ||
| constraint_changed[i] = true; | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Validate bounds_changed size before indexing
If bounds_changed is non-empty but not n long, the loop reads out of bounds and corrupts memory. Add a size check (or assert) before indexing.
🛡️ Suggested guard
if (!bounds_changed.empty()) {
+ if (bounds_changed.size() != static_cast<size_t>(n)) {
+ settings.log.printf("bounds_changed size mismatch: %zu vs %d\n",
+ bounds_changed.size(),
+ n);
+ return false;
+ }
std::fill(constraint_changed.begin(), constraint_changed.end(), false);
for (i_t j = 0; j < n; ++j) {🤖 Prompt for AI Agents
In `@cpp/src/dual_simplex/bounds_strengthening.cpp` around lines 105 - 116, The
loop indexes bounds_changed by j assuming it has length n; add validation before
the loop to avoid out-of-bounds access by checking that bounds_changed.size() ==
static_cast<size_t>(n) (or bounds_changed.size() >= n) and handle the mismatch
(e.g., assert, log+return/error, or resize/initialize bounds_changed) so the
subsequent loop over j and access to bounds_changed[j] is safe; apply this check
near the start of the block that references bounds_changed (the code surrounding
the for (i_t j = 0; j < n; ++j) loop that also touches A.col_start and
constraint_changed) and ensure constraint_changed is still reset only when the
size check passes.
| #include <dual_simplex/basis_solves.hpp> | ||
| #include <dual_simplex/cuts.hpp> |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, locate the header file
fd -a "basis_solves.hpp"Repository: NVIDIA/cuopt
Length of output: 112
🏁 Script executed:
# Check the directory structure of dual_simplex
find . -type d -name "dual_simplex" | head -20Repository: NVIDIA/cuopt
Length of output: 103
🏁 Script executed:
# List files in dual_simplex directories
find . -path "*/dual_simplex/*" -type f | grep -E "\.(hpp|h|cpp|cu|cuh)$" | head -30Repository: NVIDIA/cuopt
Length of output: 1299
🏁 Script executed:
# Check the actual includes in cuts.cpp
head -20 cpp/src/dual_simplex/cuts.cppRepository: NVIDIA/cuopt
Length of output: 758
🏁 Script executed:
# Search for basis_solves in the codebase
rg "basis_solves" --type-list | head -5
rg "basis_solves" -lRepository: NVIDIA/cuopt
Length of output: 809
🏁 Script executed:
# Get file size to determine how to read it
wc -l cpp/src/dual_simplex/cuts.cppRepository: NVIDIA/cuopt
Length of output: 90
🏁 Script executed:
# Read the full cuts.cpp file
cat -n cpp/src/dual_simplex/cuts.cppRepository: NVIDIA/cuopt
Length of output: 50369
🏁 Script executed:
# Check if drop_cuts() is called anywhere
rg "drop_cuts" --type cpp --type hppRepository: NVIDIA/cuopt
Length of output: 83
🏁 Script executed:
# Check if drop_cuts() is called anywhere
rg "drop_cuts" -t c++Repository: NVIDIA/cuopt
Length of output: 83
🏁 Script executed:
# Check if drop_cuts() is called anywhere
rg "drop_cuts"Repository: NVIDIA/cuopt
Length of output: 174
🏁 Script executed:
# Check if drop_cuts() is actually invoked/called
rg "drop_cuts\(\)"Repository: NVIDIA/cuopt
Length of output: 174
Guard against division by zero in cut scoring functions.
Lines 61 and 94 divide by norms that can be zero or near-zero:
cut_distance()divides bycut_norm = std::sqrt(dot)without checking if the norm is zerocut_orthogonality()divides bynorm_i * norm_jwithout checking for zero norms
Add epsilon checks to prevent NaN/Inf values from propagating into cut selection:
Example guard for cut_distance()
const f_t distance = (cut_norm > epsilon) ? (cut_violation / cut_norm) : 0.0;🧰 Tools
🪛 Clang (14.0.6)
[error] 8-8: 'dual_simplex/basis_solves.hpp' file not found
(clang-diagnostic-error)
🤖 Prompt for AI Agents
In `@cpp/src/dual_simplex/cuts.cpp` around lines 8 - 9, The cut scoring functions
cut_distance() and cut_orthogonality() currently divide by norms that can be
zero; add a small epsilon check before dividing to avoid NaN/Inf propagation. In
cut_distance(), compute cut_norm as before but only divide cut_violation by
cut_norm when cut_norm > epsilon (otherwise set distance to 0.0); in
cut_orthogonality(), check both norm_i and norm_j against epsilon and only
compute the division when (norm_i > epsilon && norm_j > epsilon) (otherwise set
orthogonality to 0.0). Use a consistent small epsilon (e.g.
std::numeric_limits<f_t>::epsilon() or a tiny constant) and ensure the functions
return a well-defined 0.0 when norms are too small.
| f_t cut_pool_t<i_t, f_t>::cut_distance(i_t row, | ||
| const std::vector<f_t>& x, | ||
| f_t& cut_violation, | ||
| f_t& cut_norm) | ||
| { | ||
| const i_t row_start = cut_storage_.row_start[row]; | ||
| const i_t row_end = cut_storage_.row_start[row + 1]; | ||
| f_t cut_x = 0.0; | ||
| f_t dot = 0.0; | ||
| for (i_t p = row_start; p < row_end; p++) { | ||
| const i_t j = cut_storage_.j[p]; | ||
| const f_t cut_coeff = cut_storage_.x[p]; | ||
| cut_x += cut_coeff * x[j]; | ||
| dot += cut_coeff * cut_coeff; | ||
| } | ||
| cut_violation = rhs_storage_[row] - cut_x; | ||
| cut_norm = std::sqrt(dot); | ||
| const f_t distance = cut_violation / cut_norm; | ||
| return distance; | ||
| } | ||
|
|
||
| template <typename i_t, typename f_t> | ||
| f_t cut_pool_t<i_t, f_t>::cut_density(i_t row) | ||
| { | ||
| const i_t row_start = cut_storage_.row_start[row]; | ||
| const i_t row_end = cut_storage_.row_start[row + 1]; | ||
| const i_t cut_nz = row_end - row_start; | ||
| const i_t original_vars = original_vars_; | ||
| return static_cast<f_t>(cut_nz) / original_vars; | ||
| } | ||
|
|
||
| template <typename i_t, typename f_t> | ||
| f_t cut_pool_t<i_t, f_t>::cut_orthogonality(i_t i, i_t j) | ||
| { | ||
| const i_t i_start = cut_storage_.row_start[i]; | ||
| const i_t i_end = cut_storage_.row_start[i + 1]; | ||
| const i_t i_nz = i_end - i_start; | ||
| const i_t j_start = cut_storage_.row_start[j]; | ||
| const i_t j_end = cut_storage_.row_start[j + 1]; | ||
| const i_t j_nz = j_end - j_start; | ||
|
|
||
| f_t dot = sparse_dot(cut_storage_.j.data() + i_start, | ||
| cut_storage_.x.data() + i_start, | ||
| i_nz, | ||
| cut_storage_.j.data() + j_start, | ||
| cut_storage_.x.data() + j_start, | ||
| j_nz); | ||
|
|
||
| f_t norm_i = cut_norms_[i]; | ||
| f_t norm_j = cut_norms_[j]; | ||
| return 1.0 - std::abs(dot) / (norm_i * norm_j); | ||
| } |
There was a problem hiding this comment.
Guard cut-norm usage to prevent divide-by-zero/NaNs.
cut_distance and cut_orthogonality divide by a norm that can be zero or near-zero, producing NaN/Inf and destabilizing cut scoring/selection. Add an epsilon guard to return a neutral score when norms are too small.
🛡️ Proposed fix
f_t cut_pool_t<i_t, f_t>::cut_distance(i_t row,
const std::vector<f_t>& x,
f_t& cut_violation,
f_t& cut_norm)
{
@@
cut_violation = rhs_storage_[row] - cut_x;
cut_norm = std::sqrt(dot);
- const f_t distance = cut_violation / cut_norm;
- return distance;
+ const f_t eps = 1e-12;
+ if (cut_norm <= eps) { return 0.0; }
+ const f_t distance = cut_violation / cut_norm;
+ return distance;
}
@@
f_t norm_i = cut_norms_[i];
f_t norm_j = cut_norms_[j];
- return 1.0 - std::abs(dot) / (norm_i * norm_j);
+ const f_t eps = 1e-12;
+ const f_t denom = norm_i * norm_j;
+ if (denom <= eps) { return 0.0; }
+ return 1.0 - std::abs(dot) / denom;
}Based on learnings: Guard numerical stability by guarding near-zero norms in cut-related computations (e.g., cut_orthogonality) to avoid instability.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| f_t cut_pool_t<i_t, f_t>::cut_distance(i_t row, | |
| const std::vector<f_t>& x, | |
| f_t& cut_violation, | |
| f_t& cut_norm) | |
| { | |
| const i_t row_start = cut_storage_.row_start[row]; | |
| const i_t row_end = cut_storage_.row_start[row + 1]; | |
| f_t cut_x = 0.0; | |
| f_t dot = 0.0; | |
| for (i_t p = row_start; p < row_end; p++) { | |
| const i_t j = cut_storage_.j[p]; | |
| const f_t cut_coeff = cut_storage_.x[p]; | |
| cut_x += cut_coeff * x[j]; | |
| dot += cut_coeff * cut_coeff; | |
| } | |
| cut_violation = rhs_storage_[row] - cut_x; | |
| cut_norm = std::sqrt(dot); | |
| const f_t distance = cut_violation / cut_norm; | |
| return distance; | |
| } | |
| template <typename i_t, typename f_t> | |
| f_t cut_pool_t<i_t, f_t>::cut_density(i_t row) | |
| { | |
| const i_t row_start = cut_storage_.row_start[row]; | |
| const i_t row_end = cut_storage_.row_start[row + 1]; | |
| const i_t cut_nz = row_end - row_start; | |
| const i_t original_vars = original_vars_; | |
| return static_cast<f_t>(cut_nz) / original_vars; | |
| } | |
| template <typename i_t, typename f_t> | |
| f_t cut_pool_t<i_t, f_t>::cut_orthogonality(i_t i, i_t j) | |
| { | |
| const i_t i_start = cut_storage_.row_start[i]; | |
| const i_t i_end = cut_storage_.row_start[i + 1]; | |
| const i_t i_nz = i_end - i_start; | |
| const i_t j_start = cut_storage_.row_start[j]; | |
| const i_t j_end = cut_storage_.row_start[j + 1]; | |
| const i_t j_nz = j_end - j_start; | |
| f_t dot = sparse_dot(cut_storage_.j.data() + i_start, | |
| cut_storage_.x.data() + i_start, | |
| i_nz, | |
| cut_storage_.j.data() + j_start, | |
| cut_storage_.x.data() + j_start, | |
| j_nz); | |
| f_t norm_i = cut_norms_[i]; | |
| f_t norm_j = cut_norms_[j]; | |
| return 1.0 - std::abs(dot) / (norm_i * norm_j); | |
| } | |
| f_t cut_pool_t<i_t, f_t>::cut_distance(i_t row, | |
| const std::vector<f_t>& x, | |
| f_t& cut_violation, | |
| f_t& cut_norm) | |
| { | |
| const i_t row_start = cut_storage_.row_start[row]; | |
| const i_t row_end = cut_storage_.row_start[row + 1]; | |
| f_t cut_x = 0.0; | |
| f_t dot = 0.0; | |
| for (i_t p = row_start; p < row_end; p++) { | |
| const i_t j = cut_storage_.j[p]; | |
| const f_t cut_coeff = cut_storage_.x[p]; | |
| cut_x += cut_coeff * x[j]; | |
| dot += cut_coeff * cut_coeff; | |
| } | |
| cut_violation = rhs_storage_[row] - cut_x; | |
| cut_norm = std::sqrt(dot); | |
| const f_t eps = 1e-12; | |
| if (cut_norm <= eps) { return 0.0; } | |
| const f_t distance = cut_violation / cut_norm; | |
| return distance; | |
| } | |
| template <typename i_t, typename f_t> | |
| f_t cut_pool_t<i_t, f_t>::cut_density(i_t row) | |
| { | |
| const i_t row_start = cut_storage_.row_start[row]; | |
| const i_t row_end = cut_storage_.row_start[row + 1]; | |
| const i_t cut_nz = row_end - row_start; | |
| const i_t original_vars = original_vars_; | |
| return static_cast<f_t>(cut_nz) / original_vars; | |
| } | |
| template <typename i_t, typename f_t> | |
| f_t cut_pool_t<i_t, f_t>::cut_orthogonality(i_t i, i_t j) | |
| { | |
| const i_t i_start = cut_storage_.row_start[i]; | |
| const i_t i_end = cut_storage_.row_start[i + 1]; | |
| const i_t i_nz = i_end - i_start; | |
| const i_t j_start = cut_storage_.row_start[j]; | |
| const i_t j_end = cut_storage_.row_start[j + 1]; | |
| const i_t j_nz = j_end - j_start; | |
| f_t dot = sparse_dot(cut_storage_.j.data() + i_start, | |
| cut_storage_.x.data() + i_start, | |
| i_nz, | |
| cut_storage_.j.data() + j_start, | |
| cut_storage_.x.data() + j_start, | |
| j_nz); | |
| f_t norm_i = cut_norms_[i]; | |
| f_t norm_j = cut_norms_[j]; | |
| const f_t eps = 1e-12; | |
| const f_t denom = norm_i * norm_j; | |
| if (denom <= eps) { return 0.0; } | |
| return 1.0 - std::abs(dot) / denom; | |
| } |
🤖 Prompt for AI Agents
In `@cpp/src/dual_simplex/cuts.cpp` around lines 44 - 95, Guard against near-zero
norms in cut_distance and cut_orthogonality: compute an epsilon (e.g.,
std::numeric_limits<f_t>::epsilon() * max(1.0, cut_norm) or a small constant
like 1e-12), then if cut_norm (used in cut_distance and stored in cut_norms_) or
either norm_i/norm_j is below that epsilon, set cut_violation and cut_norm
consistently and return a neutral score (for cut_distance return 0.0 and for
cut_orthogonality return 1.0) instead of performing the division; reference the
functions cut_pool_t::cut_distance, cut_pool_t::cut_orthogonality and the field
cut_norms_ when adding these guards.
| def worker( | ||
| gpu_id, | ||
| dataset_file_path, | ||
| csv_path, | ||
| git_commit, | ||
| log_path, | ||
| test_status_file, | ||
| n_gpus, | ||
| d_type="routing", | ||
| ): | ||
| import os | ||
|
|
||
| os.environ["CUDA_VISIBLE_DEVICES"] = gpu_id | ||
|
|
||
| from utils import get_configuration | ||
|
|
||
| data_files = [] | ||
| if d_type == "lp" or d_type == "mip": | ||
| data_files = glob.glob(dataset_file_path + "/*.mps") | ||
| else: | ||
| data_files = glob.glob(dataset_file_path + "/*_config.json") | ||
| idx = int(gpu_id) | ||
| n_files = 1 # len(data_files) | ||
|
|
||
| while idx < n_files: | ||
| mr, stats_mr = reinitialize_rmm() | ||
|
|
||
| data_file = data_files[idx] | ||
| test_name = str(data_file) | ||
| status = "FAILED" | ||
| try: | ||
| test_name, data_model, solver_settings, requested_metrics = ( | ||
| get_configuration(data_file, dataset_file_path, d_type) | ||
| ) | ||
| log.basicConfig( | ||
| level=log.INFO, | ||
| filename=log_path + "/" + test_name + "_log.txt", | ||
| filemode="a+", | ||
| format="%(asctime)-15s %(levelname)-8s %(message)s", | ||
| ) | ||
| log.getLogger().setLevel(log.INFO) | ||
| log.info( | ||
| f"------------- Test Start : {test_name} gpu id : {gpu_id} -------------------" | ||
| ) | ||
| status = run_benchmark( | ||
| test_name, | ||
| data_model, | ||
| solver_settings, | ||
| requested_metrics, | ||
| csv_path, | ||
| git_commit, | ||
| test_status_file, | ||
| d_type, | ||
| ) | ||
| except Exception as e: | ||
| log.error(str(e)) | ||
|
|
||
| with open(test_status_file, "a") as f: | ||
| f.write("\n") | ||
| f.write(test_name + ": " + status) | ||
|
|
||
| # Delete instance of rmm | ||
| del mr | ||
| del stats_mr | ||
|
|
||
| log.info( | ||
| f"------------- Test End : {test_name} gpu id : {gpu_id} -------------------" | ||
| ) | ||
| idx = idx + n_gpus | ||
|
|
||
|
|
||
| def run( | ||
| dataset_file_path, | ||
| csv_path, | ||
| git_commit, | ||
| log_path, | ||
| test_status_file, | ||
| n_gpus, | ||
| d_type, | ||
| ): | ||
| # Restricting n_gpus to one to avoid resource sharing | ||
| # n_gpus = 1 | ||
| procs = [] | ||
| for gpu_id in range(int(n_gpus)): | ||
| p = Process( | ||
| target=worker, | ||
| args=( | ||
| str(gpu_id), | ||
| dataset_file_path, | ||
| csv_path, | ||
| git_commit, | ||
| log_path, | ||
| test_status_file, | ||
| int(n_gpus), | ||
| d_type, | ||
| ), | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, locate and read the benchmark file
fd -p "benchmark\.py" regression/Repository: NVIDIA/cuopt
Length of output: 97
🏁 Script executed:
# Get file info and read the relevant section
wc -l regression/benchmark_scripts/benchmark.pyRepository: NVIDIA/cuopt
Length of output: 101
🏁 Script executed:
# Read the worker and run functions with surrounding context
sed -n '230,350p' regression/benchmark_scripts/benchmark.pyRepository: NVIDIA/cuopt
Length of output: 3135
🏁 Script executed:
# Search for any existing synchronization imports or Lock usage in the file
rg -n "(Lock|Queue|Manager|Semaphore)" regression/benchmark_scripts/benchmark.pyRepository: NVIDIA/cuopt
Length of output: 38
🏁 Script executed:
# Check the entire file to see if test_status_file is used elsewhere
rg -n "test_status_file" regression/benchmark_scripts/benchmark.pyRepository: NVIDIA/cuopt
Length of output: 295
🏁 Script executed:
# Verify process creation and ensure multiple processes are spawned
sed -n '310,340p' regression/benchmark_scripts/benchmark.pyRepository: NVIDIA/cuopt
Length of output: 717
Guard test_status_file writes across processes
Multiple processes append to the same file without synchronization; lines can interleave or be lost. Use a shared lock (or per-process files + merge).
🔒 Suggested locking
-from multiprocessing import Process
+from multiprocessing import Process, Lock
@@
def worker(
gpu_id,
dataset_file_path,
csv_path,
git_commit,
log_path,
test_status_file,
+ status_lock,
n_gpus,
d_type="routing",
):
@@
- with open(test_status_file, "a") as f:
- f.write("\n")
- f.write(test_name + ": " + status)
+ with status_lock:
+ with open(test_status_file, "a") as f:
+ f.write(f"\n{test_name}: {status}")
@@
def run(
dataset_file_path,
csv_path,
git_commit,
log_path,
test_status_file,
n_gpus,
d_type,
):
@@
- procs = []
+ procs = []
+ status_lock = Lock()
for gpu_id in range(int(n_gpus)):
p = Process(
target=worker,
args=(
str(gpu_id),
dataset_file_path,
csv_path,
git_commit,
log_path,
test_status_file,
+ status_lock,
int(n_gpus),
d_type,
),
)🧰 Tools
🪛 Ruff (0.14.14)
[warning] 291-291: Do not catch blind exception: Exception
(BLE001)
[warning] 292-292: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
🤖 Prompt for AI Agents
In `@regression/benchmark_scripts/benchmark.py` around lines 237 - 333, The worker
function currently has multiple processes appending to test_status_file without
synchronization (race/interleaving); fix by creating a multiprocessing.Lock (or
Manager().Lock) in run, pass that lock into worker (add lock to worker signature
and Process args), and in worker surround the append to test_status_file with
lock acquisition (use lock.acquire()/release or a context manager) so only one
process writes at a time; alternatively implement per-process temp status files
(named by gpu_id) and merge them into test_status_file in run after all
processes join.
| data_files = [] | ||
| if d_type == "lp" or d_type == "mip": | ||
| data_files = glob.glob(dataset_file_path + "/*.mps") | ||
| else: | ||
| data_files = glob.glob(dataset_file_path + "/*_config.json") | ||
| idx = int(gpu_id) | ||
| n_files = 1 # len(data_files) | ||
|
|
There was a problem hiding this comment.
Process all dataset files (n_files is hard-coded to 1)
This limits each worker to a single file and can raise IndexError when data_files is empty. Use len(data_files) and short‑circuit on empty input.
✅ Proposed fix
if d_type == "lp" or d_type == "mip":
data_files = glob.glob(dataset_file_path + "/*.mps")
else:
data_files = glob.glob(dataset_file_path + "/*_config.json")
idx = int(gpu_id)
- n_files = 1 # len(data_files)
+ n_files = len(data_files)
+ if n_files == 0:
+ log.warning("No dataset files found under %s", dataset_file_path)
+ return🤖 Prompt for AI Agents
In `@regression/benchmark_scripts/benchmark.py` around lines 253 - 260, The code
currently hard-codes n_files = 1 which prevents iterating all discovered files
and can cause IndexError when data_files is empty; change n_files to use the
actual number of discovered files (n_files = len(data_files)) and short‑circuit
when len(data_files) == 0 (e.g., return/continue or raise a clear error) before
indexing into data_files; update the logic around variables data_files, d_type,
dataset_file_path, gpu_id/idx to rely on n_files = len(data_files) so workers
process all files and avoid out‑of‑range access.
| import argparse | ||
|
|
||
|
|
||
| def create_update_benchamrk_db(benchmark_path, output_path, commit_hash): |
There was a problem hiding this comment.
Typo in function name: benchamrk → benchmark.
✏️ Proposed fix
-def create_update_benchamrk_db(benchmark_path, output_path, commit_hash):
+def create_update_benchmark_db(benchmark_path, output_path, commit_hash):🤖 Prompt for AI Agents
In `@regression/save_benchmark_results.py` at line 10, The function name
create_update_benchamrk_db contains a typo; rename it to
create_update_benchmark_db and update every reference (calls, imports, tests,
docs) to use the corrected name so the signature (benchmark_path, output_path,
commit_hash) remains the same; ensure any exported symbols or __all__ entries
are updated accordingly and run tests to confirm no remaining references to
create_update_benchamrk_db.
| for index, rows in data.iterrows(): | ||
| out_file = index.split(".")[0] + ".csv" | ||
| out_file_path = out_path + "/" + out_file | ||
|
|
||
| if os.path.exists(out_file_path): | ||
| data = pd.read_csv(out_file_path) | ||
| data = pd.concat( | ||
| [data, rows.to_frame().T], ignore_index=True | ||
| ) | ||
| data.to_csv(out_file_path, index=False) | ||
| else: | ||
| rows.to_frame().T.to_csv(out_file_path, index=False) |
There was a problem hiding this comment.
Variable data is shadowed inside the loop, causing logic error.
The outer data (line 19) is overwritten when reading the existing file (line 26), which means subsequent rows from the original CSV will be lost. Each row iteration should use a distinct variable name.
🐛 Proposed fix
for index, rows in data.iterrows():
out_file = index.split(".")[0] + ".csv"
out_file_path = out_path + "/" + out_file
if os.path.exists(out_file_path):
- data = pd.read_csv(out_file_path)
- data = pd.concat(
- [data, rows.to_frame().T], ignore_index=True
+ existing_data = pd.read_csv(out_file_path)
+ updated_data = pd.concat(
+ [existing_data, rows.to_frame().T], ignore_index=True
)
- data.to_csv(out_file_path, index=False)
+ updated_data.to_csv(out_file_path, index=False)
else:
rows.to_frame().T.to_csv(out_file_path, index=False)🤖 Prompt for AI Agents
In `@regression/save_benchmark_results.py` around lines 21 - 32, The loop shadows
the outer DataFrame named data by assigning data = pd.read_csv(...) inside the
for index, rows in data.iterrows() loop; rename the inner variable (e.g.,
existing or existing_data) and update subsequent uses (pd.concat and to_csv) to
reference that new name so the outer data and the loop variable rows remain
unchanged and all rows are processed correctly.
| python ${CUOPT_SCRIPTS_DIR}/save_benchmark_results.py -b ${RESULTS_DIR} -o ${CUOPT_SCRIPTS_DIR} -c $1 | ||
|
|
||
| cd ${CUOPT_SCRIPTS_DIR}; git add benchmarks/*; git commit -m "update benchmarks"; git push; cd - |
There was a problem hiding this comment.
Git push in CI without error handling or validation.
The chained git commands could fail partially (e.g., nothing to commit, push rejected). Consider:
- Validating that
$1is provided before use - Handling the case where there are no changes to commit
- Using
git pushwith appropriate retry or error handling
🛡️ Suggested improvement
-python ${CUOPT_SCRIPTS_DIR}/save_benchmark_results.py -b ${RESULTS_DIR} -o ${CUOPT_SCRIPTS_DIR} -c $1
+if [ -z "${1:-}" ]; then
+ echo "Error: commit hash argument required"
+ exit 1
+fi
+
+python "${CUOPT_SCRIPTS_DIR}/save_benchmark_results.py" -b "${RESULTS_DIR}" -o "${CUOPT_SCRIPTS_DIR}" -c "$1"
-cd ${CUOPT_SCRIPTS_DIR}; git add benchmarks/*; git commit -m "update benchmarks"; git push; cd -
+cd "${CUOPT_SCRIPTS_DIR}"
+git add benchmarks/* || true
+if ! git diff --cached --quiet; then
+ git commit -m "update benchmarks"
+ git push
+fi
+cd -🤖 Prompt for AI Agents
In `@regression/save_benchmarks.sh` around lines 36 - 38, In save_benchmarks.sh
ensure the script validates the required argument ($1) before calling python
${CUOPT_SCRIPTS_DIR}/save_benchmark_results.py and aborts with a non‑zero exit
code if missing; after changing to ${CUOPT_SCRIPTS_DIR} check for
staged/unstaged changes (e.g., git status --porcelain) before running git
add/commit to avoid failing on "nothing to commit", only run git commit when
there are changes and handle git commit failure by exiting with an error, and
perform git push with basic retry logic and error handling so push failures are
detected and cause the script to fail instead of silently continuing. Use the
symbols save_benchmarks.sh, ${CUOPT_SCRIPTS_DIR}, ${RESULTS_DIR}, and
save_benchmark_results.py to locate where to add these checks.
| # Must ensure PROJECT_DIR is exported first then load rapids-mg-tools env | ||
| export PROJECT_DIR=${PROJECT_DIR:-$(cd "$(dirname ${BASH_SOURCE[0]})" && pwd)} | ||
| source ${PROJECT_DIR}/config.sh | ||
|
|
||
| ################################################################################ | ||
|
|
||
| # Test | ||
| logger "Testing container image $IMAGE" | ||
| python -c "import cuopt; print(cuopt)" |
There was a problem hiding this comment.
logger will be undefined unless functions.sh is sourced.
This script only sources config.sh but calls logger, which is defined in regression/functions.sh. With set -e, this will abort.
🐛 Proposed fix
export PROJECT_DIR=${PROJECT_DIR:-$(cd "$(dirname ${BASH_SOURCE[0]})" && pwd)}
source ${PROJECT_DIR}/config.sh
+source ${PROJECT_DIR}/functions.sh📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Must ensure PROJECT_DIR is exported first then load rapids-mg-tools env | |
| export PROJECT_DIR=${PROJECT_DIR:-$(cd "$(dirname ${BASH_SOURCE[0]})" && pwd)} | |
| source ${PROJECT_DIR}/config.sh | |
| ################################################################################ | |
| # Test | |
| logger "Testing container image $IMAGE" | |
| python -c "import cuopt; print(cuopt)" | |
| # Must ensure PROJECT_DIR is exported first then load rapids-mg-tools env | |
| export PROJECT_DIR=${PROJECT_DIR:-$(cd "$(dirname ${BASH_SOURCE[0]})" && pwd)} | |
| source ${PROJECT_DIR}/config.sh | |
| source ${PROJECT_DIR}/functions.sh | |
| ################################################################################ | |
| # Test | |
| logger "Testing container image $IMAGE" | |
| python -c "import cuopt; print(cuopt)" |
🤖 Prompt for AI Agents
In `@regression/test-container.sh` around lines 10 - 18, The script calls logger
(the function defined in functions.sh) but only sources config.sh, so add
sourcing of the functions file after PROJECT_DIR is set; for example, after
export PROJECT_DIR line source the functions file that defines logger (e.g.,
source "${PROJECT_DIR}/functions.sh" or the appropriate functions file in
regression) so logger is available before it's invoked and the script won't fail
under set -e.
| def update_asv_db( | ||
| commitHash=None, | ||
| commitTime=None, | ||
| branch=None, | ||
| repo_url=None, | ||
| results_dir=None, | ||
| machine_name=None, | ||
| gpu_type=None, | ||
| configs=None, | ||
| ): | ||
| """ | ||
| Read the benchmark_result* files in results_dir/benchmarks and | ||
| update an existing asv benchmark database or create one if one | ||
| does not exist in results_dir/benchmarks/asv. If no | ||
| benchmark_result* files are present, return without updating or | ||
| creating. | ||
| """ | ||
|
|
||
| # commitHash = commitHash + str(int(time.time())) | ||
| benchmark_dir_path = Path(results_dir) / "benchmarks" / "results" / "csvs" | ||
| asv_dir_path = Path(results_dir) / "benchmarks" / "results" / "asv" | ||
|
|
There was a problem hiding this comment.
Validate required inputs before use.
update_asv_db exposes optional parameters but uses them unconditionally (e.g., Path(results_dir) and commitTime * 1000). A programmatic call with defaults will crash with unclear errors. Add an explicit required-args check early.
🛡️ Proposed validation guard
def update_asv_db(
commitHash=None,
commitTime=None,
branch=None,
repo_url=None,
results_dir=None,
machine_name=None,
gpu_type=None,
configs=None,
):
@@
- # commitHash = commitHash + str(int(time.time()))
+ required = {
+ "commitHash": commitHash,
+ "commitTime": commitTime,
+ "branch": branch,
+ "repo_url": repo_url,
+ "results_dir": results_dir,
+ "machine_name": machine_name,
+ "gpu_type": gpu_type,
+ "configs": configs,
+ }
+ missing = [k for k, v in required.items() if v in (None, "")]
+ if missing:
+ raise ValueError(f"Missing required args: {', '.join(missing)}")
+
+ # commitHash = commitHash + str(int(time.time()))
benchmark_dir_path = Path(results_dir) / "benchmarks" / "results" / "csvs"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def update_asv_db( | |
| commitHash=None, | |
| commitTime=None, | |
| branch=None, | |
| repo_url=None, | |
| results_dir=None, | |
| machine_name=None, | |
| gpu_type=None, | |
| configs=None, | |
| ): | |
| """ | |
| Read the benchmark_result* files in results_dir/benchmarks and | |
| update an existing asv benchmark database or create one if one | |
| does not exist in results_dir/benchmarks/asv. If no | |
| benchmark_result* files are present, return without updating or | |
| creating. | |
| """ | |
| # commitHash = commitHash + str(int(time.time())) | |
| benchmark_dir_path = Path(results_dir) / "benchmarks" / "results" / "csvs" | |
| asv_dir_path = Path(results_dir) / "benchmarks" / "results" / "asv" | |
| def update_asv_db( | |
| commitHash=None, | |
| commitTime=None, | |
| branch=None, | |
| repo_url=None, | |
| results_dir=None, | |
| machine_name=None, | |
| gpu_type=None, | |
| configs=None, | |
| ): | |
| """ | |
| Read the benchmark_result* files in results_dir/benchmarks and | |
| update an existing asv benchmark database or create one if one | |
| does not exist in results_dir/benchmarks/asv. If no | |
| benchmark_result* files are present, return without updating or | |
| creating. | |
| """ | |
| required = { | |
| "commitHash": commitHash, | |
| "commitTime": commitTime, | |
| "branch": branch, | |
| "repo_url": repo_url, | |
| "results_dir": results_dir, | |
| "machine_name": machine_name, | |
| "gpu_type": gpu_type, | |
| "configs": configs, | |
| } | |
| missing = [k for k, v in required.items() if v in (None, "")] | |
| if missing: | |
| raise ValueError(f"Missing required args: {', '.join(missing)}") | |
| # commitHash = commitHash + str(int(time.time())) | |
| benchmark_dir_path = Path(results_dir) / "benchmarks" / "results" / "csvs" | |
| asv_dir_path = Path(results_dir) / "benchmarks" / "results" / "asv" |
🤖 Prompt for AI Agents
In `@regression/update_asv_database.py` around lines 12 - 33, The function
update_asv_db uses inputs like results_dir and commitTime without validation
(e.g., Path(results_dir) and commitTime * 1000), so add an early
required-arguments guard at the top of update_asv_db that checks for
presence/valid types of results_dir and either commitHash or commitTime (and
that commitTime is numeric if used); if validation fails, raise a clear
exception (ValueError) or return early with a clear message. Ensure the checks
reference the same parameter names (results_dir, commitTime, commitHash) so
callers get a deterministic error before any Path(...) or arithmetic is
attempted.
…uts make it infeasible (NVIDIA#831) Fixes an issue on neos-4413714-turia where basis repair was called during dual push, and the slack needed was superbasic. Fixes an issue where cbs-cta was incorrectly classified as infeasible after cuts were added. Authors: - Chris Maes (https://github.com/chris-maes) Approvers: - Alice Boucher (https://github.com/aliceb-nv) URL: NVIDIA#831
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
cpp/src/dual_simplex/branch_and_bound.cpp (1)
408-455:⚠️ Potential issue | 🟠 MajorRecompute
objafter re-crushing to avoid stale comparisons.If
original_lp_changed since the first crush,crushed_solutionis rebuilt butobjstill reflects the old LP, so acceptance/repair checks can run against a stale objective.🛠️ Suggested fix
if (crushed_solution.size() != original_lp_.num_cols) { // original problem has been modified since the solution was crushed // we need to re-crush the solution crush_primal_solution<i_t, f_t>( original_problem_, original_lp_, solution, new_slacks_, crushed_solution); + obj = compute_objective(original_lp_, crushed_solution); }cpp/src/dual_simplex/primal.cpp (1)
299-321:⚠️ Potential issue | 🟠 MajorGuard against superbasic output from
basis_repairin primal_phase2.
basis_repaircan now populatesuperbasic_list. If that happens,nonbasic_listmay end up smaller thann - m, but subsequent loops still index it asn - m, risking OOB access and incorrect pricing. Rebuild the lists after repair and either merge superbasics or fail fast.As per coding guidelines: “Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results.”🛠️ Suggested fix
basis_repair(lp.A, settings, lp.lower, lp.upper, deficient, slacks_needed, basic_list, nonbasic_list, superbasic_list, vstatus); + get_basis_from_vstatus(m, vstatus, basic_list, nonbasic_list, superbasic_list); + if (!superbasic_list.empty()) { + settings.log.printf("Superbasic variables are not supported in primal_phase2\n"); + return primal::status_t::NUMERICAL; + } rank = factorize_basis(lp.A, settings, basic_list, L, U, p, pinv, q, deficient, slacks_needed);
🤖 Fix all issues with AI agents
In `@cpp/src/dual_simplex/basis_solves.cpp`:
- Around line 667-684: The branch that assigns vstatus for bad_j (inside the if
checking nonbasic_map[replace_j]) misses the case where lower[bad_j] ==
upper[bad_j] (fixed variable) and currently falls through to NONBASIC_LOWER;
update the condition order in that block to check for fixed bounds first and
assign variable_status_t::NONBASIC_FIXED when lower[bad_j] == upper[bad_j],
keeping the existing checks for NONBASIC_FREE, NONBASIC_LOWER and NONBASIC_UPPER
for the other cases; also replace the final assert(1 == 0) with assert(false &&
"unexpected bound configuration") to provide a clearer failure message.
| if (nonbasic_map[replace_j] != -1) { | ||
| nonbasic_list[nonbasic_map[replace_j]] = bad_j; | ||
| // This is the main issue. What value should bad_j take on. | ||
| if (lower[bad_j] == -inf && upper[bad_j] == inf) { | ||
| vstatus[bad_j] = variable_status_t::NONBASIC_FREE; | ||
| } else if (lower[bad_j] > -inf) { | ||
| vstatus[bad_j] = variable_status_t::NONBASIC_LOWER; | ||
| } else if (upper[bad_j] < inf) { | ||
| vstatus[bad_j] = variable_status_t::NONBASIC_UPPER; | ||
| } else { | ||
| assert(1 == 0); | ||
| } | ||
| } else if (superbasic_map[replace_j] != -1) { | ||
| superbasic_list[superbasic_map[replace_j]] = bad_j; | ||
| vstatus[bad_j] = variable_status_t::SUPERBASIC; | ||
| } else { | ||
| assert(1 == 0); | ||
| assert(nonbasic_map[replace_j] != -1 || superbasic_map[replace_j] != -1); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Check the variable_status_t enum to see if NONBASIC_FIXED exists
rg -n 'variable_status_t|NONBASIC_FIXED' --type=cpp --type=hpp -A2 -B2 | head -80Repository: NVIDIA/cuopt
Length of output: 83
🏁 Script executed:
# Examine the basis_solves.cpp file structure around lines 667-684
sed -n '650,700p' cpp/src/dual_simplex/basis_solves.cppRepository: NVIDIA/cuopt
Length of output: 1947
🏁 Script executed:
# Check how nonbasic_map and superbasic_map are used after line 684 in this function
rg -n 'nonbasic_map|superbasic_map' cpp/src/dual_simplex/basis_solves.cpp | tail -20Repository: NVIDIA/cuopt
Length of output: 766
🏁 Script executed:
# Find the variable_status_t enum definition
rg -n 'enum.*variable_status_t|NONBASIC_FIXED' -t c -t h -C3Repository: NVIDIA/cuopt
Length of output: 789
🏁 Script executed:
# Check the function signature and full context
sed -n '620,690p' cpp/src/dual_simplex/basis_solves.cppRepository: NVIDIA/cuopt
Length of output: 2643
🏁 Script executed:
# Find where this function is defined and used
rg -n 'fix_basis|deficient_basis' cpp/src/dual_simplex/basis_solves.cpp -A1 -B1 | head -30Repository: NVIDIA/cuopt
Length of output: 38
🏁 Script executed:
# Check how NONBASIC_FIXED is used in the codebase
rg -n 'NONBASIC_FIXED' --type-listRepository: NVIDIA/cuopt
Length of output: 445
🏁 Script executed:
# Look for uses of NONBASIC_FIXED to understand if it's important for correctness
rg -n 'NONBASIC_FIXED' -C2Repository: NVIDIA/cuopt
Length of output: 11626
🏁 Script executed:
# Check if the bounds checking logic elsewhere handles NONBASIC_FIXED
rg -n 'variable_status_t::NONBASIC' -C2 | head -100Repository: NVIDIA/cuopt
Length of output: 7606
Missing NONBASIC_FIXED status assignment for fixed variables (lower == upper).
The bound-checking logic (lines 670–678) omits the case where lower[bad_j] == upper[bad_j]. This matters: elsewhere in the dual simplex solver (primal.cpp:465, phase2.cpp:1999, crossover.cpp:460 and beyond), variables with equal bounds are explicitly marked NONBASIC_FIXED and handled differently from NONBASIC_LOWER or NONBASIC_UPPER. The current code assigns NONBASIC_LOWER to such variables, which is corrected later but indicates an initial basis inconsistency.
Add a check before the lower > -inf condition:
if (lower[bad_j] == -inf && upper[bad_j] == inf) {
vstatus[bad_j] = variable_status_t::NONBASIC_FREE;
} else if (lower[bad_j] > -inf && lower[bad_j] == upper[bad_j]) {
vstatus[bad_j] = variable_status_t::NONBASIC_FIXED;
} else if (lower[bad_j] > -inf) {
vstatus[bad_j] = variable_status_t::NONBASIC_LOWER;
} else if (upper[bad_j] < inf) {
vstatus[bad_j] = variable_status_t::NONBASIC_UPPER;
} else {
assert(false && "unexpected bound configuration");
}Also prefer assert(false) with a descriptive message over assert(1 == 0) (lines 677, 683).
🤖 Prompt for AI Agents
In `@cpp/src/dual_simplex/basis_solves.cpp` around lines 667 - 684, The branch
that assigns vstatus for bad_j (inside the if checking nonbasic_map[replace_j])
misses the case where lower[bad_j] == upper[bad_j] (fixed variable) and
currently falls through to NONBASIC_LOWER; update the condition order in that
block to check for fixed bounds first and assign
variable_status_t::NONBASIC_FIXED when lower[bad_j] == upper[bad_j], keeping the
existing checks for NONBASIC_FREE, NONBASIC_LOWER and NONBASIC_UPPER for the
other cases; also replace the final assert(1 == 0) with assert(false &&
"unexpected bound configuration") to provide a clearer failure message.
For #827
Summary by CodeRabbit
New Features
New Configuration
Improvements
API Changes
Tests / Regression