-
Notifications
You must be signed in to change notification settings - Fork 118
Add gRPC server and remote execution integrated in cuOpt solver #799
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: release/26.02
Are you sure you want to change the base?
Conversation
|
Supersedes #769 |
📝 WalkthroughWalkthroughThis PR introduces gRPC-based remote solve capabilities to cuOpt. Changes include new protobuf-based serialization, gRPC server/client implementations, hybrid GPU/CPU solution memory architecture, updated build system to support gRPC/Protobuf dependencies, and Python integration for remote problem submission and result retrieval. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 15
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
cpp/include/cuopt/linear_programming/utilities/internals.hpp (1)
23-47: Document ABI-breaking change or refactor to maintain layout compatibility.
base_solution_callback_tis exposed in public API headers (solver_settings.hpp,optimization_problem.hpp,mip/solver_settings.hpp) and the newmemory_locationmember changes the class layout. This is a breaking ABI change that is not documented in RELEASE-NOTES.md for 25.12. Either add it to the breaking changes list or refactor to preserve binary compatibility (e.g., use a reserved padding field or alternative mechanism that doesn't alter sizeof() for existing object layouts). Per coding guidelines, verify no struct layout changes in public headers.cpp/include/cuopt/linear_programming/utilities/callbacks_implems.hpp (2)
41-52: Missing null check on Python callback return values.If
PyObject_CallMethodfails (e.g., Python exception), it returnsNULL. CallingPy_DECREF(NULL)orPy_DECREFon a failedrescan cause undefined behavior. Consider checking forNULLbefore decrementing reference counts.🐛 Proposed fix to add null checks
void get_solution(void* data, void* objective_value) override { PyObject* numba_matrix = data_on_device() ? get_numba_matrix(data, n_variables) : get_numpy_array(data, n_variables); + if (!numba_matrix) return; // Python exception occurred PyObject* numpy_array = data_on_device() ? get_numba_matrix(objective_value, 1) : get_numpy_array(objective_value, 1); + if (!numpy_array) { + Py_DECREF(numba_matrix); + return; + } PyObject* res = PyObject_CallMethod(this->pyCallbackClass, "get_solution", "(OO)", numba_matrix, numpy_array); Py_DECREF(numba_matrix); Py_DECREF(numpy_array); - Py_DECREF(res); + Py_XDECREF(res); // Use Py_XDECREF for potentially NULL res }
80-91: Same null check issue inset_solution.The
set_solutionmethod has the same pattern and should also have null checks for Python callback failures.🐛 Proposed fix
void set_solution(void* data, void* objective_value) override { PyObject* numba_matrix = data_on_device() ? get_numba_matrix(data, n_variables) : get_numpy_array(data, n_variables); + if (!numba_matrix) return; PyObject* numpy_array = data_on_device() ? get_numba_matrix(objective_value, 1) : get_numpy_array(objective_value, 1); + if (!numpy_array) { + Py_DECREF(numba_matrix); + return; + } PyObject* res = PyObject_CallMethod(this->pyCallbackClass, "set_solution", "(OO)", numba_matrix, numpy_array); Py_DECREF(numba_matrix); Py_DECREF(numpy_array); - Py_DECREF(res); + Py_XDECREF(res); }
🤖 Fix all issues with AI agents
In `@build.sh`:
- Line 57: The default build unconditionally enables and builds
cuopt_grpc_server by forcing -DCUOPT_ENABLE_GRPC=ON; change build.sh so the
cuopt_grpc_server build is only executed when gRPC is detected or when the user
explicitly requests it: add a detection step (e.g., run pkg-config --exists grpc
or try cmake -DgRPC_DIR=.../ find_package(grpc) check) and gate the creation of
the cuopt_grpc_server target and the -DCUOPT_ENABLE_GRPC=ON flag on that
detection, otherwise skip adding cuopt_grpc_server to the default build (or
require an explicit CLI arg like --enable-grpc) so the script no longer forces
gRPC as a hard dependency.
In `@cpp/cuopt_remote_server.cpp`:
- Around line 1188-1217: The code currently moves result_data into the waiting
thread (waiter->result_data = std::move(result_data)), which empties it before
you store it in job_tracker; change the order or semantics so the tracker gets
the real result: either copy result_data into the waiter (use
waiter->result_data = result_data instead of std::move) and then move
result_data into the tracker (it->second.result_data = std::move(result_data)),
or update the job_tracker first (it->second.result_data =
std::move(result_data)) and then set waiter->result_data =
it->second.result_data (copy). Update the block touching
waiting_threads/waiters_mutex or the block touching job_tracker/tracker_mutex
accordingly so waiter->result_data retains the full data and job_tracker stores
the result (addresses: waiting_threads, waiters_mutex, waiter->result_data,
job_tracker, tracker_mutex, JobStatus::COMPLETED).
- Around line 1829-1833: The stdout/stderr redirection in stdout_streamer_t
(which calls dup2() in its constructor/destructor) must be serialized to avoid
cross-thread interference: introduce a process-wide mutex (e.g., a static
std::mutex stdout_redir_mutex) and acquire a lock whenever performing the
dup2-based redirection and when restoring the original fds; specifically, lock
inside stdout_streamer_t's constructor before the dup2 that redirects
stdout/stderr and keep the lock until after the dup2 in the destructor that
restores them (or alternatively lock around the scope where stdout_streamer_t
streamer(client_fd, stream_logs) is constructed in handle_sync_solve), ensuring
no two threads can perform the dup2/restore concurrently.
- Around line 303-312: The generate_job_id function currently uses shared static
RNG state (rd, gen, dis) which causes data races; change the RNG and
distribution to thread-local instead of static so each thread gets its own
std::random_device/ std::mt19937/ std::uniform_int_distribution instance, e.g.,
make rd, gen and dis thread_local inside generate_job_id (preserving the
snprintf formatting and return of buf) so concurrent calls from different client
handler threads are safe and avoid synchronization overhead.
In `@cpp/src/linear_programming/cuopt_c.cpp`:
- Around line 432-435: The remote-CPU copy currently assigns variable_types
verbatim (see cpu_data.variable_types assignment and the similar block at lines
535-540) so if callers pass 'B' the problem_cpu_data_t::is_mip() check will miss
integer/binary variables; update both copy sites to normalize
non-continuous/binary types to CUOPT_INTEGER (mirror the GPU path): iterate the
incoming variable_types array and when a type is 'B' (or otherwise
non-continuous), set the stored type to CUOPT_INTEGER before assigning
cpu_data.variable_types, ensuring MIP classification is correct.
- Around line 46-83: problem_cpu_data_t is missing an objective_scaling_factor
field and the remote cuOptReadProblem path doesn't propagate it into the view;
add a cuopt_float_t objective_scaling_factor member to problem_cpu_data_t,
populate it where cuOptReadProblem constructs or copies CPU problem data in the
remote path, and modify create_view()
(data_model_view_t<cuopt_int_t,cuopt_float_t>::create_view) to call
view.set_objective_scaling_factor(objective_scaling_factor) in addition to
set_maximize and set_objective_offset so the scaling from the MPS parser is
preserved end-to-end.
In `@cpp/src/linear_programming/solve.cu`:
- Around line 1207-1221: The cudaMemcpy calls that move var_types to
host_var_types lack error checking; update the block that uses
cudaPointerAttributes (vars: attrs, err, var_types, host_var_types) to capture
the cudaMemcpy return value (for the cudaMemcpyDeviceToHost path and any other
cudaMemcpy) into a cudaError_t, check if it's not cudaSuccess, call
cudaGetLastError() if needed, and handle/report the failure (e.g., log via
existing logger, return or throw an error) instead of ignoring it; ensure the
memcpy fallback path still clears prior cuda errors with cudaGetLastError() as
before.
- Around line 1264-1290: The cudaMemcpy calls that copy the quadratic objective
arrays (calls that move data into Q_values, Q_indices, Q_offsets from
view.get_quadratic_objective_values()/indices()/offsets()) are missing CUDA
error checks; wrap each cudaMemcpy in error-handling by capturing the
cudaError_t result, checking it against cudaSuccess, and handling failures
(e.g., log via the existing logger or return/throw an error) and/or call
cudaGetLastError() to clear and report any errors; ensure both the
device-to-host branch (the three cudaMemcpy calls) and the host branch use
consistent error reporting so failures during copying of the quadratic objective
matrix are detected and handled.
In `@cpp/src/linear_programming/utilities/protobuf_serializer.cu`:
- Around line 1395-1403: The cudaMemcpy call copying device MIP solution into
host vector (when solution.is_device_memory(), using d_sol and h_sol) lacks CUDA
error checking; wrap the cudaMemcpy(d_sol.data() -> h_sol.data()) call by
capturing its cudaError_t return (or using cudaMemcpyAsync + cudaGetLastError),
check for success (cudaSuccess), and on failure log or propagate the error
(e.g., via processLogger / throw) before iterating and calling
pb_solution->add_solution; ensure the check is applied in the block handling
device solutions for the MIP serialization same as LP solution handling.
- Around line 1305-1334: The cudaMemcpy calls that copy device buffers d_primal,
d_dual and d_reduced_cost into host vectors in protobuf_serializer.cu lack error
checking; wrap each cudaMemcpy invocation (the ones copying into h_primal,
h_dual, h_reduced_cost) with a cudaError_t result check, and on non-success use
cudaGetErrorString to log the error and fail the serialization path (e.g.,
return an error status or throw) so pb_solution is not populated with corrupted
data; reference the variables d_primal, d_dual, d_reduced_cost, nc_solution and
the optimization_problem_solution_t template to find the exact spots to add the
checks.
In `@cpp/src/linear_programming/utilities/remote_solve_grpc.cpp`:
- Around line 597-614: Replace the non-synchronized volatile flag with an atomic
boolean: change the parameter type of stream_logs_to_stdout from volatile bool*
stop_flag to std::atomic<bool>* stop_flag and replace reads of *stop_flag with
stop_flag->load(std::memory_order_acquire); update the writer side in
remote_solve.cu (the stop_logs variable and any writes) to use std::atomic<bool>
and stop_logs.store(true, std::memory_order_release) so the worker thread and
main thread synchronize safely; ensure all declarations and calls that pass
stop_logs are adjusted to the std::atomic<bool>* type.
- Around line 88-100: The code calls ParseFromArray with a
static_cast<int>(size) on size_t without checking bounds; add an explicit check
before parsing (for both branches around req.mutable_lp_request() and
req.mutable_mip_request()) that if size >
static_cast<size_t>(std::numeric_limits<int>::max()) then set error_message
(e.g., "SubmitJob: request too large for ParseFromArray"), set status_code =
grpc::StatusCode::INVALID_ARGUMENT and return false; ensure you `#include`
<limits> or <climits> as needed and reference ProblemType, ParseFromArray,
req.mutable_lp_request()/req.mutable_mip_request(), error_message and
status_code in the fix.
In `@cpp/src/mip/solve.cu`:
- Around line 383-447: The cpu_problem_data_t is missing storage and propagation
for constraint row types, so add a member (e.g., std::vector<char> row_types or
the appropriate type) to cpu_problem_data_t, ensure any code that
constructs/populates cpu_problem_data_t from gpu_view copies gpu_view.row_types
into that vector, and update cpu_problem_data_t::create_view() to call
v.set_constraint_row_types(row_types.data(), row_types.size()) (in the same
manner as other constraint arrays) before returning; refer to
cpu_problem_data_t, create_view, row_types, gpu_view, and
set_constraint_row_types to locate where to add the field, copy it, and pass it
into the data_model_view_t.
- Around line 486-495: Wrap the cudaMemcpyAsync and cudaStreamSynchronize calls
with RAFT_CUDA_TRY to check for errors: replace the direct call to
cudaMemcpyAsync(cpu_data.variable_types.data(), var_types_span.data(),
var_types_span.size() * sizeof(char), cudaMemcpyDeviceToHost, stream) with
RAFT_CUDA_TRY(cudaMemcpyAsync(...)) and similarly wrap
cudaStreamSynchronize(stream) with RAFT_CUDA_TRY(cudaStreamSynchronize(stream));
ensure you reference the same variables (cpu_data.variable_types,
var_types_span, stream) so failures in the async copy or stream sync are caught
and propagated per project guidelines.
In `@python/cuopt/cuopt/linear_programming/remote.py`:
- Around line 136-140: The AsyncRequest's job_id field is defined as bytes but
the code assigns a str directly; update the creation of pb.AsyncRequest in
remote.py (where request.request_type = pb.CANCEL_JOB and request.job_id =
job_id) to ensure job_id is bytes (e.g., set request.job_id =
job_id.encode('utf-8') or conditionally keep it if already bytes) so
serialization no longer raises a TypeError.
🟡 Minor comments (16)
docs/developer/SOLUTION_MEMORY_ARCHITECTURE.md-73-95 (1)
73-95: Add language tags to the ASCII diagram fences.markdownlint (MD040) will fail on the two untyped fenced blocks; use a
textfence to keep lint clean and rendering consistent.🛠️ Suggested fix
-``` +```text ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ Solver │ ──► │ Solution │ ──► │ User │ │ (GPU) │ │ (GPU mem) │ │ (GPU/CPU) │ └─────────────┘ └─────────────┘ └─────────────┘ -``` +``` -``` +```text ┌──────────┐ ┌──────────────────────────────────────────┐ ┌──────────┐ │ Client │ ──► │ SERVER │ ◄── │ Client │ │ (no GPU) │ │ GPU solve → to_host() → serialize │ │(solution)│ └──────────┘ └──────────────────────────────────────────┘ └──────────┘ -``` +```cpp/cuopt_cancel_job.cpp-63-88 (1)
63-88: Harden port parsing and range validation.
std::stoican throw on invalid input, andstd::atoisilently maps invalid env values to 0. Without error handling or range checks, the tool can crash or select an invalid port. Replace both with explicit range validation (1–65535) and clear error messages.🛠️ Suggested fix
+#include <cerrno> ... +bool parse_port(const char* s, int& port) +{ + if (s == nullptr || *s == '\0') { return false; } + char* end = nullptr; + errno = 0; + long val = std::strtol(s, &end, 10); + if (errno != 0 || *end != '\0' || val < 1 || val > 65535) { return false; } + port = static_cast<int>(val); + return true; +} + int main(int argc, char** argv) { // Parse arguments std::string job_id; std::string host = "localhost"; int port = 9090; @@ - if (env_port && env_port[0]) { port = std::atoi(env_port); } + if (env_port && env_port[0] && !parse_port(env_port, port)) { + std::cerr << "Error: invalid CUOPT_REMOTE_PORT\n"; + return 1; + } @@ - } else if (strcmp(argv[i], "-p") == 0 && i + 1 < argc) { - port = std::stoi(argv[++i]); + } else if (strcmp(argv[i], "-p") == 0) { + if (i + 1 >= argc || !parse_port(argv[i + 1], port)) { + std::cerr << "Error: invalid port\n\n"; + print_usage(argv[0]); + return 1; + } + ++i; } else if (argv[i][0] != '-') { job_id = argv[i]; }python/cuopt/cuopt/utilities/__init__.py-15-25 (1)
15-25: Add__all__and__dir__to expose lazy utilities for IDE discoverability.The lazy imports work correctly for direct access (e.g.,
from cuopt.utilities import type_cast), but without__all__and__dir__, these names are not discoverable via IDE completion or star imports. Adding these improves the public API definition, consistent with other modules in the codebase.Suggested fix
+__all__ = [ + "InputRuntimeError", + "InputValidationError", + "OutOfMemoryError", + "catch_cuopt_exception", + "type_cast", + "series_from_buf", + "check_solution", +] + +def __dir__(): + return sorted(__all__) + def __getattr__(name): """Lazy import CUDA-dependent utilities.""" if name == "type_cast": from cuopt.utilities.type_casting import type_cast return type_castGRPC_PROTOTYPE_STATUS.md-130-144 (1)
130-144: Add a language identifier to the expected-output code fence.
This resolves MD040 and improves rendering.✏️ Suggested edit
-``` +```text [Client] Job submitted successfully [Client] Job ID: <hex-id> [Client] Polling for job completion... [Client] Job status: QUEUED - Job queued [Client] Job status: PROCESSING - Job processing [Client] Job status: COMPLETED - Job completed [Client] Retrieving result... [Client] Result retrieved successfully [Client] LP Solution: [Client] Status: 2 (PDLP_OPTIMAL) [Client] Objective: 42.0 (dummy value) [Client] Test completed successfully!</details> </blockquote></details> <details> <summary>cpp/src/linear_programming/utilities/cython_solve.cu-293-304 (1)</summary><blockquote> `293-304`: **Propagate `solved_by_pdlp` instead of hardcoding `false`.** Remote solves should mirror solver metadata; hardcoding misreports the algorithm used. <details> <summary>🐛 Suggested fix</summary> ```diff - lp_ret.solved_by_pdlp_ = false; + lp_ret.solved_by_pdlp_ = term_info.solved_by_pdlp;cpp/cuopt_cli.cpp-508-549 (1)
508-549: Handle gRPC status probe failures before printing server max.
Failures silently report0 MiBinstead of surfacing the error; checkerror_messagebefore printing the result.🐛 Suggested guard
cuopt::linear_programming::grpc_remote::check_status(address, "__cuopt_max_message_probe__", status, error_message, &result_size_bytes, &max_message_bytes); - std::cout << "Server max message MiB: " << (max_message_bytes / (1024 * 1024)) << "\n"; + if (!error_message.empty()) { + std::cout << "Server max message MiB: unavailable (" << error_message << ")\n"; + } else { + std::cout << "Server max message MiB: " << (max_message_bytes / (1024 * 1024)) << "\n"; + }GRPC_BUILD_STATUS.md-8-24 (1)
8-24: Clarify prototype RPC coverage to avoid mixed signals.
Line 8‑13 says CheckStatus is supported, but Line 23 says “SubmitJob and GetResult only.” Please align the list and the log snippet so the status is unambiguous.GRPC_BUILD_STATUS.md-66-70 (1)
66-70: Add a language hint to the fenced block.
Line 66 opens a fenced block without a language; markdownlint MD040 flags this. Considertextfor the environment list.✏️ Suggested fix
-``` +```text libgrpc: 1.73.1 libprotobuf: 6.31.1 protobuf (pip): 6.33.2 -``` +```python/cuopt/cuopt/linear_programming/remote.py-141-213 (1)
141-213: Close sockets on all exception paths to prevent file descriptor leaks.When
socket.timeoutorsocket.errorexceptions occur aftersock.connect()(line 145), the socket is not closed before returning at lines 200–211. Wrap the I/O operations in atry/finallyblock to ensuresock.close()is always called.Suggested fix
- # Connect to server - sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) - sock.settimeout(30.0) # 30 second timeout - sock.connect((host, port)) - - # Send request (length-prefixed) - data = request.SerializeToString() - sock.sendall(struct.pack("<Q", len(data))) - sock.sendall(data) - - # Receive response - size_data = b"" - while len(size_data) < 8: - chunk = sock.recv(8 - len(size_data)) - if not chunk: - sock.close() - return CancelResult( - success=False, - message="Failed to receive response size", - job_status=JobStatus.NOT_FOUND, - ) - size_data += chunk - - size = struct.unpack("<Q", size_data)[0] - response_data = b"" - while len(response_data) < size: - chunk = sock.recv(size - len(response_data)) - if not chunk: - break - response_data += chunk - - sock.close() + # Connect to server + sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) + sock.settimeout(30.0) # 30 second timeout + try: + sock.connect((host, port)) + + # Send request (length-prefixed) + data = request.SerializeToString() + sock.sendall(struct.pack("<Q", len(data))) + sock.sendall(data) + + # Receive response + size_data = b"" + while len(size_data) < 8: + chunk = sock.recv(8 - len(size_data)) + if not chunk: + return CancelResult( + success=False, + message="Failed to receive response size", + job_status=JobStatus.NOT_FOUND, + ) + size_data += chunk + + size = struct.unpack("<Q", size_data)[0] + response_data = b"" + while len(response_data) < size: + chunk = sock.recv(size - len(response_data)) + if not chunk: + break + response_data += chunk + finally: + sock.close()cpp/include/cuopt/linear_programming/utilities/remote_serialization.hpp-15-18 (1)
15-18: Add<cstdint>include foruint8_ttype definition.This public header file uses
std::vector<uint8_t>extensively throughout (lines 53, 64, 79, 87, 99, 111, 123, 133, 142, 156, 169, 180, 188, 196, 205, 214, 247, 257, 265, 274, 283, 291, 305, 314, 322, 330, 355, 365, 377, 388, 399, 413) but does not explicitly include<cstdint>, which definesuint8_t. Add it to ensure the header is self-contained.🔧 Suggested fix
+#include <cstdint> `#include` <memory> `#include` <optional> `#include` <string> `#include` <vector>cpp/include/cuopt/linear_programming/utilities/remote_solve.hpp-39-53 (1)
39-53: Validate remote port parsing and range.
Line 46:std::stoiaccepts partial parses (e.g.,"123abc"→ 123) without exception and does not validate the port is within the valid range (1–65535). Invalid but numeric-prefix strings will enable remote solve unintentionally. Add strict parsing with position tracking and range validation.🔧 Suggested fix
- if (host != nullptr && port != nullptr && host[0] != '\0' && port[0] != '\0') { - try { - int port_num = std::stoi(port); - return remote_solve_config_t{std::string(host), port_num}; - } catch (...) { - // Invalid port number, fall back to local solve - return std::nullopt; - } - } + if (host != nullptr && port != nullptr && host[0] != '\0' && port[0] != '\0') { + try { + std::string port_str{port}; + size_t pos = 0; + int port_num = std::stoi(port_str, &pos); + if (pos != port_str.size() || port_num <= 0 || port_num > 65535) { return std::nullopt; } + return remote_solve_config_t{std::string(host), port_num}; + } catch (...) { + // Invalid port number, fall back to local solve + return std::nullopt; + } + }cpp/src/mip/solver_solution.cu-147-157 (1)
147-157: Potential null pointer dereference in get_solution().Similarly,
get_solution()dereferencessolution_without null checks. After callingto_host(),solution_is reset to nullptr, so subsequent calls toget_solution()would crash.Proposed defensive fix
template <typename i_t, typename f_t> const rmm::device_uvector<f_t>& mip_solution_t<i_t, f_t>::get_solution() const { + cuopt_expects(is_device_memory_ && solution_ != nullptr, + error_type_t::RuntimeError, + "get_solution() called but data is on host or device storage is null"); return *solution_; }cpp/src/mip/solver_solution.cu-159-169 (1)
159-169: Potential null pointer dereference in get_solution_host().The
get_solution_host()methods dereferencesolution_host_without null checks. If called whenis_device_memory_is true andsolution_host_hasn't been initialized, this will crash.Proposed defensive fix
template <typename i_t, typename f_t> std::vector<f_t>& mip_solution_t<i_t, f_t>::get_solution_host() { + cuopt_expects(!is_device_memory_ && solution_host_ != nullptr, + error_type_t::RuntimeError, + "get_solution_host() called but data is on device or host storage is null"); return *solution_host_; } template <typename i_t, typename f_t> const std::vector<f_t>& mip_solution_t<i_t, f_t>::get_solution_host() const { + cuopt_expects(!is_device_memory_ && solution_host_ != nullptr, + error_type_t::RuntimeError, + "get_solution_host() called but data is on device or host storage is null"); return *solution_host_; }cpp/include/cuopt/linear_programming/mip/solver_solution.hpp-77-86 (1)
77-86: Potential null pointer dereference in host accessors.The
get_solution_host()methods (lines 84-85) dereferencesolution_host_without checking if it's valid. Ifis_device_memory()returns true but these methods are called, it will crash.Consider adding an assertion or throwing an exception for misuse:
Suggested defensive check
+ /** + * `@brief` Returns the solution in host (CPU) memory. + * Only valid when is_device_memory() returns false. + * `@throws` std::runtime_error if called when data is on device + */ std::vector<f_t>& get_solution_host(); const std::vector<f_t>& get_solution_host() const;And in the implementation:
template <typename i_t, typename f_t> std::vector<f_t>& mip_solution_t<i_t, f_t>::get_solution_host() { cuopt_expects(!is_device_memory_ && solution_host_, error_type_t::RuntimeError, "get_solution_host() called but data is on device"); return *solution_host_; }cpp/src/linear_programming/solver_solution.cu-529-579 (1)
529-579: Keepsolved_by_pdlpconsistent withtermination_stats_.
set_termination_stats()updatestermination_stats_, butget_solved_by_pdlp()returnssolved_by_pdlp_. Unless both are kept in sync elsewhere, the getter can return stale data. Consider deriving fromtermination_stats_.solved_by_pdlp(or update the setter).Proposed fix (single source of truth)
bool optimization_problem_solution_t<i_t, f_t>::get_solved_by_pdlp() const { - return solved_by_pdlp_; + return termination_stats_.solved_by_pdlp; }cpp/src/linear_programming/solver_solution.cu-411-438 (1)
411-438: Device getters lack null-safety checks and risk crashes with CPU-only solutions.The device getters
get_primal_solution(),get_dual_solution(), andget_reduced_cost()unconditionally dereference unique_ptrs that are never initialized in CPU-only constructors or afterto_host()resets them. While public APIs incuopt_c.cppandprotobuf_serializer.cuproperly guard these calls withis_device_memory()checks, internal code liketranslate_to_crossover_problem()calls device getters without guards. Although this is currently safe due to implicit contracts (that function is only called withOptimalstatus solutions, which are always device-based), the code is fragile. Consider adding defensive null-checks or assertions to the device getters themselves, or ensure all callsites explicitly verifyis_device_memory()before using device accessors.
🧹 Nitpick comments (11)
python/cuopt/cuopt/linear_programming/internals/internals.pyx (1)
52-57: Non-owning view and limited type support.Two observations on this implementation:
Memory lifetime:
np.ctypeslib.as_arrayreturns a non-owning view into the memory atdata. Callers must ensure the underlying buffer remains valid for the lifetime of the returned array.Type handling: The else branch silently defaults to
c_doublefor anytypestrother than"float32". If an unexpected type (e.g.,"int32") is passed, this will silently misinterpret the data.Consider adding an assertion or explicit check:
Suggested improvement
def get_numpy_array(self, data, shape, typestr): - ctype = ctypes.c_float if typestr == "float32" else ctypes.c_double + if typestr == "float32": + ctype = ctypes.c_float + elif typestr == "float64": + ctype = ctypes.c_double + else: + raise ValueError(f"Unsupported typestr: {typestr}") buf_type = ctype * shape buf = buf_type.from_address(data) numpy_array = np.ctypeslib.as_array(buf) return numpy_arraycpp/src/mip/diversity/population.cu (1)
283-307: Consider keeping the incumbent-return log at DEBUG or throttling it.If incumbents are frequent, INFO logging here can be noisy in long runs. Optional, but might be worth keeping this at DEBUG or rate-limiting.
🔧 Optional tweak (reduce log volume)
- CUOPT_LOG_INFO("Returning incumbent solution with objective %g", user_objective); + CUOPT_LOG_DEBUG("Returning incumbent solution with objective %g", user_objective);cpp/cuopt_cli.cpp (1)
122-213: The defaultis_device_memoryvalue is already false; the suggested explicit call is optional.The
data_model_view_tmemberis_device_memory_defaults tofalse(CPU memory) by design, which is correct for MPS-parsed data. Addingview.set_is_device_memory(false)would be redundant but could improve clarity if defending against future default changes. This is optional.ci/utils/install_protobuf.sh (1)
11-19: Ensurednf config-manageravailability on Rocky.On minimal Rocky images,
dnf config-managercan be missing unlessdnf-plugins-coreis installed, which can leave PowerTools/CRB disabled and makeprotobuf-develunavailable. Consider installing the plugin before toggling repos.♻️ Suggested tweak
if [[ "$ID" == "rocky" ]]; then echo "Detected Rocky Linux. Installing Protobuf via dnf..." + dnf install -y dnf-plugins-core # Enable PowerTools (Rocky 8) or CRB (Rocky 9) repository for protobuf-devel if [[ "${VERSION_ID%%.*}" == "8" ]]; then dnf config-manager --set-enabled powertools || dnf config-manager --set-enabled PowerTools || true elif [[ "${VERSION_ID%%.*}" == "9" ]]; then dnf config-manager --set-enabled crb || true fi dnf install -y protobuf-devel protobuf-compilerdocs/developer/SERIALIZATION_PLUGIN_GUIDE.md (2)
13-25: Minor: Add language specifier to fenced code block.Static analysis suggests adding a language specifier. For ASCII diagrams, use
textor leave as-is if the tooling accepts it.📝 Proposed fix
-``` +```text ┌─────────────┐ ┌─────────────┐
374-377: Minor: Add language specifier for server output example.📝 Proposed fix
-``` +```text [remote_solve] Loading custom serializer from: /path/to/libcuopt_my_serializer.so [remote_solve] Using custom serializer: my_format</details> </blockquote></details> <details> <summary>cpp/libmps_parser/CMakeLists.txt (1)</summary><blockquote> `81-100`: **Keep `mps_parser_static` compile settings aligned with the shared target.** `mps_parser_static` doesn’t inherit the C++20 standard or compile flags set on `mps_parser`, which can lead to divergent builds or failures if the sources require C++20. Consider applying the same properties/flags (or factoring into a helper). If `CMAKE_CXX_STANDARD` is globally set, please confirm that it also applies to `mps_parser_static`. <details> <summary>♻️ Suggested alignment</summary> ```diff add_library(mps_parser SHARED ${MPS_PARSER_SOURCES}) add_library(mps_parser_static STATIC ${MPS_PARSER_SOURCES}) set_target_properties(mps_parser PROPERTIES BUILD_RPATH "\$ORIGIN" INSTALL_RPATH "\$ORIGIN" # set target compile options CXX_STANDARD 20 CXX_STANDARD_REQUIRED ON INTERFACE_POSITION_INDEPENDENT_CODE ON CXX_SCAN_FOR_MODULES OFF ) +set_target_properties(mps_parser_static + PROPERTIES + CXX_STANDARD 20 + CXX_STANDARD_REQUIRED ON + INTERFACE_POSITION_INDEPENDENT_CODE ON + CXX_SCAN_FOR_MODULES OFF +) target_compile_options(mps_parser PRIVATE "$<$<COMPILE_LANGUAGE:CXX>:${MPS_PARSER_CXX_FLAGS}>" ) +target_compile_options(mps_parser_static + PRIVATE "$<$<COMPILE_LANGUAGE:CXX>:${MPS_PARSER_CXX_FLAGS}>" +)cpp/include/cuopt/linear_programming/solve.hpp (1)
112-123: Clarify memory-location expectations fordata_model_view_to_optimization_problem.
The doc says the view points to GPU memory, but the new API allows CPU-backed views for remote solves. Either enforce GPU-only usage or update the docs to reflect CPU/GPU support to avoid misuse. As per coding guidelines, public headers should document GPU requirements.cpp/src/linear_programming/utilities/protobuf_serializer.cu (1)
1473-1515: Intentional dlopen handle leak is documented but could be improved.The comment on line 1515 notes that the handle is intentionally not closed to keep the library loaded. While this is acceptable, consider storing the handle to enable proper cleanup in tests or if the serializer needs to be reloaded.
cpp/src/linear_programming/utilities/remote_solve.cu (2)
197-230: Consider exponential backoff for status polling.The fixed 100ms polling interval (line 229) may be inefficient for long-running jobs or cause excessive server load. Consider exponential backoff:
Suggested improvement
+ int poll_delay_ms = 100; + constexpr int max_poll_delay_ms = 5000; + while (true) { std::string st_err; if (!grpc_remote::check_status( address, job_id, status, st_err, &result_size_bytes, &max_message_bytes)) { // ... error handling } if (status == "COMPLETED") { break; } // ... error status handling - std::this_thread::sleep_for(std::chrono::milliseconds(100)); + std::this_thread::sleep_for(std::chrono::milliseconds(poll_delay_ms)); + poll_delay_ms = std::min(poll_delay_ms * 2, max_poll_delay_ms); }
414-439: Final incumbent drain loop has hardcoded retry count.The loop retries 5 times with 50ms delays (lines 417, 427), which may not be sufficient for all cases. Consider making this configurable or documenting the rationale.
| -h - print this text | ||
| default action (no args) is to build and install 'libcuopt' then 'cuopt' then 'docs' targets | ||
| default action (no args) is to build and install 'libcuopt', 'cuopt', 'cuopt_grpc_server', then 'docs' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Step 1: Inspect how CUOPT_ENABLE_GRPC is handled in CMakeLists.txt files
rg -n "CUOPT_ENABLE_GRPC|find_package\(gRPC|gRPC_FOUND" -t cmake -A 3Repository: NVIDIA/cuopt
Length of output: 2406
🏁 Script executed:
#!/bin/bash
# Step 2: Examine build.sh lines around 57 and 461-483 to understand the gRPC build block
head -75 build.sh | tail -25Repository: NVIDIA/cuopt
Length of output: 1343
🏁 Script executed:
#!/bin/bash
# Step 3: Check lines 461+ in build.sh for the gRPC build block
sed -n '450,490p' build.shRepository: NVIDIA/cuopt
Length of output: 1901
Gate cuopt_grpc_server build on gRPC availability, or remove it from default builds.
Lines 57 and 455–483: The default build (no arguments) includes cuopt_grpc_server and unconditionally forces -DCUOPT_ENABLE_GRPC=ON (line 468). Since gRPC is designed as optional in CMakeLists.txt, this forces gRPC as a hard dependency for default builds, causing failures when gRPC is not installed. Either gate this block on gRPC availability detection or make it explicit via user argument.
🤖 Prompt for AI Agents
In `@build.sh` at line 57, The default build unconditionally enables and builds
cuopt_grpc_server by forcing -DCUOPT_ENABLE_GRPC=ON; change build.sh so the
cuopt_grpc_server build is only executed when gRPC is detected or when the user
explicitly requests it: add a detection step (e.g., run pkg-config --exists grpc
or try cmake -DgRPC_DIR=.../ find_package(grpc) check) and gate the creation of
the cuopt_grpc_server target and the -DCUOPT_ENABLE_GRPC=ON flag on that
detection, otherwise skip adding cuopt_grpc_server to the default build (or
require an explicit CLI arg like --enable-grpc) so the script no longer forces
gRPC as a hard dependency.
cpp/cuopt_remote_server.cpp
Outdated
| std::string generate_job_id() | ||
| { | ||
| static std::random_device rd; | ||
| static std::mt19937 gen(rd()); | ||
| static std::uniform_int_distribution<uint64_t> dis; | ||
|
|
||
| uint64_t id = dis(gen); | ||
| char buf[32]; | ||
| snprintf(buf, sizeof(buf), "job_%016lx", id); | ||
| return std::string(buf); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, let's examine the file structure and understand the server implementation
fd -t f "cuopt_remote_server" --extension cpp --extension hppRepository: NVIDIA/cuopt
Length of output: 83
🏁 Script executed:
# Check the overall structure and find the file
git ls-files | grep -E "cuopt_remote_server\.(cpp|hpp)"Repository: NVIDIA/cuopt
Length of output: 83
🏁 Script executed:
# Get size of the file to understand how much we need to read
wc -l cpp/cuopt_remote_server.cppRepository: NVIDIA/cuopt
Length of output: 88
🏁 Script executed:
# Read the relevant section and surrounding context to understand the function and its usage
sed -n '290,330p' cpp/cuopt_remote_server.cppRepository: NVIDIA/cuopt
Length of output: 1177
🏁 Script executed:
# Search for calls to generate_job_id to understand where it's invoked
rg "generate_job_id" -B 2 -A 2Repository: NVIDIA/cuopt
Length of output: 2795
🏁 Script executed:
# Check for thread creation or async patterns in the server
rg "std::thread|async|detach|join" cpp/cuopt_remote_server.cpp -B 2 -A 2Repository: NVIDIA/cuopt
Length of output: 2989
🏁 Script executed:
# Let's verify the threading context more carefully - check handle_client to confirm it's called from detached threads
sed -n '1900,2000p' cpp/cuopt_remote_server.cpp | grep -A 50 "handle_client"Repository: NVIDIA/cuopt
Length of output: 38
🏁 Script executed:
# Get the exact line where detached threads are created for clients
rg "std::thread.*detach" cpp/cuopt_remote_server.cpp -B 5 -A 5Repository: NVIDIA/cuopt
Length of output: 424
🌐 Web query:
std::mt19937 thread safety concurrent access C++
💡 Result:
std::mt19937 is not safe for concurrent access to the same engine object.
- Calling
operator()on a Mersenne Twister engine advances (modifies) its internal state. If two threads call it on the samestd::mt19937instance without synchronization, that’s a data race (undefined behavior). [1] - More generally, the C++ standard library’s data-race rules mean you can’t concurrently call non-
constoperations on the same object from multiple threads without external synchronization. [2]
What to do instead
- Prefer one engine per thread, e.g.
thread_local std::mt19937 rng(seed);(fast, no locking). [3] - Or protect a shared engine with a mutex (correct but can bottleneck).
- Note: function-local
staticinitialization is thread-safe, but subsequent concurrent use of that static engine is not. [3]
Sources: [1] [2] [3]
Make job ID generation thread-safe.
std::mt19937 and std::uniform_int_distribution are mutated concurrently by multiple client handler threads, causing a data race. Each call to generate_job_id() from different clients results in unsynchronized access to the shared static RNG state, leading to undefined behavior and potential duplicate IDs.
Use thread_local to give each thread its own RNG instance, eliminating the race condition without synchronization overhead.
Proposed fix
std::string generate_job_id()
{
- static std::random_device rd;
- static std::mt19937 gen(rd());
- static std::uniform_int_distribution<uint64_t> dis;
+ thread_local std::mt19937_64 gen{std::random_device{}()};
+ thread_local std::uniform_int_distribution<uint64_t> dis;
uint64_t id = dis(gen);
char buf[32];🤖 Prompt for AI Agents
In `@cpp/cuopt_remote_server.cpp` around lines 303 - 312, The generate_job_id
function currently uses shared static RNG state (rd, gen, dis) which causes data
races; change the RNG and distribution to thread-local instead of static so each
thread gets its own std::random_device/ std::mt19937/
std::uniform_int_distribution instance, e.g., make rd, gen and dis thread_local
inside generate_job_id (preserving the snprintf formatting and return of buf) so
concurrent calls from different client handler threads are safe and avoid
synchronization overhead.
cpp/cuopt_remote_server.cpp
Outdated
| // Check if there's a blocking waiter | ||
| { | ||
| std::lock_guard<std::mutex> lock(waiters_mutex); | ||
| auto wit = waiting_threads.find(job_id); | ||
| if (wit != waiting_threads.end()) { | ||
| // Wake up the waiting thread | ||
| auto waiter = wit->second; | ||
| waiter->result_data = std::move(result_data); | ||
| waiter->error_message = error_message; | ||
| waiter->success = success; | ||
| waiter->ready = true; | ||
| waiter->cv.notify_one(); | ||
| } | ||
| } | ||
|
|
||
| // Update job tracker | ||
| { | ||
| std::lock_guard<std::mutex> lock(tracker_mutex); | ||
| auto it = job_tracker.find(job_id); | ||
| if (it != job_tracker.end()) { | ||
| if (success) { | ||
| it->second.status = JobStatus::COMPLETED; | ||
| it->second.result_data = result_data; | ||
| } else if (cancelled) { | ||
| it->second.status = JobStatus::CANCELLED; | ||
| it->second.error_message = error_message; | ||
| } else { | ||
| it->second.status = JobStatus::FAILED; | ||
| it->second.error_message = error_message; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, locate the file and verify it exists
find . -name "cuopt_remote_server.cpp" -type fRepository: NVIDIA/cuopt
Length of output: 85
🏁 Script executed:
# Read the specific lines mentioned in the review (1188-1217) with context
sed -n '1180,1230p' cpp/cuopt_remote_server.cpp | cat -nRepository: NVIDIA/cuopt
Length of output: 2123
🏁 Script executed:
# Get more context about the function containing this code
sed -n '1100,1250p' cpp/cuopt_remote_server.cpp | cat -nRepository: NVIDIA/cuopt
Length of output: 6665
🏁 Script executed:
# Search for GET_RESULT or similar result retrieval methods
rg -A 5 "GET_RESULT|get_result|GetResult" cpp/cuopt_remote_server.cpp | head -60Repository: NVIDIA/cuopt
Length of output: 1562
🏁 Script executed:
# Find the structure definitions for waiting threads and job tracker
rg -B 2 -A 10 "struct.*Waiter|struct.*JobInfo" cpp/cuopt_remote_server.cpp | head -80Repository: NVIDIA/cuopt
Length of output: 675
🏁 Script executed:
# Check if there are any other operations on result_data after the tracker update
sed -n '85,150p' cpp/cuopt_remote_server.cpp | cat -nRepository: NVIDIA/cuopt
Length of output: 2793
🏁 Script executed:
# Find the get_job_result function implementation
rg -B 2 -A 20 "get_job_result\(" cpp/cuopt_remote_server.cppRepository: NVIDIA/cuopt
Length of output: 1715
🏁 Script executed:
# Also search for any other place where job_tracker.result_data is accessed
rg "job_tracker.*result_data|result_data.*job_tracker" cpp/cuopt_remote_server.cppRepository: NVIDIA/cuopt
Length of output: 38
Move result_data after updating job_tracker to prevent data loss.
When a blocking waiter exists, result_data is moved into the waiter at line 96, leaving an empty vector to be stored in job_tracker at line 111. Subsequent GET_RESULT calls retrieve this empty vector from the tracker, returning no solution data even for completed jobs. Copy the data for the waiter instead.
Proposed fix
{
std::lock_guard<std::mutex> lock(waiters_mutex);
auto wit = waiting_threads.find(job_id);
if (wit != waiting_threads.end()) {
// Wake up the waiting thread
- auto waiter = wit->second;
- waiter->result_data = std::move(result_data);
+ auto waiter = wit->second;
+ auto result_data_copy = result_data;
+ waiter->result_data = std::move(result_data_copy);
waiter->error_message = error_message;
waiter->success = success;
waiter->ready = true;
waiter->cv.notify_one();
}
}
@@
if (it != job_tracker.end()) {
if (success) {
it->second.status = JobStatus::COMPLETED;
- it->second.result_data = result_data;
+ it->second.result_data = std::move(result_data);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Check if there's a blocking waiter | |
| { | |
| std::lock_guard<std::mutex> lock(waiters_mutex); | |
| auto wit = waiting_threads.find(job_id); | |
| if (wit != waiting_threads.end()) { | |
| // Wake up the waiting thread | |
| auto waiter = wit->second; | |
| waiter->result_data = std::move(result_data); | |
| waiter->error_message = error_message; | |
| waiter->success = success; | |
| waiter->ready = true; | |
| waiter->cv.notify_one(); | |
| } | |
| } | |
| // Update job tracker | |
| { | |
| std::lock_guard<std::mutex> lock(tracker_mutex); | |
| auto it = job_tracker.find(job_id); | |
| if (it != job_tracker.end()) { | |
| if (success) { | |
| it->second.status = JobStatus::COMPLETED; | |
| it->second.result_data = result_data; | |
| } else if (cancelled) { | |
| it->second.status = JobStatus::CANCELLED; | |
| it->second.error_message = error_message; | |
| } else { | |
| it->second.status = JobStatus::FAILED; | |
| it->second.error_message = error_message; | |
| } | |
| // Check if there's a blocking waiter | |
| { | |
| std::lock_guard<std::mutex> lock(waiters_mutex); | |
| auto wit = waiting_threads.find(job_id); | |
| if (wit != waiting_threads.end()) { | |
| // Wake up the waiting thread | |
| auto waiter = wit->second; | |
| auto result_data_copy = result_data; | |
| waiter->result_data = std::move(result_data_copy); | |
| waiter->error_message = error_message; | |
| waiter->success = success; | |
| waiter->ready = true; | |
| waiter->cv.notify_one(); | |
| } | |
| } | |
| // Update job tracker | |
| { | |
| std::lock_guard<std::mutex> lock(tracker_mutex); | |
| auto it = job_tracker.find(job_id); | |
| if (it != job_tracker.end()) { | |
| if (success) { | |
| it->second.status = JobStatus::COMPLETED; | |
| it->second.result_data = std::move(result_data); | |
| } else if (cancelled) { | |
| it->second.status = JobStatus::CANCELLED; | |
| it->second.error_message = error_message; | |
| } else { | |
| it->second.status = JobStatus::FAILED; | |
| it->second.error_message = error_message; | |
| } |
🤖 Prompt for AI Agents
In `@cpp/cuopt_remote_server.cpp` around lines 1188 - 1217, The code currently
moves result_data into the waiting thread (waiter->result_data =
std::move(result_data)), which empties it before you store it in job_tracker;
change the order or semantics so the tracker gets the real result: either copy
result_data into the waiter (use waiter->result_data = result_data instead of
std::move) and then move result_data into the tracker (it->second.result_data =
std::move(result_data)), or update the job_tracker first (it->second.result_data
= std::move(result_data)) and then set waiter->result_data =
it->second.result_data (copy). Update the block touching
waiting_threads/waiters_mutex or the block touching job_tracker/tracker_mutex
accordingly so waiter->result_data retains the full data and job_tracker stores
the result (addresses: waiting_threads, waiters_mutex, waiter->result_data,
job_tracker, tracker_mutex, JobStatus::COMPLETED).
cpp/cuopt_remote_server.cpp
Outdated
| // Use RAII stdout streamer - captures stdout and streams to client while | ||
| // also echoing to server console. Destructor restores original stdout. | ||
| { | ||
| stdout_streamer_t streamer(client_fd, stream_logs); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, let's find and examine the handle_sync_solve function
fd -t f cuopt_remote_server.cppRepository: NVIDIA/cuopt
Length of output: 83
🏁 Script executed:
# Search for handle_sync_solve function definition
rg -n "void handle_sync_solve" --type cppRepository: NVIDIA/cuopt
Length of output: 126
🏁 Script executed:
# Find stdout_streamer_t implementation
rg -n "class stdout_streamer_t|struct stdout_streamer_t" --type cpp --type hppRepository: NVIDIA/cuopt
Length of output: 83
🏁 Script executed:
# Read the handle_sync_solve function
sed -n '1808,1850p' cpp/cuopt_remote_server.cppRepository: NVIDIA/cuopt
Length of output: 1551
🏁 Script executed:
# Search for stdout_streamer_t definition (broader search)
rg -n "stdout_streamer_t" cpp/cuopt_remote_server.cpp -A 5 -B 2Repository: NVIDIA/cuopt
Length of output: 1438
🏁 Script executed:
# Search for stdout_streamer_t in all files
rg -n "class stdout_streamer_t|struct stdout_streamer_t"Repository: NVIDIA/cuopt
Length of output: 113
🏁 Script executed:
# Read the full stdout_streamer_t class implementation
sed -n '133,200p' cpp/cuopt_remote_server.cppRepository: NVIDIA/cuopt
Length of output: 1569
🏁 Script executed:
# Check the main server loop to understand concurrent request handling
rg -n "handle_sync_solve|handle_async_solve|main\(" cpp/cuopt_remote_server.cpp -A 3 -B 1 | head -60Repository: NVIDIA/cuopt
Length of output: 547
🏁 Script executed:
# Search for any existing mutex or synchronization in server
rg -n "mutex|lock_guard|atomic|thread" cpp/cuopt_remote_server.cpp | head -20Repository: NVIDIA/cuopt
Length of output: 1181
🏁 Script executed:
# Read the context around where handle_sync_solve is called (line 2113)
sed -n '2100,2120p' cpp/cuopt_remote_server.cppRepository: NVIDIA/cuopt
Length of output: 743
🏁 Script executed:
# Find the function that contains line 2113
sed -n '2050,2115p' cpp/cuopt_remote_server.cppRepository: NVIDIA/cuopt
Length of output: 2582
🏁 Script executed:
# Check server main loop and thread handling
sed -n '2138,2200p' cpp/cuopt_remote_server.cppRepository: NVIDIA/cuopt
Length of output: 2509
🏁 Script executed:
# Find the function that contains the handle_sync_solve call
sed -n '1900,2115p' cpp/cuopt_remote_server.cpp | head -100Repository: NVIDIA/cuopt
Length of output: 4499
🏁 Script executed:
# Search for the start of the function containing handle_sync_solve
rg -n "^void\s+\w+\(|^int\s+\w+\(" cpp/cuopt_remote_server.cpp | grep -B 1 "^1"Repository: NVIDIA/cuopt
Length of output: 464
🏁 Script executed:
# Look for the accept() loop and threading model for handling clients
rg -n "accept\(|socket\(|listen\(" cpp/cuopt_remote_server.cpp -A 5 -B 2Repository: NVIDIA/cuopt
Length of output: 785
🏁 Script executed:
# Read what happens after accept() in the main loop
sed -n '2259,2280p' cpp/cuopt_remote_server.cppRepository: NVIDIA/cuopt
Length of output: 732
🏁 Script executed:
# Verify dup2 with STDOUT_FILENO only appears in stdout_streamer_t
rg -n "dup2.*STDOUT_FILENO" cpp/cuopt_remote_server.cppRepository: NVIDIA/cuopt
Length of output: 237
🏁 Script executed:
# Check lines 591 and 706 to see if they also have concurrency issues
sed -n '580,600p' cpp/cuopt_remote_server.cppRepository: NVIDIA/cuopt
Length of output: 808
🏁 Script executed:
# Check lines 700-715
sed -n '700,715p' cpp/cuopt_remote_server.cppRepository: NVIDIA/cuopt
Length of output: 659
Protect stdout/stderr redirection with a mutex to prevent concurrent client interference.
stdout_streamer_t uses process-wide dup2() calls in lines 156 and 175. Since handle_sync_solve is invoked from multiple detached threads (line 2274), concurrent clients can interfere: when Thread A redirects stdout to pipe_A and Thread B subsequently redirects to pipe_B before A's destructor runs, logs leak across clients and stream restoration breaks. Add synchronization:
Proposed fix
void handle_sync_solve(int client_fd,
const std::vector<uint8_t>& request_data,
bool is_mip,
bool stream_logs)
{
+ static std::mutex stdout_redirect_mutex;
+ std::lock_guard<std::mutex> lock(stdout_redirect_mutex);
+
std::string job_id = generate_job_id();📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Use RAII stdout streamer - captures stdout and streams to client while | |
| // also echoing to server console. Destructor restores original stdout. | |
| { | |
| stdout_streamer_t streamer(client_fd, stream_logs); | |
| void handle_sync_solve(int client_fd, | |
| const std::vector<uint8_t>& request_data, | |
| bool is_mip, | |
| bool stream_logs) | |
| { | |
| static std::mutex stdout_redirect_mutex; | |
| std::lock_guard<std::mutex> lock(stdout_redirect_mutex); | |
| std::string job_id = generate_job_id(); | |
| // Use RAII stdout streamer - captures stdout and streams to client while | |
| // also echoing to server console. Destructor restores original stdout. | |
| { | |
| stdout_streamer_t streamer(client_fd, stream_logs); |
🤖 Prompt for AI Agents
In `@cpp/cuopt_remote_server.cpp` around lines 1829 - 1833, The stdout/stderr
redirection in stdout_streamer_t (which calls dup2() in its
constructor/destructor) must be serialized to avoid cross-thread interference:
introduce a process-wide mutex (e.g., a static std::mutex stdout_redir_mutex)
and acquire a lock whenever performing the dup2-based redirection and when
restoring the original fds; specifically, lock inside stdout_streamer_t's
constructor before the dup2 that redirects stdout/stderr and keep the lock until
after the dup2 in the destructor that restores them (or alternatively lock
around the scope where stdout_streamer_t streamer(client_fd, stream_logs) is
constructed in handle_sync_solve), ensuring no two threads can perform the
dup2/restore concurrently.
| bool maximize = false; | ||
| cuopt_float_t objective_offset = 0.0; | ||
| std::vector<cuopt_float_t> objective_coefficients; | ||
|
|
||
| // Quadratic objective (optional) | ||
| std::vector<cuopt_float_t> Q_values; | ||
| std::vector<cuopt_int_t> Q_indices; | ||
| std::vector<cuopt_int_t> Q_offsets; | ||
|
|
||
| // Constraint matrix (CSR format) | ||
| std::vector<cuopt_float_t> A_values; | ||
| std::vector<cuopt_int_t> A_indices; | ||
| std::vector<cuopt_int_t> A_offsets; | ||
|
|
||
| // Constraint bounds (two representations) | ||
| std::vector<char> row_types; // '<', '>', '=' style | ||
| std::vector<cuopt_float_t> constraint_bounds; // single RHS for row_types style | ||
| std::vector<cuopt_float_t> constraint_lower_bounds; // ranged style | ||
| std::vector<cuopt_float_t> constraint_upper_bounds; // ranged style | ||
| bool uses_ranged_constraints = false; | ||
|
|
||
| // Variable bounds | ||
| std::vector<cuopt_float_t> variable_lower_bounds; | ||
| std::vector<cuopt_float_t> variable_upper_bounds; | ||
|
|
||
| // Variable types | ||
| std::vector<char> variable_types; // 'C' for continuous, 'I' for integer | ||
|
|
||
| /** | ||
| * @brief Create a data_model_view_t pointing to this CPU data. | ||
| */ | ||
| cuopt::linear_programming::data_model_view_t<cuopt_int_t, cuopt_float_t> create_view() const | ||
| { | ||
| cuopt::linear_programming::data_model_view_t<cuopt_int_t, cuopt_float_t> view; | ||
|
|
||
| view.set_maximize(maximize); | ||
| view.set_objective_offset(objective_offset); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remote CPU path drops objective scaling factor from MPS input.
Line 46-83 and Line 330-333: problem_cpu_data_t doesn’t store objective_scaling_factor, and the remote cuOptReadProblem path doesn’t propagate it. If the MPS parser sets a scaling factor, remote solve may compute objectives incorrectly. Add a field and pass it into the view.
🔧 Suggested fix
struct problem_cpu_data_t {
// Objective
bool maximize = false;
cuopt_float_t objective_offset = 0.0;
+ cuopt_float_t objective_scaling_factor = 1.0;
std::vector<cuopt_float_t> objective_coefficients;
...
cuopt::linear_programming::data_model_view_t<cuopt_int_t, cuopt_float_t> create_view() const
{
cuopt::linear_programming::data_model_view_t<cuopt_int_t, cuopt_float_t> view;
view.set_maximize(maximize);
view.set_objective_offset(objective_offset);
+ view.set_objective_scaling_factor(objective_scaling_factor);
...
}
};
...
cpu_data.maximize = mps.get_sense();
cpu_data.objective_offset = mps.get_objective_offset();
+ cpu_data.objective_scaling_factor = mps.get_objective_scaling_factor();Also applies to: 330-333
🤖 Prompt for AI Agents
In `@cpp/src/linear_programming/cuopt_c.cpp` around lines 46 - 83,
problem_cpu_data_t is missing an objective_scaling_factor field and the remote
cuOptReadProblem path doesn't propagate it into the view; add a cuopt_float_t
objective_scaling_factor member to problem_cpu_data_t, populate it where
cuOptReadProblem constructs or copies CPU problem data in the remote path, and
modify create_view() (data_model_view_t<cuopt_int_t,cuopt_float_t>::create_view)
to call view.set_objective_scaling_factor(objective_scaling_factor) in addition
to set_maximize and set_objective_offset so the scaling from the MPS parser is
preserved end-to-end.
| if (problem_type == ProblemType::LP) { | ||
| auto* lp_req = req.mutable_lp_request(); | ||
| if (!lp_req->ParseFromArray(data, static_cast<int>(size))) { | ||
| error_message = "SubmitJob: failed to parse LP request"; | ||
| status_code = grpc::StatusCode::INVALID_ARGUMENT; | ||
| return false; | ||
| } | ||
| } else { | ||
| auto* mip_req = req.mutable_mip_request(); | ||
| if (!mip_req->ParseFromArray(data, static_cast<int>(size))) { | ||
| error_message = "SubmitJob: failed to parse MIP request"; | ||
| status_code = grpc::StatusCode::INVALID_ARGUMENT; | ||
| return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, let's examine the file around the mentioned lines
cat -n cpp/src/linear_programming/utilities/remote_solve_grpc.cpp | head -120 | tail -50Repository: NVIDIA/cuopt
Length of output: 1867
🏁 Script executed:
# Check the full context of the submit_job function
wc -l cpp/src/linear_programming/utilities/remote_solve_grpc.cppRepository: NVIDIA/cuopt
Length of output: 118
🏁 Script executed:
# Search for the SubmitJob function definition and any size checks
rg "SubmitJob" cpp/src/linear_programming/utilities/remote_solve_grpc.cpp -B 5 -A 30 | head -100Repository: NVIDIA/cuopt
Length of output: 2181
🏁 Script executed:
# Look for protobuf includes and message definitions
rg "ParseFromArray" cpp/src/linear_programming/utilities/remote_solve_grpc.cpp -B 2 -A 2Repository: NVIDIA/cuopt
Length of output: 576
🏁 Script executed:
# Check for any existing size validation in the file
rg "size_t|INT_MAX|numeric_limits|size.*check|payload.*limit" cpp/src/linear_programming/utilities/remote_solve_grpc.cppRepository: NVIDIA/cuopt
Length of output: 1121
🏁 Script executed:
# Search for calls to submit_job
rg "submit_job\(" cpp/src/linear_programming/utilities/remote_solve_grpc.cpp -B 2 -A 2Repository: NVIDIA/cuopt
Length of output: 487
🏁 Script executed:
# Check public API headers to see if this function is exposed
fd "\.hpp$|\.h$" cpp/src/linear_programming/utilities/ | xargs grep -l "submit_job" 2>/dev/nullRepository: NVIDIA/cuopt
Length of output: 40
🏁 Script executed:
# Look for gRPC API definitions and how data is passed
rg "SubmitJobRequest|UploadJobRequest" cpp/src/linear_programming/utilities/remote_solve_grpc.cpp -B 2 -A 5 | head -60Repository: NVIDIA/cuopt
Length of output: 2055
🏁 Script executed:
# Check protobuf documentation or usage patterns for ParseFromArray
web_search
Protocol Buffers ParseFromArray int parameter size overflowRepository: NVIDIA/cuopt
Length of output: 153
🌐 Web query:
Protocol Buffers protobuf ParseFromArray size parameter limitations int overflow
💡 Result:
In protobuf C++, MessageLite::ParseFromArray(const void* data, int size) (and ParseFromString) are fundamentally limited by the fact that the size parameter and many internal length/byte-count fields are 32-bit signed ints. That implies:
- Hard upper bound ≈ 2 GiB (2,147,483,647 bytes) for a single serialized message. Above this, the size can’t be represented safely as a positive
int, and implementations commonly reject it or fail unpredictably due to integer overflow / negative lengths. This is the cross-language guideline documented by protobuf: serialized protos must be < 2 GiB. [1][2] - If you pass a value that overflowed into a negative
int, parsing can fail with truncation/limit errors or other parse failures because internal limit logic assumes non-negative byte counts (a symptom noted by users whenintlength/limit math wraps). [3] - Separately, many runtimes impose smaller “defensive” size limits (commonly mentioned as ~64 MiB by default in some Google implementations) to avoid memory/DoS issues; these can apply especially when parsing from streams rather than raw arrays. [4][5]
Practical takeaway: if you’re anywhere near multi‑GiB payloads, protobuf is the wrong unit of framing—split into multiple messages/chunks, store the bulk data out-of-band (files/object store), and keep only metadata/refs in protobuf. [4]
Sources:
[1] Proto “Total Size of the Message” limit (<2 GiB) — protobuf.dev proto-limits: https://protobuf.dev/programming-guides/proto-limits/
[2] Encoded proto size limitations (<2 GiB) — protobuf.dev encoding guide: https://protobuf.dev/programming-guides/encoding/
[3] Discussion pointing to int overflow/negative limit behavior in parsing — Stack Overflow: https://stackoverflow.com/questions/55954550/failing-parsefrom-messages-in-java-from-c
[4] Explanation of 2 GB hard limit and typical smaller default limits — Stack Overflow: https://stackoverflow.com/questions/34128872/google-protobuf-maximum-size
[5] Protobuf group: ParseFromString == ParseFromArray(..., size) and “uses int32 for size/length → max 2G bytes” — Google Groups: https://groups.google.com/g/protobuf/c/7JG0VdfAfTQ
🏁 Script executed:
# Check the caller of submit_job - look at more context around line with submit_job call
rg "submit_job\(" cpp/src/linear_programming/utilities/remote_solve_grpc.cpp -B 10 -A 5Repository: NVIDIA/cuopt
Length of output: 1224
🏁 Script executed:
# Check if there are protobuf limits or constraints defined
rg "max_bytes|kMaxMessageBytes|message.*size|ParseFromArray" cpp/src/linear_programming/utilities/remote_solve_grpc.cpp | head -30Repository: NVIDIA/cuopt
Length of output: 1428
🏁 Script executed:
# Check how similar operations handle the int cast elsewhere in the file
rg "static_cast<int>" cpp/src/linear_programming/utilities/remote_solve_grpc.cppRepository: NVIDIA/cuopt
Length of output: 275
🏁 Script executed:
# Look for proto file definition to understand message structure
fd "\.proto$" | xargs grep -l "SubmitJobRequest\|LPRequest\|MIPRequest" 2>/dev/null | head -5Repository: NVIDIA/cuopt
Length of output: 175
🏁 Script executed:
# Check function declarations and public headers
rg "submit_job" cpp/src/linear_programming/utilities/remote_solve_grpc.cpp | grep -E "^[^:]*\.cpp:|bool submit_job"Repository: NVIDIA/cuopt
Length of output: 99
🏁 Script executed:
# Look for public header exposure
find cpp -name "*.hpp" -o -name "*.h" | xargs grep "submit_job" 2>/dev/nullRepository: NVIDIA/cuopt
Length of output: 40
🏁 Script executed:
# Verify the scope - is this function static or global?
rg "^(static )?bool submit_job" cpp/src/linear_programming/utilities/remote_solve_grpc.cpp -B 3Repository: NVIDIA/cuopt
Length of output: 106
🏁 Script executed:
# Check the proto file for message size constraints
cat cpp/src/linear_programming/utilities/cuopt_remote.proto | head -100Repository: NVIDIA/cuopt
Length of output: 3018
Add size bounds check before ParseFromArray to prevent int overflow.
ParseFromArray accepts an int for size, but submit_job receives a size_t parameter without validating against std::numeric_limits<int>::max(). If size > INT_MAX, the cast causes silent integer overflow, leading to unpredictable parse failures. The caller validates against a configured max_bytes limit, but not against the protobuf hard limit. Add an explicit check:
cuopt::remote::SubmitJobRequest req;
+ if (size > static_cast<size_t>(std::numeric_limits<int>::max())) {
+ error_message = "SubmitJob: request too large for protobuf ParseFromArray";
+ status_code = grpc::StatusCode::RESOURCE_EXHAUSTED;
+ return false;
+ }
+
if (problem_type == ProblemType::LP) {
auto* lp_req = req.mutable_lp_request();
if (!lp_req->ParseFromArray(data, static_cast<int>(size))) {🤖 Prompt for AI Agents
In `@cpp/src/linear_programming/utilities/remote_solve_grpc.cpp` around lines 88 -
100, The code calls ParseFromArray with a static_cast<int>(size) on size_t
without checking bounds; add an explicit check before parsing (for both branches
around req.mutable_lp_request() and req.mutable_mip_request()) that if size >
static_cast<size_t>(std::numeric_limits<int>::max()) then set error_message
(e.g., "SubmitJob: request too large for ParseFromArray"), set status_code =
grpc::StatusCode::INVALID_ARGUMENT and return false; ensure you `#include`
<limits> or <climits> as needed and reference ProblemType, ParseFromArray,
req.mutable_lp_request()/req.mutable_mip_request(), error_message and
status_code in the fix.
| void stream_logs_to_stdout(const std::string& address, | ||
| const std::string& job_id, | ||
| volatile bool* stop_flag, | ||
| const std::string& print_prefix) | ||
| { | ||
| auto stub = make_stub(address); | ||
| grpc::ClientContext ctx; | ||
| cuopt::remote::StreamLogsRequest req; | ||
| req.set_job_id(job_id); | ||
| req.set_from_byte(0); | ||
|
|
||
| std::unique_ptr<grpc::ClientReader<cuopt::remote::LogMessage>> reader = | ||
| stub->StreamLogs(&ctx, req); | ||
|
|
||
| cuopt::remote::LogMessage msg; | ||
| while (reader->Read(&msg)) { | ||
| if (stop_flag != nullptr && *stop_flag) { ctx.TryCancel(); } | ||
| if (!msg.line().empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, let's examine the file and the specific function
cat -n cpp/src/linear_programming/utilities/remote_solve_grpc.cpp | head -700 | tail -150Repository: NVIDIA/cuopt
Length of output: 5697
🏁 Script executed:
# Let's get a better view of the function and its context
sed -n '580,650p' cpp/src/linear_programming/utilities/remote_solve_grpc.cppRepository: NVIDIA/cuopt
Length of output: 2350
🏁 Script executed:
# Search for calls to stream_logs_to_stdout
rg 'stream_logs_to_stdout' -A 3 -B 3Repository: NVIDIA/cuopt
Length of output: 2805
🏁 Script executed:
# Search for volatile usage in this file
rg 'volatile' cpp/src/linear_programming/utilities/remote_solve_grpc.cpp -B 3 -A 3Repository: NVIDIA/cuopt
Length of output: 314
🏁 Script executed:
# Check how stop_flag is created and passed in the codebase
rg 'stop_flag' cpp/src/linear_programming/utilities/remote_solve_grpc.cpp -B 2 -A 2Repository: NVIDIA/cuopt
Length of output: 506
🏁 Script executed:
# Search for stop_logs declaration and usage in remote_solve.cu
rg 'stop_logs' cpp/src/linear_programming/utilities/remote_solve.cu -B 3 -A 3Repository: NVIDIA/cuopt
Length of output: 3766
🏁 Script executed:
# Get the full context around the thread spawning in remote_solve.cu
sed -n '1,100p' cpp/src/linear_programming/utilities/remote_solve.cu | rg -B 10 -A 10 'stream_logs_to_stdout'Repository: NVIDIA/cuopt
Length of output: 38
🏁 Script executed:
# Search more broadly for stop_logs in remote_solve.cu
grep -n 'stop_logs' cpp/src/linear_programming/utilities/remote_solve.cuRepository: NVIDIA/cuopt
Length of output: 607
Use an atomic stop flag instead of volatile.
volatile bool* does not provide inter-thread synchronization and creates a data race when one thread writes the flag while another thread reads it. In cpp/src/linear_programming/utilities/remote_solve.cu, stop_logs is passed to a worker thread that continuously reads it while the main thread writes to it. Use std::atomic<bool> and call load() with appropriate memory ordering to ensure safe inter-thread communication.
Proposed fix
-void stream_logs_to_stdout(const std::string& address,
- const std::string& job_id,
- volatile bool* stop_flag,
- const std::string& print_prefix)
+void stream_logs_to_stdout(const std::string& address,
+ const std::string& job_id,
+ std::atomic<bool>* stop_flag,
+ const std::string& print_prefix)
{
auto stub = make_stub(address);
grpc::ClientContext ctx;
@@
- while (reader->Read(&msg)) {
- if (stop_flag != nullptr && *stop_flag) { ctx.TryCancel(); }
+ while (reader->Read(&msg)) {
+ if (stop_flag && stop_flag->load(std::memory_order_relaxed)) { ctx.TryCancel(); }🤖 Prompt for AI Agents
In `@cpp/src/linear_programming/utilities/remote_solve_grpc.cpp` around lines 597
- 614, Replace the non-synchronized volatile flag with an atomic boolean: change
the parameter type of stream_logs_to_stdout from volatile bool* stop_flag to
std::atomic<bool>* stop_flag and replace reads of *stop_flag with
stop_flag->load(std::memory_order_acquire); update the writer side in
remote_solve.cu (the stop_logs variable and any writes) to use std::atomic<bool>
and stop_logs.store(true, std::memory_order_release) so the worker thread and
main thread synchronize safely; ensure all declarations and calls that pass
stop_logs are adjusted to the std::atomic<bool>* type.
| // Helper struct to hold CPU copies of GPU data for remote solve | ||
| template <typename i_t, typename f_t> | ||
| struct cpu_problem_data_t { | ||
| std::vector<f_t> A_values; | ||
| std::vector<i_t> A_indices; | ||
| std::vector<i_t> A_offsets; | ||
| std::vector<f_t> constraint_bounds; | ||
| std::vector<f_t> constraint_lower_bounds; | ||
| std::vector<f_t> constraint_upper_bounds; | ||
| std::vector<f_t> objective_coefficients; | ||
| std::vector<f_t> variable_lower_bounds; | ||
| std::vector<f_t> variable_upper_bounds; | ||
| std::vector<char> variable_types; | ||
| std::vector<f_t> quadratic_objective_values; | ||
| std::vector<i_t> quadratic_objective_indices; | ||
| std::vector<i_t> quadratic_objective_offsets; | ||
| bool maximize; | ||
| f_t objective_scaling_factor; | ||
| f_t objective_offset; | ||
|
|
||
| data_model_view_t<i_t, f_t> create_view() const | ||
| { | ||
| data_model_view_t<i_t, f_t> v; | ||
| v.set_maximize(maximize); | ||
| v.set_objective_scaling_factor(objective_scaling_factor); | ||
| v.set_objective_offset(objective_offset); | ||
|
|
||
| if (!A_values.empty()) { | ||
| v.set_csr_constraint_matrix(A_values.data(), | ||
| A_values.size(), | ||
| A_indices.data(), | ||
| A_indices.size(), | ||
| A_offsets.data(), | ||
| A_offsets.size()); | ||
| } | ||
| if (!constraint_bounds.empty()) { | ||
| v.set_constraint_bounds(constraint_bounds.data(), constraint_bounds.size()); | ||
| } | ||
| if (!constraint_lower_bounds.empty() && !constraint_upper_bounds.empty()) { | ||
| v.set_constraint_lower_bounds(constraint_lower_bounds.data(), constraint_lower_bounds.size()); | ||
| v.set_constraint_upper_bounds(constraint_upper_bounds.data(), constraint_upper_bounds.size()); | ||
| } | ||
| if (!objective_coefficients.empty()) { | ||
| v.set_objective_coefficients(objective_coefficients.data(), objective_coefficients.size()); | ||
| } | ||
| if (!variable_lower_bounds.empty()) { | ||
| v.set_variable_lower_bounds(variable_lower_bounds.data(), variable_lower_bounds.size()); | ||
| } | ||
| if (!variable_upper_bounds.empty()) { | ||
| v.set_variable_upper_bounds(variable_upper_bounds.data(), variable_upper_bounds.size()); | ||
| } | ||
| if (!variable_types.empty()) { | ||
| v.set_variable_types(variable_types.data(), variable_types.size()); | ||
| } | ||
| if (!quadratic_objective_values.empty()) { | ||
| v.set_quadratic_objective_matrix(quadratic_objective_values.data(), | ||
| quadratic_objective_values.size(), | ||
| quadratic_objective_indices.data(), | ||
| quadratic_objective_indices.size(), | ||
| quadratic_objective_offsets.data(), | ||
| quadratic_objective_offsets.size()); | ||
| } | ||
| v.set_is_device_memory(false); | ||
| return v; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preserve constraint row types when copying GPU views to CPU.
Line 383-447 and Line 469-475: row_types are never captured in cpu_problem_data_t nor copied from gpu_view. For models represented via row_types + RHS, the remote solve loses constraint sense and can produce invalid results. Add row_types storage and propagate it into the CPU view.
🔧 Suggested fix
struct cpu_problem_data_t {
std::vector<f_t> A_values;
std::vector<i_t> A_indices;
std::vector<i_t> A_offsets;
+ std::vector<char> row_types;
std::vector<f_t> constraint_bounds;
std::vector<f_t> constraint_lower_bounds;
std::vector<f_t> constraint_upper_bounds;
...
data_model_view_t<i_t, f_t> create_view() const
{
data_model_view_t<i_t, f_t> v;
...
+ if (!row_types.empty()) {
+ v.set_row_types(row_types.data(), row_types.size());
+ }
if (!constraint_bounds.empty()) {
v.set_constraint_bounds(constraint_bounds.data(), constraint_bounds.size());
}
...
}
};
...
copy_to_host(cpu_data.A_values, gpu_view.get_constraint_matrix_values());
copy_to_host(cpu_data.A_indices, gpu_view.get_constraint_matrix_indices());
copy_to_host(cpu_data.A_offsets, gpu_view.get_constraint_matrix_offsets());
+ copy_to_host(cpu_data.row_types, gpu_view.get_row_types());
copy_to_host(cpu_data.constraint_bounds, gpu_view.get_constraint_bounds());Also applies to: 469-475
🤖 Prompt for AI Agents
In `@cpp/src/mip/solve.cu` around lines 383 - 447, The cpu_problem_data_t is
missing storage and propagation for constraint row types, so add a member (e.g.,
std::vector<char> row_types or the appropriate type) to cpu_problem_data_t,
ensure any code that constructs/populates cpu_problem_data_t from gpu_view
copies gpu_view.row_types into that vector, and update
cpu_problem_data_t::create_view() to call
v.set_constraint_row_types(row_types.data(), row_types.size()) (in the same
manner as other constraint arrays) before returning; refer to
cpu_problem_data_t, create_view, row_types, gpu_view, and
set_constraint_row_types to locate where to add the field, copy it, and pass it
into the data_model_view_t.
| cudaMemcpyAsync(cpu_data.variable_types.data(), | ||
| var_types_span.data(), | ||
| var_types_span.size() * sizeof(char), | ||
| cudaMemcpyDeviceToHost, | ||
| stream); | ||
| } | ||
|
|
||
| // Synchronize to ensure all copies are complete | ||
| cudaStreamSynchronize(stream); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add CUDA error checks around async copy and stream synchronization.
Lines 486-495: cudaMemcpyAsync and cudaStreamSynchronize lack error checking. Wrap with RAFT_CUDA_TRY to catch and propagate CUDA failures. Per coding guidelines, all CUDA memory operations must have error verification.
🔧 Suggested fix
- cudaMemcpyAsync(cpu_data.variable_types.data(),
- var_types_span.data(),
- var_types_span.size() * sizeof(char),
- cudaMemcpyDeviceToHost,
- stream);
+ RAFT_CUDA_TRY(cudaMemcpyAsync(cpu_data.variable_types.data(),
+ var_types_span.data(),
+ var_types_span.size() * sizeof(char),
+ cudaMemcpyDeviceToHost,
+ stream));
}
// Synchronize to ensure all copies are complete
- cudaStreamSynchronize(stream);
+ RAFT_CUDA_TRY(cudaStreamSynchronize(stream));📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| cudaMemcpyAsync(cpu_data.variable_types.data(), | |
| var_types_span.data(), | |
| var_types_span.size() * sizeof(char), | |
| cudaMemcpyDeviceToHost, | |
| stream); | |
| } | |
| // Synchronize to ensure all copies are complete | |
| cudaStreamSynchronize(stream); | |
| RAFT_CUDA_TRY(cudaMemcpyAsync(cpu_data.variable_types.data(), | |
| var_types_span.data(), | |
| var_types_span.size() * sizeof(char), | |
| cudaMemcpyDeviceToHost, | |
| stream)); | |
| } | |
| // Synchronize to ensure all copies are complete | |
| RAFT_CUDA_TRY(cudaStreamSynchronize(stream)); |
🤖 Prompt for AI Agents
In `@cpp/src/mip/solve.cu` around lines 486 - 495, Wrap the cudaMemcpyAsync and
cudaStreamSynchronize calls with RAFT_CUDA_TRY to check for errors: replace the
direct call to cudaMemcpyAsync(cpu_data.variable_types.data(),
var_types_span.data(), var_types_span.size() * sizeof(char),
cudaMemcpyDeviceToHost, stream) with RAFT_CUDA_TRY(cudaMemcpyAsync(...)) and
similarly wrap cudaStreamSynchronize(stream) with
RAFT_CUDA_TRY(cudaStreamSynchronize(stream)); ensure you reference the same
variables (cpu_data.variable_types, var_types_span, stream) so failures in the
async copy or stream sync are caught and propagated per project guidelines.
| # Create cancel request | ||
| request = pb.AsyncRequest() | ||
| request.request_type = pb.CANCEL_JOB | ||
| request.job_id = job_id | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# Find proto files and locate job_id definition
find . -name "*.proto" -type f | head -20Repository: NVIDIA/cuopt
Length of output: 179
🏁 Script executed:
# Search for job_id in proto files
rg -n "job_id" --glob '*.proto' -B 2 -A 2Repository: NVIDIA/cuopt
Length of output: 7093
🏁 Script executed:
# Examine the remote.py file around lines 136-140 and broader context
wc -l python/cuopt/cuopt/linear_programming/remote.pyRepository: NVIDIA/cuopt
Length of output: 107
🏁 Script executed:
# Read the relevant section of remote.py
sed -n '120,160p' python/cuopt/cuopt/linear_programming/remote.pyRepository: NVIDIA/cuopt
Length of output: 1377
🏁 Script executed:
# Check the cancel_job function signature and its imports
sed -n '1,50p' python/cuopt/cuopt/linear_programming/remote.pyRepository: NVIDIA/cuopt
Length of output: 1378
🏁 Script executed:
# Check the cancel_job function signature and parameters
rg -n "def cancel_job" -A 20 python/cuopt/cuopt/linear_programming/remote.pyRepository: NVIDIA/cuopt
Length of output: 842
🏁 Script executed:
# Check if job_id is used elsewhere in similar ways
rg -n "request.job_id" python/cuopt/cuopt/linear_programming/remote.py -B 5 -A 2Repository: NVIDIA/cuopt
Length of output: 256
🏁 Script executed:
# Look for how job_id is received from API calls to understand its type
rg -n "job_id.*str\|str.*job_id" python/cuopt/cuopt/linear_programming/remote.pyRepository: NVIDIA/cuopt
Length of output: 38
🏁 Script executed:
# Check the protobuf generated Python code to understand the actual field type
find . -name "cuopt_remote_pb2.py" -type fRepository: NVIDIA/cuopt
Length of output: 38
🏁 Script executed:
# Examine what cuopt_remote_pb2 is and if it properly validates types
rg -n "class AsyncRequest" cpp/build/ 2>/dev/null | head -5Repository: NVIDIA/cuopt
Length of output: 38
🏁 Script executed:
# Check if there's any type conversion or validation wrapper
rg -n "job_id" python/cuopt/cuopt/linear_programming/remote.py | head -20Repository: NVIDIA/cuopt
Length of output: 182
🏁 Script executed:
# Look at the rest of the cancel_job function to see all job_id assignments
sed -n '75,170p' python/cuopt/cuopt/linear_programming/remote.pyRepository: NVIDIA/cuopt
Length of output: 3267
🏁 Script executed:
# Check if there are generated protobuf files in the repository or build directory
find . -name "*pb2.py" -type f 2>/dev/null | head -10Repository: NVIDIA/cuopt
Length of output: 38
🏁 Script executed:
# Search for how job_id is handled in other functions in remote.py
sed -n '1,217p' python/cuopt/cuopt/linear_programming/remote.py | grep -n "job_id\|bytes\|encode" | head -30Repository: NVIDIA/cuopt
Length of output: 182
🏁 Script executed:
# Check if there's any protobuf version specification or requirements
find . -name "requirements*.txt" -o -name "setup.py" -o -name "pyproject.toml" | xargs grep -l protobuf 2>/dev/nullRepository: NVIDIA/cuopt
Length of output: 40
🏁 Script executed:
# Look at how other functions handle similar proto assignments
rg -n "request\." python/cuopt/cuopt/linear_programming/remote.py | head -30Repository: NVIDIA/cuopt
Length of output: 224
Fix type mismatch: job_id must be encoded to bytes.
The .proto file defines job_id as bytes in AsyncRequest (cuopt_remote.proto:285), but the function accepts str and assigns it directly without encoding. This causes a TypeError when the protobuf message serializes.
Suggested fix
- request.job_id = job_id
+ request.job_id = (
+ job_id if isinstance(job_id, (bytes, bytearray)) else job_id.encode()
+ )📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Create cancel request | |
| request = pb.AsyncRequest() | |
| request.request_type = pb.CANCEL_JOB | |
| request.job_id = job_id | |
| # Create cancel request | |
| request = pb.AsyncRequest() | |
| request.request_type = pb.CANCEL_JOB | |
| request.job_id = ( | |
| job_id if isinstance(job_id, (bytes, bytearray)) else job_id.encode() | |
| ) |
🤖 Prompt for AI Agents
In `@python/cuopt/cuopt/linear_programming/remote.py` around lines 136 - 140, The
AsyncRequest's job_id field is defined as bytes but the code assigns a str
directly; update the creation of pb.AsyncRequest in remote.py (where
request.request_type = pb.CANCEL_JOB and request.job_id = job_id) to ensure
job_id is bytes (e.g., set request.job_id = job_id.encode('utf-8') or
conditionally keep it if already bytes) so serialization no longer raises a
TypeError.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
conda/recipes/libcuopt/recipe.yaml (1)
152-175: Addgrpc-cppto run dependencies.The build enables
CUOPT_ENABLE_GRPC=ONand installs bothcuopt_grpc_serverandcuopt_remote_serverexecutables tobin/. Both linkgRPC::grpc++and will fail at runtime withoutgrpc-cpp. Whileprotobufis already inrun(correctly mitigating the ignoredrun_exports),grpc-cppis missing and must be added.
🤖 Fix all issues with AI agents
In `@ci/utils/install_protobuf_grpc.sh`:
- Around line 22-24: The Rocky Linux gRPC install line uses "dnf install -y
grpc-devel grpc-plugins || dnf install -y grpc-devel || true", which silences
failures and violates set -euo pipefail; update the dnf installation logic in
ci/utils/install_protobuf_grpc.sh so it does not end with "|| true": run the
first dnf install attempt, if it fails try the fallback install, and if that
also fails print a clear error via echo or >&2 with context ("gRPC packages
missing: grpc-devel/grpc-plugins") and exit non‑zero; ensure this explicit check
replaces the trailing "|| true" so the script fails fast on missing gRPC
packages.
| # gRPC C++ (headers/libs + grpc_cpp_plugin for codegen) | ||
| # Package names can vary by repo; try the common ones first. | ||
| dnf install -y grpc-devel grpc-plugins || dnf install -y grpc-devel || true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
cat ci/utils/install_protobuf_grpc.shRepository: NVIDIA/cuopt
Length of output: 1751
🏁 Script executed:
# Search for patterns related to grpc_cpp_plugin in the repo
rg "grpc_cpp_plugin" --type py --type cpp --type bashRepository: NVIDIA/cuopt
Length of output: 84
🏁 Script executed:
# Search for grpc_cpp_plugin references without type filter
rg "grpc_cpp_plugin" --type py --type cpp || rg "grpc_cpp_plugin"Repository: NVIDIA/cuopt
Length of output: 580
🏁 Script executed:
# Verify the exact fallback logic and what packages grpc-plugins vs grpc-devel provide
# Check if there's a pattern of similar fallbacks in the script
rg "dnf install" ci/utils/install_protobuf_grpc.sh -A 1Repository: NVIDIA/cuopt
Length of output: 238
Don't silently ignore gRPC install failures; breaks the script's fail-fast contract.
The script uses set -euo pipefail but the Rocky Linux gRPC install ends with || true, bypassing this contract. This causes silent success even when gRPC packages are unavailable, deferring the error to a later CMake build failure with obscured context. The Ubuntu path correctly fails hard on missing packages—Rocky should do the same.
Replace || true with explicit error handling and validation:
🔧 Suggested fix
- dnf install -y grpc-devel grpc-plugins || dnf install -y grpc-devel || true
+ dnf install -y grpc-devel grpc-plugins || dnf install -y grpc-devel || {
+ echo "gRPC packages not available in enabled repos; aborting."
+ exit 1
+ }
+ command -v grpc_cpp_plugin >/dev/null || {
+ echo "grpc_cpp_plugin not found after install; aborting."
+ exit 1
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # gRPC C++ (headers/libs + grpc_cpp_plugin for codegen) | |
| # Package names can vary by repo; try the common ones first. | |
| dnf install -y grpc-devel grpc-plugins || dnf install -y grpc-devel || true | |
| # gRPC C++ (headers/libs + grpc_cpp_plugin for codegen) | |
| # Package names can vary by repo; try the common ones first. | |
| dnf install -y grpc-devel grpc-plugins || dnf install -y grpc-devel || { | |
| echo "gRPC packages not available in enabled repos; aborting." | |
| exit 1 | |
| } | |
| command -v grpc_cpp_plugin >/dev/null || { | |
| echo "grpc_cpp_plugin not found after install; aborting." | |
| exit 1 | |
| } |
🤖 Prompt for AI Agents
In `@ci/utils/install_protobuf_grpc.sh` around lines 22 - 24, The Rocky Linux gRPC
install line uses "dnf install -y grpc-devel grpc-plugins || dnf install -y
grpc-devel || true", which silences failures and violates set -euo pipefail;
update the dnf installation logic in ci/utils/install_protobuf_grpc.sh so it
does not end with "|| true": run the first dnf install attempt, if it fails try
the fallback install, and if that also fails print a clear error via echo or >&2
with context ("gRPC packages missing: grpc-devel/grpc-plugins") and exit
non‑zero; ensure this explicit check replaces the trailing "|| true" so the
script fails fast on missing gRPC packages.
| * | ||
| * @param is_device true if data is in GPU memory, false if in CPU memory | ||
| */ | ||
| void set_is_device_memory(bool is_device) noexcept { is_device_memory_ = is_device; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be
| void set_is_device_memory(bool is_device) noexcept { is_device_memory_ = is_device; } | |
| void set_device_memory(bool is_device) noexcept { is_device_memory_ = is_device; } |
or enable_device_memory ?
| template class mps_data_model_t<int, float>; | ||
|
|
||
| template class mps_data_model_t<int, double>; | ||
| template class mps_data_model_t<int64_t, float>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we support int64_t
| @@ -0,0 +1,245 @@ | |||
| // SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this generated by grpc or manually added ?
rgsl888prabhu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets add test cases to check remote execution for both GRPC and HTTP
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@cpp/CMakeLists.txt`:
- Around line 280-307: The CMake logic makes gRPC mandatory because it uses
find_package(gRPC REQUIRED) and never declares an option; change to declare an
option (e.g., option(CUOPT_ENABLE_GRPC "Enable gRPC support" OFF) ) before the
find_package call, remove REQUIRED from find_package(gRPC) so it becomes
optional, and only set CUOPT_ENABLE_GRPC ON /
add_compile_definitions(CUOPT_ENABLE_GRPC) when gRPC is found and the option is
enabled (otherwise leave the option OFF and handle the error branch that checks
CUOPT_ENABLE_GRPC). Ensure you update the branches that reference gRPC_FOUND and
CUOPT_ENABLE_GRPC accordingly so gRPC truly remains optional.
🧹 Nitpick comments (1)
cpp/CMakeLists.txt (1)
267-279: Avoid hard‑coding CUDSS SONAMEs in fallbacks.Using
libcudss.so.0/libcudss_mtlayer_gomp.so.0risks breakage if the SONAME bumps. Considerfind_library()or target-location fallback instead.♻️ Possible fallback using
find_libraryif(NOT CUDSS_LIB_FILE AND DEFINED cudss_LIBRARY_DIR) - set(CUDSS_LIB_FILE "${cudss_LIBRARY_DIR}/libcudss.so.0") + find_library(CUDSS_LIB_FILE NAMES cudss + PATHS "${cudss_LIBRARY_DIR}" NO_DEFAULT_PATH) endif() if(NOT CUDSS_MT_LIB_FILE AND DEFINED cudss_LIBRARY_DIR) - set(CUDSS_MT_LIB_FILE "${cudss_LIBRARY_DIR}/libcudss_mtlayer_gomp.so.0") + find_library(CUDSS_MT_LIB_FILE NAMES cudss_mtlayer_gomp + PATHS "${cudss_LIBRARY_DIR}" NO_DEFAULT_PATH) endif()
| # Protocol Buffers for remote solve serialization | ||
| # | ||
| # Use plain find_package so system installs (that don't ship ProtobufConfig.cmake) | ||
| # can fall back to CMake's built-in FindProtobuf module. | ||
| find_package(Protobuf REQUIRED) | ||
|
|
||
| # Normalize target names across Protobuf providers (config vs module). | ||
| if(TARGET Protobuf::libprotobuf AND NOT TARGET protobuf::libprotobuf) | ||
| add_library(protobuf::libprotobuf ALIAS Protobuf::libprotobuf) | ||
| endif() | ||
| if(TARGET Protobuf::protoc AND NOT TARGET protobuf::protoc) | ||
| add_executable(protobuf::protoc ALIAS Protobuf::protoc) | ||
| endif() | ||
| include_directories(${Protobuf_INCLUDE_DIRS}) | ||
| include_directories(${CMAKE_CURRENT_BINARY_DIR}) | ||
|
|
||
| # gRPC for remote solve service (optional, for gRPC-based implementation) | ||
| find_package(gRPC REQUIRED) | ||
| if(gRPC_FOUND) | ||
| message(STATUS "Found gRPC: ${gRPC_VERSION}") | ||
| set(CUOPT_ENABLE_GRPC ON) | ||
| add_compile_definitions(CUOPT_ENABLE_GRPC) | ||
| elseif(CUOPT_ENABLE_GRPC) | ||
| message(FATAL_ERROR "CUOPT_ENABLE_GRPC=ON but gRPC was not found") | ||
| else() | ||
| message(STATUS "CUOPT_ENABLE_GRPC=OFF, gRPC-based remote solve disabled") | ||
| set(CUOPT_ENABLE_GRPC OFF) | ||
| endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, let's verify the file and lines in question
wc -l cpp/CMakeLists.txtRepository: NVIDIA/cuopt
Length of output: 78
🏁 Script executed:
# Read the specific lines mentioned in the review
sed -n '280,307p' cpp/CMakeLists.txtRepository: NVIDIA/cuopt
Length of output: 1169
🏁 Script executed:
# Search for CUOPT_ENABLE_GRPC usage throughout the codebase
rg 'CUOPT_ENABLE_GRPC' --type cmakeRepository: NVIDIA/cuopt
Length of output: 713
🏁 Script executed:
# Check for any gRPC-related configuration
rg -i 'grpc' cpp/CMakeLists.txt -B 2 -A 2Repository: NVIDIA/cuopt
Length of output: 3496
🏁 Script executed:
# Check if there's any documentation or comments about optional vs required
rg -i 'optional' cpp/CMakeLists.txt -B 2 -A 2Repository: NVIDIA/cuopt
Length of output: 651
gRPC is effectively mandatory despite optional logic.
find_package(gRPC REQUIRED) terminates CMake configuration if gRPC is not found, preventing any fallback logic from running. Additionally, no option() declaration exists before this logic, so users cannot disable gRPC even if they want to. The code unconditionally forces CUOPT_ENABLE_GRPC ON whenever gRPC is detected, making "optional gRPC" impossible.
🐛 Suggested fix to respect the option and keep gRPC truly optional
-# gRPC for remote solve service (optional, for gRPC-based implementation)
-find_package(gRPC REQUIRED)
-if(gRPC_FOUND)
- message(STATUS "Found gRPC: ${gRPC_VERSION}")
- set(CUOPT_ENABLE_GRPC ON)
- add_compile_definitions(CUOPT_ENABLE_GRPC)
-elseif(CUOPT_ENABLE_GRPC)
- message(FATAL_ERROR "CUOPT_ENABLE_GRPC=ON but gRPC was not found")
-else()
- message(STATUS "CUOPT_ENABLE_GRPC=OFF, gRPC-based remote solve disabled")
- set(CUOPT_ENABLE_GRPC OFF)
-endif()
+# gRPC for remote solve service (optional, for gRPC-based implementation)
+option(CUOPT_ENABLE_GRPC "Enable gRPC-based remote solve" OFF)
+if(CUOPT_ENABLE_GRPC)
+ find_package(gRPC REQUIRED)
+else()
+ find_package(gRPC QUIET)
+endif()
+if(gRPC_FOUND)
+ message(STATUS "Found gRPC: ${gRPC_VERSION}")
+ add_compile_definitions(CUOPT_ENABLE_GRPC)
+elseif(CUOPT_ENABLE_GRPC)
+ message(FATAL_ERROR "CUOPT_ENABLE_GRPC=ON but gRPC was not found")
+else()
+ message(STATUS "CUOPT_ENABLE_GRPC=OFF, gRPC-based remote solve disabled")
+endif()🤖 Prompt for AI Agents
In `@cpp/CMakeLists.txt` around lines 280 - 307, The CMake logic makes gRPC
mandatory because it uses find_package(gRPC REQUIRED) and never declares an
option; change to declare an option (e.g., option(CUOPT_ENABLE_GRPC "Enable gRPC
support" OFF) ) before the find_package call, remove REQUIRED from
find_package(gRPC) so it becomes optional, and only set CUOPT_ENABLE_GRPC ON /
add_compile_definitions(CUOPT_ENABLE_GRPC) when gRPC is found and the option is
enabled (otherwise leave the option OFF and handle the error branch that checks
CUOPT_ENABLE_GRPC). Ensure you update the branches that reference gRPC_FOUND and
CUOPT_ENABLE_GRPC accordingly so gRPC truly remains optional.
cpp/scripts/run_dev_server.sh
Outdated
| @@ -0,0 +1,24 @@ | |||
| #!/bin/bash | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this script will be removed ....
cpp/cuopt_remote_server.cpp
Outdated
| @@ -0,0 +1,2298 @@ | |||
| /* | |||
| * SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION & AFFILIATES. All rights | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will be removed, legacy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@ci/build_wheel_libcuopt.sh`:
- Around line 20-22: The called script ci/utils/install_protobuf_grpc.sh
currently lacks the execute bit; update its file mode so it is executable (e.g.,
add the execute permission via chmod +x on ci/utils/install_protobuf_grpc.sh)
and commit that mode change so the script shows as executable in the repo; keep
the existing call in ci/build_wheel_libcuopt.sh (bash
ci/utils/install_protobuf_grpc.sh) unchanged.
In `@python/libcuopt/CMakeLists.txt`:
- Around line 42-64: The FetchContent_Declare for grpc pins an insecure v1.51.1;
update the GIT_TAG in FetchContent_Declare(grpc ...) to at least v1.65.4
(preferably the current stable such as v1.76.0) to remediate the listed CVEs,
and re-run dependency tests; also avoid globally forcing BUILD_SHARED_LIBS via
the CACHE with FORCE (which affects all subsequent add_library/add_subdirectory
calls like add_subdirectory(../../cpp cuopt-cpp)); instead set BUILD_SHARED_LIBS
for grpc only (e.g., set it non-cached or scoped locally before
FetchContent_MakeAvailable and restore/unset it afterward) or explicitly control
shared/static for the downstream add_subdirectory to prevent unintended
propagation.
🧹 Nitpick comments (7)
ci/build_wheel_libcuopt.sh (1)
60-73: Consider whether gRPC/Protobuf libraries should be excluded.If gRPC and Protobuf are linked dynamically, you may want to add them to
EXCLUDE_ARGS(e.g.,--exclude "libgrpc*.so*"and--exclude "libprotobuf*.so*") to reduce wheel size, assuming they're expected as system dependencies. If they're intentionally bundled for portability, this can be ignored.python/libcuopt/CMakeLists.txt (1)
47-55: Consider scopingBUILD_SHARED_LIBSto avoid polluting global CMake state.Setting
BUILD_SHARED_LIBS ONwithFORCEaffects all subsequent targets in the CMake configuration, not just gRPC. If other dependencies fetched later expect static linking, this could cause issues.♻️ Suggested approach using a scoped block
-set(gRPC_BUILD_TESTS OFF CACHE BOOL "" FORCE) -set(gRPC_INSTALL OFF CACHE BOOL "" FORCE) -set(BUILD_SHARED_LIBS ON CACHE BOOL "" FORCE) +set(gRPC_BUILD_TESTS OFF CACHE BOOL "" FORCE) +set(gRPC_INSTALL OFF CACHE BOOL "" FORCE) +# Save and restore BUILD_SHARED_LIBS to avoid affecting other targets +set(_CUOPT_SAVED_BUILD_SHARED_LIBS ${BUILD_SHARED_LIBS}) +set(BUILD_SHARED_LIBS ON CACHE BOOL "" FORCE) set(gRPC_SSL_PROVIDER "package" CACHE STRING "" FORCE) ... FetchContent_MakeAvailable(grpc) +set(BUILD_SHARED_LIBS ${_CUOPT_SAVED_BUILD_SHARED_LIBS} CACHE BOOL "" FORCE)cpp/CMakeLists.txt (5)
229-264: Defensive ZLIB resolution is reasonable but complex; consider documenting the rationale.The workaround to avoid broken system ZLIB configs is sensible, but the
CMAKE_IGNORE_PATHmanipulation (line 253) is aggressive and could cause issues if downstream projects legitimately need those paths for other packages. A brief comment explaining the specific failure scenario would help future maintainers.
293-294: Prefertarget_include_directoriesover globalinclude_directories.Using global
include_directories()pollutes the include path for all targets in this directory and subdirectories. Since protobuf headers are only needed by specific targets, consider usingtarget_include_directories()on those targets instead.♻️ Suggested change
Remove these global includes and add them to specific targets that need them:
-include_directories(${Protobuf_INCLUDE_DIRS}) -include_directories(${CMAKE_CURRENT_BINARY_DIR})Then add to the
cuopttarget (around line 429):target_include_directories(cuopt PRIVATE ${Protobuf_INCLUDE_DIRS} ${CMAKE_CURRENT_BINARY_DIR} )
378-390: Theelsebranch is currently dead code.Since
CUOPT_ENABLE_GRPCis unconditionally set toONat line 304, the else branch (lines 385-390) will never execute. If gRPC is intended to always be required, consider removing the conditional to simplify the code. If optional gRPC support is planned for the future, add a comment explaining the intent.
731-741:protobuf::libprotobufshould likely be PRIVATE, not PUBLIC.For
cuopt_remote_server, protobuf is an implementation detail and shouldn't be propagated to consumers linking against this executable (which doesn't make sense anyway for an executable). Move it to PRIVATE for consistency with other targets.♻️ Suggested fix
target_link_libraries(cuopt_remote_server PUBLIC cuopt OpenMP::OpenMP_CXX - protobuf::libprotobuf PRIVATE + protobuf::libprotobuf )
757-799: Consider tracking the disabled test sources with an issue.The
if(FALSE)block preserves build rules for test executables whose sources don't exist yet. While the TODO comment explains this, tracking it in an issue would ensure it doesn't get forgotten.Would you like me to help create an issue to track restoring
remote_solve_test.cppandlocal_solve_test.cpp?
| # Install Protobuf + gRPC (protoc + grpc_cpp_plugin) | ||
| bash ci/utils/install_protobuf_grpc.sh | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify the install script exists and has proper structure
if [ -f "ci/utils/install_protobuf_grpc.sh" ]; then
echo "=== Script exists ==="
ls -la ci/utils/install_protobuf_grpc.sh
echo ""
echo "=== First 10 lines ==="
head -10 ci/utils/install_protobuf_grpc.sh
else
echo "ERROR: ci/utils/install_protobuf_grpc.sh not found"
# Check if the utils directory exists
if [ -d "ci/utils" ]; then
echo "Directory ci/utils exists with contents:"
ls -la ci/utils/
else
echo "Directory ci/utils does not exist"
fi
fiRepository: NVIDIA/cuopt
Length of output: 463
Script exists but lacks execute permissions.
The script ci/utils/install_protobuf_grpc.sh is present with correct error handling (set -euo pipefail) and structure. However, it has read-only permissions (-rw-r--r--) instead of being executable. Since the call uses bash explicitly, it will still execute, but add the execute bit for consistency with standard shell script practices.
🤖 Prompt for AI Agents
In `@ci/build_wheel_libcuopt.sh` around lines 20 - 22, The called script
ci/utils/install_protobuf_grpc.sh currently lacks the execute bit; update its
file mode so it is executable (e.g., add the execute permission via chmod +x on
ci/utils/install_protobuf_grpc.sh) and commit that mode change so the script
shows as executable in the repo; keep the existing call in
ci/build_wheel_libcuopt.sh (bash ci/utils/install_protobuf_grpc.sh) unchanged.
Introduce CPU/remote-safe solution handling, gRPC service definitions, and remote solve client/server plumbing to support remote execution and log streaming while retaining local solve behavior.
Include gRPC packages in dependency manifests and regenerate environment files from dependencies.yaml.
Ensure incumbent callbacks are retained for gRPC workers, switch incumbent job IDs to bytes in the proto, and harden remote solve CPU-only paths and build configuration defaults.
Drop cuopt_remote_server sources/scripts and the Python cancel helpers, and update docs/CMake to reflect gRPC-only server support. Signed-off-by: Trevor McKay <tmgithub1@gmail.com>
Align remote execution with gRPC-only path, drop legacy async APIs, refresh architecture docs, and fix protobuf target aliasing in Python builds.
1cc9796 to
754c449
Compare
|
/ok to test 754c449 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 18
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cpp/src/linear_programming/solver_solution.cu (1)
320-341: Avoid size_t underflow on empty solution vectorsLine 320-341: the loops use
size() - 1. If any solution vector is empty, this underflows and will crash. This is more likely now with remote error paths that can yield empty host vectors. Consider guarding for empties before iterating.🛠️ Proposed fix (guard empty vectors)
- if (!var_names_.empty() && generate_variable_values) { + if (generate_variable_values && !var_names_.empty() && !row_names_.empty() && + !primal_solution.empty() && !dual_solution.empty() && !reduced_cost.empty()) {
🤖 Fix all issues with AI agents
In `@conda/recipes/libcuopt/recipe.yaml`:
- Line 32: The recipe enables BUILD_REMOTE_SERVER and CUOPT_ENABLE_GRPC which
builds the cuopt_grpc_server binary but the package lacks an explicit grpc-cpp
runtime dependency and no test verifies the server is packaged; update the conda
recipe to declare grpc-cpp (grpc-cpp or grpc_cpp depending on channel naming) in
the runtime requirements section or confirm conda auto-detection will pull
grpc-cpp at runtime, and add a package_contents test that asserts the installed
cuopt_grpc_server binary exists (and is executable) to validate packaging; look
for references to BUILD_REMOTE_SERVER, CUOPT_ENABLE_GRPC, cuopt_grpc_server, and
package_contents to locate the exact spots to modify.
In `@cpp/include/cuopt/linear_programming/pdlp/solver_solution.hpp`:
- Around line 281-306: Add a runtime assertion in the implementations of
get_primal_solution_host(), get_dual_solution_host(), and
get_reduced_cost_host() to validate the documented precondition by calling
cuopt_assert(!is_device_memory_) before dereferencing the host vectors; locate
the definitions of these three methods and insert the assertion at the top of
each function to prevent returning host references when is_device_memory_ is
true.
In `@cpp/include/cuopt/linear_programming/solve.hpp`:
- Around line 112-127: Docstring for data_model_view_to_optimization_problem
incorrectly states the view points to GPU memory only; update the comment to say
data_model_view_t<i_t, f_t> may reference either CPU or GPU memory and that the
function copies data from the view (regardless of host or device residence) into
an owning optimization_problem_t<i_t, f_t>, preserving mention of the
raft::handle_t CUDA stream usage when device transfers are required; reference
the function name data_model_view_to_optimization_problem and the type
data_model_view_t in the updated description.
In `@cpp/include/cuopt/linear_programming/utilities/remote_solve.hpp`:
- Around line 39-53: get_remote_solve_config currently uses std::stoi which
permits partial parsing and out-of-range/negative values; replace that with a
strict parse that verifies the entire CUOPT_REMOTE_PORT string is numeric and
within 1–65535 before constructing remote_solve_config_t. Specifically, in
get_remote_solve_config() parse port using a function that exposes the end
pointer (e.g., std::strtol or std::stoll with checking), ensure no leftover
characters remain after the number, detect conversion errors/overflows, and
check the resulting integer is between 1 and 65535; if any check fails, return
std::nullopt.
In `@cpp/src/linear_programming/solve.cu`:
- Around line 1403-1415: In copy_view_to_cpu, check CUDA return codes for the
GPU→CPU transfer and synchronization: capture and validate the cudaMemcpyAsync
return value (for the call that copies var_types_span into
cpu_data.variable_types) and capture and validate the cudaStreamSynchronize
return value for stream; on error, log/propagate a descriptive error including
cudaGetErrorName()/cudaGetErrorString or return a failure status instead of
proceeding, so failures from cudaMemcpyAsync and cudaStreamSynchronize are not
ignored.
In `@cpp/src/linear_programming/solver_solution.cu`:
- Around line 530-534: get_solved_by_pdlp() and reporting are reading different
sources (solved_by_pdlp_ vs termination_stats_.solved_by_pdlp) which can
diverge; unify them by making the termination stats the single source of truth:
in set_termination_stats(const additional_termination_information_t& stats)
assign termination_stats_ = stats and also set solved_by_pdlp_ =
stats.solved_by_pdlp (or alternatively change get_solved_by_pdlp() to return
termination_stats_.solved_by_pdlp directly) so both the accessor
get_solved_by_pdlp(), reporting, and the stored flag remain in sync; update the
implementation of get_solved_by_pdlp() and set_termination_stats() accordingly
(references: set_termination_stats, termination_stats_, get_solved_by_pdlp,
solved_by_pdlp_).
- Around line 411-476: The device/host getters (get_primal_solution,
get_dual_solution, get_reduced_cost and their _host counterparts) currently
dereference unique_ptrs (primal_solution_, dual_solution_, reduced_cost_,
primal_solution_host_, dual_solution_host_, reduced_cost_host_) without
verifying the memory mode; add a precondition check at the top of each getter
that verifies is_device_memory_ is true for device getters and false for host
getters (use an existing assert/EXPECTS macro or throw a clear runtime_error) so
callers fail fast with a descriptive message instead of causing a null deref.
- Around line 175-217: When copying from other into host mode
(other.is_device_memory_ == false) ensure any existing GPU allocations are
released: before assigning host vectors in the else branch, if is_device_memory_
is true call handle_ptr->sync_stream() and reset the device unique_ptrs
(primal_solution_.reset(), dual_solution_.reset(), reduced_cost_.reset()) and
set is_device_memory_ = false so GPU buffers are freed; apply the same change in
the other copy-path around the code referenced at lines 508-527 (the same
primal_solution_/dual_solution_/reduced_cost_ reset logic).
In `@cpp/src/linear_programming/utilities/cuopt_remote_service.proto`:
- Around line 216-221: The Incumbent protobuf message defines job_id as bytes
but other messages use string; update the Incumbent message's job_id field from
bytes to string to maintain type consistency (modify the Incumbent message
declaration and its job_id field definition to use string job_id = 4), and then
regenerate any affected stubs to pick up the change.
- Around line 210-214: IncumbentRequest.job_id is declared as bytes while all
other messages (e.g., StatusRequest.job_id, GetResultRequest.job_id,
ResultChunk.job_id, DeleteRequest.job_id, CancelRequest.job_id,
WaitRequest.job_id, StreamLogsRequest.job_id) use string; change the field type
in IncumbentRequest from bytes to string to make the schema consistent, update
any code that relied on bytes to use string values, and regenerate protobuf/GRPC
stubs so consumers and servers use the unified string job_id type across
functions/messages.
In `@cpp/src/linear_programming/utilities/cuopt_remote.proto`:
- Around line 21-52: The proto defines index arrays as int32 which can truncate
64-bit indices; update the OptimizationProblem fields A_indices and A_offsets
and the QuadraticProblem fields Q_indices and Q_offsets from int32 to int64 to
match the IndexType enum (INT64) and the RequestHeader handling; after changing
the field types, regenerate protobuf stubs and ensure serializers/deserializers
map i_t to int64 (and still to int32 when IndexType==INT32) so
IndexType/RequestHeader and the fields (A_indices, A_offsets, Q_indices,
Q_offsets) remain consistent.
In `@cpp/src/linear_programming/utilities/cython_solve.cu`:
- Around line 293-304: The code currently forces lp_ret.solved_by_pdlp_ = false
which ignores the actual termination info; change this to propagate the real
flag from the termination info (e.g., set lp_ret.solved_by_pdlp_ =
term_info.solved_by_pdlp or the equivalent field on the termination/info object)
so the response reflects whether the solve was performed by PDLP; update any
mapping if the flag resides on a different object (e.g.,
solution.get_termination_info().solved_by_pdlp) and ensure response.lp_ret is
still moved as before.
In `@cpp/src/mip/solver_solution.cu`:
- Around line 147-169: The getters mip_solution_t<i_t,f_t>::get_solution and
mip_solution_t<i_t,f_t>::get_solution_host can dereference nullptr (solution_ or
solution_host_) depending on how the object was constructed; add defensive
precondition checks at the start of these methods (e.g., assert or throw
std::runtime_error) that verify the corresponding pointer is non-null and/or
that the object's memory mode (use is_device_memory() or equivalent) matches the
requested accessor, and include a clear error message referencing solution_ or
solution_host_; this prevents undefined behavior and guides callers to check the
memory mode before calling the wrong getter.
In `@cpp/tests/linear_programming/CMakeLists.txt`:
- Around line 55-68: Remove the duplicated CONDA_PREFIX block by keeping a
single conditional that checks ENV{CONDA_PREFIX}, sets _CUOPT_CONDA_PREFIX,
tests for the version.h path, calls target_include_directories(C_API_TEST BEFORE
PRIVATE "${_CUOPT_CONDA_PREFIX}/include/rapids") if it exists, and then unsets
_CUOPT_CONDA_PREFIX; ensure only one copy of this logic remains (refer to the
repeated blocks that set _CUOPT_CONDA_PREFIX and call target_include_directories
for target C_API_TEST).
- Around line 31-44: Remove the duplicated CONDA_PREFIX include block: there are
two identical if(DEFINED ENV{CONDA_PREFIX}) ...
target_include_directories(c_api_tester BEFORE PRIVATE
"${_CUOPT_CONDA_PREFIX}/include/rapids") ... unset(_CUOPT_CONDA_PREFIX)
sections; keep only one. Locate the repeated block that checks for
"${_CUOPT_CONDA_PREFIX}/include/rapids/cuda/std/__cccl/version.h" and ensure a
single conditional sets target_include_directories for the c_api_tester target
and unsets _CUOPT_CONDA_PREFIX once.
In `@docs/developer/SERIALIZATION_PLUGIN_GUIDE.md`:
- Around line 13-25: The markdown fenced code blocks for the ASCII diagram and
the example output are missing language tags and trigger markdownlint MD040;
update the two blocks that contain the ASCII diagram (the box with
serialize_lp_request() and serialize_lp_solution()) and the example output block
that starts with "[remote_solve] Loading custom serializer..." to use ```text
fence markers instead of plain ``` so both the diagram and the output snippet
are annotated as text; locate the blocks by searching for the ASCII diagram
containing serialize_lp_request/serialize_lp_solution and the output snippet
beginning "[remote_solve] Loading custom serializer" and add the language tag to
each fenced block.
In `@python/cuopt/cuopt/utilities/__init__.py`:
- Around line 11-25: Add a module-level __all__ list that includes "type_cast",
"series_from_buf", and "check_solution" so star-imports continue to expose these
names, and optionally implement __dir__ to return sorted(__all__ +
list(globals().keys())) to surface them in autocomplete; keep the existing
lazy-loader __getattr__ unchanged so the actual imports still occur on first
access (referencing __getattr__, type_cast, series_from_buf, check_solution).
🧹 Nitpick comments (15)
python/cuopt/cuopt/linear_programming/internals/internals.pyx (1)
52-57: Non-owning view: document or consider memory lifetime.
np.ctypeslib.as_arrayreturns a view that does not own the underlying memory. If the original buffer atdatais freed while the returned numpy array is still in use, this results in undefined behavior (use-after-free).This is likely acceptable given the low-level nature of this callback interface and the PR's purpose (handling CPU memory for remote solves), but consider:
- Adding a brief docstring clarifying that the caller must ensure the underlying memory remains valid for the lifetime of the returned array.
- Alternatively, if a safe copy is ever needed,
np.ctypeslib.as_array(buf).copy()would provide ownership.conda/environments/all_cuda-131_arch-x86_64.yaml (1)
28-28: Consider adding version constraints forgrpc-cppandlibprotobufindependencies.yaml.These dependencies are unversioned in
dependencies.yaml, which is consistent with other build dependencies in thebuild_cppsection but may impact reproducibility. gRPC and Protobuf have known ABI compatibility concerns across major versions, and explicit version constraints would help prevent unexpected breakages from upstream changes.Since this file is auto-generated from
dependencies.yaml, ensure version constraints are added there (e.g.,grpc-cpp>=1.x,libprotobuf>=x.y).docs/developer/SOLUTION_MEMORY_ARCHITECTURE.md (1)
133-140: Consider adding guidance on when to avoid callingto_host().The performance section mentions overhead is negligible for typical problems. Consider adding a note that for very large problems (millions of variables), users should be mindful of the memory bandwidth cost and avoid unnecessary
to_host()calls in tight loops.GRPC_ARCHITECTURE.md (1)
1-82: Good architecture documentation for the gRPC remote solve system.The document clearly explains the service definition, components, data flow, and build integration. One minor issue flagged by static analysis:
📝 Fix: Add language specifier to empty code fence at end of file
Line 82 has an unclosed/empty code fence. The trailing triple backticks appear to be unintended (the document ends with the security notes section, not a code block). Consider removing them:
- gRPC provides HTTP/2 framing, flow control, and standard status codes. -```cpp/include/cuopt/linear_programming/mip/solver_solution.hpp (1)
173-180: Consider consolidating duplicate accessor methods.
get_nodes()andget_simplex_iterations()(lines 175, 180) appear to duplicate the existingget_num_nodes()(line 100) andget_num_simplex_iterations()(line 101) methods. Consider using only one naming convention to avoid API confusion.♻️ Suggested consolidation
Either remove the new methods and use existing ones in the serializer, or deprecate the old methods:
- /** - * `@brief` Get number of nodes - */ - i_t get_nodes() const; - - /** - * `@brief` Get number of simplex iterations - */ - i_t get_simplex_iterations() const;The protobuf serializer at
cpp/src/linear_programming/utilities/protobuf_serializer.cucould use the existingget_num_nodes()andget_num_simplex_iterations()methods instead.cpp/src/mip/solver_solution.cu (1)
398-408: Duplicate methods:get_nodes()andget_simplex_iterations().These methods duplicate
get_num_nodes()(line 255) andget_num_simplex_iterations()(line 261) with identical implementations. Consider removing the duplicates to avoid API confusion.build.sh (2)
419-425: Silent cache invalidation may surprise users.The cache invalidation when system ZLIB is detected in
CMakeCache.txtcould be confusing to users who expect their build cache to persist. Consider logging a message when this happens.💡 Suggested improvement
if [ -f "${LIBCUOPT_BUILD_DIR}/CMakeCache.txt" ]; then if grep -Eq "/(usr/)?lib64/cmake/ZLIB" "${LIBCUOPT_BUILD_DIR}/CMakeCache.txt"; then + echo "Note: Clearing CMake cache due to system ZLIB references" rm -f "${LIBCUOPT_BUILD_DIR}/CMakeCache.txt" rm -rf "${LIBCUOPT_BUILD_DIR}/CMakeFiles" fi fi
521-534: Consider extracting duplicated SKBUILD argument handling into a function.The SKBUILD_EXTRA_ARGS construction logic (escaping semicolons, joining, building SKBUILD_CMAKE_ARGS) is duplicated between the
cuoptbuild (lines 521-534) andcuopt_mps_parserbuild (lines 542-553). Extracting this into a helper function would improve maintainability.python/cuopt/cuopt/linear_programming/solver/solver_wrapper.pyx (2)
82-91: Consider caching lazy imports.
_get_cuda_imports()performs module imports on each call. While Python caches module imports, the function call overhead is repeated. For frequently called code paths, consider caching at module level after first import.💡 Optional caching pattern
_cuda_imports_cache = None def _get_cuda_imports(): """Lazily import CUDA-dependent modules.""" global _cuda_imports_cache if _cuda_imports_cache is None: import cupy as cp import cudf from cuopt.utilities import series_from_buf _cuda_imports_cache = (cp, cudf, series_from_buf) return _cuda_imports_cache
488-498: Document warm start limitation for remote solves in user-facing API.When using remote solve, warm start data is unavailable (empty arrays). Consider adding documentation or a warning when users configure warm start but are using remote solve, to prevent confusion.
cpp/src/linear_programming/utilities/remote_solve.cu (4)
56-84: CUDA error handling uses raw checks instead of CUDA_CHECK macro.Per coding guidelines, CUDA kernel launches and memory operations should use
CUDA_CHECKor equivalent verification. The current implementation uses rawcudaError_tchecks which is functional but inconsistent with the codebase's error checking patterns.
198-203: Log thread usesvolatile boolfor cross-thread signaling.The
volatile bool stop_logsis used for signaling the log thread to stop. While this works in practice,std::atomic<bool>would be the modern C++ approach for cross-thread communication with proper memory ordering guarantees.💡 Modern alternative
std::atomic<bool> stop_logs{false}; // ... stop_logs.store(true);
184-188: Consider consistent logging instead of mixingstd::cerrandCUOPT_LOG_*.Error output uses
std::cerr(line 184-185) while other logging usesCUOPT_LOG_INFO. Consider usingCUOPT_LOG_ERRORfor consistency with the rest of the codebase's logging infrastructure.💡 Suggested change
if (!grpc_remote::submit_or_upload(...)) { - std::cerr << "[remote_solve] UploadAndSubmit failed: " << err << "\n"; - std::cerr.flush(); + CUOPT_LOG_ERROR("[remote_solve] UploadAndSubmit failed: %s", err.c_str()); return optimization_problem_solution_t<i_t, f_t>(...); }
426-451: Final incumbent drain logic is reasonable but could be documented.The post-completion drain loop (5 iterations, 50ms sleep between retries if empty) handles race conditions where incumbents may arrive slightly after completion status. Consider adding a brief comment explaining this timing window.
cpp/cuopt_cli.cpp (1)
113-213: Consolidate data_model_view construction helpers.
create_view_from_mps_data_modelduplicates logic already present incpp/src/linear_programming/solve.cuandcpp/src/linear_programming/cuopt_c.cpp. Consider extracting a shared utility to avoid drift and keep CPU view construction consistent. Based on learnings, refactor duplicated solver utilities.
| set +x | ||
| ./build.sh -n -v ${BUILD_EXTRA_FLAGS} libmps_parser libcuopt deb --allgpuarch --cmake-args=\"-DCMAKE_INSTALL_LIBDIR=lib\" | ||
| ./build.sh -n -v ${BUILD_EXTRA_FLAGS} libmps_parser libcuopt deb --allgpuarch --cmake-args=\"-DCMAKE_INSTALL_LIBDIR=lib -DBUILD_REMOTE_SERVER=ON -DCUOPT_ENABLE_GRPC=ON\" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# Search for cuopt_grpc_server references in CMakeLists.txt files
find . -name "CMakeLists.txt" | head -20Repository: NVIDIA/cuopt
Length of output: 769
🏁 Script executed:
# Look for cuopt_grpc_server in the repository
rg -n "cuopt_grpc_server" --type-list | head
rg -n "cuopt_grpc_server" .Repository: NVIDIA/cuopt
Length of output: 3251
🏁 Script executed:
# Examine the conda recipe file for libcuopt
cat -n conda/recipes/libcuopt/recipe.yaml | head -200Repository: NVIDIA/cuopt
Length of output: 7865
🏁 Script executed:
tail -n +189 conda/recipes/libcuopt/recipe.yaml | head -100Repository: NVIDIA/cuopt
Length of output: 1278
Verify gRPC runtime dependencies and packaging of cuopt_grpc_server binary.
Line 32 enables BUILD_REMOTE_SERVER and CUOPT_ENABLE_GRPC, and build.sh/CMakeLists.txt confirm that cuopt_grpc_server is built and installed to the package. However, the libcuopt output package does not explicitly declare grpc-cpp as a runtime dependency (only libprotobuf), and the package_contents tests do not verify the server binary is actually present. Ensure the grpc-cpp runtime dependency is properly declared or rely on conda's auto-detection; also add a test to verify cuopt_grpc_server is packaged.
Also applies to: 63-74, 153-176
🤖 Prompt for AI Agents
In `@conda/recipes/libcuopt/recipe.yaml` at line 32, The recipe enables
BUILD_REMOTE_SERVER and CUOPT_ENABLE_GRPC which builds the cuopt_grpc_server
binary but the package lacks an explicit grpc-cpp runtime dependency and no test
verifies the server is packaged; update the conda recipe to declare grpc-cpp
(grpc-cpp or grpc_cpp depending on channel naming) in the runtime requirements
section or confirm conda auto-detection will pull grpc-cpp at runtime, and add a
package_contents test that asserts the installed cuopt_grpc_server binary exists
(and is executable) to validate packaging; look for references to
BUILD_REMOTE_SERVER, CUOPT_ENABLE_GRPC, cuopt_grpc_server, and package_contents
to locate the exact spots to modify.
| /** | ||
| * @brief Returns the primal solution in host (CPU) memory. | ||
| * Only valid when is_device_memory() returns false. | ||
| * | ||
| * @return std::vector<f_t>& The host memory container for the primal solution. | ||
| */ | ||
| std::vector<f_t>& get_primal_solution_host(); | ||
| const std::vector<f_t>& get_primal_solution_host() const; | ||
|
|
||
| /** | ||
| * @brief Returns the dual solution in host (CPU) memory. | ||
| * Only valid when is_device_memory() returns false. | ||
| * | ||
| * @return std::vector<f_t>& The host memory container for the dual solution. | ||
| */ | ||
| std::vector<f_t>& get_dual_solution_host(); | ||
| const std::vector<f_t>& get_dual_solution_host() const; | ||
|
|
||
| /** | ||
| * @brief Returns the reduced cost in host (CPU) memory. | ||
| * Only valid when is_device_memory() returns false. | ||
| * | ||
| * @return std::vector<f_t>& The host memory container for the reduced cost. | ||
| */ | ||
| std::vector<f_t>& get_reduced_cost_host(); | ||
| const std::vector<f_t>& get_reduced_cost_host() const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find the implementation file for solver_solution
find . -type f -name "solver_solution.cpp" | head -10Repository: NVIDIA/cuopt
Length of output: 38
🏁 Script executed:
#!/bin/bash
# Also check if there's an inline implementation in the header
cat -n cpp/include/cuopt/linear_programming/pdlp/solver_solution.hpp | sed -n '281,350p'Repository: NVIDIA/cuopt
Length of output: 2902
🏁 Script executed:
#!/bin/bash
# Search for the implementations of these accessor methods
rg -n "get_primal_solution_host|get_dual_solution_host|get_reduced_cost_host" --type cpp cpp/src/ -A 8 | head -80Repository: NVIDIA/cuopt
Length of output: 1910
🏁 Script executed:
#!/bin/bash
# Search for solver_solution class in all cpp files
find cpp/src -type f -name "*.cpp" | xargs rg -l "get_primal_solution_host" | head -5Repository: NVIDIA/cuopt
Length of output: 94
🏁 Script executed:
#!/bin/bash
# Look for pdlp directory structure
ls -la cpp/src/linear_programming/ | grep pdlpRepository: NVIDIA/cuopt
Length of output: 386
🏁 Script executed:
#!/bin/bash
# Search for the class definition and implementation more broadly
rg -n "class.*solver_solution" cpp/include/ -A 10Repository: NVIDIA/cuopt
Length of output: 38
🏁 Script executed:
#!/bin/bash
# Look for template implementations or inline implementations
rg -n "get_primal_solution_host\(\)" cpp/src/ -B 2 -A 5 | head -100Repository: NVIDIA/cuopt
Length of output: 3586
🏁 Script executed:
#!/bin/bash
# Get complete implementations of all three host accessors
rg -n "get_primal_solution_host|get_dual_solution_host|get_reduced_cost_host" cpp/src/linear_programming/solver_solution.cu -A 4 | grep -E "^[0-9]+-" | head -50Repository: NVIDIA/cuopt
Length of output: 369
🏁 Script executed:
#!/bin/bash
# Check if there's an is_device_memory_ member and how it's used
rg -n "is_device_memory_" cpp/include/cuopt/linear_programming/pdlp/solver_solution.hpp -B 2 -A 2Repository: NVIDIA/cuopt
Length of output: 218
🏁 Script executed:
#!/bin/bash
# Check for any assertions in the solver_solution.cu file
rg -n "assert|ASSERT|cuopt_assert" cpp/src/linear_programming/solver_solution.cu | head -20Repository: NVIDIA/cuopt
Length of output: 38
🏁 Script executed:
#!/bin/bash
# Look at the broader context of the implementation file to understand assertion patterns
head -100 cpp/src/linear_programming/solver_solution.cuRepository: NVIDIA/cuopt
Length of output: 4281
Add runtime assertions to host accessor implementations to validate preconditions.
The implementations of get_primal_solution_host(), get_dual_solution_host(), and get_reduced_cost_host() lack any checks for the documented precondition "Only valid when is_device_memory() returns false." Currently these methods simply dereference the host vector pointers without validating the is_device_memory_ flag. Add cuopt_assert(!is_device_memory_) to catch misuse during development and prevent undefined behavior when callers use the wrong accessor for the memory location.
🤖 Prompt for AI Agents
In `@cpp/include/cuopt/linear_programming/pdlp/solver_solution.hpp` around lines
281 - 306, Add a runtime assertion in the implementations of
get_primal_solution_host(), get_dual_solution_host(), and
get_reduced_cost_host() to validate the documented precondition by calling
cuopt_assert(!is_device_memory_) before dereferencing the host vectors; locate
the definitions of these three methods and insert the assertion at the top of
each function to prevent returning host references when is_device_memory_ is
true.
| /** | ||
| * @brief Convert a data_model_view_t to an optimization_problem_t. | ||
| * | ||
| * This function copies data from the view (which points to GPU memory) | ||
| * into an owning optimization_problem_t. | ||
| * | ||
| * @tparam i_t Data type of indexes | ||
| * @tparam f_t Data type of the variables and their weights in the equations | ||
| * | ||
| * @param[in] handle_ptr A raft::handle_t object with its corresponding CUDA stream. | ||
| * @param[in] view A data_model_view_t<i_t, f_t> object with spans pointing to GPU memory | ||
| * @return optimization_problem_t<i_t, f_t> owning container for the problem | ||
| */ | ||
| template <typename i_t, typename f_t> | ||
| optimization_problem_t<i_t, f_t> data_model_view_to_optimization_problem( | ||
| raft::handle_t const* handle_ptr, const data_model_view_t<i_t, f_t>& view); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarify doc: view may be CPU or GPU memory.
Line 115 describes GPU-only memory, but the view overloads are used with CPU views too (local CPU→GPU path). Update the doc to reflect both cases.
✏️ Suggested doc tweak
- * This function copies data from the view (which points to GPU memory)
+ * This function copies data from the view (which may point to GPU or CPU memory)🤖 Prompt for AI Agents
In `@cpp/include/cuopt/linear_programming/solve.hpp` around lines 112 - 127,
Docstring for data_model_view_to_optimization_problem incorrectly states the
view points to GPU memory only; update the comment to say data_model_view_t<i_t,
f_t> may reference either CPU or GPU memory and that the function copies data
from the view (regardless of host or device residence) into an owning
optimization_problem_t<i_t, f_t>, preserving mention of the raft::handle_t CUDA
stream usage when device transfers are required; reference the function name
data_model_view_to_optimization_problem and the type data_model_view_t in the
updated description.
| inline std::optional<remote_solve_config_t> get_remote_solve_config() | ||
| { | ||
| const char* host = std::getenv("CUOPT_REMOTE_HOST"); | ||
| const char* port = std::getenv("CUOPT_REMOTE_PORT"); | ||
|
|
||
| if (host != nullptr && port != nullptr && host[0] != '\0' && port[0] != '\0') { | ||
| try { | ||
| int port_num = std::stoi(port); | ||
| return remote_solve_config_t{std::string(host), port_num}; | ||
| } catch (...) { | ||
| // Invalid port number, fall back to local solve | ||
| return std::nullopt; | ||
| } | ||
| } | ||
| return std::nullopt; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Validate CUOPT_REMOTE_PORT is numeric and within 1–65535.
std::stoi accepts partial strings and allows negative/zero values. Add a strict parse and bounds check to prevent invalid configs from being treated as enabled.
✏️ Suggested fix
- if (host != nullptr && port != nullptr && host[0] != '\0' && port[0] != '\0') {
+ if (host != nullptr && port != nullptr && host[0] != '\0' && port[0] != '\0') {
try {
- int port_num = std::stoi(port);
- return remote_solve_config_t{std::string(host), port_num};
+ std::string port_str(port);
+ size_t idx = 0;
+ int port_num = std::stoi(port_str, &idx);
+ if (idx != port_str.size() || port_num <= 0 || port_num > 65535) {
+ return std::nullopt;
+ }
+ return remote_solve_config_t{std::string(host), port_num};
} catch (...) {
// Invalid port number, fall back to local solve
return std::nullopt;
}
}🤖 Prompt for AI Agents
In `@cpp/include/cuopt/linear_programming/utilities/remote_solve.hpp` around lines
39 - 53, get_remote_solve_config currently uses std::stoi which permits partial
parsing and out-of-range/negative values; replace that with a strict parse that
verifies the entire CUOPT_REMOTE_PORT string is numeric and within 1–65535
before constructing remote_solve_config_t. Specifically, in
get_remote_solve_config() parse port using a function that exposes the end
pointer (e.g., std::strtol or std::stoll with checking), ensure no leftover
characters remain after the number, detect conversion errors/overflows, and
check the resulting integer is between 1 and 65535; if any check fails, return
std::nullopt.
| auto var_types_span = gpu_view.get_variable_types(); | ||
| if (var_types_span.size() > 0) { | ||
| cpu_data.variable_types.resize(var_types_span.size()); | ||
| cudaMemcpyAsync(cpu_data.variable_types.data(), | ||
| var_types_span.data(), | ||
| var_types_span.size() * sizeof(char), | ||
| cudaMemcpyDeviceToHost, | ||
| stream); | ||
| } | ||
|
|
||
| // Synchronize to ensure all copies are complete | ||
| cudaStreamSynchronize(stream); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check CUDA errors for GPU→CPU copies in copy_view_to_cpu.
cudaMemcpyAsync and cudaStreamSynchronize are unchecked; failures will silently corrupt the serialized remote payload. Please add CUDA error checks as per guidelines.
🐛 Proposed fix
if (var_types_span.size() > 0) {
cpu_data.variable_types.resize(var_types_span.size());
- cudaMemcpyAsync(cpu_data.variable_types.data(),
- var_types_span.data(),
- var_types_span.size() * sizeof(char),
- cudaMemcpyDeviceToHost,
- stream);
+ RAFT_CUDA_TRY(cudaMemcpyAsync(cpu_data.variable_types.data(),
+ var_types_span.data(),
+ var_types_span.size() * sizeof(char),
+ cudaMemcpyDeviceToHost,
+ stream));
}
// Synchronize to ensure all copies are complete
- cudaStreamSynchronize(stream);
+ RAFT_CUDA_TRY(cudaStreamSynchronize(stream));🤖 Prompt for AI Agents
In `@cpp/src/linear_programming/solve.cu` around lines 1403 - 1415, In
copy_view_to_cpu, check CUDA return codes for the GPU→CPU transfer and
synchronization: capture and validate the cudaMemcpyAsync return value (for the
call that copies var_types_span into cpu_data.variable_types) and capture and
validate the cudaStreamSynchronize return value for stream; on error,
log/propagate a descriptive error including
cudaGetErrorName()/cudaGetErrorString or return a failure status instead of
proceeding, so failures from cudaMemcpyAsync and cudaStreamSynchronize are not
ignored.
| template <typename i_t, typename f_t> | ||
| const rmm::device_uvector<f_t>& mip_solution_t<i_t, f_t>::get_solution() const | ||
| { | ||
| return solution_; | ||
| return *solution_; | ||
| } | ||
|
|
||
| template <typename i_t, typename f_t> | ||
| rmm::device_uvector<f_t>& mip_solution_t<i_t, f_t>::get_solution() | ||
| { | ||
| return solution_; | ||
| return *solution_; | ||
| } | ||
|
|
||
| template <typename i_t, typename f_t> | ||
| std::vector<f_t>& mip_solution_t<i_t, f_t>::get_solution_host() | ||
| { | ||
| return *solution_host_; | ||
| } | ||
|
|
||
| template <typename i_t, typename f_t> | ||
| const std::vector<f_t>& mip_solution_t<i_t, f_t>::get_solution_host() const | ||
| { | ||
| return *solution_host_; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential null pointer dereference in get_solution() and get_solution_host().
When a solution is constructed via CPU-only constructors (for remote solve), solution_ is nullptr. Calling get_solution() will dereference this null pointer, causing undefined behavior. Similarly, get_solution_host() will crash when called on device-memory solutions where solution_host_ is nullptr.
Consider adding precondition checks or assertions to catch misuse early:
🛠️ Proposed fix: Add defensive checks
template <typename i_t, typename f_t>
const rmm::device_uvector<f_t>& mip_solution_t<i_t, f_t>::get_solution() const
{
+ cuopt_assert(solution_ != nullptr,
+ "get_solution() called but solution is in host memory. Use get_solution_host().");
return *solution_;
}
template <typename i_t, typename f_t>
rmm::device_uvector<f_t>& mip_solution_t<i_t, f_t>::get_solution()
{
+ cuopt_assert(solution_ != nullptr,
+ "get_solution() called but solution is in host memory. Use get_solution_host().");
return *solution_;
}
template <typename i_t, typename f_t>
std::vector<f_t>& mip_solution_t<i_t, f_t>::get_solution_host()
{
+ cuopt_assert(solution_host_ != nullptr,
+ "get_solution_host() called but solution is in device memory. Use get_solution().");
return *solution_host_;
}
template <typename i_t, typename f_t>
const std::vector<f_t>& mip_solution_t<i_t, f_t>::get_solution_host() const
{
+ cuopt_assert(solution_host_ != nullptr,
+ "get_solution_host() called but solution is in device memory. Use get_solution().");
return *solution_host_;
}Alternatively, document the preconditions clearly in the header and rely on callers to check is_device_memory() first.
🤖 Prompt for AI Agents
In `@cpp/src/mip/solver_solution.cu` around lines 147 - 169, The getters
mip_solution_t<i_t,f_t>::get_solution and
mip_solution_t<i_t,f_t>::get_solution_host can dereference nullptr (solution_ or
solution_host_) depending on how the object was constructed; add defensive
precondition checks at the start of these methods (e.g., assert or throw
std::runtime_error) that verify the corresponding pointer is non-null and/or
that the object's memory mode (use is_device_memory() or equivalent) matches the
requested accessor, and include a clear error message referencing solution_ or
solution_host_; this prevents undefined behavior and guides callers to check the
memory mode before calling the wrong getter.
| if(DEFINED ENV{CONDA_PREFIX}) | ||
| set(_CUOPT_CONDA_PREFIX "$ENV{CONDA_PREFIX}") | ||
| if(EXISTS "${_CUOPT_CONDA_PREFIX}/include/rapids/cuda/std/__cccl/version.h") | ||
| target_include_directories(c_api_tester BEFORE PRIVATE "${_CUOPT_CONDA_PREFIX}/include/rapids") | ||
| endif() | ||
| unset(_CUOPT_CONDA_PREFIX) | ||
| endif() | ||
| if(DEFINED ENV{CONDA_PREFIX}) | ||
| set(_CUOPT_CONDA_PREFIX "$ENV{CONDA_PREFIX}") | ||
| if(EXISTS "${_CUOPT_CONDA_PREFIX}/include/rapids/cuda/std/__cccl/version.h") | ||
| target_include_directories(c_api_tester BEFORE PRIVATE "${_CUOPT_CONDA_PREFIX}/include/rapids") | ||
| endif() | ||
| unset(_CUOPT_CONDA_PREFIX) | ||
| endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove duplicated CONDA_PREFIX include directory blocks.
Lines 31-37 and 38-44 are identical. This appears to be a copy-paste error—the same conditional block is repeated twice for the c_api_tester target.
🔧 Suggested fix
if(DEFINED ENV{CONDA_PREFIX})
set(_CUOPT_CONDA_PREFIX "$ENV{CONDA_PREFIX}")
if(EXISTS "${_CUOPT_CONDA_PREFIX}/include/rapids/cuda/std/__cccl/version.h")
target_include_directories(c_api_tester BEFORE PRIVATE "${_CUOPT_CONDA_PREFIX}/include/rapids")
endif()
unset(_CUOPT_CONDA_PREFIX)
endif()
- if(DEFINED ENV{CONDA_PREFIX})
- set(_CUOPT_CONDA_PREFIX "$ENV{CONDA_PREFIX}")
- if(EXISTS "${_CUOPT_CONDA_PREFIX}/include/rapids/cuda/std/__cccl/version.h")
- target_include_directories(c_api_tester BEFORE PRIVATE "${_CUOPT_CONDA_PREFIX}/include/rapids")
- endif()
- unset(_CUOPT_CONDA_PREFIX)
- endif()📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if(DEFINED ENV{CONDA_PREFIX}) | |
| set(_CUOPT_CONDA_PREFIX "$ENV{CONDA_PREFIX}") | |
| if(EXISTS "${_CUOPT_CONDA_PREFIX}/include/rapids/cuda/std/__cccl/version.h") | |
| target_include_directories(c_api_tester BEFORE PRIVATE "${_CUOPT_CONDA_PREFIX}/include/rapids") | |
| endif() | |
| unset(_CUOPT_CONDA_PREFIX) | |
| endif() | |
| if(DEFINED ENV{CONDA_PREFIX}) | |
| set(_CUOPT_CONDA_PREFIX "$ENV{CONDA_PREFIX}") | |
| if(EXISTS "${_CUOPT_CONDA_PREFIX}/include/rapids/cuda/std/__cccl/version.h") | |
| target_include_directories(c_api_tester BEFORE PRIVATE "${_CUOPT_CONDA_PREFIX}/include/rapids") | |
| endif() | |
| unset(_CUOPT_CONDA_PREFIX) | |
| endif() | |
| if(DEFINED ENV{CONDA_PREFIX}) | |
| set(_CUOPT_CONDA_PREFIX "$ENV{CONDA_PREFIX}") | |
| if(EXISTS "${_CUOPT_CONDA_PREFIX}/include/rapids/cuda/std/__cccl/version.h") | |
| target_include_directories(c_api_tester BEFORE PRIVATE "${_CUOPT_CONDA_PREFIX}/include/rapids") | |
| endif() | |
| unset(_CUOPT_CONDA_PREFIX) | |
| endif() |
🤖 Prompt for AI Agents
In `@cpp/tests/linear_programming/CMakeLists.txt` around lines 31 - 44, Remove the
duplicated CONDA_PREFIX include block: there are two identical if(DEFINED
ENV{CONDA_PREFIX}) ... target_include_directories(c_api_tester BEFORE PRIVATE
"${_CUOPT_CONDA_PREFIX}/include/rapids") ... unset(_CUOPT_CONDA_PREFIX)
sections; keep only one. Locate the repeated block that checks for
"${_CUOPT_CONDA_PREFIX}/include/rapids/cuda/std/__cccl/version.h" and ensure a
single conditional sets target_include_directories for the c_api_tester target
and unsets _CUOPT_CONDA_PREFIX once.
| if(DEFINED ENV{CONDA_PREFIX}) | ||
| set(_CUOPT_CONDA_PREFIX "$ENV{CONDA_PREFIX}") | ||
| if(EXISTS "${_CUOPT_CONDA_PREFIX}/include/rapids/cuda/std/__cccl/version.h") | ||
| target_include_directories(C_API_TEST BEFORE PRIVATE "${_CUOPT_CONDA_PREFIX}/include/rapids") | ||
| endif() | ||
| unset(_CUOPT_CONDA_PREFIX) | ||
| endif() | ||
| if(DEFINED ENV{CONDA_PREFIX}) | ||
| set(_CUOPT_CONDA_PREFIX "$ENV{CONDA_PREFIX}") | ||
| if(EXISTS "${_CUOPT_CONDA_PREFIX}/include/rapids/cuda/std/__cccl/version.h") | ||
| target_include_directories(C_API_TEST BEFORE PRIVATE "${_CUOPT_CONDA_PREFIX}/include/rapids") | ||
| endif() | ||
| unset(_CUOPT_CONDA_PREFIX) | ||
| endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove duplicated CONDA_PREFIX include directory blocks.
Same issue as above—lines 55-61 and 62-68 are identical duplicates for the C_API_TEST target.
🔧 Suggested fix
if(DEFINED ENV{CONDA_PREFIX})
set(_CUOPT_CONDA_PREFIX "$ENV{CONDA_PREFIX}")
if(EXISTS "${_CUOPT_CONDA_PREFIX}/include/rapids/cuda/std/__cccl/version.h")
target_include_directories(C_API_TEST BEFORE PRIVATE "${_CUOPT_CONDA_PREFIX}/include/rapids")
endif()
unset(_CUOPT_CONDA_PREFIX)
endif()
- if(DEFINED ENV{CONDA_PREFIX})
- set(_CUOPT_CONDA_PREFIX "$ENV{CONDA_PREFIX}")
- if(EXISTS "${_CUOPT_CONDA_PREFIX}/include/rapids/cuda/std/__cccl/version.h")
- target_include_directories(C_API_TEST BEFORE PRIVATE "${_CUOPT_CONDA_PREFIX}/include/rapids")
- endif()
- unset(_CUOPT_CONDA_PREFIX)
- endif()📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if(DEFINED ENV{CONDA_PREFIX}) | |
| set(_CUOPT_CONDA_PREFIX "$ENV{CONDA_PREFIX}") | |
| if(EXISTS "${_CUOPT_CONDA_PREFIX}/include/rapids/cuda/std/__cccl/version.h") | |
| target_include_directories(C_API_TEST BEFORE PRIVATE "${_CUOPT_CONDA_PREFIX}/include/rapids") | |
| endif() | |
| unset(_CUOPT_CONDA_PREFIX) | |
| endif() | |
| if(DEFINED ENV{CONDA_PREFIX}) | |
| set(_CUOPT_CONDA_PREFIX "$ENV{CONDA_PREFIX}") | |
| if(EXISTS "${_CUOPT_CONDA_PREFIX}/include/rapids/cuda/std/__cccl/version.h") | |
| target_include_directories(C_API_TEST BEFORE PRIVATE "${_CUOPT_CONDA_PREFIX}/include/rapids") | |
| endif() | |
| unset(_CUOPT_CONDA_PREFIX) | |
| endif() | |
| if(DEFINED ENV{CONDA_PREFIX}) | |
| set(_CUOPT_CONDA_PREFIX "$ENV{CONDA_PREFIX}") | |
| if(EXISTS "${_CUOPT_CONDA_PREFIX}/include/rapids/cuda/std/__cccl/version.h") | |
| target_include_directories(C_API_TEST BEFORE PRIVATE "${_CUOPT_CONDA_PREFIX}/include/rapids") | |
| endif() | |
| unset(_CUOPT_CONDA_PREFIX) | |
| endif() |
🤖 Prompt for AI Agents
In `@cpp/tests/linear_programming/CMakeLists.txt` around lines 55 - 68, Remove the
duplicated CONDA_PREFIX block by keeping a single conditional that checks
ENV{CONDA_PREFIX}, sets _CUOPT_CONDA_PREFIX, tests for the version.h path, calls
target_include_directories(C_API_TEST BEFORE PRIVATE
"${_CUOPT_CONDA_PREFIX}/include/rapids") if it exists, and then unsets
_CUOPT_CONDA_PREFIX; ensure only one copy of this logic remains (refer to the
repeated blocks that set _CUOPT_CONDA_PREFIX and call target_include_directories
for target C_API_TEST).
| ``` | ||
| ┌─────────────┐ ┌─────────────┐ | ||
| │ Client │ │ Server │ | ||
| │ │ serialize_lp_request() │ │ | ||
| │ Problem ───┼──────────────────────────────┼──► Problem │ | ||
| │ │ │ │ | ||
| │ │ serialize_lp_solution() │ │ | ||
| │ Solution ◄─┼──────────────────────────────┼─── Solution │ | ||
| └─────────────┘ └─────────────┘ | ||
| ▲ ▲ | ||
| │ │ | ||
| └────────── Same Serializer ───────────────┘ | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add language tags to fenced blocks
markdownlint MD040 will flag these. Use text for the ASCII diagram and output snippet.
📝 Proposed fix
-```
+```text
@@
-```
+```text
[remote_solve] Loading custom serializer from: /path/to/libcuopt_my_serializer.so
[remote_solve] Using custom serializer: my_format</details>
Also applies to: 373-377
<details>
<summary>🤖 Prompt for AI Agents</summary>
In @docs/developer/SERIALIZATION_PLUGIN_GUIDE.md around lines 13 - 25, The
markdown fenced code blocks for the ASCII diagram and the example output are
missing language tags and trigger markdownlint MD040; update the two blocks that
contain the ASCII diagram (the box with serialize_lp_request() and
serialize_lp_solution()) and the example output block that starts with
"[remote_solve] Loading custom serializer..." to use text fence markers instead of plain so both the diagram and the output snippet are annotated as
text; locate the blocks by searching for the ASCII diagram containing
serialize_lp_request/serialize_lp_solution and the output snippet beginning
"[remote_solve] Loading custom serializer" and add the language tag to each
fenced block.
</details>
<!-- fingerprinting:phantom:poseidon:eagle -->
<!-- This is an auto-generated comment by CodeRabbit -->
| # Lazy imports for CUDA-dependent modules to support CPU-only hosts | ||
| # These will be imported when first accessed | ||
|
|
||
|
|
||
| def __getattr__(name): | ||
| """Lazy import CUDA-dependent utilities.""" | ||
| if name == "type_cast": | ||
| from cuopt.utilities.type_casting import type_cast | ||
| return type_cast | ||
| elif name == "series_from_buf": | ||
| from cuopt.utilities.utils import series_from_buf | ||
| return series_from_buf | ||
| elif name == "check_solution": | ||
| from cuopt.utilities.utils import check_solution | ||
| return check_solution |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preserve import * compatibility by defining __all__ (and optionally __dir__).
With lazy loading, from cuopt.utilities import * will no longer export type_cast, series_from_buf, or check_solution because they aren’t in the module dict. That’s a backward-incompatible change for public API users. Please add __all__ (and optionally __dir__) to keep star-import behavior consistent. As per coding guidelines, preserve backward compatibility in Python APIs.
✅ Proposed fix (keeps lazy loading and restores public exports)
+__all__ = [
+ "InputRuntimeError",
+ "InputValidationError",
+ "OutOfMemoryError",
+ "catch_cuopt_exception",
+ "type_cast",
+ "series_from_buf",
+ "check_solution",
+]
+
def __getattr__(name):
"""Lazy import CUDA-dependent utilities."""
if name == "type_cast":
from cuopt.utilities.type_casting import type_cast
+ globals()["type_cast"] = type_cast
return type_cast
elif name == "series_from_buf":
from cuopt.utilities.utils import series_from_buf
+ globals()["series_from_buf"] = series_from_buf
return series_from_buf
elif name == "check_solution":
from cuopt.utilities.utils import check_solution
+ globals()["check_solution"] = check_solution
return check_solution
raise AttributeError(f"module {__name__!r} has no attribute {name!r}")
+
+def __dir__():
+ return sorted(set(globals()) | {"type_cast", "series_from_buf", "check_solution"})🤖 Prompt for AI Agents
In `@python/cuopt/cuopt/utilities/__init__.py` around lines 11 - 25, Add a
module-level __all__ list that includes "type_cast", "series_from_buf", and
"check_solution" so star-imports continue to expose these names, and optionally
implement __dir__ to return sorted(__all__ + list(globals().keys())) to surface
them in autocomplete; keep the existing lazy-loader __getattr__ unchanged so the
actual imports still occur on first access (referencing __getattr__, type_cast,
series_from_buf, check_solution).
This change adds a new remote execution system which will ultimately replace the cuopt Python server. In this new implementation, remote solves are identical to local solves except that the env vars CUOPT_REMOTE_HOST and CUOPT_REMOTE_PORT are set to indicate a remote server.
The remote server code is included and can be built with "build.sh cuopt_grpc_server". It builds as part of the default build.
Internal code paths have been modified to use data_model_view_t with non-owning pointers to store problems and solutions as GPU memory or CPU memory, the former for local solves and the latter for remote solves.
The server is built with gRPC and has support for TLS and mTLS.
Summary by CodeRabbit
Release Notes
New Features
cuopt_grpc_serverexecutable for remote problem solving--print-grpc-maxCLI option for gRPC configurationDocumentation
✏️ Tip: You can customize this high-level summary in your review settings.