Skip to content

Conversation

@tmckayus
Copy link
Contributor

@tmckayus tmckayus commented Jan 26, 2026

This change adds a new remote execution system which will ultimately replace the cuopt Python server. In this new implementation, remote solves are identical to local solves except that the env vars CUOPT_REMOTE_HOST and CUOPT_REMOTE_PORT are set to indicate a remote server.

The remote server code is included and can be built with "build.sh cuopt_grpc_server". It builds as part of the default build.

Internal code paths have been modified to use data_model_view_t with non-owning pointers to store problems and solutions as GPU memory or CPU memory, the former for local solves and the latter for remote solves.

The server is built with gRPC and has support for TLS and mTLS.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added remote solving capability for LP and MIP problems via gRPC
    • Added cuopt_grpc_server executable for remote problem solving
    • Added remote job lifecycle management (submit, status polling, result retrieval, cancellation)
    • Added --print-grpc-max CLI option for gRPC configuration
    • Added Protobuf-based serialization for remote data transport
  • Documentation

    • Added gRPC architecture and integration guide
    • Added remote serialization plugin development guide
    • Added solution memory architecture documentation

✏️ Tip: You can customize this high-level summary in your review settings.

@tmckayus tmckayus requested review from a team as code owners January 26, 2026 15:03
@tmckayus
Copy link
Contributor Author

Supersedes #769

@tmckayus tmckayus added feature request New feature or request non-breaking Introduces a non-breaking change improvement Improves an existing functionality and removed improvement Improves an existing functionality labels Jan 26, 2026
@coderabbitai
Copy link

coderabbitai bot commented Jan 26, 2026

📝 Walkthrough

Walkthrough

This PR introduces gRPC-based remote solve capabilities to cuOpt. Changes include new protobuf-based serialization, gRPC server/client implementations, hybrid GPU/CPU solution memory architecture, updated build system to support gRPC/Protobuf dependencies, and Python integration for remote problem submission and result retrieval.

Changes

Cohort / File(s) Summary
Build System & CMake Configuration
build.sh, ci/build_wheel_libcuopt.sh, ci/utils/install_protobuf_grpc.sh, cpp/CMakeLists.txt, cpp/libmps_parser/CMakeLists.txt, cpp/src/linear_programming/CMakeLists.txt, python/libcuopt/CMakeLists.txt, python/cuopt/CMakeLists.txt
Added new build target cuopt_grpc_server; integrated Protobuf and gRPC discovery; added in-tree gRPC FetchContent with v1.76.0; enhanced CMake argument handling for Python builds; added GRPC-related build flags and dependencies.
Conda & Package Dependencies
dependencies.yaml, conda/environments/all_cuda-*, conda/recipes/libcuopt/recipe.yaml
Added grpc-cpp and libprotobuf to all Conda environments (12x_aarch64 and x86_64) and recipe build/host dependencies; updated package run_exports.
Core Remote Solve API Headers
cpp/include/cuopt/linear_programming/data_model_view.hpp, cpp/include/cuopt/linear_programming/utilities/remote_solve.hpp, cpp/include/cuopt/linear_programming/utilities/remote_serialization.hpp, cpp/include/cuopt/linear_programming/solve.hpp
Introduced non-owning data_model_view_t aliases; defined remote_solve_config_t and remote job status enums; declared abstract remote_serializer_t interface with factory accessors; added view-based solve overloads for remote/local dispatch.
Solution Memory Architecture
cpp/include/cuopt/linear_programming/mip/solver_solution.hpp, cpp/include/cuopt/linear_programming/pdlp/solver_solution.hpp, cpp/include/cuopt/linear_programming/utilities/internals.hpp, cpp/include/cuopt/linear_programming/utilities/callbacks_implems.hpp, cpp/include/cuopt/linear_programming/utilities/cython_solve.hpp
Added dual GPU/CPU storage with is_device_memory_ flag; CPU-only constructors for remote solve; host accessors/setters for solutions; to_host() migration methods; callback memory location tracking.
Protobuf Schemas
cpp/src/linear_programming/utilities/cuopt_remote.proto, cpp/src/linear_programming/utilities/cuopt_remote_service.proto
Defined comprehensive protobuf messages for LP/MIP problem serialization, solver settings, solutions, termination statuses, and async job management; specified gRPC service with SubmitJob, CheckStatus, GetResult, StreamResult, StreamLogs, CancelJob, DeleteResult, GetIncumbents RPCs.
Serialization & Remote Client Implementation
cpp/src/linear_programming/utilities/protobuf_serializer.cu, cpp/src/linear_programming/utilities/remote_solve.cu, cpp/src/linear_programming/utilities/remote_solve_grpc.cpp, cpp/src/linear_programming/utilities/remote_solve_grpc.hpp
Implemented protobuf_serializer_t with full problem/solution serialization; added remote solve workflows for LP/MIP with gRPC submission, status polling, result streaming, and incumbent handling; factory functions for serializer access.
Core Solver Implementation Updates
cpp/src/linear_programming/solve.cu, cpp/src/linear_programming/cuopt_c.cpp, cpp/src/linear_programming/solver_solution.cu, cpp/src/linear_programming/utilities/cython_solve.cu, cpp/src/mip/solve.cu, cpp/src/mip/solver_solution.cu
Reworked solvers to support view-based dispatch and dual GPU/CPU paths; added remote detection and routing; refactored solution storage to unique_ptr-wrapped device vectors with host mirrors; added CPU-side data management for remote solving.
CLI & MPS Parser
cpp/cuopt_cli.cpp, cpp/libmps_parser/CMakeLists.txt, cpp/libmps_parser/include/mps_parser/data_model_view.hpp, cpp/libmps_parser/src/...
Added --print-grpc-max CLI option for server message size probing; implemented create_view_from_mps_data_model() helper; exposed is_device_memory() accessors; replaced template instantiations (int → int64_t); added crash handlers.
MIP Solution & Callbacks
cpp/src/mip/diversity/population.cu
Added memory location checks for incumbent callbacks; marked GET_SOLUTION and SET_SOLUTION callbacks to execute on DEVICE memory.
Python Integration
python/cuopt/cuopt/__init__.py, python/cuopt/cuopt/linear_programming/__init__.py, python/cuopt/cuopt/linear_programming/internals/internals.pyx, python/cuopt/cuopt/linear_programming/solver/solver.pxd, python/cuopt/cuopt/linear_programming/solver/solver_wrapper.pyx, python/cuopt/cuopt/utilities/...
Implemented lazy module loading via __getattr__/__dir__; added CPU/GPU data path handling in solution conversion; deferred CUDA imports for remote-only execution; replaced NumPy array interface with ctypes-based conversion.
Documentation & Configuration
GRPC_ARCHITECTURE.md, docs/developer/SERIALIZATION_PLUGIN_GUIDE.md, docs/developer/SOLUTION_MEMORY_ARCHITECTURE.md, .gitignore
Added comprehensive gRPC architecture overview; documented serialization plugin interface and implementation guide; described hybrid GPU/CPU solution memory design; added dev_scripts/test_c_api to ignore list.
Test Configuration
cpp/tests/linear_programming/CMakeLists.txt
Added conditional RAPIDS CCCL header inclusion based on CONDA_PREFIX availability.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 27.84% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: adding a gRPC-based remote execution system integrated with the cuOpt solver.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 15

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
cpp/include/cuopt/linear_programming/utilities/internals.hpp (1)

23-47: Document ABI-breaking change or refactor to maintain layout compatibility.

base_solution_callback_t is exposed in public API headers (solver_settings.hpp, optimization_problem.hpp, mip/solver_settings.hpp) and the new memory_location member changes the class layout. This is a breaking ABI change that is not documented in RELEASE-NOTES.md for 25.12. Either add it to the breaking changes list or refactor to preserve binary compatibility (e.g., use a reserved padding field or alternative mechanism that doesn't alter sizeof() for existing object layouts). Per coding guidelines, verify no struct layout changes in public headers.

cpp/include/cuopt/linear_programming/utilities/callbacks_implems.hpp (2)

41-52: Missing null check on Python callback return values.

If PyObject_CallMethod fails (e.g., Python exception), it returns NULL. Calling Py_DECREF(NULL) or Py_DECREF on a failed res can cause undefined behavior. Consider checking for NULL before decrementing reference counts.

🐛 Proposed fix to add null checks
   void get_solution(void* data, void* objective_value) override
   {
     PyObject* numba_matrix =
       data_on_device() ? get_numba_matrix(data, n_variables) : get_numpy_array(data, n_variables);
+    if (!numba_matrix) return;  // Python exception occurred
     PyObject* numpy_array =
       data_on_device() ? get_numba_matrix(objective_value, 1) : get_numpy_array(objective_value, 1);
+    if (!numpy_array) {
+      Py_DECREF(numba_matrix);
+      return;
+    }
     PyObject* res =
       PyObject_CallMethod(this->pyCallbackClass, "get_solution", "(OO)", numba_matrix, numpy_array);
     Py_DECREF(numba_matrix);
     Py_DECREF(numpy_array);
-    Py_DECREF(res);
+    Py_XDECREF(res);  // Use Py_XDECREF for potentially NULL res
   }

80-91: Same null check issue in set_solution.

The set_solution method has the same pattern and should also have null checks for Python callback failures.

🐛 Proposed fix
   void set_solution(void* data, void* objective_value) override
   {
     PyObject* numba_matrix =
       data_on_device() ? get_numba_matrix(data, n_variables) : get_numpy_array(data, n_variables);
+    if (!numba_matrix) return;
     PyObject* numpy_array =
       data_on_device() ? get_numba_matrix(objective_value, 1) : get_numpy_array(objective_value, 1);
+    if (!numpy_array) {
+      Py_DECREF(numba_matrix);
+      return;
+    }
     PyObject* res =
       PyObject_CallMethod(this->pyCallbackClass, "set_solution", "(OO)", numba_matrix, numpy_array);
     Py_DECREF(numba_matrix);
     Py_DECREF(numpy_array);
-    Py_DECREF(res);
+    Py_XDECREF(res);
   }
🤖 Fix all issues with AI agents
In `@build.sh`:
- Line 57: The default build unconditionally enables and builds
cuopt_grpc_server by forcing -DCUOPT_ENABLE_GRPC=ON; change build.sh so the
cuopt_grpc_server build is only executed when gRPC is detected or when the user
explicitly requests it: add a detection step (e.g., run pkg-config --exists grpc
or try cmake -DgRPC_DIR=.../ find_package(grpc) check) and gate the creation of
the cuopt_grpc_server target and the -DCUOPT_ENABLE_GRPC=ON flag on that
detection, otherwise skip adding cuopt_grpc_server to the default build (or
require an explicit CLI arg like --enable-grpc) so the script no longer forces
gRPC as a hard dependency.

In `@cpp/cuopt_remote_server.cpp`:
- Around line 1188-1217: The code currently moves result_data into the waiting
thread (waiter->result_data = std::move(result_data)), which empties it before
you store it in job_tracker; change the order or semantics so the tracker gets
the real result: either copy result_data into the waiter (use
waiter->result_data = result_data instead of std::move) and then move
result_data into the tracker (it->second.result_data = std::move(result_data)),
or update the job_tracker first (it->second.result_data =
std::move(result_data)) and then set waiter->result_data =
it->second.result_data (copy). Update the block touching
waiting_threads/waiters_mutex or the block touching job_tracker/tracker_mutex
accordingly so waiter->result_data retains the full data and job_tracker stores
the result (addresses: waiting_threads, waiters_mutex, waiter->result_data,
job_tracker, tracker_mutex, JobStatus::COMPLETED).
- Around line 1829-1833: The stdout/stderr redirection in stdout_streamer_t
(which calls dup2() in its constructor/destructor) must be serialized to avoid
cross-thread interference: introduce a process-wide mutex (e.g., a static
std::mutex stdout_redir_mutex) and acquire a lock whenever performing the
dup2-based redirection and when restoring the original fds; specifically, lock
inside stdout_streamer_t's constructor before the dup2 that redirects
stdout/stderr and keep the lock until after the dup2 in the destructor that
restores them (or alternatively lock around the scope where stdout_streamer_t
streamer(client_fd, stream_logs) is constructed in handle_sync_solve), ensuring
no two threads can perform the dup2/restore concurrently.
- Around line 303-312: The generate_job_id function currently uses shared static
RNG state (rd, gen, dis) which causes data races; change the RNG and
distribution to thread-local instead of static so each thread gets its own
std::random_device/ std::mt19937/ std::uniform_int_distribution instance, e.g.,
make rd, gen and dis thread_local inside generate_job_id (preserving the
snprintf formatting and return of buf) so concurrent calls from different client
handler threads are safe and avoid synchronization overhead.

In `@cpp/src/linear_programming/cuopt_c.cpp`:
- Around line 432-435: The remote-CPU copy currently assigns variable_types
verbatim (see cpu_data.variable_types assignment and the similar block at lines
535-540) so if callers pass 'B' the problem_cpu_data_t::is_mip() check will miss
integer/binary variables; update both copy sites to normalize
non-continuous/binary types to CUOPT_INTEGER (mirror the GPU path): iterate the
incoming variable_types array and when a type is 'B' (or otherwise
non-continuous), set the stored type to CUOPT_INTEGER before assigning
cpu_data.variable_types, ensuring MIP classification is correct.
- Around line 46-83: problem_cpu_data_t is missing an objective_scaling_factor
field and the remote cuOptReadProblem path doesn't propagate it into the view;
add a cuopt_float_t objective_scaling_factor member to problem_cpu_data_t,
populate it where cuOptReadProblem constructs or copies CPU problem data in the
remote path, and modify create_view()
(data_model_view_t<cuopt_int_t,cuopt_float_t>::create_view) to call
view.set_objective_scaling_factor(objective_scaling_factor) in addition to
set_maximize and set_objective_offset so the scaling from the MPS parser is
preserved end-to-end.

In `@cpp/src/linear_programming/solve.cu`:
- Around line 1207-1221: The cudaMemcpy calls that move var_types to
host_var_types lack error checking; update the block that uses
cudaPointerAttributes (vars: attrs, err, var_types, host_var_types) to capture
the cudaMemcpy return value (for the cudaMemcpyDeviceToHost path and any other
cudaMemcpy) into a cudaError_t, check if it's not cudaSuccess, call
cudaGetLastError() if needed, and handle/report the failure (e.g., log via
existing logger, return or throw an error) instead of ignoring it; ensure the
memcpy fallback path still clears prior cuda errors with cudaGetLastError() as
before.
- Around line 1264-1290: The cudaMemcpy calls that copy the quadratic objective
arrays (calls that move data into Q_values, Q_indices, Q_offsets from
view.get_quadratic_objective_values()/indices()/offsets()) are missing CUDA
error checks; wrap each cudaMemcpy in error-handling by capturing the
cudaError_t result, checking it against cudaSuccess, and handling failures
(e.g., log via the existing logger or return/throw an error) and/or call
cudaGetLastError() to clear and report any errors; ensure both the
device-to-host branch (the three cudaMemcpy calls) and the host branch use
consistent error reporting so failures during copying of the quadratic objective
matrix are detected and handled.

In `@cpp/src/linear_programming/utilities/protobuf_serializer.cu`:
- Around line 1395-1403: The cudaMemcpy call copying device MIP solution into
host vector (when solution.is_device_memory(), using d_sol and h_sol) lacks CUDA
error checking; wrap the cudaMemcpy(d_sol.data() -> h_sol.data()) call by
capturing its cudaError_t return (or using cudaMemcpyAsync + cudaGetLastError),
check for success (cudaSuccess), and on failure log or propagate the error
(e.g., via processLogger / throw) before iterating and calling
pb_solution->add_solution; ensure the check is applied in the block handling
device solutions for the MIP serialization same as LP solution handling.
- Around line 1305-1334: The cudaMemcpy calls that copy device buffers d_primal,
d_dual and d_reduced_cost into host vectors in protobuf_serializer.cu lack error
checking; wrap each cudaMemcpy invocation (the ones copying into h_primal,
h_dual, h_reduced_cost) with a cudaError_t result check, and on non-success use
cudaGetErrorString to log the error and fail the serialization path (e.g.,
return an error status or throw) so pb_solution is not populated with corrupted
data; reference the variables d_primal, d_dual, d_reduced_cost, nc_solution and
the optimization_problem_solution_t template to find the exact spots to add the
checks.

In `@cpp/src/linear_programming/utilities/remote_solve_grpc.cpp`:
- Around line 597-614: Replace the non-synchronized volatile flag with an atomic
boolean: change the parameter type of stream_logs_to_stdout from volatile bool*
stop_flag to std::atomic<bool>* stop_flag and replace reads of *stop_flag with
stop_flag->load(std::memory_order_acquire); update the writer side in
remote_solve.cu (the stop_logs variable and any writes) to use std::atomic<bool>
and stop_logs.store(true, std::memory_order_release) so the worker thread and
main thread synchronize safely; ensure all declarations and calls that pass
stop_logs are adjusted to the std::atomic<bool>* type.
- Around line 88-100: The code calls ParseFromArray with a
static_cast<int>(size) on size_t without checking bounds; add an explicit check
before parsing (for both branches around req.mutable_lp_request() and
req.mutable_mip_request()) that if size >
static_cast<size_t>(std::numeric_limits<int>::max()) then set error_message
(e.g., "SubmitJob: request too large for ParseFromArray"), set status_code =
grpc::StatusCode::INVALID_ARGUMENT and return false; ensure you `#include`
<limits> or <climits> as needed and reference ProblemType, ParseFromArray,
req.mutable_lp_request()/req.mutable_mip_request(), error_message and
status_code in the fix.

In `@cpp/src/mip/solve.cu`:
- Around line 383-447: The cpu_problem_data_t is missing storage and propagation
for constraint row types, so add a member (e.g., std::vector<char> row_types or
the appropriate type) to cpu_problem_data_t, ensure any code that
constructs/populates cpu_problem_data_t from gpu_view copies gpu_view.row_types
into that vector, and update cpu_problem_data_t::create_view() to call
v.set_constraint_row_types(row_types.data(), row_types.size()) (in the same
manner as other constraint arrays) before returning; refer to
cpu_problem_data_t, create_view, row_types, gpu_view, and
set_constraint_row_types to locate where to add the field, copy it, and pass it
into the data_model_view_t.
- Around line 486-495: Wrap the cudaMemcpyAsync and cudaStreamSynchronize calls
with RAFT_CUDA_TRY to check for errors: replace the direct call to
cudaMemcpyAsync(cpu_data.variable_types.data(), var_types_span.data(),
var_types_span.size() * sizeof(char), cudaMemcpyDeviceToHost, stream) with
RAFT_CUDA_TRY(cudaMemcpyAsync(...)) and similarly wrap
cudaStreamSynchronize(stream) with RAFT_CUDA_TRY(cudaStreamSynchronize(stream));
ensure you reference the same variables (cpu_data.variable_types,
var_types_span, stream) so failures in the async copy or stream sync are caught
and propagated per project guidelines.

In `@python/cuopt/cuopt/linear_programming/remote.py`:
- Around line 136-140: The AsyncRequest's job_id field is defined as bytes but
the code assigns a str directly; update the creation of pb.AsyncRequest in
remote.py (where request.request_type = pb.CANCEL_JOB and request.job_id =
job_id) to ensure job_id is bytes (e.g., set request.job_id =
job_id.encode('utf-8') or conditionally keep it if already bytes) so
serialization no longer raises a TypeError.
🟡 Minor comments (16)
docs/developer/SOLUTION_MEMORY_ARCHITECTURE.md-73-95 (1)

73-95: Add language tags to the ASCII diagram fences.

markdownlint (MD040) will fail on the two untyped fenced blocks; use a text fence to keep lint clean and rendering consistent.

🛠️ Suggested fix
-```
+```text
 ┌─────────────┐     ┌─────────────┐     ┌─────────────┐
 │   Solver    │ ──► │  Solution   │ ──► │    User     │
 │   (GPU)     │     │ (GPU mem)   │     │ (GPU/CPU)   │
 └─────────────┘     └─────────────┘     └─────────────┘
-```
+```

-```
+```text
 ┌──────────┐     ┌──────────────────────────────────────────┐     ┌──────────┐
 │  Client  │ ──► │              SERVER                      │ ◄── │  Client  │
 │ (no GPU) │     │ GPU solve → to_host() → serialize        │     │(solution)│
 └──────────┘     └──────────────────────────────────────────┘     └──────────┘
-```
+```
cpp/cuopt_cancel_job.cpp-63-88 (1)

63-88: Harden port parsing and range validation.

std::stoi can throw on invalid input, and std::atoi silently maps invalid env values to 0. Without error handling or range checks, the tool can crash or select an invalid port. Replace both with explicit range validation (1–65535) and clear error messages.

🛠️ Suggested fix
+#include <cerrno>
 ...
+bool parse_port(const char* s, int& port)
+{
+  if (s == nullptr || *s == '\0') { return false; }
+  char* end = nullptr;
+  errno     = 0;
+  long val  = std::strtol(s, &end, 10);
+  if (errno != 0 || *end != '\0' || val < 1 || val > 65535) { return false; }
+  port = static_cast<int>(val);
+  return true;
+}
+
 int main(int argc, char** argv)
 {
   // Parse arguments
   std::string job_id;
   std::string host = "localhost";
   int port         = 9090;
@@
-  if (env_port && env_port[0]) { port = std::atoi(env_port); }
+  if (env_port && env_port[0] && !parse_port(env_port, port)) {
+    std::cerr << "Error: invalid CUOPT_REMOTE_PORT\n";
+    return 1;
+  }
@@
-    } else if (strcmp(argv[i], "-p") == 0 && i + 1 < argc) {
-      port = std::stoi(argv[++i]);
+    } else if (strcmp(argv[i], "-p") == 0) {
+      if (i + 1 >= argc || !parse_port(argv[i + 1], port)) {
+        std::cerr << "Error: invalid port\n\n";
+        print_usage(argv[0]);
+        return 1;
+      }
+      ++i;
     } else if (argv[i][0] != '-') {
       job_id = argv[i];
     }
python/cuopt/cuopt/utilities/__init__.py-15-25 (1)

15-25: Add __all__ and __dir__ to expose lazy utilities for IDE discoverability.

The lazy imports work correctly for direct access (e.g., from cuopt.utilities import type_cast), but without __all__ and __dir__, these names are not discoverable via IDE completion or star imports. Adding these improves the public API definition, consistent with other modules in the codebase.

Suggested fix
+__all__ = [
+    "InputRuntimeError",
+    "InputValidationError",
+    "OutOfMemoryError",
+    "catch_cuopt_exception",
+    "type_cast",
+    "series_from_buf",
+    "check_solution",
+]
+
+def __dir__():
+    return sorted(__all__)
+
 def __getattr__(name):
     """Lazy import CUDA-dependent utilities."""
     if name == "type_cast":
         from cuopt.utilities.type_casting import type_cast
         return type_cast
GRPC_PROTOTYPE_STATUS.md-130-144 (1)

130-144: Add a language identifier to the expected-output code fence.
This resolves MD040 and improves rendering.

✏️ Suggested edit
-```
+```text
 [Client] Job submitted successfully
 [Client] Job ID: <hex-id>
 [Client] Polling for job completion...
 [Client] Job status: QUEUED - Job queued
 [Client] Job status: PROCESSING - Job processing
 [Client] Job status: COMPLETED - Job completed
 [Client] Retrieving result...
 [Client] Result retrieved successfully
 [Client] LP Solution:
 [Client]   Status: 2  (PDLP_OPTIMAL)
 [Client]   Objective: 42.0  (dummy value)
 [Client] Test completed successfully!
</details>

</blockquote></details>
<details>
<summary>cpp/src/linear_programming/utilities/cython_solve.cu-293-304 (1)</summary><blockquote>

`293-304`: **Propagate `solved_by_pdlp` instead of hardcoding `false`.**  
Remote solves should mirror solver metadata; hardcoding misreports the algorithm used.  

<details>
<summary>🐛 Suggested fix</summary>

```diff
-      lp_ret.solved_by_pdlp_     = false;
+      lp_ret.solved_by_pdlp_     = term_info.solved_by_pdlp;
cpp/cuopt_cli.cpp-508-549 (1)

508-549: Handle gRPC status probe failures before printing server max.
Failures silently report 0 MiB instead of surfacing the error; check error_message before printing the result.

🐛 Suggested guard
       cuopt::linear_programming::grpc_remote::check_status(address,
                                                            "__cuopt_max_message_probe__",
                                                            status,
                                                            error_message,
                                                            &result_size_bytes,
                                                            &max_message_bytes);
-      std::cout << "Server max message MiB: " << (max_message_bytes / (1024 * 1024)) << "\n";
+      if (!error_message.empty()) {
+        std::cout << "Server max message MiB: unavailable (" << error_message << ")\n";
+      } else {
+        std::cout << "Server max message MiB: " << (max_message_bytes / (1024 * 1024)) << "\n";
+      }
GRPC_BUILD_STATUS.md-8-24 (1)

8-24: Clarify prototype RPC coverage to avoid mixed signals.
Line 8‑13 says CheckStatus is supported, but Line 23 says “SubmitJob and GetResult only.” Please align the list and the log snippet so the status is unambiguous.

GRPC_BUILD_STATUS.md-66-70 (1)

66-70: Add a language hint to the fenced block.
Line 66 opens a fenced block without a language; markdownlint MD040 flags this. Consider text for the environment list.

✏️ Suggested fix
-```
+```text
 libgrpc: 1.73.1
 libprotobuf: 6.31.1
 protobuf (pip): 6.33.2
-```
+```
python/cuopt/cuopt/linear_programming/remote.py-141-213 (1)

141-213: Close sockets on all exception paths to prevent file descriptor leaks.

When socket.timeout or socket.error exceptions occur after sock.connect() (line 145), the socket is not closed before returning at lines 200–211. Wrap the I/O operations in a try/finally block to ensure sock.close() is always called.

Suggested fix
-        # Connect to server
-        sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
-        sock.settimeout(30.0)  # 30 second timeout
-        sock.connect((host, port))
-
-        # Send request (length-prefixed)
-        data = request.SerializeToString()
-        sock.sendall(struct.pack("<Q", len(data)))
-        sock.sendall(data)
-
-        # Receive response
-        size_data = b""
-        while len(size_data) < 8:
-            chunk = sock.recv(8 - len(size_data))
-            if not chunk:
-                sock.close()
-                return CancelResult(
-                    success=False,
-                    message="Failed to receive response size",
-                    job_status=JobStatus.NOT_FOUND,
-                )
-            size_data += chunk
-
-        size = struct.unpack("<Q", size_data)[0]
-        response_data = b""
-        while len(response_data) < size:
-            chunk = sock.recv(size - len(response_data))
-            if not chunk:
-                break
-            response_data += chunk
-
-        sock.close()
+        # Connect to server
+        sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
+        sock.settimeout(30.0)  # 30 second timeout
+        try:
+            sock.connect((host, port))
+
+            # Send request (length-prefixed)
+            data = request.SerializeToString()
+            sock.sendall(struct.pack("<Q", len(data)))
+            sock.sendall(data)
+
+            # Receive response
+            size_data = b""
+            while len(size_data) < 8:
+                chunk = sock.recv(8 - len(size_data))
+                if not chunk:
+                    return CancelResult(
+                        success=False,
+                        message="Failed to receive response size",
+                        job_status=JobStatus.NOT_FOUND,
+                    )
+                size_data += chunk
+
+            size = struct.unpack("<Q", size_data)[0]
+            response_data = b""
+            while len(response_data) < size:
+                chunk = sock.recv(size - len(response_data))
+                if not chunk:
+                    break
+                response_data += chunk
+        finally:
+            sock.close()
cpp/include/cuopt/linear_programming/utilities/remote_serialization.hpp-15-18 (1)

15-18: Add <cstdint> include for uint8_t type definition.

This public header file uses std::vector<uint8_t> extensively throughout (lines 53, 64, 79, 87, 99, 111, 123, 133, 142, 156, 169, 180, 188, 196, 205, 214, 247, 257, 265, 274, 283, 291, 305, 314, 322, 330, 355, 365, 377, 388, 399, 413) but does not explicitly include <cstdint>, which defines uint8_t. Add it to ensure the header is self-contained.

🔧 Suggested fix
+#include <cstdint>
 `#include` <memory>
 `#include` <optional>
 `#include` <string>
 `#include` <vector>
cpp/include/cuopt/linear_programming/utilities/remote_solve.hpp-39-53 (1)

39-53: Validate remote port parsing and range.
Line 46: std::stoi accepts partial parses (e.g., "123abc" → 123) without exception and does not validate the port is within the valid range (1–65535). Invalid but numeric-prefix strings will enable remote solve unintentionally. Add strict parsing with position tracking and range validation.

🔧 Suggested fix
-  if (host != nullptr && port != nullptr && host[0] != '\0' && port[0] != '\0') {
-    try {
-      int port_num = std::stoi(port);
-      return remote_solve_config_t{std::string(host), port_num};
-    } catch (...) {
-      // Invalid port number, fall back to local solve
-      return std::nullopt;
-    }
-  }
+  if (host != nullptr && port != nullptr && host[0] != '\0' && port[0] != '\0') {
+    try {
+      std::string port_str{port};
+      size_t pos = 0;
+      int port_num = std::stoi(port_str, &pos);
+      if (pos != port_str.size() || port_num <= 0 || port_num > 65535) { return std::nullopt; }
+      return remote_solve_config_t{std::string(host), port_num};
+    } catch (...) {
+      // Invalid port number, fall back to local solve
+      return std::nullopt;
+    }
+  }
cpp/src/mip/solver_solution.cu-147-157 (1)

147-157: Potential null pointer dereference in get_solution().

Similarly, get_solution() dereferences solution_ without null checks. After calling to_host(), solution_ is reset to nullptr, so subsequent calls to get_solution() would crash.

Proposed defensive fix
 template <typename i_t, typename f_t>
 const rmm::device_uvector<f_t>& mip_solution_t<i_t, f_t>::get_solution() const
 {
+  cuopt_expects(is_device_memory_ && solution_ != nullptr,
+                error_type_t::RuntimeError,
+                "get_solution() called but data is on host or device storage is null");
   return *solution_;
 }
cpp/src/mip/solver_solution.cu-159-169 (1)

159-169: Potential null pointer dereference in get_solution_host().

The get_solution_host() methods dereference solution_host_ without null checks. If called when is_device_memory_ is true and solution_host_ hasn't been initialized, this will crash.

Proposed defensive fix
 template <typename i_t, typename f_t>
 std::vector<f_t>& mip_solution_t<i_t, f_t>::get_solution_host()
 {
+  cuopt_expects(!is_device_memory_ && solution_host_ != nullptr,
+                error_type_t::RuntimeError,
+                "get_solution_host() called but data is on device or host storage is null");
   return *solution_host_;
 }
 
 template <typename i_t, typename f_t>
 const std::vector<f_t>& mip_solution_t<i_t, f_t>::get_solution_host() const
 {
+  cuopt_expects(!is_device_memory_ && solution_host_ != nullptr,
+                error_type_t::RuntimeError,
+                "get_solution_host() called but data is on device or host storage is null");
   return *solution_host_;
 }
cpp/include/cuopt/linear_programming/mip/solver_solution.hpp-77-86 (1)

77-86: Potential null pointer dereference in host accessors.

The get_solution_host() methods (lines 84-85) dereference solution_host_ without checking if it's valid. If is_device_memory() returns true but these methods are called, it will crash.

Consider adding an assertion or throwing an exception for misuse:

Suggested defensive check
+  /**
+   * `@brief` Returns the solution in host (CPU) memory.
+   * Only valid when is_device_memory() returns false.
+   * `@throws` std::runtime_error if called when data is on device
+   */
   std::vector<f_t>& get_solution_host();
   const std::vector<f_t>& get_solution_host() const;

And in the implementation:

template <typename i_t, typename f_t>
std::vector<f_t>& mip_solution_t<i_t, f_t>::get_solution_host()
{
  cuopt_expects(!is_device_memory_ && solution_host_, 
                error_type_t::RuntimeError, 
                "get_solution_host() called but data is on device");
  return *solution_host_;
}
cpp/src/linear_programming/solver_solution.cu-529-579 (1)

529-579: Keep solved_by_pdlp consistent with termination_stats_.
set_termination_stats() updates termination_stats_, but get_solved_by_pdlp() returns solved_by_pdlp_. Unless both are kept in sync elsewhere, the getter can return stale data. Consider deriving from termination_stats_.solved_by_pdlp (or update the setter).

Proposed fix (single source of truth)
 bool optimization_problem_solution_t<i_t, f_t>::get_solved_by_pdlp() const
 {
-  return solved_by_pdlp_;
+  return termination_stats_.solved_by_pdlp;
 }
cpp/src/linear_programming/solver_solution.cu-411-438 (1)

411-438: Device getters lack null-safety checks and risk crashes with CPU-only solutions.

The device getters get_primal_solution(), get_dual_solution(), and get_reduced_cost() unconditionally dereference unique_ptrs that are never initialized in CPU-only constructors or after to_host() resets them. While public APIs in cuopt_c.cpp and protobuf_serializer.cu properly guard these calls with is_device_memory() checks, internal code like translate_to_crossover_problem() calls device getters without guards. Although this is currently safe due to implicit contracts (that function is only called with Optimal status solutions, which are always device-based), the code is fragile. Consider adding defensive null-checks or assertions to the device getters themselves, or ensure all callsites explicitly verify is_device_memory() before using device accessors.

🧹 Nitpick comments (11)
python/cuopt/cuopt/linear_programming/internals/internals.pyx (1)

52-57: Non-owning view and limited type support.

Two observations on this implementation:

  1. Memory lifetime: np.ctypeslib.as_array returns a non-owning view into the memory at data. Callers must ensure the underlying buffer remains valid for the lifetime of the returned array.

  2. Type handling: The else branch silently defaults to c_double for any typestr other than "float32". If an unexpected type (e.g., "int32") is passed, this will silently misinterpret the data.

Consider adding an assertion or explicit check:

Suggested improvement
     def get_numpy_array(self, data, shape, typestr):
-        ctype = ctypes.c_float if typestr == "float32" else ctypes.c_double
+        if typestr == "float32":
+            ctype = ctypes.c_float
+        elif typestr == "float64":
+            ctype = ctypes.c_double
+        else:
+            raise ValueError(f"Unsupported typestr: {typestr}")
         buf_type = ctype * shape
         buf = buf_type.from_address(data)
         numpy_array = np.ctypeslib.as_array(buf)
         return numpy_array
cpp/src/mip/diversity/population.cu (1)

283-307: Consider keeping the incumbent-return log at DEBUG or throttling it.

If incumbents are frequent, INFO logging here can be noisy in long runs. Optional, but might be worth keeping this at DEBUG or rate-limiting.

🔧 Optional tweak (reduce log volume)
-        CUOPT_LOG_INFO("Returning incumbent solution with objective %g", user_objective);
+        CUOPT_LOG_DEBUG("Returning incumbent solution with objective %g", user_objective);
cpp/cuopt_cli.cpp (1)

122-213: The default is_device_memory value is already false; the suggested explicit call is optional.

The data_model_view_t member is_device_memory_ defaults to false (CPU memory) by design, which is correct for MPS-parsed data. Adding view.set_is_device_memory(false) would be redundant but could improve clarity if defending against future default changes. This is optional.

ci/utils/install_protobuf.sh (1)

11-19: Ensure dnf config-manager availability on Rocky.

On minimal Rocky images, dnf config-manager can be missing unless dnf-plugins-core is installed, which can leave PowerTools/CRB disabled and make protobuf-devel unavailable. Consider installing the plugin before toggling repos.

♻️ Suggested tweak
 if [[ "$ID" == "rocky" ]]; then
     echo "Detected Rocky Linux. Installing Protobuf via dnf..."
+    dnf install -y dnf-plugins-core
     # Enable PowerTools (Rocky 8) or CRB (Rocky 9) repository for protobuf-devel
     if [[ "${VERSION_ID%%.*}" == "8" ]]; then
         dnf config-manager --set-enabled powertools || dnf config-manager --set-enabled PowerTools || true
     elif [[ "${VERSION_ID%%.*}" == "9" ]]; then
         dnf config-manager --set-enabled crb || true
     fi
     dnf install -y protobuf-devel protobuf-compiler
docs/developer/SERIALIZATION_PLUGIN_GUIDE.md (2)

13-25: Minor: Add language specifier to fenced code block.

Static analysis suggests adding a language specifier. For ASCII diagrams, use text or leave as-is if the tooling accepts it.

📝 Proposed fix
-```
+```text
 ┌─────────────┐                              ┌─────────────┐

374-377: Minor: Add language specifier for server output example.

📝 Proposed fix
-```
+```text
 [remote_solve] Loading custom serializer from: /path/to/libcuopt_my_serializer.so
 [remote_solve] Using custom serializer: my_format
</details>

</blockquote></details>
<details>
<summary>cpp/libmps_parser/CMakeLists.txt (1)</summary><blockquote>

`81-100`: **Keep `mps_parser_static` compile settings aligned with the shared target.**  
`mps_parser_static` doesn’t inherit the C++20 standard or compile flags set on `mps_parser`, which can lead to divergent builds or failures if the sources require C++20. Consider applying the same properties/flags (or factoring into a helper). If `CMAKE_CXX_STANDARD` is globally set, please confirm that it also applies to `mps_parser_static`.

<details>
<summary>♻️ Suggested alignment</summary>

```diff
 add_library(mps_parser SHARED ${MPS_PARSER_SOURCES})
 add_library(mps_parser_static STATIC ${MPS_PARSER_SOURCES})

 set_target_properties(mps_parser
   PROPERTIES BUILD_RPATH "\$ORIGIN"
   INSTALL_RPATH "\$ORIGIN"
   # set target compile options
   CXX_STANDARD 20
   CXX_STANDARD_REQUIRED ON
   INTERFACE_POSITION_INDEPENDENT_CODE ON
   CXX_SCAN_FOR_MODULES OFF
 )
+set_target_properties(mps_parser_static
+  PROPERTIES
+  CXX_STANDARD 20
+  CXX_STANDARD_REQUIRED ON
+  INTERFACE_POSITION_INDEPENDENT_CODE ON
+  CXX_SCAN_FOR_MODULES OFF
+)

 target_compile_options(mps_parser
   PRIVATE "$<$<COMPILE_LANGUAGE:CXX>:${MPS_PARSER_CXX_FLAGS}>"
 )
+target_compile_options(mps_parser_static
+  PRIVATE "$<$<COMPILE_LANGUAGE:CXX>:${MPS_PARSER_CXX_FLAGS}>"
+)
cpp/include/cuopt/linear_programming/solve.hpp (1)

112-123: Clarify memory-location expectations for data_model_view_to_optimization_problem.
The doc says the view points to GPU memory, but the new API allows CPU-backed views for remote solves. Either enforce GPU-only usage or update the docs to reflect CPU/GPU support to avoid misuse. As per coding guidelines, public headers should document GPU requirements.

cpp/src/linear_programming/utilities/protobuf_serializer.cu (1)

1473-1515: Intentional dlopen handle leak is documented but could be improved.

The comment on line 1515 notes that the handle is intentionally not closed to keep the library loaded. While this is acceptable, consider storing the handle to enable proper cleanup in tests or if the serializer needs to be reloaded.

cpp/src/linear_programming/utilities/remote_solve.cu (2)

197-230: Consider exponential backoff for status polling.

The fixed 100ms polling interval (line 229) may be inefficient for long-running jobs or cause excessive server load. Consider exponential backoff:

Suggested improvement
+  int poll_delay_ms = 100;
+  constexpr int max_poll_delay_ms = 5000;
+
   while (true) {
     std::string st_err;
     if (!grpc_remote::check_status(
           address, job_id, status, st_err, &result_size_bytes, &max_message_bytes)) {
       // ... error handling
     }

     if (status == "COMPLETED") { break; }
     // ... error status handling

-    std::this_thread::sleep_for(std::chrono::milliseconds(100));
+    std::this_thread::sleep_for(std::chrono::milliseconds(poll_delay_ms));
+    poll_delay_ms = std::min(poll_delay_ms * 2, max_poll_delay_ms);
   }

414-439: Final incumbent drain loop has hardcoded retry count.

The loop retries 5 times with 50ms delays (lines 417, 427), which may not be sufficient for all cases. Consider making this configurable or documenting the rationale.

-h - print this text
default action (no args) is to build and install 'libcuopt' then 'cuopt' then 'docs' targets
default action (no args) is to build and install 'libcuopt', 'cuopt', 'cuopt_grpc_server', then 'docs'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Step 1: Inspect how CUOPT_ENABLE_GRPC is handled in CMakeLists.txt files
rg -n "CUOPT_ENABLE_GRPC|find_package\(gRPC|gRPC_FOUND" -t cmake -A 3

Repository: NVIDIA/cuopt

Length of output: 2406


🏁 Script executed:

#!/bin/bash
# Step 2: Examine build.sh lines around 57 and 461-483 to understand the gRPC build block
head -75 build.sh | tail -25

Repository: NVIDIA/cuopt

Length of output: 1343


🏁 Script executed:

#!/bin/bash
# Step 3: Check lines 461+ in build.sh for the gRPC build block
sed -n '450,490p' build.sh

Repository: NVIDIA/cuopt

Length of output: 1901


Gate cuopt_grpc_server build on gRPC availability, or remove it from default builds.
Lines 57 and 455–483: The default build (no arguments) includes cuopt_grpc_server and unconditionally forces -DCUOPT_ENABLE_GRPC=ON (line 468). Since gRPC is designed as optional in CMakeLists.txt, this forces gRPC as a hard dependency for default builds, causing failures when gRPC is not installed. Either gate this block on gRPC availability detection or make it explicit via user argument.

🤖 Prompt for AI Agents
In `@build.sh` at line 57, The default build unconditionally enables and builds
cuopt_grpc_server by forcing -DCUOPT_ENABLE_GRPC=ON; change build.sh so the
cuopt_grpc_server build is only executed when gRPC is detected or when the user
explicitly requests it: add a detection step (e.g., run pkg-config --exists grpc
or try cmake -DgRPC_DIR=.../ find_package(grpc) check) and gate the creation of
the cuopt_grpc_server target and the -DCUOPT_ENABLE_GRPC=ON flag on that
detection, otherwise skip adding cuopt_grpc_server to the default build (or
require an explicit CLI arg like --enable-grpc) so the script no longer forces
gRPC as a hard dependency.

Comment on lines 303 to 312
std::string generate_job_id()
{
static std::random_device rd;
static std::mt19937 gen(rd());
static std::uniform_int_distribution<uint64_t> dis;

uint64_t id = dis(gen);
char buf[32];
snprintf(buf, sizeof(buf), "job_%016lx", id);
return std::string(buf);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the file structure and understand the server implementation
fd -t f "cuopt_remote_server" --extension cpp --extension hpp

Repository: NVIDIA/cuopt

Length of output: 83


🏁 Script executed:

# Check the overall structure and find the file
git ls-files | grep -E "cuopt_remote_server\.(cpp|hpp)"

Repository: NVIDIA/cuopt

Length of output: 83


🏁 Script executed:

# Get size of the file to understand how much we need to read
wc -l cpp/cuopt_remote_server.cpp

Repository: NVIDIA/cuopt

Length of output: 88


🏁 Script executed:

# Read the relevant section and surrounding context to understand the function and its usage
sed -n '290,330p' cpp/cuopt_remote_server.cpp

Repository: NVIDIA/cuopt

Length of output: 1177


🏁 Script executed:

# Search for calls to generate_job_id to understand where it's invoked
rg "generate_job_id" -B 2 -A 2

Repository: NVIDIA/cuopt

Length of output: 2795


🏁 Script executed:

# Check for thread creation or async patterns in the server
rg "std::thread|async|detach|join" cpp/cuopt_remote_server.cpp -B 2 -A 2

Repository: NVIDIA/cuopt

Length of output: 2989


🏁 Script executed:

# Let's verify the threading context more carefully - check handle_client to confirm it's called from detached threads
sed -n '1900,2000p' cpp/cuopt_remote_server.cpp | grep -A 50 "handle_client"

Repository: NVIDIA/cuopt

Length of output: 38


🏁 Script executed:

# Get the exact line where detached threads are created for clients
rg "std::thread.*detach" cpp/cuopt_remote_server.cpp -B 5 -A 5

Repository: NVIDIA/cuopt

Length of output: 424


🌐 Web query:

std::mt19937 thread safety concurrent access C++

💡 Result:

std::mt19937 is not safe for concurrent access to the same engine object.

  • Calling operator() on a Mersenne Twister engine advances (modifies) its internal state. If two threads call it on the same std::mt19937 instance without synchronization, that’s a data race (undefined behavior). [1]
  • More generally, the C++ standard library’s data-race rules mean you can’t concurrently call non-const operations on the same object from multiple threads without external synchronization. [2]

What to do instead

  • Prefer one engine per thread, e.g. thread_local std::mt19937 rng(seed); (fast, no locking). [3]
  • Or protect a shared engine with a mutex (correct but can bottleneck).
  • Note: function-local static initialization is thread-safe, but subsequent concurrent use of that static engine is not. [3]

Sources: [1] [2] [3]


Make job ID generation thread-safe.

std::mt19937 and std::uniform_int_distribution are mutated concurrently by multiple client handler threads, causing a data race. Each call to generate_job_id() from different clients results in unsynchronized access to the shared static RNG state, leading to undefined behavior and potential duplicate IDs.

Use thread_local to give each thread its own RNG instance, eliminating the race condition without synchronization overhead.

Proposed fix
 std::string generate_job_id()
 {
-  static std::random_device rd;
-  static std::mt19937 gen(rd());
-  static std::uniform_int_distribution<uint64_t> dis;
+  thread_local std::mt19937_64 gen{std::random_device{}()};
+  thread_local std::uniform_int_distribution<uint64_t> dis;
 
   uint64_t id = dis(gen);
   char buf[32];
🤖 Prompt for AI Agents
In `@cpp/cuopt_remote_server.cpp` around lines 303 - 312, The generate_job_id
function currently uses shared static RNG state (rd, gen, dis) which causes data
races; change the RNG and distribution to thread-local instead of static so each
thread gets its own std::random_device/ std::mt19937/
std::uniform_int_distribution instance, e.g., make rd, gen and dis thread_local
inside generate_job_id (preserving the snprintf formatting and return of buf) so
concurrent calls from different client handler threads are safe and avoid
synchronization overhead.

Comment on lines 1188 to 1217
// Check if there's a blocking waiter
{
std::lock_guard<std::mutex> lock(waiters_mutex);
auto wit = waiting_threads.find(job_id);
if (wit != waiting_threads.end()) {
// Wake up the waiting thread
auto waiter = wit->second;
waiter->result_data = std::move(result_data);
waiter->error_message = error_message;
waiter->success = success;
waiter->ready = true;
waiter->cv.notify_one();
}
}

// Update job tracker
{
std::lock_guard<std::mutex> lock(tracker_mutex);
auto it = job_tracker.find(job_id);
if (it != job_tracker.end()) {
if (success) {
it->second.status = JobStatus::COMPLETED;
it->second.result_data = result_data;
} else if (cancelled) {
it->second.status = JobStatus::CANCELLED;
it->second.error_message = error_message;
} else {
it->second.status = JobStatus::FAILED;
it->second.error_message = error_message;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, locate the file and verify it exists
find . -name "cuopt_remote_server.cpp" -type f

Repository: NVIDIA/cuopt

Length of output: 85


🏁 Script executed:

# Read the specific lines mentioned in the review (1188-1217) with context
sed -n '1180,1230p' cpp/cuopt_remote_server.cpp | cat -n

Repository: NVIDIA/cuopt

Length of output: 2123


🏁 Script executed:

# Get more context about the function containing this code
sed -n '1100,1250p' cpp/cuopt_remote_server.cpp | cat -n

Repository: NVIDIA/cuopt

Length of output: 6665


🏁 Script executed:

# Search for GET_RESULT or similar result retrieval methods
rg -A 5 "GET_RESULT|get_result|GetResult" cpp/cuopt_remote_server.cpp | head -60

Repository: NVIDIA/cuopt

Length of output: 1562


🏁 Script executed:

# Find the structure definitions for waiting threads and job tracker
rg -B 2 -A 10 "struct.*Waiter|struct.*JobInfo" cpp/cuopt_remote_server.cpp | head -80

Repository: NVIDIA/cuopt

Length of output: 675


🏁 Script executed:

# Check if there are any other operations on result_data after the tracker update
sed -n '85,150p' cpp/cuopt_remote_server.cpp | cat -n

Repository: NVIDIA/cuopt

Length of output: 2793


🏁 Script executed:

# Find the get_job_result function implementation
rg -B 2 -A 20 "get_job_result\(" cpp/cuopt_remote_server.cpp

Repository: NVIDIA/cuopt

Length of output: 1715


🏁 Script executed:

# Also search for any other place where job_tracker.result_data is accessed
rg "job_tracker.*result_data|result_data.*job_tracker" cpp/cuopt_remote_server.cpp

Repository: NVIDIA/cuopt

Length of output: 38


Move result_data after updating job_tracker to prevent data loss.

When a blocking waiter exists, result_data is moved into the waiter at line 96, leaving an empty vector to be stored in job_tracker at line 111. Subsequent GET_RESULT calls retrieve this empty vector from the tracker, returning no solution data even for completed jobs. Copy the data for the waiter instead.

Proposed fix
       {
         std::lock_guard<std::mutex> lock(waiters_mutex);
         auto wit = waiting_threads.find(job_id);
         if (wit != waiting_threads.end()) {
           // Wake up the waiting thread
-          auto waiter           = wit->second;
-          waiter->result_data   = std::move(result_data);
+          auto waiter = wit->second;
+          auto result_data_copy = result_data;
+          waiter->result_data   = std::move(result_data_copy);
           waiter->error_message = error_message;
           waiter->success       = success;
           waiter->ready         = true;
           waiter->cv.notify_one();
         }
       }
@@
         if (it != job_tracker.end()) {
           if (success) {
             it->second.status      = JobStatus::COMPLETED;
-            it->second.result_data = result_data;
+            it->second.result_data = std::move(result_data);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Check if there's a blocking waiter
{
std::lock_guard<std::mutex> lock(waiters_mutex);
auto wit = waiting_threads.find(job_id);
if (wit != waiting_threads.end()) {
// Wake up the waiting thread
auto waiter = wit->second;
waiter->result_data = std::move(result_data);
waiter->error_message = error_message;
waiter->success = success;
waiter->ready = true;
waiter->cv.notify_one();
}
}
// Update job tracker
{
std::lock_guard<std::mutex> lock(tracker_mutex);
auto it = job_tracker.find(job_id);
if (it != job_tracker.end()) {
if (success) {
it->second.status = JobStatus::COMPLETED;
it->second.result_data = result_data;
} else if (cancelled) {
it->second.status = JobStatus::CANCELLED;
it->second.error_message = error_message;
} else {
it->second.status = JobStatus::FAILED;
it->second.error_message = error_message;
}
// Check if there's a blocking waiter
{
std::lock_guard<std::mutex> lock(waiters_mutex);
auto wit = waiting_threads.find(job_id);
if (wit != waiting_threads.end()) {
// Wake up the waiting thread
auto waiter = wit->second;
auto result_data_copy = result_data;
waiter->result_data = std::move(result_data_copy);
waiter->error_message = error_message;
waiter->success = success;
waiter->ready = true;
waiter->cv.notify_one();
}
}
// Update job tracker
{
std::lock_guard<std::mutex> lock(tracker_mutex);
auto it = job_tracker.find(job_id);
if (it != job_tracker.end()) {
if (success) {
it->second.status = JobStatus::COMPLETED;
it->second.result_data = std::move(result_data);
} else if (cancelled) {
it->second.status = JobStatus::CANCELLED;
it->second.error_message = error_message;
} else {
it->second.status = JobStatus::FAILED;
it->second.error_message = error_message;
}
🤖 Prompt for AI Agents
In `@cpp/cuopt_remote_server.cpp` around lines 1188 - 1217, The code currently
moves result_data into the waiting thread (waiter->result_data =
std::move(result_data)), which empties it before you store it in job_tracker;
change the order or semantics so the tracker gets the real result: either copy
result_data into the waiter (use waiter->result_data = result_data instead of
std::move) and then move result_data into the tracker (it->second.result_data =
std::move(result_data)), or update the job_tracker first (it->second.result_data
= std::move(result_data)) and then set waiter->result_data =
it->second.result_data (copy). Update the block touching
waiting_threads/waiters_mutex or the block touching job_tracker/tracker_mutex
accordingly so waiter->result_data retains the full data and job_tracker stores
the result (addresses: waiting_threads, waiters_mutex, waiter->result_data,
job_tracker, tracker_mutex, JobStatus::COMPLETED).

Comment on lines 1829 to 1833
// Use RAII stdout streamer - captures stdout and streams to client while
// also echoing to server console. Destructor restores original stdout.
{
stdout_streamer_t streamer(client_fd, stream_logs);

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's find and examine the handle_sync_solve function
fd -t f cuopt_remote_server.cpp

Repository: NVIDIA/cuopt

Length of output: 83


🏁 Script executed:

# Search for handle_sync_solve function definition
rg -n "void handle_sync_solve" --type cpp

Repository: NVIDIA/cuopt

Length of output: 126


🏁 Script executed:

# Find stdout_streamer_t implementation
rg -n "class stdout_streamer_t|struct stdout_streamer_t" --type cpp --type hpp

Repository: NVIDIA/cuopt

Length of output: 83


🏁 Script executed:

# Read the handle_sync_solve function
sed -n '1808,1850p' cpp/cuopt_remote_server.cpp

Repository: NVIDIA/cuopt

Length of output: 1551


🏁 Script executed:

# Search for stdout_streamer_t definition (broader search)
rg -n "stdout_streamer_t" cpp/cuopt_remote_server.cpp -A 5 -B 2

Repository: NVIDIA/cuopt

Length of output: 1438


🏁 Script executed:

# Search for stdout_streamer_t in all files
rg -n "class stdout_streamer_t|struct stdout_streamer_t"

Repository: NVIDIA/cuopt

Length of output: 113


🏁 Script executed:

# Read the full stdout_streamer_t class implementation
sed -n '133,200p' cpp/cuopt_remote_server.cpp

Repository: NVIDIA/cuopt

Length of output: 1569


🏁 Script executed:

# Check the main server loop to understand concurrent request handling
rg -n "handle_sync_solve|handle_async_solve|main\(" cpp/cuopt_remote_server.cpp -A 3 -B 1 | head -60

Repository: NVIDIA/cuopt

Length of output: 547


🏁 Script executed:

# Search for any existing mutex or synchronization in server
rg -n "mutex|lock_guard|atomic|thread" cpp/cuopt_remote_server.cpp | head -20

Repository: NVIDIA/cuopt

Length of output: 1181


🏁 Script executed:

# Read the context around where handle_sync_solve is called (line 2113)
sed -n '2100,2120p' cpp/cuopt_remote_server.cpp

Repository: NVIDIA/cuopt

Length of output: 743


🏁 Script executed:

# Find the function that contains line 2113
sed -n '2050,2115p' cpp/cuopt_remote_server.cpp

Repository: NVIDIA/cuopt

Length of output: 2582


🏁 Script executed:

# Check server main loop and thread handling
sed -n '2138,2200p' cpp/cuopt_remote_server.cpp

Repository: NVIDIA/cuopt

Length of output: 2509


🏁 Script executed:

# Find the function that contains the handle_sync_solve call
sed -n '1900,2115p' cpp/cuopt_remote_server.cpp | head -100

Repository: NVIDIA/cuopt

Length of output: 4499


🏁 Script executed:

# Search for the start of the function containing handle_sync_solve
rg -n "^void\s+\w+\(|^int\s+\w+\(" cpp/cuopt_remote_server.cpp | grep -B 1 "^1"

Repository: NVIDIA/cuopt

Length of output: 464


🏁 Script executed:

# Look for the accept() loop and threading model for handling clients
rg -n "accept\(|socket\(|listen\(" cpp/cuopt_remote_server.cpp -A 5 -B 2

Repository: NVIDIA/cuopt

Length of output: 785


🏁 Script executed:

# Read what happens after accept() in the main loop
sed -n '2259,2280p' cpp/cuopt_remote_server.cpp

Repository: NVIDIA/cuopt

Length of output: 732


🏁 Script executed:

# Verify dup2 with STDOUT_FILENO only appears in stdout_streamer_t
rg -n "dup2.*STDOUT_FILENO" cpp/cuopt_remote_server.cpp

Repository: NVIDIA/cuopt

Length of output: 237


🏁 Script executed:

# Check lines 591 and 706 to see if they also have concurrency issues
sed -n '580,600p' cpp/cuopt_remote_server.cpp

Repository: NVIDIA/cuopt

Length of output: 808


🏁 Script executed:

# Check lines 700-715
sed -n '700,715p' cpp/cuopt_remote_server.cpp

Repository: NVIDIA/cuopt

Length of output: 659


Protect stdout/stderr redirection with a mutex to prevent concurrent client interference.
stdout_streamer_t uses process-wide dup2() calls in lines 156 and 175. Since handle_sync_solve is invoked from multiple detached threads (line 2274), concurrent clients can interfere: when Thread A redirects stdout to pipe_A and Thread B subsequently redirects to pipe_B before A's destructor runs, logs leak across clients and stream restoration breaks. Add synchronization:

Proposed fix
 void handle_sync_solve(int client_fd,
                        const std::vector<uint8_t>& request_data,
                        bool is_mip,
                        bool stream_logs)
 {
+  static std::mutex stdout_redirect_mutex;
+  std::lock_guard<std::mutex> lock(stdout_redirect_mutex);
+
   std::string job_id = generate_job_id();
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Use RAII stdout streamer - captures stdout and streams to client while
// also echoing to server console. Destructor restores original stdout.
{
stdout_streamer_t streamer(client_fd, stream_logs);
void handle_sync_solve(int client_fd,
const std::vector<uint8_t>& request_data,
bool is_mip,
bool stream_logs)
{
static std::mutex stdout_redirect_mutex;
std::lock_guard<std::mutex> lock(stdout_redirect_mutex);
std::string job_id = generate_job_id();
// Use RAII stdout streamer - captures stdout and streams to client while
// also echoing to server console. Destructor restores original stdout.
{
stdout_streamer_t streamer(client_fd, stream_logs);
🤖 Prompt for AI Agents
In `@cpp/cuopt_remote_server.cpp` around lines 1829 - 1833, The stdout/stderr
redirection in stdout_streamer_t (which calls dup2() in its
constructor/destructor) must be serialized to avoid cross-thread interference:
introduce a process-wide mutex (e.g., a static std::mutex stdout_redir_mutex)
and acquire a lock whenever performing the dup2-based redirection and when
restoring the original fds; specifically, lock inside stdout_streamer_t's
constructor before the dup2 that redirects stdout/stderr and keep the lock until
after the dup2 in the destructor that restores them (or alternatively lock
around the scope where stdout_streamer_t streamer(client_fd, stream_logs) is
constructed in handle_sync_solve), ensuring no two threads can perform the
dup2/restore concurrently.

Comment on lines +46 to +83
bool maximize = false;
cuopt_float_t objective_offset = 0.0;
std::vector<cuopt_float_t> objective_coefficients;

// Quadratic objective (optional)
std::vector<cuopt_float_t> Q_values;
std::vector<cuopt_int_t> Q_indices;
std::vector<cuopt_int_t> Q_offsets;

// Constraint matrix (CSR format)
std::vector<cuopt_float_t> A_values;
std::vector<cuopt_int_t> A_indices;
std::vector<cuopt_int_t> A_offsets;

// Constraint bounds (two representations)
std::vector<char> row_types; // '<', '>', '=' style
std::vector<cuopt_float_t> constraint_bounds; // single RHS for row_types style
std::vector<cuopt_float_t> constraint_lower_bounds; // ranged style
std::vector<cuopt_float_t> constraint_upper_bounds; // ranged style
bool uses_ranged_constraints = false;

// Variable bounds
std::vector<cuopt_float_t> variable_lower_bounds;
std::vector<cuopt_float_t> variable_upper_bounds;

// Variable types
std::vector<char> variable_types; // 'C' for continuous, 'I' for integer

/**
* @brief Create a data_model_view_t pointing to this CPU data.
*/
cuopt::linear_programming::data_model_view_t<cuopt_int_t, cuopt_float_t> create_view() const
{
cuopt::linear_programming::data_model_view_t<cuopt_int_t, cuopt_float_t> view;

view.set_maximize(maximize);
view.set_objective_offset(objective_offset);

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Remote CPU path drops objective scaling factor from MPS input.
Line 46-83 and Line 330-333: problem_cpu_data_t doesn’t store objective_scaling_factor, and the remote cuOptReadProblem path doesn’t propagate it. If the MPS parser sets a scaling factor, remote solve may compute objectives incorrectly. Add a field and pass it into the view.

🔧 Suggested fix
 struct problem_cpu_data_t {
   // Objective
   bool maximize                  = false;
   cuopt_float_t objective_offset = 0.0;
+  cuopt_float_t objective_scaling_factor = 1.0;
   std::vector<cuopt_float_t> objective_coefficients;
   ...
   cuopt::linear_programming::data_model_view_t<cuopt_int_t, cuopt_float_t> create_view() const
   {
     cuopt::linear_programming::data_model_view_t<cuopt_int_t, cuopt_float_t> view;

     view.set_maximize(maximize);
     view.set_objective_offset(objective_offset);
+    view.set_objective_scaling_factor(objective_scaling_factor);
     ...
   }
 };

 ...
     cpu_data.maximize         = mps.get_sense();
     cpu_data.objective_offset = mps.get_objective_offset();
+    cpu_data.objective_scaling_factor = mps.get_objective_scaling_factor();

Also applies to: 330-333

🤖 Prompt for AI Agents
In `@cpp/src/linear_programming/cuopt_c.cpp` around lines 46 - 83,
problem_cpu_data_t is missing an objective_scaling_factor field and the remote
cuOptReadProblem path doesn't propagate it into the view; add a cuopt_float_t
objective_scaling_factor member to problem_cpu_data_t, populate it where
cuOptReadProblem constructs or copies CPU problem data in the remote path, and
modify create_view() (data_model_view_t<cuopt_int_t,cuopt_float_t>::create_view)
to call view.set_objective_scaling_factor(objective_scaling_factor) in addition
to set_maximize and set_objective_offset so the scaling from the MPS parser is
preserved end-to-end.

Comment on lines +88 to +100
if (problem_type == ProblemType::LP) {
auto* lp_req = req.mutable_lp_request();
if (!lp_req->ParseFromArray(data, static_cast<int>(size))) {
error_message = "SubmitJob: failed to parse LP request";
status_code = grpc::StatusCode::INVALID_ARGUMENT;
return false;
}
} else {
auto* mip_req = req.mutable_mip_request();
if (!mip_req->ParseFromArray(data, static_cast<int>(size))) {
error_message = "SubmitJob: failed to parse MIP request";
status_code = grpc::StatusCode::INVALID_ARGUMENT;
return false;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the file around the mentioned lines
cat -n cpp/src/linear_programming/utilities/remote_solve_grpc.cpp | head -120 | tail -50

Repository: NVIDIA/cuopt

Length of output: 1867


🏁 Script executed:

# Check the full context of the submit_job function
wc -l cpp/src/linear_programming/utilities/remote_solve_grpc.cpp

Repository: NVIDIA/cuopt

Length of output: 118


🏁 Script executed:

# Search for the SubmitJob function definition and any size checks
rg "SubmitJob" cpp/src/linear_programming/utilities/remote_solve_grpc.cpp -B 5 -A 30 | head -100

Repository: NVIDIA/cuopt

Length of output: 2181


🏁 Script executed:

# Look for protobuf includes and message definitions
rg "ParseFromArray" cpp/src/linear_programming/utilities/remote_solve_grpc.cpp -B 2 -A 2

Repository: NVIDIA/cuopt

Length of output: 576


🏁 Script executed:

# Check for any existing size validation in the file
rg "size_t|INT_MAX|numeric_limits|size.*check|payload.*limit" cpp/src/linear_programming/utilities/remote_solve_grpc.cpp

Repository: NVIDIA/cuopt

Length of output: 1121


🏁 Script executed:

# Search for calls to submit_job
rg "submit_job\(" cpp/src/linear_programming/utilities/remote_solve_grpc.cpp -B 2 -A 2

Repository: NVIDIA/cuopt

Length of output: 487


🏁 Script executed:

# Check public API headers to see if this function is exposed
fd "\.hpp$|\.h$" cpp/src/linear_programming/utilities/ | xargs grep -l "submit_job" 2>/dev/null

Repository: NVIDIA/cuopt

Length of output: 40


🏁 Script executed:

# Look for gRPC API definitions and how data is passed
rg "SubmitJobRequest|UploadJobRequest" cpp/src/linear_programming/utilities/remote_solve_grpc.cpp -B 2 -A 5 | head -60

Repository: NVIDIA/cuopt

Length of output: 2055


🏁 Script executed:

# Check protobuf documentation or usage patterns for ParseFromArray
web_search
Protocol Buffers ParseFromArray int parameter size overflow

Repository: NVIDIA/cuopt

Length of output: 153


🌐 Web query:

Protocol Buffers protobuf ParseFromArray size parameter limitations int overflow

💡 Result:

In protobuf C++, MessageLite::ParseFromArray(const void* data, int size) (and ParseFromString) are fundamentally limited by the fact that the size parameter and many internal length/byte-count fields are 32-bit signed ints. That implies:

  • Hard upper bound ≈ 2 GiB (2,147,483,647 bytes) for a single serialized message. Above this, the size can’t be represented safely as a positive int, and implementations commonly reject it or fail unpredictably due to integer overflow / negative lengths. This is the cross-language guideline documented by protobuf: serialized protos must be < 2 GiB. [1][2]
  • If you pass a value that overflowed into a negative int, parsing can fail with truncation/limit errors or other parse failures because internal limit logic assumes non-negative byte counts (a symptom noted by users when int length/limit math wraps). [3]
  • Separately, many runtimes impose smaller “defensive” size limits (commonly mentioned as ~64 MiB by default in some Google implementations) to avoid memory/DoS issues; these can apply especially when parsing from streams rather than raw arrays. [4][5]

Practical takeaway: if you’re anywhere near multi‑GiB payloads, protobuf is the wrong unit of framing—split into multiple messages/chunks, store the bulk data out-of-band (files/object store), and keep only metadata/refs in protobuf. [4]

Sources:
[1] Proto “Total Size of the Message” limit (<2 GiB) — protobuf.dev proto-limits: https://protobuf.dev/programming-guides/proto-limits/
[2] Encoded proto size limitations (<2 GiB) — protobuf.dev encoding guide: https://protobuf.dev/programming-guides/encoding/
[3] Discussion pointing to int overflow/negative limit behavior in parsing — Stack Overflow: https://stackoverflow.com/questions/55954550/failing-parsefrom-messages-in-java-from-c
[4] Explanation of 2 GB hard limit and typical smaller default limits — Stack Overflow: https://stackoverflow.com/questions/34128872/google-protobuf-maximum-size
[5] Protobuf group: ParseFromString == ParseFromArray(..., size) and “uses int32 for size/length → max 2G bytes” — Google Groups: https://groups.google.com/g/protobuf/c/7JG0VdfAfTQ


🏁 Script executed:

# Check the caller of submit_job - look at more context around line with submit_job call
rg "submit_job\(" cpp/src/linear_programming/utilities/remote_solve_grpc.cpp -B 10 -A 5

Repository: NVIDIA/cuopt

Length of output: 1224


🏁 Script executed:

# Check if there are protobuf limits or constraints defined
rg "max_bytes|kMaxMessageBytes|message.*size|ParseFromArray" cpp/src/linear_programming/utilities/remote_solve_grpc.cpp | head -30

Repository: NVIDIA/cuopt

Length of output: 1428


🏁 Script executed:

# Check how similar operations handle the int cast elsewhere in the file
rg "static_cast<int>" cpp/src/linear_programming/utilities/remote_solve_grpc.cpp

Repository: NVIDIA/cuopt

Length of output: 275


🏁 Script executed:

# Look for proto file definition to understand message structure
fd "\.proto$" | xargs grep -l "SubmitJobRequest\|LPRequest\|MIPRequest" 2>/dev/null | head -5

Repository: NVIDIA/cuopt

Length of output: 175


🏁 Script executed:

# Check function declarations and public headers
rg "submit_job" cpp/src/linear_programming/utilities/remote_solve_grpc.cpp | grep -E "^[^:]*\.cpp:|bool submit_job"

Repository: NVIDIA/cuopt

Length of output: 99


🏁 Script executed:

# Look for public header exposure
find cpp -name "*.hpp" -o -name "*.h" | xargs grep "submit_job" 2>/dev/null

Repository: NVIDIA/cuopt

Length of output: 40


🏁 Script executed:

# Verify the scope - is this function static or global?
rg "^(static )?bool submit_job" cpp/src/linear_programming/utilities/remote_solve_grpc.cpp -B 3

Repository: NVIDIA/cuopt

Length of output: 106


🏁 Script executed:

# Check the proto file for message size constraints
cat cpp/src/linear_programming/utilities/cuopt_remote.proto | head -100

Repository: NVIDIA/cuopt

Length of output: 3018


Add size bounds check before ParseFromArray to prevent int overflow.

ParseFromArray accepts an int for size, but submit_job receives a size_t parameter without validating against std::numeric_limits<int>::max(). If size > INT_MAX, the cast causes silent integer overflow, leading to unpredictable parse failures. The caller validates against a configured max_bytes limit, but not against the protobuf hard limit. Add an explicit check:

  cuopt::remote::SubmitJobRequest req;

+  if (size > static_cast<size_t>(std::numeric_limits<int>::max())) {
+    error_message = "SubmitJob: request too large for protobuf ParseFromArray";
+    status_code   = grpc::StatusCode::RESOURCE_EXHAUSTED;
+    return false;
+  }
+
   if (problem_type == ProblemType::LP) {
     auto* lp_req = req.mutable_lp_request();
     if (!lp_req->ParseFromArray(data, static_cast<int>(size))) {
🤖 Prompt for AI Agents
In `@cpp/src/linear_programming/utilities/remote_solve_grpc.cpp` around lines 88 -
100, The code calls ParseFromArray with a static_cast<int>(size) on size_t
without checking bounds; add an explicit check before parsing (for both branches
around req.mutable_lp_request() and req.mutable_mip_request()) that if size >
static_cast<size_t>(std::numeric_limits<int>::max()) then set error_message
(e.g., "SubmitJob: request too large for ParseFromArray"), set status_code =
grpc::StatusCode::INVALID_ARGUMENT and return false; ensure you `#include`
<limits> or <climits> as needed and reference ProblemType, ParseFromArray,
req.mutable_lp_request()/req.mutable_mip_request(), error_message and
status_code in the fix.

Comment on lines +597 to +614
void stream_logs_to_stdout(const std::string& address,
const std::string& job_id,
volatile bool* stop_flag,
const std::string& print_prefix)
{
auto stub = make_stub(address);
grpc::ClientContext ctx;
cuopt::remote::StreamLogsRequest req;
req.set_job_id(job_id);
req.set_from_byte(0);

std::unique_ptr<grpc::ClientReader<cuopt::remote::LogMessage>> reader =
stub->StreamLogs(&ctx, req);

cuopt::remote::LogMessage msg;
while (reader->Read(&msg)) {
if (stop_flag != nullptr && *stop_flag) { ctx.TryCancel(); }
if (!msg.line().empty()) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the file and the specific function
cat -n cpp/src/linear_programming/utilities/remote_solve_grpc.cpp | head -700 | tail -150

Repository: NVIDIA/cuopt

Length of output: 5697


🏁 Script executed:

# Let's get a better view of the function and its context
sed -n '580,650p' cpp/src/linear_programming/utilities/remote_solve_grpc.cpp

Repository: NVIDIA/cuopt

Length of output: 2350


🏁 Script executed:

# Search for calls to stream_logs_to_stdout
rg 'stream_logs_to_stdout' -A 3 -B 3

Repository: NVIDIA/cuopt

Length of output: 2805


🏁 Script executed:

# Search for volatile usage in this file
rg 'volatile' cpp/src/linear_programming/utilities/remote_solve_grpc.cpp -B 3 -A 3

Repository: NVIDIA/cuopt

Length of output: 314


🏁 Script executed:

# Check how stop_flag is created and passed in the codebase
rg 'stop_flag' cpp/src/linear_programming/utilities/remote_solve_grpc.cpp -B 2 -A 2

Repository: NVIDIA/cuopt

Length of output: 506


🏁 Script executed:

# Search for stop_logs declaration and usage in remote_solve.cu
rg 'stop_logs' cpp/src/linear_programming/utilities/remote_solve.cu -B 3 -A 3

Repository: NVIDIA/cuopt

Length of output: 3766


🏁 Script executed:

# Get the full context around the thread spawning in remote_solve.cu
sed -n '1,100p' cpp/src/linear_programming/utilities/remote_solve.cu | rg -B 10 -A 10 'stream_logs_to_stdout'

Repository: NVIDIA/cuopt

Length of output: 38


🏁 Script executed:

# Search more broadly for stop_logs in remote_solve.cu
grep -n 'stop_logs' cpp/src/linear_programming/utilities/remote_solve.cu

Repository: NVIDIA/cuopt

Length of output: 607


Use an atomic stop flag instead of volatile.

volatile bool* does not provide inter-thread synchronization and creates a data race when one thread writes the flag while another thread reads it. In cpp/src/linear_programming/utilities/remote_solve.cu, stop_logs is passed to a worker thread that continuously reads it while the main thread writes to it. Use std::atomic<bool> and call load() with appropriate memory ordering to ensure safe inter-thread communication.

Proposed fix
-void stream_logs_to_stdout(const std::string& address,
-                           const std::string& job_id,
-                           volatile bool* stop_flag,
-                           const std::string& print_prefix)
+void stream_logs_to_stdout(const std::string& address,
+                           const std::string& job_id,
+                           std::atomic<bool>* stop_flag,
+                           const std::string& print_prefix)
 {
   auto stub = make_stub(address);
   grpc::ClientContext ctx;
@@
-  while (reader->Read(&msg)) {
-    if (stop_flag != nullptr && *stop_flag) { ctx.TryCancel(); }
+  while (reader->Read(&msg)) {
+    if (stop_flag && stop_flag->load(std::memory_order_relaxed)) { ctx.TryCancel(); }
🤖 Prompt for AI Agents
In `@cpp/src/linear_programming/utilities/remote_solve_grpc.cpp` around lines 597
- 614, Replace the non-synchronized volatile flag with an atomic boolean: change
the parameter type of stream_logs_to_stdout from volatile bool* stop_flag to
std::atomic<bool>* stop_flag and replace reads of *stop_flag with
stop_flag->load(std::memory_order_acquire); update the writer side in
remote_solve.cu (the stop_logs variable and any writes) to use std::atomic<bool>
and stop_logs.store(true, std::memory_order_release) so the worker thread and
main thread synchronize safely; ensure all declarations and calls that pass
stop_logs are adjusted to the std::atomic<bool>* type.

Comment on lines +383 to +447
// Helper struct to hold CPU copies of GPU data for remote solve
template <typename i_t, typename f_t>
struct cpu_problem_data_t {
std::vector<f_t> A_values;
std::vector<i_t> A_indices;
std::vector<i_t> A_offsets;
std::vector<f_t> constraint_bounds;
std::vector<f_t> constraint_lower_bounds;
std::vector<f_t> constraint_upper_bounds;
std::vector<f_t> objective_coefficients;
std::vector<f_t> variable_lower_bounds;
std::vector<f_t> variable_upper_bounds;
std::vector<char> variable_types;
std::vector<f_t> quadratic_objective_values;
std::vector<i_t> quadratic_objective_indices;
std::vector<i_t> quadratic_objective_offsets;
bool maximize;
f_t objective_scaling_factor;
f_t objective_offset;

data_model_view_t<i_t, f_t> create_view() const
{
data_model_view_t<i_t, f_t> v;
v.set_maximize(maximize);
v.set_objective_scaling_factor(objective_scaling_factor);
v.set_objective_offset(objective_offset);

if (!A_values.empty()) {
v.set_csr_constraint_matrix(A_values.data(),
A_values.size(),
A_indices.data(),
A_indices.size(),
A_offsets.data(),
A_offsets.size());
}
if (!constraint_bounds.empty()) {
v.set_constraint_bounds(constraint_bounds.data(), constraint_bounds.size());
}
if (!constraint_lower_bounds.empty() && !constraint_upper_bounds.empty()) {
v.set_constraint_lower_bounds(constraint_lower_bounds.data(), constraint_lower_bounds.size());
v.set_constraint_upper_bounds(constraint_upper_bounds.data(), constraint_upper_bounds.size());
}
if (!objective_coefficients.empty()) {
v.set_objective_coefficients(objective_coefficients.data(), objective_coefficients.size());
}
if (!variable_lower_bounds.empty()) {
v.set_variable_lower_bounds(variable_lower_bounds.data(), variable_lower_bounds.size());
}
if (!variable_upper_bounds.empty()) {
v.set_variable_upper_bounds(variable_upper_bounds.data(), variable_upper_bounds.size());
}
if (!variable_types.empty()) {
v.set_variable_types(variable_types.data(), variable_types.size());
}
if (!quadratic_objective_values.empty()) {
v.set_quadratic_objective_matrix(quadratic_objective_values.data(),
quadratic_objective_values.size(),
quadratic_objective_indices.data(),
quadratic_objective_indices.size(),
quadratic_objective_offsets.data(),
quadratic_objective_offsets.size());
}
v.set_is_device_memory(false);
return v;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Preserve constraint row types when copying GPU views to CPU.
Line 383-447 and Line 469-475: row_types are never captured in cpu_problem_data_t nor copied from gpu_view. For models represented via row_types + RHS, the remote solve loses constraint sense and can produce invalid results. Add row_types storage and propagate it into the CPU view.

🔧 Suggested fix
 struct cpu_problem_data_t {
   std::vector<f_t> A_values;
   std::vector<i_t> A_indices;
   std::vector<i_t> A_offsets;
+  std::vector<char> row_types;
   std::vector<f_t> constraint_bounds;
   std::vector<f_t> constraint_lower_bounds;
   std::vector<f_t> constraint_upper_bounds;
   ...
   data_model_view_t<i_t, f_t> create_view() const
   {
     data_model_view_t<i_t, f_t> v;
     ...
+    if (!row_types.empty()) {
+      v.set_row_types(row_types.data(), row_types.size());
+    }
     if (!constraint_bounds.empty()) {
       v.set_constraint_bounds(constraint_bounds.data(), constraint_bounds.size());
     }
     ...
   }
 };

 ...
   copy_to_host(cpu_data.A_values, gpu_view.get_constraint_matrix_values());
   copy_to_host(cpu_data.A_indices, gpu_view.get_constraint_matrix_indices());
   copy_to_host(cpu_data.A_offsets, gpu_view.get_constraint_matrix_offsets());
+  copy_to_host(cpu_data.row_types, gpu_view.get_row_types());
   copy_to_host(cpu_data.constraint_bounds, gpu_view.get_constraint_bounds());

Also applies to: 469-475

🤖 Prompt for AI Agents
In `@cpp/src/mip/solve.cu` around lines 383 - 447, The cpu_problem_data_t is
missing storage and propagation for constraint row types, so add a member (e.g.,
std::vector<char> row_types or the appropriate type) to cpu_problem_data_t,
ensure any code that constructs/populates cpu_problem_data_t from gpu_view
copies gpu_view.row_types into that vector, and update
cpu_problem_data_t::create_view() to call
v.set_constraint_row_types(row_types.data(), row_types.size()) (in the same
manner as other constraint arrays) before returning; refer to
cpu_problem_data_t, create_view, row_types, gpu_view, and
set_constraint_row_types to locate where to add the field, copy it, and pass it
into the data_model_view_t.

Comment on lines +486 to +495
cudaMemcpyAsync(cpu_data.variable_types.data(),
var_types_span.data(),
var_types_span.size() * sizeof(char),
cudaMemcpyDeviceToHost,
stream);
}

// Synchronize to ensure all copies are complete
cudaStreamSynchronize(stream);

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Add CUDA error checks around async copy and stream synchronization.

Lines 486-495: cudaMemcpyAsync and cudaStreamSynchronize lack error checking. Wrap with RAFT_CUDA_TRY to catch and propagate CUDA failures. Per coding guidelines, all CUDA memory operations must have error verification.

🔧 Suggested fix
-    cudaMemcpyAsync(cpu_data.variable_types.data(),
-                    var_types_span.data(),
-                    var_types_span.size() * sizeof(char),
-                    cudaMemcpyDeviceToHost,
-                    stream);
+    RAFT_CUDA_TRY(cudaMemcpyAsync(cpu_data.variable_types.data(),
+                                  var_types_span.data(),
+                                  var_types_span.size() * sizeof(char),
+                                  cudaMemcpyDeviceToHost,
+                                  stream));
   }

   // Synchronize to ensure all copies are complete
-  cudaStreamSynchronize(stream);
+  RAFT_CUDA_TRY(cudaStreamSynchronize(stream));
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
cudaMemcpyAsync(cpu_data.variable_types.data(),
var_types_span.data(),
var_types_span.size() * sizeof(char),
cudaMemcpyDeviceToHost,
stream);
}
// Synchronize to ensure all copies are complete
cudaStreamSynchronize(stream);
RAFT_CUDA_TRY(cudaMemcpyAsync(cpu_data.variable_types.data(),
var_types_span.data(),
var_types_span.size() * sizeof(char),
cudaMemcpyDeviceToHost,
stream));
}
// Synchronize to ensure all copies are complete
RAFT_CUDA_TRY(cudaStreamSynchronize(stream));
🤖 Prompt for AI Agents
In `@cpp/src/mip/solve.cu` around lines 486 - 495, Wrap the cudaMemcpyAsync and
cudaStreamSynchronize calls with RAFT_CUDA_TRY to check for errors: replace the
direct call to cudaMemcpyAsync(cpu_data.variable_types.data(),
var_types_span.data(), var_types_span.size() * sizeof(char),
cudaMemcpyDeviceToHost, stream) with RAFT_CUDA_TRY(cudaMemcpyAsync(...)) and
similarly wrap cudaStreamSynchronize(stream) with
RAFT_CUDA_TRY(cudaStreamSynchronize(stream)); ensure you reference the same
variables (cpu_data.variable_types, var_types_span, stream) so failures in the
async copy or stream sync are caught and propagated per project guidelines.

Comment on lines 136 to 140
# Create cancel request
request = pb.AsyncRequest()
request.request_type = pb.CANCEL_JOB
request.job_id = job_id

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Find proto files and locate job_id definition
find . -name "*.proto" -type f | head -20

Repository: NVIDIA/cuopt

Length of output: 179


🏁 Script executed:

# Search for job_id in proto files
rg -n "job_id" --glob '*.proto' -B 2 -A 2

Repository: NVIDIA/cuopt

Length of output: 7093


🏁 Script executed:

# Examine the remote.py file around lines 136-140 and broader context
wc -l python/cuopt/cuopt/linear_programming/remote.py

Repository: NVIDIA/cuopt

Length of output: 107


🏁 Script executed:

# Read the relevant section of remote.py
sed -n '120,160p' python/cuopt/cuopt/linear_programming/remote.py

Repository: NVIDIA/cuopt

Length of output: 1377


🏁 Script executed:

# Check the cancel_job function signature and its imports
sed -n '1,50p' python/cuopt/cuopt/linear_programming/remote.py

Repository: NVIDIA/cuopt

Length of output: 1378


🏁 Script executed:

# Check the cancel_job function signature and parameters
rg -n "def cancel_job" -A 20 python/cuopt/cuopt/linear_programming/remote.py

Repository: NVIDIA/cuopt

Length of output: 842


🏁 Script executed:

# Check if job_id is used elsewhere in similar ways
rg -n "request.job_id" python/cuopt/cuopt/linear_programming/remote.py -B 5 -A 2

Repository: NVIDIA/cuopt

Length of output: 256


🏁 Script executed:

# Look for how job_id is received from API calls to understand its type
rg -n "job_id.*str\|str.*job_id" python/cuopt/cuopt/linear_programming/remote.py

Repository: NVIDIA/cuopt

Length of output: 38


🏁 Script executed:

# Check the protobuf generated Python code to understand the actual field type
find . -name "cuopt_remote_pb2.py" -type f

Repository: NVIDIA/cuopt

Length of output: 38


🏁 Script executed:

# Examine what cuopt_remote_pb2 is and if it properly validates types
rg -n "class AsyncRequest" cpp/build/ 2>/dev/null | head -5

Repository: NVIDIA/cuopt

Length of output: 38


🏁 Script executed:

# Check if there's any type conversion or validation wrapper
rg -n "job_id" python/cuopt/cuopt/linear_programming/remote.py | head -20

Repository: NVIDIA/cuopt

Length of output: 182


🏁 Script executed:

# Look at the rest of the cancel_job function to see all job_id assignments
sed -n '75,170p' python/cuopt/cuopt/linear_programming/remote.py

Repository: NVIDIA/cuopt

Length of output: 3267


🏁 Script executed:

# Check if there are generated protobuf files in the repository or build directory
find . -name "*pb2.py" -type f 2>/dev/null | head -10

Repository: NVIDIA/cuopt

Length of output: 38


🏁 Script executed:

# Search for how job_id is handled in other functions in remote.py
sed -n '1,217p' python/cuopt/cuopt/linear_programming/remote.py | grep -n "job_id\|bytes\|encode" | head -30

Repository: NVIDIA/cuopt

Length of output: 182


🏁 Script executed:

# Check if there's any protobuf version specification or requirements
find . -name "requirements*.txt" -o -name "setup.py" -o -name "pyproject.toml" | xargs grep -l protobuf 2>/dev/null

Repository: NVIDIA/cuopt

Length of output: 40


🏁 Script executed:

# Look at how other functions handle similar proto assignments
rg -n "request\." python/cuopt/cuopt/linear_programming/remote.py | head -30

Repository: NVIDIA/cuopt

Length of output: 224


Fix type mismatch: job_id must be encoded to bytes.

The .proto file defines job_id as bytes in AsyncRequest (cuopt_remote.proto:285), but the function accepts str and assigns it directly without encoding. This causes a TypeError when the protobuf message serializes.

Suggested fix
-    request.job_id = job_id
+    request.job_id = (
+        job_id if isinstance(job_id, (bytes, bytearray)) else job_id.encode()
+    )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Create cancel request
request = pb.AsyncRequest()
request.request_type = pb.CANCEL_JOB
request.job_id = job_id
# Create cancel request
request = pb.AsyncRequest()
request.request_type = pb.CANCEL_JOB
request.job_id = (
job_id if isinstance(job_id, (bytes, bytearray)) else job_id.encode()
)
🤖 Prompt for AI Agents
In `@python/cuopt/cuopt/linear_programming/remote.py` around lines 136 - 140, The
AsyncRequest's job_id field is defined as bytes but the code assigns a str
directly; update the creation of pb.AsyncRequest in remote.py (where
request.request_type = pb.CANCEL_JOB and request.job_id = job_id) to ensure
job_id is bytes (e.g., set request.job_id = job_id.encode('utf-8') or
conditionally keep it if already bytes) so serialization no longer raises a
TypeError.

@anandhkb anandhkb added this to the 26.02 milestone Jan 26, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
conda/recipes/libcuopt/recipe.yaml (1)

152-175: Add grpc-cpp to run dependencies.

The build enables CUOPT_ENABLE_GRPC=ON and installs both cuopt_grpc_server and cuopt_remote_server executables to bin/. Both link gRPC::grpc++ and will fail at runtime without grpc-cpp. While protobuf is already in run (correctly mitigating the ignored run_exports), grpc-cpp is missing and must be added.

🤖 Fix all issues with AI agents
In `@ci/utils/install_protobuf_grpc.sh`:
- Around line 22-24: The Rocky Linux gRPC install line uses "dnf install -y
grpc-devel grpc-plugins || dnf install -y grpc-devel || true", which silences
failures and violates set -euo pipefail; update the dnf installation logic in
ci/utils/install_protobuf_grpc.sh so it does not end with "|| true": run the
first dnf install attempt, if it fails try the fallback install, and if that
also fails print a clear error via echo or >&2 with context ("gRPC packages
missing: grpc-devel/grpc-plugins") and exit non‑zero; ensure this explicit check
replaces the trailing "|| true" so the script fails fast on missing gRPC
packages.

Comment on lines +22 to +24
# gRPC C++ (headers/libs + grpc_cpp_plugin for codegen)
# Package names can vary by repo; try the common ones first.
dnf install -y grpc-devel grpc-plugins || dnf install -y grpc-devel || true
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat ci/utils/install_protobuf_grpc.sh

Repository: NVIDIA/cuopt

Length of output: 1751


🏁 Script executed:

# Search for patterns related to grpc_cpp_plugin in the repo
rg "grpc_cpp_plugin" --type py --type cpp --type bash

Repository: NVIDIA/cuopt

Length of output: 84


🏁 Script executed:

# Search for grpc_cpp_plugin references without type filter
rg "grpc_cpp_plugin" --type py --type cpp || rg "grpc_cpp_plugin"

Repository: NVIDIA/cuopt

Length of output: 580


🏁 Script executed:

# Verify the exact fallback logic and what packages grpc-plugins vs grpc-devel provide
# Check if there's a pattern of similar fallbacks in the script
rg "dnf install" ci/utils/install_protobuf_grpc.sh -A 1

Repository: NVIDIA/cuopt

Length of output: 238


Don't silently ignore gRPC install failures; breaks the script's fail-fast contract.

The script uses set -euo pipefail but the Rocky Linux gRPC install ends with || true, bypassing this contract. This causes silent success even when gRPC packages are unavailable, deferring the error to a later CMake build failure with obscured context. The Ubuntu path correctly fails hard on missing packages—Rocky should do the same.

Replace || true with explicit error handling and validation:

🔧 Suggested fix
-        dnf install -y grpc-devel grpc-plugins || dnf install -y grpc-devel || true
+        dnf install -y grpc-devel grpc-plugins || dnf install -y grpc-devel || {
+            echo "gRPC packages not available in enabled repos; aborting."
+            exit 1
+        }
+        command -v grpc_cpp_plugin >/dev/null || {
+            echo "grpc_cpp_plugin not found after install; aborting."
+            exit 1
+        }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# gRPC C++ (headers/libs + grpc_cpp_plugin for codegen)
# Package names can vary by repo; try the common ones first.
dnf install -y grpc-devel grpc-plugins || dnf install -y grpc-devel || true
# gRPC C++ (headers/libs + grpc_cpp_plugin for codegen)
# Package names can vary by repo; try the common ones first.
dnf install -y grpc-devel grpc-plugins || dnf install -y grpc-devel || {
echo "gRPC packages not available in enabled repos; aborting."
exit 1
}
command -v grpc_cpp_plugin >/dev/null || {
echo "grpc_cpp_plugin not found after install; aborting."
exit 1
}
🤖 Prompt for AI Agents
In `@ci/utils/install_protobuf_grpc.sh` around lines 22 - 24, The Rocky Linux gRPC
install line uses "dnf install -y grpc-devel grpc-plugins || dnf install -y
grpc-devel || true", which silences failures and violates set -euo pipefail;
update the dnf installation logic in ci/utils/install_protobuf_grpc.sh so it
does not end with "|| true": run the first dnf install attempt, if it fails try
the fallback install, and if that also fails print a clear error via echo or >&2
with context ("gRPC packages missing: grpc-devel/grpc-plugins") and exit
non‑zero; ensure this explicit check replaces the trailing "|| true" so the
script fails fast on missing gRPC packages.

*
* @param is_device true if data is in GPU memory, false if in CPU memory
*/
void set_is_device_memory(bool is_device) noexcept { is_device_memory_ = is_device; }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be

Suggested change
void set_is_device_memory(bool is_device) noexcept { is_device_memory_ = is_device; }
void set_device_memory(bool is_device) noexcept { is_device_memory_ = is_device; }

or enable_device_memory ?

template class mps_data_model_t<int, float>;

template class mps_data_model_t<int, double>;
template class mps_data_model_t<int64_t, float>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we support int64_t

@@ -0,0 +1,245 @@
// SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this generated by grpc or manually added ?

Copy link
Collaborator

@rgsl888prabhu rgsl888prabhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets add test cases to check remote execution for both GRPC and HTTP

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@cpp/CMakeLists.txt`:
- Around line 280-307: The CMake logic makes gRPC mandatory because it uses
find_package(gRPC REQUIRED) and never declares an option; change to declare an
option (e.g., option(CUOPT_ENABLE_GRPC "Enable gRPC support" OFF) ) before the
find_package call, remove REQUIRED from find_package(gRPC) so it becomes
optional, and only set CUOPT_ENABLE_GRPC ON /
add_compile_definitions(CUOPT_ENABLE_GRPC) when gRPC is found and the option is
enabled (otherwise leave the option OFF and handle the error branch that checks
CUOPT_ENABLE_GRPC). Ensure you update the branches that reference gRPC_FOUND and
CUOPT_ENABLE_GRPC accordingly so gRPC truly remains optional.
🧹 Nitpick comments (1)
cpp/CMakeLists.txt (1)

267-279: Avoid hard‑coding CUDSS SONAMEs in fallbacks.

Using libcudss.so.0 / libcudss_mtlayer_gomp.so.0 risks breakage if the SONAME bumps. Consider find_library() or target-location fallback instead.

♻️ Possible fallback using find_library
 if(NOT CUDSS_LIB_FILE AND DEFINED cudss_LIBRARY_DIR)
-  set(CUDSS_LIB_FILE "${cudss_LIBRARY_DIR}/libcudss.so.0")
+  find_library(CUDSS_LIB_FILE NAMES cudss
+    PATHS "${cudss_LIBRARY_DIR}" NO_DEFAULT_PATH)
 endif()
 if(NOT CUDSS_MT_LIB_FILE AND DEFINED cudss_LIBRARY_DIR)
-  set(CUDSS_MT_LIB_FILE "${cudss_LIBRARY_DIR}/libcudss_mtlayer_gomp.so.0")
+  find_library(CUDSS_MT_LIB_FILE NAMES cudss_mtlayer_gomp
+    PATHS "${cudss_LIBRARY_DIR}" NO_DEFAULT_PATH)
 endif()

Comment on lines 280 to 312
# Protocol Buffers for remote solve serialization
#
# Use plain find_package so system installs (that don't ship ProtobufConfig.cmake)
# can fall back to CMake's built-in FindProtobuf module.
find_package(Protobuf REQUIRED)

# Normalize target names across Protobuf providers (config vs module).
if(TARGET Protobuf::libprotobuf AND NOT TARGET protobuf::libprotobuf)
add_library(protobuf::libprotobuf ALIAS Protobuf::libprotobuf)
endif()
if(TARGET Protobuf::protoc AND NOT TARGET protobuf::protoc)
add_executable(protobuf::protoc ALIAS Protobuf::protoc)
endif()
include_directories(${Protobuf_INCLUDE_DIRS})
include_directories(${CMAKE_CURRENT_BINARY_DIR})

# gRPC for remote solve service (optional, for gRPC-based implementation)
find_package(gRPC REQUIRED)
if(gRPC_FOUND)
message(STATUS "Found gRPC: ${gRPC_VERSION}")
set(CUOPT_ENABLE_GRPC ON)
add_compile_definitions(CUOPT_ENABLE_GRPC)
elseif(CUOPT_ENABLE_GRPC)
message(FATAL_ERROR "CUOPT_ENABLE_GRPC=ON but gRPC was not found")
else()
message(STATUS "CUOPT_ENABLE_GRPC=OFF, gRPC-based remote solve disabled")
set(CUOPT_ENABLE_GRPC OFF)
endif()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's verify the file and lines in question
wc -l cpp/CMakeLists.txt

Repository: NVIDIA/cuopt

Length of output: 78


🏁 Script executed:

# Read the specific lines mentioned in the review
sed -n '280,307p' cpp/CMakeLists.txt

Repository: NVIDIA/cuopt

Length of output: 1169


🏁 Script executed:

# Search for CUOPT_ENABLE_GRPC usage throughout the codebase
rg 'CUOPT_ENABLE_GRPC' --type cmake

Repository: NVIDIA/cuopt

Length of output: 713


🏁 Script executed:

# Check for any gRPC-related configuration
rg -i 'grpc' cpp/CMakeLists.txt -B 2 -A 2

Repository: NVIDIA/cuopt

Length of output: 3496


🏁 Script executed:

# Check if there's any documentation or comments about optional vs required
rg -i 'optional' cpp/CMakeLists.txt -B 2 -A 2

Repository: NVIDIA/cuopt

Length of output: 651


gRPC is effectively mandatory despite optional logic.

find_package(gRPC REQUIRED) terminates CMake configuration if gRPC is not found, preventing any fallback logic from running. Additionally, no option() declaration exists before this logic, so users cannot disable gRPC even if they want to. The code unconditionally forces CUOPT_ENABLE_GRPC ON whenever gRPC is detected, making "optional gRPC" impossible.

🐛 Suggested fix to respect the option and keep gRPC truly optional
-# gRPC for remote solve service (optional, for gRPC-based implementation)
-find_package(gRPC REQUIRED)
-if(gRPC_FOUND)
-  message(STATUS "Found gRPC: ${gRPC_VERSION}")
-  set(CUOPT_ENABLE_GRPC ON)
-  add_compile_definitions(CUOPT_ENABLE_GRPC)
-elseif(CUOPT_ENABLE_GRPC)
-  message(FATAL_ERROR "CUOPT_ENABLE_GRPC=ON but gRPC was not found")
-else()
-  message(STATUS "CUOPT_ENABLE_GRPC=OFF, gRPC-based remote solve disabled")
-  set(CUOPT_ENABLE_GRPC OFF)
-endif()
+# gRPC for remote solve service (optional, for gRPC-based implementation)
+option(CUOPT_ENABLE_GRPC "Enable gRPC-based remote solve" OFF)
+if(CUOPT_ENABLE_GRPC)
+  find_package(gRPC REQUIRED)
+else()
+  find_package(gRPC QUIET)
+endif()
+if(gRPC_FOUND)
+  message(STATUS "Found gRPC: ${gRPC_VERSION}")
+  add_compile_definitions(CUOPT_ENABLE_GRPC)
+elseif(CUOPT_ENABLE_GRPC)
+  message(FATAL_ERROR "CUOPT_ENABLE_GRPC=ON but gRPC was not found")
+else()
+  message(STATUS "CUOPT_ENABLE_GRPC=OFF, gRPC-based remote solve disabled")
+endif()
🤖 Prompt for AI Agents
In `@cpp/CMakeLists.txt` around lines 280 - 307, The CMake logic makes gRPC
mandatory because it uses find_package(gRPC REQUIRED) and never declares an
option; change to declare an option (e.g., option(CUOPT_ENABLE_GRPC "Enable gRPC
support" OFF) ) before the find_package call, remove REQUIRED from
find_package(gRPC) so it becomes optional, and only set CUOPT_ENABLE_GRPC ON /
add_compile_definitions(CUOPT_ENABLE_GRPC) when gRPC is found and the option is
enabled (otherwise leave the option OFF and handle the error branch that checks
CUOPT_ENABLE_GRPC). Ensure you update the branches that reference gRPC_FOUND and
CUOPT_ENABLE_GRPC accordingly so gRPC truly remains optional.

@@ -0,0 +1,24 @@
#!/bin/bash
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this script will be removed ....

@@ -0,0 +1,2298 @@
/*
* SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION & AFFILIATES. All rights
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will be removed, legacy

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@ci/build_wheel_libcuopt.sh`:
- Around line 20-22: The called script ci/utils/install_protobuf_grpc.sh
currently lacks the execute bit; update its file mode so it is executable (e.g.,
add the execute permission via chmod +x on ci/utils/install_protobuf_grpc.sh)
and commit that mode change so the script shows as executable in the repo; keep
the existing call in ci/build_wheel_libcuopt.sh (bash
ci/utils/install_protobuf_grpc.sh) unchanged.

In `@python/libcuopt/CMakeLists.txt`:
- Around line 42-64: The FetchContent_Declare for grpc pins an insecure v1.51.1;
update the GIT_TAG in FetchContent_Declare(grpc ...) to at least v1.65.4
(preferably the current stable such as v1.76.0) to remediate the listed CVEs,
and re-run dependency tests; also avoid globally forcing BUILD_SHARED_LIBS via
the CACHE with FORCE (which affects all subsequent add_library/add_subdirectory
calls like add_subdirectory(../../cpp cuopt-cpp)); instead set BUILD_SHARED_LIBS
for grpc only (e.g., set it non-cached or scoped locally before
FetchContent_MakeAvailable and restore/unset it afterward) or explicitly control
shared/static for the downstream add_subdirectory to prevent unintended
propagation.
🧹 Nitpick comments (7)
ci/build_wheel_libcuopt.sh (1)

60-73: Consider whether gRPC/Protobuf libraries should be excluded.

If gRPC and Protobuf are linked dynamically, you may want to add them to EXCLUDE_ARGS (e.g., --exclude "libgrpc*.so*" and --exclude "libprotobuf*.so*") to reduce wheel size, assuming they're expected as system dependencies. If they're intentionally bundled for portability, this can be ignored.

python/libcuopt/CMakeLists.txt (1)

47-55: Consider scoping BUILD_SHARED_LIBS to avoid polluting global CMake state.

Setting BUILD_SHARED_LIBS ON with FORCE affects all subsequent targets in the CMake configuration, not just gRPC. If other dependencies fetched later expect static linking, this could cause issues.

♻️ Suggested approach using a scoped block
-set(gRPC_BUILD_TESTS OFF CACHE BOOL "" FORCE)
-set(gRPC_INSTALL OFF CACHE BOOL "" FORCE)
-set(BUILD_SHARED_LIBS ON CACHE BOOL "" FORCE)
+set(gRPC_BUILD_TESTS OFF CACHE BOOL "" FORCE)
+set(gRPC_INSTALL OFF CACHE BOOL "" FORCE)
+# Save and restore BUILD_SHARED_LIBS to avoid affecting other targets
+set(_CUOPT_SAVED_BUILD_SHARED_LIBS ${BUILD_SHARED_LIBS})
+set(BUILD_SHARED_LIBS ON CACHE BOOL "" FORCE)
 set(gRPC_SSL_PROVIDER "package" CACHE STRING "" FORCE)
 ...
 FetchContent_MakeAvailable(grpc)
+set(BUILD_SHARED_LIBS ${_CUOPT_SAVED_BUILD_SHARED_LIBS} CACHE BOOL "" FORCE)
cpp/CMakeLists.txt (5)

229-264: Defensive ZLIB resolution is reasonable but complex; consider documenting the rationale.

The workaround to avoid broken system ZLIB configs is sensible, but the CMAKE_IGNORE_PATH manipulation (line 253) is aggressive and could cause issues if downstream projects legitimately need those paths for other packages. A brief comment explaining the specific failure scenario would help future maintainers.


293-294: Prefer target_include_directories over global include_directories.

Using global include_directories() pollutes the include path for all targets in this directory and subdirectories. Since protobuf headers are only needed by specific targets, consider using target_include_directories() on those targets instead.

♻️ Suggested change

Remove these global includes and add them to specific targets that need them:

-include_directories(${Protobuf_INCLUDE_DIRS})
-include_directories(${CMAKE_CURRENT_BINARY_DIR})

Then add to the cuopt target (around line 429):

target_include_directories(cuopt PRIVATE
  ${Protobuf_INCLUDE_DIRS}
  ${CMAKE_CURRENT_BINARY_DIR}
)

378-390: The else branch is currently dead code.

Since CUOPT_ENABLE_GRPC is unconditionally set to ON at line 304, the else branch (lines 385-390) will never execute. If gRPC is intended to always be required, consider removing the conditional to simplify the code. If optional gRPC support is planned for the future, add a comment explaining the intent.


731-741: protobuf::libprotobuf should likely be PRIVATE, not PUBLIC.

For cuopt_remote_server, protobuf is an implementation detail and shouldn't be propagated to consumers linking against this executable (which doesn't make sense anyway for an executable). Move it to PRIVATE for consistency with other targets.

♻️ Suggested fix
   target_link_libraries(cuopt_remote_server
     PUBLIC
     cuopt
     OpenMP::OpenMP_CXX
-    protobuf::libprotobuf
     PRIVATE
+    protobuf::libprotobuf
   )

757-799: Consider tracking the disabled test sources with an issue.

The if(FALSE) block preserves build rules for test executables whose sources don't exist yet. While the TODO comment explains this, tracking it in an issue would ensure it doesn't get forgotten.

Would you like me to help create an issue to track restoring remote_solve_test.cpp and local_solve_test.cpp?

Comment on lines +20 to +22
# Install Protobuf + gRPC (protoc + grpc_cpp_plugin)
bash ci/utils/install_protobuf_grpc.sh

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify the install script exists and has proper structure
if [ -f "ci/utils/install_protobuf_grpc.sh" ]; then
    echo "=== Script exists ==="
    ls -la ci/utils/install_protobuf_grpc.sh
    echo ""
    echo "=== First 10 lines ==="
    head -10 ci/utils/install_protobuf_grpc.sh
else
    echo "ERROR: ci/utils/install_protobuf_grpc.sh not found"
    # Check if the utils directory exists
    if [ -d "ci/utils" ]; then
        echo "Directory ci/utils exists with contents:"
        ls -la ci/utils/
    else
        echo "Directory ci/utils does not exist"
    fi
fi

Repository: NVIDIA/cuopt

Length of output: 463


Script exists but lacks execute permissions.

The script ci/utils/install_protobuf_grpc.sh is present with correct error handling (set -euo pipefail) and structure. However, it has read-only permissions (-rw-r--r--) instead of being executable. Since the call uses bash explicitly, it will still execute, but add the execute bit for consistency with standard shell script practices.

🤖 Prompt for AI Agents
In `@ci/build_wheel_libcuopt.sh` around lines 20 - 22, The called script
ci/utils/install_protobuf_grpc.sh currently lacks the execute bit; update its
file mode so it is executable (e.g., add the execute permission via chmod +x on
ci/utils/install_protobuf_grpc.sh) and commit that mode change so the script
shows as executable in the repo; keep the existing call in
ci/build_wheel_libcuopt.sh (bash ci/utils/install_protobuf_grpc.sh) unchanged.

tmckayus and others added 7 commits January 26, 2026 23:38
Introduce CPU/remote-safe solution handling, gRPC service definitions, and remote solve client/server plumbing to support remote execution and log streaming while retaining local solve behavior.
Include gRPC packages in dependency manifests and regenerate environment files from dependencies.yaml.
Ensure incumbent callbacks are retained for gRPC workers, switch incumbent job IDs to bytes in the proto, and harden remote solve CPU-only paths and build configuration defaults.
rgsl888prabhu and others added 8 commits January 26, 2026 23:38
Drop cuopt_remote_server sources/scripts and the Python cancel helpers, and update docs/CMake to reflect gRPC-only server support.

Signed-off-by: Trevor McKay <tmgithub1@gmail.com>
Align remote execution with gRPC-only path, drop legacy async APIs,
refresh architecture docs, and fix protobuf target aliasing in Python
builds.
@copy-pr-bot
Copy link

copy-pr-bot bot commented Jan 27, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@tmckayus
Copy link
Contributor Author

/ok to test 754c449

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 18

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cpp/src/linear_programming/solver_solution.cu (1)

320-341: Avoid size_t underflow on empty solution vectors

Line 320-341: the loops use size() - 1. If any solution vector is empty, this underflows and will crash. This is more likely now with remote error paths that can yield empty host vectors. Consider guarding for empties before iterating.

🛠️ Proposed fix (guard empty vectors)
-  if (!var_names_.empty() && generate_variable_values) {
+  if (generate_variable_values && !var_names_.empty() && !row_names_.empty() &&
+      !primal_solution.empty() && !dual_solution.empty() && !reduced_cost.empty()) {
🤖 Fix all issues with AI agents
In `@conda/recipes/libcuopt/recipe.yaml`:
- Line 32: The recipe enables BUILD_REMOTE_SERVER and CUOPT_ENABLE_GRPC which
builds the cuopt_grpc_server binary but the package lacks an explicit grpc-cpp
runtime dependency and no test verifies the server is packaged; update the conda
recipe to declare grpc-cpp (grpc-cpp or grpc_cpp depending on channel naming) in
the runtime requirements section or confirm conda auto-detection will pull
grpc-cpp at runtime, and add a package_contents test that asserts the installed
cuopt_grpc_server binary exists (and is executable) to validate packaging; look
for references to BUILD_REMOTE_SERVER, CUOPT_ENABLE_GRPC, cuopt_grpc_server, and
package_contents to locate the exact spots to modify.

In `@cpp/include/cuopt/linear_programming/pdlp/solver_solution.hpp`:
- Around line 281-306: Add a runtime assertion in the implementations of
get_primal_solution_host(), get_dual_solution_host(), and
get_reduced_cost_host() to validate the documented precondition by calling
cuopt_assert(!is_device_memory_) before dereferencing the host vectors; locate
the definitions of these three methods and insert the assertion at the top of
each function to prevent returning host references when is_device_memory_ is
true.

In `@cpp/include/cuopt/linear_programming/solve.hpp`:
- Around line 112-127: Docstring for data_model_view_to_optimization_problem
incorrectly states the view points to GPU memory only; update the comment to say
data_model_view_t<i_t, f_t> may reference either CPU or GPU memory and that the
function copies data from the view (regardless of host or device residence) into
an owning optimization_problem_t<i_t, f_t>, preserving mention of the
raft::handle_t CUDA stream usage when device transfers are required; reference
the function name data_model_view_to_optimization_problem and the type
data_model_view_t in the updated description.

In `@cpp/include/cuopt/linear_programming/utilities/remote_solve.hpp`:
- Around line 39-53: get_remote_solve_config currently uses std::stoi which
permits partial parsing and out-of-range/negative values; replace that with a
strict parse that verifies the entire CUOPT_REMOTE_PORT string is numeric and
within 1–65535 before constructing remote_solve_config_t. Specifically, in
get_remote_solve_config() parse port using a function that exposes the end
pointer (e.g., std::strtol or std::stoll with checking), ensure no leftover
characters remain after the number, detect conversion errors/overflows, and
check the resulting integer is between 1 and 65535; if any check fails, return
std::nullopt.

In `@cpp/src/linear_programming/solve.cu`:
- Around line 1403-1415: In copy_view_to_cpu, check CUDA return codes for the
GPU→CPU transfer and synchronization: capture and validate the cudaMemcpyAsync
return value (for the call that copies var_types_span into
cpu_data.variable_types) and capture and validate the cudaStreamSynchronize
return value for stream; on error, log/propagate a descriptive error including
cudaGetErrorName()/cudaGetErrorString or return a failure status instead of
proceeding, so failures from cudaMemcpyAsync and cudaStreamSynchronize are not
ignored.

In `@cpp/src/linear_programming/solver_solution.cu`:
- Around line 530-534: get_solved_by_pdlp() and reporting are reading different
sources (solved_by_pdlp_ vs termination_stats_.solved_by_pdlp) which can
diverge; unify them by making the termination stats the single source of truth:
in set_termination_stats(const additional_termination_information_t& stats)
assign termination_stats_ = stats and also set solved_by_pdlp_ =
stats.solved_by_pdlp (or alternatively change get_solved_by_pdlp() to return
termination_stats_.solved_by_pdlp directly) so both the accessor
get_solved_by_pdlp(), reporting, and the stored flag remain in sync; update the
implementation of get_solved_by_pdlp() and set_termination_stats() accordingly
(references: set_termination_stats, termination_stats_, get_solved_by_pdlp,
solved_by_pdlp_).
- Around line 411-476: The device/host getters (get_primal_solution,
get_dual_solution, get_reduced_cost and their _host counterparts) currently
dereference unique_ptrs (primal_solution_, dual_solution_, reduced_cost_,
primal_solution_host_, dual_solution_host_, reduced_cost_host_) without
verifying the memory mode; add a precondition check at the top of each getter
that verifies is_device_memory_ is true for device getters and false for host
getters (use an existing assert/EXPECTS macro or throw a clear runtime_error) so
callers fail fast with a descriptive message instead of causing a null deref.
- Around line 175-217: When copying from other into host mode
(other.is_device_memory_ == false) ensure any existing GPU allocations are
released: before assigning host vectors in the else branch, if is_device_memory_
is true call handle_ptr->sync_stream() and reset the device unique_ptrs
(primal_solution_.reset(), dual_solution_.reset(), reduced_cost_.reset()) and
set is_device_memory_ = false so GPU buffers are freed; apply the same change in
the other copy-path around the code referenced at lines 508-527 (the same
primal_solution_/dual_solution_/reduced_cost_ reset logic).

In `@cpp/src/linear_programming/utilities/cuopt_remote_service.proto`:
- Around line 216-221: The Incumbent protobuf message defines job_id as bytes
but other messages use string; update the Incumbent message's job_id field from
bytes to string to maintain type consistency (modify the Incumbent message
declaration and its job_id field definition to use string job_id = 4), and then
regenerate any affected stubs to pick up the change.
- Around line 210-214: IncumbentRequest.job_id is declared as bytes while all
other messages (e.g., StatusRequest.job_id, GetResultRequest.job_id,
ResultChunk.job_id, DeleteRequest.job_id, CancelRequest.job_id,
WaitRequest.job_id, StreamLogsRequest.job_id) use string; change the field type
in IncumbentRequest from bytes to string to make the schema consistent, update
any code that relied on bytes to use string values, and regenerate protobuf/GRPC
stubs so consumers and servers use the unified string job_id type across
functions/messages.

In `@cpp/src/linear_programming/utilities/cuopt_remote.proto`:
- Around line 21-52: The proto defines index arrays as int32 which can truncate
64-bit indices; update the OptimizationProblem fields A_indices and A_offsets
and the QuadraticProblem fields Q_indices and Q_offsets from int32 to int64 to
match the IndexType enum (INT64) and the RequestHeader handling; after changing
the field types, regenerate protobuf stubs and ensure serializers/deserializers
map i_t to int64 (and still to int32 when IndexType==INT32) so
IndexType/RequestHeader and the fields (A_indices, A_offsets, Q_indices,
Q_offsets) remain consistent.

In `@cpp/src/linear_programming/utilities/cython_solve.cu`:
- Around line 293-304: The code currently forces lp_ret.solved_by_pdlp_ = false
which ignores the actual termination info; change this to propagate the real
flag from the termination info (e.g., set lp_ret.solved_by_pdlp_ =
term_info.solved_by_pdlp or the equivalent field on the termination/info object)
so the response reflects whether the solve was performed by PDLP; update any
mapping if the flag resides on a different object (e.g.,
solution.get_termination_info().solved_by_pdlp) and ensure response.lp_ret is
still moved as before.

In `@cpp/src/mip/solver_solution.cu`:
- Around line 147-169: The getters mip_solution_t<i_t,f_t>::get_solution and
mip_solution_t<i_t,f_t>::get_solution_host can dereference nullptr (solution_ or
solution_host_) depending on how the object was constructed; add defensive
precondition checks at the start of these methods (e.g., assert or throw
std::runtime_error) that verify the corresponding pointer is non-null and/or
that the object's memory mode (use is_device_memory() or equivalent) matches the
requested accessor, and include a clear error message referencing solution_ or
solution_host_; this prevents undefined behavior and guides callers to check the
memory mode before calling the wrong getter.

In `@cpp/tests/linear_programming/CMakeLists.txt`:
- Around line 55-68: Remove the duplicated CONDA_PREFIX block by keeping a
single conditional that checks ENV{CONDA_PREFIX}, sets _CUOPT_CONDA_PREFIX,
tests for the version.h path, calls target_include_directories(C_API_TEST BEFORE
PRIVATE "${_CUOPT_CONDA_PREFIX}/include/rapids") if it exists, and then unsets
_CUOPT_CONDA_PREFIX; ensure only one copy of this logic remains (refer to the
repeated blocks that set _CUOPT_CONDA_PREFIX and call target_include_directories
for target C_API_TEST).
- Around line 31-44: Remove the duplicated CONDA_PREFIX include block: there are
two identical if(DEFINED ENV{CONDA_PREFIX}) ...
target_include_directories(c_api_tester BEFORE PRIVATE
"${_CUOPT_CONDA_PREFIX}/include/rapids") ... unset(_CUOPT_CONDA_PREFIX)
sections; keep only one. Locate the repeated block that checks for
"${_CUOPT_CONDA_PREFIX}/include/rapids/cuda/std/__cccl/version.h" and ensure a
single conditional sets target_include_directories for the c_api_tester target
and unsets _CUOPT_CONDA_PREFIX once.

In `@docs/developer/SERIALIZATION_PLUGIN_GUIDE.md`:
- Around line 13-25: The markdown fenced code blocks for the ASCII diagram and
the example output are missing language tags and trigger markdownlint MD040;
update the two blocks that contain the ASCII diagram (the box with
serialize_lp_request() and serialize_lp_solution()) and the example output block
that starts with "[remote_solve] Loading custom serializer..." to use ```text
fence markers instead of plain ``` so both the diagram and the output snippet
are annotated as text; locate the blocks by searching for the ASCII diagram
containing serialize_lp_request/serialize_lp_solution and the output snippet
beginning "[remote_solve] Loading custom serializer" and add the language tag to
each fenced block.

In `@python/cuopt/cuopt/utilities/__init__.py`:
- Around line 11-25: Add a module-level __all__ list that includes "type_cast",
"series_from_buf", and "check_solution" so star-imports continue to expose these
names, and optionally implement __dir__ to return sorted(__all__ +
list(globals().keys())) to surface them in autocomplete; keep the existing
lazy-loader __getattr__ unchanged so the actual imports still occur on first
access (referencing __getattr__, type_cast, series_from_buf, check_solution).
🧹 Nitpick comments (15)
python/cuopt/cuopt/linear_programming/internals/internals.pyx (1)

52-57: Non-owning view: document or consider memory lifetime.

np.ctypeslib.as_array returns a view that does not own the underlying memory. If the original buffer at data is freed while the returned numpy array is still in use, this results in undefined behavior (use-after-free).

This is likely acceptable given the low-level nature of this callback interface and the PR's purpose (handling CPU memory for remote solves), but consider:

  1. Adding a brief docstring clarifying that the caller must ensure the underlying memory remains valid for the lifetime of the returned array.
  2. Alternatively, if a safe copy is ever needed, np.ctypeslib.as_array(buf).copy() would provide ownership.
conda/environments/all_cuda-131_arch-x86_64.yaml (1)

28-28: Consider adding version constraints for grpc-cpp and libprotobuf in dependencies.yaml.

These dependencies are unversioned in dependencies.yaml, which is consistent with other build dependencies in the build_cpp section but may impact reproducibility. gRPC and Protobuf have known ABI compatibility concerns across major versions, and explicit version constraints would help prevent unexpected breakages from upstream changes.

Since this file is auto-generated from dependencies.yaml, ensure version constraints are added there (e.g., grpc-cpp>=1.x, libprotobuf>=x.y).

docs/developer/SOLUTION_MEMORY_ARCHITECTURE.md (1)

133-140: Consider adding guidance on when to avoid calling to_host().

The performance section mentions overhead is negligible for typical problems. Consider adding a note that for very large problems (millions of variables), users should be mindful of the memory bandwidth cost and avoid unnecessary to_host() calls in tight loops.

GRPC_ARCHITECTURE.md (1)

1-82: Good architecture documentation for the gRPC remote solve system.

The document clearly explains the service definition, components, data flow, and build integration. One minor issue flagged by static analysis:

📝 Fix: Add language specifier to empty code fence at end of file

Line 82 has an unclosed/empty code fence. The trailing triple backticks appear to be unintended (the document ends with the security notes section, not a code block). Consider removing them:

 - gRPC provides HTTP/2 framing, flow control, and standard status codes.
-```
cpp/include/cuopt/linear_programming/mip/solver_solution.hpp (1)

173-180: Consider consolidating duplicate accessor methods.

get_nodes() and get_simplex_iterations() (lines 175, 180) appear to duplicate the existing get_num_nodes() (line 100) and get_num_simplex_iterations() (line 101) methods. Consider using only one naming convention to avoid API confusion.

♻️ Suggested consolidation

Either remove the new methods and use existing ones in the serializer, or deprecate the old methods:

-  /**
-   * `@brief` Get number of nodes
-   */
-  i_t get_nodes() const;
-
-  /**
-   * `@brief` Get number of simplex iterations
-   */
-  i_t get_simplex_iterations() const;

The protobuf serializer at cpp/src/linear_programming/utilities/protobuf_serializer.cu could use the existing get_num_nodes() and get_num_simplex_iterations() methods instead.

cpp/src/mip/solver_solution.cu (1)

398-408: Duplicate methods: get_nodes() and get_simplex_iterations().

These methods duplicate get_num_nodes() (line 255) and get_num_simplex_iterations() (line 261) with identical implementations. Consider removing the duplicates to avoid API confusion.

build.sh (2)

419-425: Silent cache invalidation may surprise users.

The cache invalidation when system ZLIB is detected in CMakeCache.txt could be confusing to users who expect their build cache to persist. Consider logging a message when this happens.

💡 Suggested improvement
     if [ -f "${LIBCUOPT_BUILD_DIR}/CMakeCache.txt" ]; then
         if grep -Eq "/(usr/)?lib64/cmake/ZLIB" "${LIBCUOPT_BUILD_DIR}/CMakeCache.txt"; then
+            echo "Note: Clearing CMake cache due to system ZLIB references"
             rm -f "${LIBCUOPT_BUILD_DIR}/CMakeCache.txt"
             rm -rf "${LIBCUOPT_BUILD_DIR}/CMakeFiles"
         fi
     fi

521-534: Consider extracting duplicated SKBUILD argument handling into a function.

The SKBUILD_EXTRA_ARGS construction logic (escaping semicolons, joining, building SKBUILD_CMAKE_ARGS) is duplicated between the cuopt build (lines 521-534) and cuopt_mps_parser build (lines 542-553). Extracting this into a helper function would improve maintainability.

python/cuopt/cuopt/linear_programming/solver/solver_wrapper.pyx (2)

82-91: Consider caching lazy imports.

_get_cuda_imports() performs module imports on each call. While Python caches module imports, the function call overhead is repeated. For frequently called code paths, consider caching at module level after first import.

💡 Optional caching pattern
_cuda_imports_cache = None

def _get_cuda_imports():
    """Lazily import CUDA-dependent modules."""
    global _cuda_imports_cache
    if _cuda_imports_cache is None:
        import cupy as cp
        import cudf
        from cuopt.utilities import series_from_buf
        _cuda_imports_cache = (cp, cudf, series_from_buf)
    return _cuda_imports_cache

488-498: Document warm start limitation for remote solves in user-facing API.

When using remote solve, warm start data is unavailable (empty arrays). Consider adding documentation or a warning when users configure warm start but are using remote solve, to prevent confusion.

cpp/src/linear_programming/utilities/remote_solve.cu (4)

56-84: CUDA error handling uses raw checks instead of CUDA_CHECK macro.

Per coding guidelines, CUDA kernel launches and memory operations should use CUDA_CHECK or equivalent verification. The current implementation uses raw cudaError_t checks which is functional but inconsistent with the codebase's error checking patterns.


198-203: Log thread uses volatile bool for cross-thread signaling.

The volatile bool stop_logs is used for signaling the log thread to stop. While this works in practice, std::atomic<bool> would be the modern C++ approach for cross-thread communication with proper memory ordering guarantees.

💡 Modern alternative
std::atomic<bool> stop_logs{false};
// ... 
stop_logs.store(true);

184-188: Consider consistent logging instead of mixing std::cerr and CUOPT_LOG_*.

Error output uses std::cerr (line 184-185) while other logging uses CUOPT_LOG_INFO. Consider using CUOPT_LOG_ERROR for consistency with the rest of the codebase's logging infrastructure.

💡 Suggested change
   if (!grpc_remote::submit_or_upload(...)) {
-    std::cerr << "[remote_solve] UploadAndSubmit failed: " << err << "\n";
-    std::cerr.flush();
+    CUOPT_LOG_ERROR("[remote_solve] UploadAndSubmit failed: %s", err.c_str());
     return optimization_problem_solution_t<i_t, f_t>(...);
   }

426-451: Final incumbent drain logic is reasonable but could be documented.

The post-completion drain loop (5 iterations, 50ms sleep between retries if empty) handles race conditions where incumbents may arrive slightly after completion status. Consider adding a brief comment explaining this timing window.

cpp/cuopt_cli.cpp (1)

113-213: Consolidate data_model_view construction helpers.

create_view_from_mps_data_model duplicates logic already present in cpp/src/linear_programming/solve.cu and cpp/src/linear_programming/cuopt_c.cpp. Consider extracting a shared utility to avoid drift and keep CPU view construction consistent. Based on learnings, refactor duplicated solver utilities.

set +x
./build.sh -n -v ${BUILD_EXTRA_FLAGS} libmps_parser libcuopt deb --allgpuarch --cmake-args=\"-DCMAKE_INSTALL_LIBDIR=lib\"
./build.sh -n -v ${BUILD_EXTRA_FLAGS} libmps_parser libcuopt deb --allgpuarch --cmake-args=\"-DCMAKE_INSTALL_LIBDIR=lib -DBUILD_REMOTE_SERVER=ON -DCUOPT_ENABLE_GRPC=ON\"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Search for cuopt_grpc_server references in CMakeLists.txt files
find . -name "CMakeLists.txt" | head -20

Repository: NVIDIA/cuopt

Length of output: 769


🏁 Script executed:

# Look for cuopt_grpc_server in the repository
rg -n "cuopt_grpc_server" --type-list | head
rg -n "cuopt_grpc_server" .

Repository: NVIDIA/cuopt

Length of output: 3251


🏁 Script executed:

# Examine the conda recipe file for libcuopt
cat -n conda/recipes/libcuopt/recipe.yaml | head -200

Repository: NVIDIA/cuopt

Length of output: 7865


🏁 Script executed:

tail -n +189 conda/recipes/libcuopt/recipe.yaml | head -100

Repository: NVIDIA/cuopt

Length of output: 1278


Verify gRPC runtime dependencies and packaging of cuopt_grpc_server binary.

Line 32 enables BUILD_REMOTE_SERVER and CUOPT_ENABLE_GRPC, and build.sh/CMakeLists.txt confirm that cuopt_grpc_server is built and installed to the package. However, the libcuopt output package does not explicitly declare grpc-cpp as a runtime dependency (only libprotobuf), and the package_contents tests do not verify the server binary is actually present. Ensure the grpc-cpp runtime dependency is properly declared or rely on conda's auto-detection; also add a test to verify cuopt_grpc_server is packaged.

Also applies to: 63-74, 153-176

🤖 Prompt for AI Agents
In `@conda/recipes/libcuopt/recipe.yaml` at line 32, The recipe enables
BUILD_REMOTE_SERVER and CUOPT_ENABLE_GRPC which builds the cuopt_grpc_server
binary but the package lacks an explicit grpc-cpp runtime dependency and no test
verifies the server is packaged; update the conda recipe to declare grpc-cpp
(grpc-cpp or grpc_cpp depending on channel naming) in the runtime requirements
section or confirm conda auto-detection will pull grpc-cpp at runtime, and add a
package_contents test that asserts the installed cuopt_grpc_server binary exists
(and is executable) to validate packaging; look for references to
BUILD_REMOTE_SERVER, CUOPT_ENABLE_GRPC, cuopt_grpc_server, and package_contents
to locate the exact spots to modify.

Comment on lines +281 to +306
/**
* @brief Returns the primal solution in host (CPU) memory.
* Only valid when is_device_memory() returns false.
*
* @return std::vector<f_t>& The host memory container for the primal solution.
*/
std::vector<f_t>& get_primal_solution_host();
const std::vector<f_t>& get_primal_solution_host() const;

/**
* @brief Returns the dual solution in host (CPU) memory.
* Only valid when is_device_memory() returns false.
*
* @return std::vector<f_t>& The host memory container for the dual solution.
*/
std::vector<f_t>& get_dual_solution_host();
const std::vector<f_t>& get_dual_solution_host() const;

/**
* @brief Returns the reduced cost in host (CPU) memory.
* Only valid when is_device_memory() returns false.
*
* @return std::vector<f_t>& The host memory container for the reduced cost.
*/
std::vector<f_t>& get_reduced_cost_host();
const std::vector<f_t>& get_reduced_cost_host() const;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find the implementation file for solver_solution
find . -type f -name "solver_solution.cpp" | head -10

Repository: NVIDIA/cuopt

Length of output: 38


🏁 Script executed:

#!/bin/bash
# Also check if there's an inline implementation in the header
cat -n cpp/include/cuopt/linear_programming/pdlp/solver_solution.hpp | sed -n '281,350p'

Repository: NVIDIA/cuopt

Length of output: 2902


🏁 Script executed:

#!/bin/bash
# Search for the implementations of these accessor methods
rg -n "get_primal_solution_host|get_dual_solution_host|get_reduced_cost_host" --type cpp cpp/src/ -A 8 | head -80

Repository: NVIDIA/cuopt

Length of output: 1910


🏁 Script executed:

#!/bin/bash
# Search for solver_solution class in all cpp files
find cpp/src -type f -name "*.cpp" | xargs rg -l "get_primal_solution_host" | head -5

Repository: NVIDIA/cuopt

Length of output: 94


🏁 Script executed:

#!/bin/bash
# Look for pdlp directory structure
ls -la cpp/src/linear_programming/ | grep pdlp

Repository: NVIDIA/cuopt

Length of output: 386


🏁 Script executed:

#!/bin/bash
# Search for the class definition and implementation more broadly
rg -n "class.*solver_solution" cpp/include/ -A 10

Repository: NVIDIA/cuopt

Length of output: 38


🏁 Script executed:

#!/bin/bash
# Look for template implementations or inline implementations
rg -n "get_primal_solution_host\(\)" cpp/src/ -B 2 -A 5 | head -100

Repository: NVIDIA/cuopt

Length of output: 3586


🏁 Script executed:

#!/bin/bash
# Get complete implementations of all three host accessors
rg -n "get_primal_solution_host|get_dual_solution_host|get_reduced_cost_host" cpp/src/linear_programming/solver_solution.cu -A 4 | grep -E "^[0-9]+-" | head -50

Repository: NVIDIA/cuopt

Length of output: 369


🏁 Script executed:

#!/bin/bash
# Check if there's an is_device_memory_ member and how it's used
rg -n "is_device_memory_" cpp/include/cuopt/linear_programming/pdlp/solver_solution.hpp -B 2 -A 2

Repository: NVIDIA/cuopt

Length of output: 218


🏁 Script executed:

#!/bin/bash
# Check for any assertions in the solver_solution.cu file
rg -n "assert|ASSERT|cuopt_assert" cpp/src/linear_programming/solver_solution.cu | head -20

Repository: NVIDIA/cuopt

Length of output: 38


🏁 Script executed:

#!/bin/bash
# Look at the broader context of the implementation file to understand assertion patterns
head -100 cpp/src/linear_programming/solver_solution.cu

Repository: NVIDIA/cuopt

Length of output: 4281


Add runtime assertions to host accessor implementations to validate preconditions.

The implementations of get_primal_solution_host(), get_dual_solution_host(), and get_reduced_cost_host() lack any checks for the documented precondition "Only valid when is_device_memory() returns false." Currently these methods simply dereference the host vector pointers without validating the is_device_memory_ flag. Add cuopt_assert(!is_device_memory_) to catch misuse during development and prevent undefined behavior when callers use the wrong accessor for the memory location.

🤖 Prompt for AI Agents
In `@cpp/include/cuopt/linear_programming/pdlp/solver_solution.hpp` around lines
281 - 306, Add a runtime assertion in the implementations of
get_primal_solution_host(), get_dual_solution_host(), and
get_reduced_cost_host() to validate the documented precondition by calling
cuopt_assert(!is_device_memory_) before dereferencing the host vectors; locate
the definitions of these three methods and insert the assertion at the top of
each function to prevent returning host references when is_device_memory_ is
true.

Comment on lines +112 to +127
/**
* @brief Convert a data_model_view_t to an optimization_problem_t.
*
* This function copies data from the view (which points to GPU memory)
* into an owning optimization_problem_t.
*
* @tparam i_t Data type of indexes
* @tparam f_t Data type of the variables and their weights in the equations
*
* @param[in] handle_ptr A raft::handle_t object with its corresponding CUDA stream.
* @param[in] view A data_model_view_t<i_t, f_t> object with spans pointing to GPU memory
* @return optimization_problem_t<i_t, f_t> owning container for the problem
*/
template <typename i_t, typename f_t>
optimization_problem_t<i_t, f_t> data_model_view_to_optimization_problem(
raft::handle_t const* handle_ptr, const data_model_view_t<i_t, f_t>& view);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Clarify doc: view may be CPU or GPU memory.

Line 115 describes GPU-only memory, but the view overloads are used with CPU views too (local CPU→GPU path). Update the doc to reflect both cases.

✏️ Suggested doc tweak
- * This function copies data from the view (which points to GPU memory)
+ * This function copies data from the view (which may point to GPU or CPU memory)
🤖 Prompt for AI Agents
In `@cpp/include/cuopt/linear_programming/solve.hpp` around lines 112 - 127,
Docstring for data_model_view_to_optimization_problem incorrectly states the
view points to GPU memory only; update the comment to say data_model_view_t<i_t,
f_t> may reference either CPU or GPU memory and that the function copies data
from the view (regardless of host or device residence) into an owning
optimization_problem_t<i_t, f_t>, preserving mention of the raft::handle_t CUDA
stream usage when device transfers are required; reference the function name
data_model_view_to_optimization_problem and the type data_model_view_t in the
updated description.

Comment on lines +39 to +53
inline std::optional<remote_solve_config_t> get_remote_solve_config()
{
const char* host = std::getenv("CUOPT_REMOTE_HOST");
const char* port = std::getenv("CUOPT_REMOTE_PORT");

if (host != nullptr && port != nullptr && host[0] != '\0' && port[0] != '\0') {
try {
int port_num = std::stoi(port);
return remote_solve_config_t{std::string(host), port_num};
} catch (...) {
// Invalid port number, fall back to local solve
return std::nullopt;
}
}
return std::nullopt;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Validate CUOPT_REMOTE_PORT is numeric and within 1–65535.

std::stoi accepts partial strings and allows negative/zero values. Add a strict parse and bounds check to prevent invalid configs from being treated as enabled.

✏️ Suggested fix
-  if (host != nullptr && port != nullptr && host[0] != '\0' && port[0] != '\0') {
+  if (host != nullptr && port != nullptr && host[0] != '\0' && port[0] != '\0') {
     try {
-      int port_num = std::stoi(port);
-      return remote_solve_config_t{std::string(host), port_num};
+      std::string port_str(port);
+      size_t idx = 0;
+      int port_num = std::stoi(port_str, &idx);
+      if (idx != port_str.size() || port_num <= 0 || port_num > 65535) {
+        return std::nullopt;
+      }
+      return remote_solve_config_t{std::string(host), port_num};
     } catch (...) {
       // Invalid port number, fall back to local solve
       return std::nullopt;
     }
   }
🤖 Prompt for AI Agents
In `@cpp/include/cuopt/linear_programming/utilities/remote_solve.hpp` around lines
39 - 53, get_remote_solve_config currently uses std::stoi which permits partial
parsing and out-of-range/negative values; replace that with a strict parse that
verifies the entire CUOPT_REMOTE_PORT string is numeric and within 1–65535
before constructing remote_solve_config_t. Specifically, in
get_remote_solve_config() parse port using a function that exposes the end
pointer (e.g., std::strtol or std::stoll with checking), ensure no leftover
characters remain after the number, detect conversion errors/overflows, and
check the resulting integer is between 1 and 65535; if any check fails, return
std::nullopt.

Comment on lines +1403 to +1415
auto var_types_span = gpu_view.get_variable_types();
if (var_types_span.size() > 0) {
cpu_data.variable_types.resize(var_types_span.size());
cudaMemcpyAsync(cpu_data.variable_types.data(),
var_types_span.data(),
var_types_span.size() * sizeof(char),
cudaMemcpyDeviceToHost,
stream);
}

// Synchronize to ensure all copies are complete
cudaStreamSynchronize(stream);

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Check CUDA errors for GPU→CPU copies in copy_view_to_cpu.

cudaMemcpyAsync and cudaStreamSynchronize are unchecked; failures will silently corrupt the serialized remote payload. Please add CUDA error checks as per guidelines.

🐛 Proposed fix
   if (var_types_span.size() > 0) {
     cpu_data.variable_types.resize(var_types_span.size());
-    cudaMemcpyAsync(cpu_data.variable_types.data(),
-                    var_types_span.data(),
-                    var_types_span.size() * sizeof(char),
-                    cudaMemcpyDeviceToHost,
-                    stream);
+    RAFT_CUDA_TRY(cudaMemcpyAsync(cpu_data.variable_types.data(),
+                                  var_types_span.data(),
+                                  var_types_span.size() * sizeof(char),
+                                  cudaMemcpyDeviceToHost,
+                                  stream));
   }

   // Synchronize to ensure all copies are complete
-  cudaStreamSynchronize(stream);
+  RAFT_CUDA_TRY(cudaStreamSynchronize(stream));
🤖 Prompt for AI Agents
In `@cpp/src/linear_programming/solve.cu` around lines 1403 - 1415, In
copy_view_to_cpu, check CUDA return codes for the GPU→CPU transfer and
synchronization: capture and validate the cudaMemcpyAsync return value (for the
call that copies var_types_span into cpu_data.variable_types) and capture and
validate the cudaStreamSynchronize return value for stream; on error,
log/propagate a descriptive error including
cudaGetErrorName()/cudaGetErrorString or return a failure status instead of
proceeding, so failures from cudaMemcpyAsync and cudaStreamSynchronize are not
ignored.

Comment on lines 147 to 169
template <typename i_t, typename f_t>
const rmm::device_uvector<f_t>& mip_solution_t<i_t, f_t>::get_solution() const
{
return solution_;
return *solution_;
}

template <typename i_t, typename f_t>
rmm::device_uvector<f_t>& mip_solution_t<i_t, f_t>::get_solution()
{
return solution_;
return *solution_;
}

template <typename i_t, typename f_t>
std::vector<f_t>& mip_solution_t<i_t, f_t>::get_solution_host()
{
return *solution_host_;
}

template <typename i_t, typename f_t>
const std::vector<f_t>& mip_solution_t<i_t, f_t>::get_solution_host() const
{
return *solution_host_;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential null pointer dereference in get_solution() and get_solution_host().

When a solution is constructed via CPU-only constructors (for remote solve), solution_ is nullptr. Calling get_solution() will dereference this null pointer, causing undefined behavior. Similarly, get_solution_host() will crash when called on device-memory solutions where solution_host_ is nullptr.

Consider adding precondition checks or assertions to catch misuse early:

🛠️ Proposed fix: Add defensive checks
 template <typename i_t, typename f_t>
 const rmm::device_uvector<f_t>& mip_solution_t<i_t, f_t>::get_solution() const
 {
+  cuopt_assert(solution_ != nullptr,
+               "get_solution() called but solution is in host memory. Use get_solution_host().");
   return *solution_;
 }

 template <typename i_t, typename f_t>
 rmm::device_uvector<f_t>& mip_solution_t<i_t, f_t>::get_solution()
 {
+  cuopt_assert(solution_ != nullptr,
+               "get_solution() called but solution is in host memory. Use get_solution_host().");
   return *solution_;
 }

 template <typename i_t, typename f_t>
 std::vector<f_t>& mip_solution_t<i_t, f_t>::get_solution_host()
 {
+  cuopt_assert(solution_host_ != nullptr,
+               "get_solution_host() called but solution is in device memory. Use get_solution().");
   return *solution_host_;
 }

 template <typename i_t, typename f_t>
 const std::vector<f_t>& mip_solution_t<i_t, f_t>::get_solution_host() const
 {
+  cuopt_assert(solution_host_ != nullptr,
+               "get_solution_host() called but solution is in device memory. Use get_solution().");
   return *solution_host_;
 }

Alternatively, document the preconditions clearly in the header and rely on callers to check is_device_memory() first.

🤖 Prompt for AI Agents
In `@cpp/src/mip/solver_solution.cu` around lines 147 - 169, The getters
mip_solution_t<i_t,f_t>::get_solution and
mip_solution_t<i_t,f_t>::get_solution_host can dereference nullptr (solution_ or
solution_host_) depending on how the object was constructed; add defensive
precondition checks at the start of these methods (e.g., assert or throw
std::runtime_error) that verify the corresponding pointer is non-null and/or
that the object's memory mode (use is_device_memory() or equivalent) matches the
requested accessor, and include a clear error message referencing solution_ or
solution_host_; this prevents undefined behavior and guides callers to check the
memory mode before calling the wrong getter.

Comment on lines +31 to +44
if(DEFINED ENV{CONDA_PREFIX})
set(_CUOPT_CONDA_PREFIX "$ENV{CONDA_PREFIX}")
if(EXISTS "${_CUOPT_CONDA_PREFIX}/include/rapids/cuda/std/__cccl/version.h")
target_include_directories(c_api_tester BEFORE PRIVATE "${_CUOPT_CONDA_PREFIX}/include/rapids")
endif()
unset(_CUOPT_CONDA_PREFIX)
endif()
if(DEFINED ENV{CONDA_PREFIX})
set(_CUOPT_CONDA_PREFIX "$ENV{CONDA_PREFIX}")
if(EXISTS "${_CUOPT_CONDA_PREFIX}/include/rapids/cuda/std/__cccl/version.h")
target_include_directories(c_api_tester BEFORE PRIVATE "${_CUOPT_CONDA_PREFIX}/include/rapids")
endif()
unset(_CUOPT_CONDA_PREFIX)
endif()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove duplicated CONDA_PREFIX include directory blocks.

Lines 31-37 and 38-44 are identical. This appears to be a copy-paste error—the same conditional block is repeated twice for the c_api_tester target.

🔧 Suggested fix
     if(DEFINED ENV{CONDA_PREFIX})
         set(_CUOPT_CONDA_PREFIX "$ENV{CONDA_PREFIX}")
         if(EXISTS "${_CUOPT_CONDA_PREFIX}/include/rapids/cuda/std/__cccl/version.h")
             target_include_directories(c_api_tester BEFORE PRIVATE "${_CUOPT_CONDA_PREFIX}/include/rapids")
         endif()
         unset(_CUOPT_CONDA_PREFIX)
     endif()
-    if(DEFINED ENV{CONDA_PREFIX})
-        set(_CUOPT_CONDA_PREFIX "$ENV{CONDA_PREFIX}")
-        if(EXISTS "${_CUOPT_CONDA_PREFIX}/include/rapids/cuda/std/__cccl/version.h")
-            target_include_directories(c_api_tester BEFORE PRIVATE "${_CUOPT_CONDA_PREFIX}/include/rapids")
-        endif()
-        unset(_CUOPT_CONDA_PREFIX)
-    endif()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if(DEFINED ENV{CONDA_PREFIX})
set(_CUOPT_CONDA_PREFIX "$ENV{CONDA_PREFIX}")
if(EXISTS "${_CUOPT_CONDA_PREFIX}/include/rapids/cuda/std/__cccl/version.h")
target_include_directories(c_api_tester BEFORE PRIVATE "${_CUOPT_CONDA_PREFIX}/include/rapids")
endif()
unset(_CUOPT_CONDA_PREFIX)
endif()
if(DEFINED ENV{CONDA_PREFIX})
set(_CUOPT_CONDA_PREFIX "$ENV{CONDA_PREFIX}")
if(EXISTS "${_CUOPT_CONDA_PREFIX}/include/rapids/cuda/std/__cccl/version.h")
target_include_directories(c_api_tester BEFORE PRIVATE "${_CUOPT_CONDA_PREFIX}/include/rapids")
endif()
unset(_CUOPT_CONDA_PREFIX)
endif()
if(DEFINED ENV{CONDA_PREFIX})
set(_CUOPT_CONDA_PREFIX "$ENV{CONDA_PREFIX}")
if(EXISTS "${_CUOPT_CONDA_PREFIX}/include/rapids/cuda/std/__cccl/version.h")
target_include_directories(c_api_tester BEFORE PRIVATE "${_CUOPT_CONDA_PREFIX}/include/rapids")
endif()
unset(_CUOPT_CONDA_PREFIX)
endif()
🤖 Prompt for AI Agents
In `@cpp/tests/linear_programming/CMakeLists.txt` around lines 31 - 44, Remove the
duplicated CONDA_PREFIX include block: there are two identical if(DEFINED
ENV{CONDA_PREFIX}) ... target_include_directories(c_api_tester BEFORE PRIVATE
"${_CUOPT_CONDA_PREFIX}/include/rapids") ... unset(_CUOPT_CONDA_PREFIX)
sections; keep only one. Locate the repeated block that checks for
"${_CUOPT_CONDA_PREFIX}/include/rapids/cuda/std/__cccl/version.h" and ensure a
single conditional sets target_include_directories for the c_api_tester target
and unsets _CUOPT_CONDA_PREFIX once.

Comment on lines +55 to +68
if(DEFINED ENV{CONDA_PREFIX})
set(_CUOPT_CONDA_PREFIX "$ENV{CONDA_PREFIX}")
if(EXISTS "${_CUOPT_CONDA_PREFIX}/include/rapids/cuda/std/__cccl/version.h")
target_include_directories(C_API_TEST BEFORE PRIVATE "${_CUOPT_CONDA_PREFIX}/include/rapids")
endif()
unset(_CUOPT_CONDA_PREFIX)
endif()
if(DEFINED ENV{CONDA_PREFIX})
set(_CUOPT_CONDA_PREFIX "$ENV{CONDA_PREFIX}")
if(EXISTS "${_CUOPT_CONDA_PREFIX}/include/rapids/cuda/std/__cccl/version.h")
target_include_directories(C_API_TEST BEFORE PRIVATE "${_CUOPT_CONDA_PREFIX}/include/rapids")
endif()
unset(_CUOPT_CONDA_PREFIX)
endif()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove duplicated CONDA_PREFIX include directory blocks.

Same issue as above—lines 55-61 and 62-68 are identical duplicates for the C_API_TEST target.

🔧 Suggested fix
     if(DEFINED ENV{CONDA_PREFIX})
         set(_CUOPT_CONDA_PREFIX "$ENV{CONDA_PREFIX}")
         if(EXISTS "${_CUOPT_CONDA_PREFIX}/include/rapids/cuda/std/__cccl/version.h")
             target_include_directories(C_API_TEST BEFORE PRIVATE "${_CUOPT_CONDA_PREFIX}/include/rapids")
         endif()
         unset(_CUOPT_CONDA_PREFIX)
     endif()
-    if(DEFINED ENV{CONDA_PREFIX})
-        set(_CUOPT_CONDA_PREFIX "$ENV{CONDA_PREFIX}")
-        if(EXISTS "${_CUOPT_CONDA_PREFIX}/include/rapids/cuda/std/__cccl/version.h")
-            target_include_directories(C_API_TEST BEFORE PRIVATE "${_CUOPT_CONDA_PREFIX}/include/rapids")
-        endif()
-        unset(_CUOPT_CONDA_PREFIX)
-    endif()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if(DEFINED ENV{CONDA_PREFIX})
set(_CUOPT_CONDA_PREFIX "$ENV{CONDA_PREFIX}")
if(EXISTS "${_CUOPT_CONDA_PREFIX}/include/rapids/cuda/std/__cccl/version.h")
target_include_directories(C_API_TEST BEFORE PRIVATE "${_CUOPT_CONDA_PREFIX}/include/rapids")
endif()
unset(_CUOPT_CONDA_PREFIX)
endif()
if(DEFINED ENV{CONDA_PREFIX})
set(_CUOPT_CONDA_PREFIX "$ENV{CONDA_PREFIX}")
if(EXISTS "${_CUOPT_CONDA_PREFIX}/include/rapids/cuda/std/__cccl/version.h")
target_include_directories(C_API_TEST BEFORE PRIVATE "${_CUOPT_CONDA_PREFIX}/include/rapids")
endif()
unset(_CUOPT_CONDA_PREFIX)
endif()
if(DEFINED ENV{CONDA_PREFIX})
set(_CUOPT_CONDA_PREFIX "$ENV{CONDA_PREFIX}")
if(EXISTS "${_CUOPT_CONDA_PREFIX}/include/rapids/cuda/std/__cccl/version.h")
target_include_directories(C_API_TEST BEFORE PRIVATE "${_CUOPT_CONDA_PREFIX}/include/rapids")
endif()
unset(_CUOPT_CONDA_PREFIX)
endif()
🤖 Prompt for AI Agents
In `@cpp/tests/linear_programming/CMakeLists.txt` around lines 55 - 68, Remove the
duplicated CONDA_PREFIX block by keeping a single conditional that checks
ENV{CONDA_PREFIX}, sets _CUOPT_CONDA_PREFIX, tests for the version.h path, calls
target_include_directories(C_API_TEST BEFORE PRIVATE
"${_CUOPT_CONDA_PREFIX}/include/rapids") if it exists, and then unsets
_CUOPT_CONDA_PREFIX; ensure only one copy of this logic remains (refer to the
repeated blocks that set _CUOPT_CONDA_PREFIX and call target_include_directories
for target C_API_TEST).

Comment on lines +13 to +25
```
┌─────────────┐ ┌─────────────┐
│ Client │ │ Server │
│ │ serialize_lp_request() │ │
│ Problem ───┼──────────────────────────────┼──► Problem │
│ │ │ │
│ │ serialize_lp_solution() │ │
│ Solution ◄─┼──────────────────────────────┼─── Solution │
└─────────────┘ └─────────────┘
▲ ▲
│ │
└────────── Same Serializer ───────────────┘
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add language tags to fenced blocks
markdownlint MD040 will flag these. Use text for the ASCII diagram and output snippet.

📝 Proposed fix
-```
+```text
@@
-```
+```text
 [remote_solve] Loading custom serializer from: /path/to/libcuopt_my_serializer.so
 [remote_solve] Using custom serializer: my_format
</details>


Also applies to: 373-377

<details>
<summary>🤖 Prompt for AI Agents</summary>

In @docs/developer/SERIALIZATION_PLUGIN_GUIDE.md around lines 13 - 25, The
markdown fenced code blocks for the ASCII diagram and the example output are
missing language tags and trigger markdownlint MD040; update the two blocks that
contain the ASCII diagram (the box with serialize_lp_request() and
serialize_lp_solution()) and the example output block that starts with
"[remote_solve] Loading custom serializer..." to use text fence markers instead of plain so both the diagram and the output snippet are annotated as
text; locate the blocks by searching for the ASCII diagram containing
serialize_lp_request/serialize_lp_solution and the output snippet beginning
"[remote_solve] Loading custom serializer" and add the language tag to each
fenced block.


</details>

<!-- fingerprinting:phantom:poseidon:eagle -->

<!-- This is an auto-generated comment by CodeRabbit -->

Comment on lines +11 to +25
# Lazy imports for CUDA-dependent modules to support CPU-only hosts
# These will be imported when first accessed


def __getattr__(name):
"""Lazy import CUDA-dependent utilities."""
if name == "type_cast":
from cuopt.utilities.type_casting import type_cast
return type_cast
elif name == "series_from_buf":
from cuopt.utilities.utils import series_from_buf
return series_from_buf
elif name == "check_solution":
from cuopt.utilities.utils import check_solution
return check_solution
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Preserve import * compatibility by defining __all__ (and optionally __dir__).

With lazy loading, from cuopt.utilities import * will no longer export type_cast, series_from_buf, or check_solution because they aren’t in the module dict. That’s a backward-incompatible change for public API users. Please add __all__ (and optionally __dir__) to keep star-import behavior consistent. As per coding guidelines, preserve backward compatibility in Python APIs.

✅ Proposed fix (keeps lazy loading and restores public exports)
+__all__ = [
+    "InputRuntimeError",
+    "InputValidationError",
+    "OutOfMemoryError",
+    "catch_cuopt_exception",
+    "type_cast",
+    "series_from_buf",
+    "check_solution",
+]
+
 def __getattr__(name):
     """Lazy import CUDA-dependent utilities."""
     if name == "type_cast":
         from cuopt.utilities.type_casting import type_cast
+        globals()["type_cast"] = type_cast
         return type_cast
     elif name == "series_from_buf":
         from cuopt.utilities.utils import series_from_buf
+        globals()["series_from_buf"] = series_from_buf
         return series_from_buf
     elif name == "check_solution":
         from cuopt.utilities.utils import check_solution
+        globals()["check_solution"] = check_solution
         return check_solution
     raise AttributeError(f"module {__name__!r} has no attribute {name!r}")
+
+def __dir__():
+    return sorted(set(globals()) | {"type_cast", "series_from_buf", "check_solution"})
🤖 Prompt for AI Agents
In `@python/cuopt/cuopt/utilities/__init__.py` around lines 11 - 25, Add a
module-level __all__ list that includes "type_cast", "series_from_buf", and
"check_solution" so star-imports continue to expose these names, and optionally
implement __dir__ to return sorted(__all__ + list(globals().keys())) to surface
them in autocomplete; keep the existing lazy-loader __getattr__ unchanged so the
actual imports still occur on first access (referencing __getattr__, type_cast,
series_from_buf, check_solution).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature request New feature or request non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants