Conversation
Reuse that function call in sorting code-base where argsort is used.
|
Deleted rendered PR docs from intelpython.github.com/dpctl, latest should be updated shortly. 🤞 |
|
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_336 ran successfully. |
|
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_337 ran successfully. |
|
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_338 ran successfully. |
29d7198 to
f1b2045
Compare
f1b2045 to
51ead2b
Compare
|
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_339 ran successfully. |
|
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_340 ran successfully. |
Until it is passed over to the host function, and
unique_ptr's ownership is released.
Also reduced allocation sizes, where too much was being
allocated.
Introduce smart_malloc_device, etc.
The smart_malloc_device<T>(count, q) makes USM allocation
and returns a unique_ptr<T, USMDeleter> which owns the
allocation. The function throws an exception (std::runtime_error)
if USM allocation is not successful.
Introduce async_smart_free.
This function intends to replace use of host_task submissions
to manage USM temporary deallocations.
The usage is as follows:
```
// returns unique_ptr
auto alloc_owner = smart_malloc_device<T>(count, q);
// get raw pointer for use in kernels
T *data = alloc_owner.get();
[..SNIP..]
// submit host_task that releases the unique_ptr
// after the host task was successfully submitted
// and ownership of USM allocation is transfered to
// the said host task
sycl::event ht_ev =
async_smart_free(q,
dependent_events,
alloc_owner);
[...SNIP...]
```
bbb55f1 to
da3fbcc
Compare
|
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_341 ran successfully. |
|
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_338 ran successfully. |
Replaced three duplicates of the same kernel with calls to this function.
|
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_339 ran successfully. |
|
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_340 ran successfully. |
Factored out map_back_impl projects indexing from flat index to a row-wise index. Removed dead code excluded by preprocessor conditional.
|
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_341 ran successfully. |
|
Ping @AlexanderKalistratov |
Replaced it with hand-written implementation of ceil_log2(n),
such that n <= (dectype(n){1} << ceil_log2(n)) is true for all
positive values of `n` in the range.
|
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_342 ran successfully. |
Add check of computed against expected indices
|
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_344 ran successfully. |
One asserts that at least one unique pointer is specified. Another that specified arguments are unique pointers with USMDeleter.
|
Array API standard conformance tests for dpctl=0.19.0dev0=py310h93fe807_346 ran successfully. |
|
I suggest we exclude these failing |
ndgrigorian
left a comment
There was a problem hiding this comment.
This LGTM, we can merge this into the topk branch and drop the test file PR, then remove the commit that adds test_top_k_largest_1d_radix_i1
This PR builds on top of feature/topk branch.
It adds
iota_implin newsort_utils.hppfile, and uses it inmerge_sort.hpp,radix_sort.hppandtopk.hpp.It also fixes possible USM allocation leak in exception handling.