Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Reverse param in ordering functions #14989

Closed
wants to merge 12 commits into from

Conversation

mproszewska
Copy link
Contributor

ENH: add reverse parameter in np.sort (not in ndarray.sort()).

See #14728
It is done by flipping (np.flip) array before and after sort. I think that adding reverse in ndarray.sort() would include way more changes in code than it's worth (e.g. changing PyArray_SortFunc signature), but I'm open to any suggestions.

@WarrenWeckesser
Copy link
Member

WarrenWeckesser commented Nov 27, 2019

A reverse parameter in numpy.sort is a potentially useful enhancement, but it would be a good idea to get some feedback on the mailing list about whether we want to add it to NumPy. If we do add it, we should implement it in all the relevant functions and methods, so that we maintain a consistent API. That means the functions numpy.sort and numpy.argsort and the methods ndarray.sort and ndarray.argsort should have the parameter.

Also note that if reverse is added to argsort, simply reversing the array before and after sorting will not handle repeated values correctly when kind is 'stable'.

@eric-wieser
Copy link
Member

eric-wieser commented Nov 27, 2019

I'm in favor of adding this keyword, but only if we implement it everywhere as Warren says - which would mean implementing it in C. It's also not immediately clear to me how this would interact with the order kwarg.

Also note that if reverse is added to argsort, simply reversing the array before and after sorting will not handle repeated values correctly when kind is 'stable'.

Can you provide an example of where this fails, @WarrenWeckesser?

@hameerabbasi
Copy link
Contributor

It will fail if there are repeated values, because simply reversing the argsort output won't keep the relative position of the inputs. Concrete example:

a = np.argsort([1, 1], kind="stable")[::-1]

(Because a reverse parameter would return [0, 1] and this returns [1, 0]).

@WarrenWeckesser
Copy link
Member

WarrenWeckesser commented Nov 27, 2019

Hameer summed up the issue with argsort, but it turns out you can get the correct result if you also "flip" the values of the indices (i.e. convert i to len(a) - 1 - i) at the appropriate step.

For example, suppose z = [0, 0, 9, 9, 9, 0]. The expected result of np.argsort(z, kind='stable', reverse=True) is [2, 3, 4, 0, 1, 5]:

In [231]: z
Out[231]: array([0, 0, 9, 9, 9, 0])

In [232]: (len(z) - 1 - np.argsort(z[::-1], kind='stable'))[::-1]
Out[232]: array([2, 3, 4, 0, 1, 5])

@eric-wieser
Copy link
Member

If you didn't flip the indices for argsort then the result would just be wrong anyway.

@WarrenWeckesser
Copy link
Member

If you didn't flip the indices for argsort then the result would just be wrong anyway.

Right. I confess that I didn't fully think through the implication of the initial reversal of the input when I made my first comment about stable argsort. 😳

@mproszewska
Copy link
Contributor Author

mproszewska commented Dec 4, 2019

I think adding reverse in ordering functions would include adding:
sort - flip before and after sort
lexsort - flip each of keys
argsort - flip before and after sort with indices change
msort - call sort with reverse=True
sort_complex - call sort with reverse=True
partition - finding (n-i)-th element and flipping result
argpartition - finding (n-i)-th element and flipping result
searchsorted - could use sorter parameter

I can submit pull request with those changes.

Adding reverse in ndarray functions would include more changes (probably many signatures changes), but in that case using reverse when comparing elements, would be a solution for all ordering functions.

@mproszewska mproszewska changed the title ENH: Reverse param in np.sort ENH: Reverse param in ordering functions Dec 10, 2019
@Qiyu8
Copy link
Member

Qiyu8 commented May 25, 2020

@mproszewska do you want to keep working on this?I think the only thing left is some conflicting fix.

@mproszewska
Copy link
Contributor Author

I could, but I'm not sure how numpy team feels about my solution or what else they expect

@axil
Copy link
Contributor

axil commented Nov 30, 2020

The first flip, while perfectly valid for np.argsort, seems superfluous for np.sort to me. The only use-case where it would matter is with kind='stable'. When sorting by one dimension, it maintains the order in the "other" dimensions (which in argsort case is the index). But in the np.sort case if there're no "other" dimensions. Can you give a counter-example?

Also it would be nice to accept lists or tuples in case you need to sort by one column ascending and by other descending (with the order argument) - the way it is done in pandas.

I also think that having 'reverse' for np.sort and not having it for ndarray.sort would look a bit confusing.

It might also consider skipping the first flip for kind=quicksort as it doesn't maintain the order anyway.

Base automatically changed from master to main March 4, 2021 02:04
@InessaPawson InessaPawson added triage review Issue/PR to be discussed at the next triage meeting and removed 61 - Stale labels Jun 16, 2022
@seberg
Copy link
Member

seberg commented Jun 29, 2022

We discussed this a bit in today and considering the age of the PR decided to close this. There is a bit of an API question open. Generally, we think the feature itself is good though.
(The open question is that here decending is preferred, which is at odds with pythons reversed= though).

However, it seems to us that this should (unfortunately) be done "properly" on the C level, that may be quite a bit harder (so unfortunately).
Doing it in C may also require reorganizing the sort methods on the C level. That is certainly possible now by creating a new get_sort_method() slot on the DTypes! Unfortunately, it is also a pretty new mechanism, so there is no very easy blue-prints for it. I would be happy to help with that part, but for me it would mainly make sense to prioritize that if someone comfortable with the C sorting code will continue the work (i.e. use the new machinery).

@seberg seberg closed this Jun 29, 2022
@InessaPawson InessaPawson added triaged Issue/PR that was discussed in a triage meeting and removed triage review Issue/PR to be discussed at the next triage meeting labels Jul 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
01 - Enhancement component: numpy._core triaged Issue/PR that was discussed in a triage meeting
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants