Without looking it up, can you say what the following code does:
import numpy as np from scipy import stats a = np.arange(25) b = np.arange(25)+4 print(stats.mannwhitneyu(a , b))
You probably guessed that it computes the Mann-Whitney test between two samples, but exactly which test? The two-sided or the one-sided test?
You can’t tell from the code because it depends on which version of scipy you are running and it has gone back and forth between the two! Pre-0.17.0 it used the one-sided test with the side being decided based on the input data. This was obviously the wrong thing to do. Then, the API was fixed in 0.17.0 to do the two-sided test. This was considered a bad thing because it broke backwards compatibility and now it’s back to performing the one-sided test! I wish I was making this up.
Reading through the github issues (#4933, #6034, #6062, #6100) is an example of how open source projects can stagnate. There is a basic, simple, solution to the issue: create a corrected version of the function with a new name and deprecate the old one. This keeps backwards compatibility while allowing the project to fix its API. Once the issue had been identified, this should have been a 20 minute job. Reading through the issues, this simple solution is proposed, discussed, seemingly agreed to. Instead, something else happens and at this point, it’d take me longer than 20 minutes to just read through the whole discussions.
This is not the first time I have run into numpy/scipy’s lack of respect for backwards compatibility either. Fortunately, there is a solution to this case, which is to use the full version:
stats.mannwhitneyu(a, b, alternative='two-sided')