Skip to content

[Failed] Fixed #1112 Migrate to New NumPy Random Number Generator (RNG) API#1131

Closed
seanlaw wants to merge 9 commits intostumpy-dev:mainfrom
seanlaw:migrate_new_numpy_rng
Closed

[Failed] Fixed #1112 Migrate to New NumPy Random Number Generator (RNG) API#1131
seanlaw wants to merge 9 commits intostumpy-dev:mainfrom
seanlaw:migrate_new_numpy_rng

Conversation

@seanlaw
Copy link
Contributor

@seanlaw seanlaw commented Feb 14, 2026

See #1112

Pull Request Checklist

Below is a simple checklist but please do not hesitate to ask for assistance!

  • Fork, clone, and checkout the newest version of the code
  • Create a new branch
  • Make necessary code changes
  • Install black (i.e., python -m pip install black or conda install -c conda-forge black)
  • Install flake8 (i.e., python -m pip install flake8 or conda install -c conda-forge flake8)
  • Install pytest-cov (i.e., python -m pip install pytest-cov or conda install -c conda-forge pytest-cov)
  • Run black --exclude=".*\.ipynb" --extend-exclude=".venv" --diff ./ in the root stumpy directory
  • Run flake8 --extend-exclude=.venv ./ in the root stumpy directory
  • Run ./setup.sh dev && ./test.sh in the root stumpy directory
  • Reference a Github issue (and create one if one doesn't already exist)

To Do List

  1. Search for np.random
  2. Replace np.random.rand(x) with pytest.RNG.random(x)
  3. Replace np.random.rand(x*y).reshape(x,y) with pytest.RNG.random(size=(x, y))
  4. Replace np.random.uniform(x, y, [z]) with pytest.RNG.uniform(x, y, size=z)
  5. Replace np.random.permutation([x, y, z]) with pytest.RNG.permutation([x, y, z])
  6. Replace np.random.randint(x, y, z) with pytest.RNG.integers(x, y, z)
  7. Replace np.random.choice([x, y], l, replace=True) with pytest.RNG.choice([x, y], l, replace=True)
  8. Replace np.random.normal(size=x) with pytest.RNG.normal(size=x)
  9. Replace all np.random from docstrings, README, tutorials, etc

@gitnotebooks
Copy link

gitnotebooks bot commented Feb 14, 2026

Review these changes at https://app.gitnotebooks.com/stumpy-dev/stumpy/pull/1131

@seanlaw
Copy link
Contributor Author

seanlaw commented Feb 14, 2026

@NimaSarajpoor I've tested the migration on test_core.py as an initial trial run. Would you mind taking a look and providing any feedback before I do the same thing on the other tests?

Copy link
Collaborator

@NimaSarajpoor NimaSarajpoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seanlaw
I've left just one comment for your consideration.

@seanlaw
Copy link
Contributor Author

seanlaw commented Feb 15, 2026

@NimaSarajpoor If it's okay with you, I'm going to add ~5 files at a time for you to incrementally review (there are 46 test files in total and I'll need to work on tutorials and other docs too).

@seanlaw
Copy link
Contributor Author

seanlaw commented Feb 16, 2026

Things are starting to get tricky. Previously, np.random.seed(seed) would set the seed for the global random number generator. However, with the new approach, we'll need to synchronize things across to naive.py (i.e., pass the RNG state). Extra care must be taken. Otherwise, things will fall out of sync and cause assertions to fail!

@seanlaw
Copy link
Contributor Author

seanlaw commented Feb 26, 2026

@NimaSarajpoor I added three new files for you to review

# naive.distance(naive.z_norm(T[10:30]), naive.z_norm(T[110:130])) = 0.47
# naive.distance(naive.z_norm(T[10:30]), naive.z_norm(T[210:230])) = 0.24
# naive.distance(naive.z_norm(T[110:130]), naive.z_norm(T[210:230])) = 0.72
# Hence T[10:30] is the motif representative for this motif
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these comments should be revised. They show the calculation of z-normalized distances. However, aamp_motifs is for non-normalized case.

Copy link
Contributor Author

@seanlaw seanlaw Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, what should it be changed to? I don't remember adding this comment :(

This?

    # naive.distance(T[10:30], T[110:130]) = 0.47
    # naive.distance(T[10:30], T[210:230]) = 0.24
    # naive.distance(T[110:130], T[210:230]) = 0.72
    # Hence T[10:30] is the motif representative for this motif

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just recalculated the numbers for non-normalized distance.

  # naive.distance(T[10:30], T[110:130]) = 16.65
  # naive.distance(T[10:30], T[210:230]) = 0.1
  # naive.distance(T[110:130], T[210:230]) =   16.68
  # Hence T[10:30] is the motif representative for this motif

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should use == because I had interpreted it as setting the distance rather than the value of the distance

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right... == should be used here 👍

@seanlaw
Copy link
Contributor Author

seanlaw commented Feb 27, 2026

I discovered yesterday that this migration will require a lot more work since, unlike the traditional np.random that shares a global state, the new random number generator is only local to, say, a module. In order to share that random state across naive.py, conftest.py, all of the unit tests, and all of the stumpy modules, I'll have to add a global RNG = np.random.default_rng() variable to core.py and then share it across everything.

As a result, I think I'm gonna need to start all over with a new PR and rethink what needs to be done

@NimaSarajpoor
Copy link
Collaborator

NimaSarajpoor commented Feb 28, 2026

@seanlaw

In order to share that random state across naive.py....

Can you please provide an example for this so that I can better understand the challenge? Do we need to share random state in certain cases? Because I think we create a randomly-generated input data T and then we simply pass that input T to both naive and performant functions. So, sharing the random state does not seem to be required. Am I missing something?

@seanlaw
Copy link
Contributor Author

seanlaw commented Mar 1, 2026

Can you please provide an example for this so that I can better understand the challenge? Do we need to share random state in certain cases? Because I think we create a randomly-generated input data T and then we simply pass that input T to both naive and performant functions. So, sharing the random state does not seem to be required. Am I missing something?

Here's a simple illustrative example.

# naive.py
import numpy as np

def some_func(T, choices):
    c = np.random.choice(choices)
    out = np.empty(len(T))
    for i in range(len(T)):
        out[i] = 0.0
        for _ in range(c):
            out[i] += T[i]
    return out
# some_module.py
import numpy as np

def some_func(T, choices):
    c = np.random.choice(choices)
    out = T * c
    return out
# test.py
import numpy as np
import numpy.testing as npt
import naive
import some_module

def test_some_func():
    T = np.random.rand(10)
    choices = np.random.randint(50, size=(4))
    
    np.random.seed(10)
    ref = naive.some_func(T, choices)
    
    np.random.seed(10)
    cmp = some_module.some_func(T, choices)
    
    npt.assert_almost_equal(ref, cmp)

In this case, we have a dummy function (naive and performant) and all it does is take an array T, an array of integers choices, and randomly chooses one integer to multiply against each element of T. This code works because:

  1. T is the same in both cases
  2. choices is the same in both cases
  3. We make sure to set the random seed BEFORE we randomly select an integer from choices. In doing so, we ensure that BOTH functions will end up choosing the same value from choices!

Note that we have exactly this case in some of our stumpy code!

Now, if we transition this "legacy" numpy code over to the modern way of implementing random number generators, we'd have to do something like (this is untested!):

# core.py
import numpy as np

RNG = np.random.default_rng()  # A shared RNG

def get_rng_state():
    return RNG.bit_generator.state

def set_rng_state(state):
    RNG.bit_generator.state = state
# naive.py
import numpy as np
import core

def some_func(T, choices):
    c = core.RNG.choice(choices)
    out = np.empty(len(T))
    for i in range(len(T)):
        out[i] = 0.0
        for _ in range(c):
            out[i] += T[i]
    return out
# some_module.py
import numpy as np
import core

def some_func(T, choices):
    c = core.RNG.choice(choices)
    out = T * c
    return out
# test.py
import numpy as np
import numpy.testing as npt
import naive
import some_module
import core

def test_some_func():
    # Technically, `T`, and `choices` does NOT need to depend on `core.RNG` and, instead,
    # this could've been generated from a local rng but this simplifies things
    T = core.RNG.random(10)
    choices = core.RNG.integers(0, 50, 4)
    
    state = core.get_rng_state()
    ref = naive.some_func(T, choices)
    
    core.set_rng_state(state)
    cmp = some_module.some_func(T, choices)
    
    npt.assert_almost_equal(ref, cmp)

@NimaSarajpoor at least this is what I am thinking. Please see if you can get my point and feel free to ask clarifying questions!

@NimaSarajpoor
Copy link
Collaborator

@seanlaw

Note that we have exactly this case in some of our stumpy code!

Got it 👍

And since we have a case like the following in some modules in /stumpy/:

# some_module.py
import numpy as np
import core

def some_func(T, choices):
    c = core.RNG.choice(choices)
    out = T * c
    return out

then we cannot set the random state in conftest.py, and we should use a stumpy's module (like core.py) instead.

@seanlaw
Copy link
Contributor Author

seanlaw commented Mar 1, 2026

And since we have a case like the following in some modules in /stumpy/:
then we cannot set the random state in conftest.py, and we should use a stumpy's module (like core.py) instead.

That's right! Hence why I eventually came to the conclusion that the migration will be tricky and one needs to take extra care because tests were failing and really hard to understand why (because it was literally "random"). I'm going to close this PR (use it as a reference) and submit a brand new PR now that I know what to do (and hopefully don't encounter another problem).

@seanlaw seanlaw changed the title Fixed #1112 Migrate to New NumPy Random Number Generator (RNG) API [Failed] Fixed #1112 Migrate to New NumPy Random Number Generator (RNG) API Mar 2, 2026
@seanlaw seanlaw closed this Mar 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants