Version 0.58.0 (20 September 2023)

This is a major Numba release. Numba now uses towncrier to create the release notes, so please find a summary of all noteworthy items below.

Highlights 

Added towncrier

This PR adds towncrier as a GitHub workflow for checking release notes. From this PR onwards every PR made in Numba will require a appropriate release note associated with it. The reviewer may decide to skip adding release notes in smaller PRs with minimal impact by addition of a skip_release_notes label to the PR.

(PR-#8792)

The minimum supported NumPy version is 1.22.

Following NEP-0029, the minimum supported NumPy version is now 1.22.

(PR-#9093)

Add support for NumPy 1.25

Extend Numba to support new and changed features released in NumPy 1.25.

(PR-#9011)

Remove NVVM 3.4 and CTK 11.0 / 11.1 support

Support for CUDA toolkits < 11.2 is removed.

(PR-#9040)

Removal of Windows 32-bit Support

This release onwards, Numba has discontinued support for Windows 32-bit operating systems.

(PR-#9083)

The minimum llvmlite version is now 0.41.0.

The minimum required version of llvmlite is now version 0.41.0.

(PR-#8916)

Added RVSDG-frontend

This PR is a preliminary work on adding a RVSDG-frontend for processing bytecode. RVSDG (Regionalized Value-State Dependence Graph) allows us to have a dataflow-centric view instead of a traditional SSA-CFG view. This allows us to simplify the compiler in the future.

(PR-#9012)

New Features 

`numba.experimental.jitclass` gains support for `*matmul` methods.

numba.experimental.jitclass now has support for the following methods:

__matmul__
__imatmul__
__rmatmul__

(PR-#8892)

`numba.experimental.jitclass` gains support for reflected -dunder- methods.

numba.experimental.jitclass now has support for the following methods:

__radd__
__rand_
__rfloordiv__
__rlshift__
__ror_
__rmod_
__rmul_
__rpow_
__rrshift_
__rsub_
__rtruediv_
__rxor_

(PR-#8906)

Add support for value `max` to `NUMBA_OPT`.

The optimisation level that Numba applies when compiling can be set through the environment variable NUMBA_OPT. This has historically been a value between 0 and 3 (inclusive). Support for the value max has now been added, this is a Numba-specific optimisation level which indicates that the user would like Numba to try running the most optimisation possible, potentially trading a longer compilation time for better run-time performance. In practice, use of the max level of optimisation may or may not benefit the run-time or compile-time performance of user code, but it has been added to present an easy to access option for users to try if they so wish.

(PR-#9094)

Improvements 

Updates to `numba.core.pythonapi`.

Support for Python C-API functions PyBytes_AsString and PyBytes_AsStringAndSize is added to numba.core.pythonapi.PythonAPI as bytes_as_string and bytes_as_string_and_size methods respectively.

(PR-#8462)

Support for `isinstance` is now non-experimental.

Support for the isinstance built-in function has moved from being considered an experimental feature to a fully supported feature.

(PR-#8911)

NumPy Support 

All modes are supported in `numpy.correlate` and `numpy.convolve`.

All values for the mode argument to numpy.correlate and numpy.convolve are now supported.

(PR-#7543)

`@vectorize` accommodates arguments implementing `__array_ufunc__`.

Universal functions (ufuncs) created with numba.vectorize will now respect arguments implementing __array_ufunc__ (NEP-13) to allow pre- and post-processing of arguments and return values when the ufunc is called from the interpreter.

(PR-#8995)

Added support for `np.geomspace` function.

This PR improves on #4074 by adding support for np.geomspace. The current implementation only supports scalar start and stop parameters.

(PR-#9068)

Added support for `np.vsplit`, `np.hsplit`, `np.dsplit`.

This PR improves on #4074 by adding support for np.vsplit, np.hsplit, and np.dsplit.

(PR-#9082)

Added support for `np.row_stack` function.

Support is added for numpy.row_stack.

(PR-#9085)

Added support for functions `np.polynomial.polyutils.trimseq`, as well as functions `polyadd`, `polysub`, `polymul` from `np.polynomial.polynomial`.

Support is added for np.polynomial.polyutils.trimseq, np.polynomial.polynomial.polyadd, np.polynomial.polynomial.polysub, np.polynomial.polynomial.polymul.

(PR-#9087)

Added support for `np.diagflat` function.

Support is added for numpy.diagflat.

(PR-#9113)

Added support for `np.resize` function.

Support is added for numpy.resize.

(PR-#9118)

Add np.trim_zeros

Support for np.trim_zeros() is added.

(PR-#9074)

CUDA Changes 

Bitwise operation `ufunc` support for the CUDA target.

Support is added for some ufuncs associated with bitwise operation on the CUDA target. Namely:

numpy.bitwise_and
numpy.bitwise_or
numpy.bitwise_not
numpy.bitwise_xor
numpy.invert
numpy.left_shift
numpy.right_shift

(PR-#8974)

Add support for the latest CUDA driver codes.

Support is added for the latest set of CUDA driver codes.

(PR-#8988)

Add NumPy comparison ufunc in CUDA

this PR adds support for comparison ufuncs for the CUDA target (eg. numpy.greater, numpy.greater_equal, numpy.less_equal, etc.).

(PR-#9007)

Report absolute path of `libcuda.so` on Linux

numba -s now reports the absolute path to libcuda.so on Linux, to aid troubleshooting driver issues, particularly on WSL2 where a Linux driver can incorrectly be installed in the environment.

(PR-#9034)

Add debuginfo support to `nvdisasm` output.

Support is added for debuginfo (source line and inlining information) in functions that make calls through nvdisasm. For example the CUDA dispatcher .inspect_sass method output is now augmented with this information.

(PR-#9035)

Add CUDA SASS CFG Support

This PR adds support for getting the SASS CFG in dot language format. It adds an inspect_sass_cfg() method to CUDADispatcher and the -cfg flag to the nvdisasm command line tool.

(PR-#9051)

Support NVRTC using the ctypes binding

NVRTC can now be used when the ctypes binding is in use, enabling float16, and linking CUDA C / C++ sources without needing the NVIDIA CUDA Python bindings.

(PR-#9086)

Fix CUDA atomics tests with toolkit 12.2

CUDA 12.2 generates slightly different PTX for some atomics, so the relevant tests are updated to look for the correct instructions when 12.2 is used.

(PR-#9088)

Bug Fixes 

Handling of different sized unsigned integer indexes are fixed in `numba.typed.List`.

An issue with the order of truncation/extension and casting of unsigned integer indexes in numba.typed.List has been fixed.

(PR-#7262)

Prevent invalid fusion

This PR fixes an issue in which an array first read in a parfor and later written in the same parfor would only be classified as used in the parfor. When a subsequent parfor also used the same array then fusion of the parfors was happening which should have been forbidden given that that the first parfor was also writing to the array. This PR treats such arrays in a parfor as being both used and defined so that fusion will be prevented.

(PR-#7582)

The `numpy.allclose` implementation now correctly handles default arguments.

The implementation of numpy.allclose is corrected to use TypingError to report typing errors.

(PR-#8885)

Add type validation to `numpy.isclose`.

Type validation is added to the implementation of numpy.isclose.

(PR-#8944)

Fix support for overloading dispatcher with non-compatible first-class functions

Fixes an error caused by not handling compilation error during casting of Dispatcher objects into first-class functions. With the fix, users can now overload a dispatcher with non-compatible first-class functions. Refer to https://github.com/numba/numba/issues/9071 for details.

(PR-#9072)

Support `dtype` keyword argument in `numpy.arange` with `parallel=True`

Fixes parfors transformation to support the use of dtype keyword argument in numpy.arange(..., dtype=dtype).

(PR-#9095)

Fix all `@overload`s to use parameter names that match public APIs.

Some of the Numba @overloads for functions in NumPy and Python’s built-ins were written using parameter names that did not match those used in API they were overloading. The result of this being that calling a function with such a mismatch using the parameter names as key-word arguments at the call site would result in a compilation error. This has now been universally fixed throughout the code base and a unit test is running with a best-effort attempt to prevent reintroduction of similar mistakes in the future. Fixed functions include:

From Python built-ins:

complex

From the Python random module:

random.seed
random.gauss
random.normalvariate
random.randrange
random.randint
random.uniform
random.shuffle

From the numpy module:

numpy.argmin
numpy.argmax
numpy.array_equal
numpy.average
numpy.count_nonzero
numpy.flip
numpy.fliplr
numpy.flipud
numpy.iinfo
numpy.isscalar
numpy.imag
numpy.real
numpy.reshape
numpy.rot90
numpy.swapaxes
numpy.union1d
numpy.unique

From the numpy.linalg module:

numpy.linalg.norm
numpy.linalg.cond
numpy.linalg.matrix_rank

From the numpy.random module:

numpy.random.beta
numpy.random.chisquare
numpy.random.f
numpy.random.gamma
numpy.random.hypergeometric
numpy.random.lognormal
numpy.random.pareto
numpy.random.randint
numpy.random.random_sample
numpy.random.ranf
numpy.random.rayleigh
numpy.random.sample
numpy.random.shuffle
numpy.random.standard_gamma
numpy.random.triangular
numpy.random.weibull

(PR-#9099)

Changes 

Support for `@numba.extending.intrinsic(prefer_literal=True)`

In the high level extension API, the prefer_literal option is added to the numba.extending.intrinsic decorator to prioritize the use of literal types when available. This has the same behavior as in the prefer_literal option in the numba.extending.overload decorator.

(PR-#6647)

Deprecations 

Deprecation of old-style `NUMBA_CAPTURED_ERRORS`

Added deprecation schedule of NUMBA_CAPTURED_ERRORS=old_style. NUMBA_CAPTURED_ERRORS=new_style will become the default in future releases. Details are documented at https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-old-style-numba-captured-errors

(PR-#9090)

Pull-Requests 

PR #6647: Support prefer_literal option for intrinsic decorator (ashutoshvarma sklam)
PR #7262: fix order of handling and casting (esc)
PR #7543: Support for all modes in np.correlate and np.convolve (jeertmans)
PR #7582: Use get_parfor_writes to detect illegal array access that prevents fusion. (DrTodd13)
PR #8371: Added binomial distribution (esc kc611)
PR #8462: Add PyBytes_AsString and PyBytes_AsStringAndSize (ianna)
PR #8633: DOC: Convert vectorize and guvectorize examples to doctests (Matt711)
PR #8730: Update dev-docs (sgbaird esc)
PR #8792: Added towncrier as a github workflow (kc611)
PR #8854: Updated mk_alloc to support Numba-Dpex compute follows data. (mingjie-intel)
PR #8861: CUDA: Don’t add device kwarg for jit registry (gmarkall)
PR #8871: Don’t return the function in CallConv.decorate_function() (gmarkall)
PR #8885: Fix np.allclose not handling default args (guilhermeleobas)
PR #8892: Add support for __*matmul__ methods in jitclass (louisamand)
PR #8895: CUDA: Enable caching functions that use CG (gmarkall)
PR #8906: Add support for reflected dunder methods in jitclass (louisamand)
PR #8911: Remove isinstance experimental feature warning (guilhermeleobas)
PR #8916: Bump llvmlite requirement to 0.41.0dev0 (sklam)
PR #8925: Update release checklist template (sklam)
PR #8937: Remove old Website development documentation (esc gmarkall)
PR #8944: Add exceptions to np.isclose (guilhermeleobas)
PR #8974: CUDA: Add binary ufunc support (Matt711)
PR #8976: Fix index URL for ptxcompiler/cubinlinker packages. (bdice)
PR #8978: Import MVC packages when using MVCLinker. (bdice)
PR #8983: Fix typo in deprecation.rst (dsgibbons)
PR #8988: support for latest CUDA driver codes #8363 (s1Sharp)
PR #8995: Allow libraries that implement __array_ufunc__ to override DUFunc.__c… (jpivarski)
PR #9007: CUDA: Add comparison ufunc support (Matt711)
PR #9012: RVSDG-frontend (sklam)
PR #9021: update the release checklist following 0.57.1rc1 (esc)
PR #9022: fix: update the C++ ABI repo reference (emmanuel-ferdman)
PR #9028: Replace use of imp module removed in 3.12 (hauntsaninja)
PR #9034: CUDA libs test: Report the absolute path of the loaded libcuda.so on Linux, + other improvements (gmarkall)
PR #9035: CUDA: Allow for debuginfo in nvdisasm output (Matt711)
PR #9037: Recognize additional functions as being pure or not having side effects. (DrTodd13)
PR #9039: Correct git clone link in installation instructions. (ellifteria)
PR #9040: Remove NVVM 3.4 and CTK 11.0 / 11.1 support (gmarkall)
PR #9046: copy the change log changes for 0.57.1 to main (esc)
PR #9050: Update CODEOWNERS (sklam)
PR #9051: Add CUDA CFG support (Matt711)
PR #9056: adding weekly meeting notes script (esc)
PR #9068: Adding np.geomspace (KrisMinchev)
PR #9069: Fix towncrier error due to importlib_resources upgrade (sklam)
PR #9072: Fix support for overloading dispatcher with non-compatible first-class functions (gmarkall sklam)
PR #9074: Add np.trim_zeros (sungraek guilhermeleobas)
PR #9082: Add np.vsplit, np.hsplit, and np.dsplit (KrisMinchev)
PR #9083: Removed windows 32 references from code and documentation (kc611)
PR #9085: Add tests for np.row_stack (KrisMinchev)
PR #9086: Support NVRTC using ctypes binding (testhound gmarkall)
PR #9087: Add trimseq from np.polynomial.polyutils and polyadd, polysub, polymul from np.polynomial.polynomial (KrisMinchev)
PR #9088: Fix: Issue 9063 - CUDA atomics tests failing with CUDA 12.2 (gmarkall)
PR #9090: Add deprecation notice for old_style error capturing. (esc sklam)
PR #9094: Add support for a ‘max’ level to NUMBA_OPT environment variable. (stuartarchibald)
PR #9095: Support dtype keyword in arange_parallel_impl (DrTodd13 sklam)
PR #9105: NumPy 1.25 support (PR #9011) continued (gmarkall apmasell)
PR #9111: Fixes ReST syntax error in PR#9099 (stuartarchibald gmarkall sklam apmasell)
PR #9112: Fixups for PR#9100 (stuartarchibald sklam)
PR #9113: Add support for np.diagflat (KrisMinchev)
PR #9114: update np min to 122 (stuartarchibald esc)
PR #9117: Fixed towncrier template rendering (kc611)
PR #9118: Add support for np.resize() (KrisMinchev)
PR #9120: Update conda-recipe for numba-rvsdg (sklam)
PR #9127: Fix accidental cffi test deps, refactor cffi skipping (gmarkall)
PR #9128: Merge rvsdg_frontend branch to main (esc sklam)
PR #9152: Fix old_style error capturing deprecation warnings (sklam)
PR #9159: Fix uncaught exception in find_file() (gmarkall)
PR #9173: Towncrier fixups (Continue #9158 and retarget to main branch) (sklam)
PR #9181: Remove extra decrefs in RNG (sklam)
PR #9190: Fix issue with incompatible multiprocessing context in test. (stuartarchibald)

Added towncrier

The minimum supported NumPy version is 1.22.

Add support for NumPy 1.25

Remove NVVM 3.4 and CTK 11.0 / 11.1 support

Removal of Windows 32-bit Support

The minimum llvmlite version is now 0.41.0.

Added RVSDG-frontend

numba.experimental.jitclass gains support for __*matmul__ methods.

numba.experimental.jitclass gains support for reflected -dunder- methods.

Add support for value max to NUMBA_OPT.

Updates to numba.core.pythonapi.

Support for isinstance is now non-experimental.

All modes are supported in numpy.correlate and numpy.convolve.

@vectorize accommodates arguments implementing __array_ufunc__.

Added support for np.geomspace function.

Added support for np.vsplit, np.hsplit, np.dsplit.

Added support for np.row_stack function.

Added support for functions np.polynomial.polyutils.trimseq, as well as functions polyadd, polysub, polymul from np.polynomial.polynomial.

Added support for np.diagflat function.

Added support for np.resize function.