Version 0.58.0 (20 September 2023)
This is a major Numba release. Numba now uses towncrier to create the release notes, so please find a summary of all noteworthy items below.
Highlights
Added towncrier
This PR adds towncrier as a GitHub workflow for checking release notes.
From this PR onwards every PR made in Numba will require a appropriate
release note associated with it. The reviewer may decide to skip adding
release notes in smaller PRs with minimal impact by addition of a
skip_release_notes label to the PR.
(PR-#8792)
The minimum supported NumPy version is 1.22.
Following NEP-0029, the minimum supported NumPy version is now 1.22.
(PR-#9093)
Add support for NumPy 1.25
Extend Numba to support new and changed features released in NumPy 1.25.
(PR-#9011)
Remove NVVM 3.4 and CTK 11.0 / 11.1 support
Support for CUDA toolkits < 11.2 is removed.
(PR-#9040)
Removal of Windows 32-bit Support
This release onwards, Numba has discontinued support for Windows 32-bit operating systems.
(PR-#9083)
The minimum llvmlite version is now 0.41.0.
The minimum required version of llvmlite is now version 0.41.0.
(PR-#8916)
Added RVSDG-frontend
This PR is a preliminary work on adding a RVSDG-frontend for processing bytecode. RVSDG (Regionalized Value-State Dependence Graph) allows us to have a dataflow-centric view instead of a traditional SSA-CFG view. This allows us to simplify the compiler in the future.
(PR-#9012)
New Features
numba.experimental.jitclass gains support for __*matmul__ methods.
numba.experimental.jitclass now has support for the following methods:
__matmul____imatmul____rmatmul__
(PR-#8892)
numba.experimental.jitclass gains support for reflected -dunder- methods.
numba.experimental.jitclass now has support for the following methods:
__radd____rand___rfloordiv____rlshift____ror___rmod___rmul___rpow___rrshift___rsub___rtruediv___rxor_
(PR-#8906)
Add support for value max to NUMBA_OPT.
The optimisation level that Numba applies when compiling can be set through the
environment variable NUMBA_OPT. This has historically been a value between
0 and 3 (inclusive). Support for the value max has now been added, this is a
Numba-specific optimisation level which indicates that the user would like Numba
to try running the most optimisation possible, potentially trading a longer
compilation time for better run-time performance. In practice, use of the max
level of optimisation may or may not benefit the run-time or compile-time
performance of user code, but it has been added to present an easy to access
option for users to try if they so wish.
(PR-#9094)
Improvements
Updates to numba.core.pythonapi.
Support for Python C-API functions PyBytes_AsString and
PyBytes_AsStringAndSize is added to numba.core.pythonapi.PythonAPI as
bytes_as_string and bytes_as_string_and_size methods respectively.
(PR-#8462)
Support for isinstance is now non-experimental.
Support for the isinstance built-in function has moved from being considered
an experimental feature to a fully supported feature.
(PR-#8911)
NumPy Support
All modes are supported in numpy.correlate and numpy.convolve.
All values for the mode argument to numpy.correlate and
numpy.convolve are now supported.
(PR-#7543)
@vectorize accommodates arguments implementing __array_ufunc__.
Universal functions (ufuncs) created with numba.vectorize will now
respect arguments implementing __array_ufunc__ (NEP-13) to allow pre- and
post-processing of arguments and return values when the ufunc is called from the
interpreter.
(PR-#8995)
Added support for np.geomspace function.
This PR improves on #4074 by
adding support for np.geomspace. The current implementation only supports
scalar start and stop parameters.
(PR-#9068)
Added support for np.vsplit, np.hsplit, np.dsplit.
This PR improves on #4074 by adding support for np.vsplit, np.hsplit, and np.dsplit.
(PR-#9082)
Added support for np.row_stack function.
Support is added for numpy.row_stack.
(PR-#9085)
Added support for functions np.polynomial.polyutils.trimseq, as well as functions polyadd, polysub, polymul from np.polynomial.polynomial.
Support is added for np.polynomial.polyutils.trimseq, np.polynomial.polynomial.polyadd, np.polynomial.polynomial.polysub, np.polynomial.polynomial.polymul.
(PR-#9087)
Added support for np.diagflat function.
Support is added for numpy.diagflat.
(PR-#9113)
Added support for np.resize function.
Support is added for numpy.resize.
(PR-#9118)
Add np.trim_zeros
Support for np.trim_zeros() is added.
(PR-#9074)
CUDA Changes
Bitwise operation ufunc support for the CUDA target.
Support is added for some ufuncs associated with bitwise operation on the
CUDA target. Namely:
numpy.bitwise_andnumpy.bitwise_ornumpy.bitwise_notnumpy.bitwise_xornumpy.invertnumpy.left_shiftnumpy.right_shift
(PR-#8974)
Add support for the latest CUDA driver codes.
Support is added for the latest set of CUDA driver codes.
(PR-#8988)
Add NumPy comparison ufunc in CUDA
this PR adds support for comparison ufuncs for the CUDA target
(eg. numpy.greater, numpy.greater_equal, numpy.less_equal, etc.).
(PR-#9007)
Report absolute path of libcuda.so on Linux
numba -s now reports the absolute path to libcuda.so on Linux, to aid
troubleshooting driver issues, particularly on WSL2 where a Linux driver can
incorrectly be installed in the environment.
(PR-#9034)
Add debuginfo support to nvdisasm output.
Support is added for debuginfo (source line and inlining information) in
functions that make calls through nvdisasm. For example the CUDA dispatcher
.inspect_sass method output is now augmented with this information.
(PR-#9035)
Add CUDA SASS CFG Support
This PR adds support for getting the SASS CFG in dot language format.
It adds an inspect_sass_cfg() method to CUDADispatcher and the -cfg
flag to the nvdisasm command line tool.
(PR-#9051)
Support NVRTC using the ctypes binding
NVRTC can now be used when the ctypes binding is in use, enabling float16, and linking CUDA C / C++ sources without needing the NVIDIA CUDA Python bindings.
(PR-#9086)
Fix CUDA atomics tests with toolkit 12.2
CUDA 12.2 generates slightly different PTX for some atomics, so the relevant tests are updated to look for the correct instructions when 12.2 is used.
(PR-#9088)
Bug Fixes
Handling of different sized unsigned integer indexes are fixed in numba.typed.List.
An issue with the order of truncation/extension and casting of unsigned integer
indexes in numba.typed.List has been fixed.
(PR-#7262)
Prevent invalid fusion
This PR fixes an issue in which an array first read in a parfor and later written in the same parfor would only be classified as used in the parfor. When a subsequent parfor also used the same array then fusion of the parfors was happening which should have been forbidden given that that the first parfor was also writing to the array. This PR treats such arrays in a parfor as being both used and defined so that fusion will be prevented.
(PR-#7582)
The numpy.allclose implementation now correctly handles default arguments.
The implementation of numpy.allclose is corrected to use TypingError to
report typing errors.
(PR-#8885)
Add type validation to numpy.isclose.
Type validation is added to the implementation of numpy.isclose.
(PR-#8944)
Fix support for overloading dispatcher with non-compatible first-class functions
Fixes an error caused by not handling compilation error during casting of
Dispatcher objects into first-class functions. With the fix, users can now
overload a dispatcher with non-compatible first-class functions. Refer to
https://github.com/numba/numba/issues/9071 for details.
(PR-#9072)
Support dtype keyword argument in numpy.arange with parallel=True
Fixes parfors transformation to support the use of dtype keyword argument in
numpy.arange(..., dtype=dtype).
(PR-#9095)
Fix all @overloads to use parameter names that match public APIs.
Some of the Numba @overloads for functions in NumPy and Python’s built-ins
were written using parameter names that did not match those used in API they
were overloading. The result of this being that calling a function with such a
mismatch using the parameter names as key-word arguments at the call site would
result in a compilation error. This has now been universally fixed throughout
the code base and a unit test is running with a best-effort attempt to prevent
reintroduction of similar mistakes in the future. Fixed functions include:
From Python built-ins:
complex
From the Python random module:
random.seedrandom.gaussrandom.normalvariaterandom.randrangerandom.randintrandom.uniformrandom.shuffle
From the numpy module:
numpy.argminnumpy.argmaxnumpy.array_equalnumpy.averagenumpy.count_nonzeronumpy.flipnumpy.fliplrnumpy.flipudnumpy.iinfonumpy.isscalarnumpy.imagnumpy.realnumpy.reshapenumpy.rot90numpy.swapaxesnumpy.union1dnumpy.unique
From the numpy.linalg module:
numpy.linalg.normnumpy.linalg.condnumpy.linalg.matrix_rank
From the numpy.random module:
numpy.random.betanumpy.random.chisquarenumpy.random.fnumpy.random.gammanumpy.random.hypergeometricnumpy.random.lognormalnumpy.random.paretonumpy.random.randintnumpy.random.random_samplenumpy.random.ranfnumpy.random.rayleighnumpy.random.samplenumpy.random.shufflenumpy.random.standard_gammanumpy.random.triangularnumpy.random.weibull
(PR-#9099)
Changes
Support for @numba.extending.intrinsic(prefer_literal=True)
In the high level extension API, the prefer_literal option is added to the
numba.extending.intrinsic decorator to prioritize the use of literal types
when available. This has the same behavior as in the prefer_literal
option in the numba.extending.overload decorator.
(PR-#6647)
Deprecations
Deprecation of old-style NUMBA_CAPTURED_ERRORS
Added deprecation schedule of NUMBA_CAPTURED_ERRORS=old_style.
NUMBA_CAPTURED_ERRORS=new_style will become the default in future releases.
Details are documented at
https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-old-style-numba-captured-errors
(PR-#9090)
Pull-Requests
PR #6647: Support prefer_literal option for intrinsic decorator (ashutoshvarma sklam)
PR #7543: Support for all modes in np.correlate and np.convolve (jeertmans)
PR #7582: Use get_parfor_writes to detect illegal array access that prevents fusion. (DrTodd13)
PR #8462: Add PyBytes_AsString and PyBytes_AsStringAndSize (ianna)
PR #8633: DOC: Convert vectorize and guvectorize examples to doctests (Matt711)
PR #8854: Updated mk_alloc to support Numba-Dpex compute follows data. (mingjie-intel)
PR #8861: CUDA: Don’t add device kwarg for jit registry (gmarkall)
PR #8871: Don’t return the function in CallConv.decorate_function() (gmarkall)
PR #8885: Fix np.allclose not handling default args (guilhermeleobas)
PR #8892: Add support for __*matmul__ methods in jitclass (louisamand)
PR #8895: CUDA: Enable caching functions that use CG (gmarkall)
PR #8906: Add support for reflected dunder methods in jitclass (louisamand)
PR #8911: Remove isinstance experimental feature warning (guilhermeleobas)
PR #8937: Remove old Website development documentation (esc gmarkall)
PR #8944: Add exceptions to np.isclose (guilhermeleobas)
PR #8976: Fix index URL for ptxcompiler/cubinlinker packages. (bdice)
PR #8988: support for latest CUDA driver codes #8363 (s1Sharp)
PR #8995: Allow libraries that implement __array_ufunc__ to override DUFunc.__c… (jpivarski)
PR #9021: update the release checklist following 0.57.1rc1 (esc)
PR #9022: fix: update the C++ ABI repo reference (emmanuel-ferdman)
PR #9028: Replace use of imp module removed in 3.12 (hauntsaninja)
PR #9034: CUDA libs test: Report the absolute path of the loaded libcuda.so on Linux, + other improvements (gmarkall)
PR #9035: CUDA: Allow for debuginfo in nvdisasm output (Matt711)
PR #9037: Recognize additional functions as being pure or not having side effects. (DrTodd13)
PR #9039: Correct git clone link in installation instructions. (ellifteria)
PR #9040: Remove NVVM 3.4 and CTK 11.0 / 11.1 support (gmarkall)
PR #9046: copy the change log changes for 0.57.1 to main (esc)
PR #9068: Adding np.geomspace (KrisMinchev)
PR #9069: Fix towncrier error due to importlib_resources upgrade (sklam)
PR #9072: Fix support for overloading dispatcher with non-compatible first-class functions (gmarkall sklam)
PR #9074: Add np.trim_zeros (sungraek guilhermeleobas)
PR #9082: Add np.vsplit, np.hsplit, and np.dsplit (KrisMinchev)
PR #9083: Removed windows 32 references from code and documentation (kc611)
PR #9085: Add tests for np.row_stack (KrisMinchev)
PR #9086: Support NVRTC using ctypes binding (testhound gmarkall)
PR #9087: Add trimseq from np.polynomial.polyutils and polyadd, polysub, polymul from np.polynomial.polynomial (KrisMinchev)
PR #9088: Fix: Issue 9063 - CUDA atomics tests failing with CUDA 12.2 (gmarkall)
PR #9090: Add deprecation notice for old_style error capturing. (esc sklam)
PR #9094: Add support for a ‘max’ level to NUMBA_OPT environment variable. (stuartarchibald)
PR #9095: Support dtype keyword in arange_parallel_impl (DrTodd13 sklam)
PR #9105: NumPy 1.25 support (PR #9011) continued (gmarkall apmasell)
PR #9111: Fixes ReST syntax error in PR#9099 (stuartarchibald gmarkall sklam apmasell)
PR #9112: Fixups for PR#9100 (stuartarchibald sklam)
PR #9113: Add support for np.diagflat (KrisMinchev)
PR #9114: update np min to 122 (stuartarchibald esc)
PR #9118: Add support for np.resize() (KrisMinchev)
PR #9127: Fix accidental cffi test deps, refactor cffi skipping (gmarkall)
PR #9152: Fix old_style error capturing deprecation warnings (sklam)
PR #9173: Towncrier fixups (Continue #9158 and retarget to main branch) (sklam)
PR #9190: Fix issue with incompatible multiprocessing context in test. (stuartarchibald)