Operations#

These module-level functions wrap the native sparse primitives. In most cases the @ operator on COOArray, CSRArray, and CSCArray is the preferred spelling. These functions exist for explicit dispatch and for callers who prefer a functional style.

Sparse-dense operations return lazy mlx.core.array values. Sparse-sparse add / subtract and matmat operations return new sparse arrays and may synchronize structure to host because their output sparsity pattern is data-dependent.

add#

mlx_sparse.add(A, B)[source]#

Add two sparse arrays without densifying.

Computes A + B for rank-2 mlx-sparse arrays with equal shape and matching value dtype. The production path canonicalizes both operands with native sort/sum kernels, merges their CSR structures in native C++ or Metal, sums duplicate coordinates, and removes exact zero cancellations from the result.

CSR inputs return a canonical CSRArray. Homogeneous CSC inputs return a canonical CSCArray via native CSR/CSC conversion. COO and mixed-format inputs return canonical CSR output so no dense matrix is created.

Sparse+dense addition is intentionally out of scope for this release: adding a sparse matrix to a dense matrix returns a dense matrix mathematically, and this API does not hide that cost. Add or subtract a Python scalar only when the scalar is exactly zero; nonzero scalar addition is rejected for the same reason.

The output structure depends on the input structures and on exact numerical cancellation, so public sparse addition is treated as a dynamic-topology operation. Gradients through integer structure are unsupported, and no fixed-topology sparse-value autodiff contract is claimed for this dynamic operation.

Parameters:

A – Left sparse operand, or the scalar 0 for 0 + B.
B – Right sparse operand, or the scalar 0 for A + 0.

Returns:

A canonical sparse array. The result is CSR except for homogeneous CSC inputs, which return CSC.

Raises:

TypeError – If operands are dense, shapes differ, or value dtypes differ.
NotImplementedError – If nonzero scalar addition would densify.

subtract#

mlx_sparse.subtract(A, B)[source]#

Subtract two sparse arrays without densifying.

Computes A - B for rank-2 mlx-sparse arrays with equal shape and matching value dtype. Semantics match add(): inputs are canonicalized natively, the structural union is merged in CSR form, duplicate coordinates are summed, and exact zero cancellations are pruned from the canonical output. Homogeneous CSC inputs return CSC; all other supported sparse combinations return CSR.

Sparse-dense subtraction and nonzero scalar subtraction are rejected because they would produce dense results. The scalar 0 is accepted as the additive identity: A - 0 returns A and 0 - A returns -A as a sparse array with the same structure.

The output topology is dynamic because exact cancellation can remove stored entries. Gradients through the public sparse subtraction structure are not claimed in this release.

Parameters:

A – Left sparse operand, or scalar 0 for 0 - B.
B – Right sparse operand, or scalar 0 for A - 0.

Returns:

A canonical sparse array. The result is CSR except for homogeneous CSC inputs, which return CSC.

Raises:

TypeError – If operands are dense, shapes differ, or value dtypes differ.
NotImplementedError – If nonzero scalar subtraction would densify.

kron#

mlx_sparse.kron(A, B, format=None)[source]#

Return the sparse Kronecker product of two rank-2 operands.

kron(A, B) builds the matrix whose stored entries follow row = row_A * B.shape[0] + row_B, col = col_A * B.shape[1] + col_B, and data = data_A * data_B. COO, CSR, CSC, and dense rank-2 MLX-compatible inputs are accepted; dense inputs are converted with the native mlx_sparse.fromdense() path before assembly, never with Python loops over entries.

format may be "coo", "csr", "csc", or None. The default is COO, matching the construction-oriented SciPy API. COO output is the direct native fixed-topology product and preserves duplicate structural entries if either input contains duplicates. CSR and CSC output canonicalize through native compressed conversion, summing duplicate products and returning duplicate-free compressed structures. Unsupported SciPy formats such as "bsr", "dia", "dok", and "lil" are rejected explicitly.

Value dtype promotion follows the package’s sparse value constraints: complex64 wins over real dtypes, any float32 operand yields float32, equal low-precision operands keep their dtype, and mixed float16/bfloat16 promotes to float32. Dense integer or boolean operands are converted to float32 because mlx-sparse sparse containers do not store integer or boolean value buffers in this release.

Sparse-value JVP/VJP is implemented for the native COO data product when the input structures are fixed. Gradients through integer coordinates, dense-to-sparse extraction, and duplicate-summing canonicalization are not part of the differentiable contract.

Parameters:

A – Left COO, CSR, CSC, or dense rank-2 operand.
B – Right COO, CSR, CSC, or dense rank-2 operand.
format – Output format, one of None, "coo", "csr", or "csc". None defaults to "coo".

Returns:

A COOArray, CSRArray, or CSCArray with shape (A.shape[0] * B.shape[0], A.shape[1] * B.shape[1]).

Raises:

ValueError – If an operand is not rank-2, the requested format is unknown, or output dimensions exceed MLX limits.
TypeError – If format is not a string or None.
NotImplementedError – If a known unsupported SciPy sparse format is requested.

kronsum#

mlx_sparse.kronsum(A, B, format=None)[source]#

Return the Kronecker sum of two square sparse or dense matrices.

The Kronecker sum is defined as kron(I_n, A) + kron(B, I_m) for A.shape == (m, m) and B.shape == (n, n). Inputs may be COO, CSR, CSC, or dense rank-2 arrays. Dense inputs are extracted with native mlx_sparse.fromdense(); the two Kronecker products are assembled with native COO kernels and the sum is merged with native sparse addition.

format may be "coo", "csr", "csc", or None. The default is COO. The intermediate sum is canonical CSR, so returned CSR and CSC outputs are canonical; returned COO is produced by native CSR-to-COO expansion and is also canonical.

Parameters:

A – Left square COO, CSR, CSC, or dense rank-2 operand.
B – Right square COO, CSR, CSC, or dense rank-2 operand.
format – Output format, one of None, "coo", "csr", or "csc". None defaults to "coo".

Returns:

A sparse array with shape (A.shape[0] * B.shape[0], A.shape[1] * B.shape[1]).

Raises:

ValueError – If either operand is not square or if output shape/nnz limits are exceeded.
TypeError – If format is not a string or None.
NotImplementedError – If a known unsupported SciPy sparse format is requested.

csr_matvec#

mlx_sparse.csr_matvec(a, x)[source]#

Multiply a CSR sparse matrix by a dense vector.

Computes y = A @ x where A is a CSRArray and x is a rank-1 dense array. The result is added to the MLX computation graph and not evaluated eagerly.

On Apple Silicon, the Metal backend dispatches a scalar row kernel for short rows and a vector-reduction kernel for long rows. CPU and GPU paths support float32, float16, bfloat16, and complex64 values with int32 or int64 indices.

Parameters:

a (CSRArray) – The sparse matrix, shape (n_rows, n_cols).
x – Dense vector, shape (n_cols,). Converted to mx.array if needed. Must have the same dtype as a.data.

Returns:

Dense vector of shape (n_rows,) with the same dtype as a.data.

Raises:

TypeError – If a is not a CSRArray, or if the dtypes of a.data and x do not match.
ValueError – If shape constraints are violated.

Return type:

mlx.core.array

Example:

import mlx.core as mx
import mlx_sparse as ms

y = a @ x  # preferred via __matmul__
y = ms.csr_matvec(a, x)  # explicit call
mx.eval(y)

coo_matvec#

mlx_sparse.coo_matvec(a, x)[source]#

Multiply a COO sparse matrix by a dense vector.

Parameters:: a (COOArray)
Return type:: mlx.core.array

csc_matvec#

mlx_sparse.csc_matvec(a, x)[source]#

Multiply a CSC sparse matrix by a dense vector.

Parameters:: a (CSCArray)
Return type:: mlx.core.array

csc_matvec_transpose#

mlx_sparse.csc_matvec_transpose(a, x)[source]#

Multiply the transpose of a CSC sparse matrix by a dense vector.

Parameters:: a (CSCArray)
Return type:: mlx.core.array

csr_matmul#

mlx_sparse.csr_matmul(a, rhs)[source]#

Multiply a CSR sparse matrix by a dense matrix.

Computes Y = A @ B where A is a CSRArray and B is a rank-2 or batched dense array. The result is added to the MLX computation graph and not evaluated eagerly.

On Apple Silicon, the Metal backend dispatches scalar output-element kernels for short rows and vector-reduction kernels for long rows. CPU and GPU paths support float32, float16, bfloat16, and complex64 values with int32 or int64 indices.

Parameters:

a (CSRArray) – The sparse matrix, shape (n_rows, n_cols).
rhs – Dense matrix, shape (n_cols, k), or batched dense matrix with sparse dimension at rhs.shape[-2]. Converted to mx.array if needed. Must have the same dtype as a.data.

Returns:

Dense matrix or batched dense matrix with sparse dimension replaced by n_rows and the same dtype as a.data.

Raises:

TypeError – If a is not a CSRArray, or if dtype constraints are violated.
ValueError – If shape constraints are violated.

Return type:

mlx.core.array

Example:

import mlx.core as mx
import mlx_sparse as ms

Y = a @ B  # preferred via __matmul__
Y = ms.csr_matmul(a, B)  # explicit call
mx.eval(Y)

coo_matmul#

mlx_sparse.coo_matmul(a, rhs)[source]#

Multiply a COO sparse matrix by a dense matrix or batched matrices.

Parameters:: a (COOArray)
Return type:: mlx.core.array

csc_matmul#

mlx_sparse.csc_matmul(a, rhs)[source]#

Multiply a CSC sparse matrix by a dense matrix or batched matrices.

Parameters:: a (CSCArray)
Return type:: mlx.core.array

csr_batched_matvec#

mlx_sparse.csr_batched_matvec(a, rhs)[source]#

Multiply a CSR sparse matrix by a batch of dense vectors.

Computes Y[b] = A @ X[b] for X with shape (..., n_cols) and returns shape (..., n_rows). The implementation uses native batched CPU/Metal kernels after flattening any leading batch dimensions.

Parameters:: a (CSRArray)
Return type:: mlx.core.array

coo_batched_matvec#

mlx_sparse.coo_batched_matvec(a, rhs)[source]#

Multiply a COO sparse matrix by a batch of dense vectors.

Parameters:: a (COOArray)
Return type:: mlx.core.array

csc_batched_matvec#

mlx_sparse.csc_batched_matvec(a, rhs)[source]#

Multiply a CSC sparse matrix by a batch of dense vectors.

Parameters:: a (CSCArray)
Return type:: mlx.core.array

csr_batched_matmul#

mlx_sparse.csr_batched_matmul(a, rhs)[source]#

Multiply a CSR sparse matrix by a batch of dense matrices.

rhs must have shape (..., n_cols, k) and the result has shape (..., n_rows, k). For rank-2 dense matrices, use csr_matmul().

Parameters:: a (CSRArray)
Return type:: mlx.core.array

coo_batched_matmul#

mlx_sparse.coo_batched_matmul(a, rhs)[source]#

Multiply a COO sparse matrix by a batch of dense matrices.

Parameters:: a (COOArray)
Return type:: mlx.core.array

csc_batched_matmul#

mlx_sparse.csc_batched_matmul(a, rhs)[source]#

Multiply a CSC sparse matrix by a batch of dense matrices.

Parameters:: a (CSCArray)
Return type:: mlx.core.array

csr_matmat#

mlx_sparse.csr_matmat(a, rhs)[source]#

Multiply two CSR sparse matrices and return a canonical CSR matrix.

Computes C = A @ B where both A and B are CSRArray instances. The output sparsity pattern is not known at graph-build time, so this operation performs a native C++ structural assembly pass on the host (calling mx.eval on the input arrays internally) and returns a new CSRArray with canonical format.

Because the output size is data-dependent, this operation is not representable as a fixed-shape MLX primitive. It is suitable for one-shot matrix products and matrix-power computations, but is not appropriate inside a JIT-compiled function.

Parameters:

a (CSRArray) – Left-hand sparse matrix, shape (m, k).
rhs (CSRArray) – Right-hand sparse matrix, shape (k, n).

Returns:

A canonical CSRArray with shape (m, n), has_canonical_format=True, and sorted_indices=True.

Raises:

TypeError – If either argument is not a CSRArray.
ValueError – If the inner dimensions do not match (a.shape[1] != rhs.shape[0]).

Return type:

CSRArray

Example:

import mlx_sparse as ms

# Compute the square of a sparse matrix
C = A @ A  # dispatches csr_matmat when A is CSRArray
C = ms.csr_matmat(A, A)  # explicit call

# Chain sparse matrix products
D = ms.csr_matmat(ms.csr_matmat(A, B), C)

coo_matmat#

mlx_sparse.coo_matmat(a, rhs)[source]#

Multiply two COO sparse matrices and return a canonical COO matrix.

The native implementation groups both operands by coordinate rows, performs a symbolic row pass to size the result, then fills sorted output coordinates without routing through CSR.

Parameters:

a (COOArray)
rhs (COOArray)

Return type:

COOArray

csc_matmat#

mlx_sparse.csc_matmat(a, rhs)[source]#

Multiply two CSC sparse matrices and return a canonical CSC matrix.

The native implementation traverses right-hand columns and left-hand compressed columns directly, producing sorted row indices per output column. It does not convert to CSR internally.

Parameters:

a (CSCArray)
rhs (CSCArray)

Return type:

CSCArray

Reductions#

All sparse containers expose reduction methods as well as module-level helper functions. row_sums / col_sums return the same dtype as the sparse values, row_norms / col_norms return float32, and diagonal / trace sum duplicate diagonal entries.

Format	Functions
COO	`coo_row_sums()`, `coo_col_sums()`, `coo_row_norms()`, `coo_col_norms()`, `coo_diagonal()`, `coo_trace()`
CSR	`csr_row_sums()`, `csr_col_sums()`, `csr_row_norms()`, `csr_diagonal()`, `csr_trace()`
CSC	`csc_row_sums()`, `csc_col_sums()`, `csc_row_norms()`, `csc_col_norms()`, `csc_diagonal()`, `csc_trace()`

COO and CSC reductions are native C++/Metal paths. Norm reductions use dense matrix semantics, so non-canonical COO/CSC inputs are canonicalized before norming to ensure duplicate coordinates are summed before the square is taken. Sparse-value JVP/VJP is implemented for row_sums, col_sums, diagonal, and trace on COO, CSR, and CSC. Row/column norms are forward-only in v0.0.6b0 while zero-norm and complex-norm gradient behavior is specified.

todense#

mlx_sparse.todense(array)[source]#

Materialize a sparse array as a dense MLX array.

Convenience wrapper that calls array.todense() on any sparse container. Duplicate entries are summed, consistent with canonicalize().todense().

Parameters:: array – A COOArray, CSRArray, or CSCArray instance.
Returns:: Dense array of shape (n_rows, n_cols) with the same dtype as array.data.
Raises:: TypeError – If array does not have a todense method.
Return type:: mlx.core.array

Example:

import mlx_sparse as ms

dense = ms.todense(my_csr)

identity_like#

mlx_sparse.identity_like(x)[source]#

Return a native MLX copy of x.

This function exists as an extension smoke test. It passes x through the native _ext module (if available) and returns an identical MLX array. For production code, prefer mlx.core operations directly.

Parameters:: x (mlx.core.array) – Any MLX array.
Returns:: An MLX array with the same shape, dtype, and values as x.
Return type:: mlx.core.array

is_available#

mlx_sparse.is_available()[source]#

Return True if the native C++ extension is loaded.

The mlx-sparse native extension (_ext) provides MLX-primitive implementations of sparse operations with CPU and Metal backends. When it is absent (e.g. a pure-source checkout without a build step), all operations fall back to NumPy-based Python implementations in mlx_sparse._fallback.

Returns:: True if mlx_sparse._ext was successfully imported at package load time, False otherwise.
Return type:: bool

Example:

import mlx_sparse as ms

if not ms.is_available():
    print("Native extension not found. Using Python fallback.")

Dispatch summary#

The @ operator on CSRArray dispatches based on the type and rank of rhs:

C = A @ B  # rhs is CSRArray -> csr_matmat(A, B) returns CSRArray
y = A @ x  # rhs.ndim == 1 -> csr_matvec(A, x) returns mx.array
Y = A @ X  # rhs.ndim == 2 -> csr_matmul(A, X) returns mx.array
Yb = A @ Xb  # rhs.ndim > 2 -> csr_matmul(A, Xb) returns mx.array

The explicit function calls accept the same arguments:

y = ms.csr_matvec(A, x)
Y = ms.csr_matmul(A, X)
yb = ms.csr_batched_matvec(A, xb)
Yb = ms.csr_batched_matmul(A, Xb)
C = ms.csr_matmat(A, B)

For COOArray and CSCArray, dense RHS dispatch mirrors CSR: rank-1 RHS uses format-native matvec, rank-2 RHS uses format-native matmul, and higher-rank RHS is flattened into the corresponding native batched primitive. Sparse-sparse @ supports every COO/CSR/CSC format pair. The operator result follows the left-hand sparse format: COO left operands return canonical COO, CSR left operands return canonical CSR, and CSC left operands return canonical CSC. Same-format products dispatch directly to the native format-specific kernel. Mixed-format products normalize the RHS through native format conversion before calling the same native kernel, never through dense materialization or Python loops over stored entries.

Vectorized sparse-dense products use the same native path. For COO, CSR, and CSC matrices with fixed sparse structure, mx.vmap(lambda x: A @ x)(X) routes to the format-native batched matvec primitive and mx.vmap(lambda X: A @ X)(XB) routes to the format-native batched matmul primitive. The mapped dense RHS axis may be any valid RHS axis, the primitive normalizes it internally and reports a stable mapped output axis back to MLX so out_axes is handled by MLX. Mapping sparse data or structural index buffers is intentionally rejected in v0.0.6b0, use explicit batched dense RHS helpers when the sparse topology is fixed and only the right-hand side is batched.

The + and - operators on COOArray, CSRArray, and CSCArray dispatch to add() and subtract(). Sparse-sparse addition supports COO, CSR, and CSC operands with equal shape and matching value dtype. Inputs are canonicalized natively, duplicate coordinates are summed, exact zero cancellations are removed, and the output is canonical. CSR inputs return CSR, homogeneous CSC inputs return CSC, and COO or mixed-format inputs return CSR. Sparse+dense addition and nonzero scalar addition are rejected because they would produce dense matrices, call A.todense() explicitly when that is intended. The scalar 0 is accepted as a sparse-preserving identity.

The kron() and kronsum() structural constructors accept COO, CSR, CSC, or dense rank-2 inputs and return COO, CSR, or CSC output. Dense operands are first extracted with the native fromdense() constructor. kron is assembled by the native COO Kronecker primitive after native compressed-to-COO conversion for CSR/CSC operands. format="coo" returns the direct fixed product topology and preserves duplicate coordinates when inputs contain duplicates, format="csr" and format="csc" canonicalize through native compressed conversion and sum duplicate products. kronsum builds kron(I_n, A) + kron(B, I_m) for square inputs and uses the native sparse addition merge. Unsupported SciPy storage names such as "bsr", "dia", "dok", and "lil" are rejected explicitly.

All sparse-dense products validate that rhs.dtype == A.data.dtype. There is no implicit type promotion. See Dtype policy for the full dtype matrix.

Native dispatch notes#

Sparse-dense matvec and matmul for COO, CSR, and CSC are fixed-output primitives and stay lazy in the MLX graph. Explicit batched helpers use native C++/Metal kernels for leading batch dimensions rather than materializing dense matrices in Python.

Transpose products used by autodiff are also native. On Metal, float32 transpose matvec/matmul use atomic scatter-add kernels. Other GPU value dtypes lower through native csr_transpose followed by the ordinary native product. Sparse-sparse add / subtract and matmat are different: their output shape depends on the input structure and, for addition, on exact cancellation, so they perform count work and synchronize enough structure to allocate compact output buffers. Sparse addition canonicalizes to CSR for the native merge and uses CPU or Metal row-merge kernels. CSR matmat uses row symbolic/numeric assembly, COO groups coordinate rows without routing through CSR, and CSC walks right-hand columns against left-hand compressed columns to produce sorted output columns. Mixed-format @ uses native RHS conversion plus the left-hand format’s native sparse-sparse product, direct mixed-format kernels should be added only when benchmarks show conversion overhead is material.

kron is a fixed-output structural product before optional compressed canonicalization. The native COO kernel writes nnz_A * nnz_B values and coordinates directly on CPU or Metal without dense masks and without Python loops over stored entries. Sparse-value JVP/VJP is implemented for the COO data product when input structures are fixed. Gradients through integer structural buffers, dense-to-sparse extraction, duplicate-summing canonicalization, and structural triangular compaction are unsupported.

coo_matvec / coo_matmul are native coordinate scatter products. On Metal, float32 uses atomic scatter-add over stored coordinates, other value dtypes stay native through a serial scatter kernel because Metal does not provide storage-compatible atomic adds for float16, bfloat16, or complex64.

csc_matvec / csc_matmul are native compressed-column scatter products. Forward CSC products walk columns and scatter into output rows, on Metal, float32 uses atomic scatter-add while other value dtypes use native serial scatter. CSC transpose products are the layout’s reduction fast path: each output entry is one compressed-column dot product.