Skip to content

filters

occulus.filters

Point cloud filtering and downsampling operations.

All filters operate on :class:~occulus.types.PointCloud instances and return new clouds — they never mutate the input.

Available filters
  • :func:voxel_downsample — grid-based spatial downsampling
  • :func:random_downsample — uniform random point selection
  • :func:statistical_outlier_removal — remove points with unusual neighbourhood distances
  • :func:radius_outlier_removal — remove points with too few neighbours within a radius
  • :func:crop — axis-aligned bounding box crop

All implementations use pure NumPy and SciPy. No optional dependencies required.

crop(cloud, bbox)

Crop a point cloud to an axis-aligned bounding box.

Parameters:

Name Type Description Default
cloud PointCloud

Input point cloud.

required
bbox tuple[float, float, float, float, float, float]

Bounding box as (xmin, ymin, zmin, xmax, ymax, zmax). Bounds are inclusive.

required

Returns:

Type Description
PointCloud

Points that fall within (inclusive of) the bounding box.

Raises:

Type Description
OcculusValidationError

If bbox does not have exactly 6 elements or any min >= max.

Source code in src/occulus/filters/__init__.py
def crop(
    cloud: PointCloud,
    bbox: tuple[float, float, float, float, float, float],
) -> PointCloud:
    """Crop a point cloud to an axis-aligned bounding box.

    Parameters
    ----------
    cloud : PointCloud
        Input point cloud.
    bbox : tuple[float, float, float, float, float, float]
        Bounding box as ``(xmin, ymin, zmin, xmax, ymax, zmax)``.
        Bounds are inclusive.

    Returns
    -------
    PointCloud
        Points that fall within (inclusive of) the bounding box.

    Raises
    ------
    OcculusValidationError
        If ``bbox`` does not have exactly 6 elements or any min >= max.
    """
    if len(bbox) != 6:
        raise OcculusValidationError(f"bbox must have 6 elements, got {len(bbox)}")

    xmin, ymin, zmin, xmax, ymax, zmax = bbox

    if xmin >= xmax or ymin >= ymax or zmin >= zmax:
        raise OcculusValidationError(
            f"bbox min values must be strictly less than max values: {bbox}"
        )

    xyz = cloud.xyz
    mask: NDArray[np.bool_] = (
        (xyz[:, 0] >= xmin)
        & (xyz[:, 0] <= xmax)
        & (xyz[:, 1] >= ymin)
        & (xyz[:, 1] <= ymax)
        & (xyz[:, 2] >= zmin)
        & (xyz[:, 2] <= zmax)
    )

    logger.debug(
        "crop: %d%d points inside bbox %s",
        cloud.n_points,
        int(mask.sum()),
        bbox,
    )
    return _subset(cloud, np.where(mask)[0])

radius_outlier_removal(cloud, radius, min_neighbors=2)

Remove points that have fewer than min_neighbors within radius.

Parameters:

Name Type Description Default
cloud PointCloud

Input point cloud.

required
radius float

Search radius in the same units as the cloud coordinates. Must be strictly positive.

required
min_neighbors int

Minimum number of neighbours (excluding self) required to keep a point, by default 2.

2

Returns:

Type Description
PointCloud

Cloud with isolated points removed.

NDArray[bool_]

Boolean inlier mask of length n_points (True = kept).

Raises:

Type Description
OcculusValidationError

If radius is not positive or min_neighbors is non-positive.

Source code in src/occulus/filters/__init__.py
def radius_outlier_removal(
    cloud: PointCloud,
    radius: float,
    min_neighbors: int = 2,
) -> tuple[PointCloud, NDArray[np.bool_]]:
    """Remove points that have fewer than ``min_neighbors`` within ``radius``.

    Parameters
    ----------
    cloud : PointCloud
        Input point cloud.
    radius : float
        Search radius in the same units as the cloud coordinates.
        Must be strictly positive.
    min_neighbors : int, optional
        Minimum number of neighbours (excluding self) required to keep a
        point, by default 2.

    Returns
    -------
    PointCloud
        Cloud with isolated points removed.
    NDArray[np.bool_]
        Boolean inlier mask of length ``n_points`` (``True`` = kept).

    Raises
    ------
    OcculusValidationError
        If ``radius`` is not positive or ``min_neighbors`` is non-positive.
    """
    if radius <= 0:
        raise OcculusValidationError(f"radius must be positive, got {radius}")
    if min_neighbors <= 0:
        raise OcculusValidationError(f"min_neighbors must be positive, got {min_neighbors}")

    tree = KDTree(cloud.xyz)
    counts: NDArray[np.intp] = np.asarray(
        tree.query_ball_point(cloud.xyz, r=radius, return_length=True, workers=-1)
    )
    inlier_mask: NDArray[np.bool_] = (counts - 1) >= min_neighbors

    logger.debug(
        "radius_outlier_removal: removed %d/%d points (r=%.4f, min_nb=%d)",
        (~inlier_mask).sum(),
        cloud.n_points,
        radius,
        min_neighbors,
    )
    return _subset(cloud, np.where(inlier_mask)[0]), inlier_mask

random_downsample(cloud, fraction, *, seed=None)

Randomly downsample a point cloud to a fraction of its points.

Parameters:

Name Type Description Default
cloud PointCloud

Input point cloud.

required
fraction float

Fraction of points to retain, in (0.0, 1.0].

required
seed int | None

Random seed for reproducibility, by default None.

None

Returns:

Type Description
PointCloud

Downsampled cloud of the same concrete subtype as the input.

Raises:

Type Description
OcculusValidationError

If fraction is not in (0.0, 1.0] or the cloud is empty.

Source code in src/occulus/filters/__init__.py
def random_downsample(
    cloud: PointCloud,
    fraction: float,
    *,
    seed: int | None = None,
) -> PointCloud:
    """Randomly downsample a point cloud to a fraction of its points.

    Parameters
    ----------
    cloud : PointCloud
        Input point cloud.
    fraction : float
        Fraction of points to retain, in (0.0, 1.0].
    seed : int | None, optional
        Random seed for reproducibility, by default None.

    Returns
    -------
    PointCloud
        Downsampled cloud of the same concrete subtype as the input.

    Raises
    ------
    OcculusValidationError
        If ``fraction`` is not in (0.0, 1.0] or the cloud is empty.
    """
    if not (0.0 < fraction <= 1.0):
        raise OcculusValidationError(f"fraction must be in (0.0, 1.0], got {fraction}")
    if cloud.n_points == 0:
        raise OcculusValidationError("Cannot downsample an empty point cloud")

    rng = np.random.default_rng(seed)
    k = max(1, int(cloud.n_points * fraction))
    selected = rng.choice(cloud.n_points, size=k, replace=False)
    selected.sort()

    logger.debug("random_downsample: %d%d points", cloud.n_points, k)
    return _subset(cloud, selected)

statistical_outlier_removal(cloud, nb_neighbors=20, std_ratio=2.0)

Remove statistical outliers based on nearest-neighbour distances.

For each point the mean distance to its nb_neighbors nearest neighbours is computed. Points whose mean distance exceeds global_mean + std_ratio * global_std are classified as outliers and removed.

Parameters:

Name Type Description Default
cloud PointCloud

Input point cloud.

required
nb_neighbors int

Number of nearest neighbours to consider (excluding self), by default 20.

20
std_ratio float

Standard deviation multiplier for the removal threshold, by default 2.0. Lower values remove points more aggressively.

2.0

Returns:

Type Description
PointCloud

Cloud with outliers removed.

NDArray[bool_]

Boolean inlier mask of length n_points (True = kept).

Raises:

Type Description
OcculusValidationError

If nb_neighbors is non-positive or exceeds the number of points.

Source code in src/occulus/filters/__init__.py
def statistical_outlier_removal(
    cloud: PointCloud,
    nb_neighbors: int = 20,
    std_ratio: float = 2.0,
) -> tuple[PointCloud, NDArray[np.bool_]]:
    """Remove statistical outliers based on nearest-neighbour distances.

    For each point the mean distance to its ``nb_neighbors`` nearest
    neighbours is computed. Points whose mean distance exceeds
    ``global_mean + std_ratio * global_std`` are classified as outliers
    and removed.

    Parameters
    ----------
    cloud : PointCloud
        Input point cloud.
    nb_neighbors : int, optional
        Number of nearest neighbours to consider (excluding self), by default 20.
    std_ratio : float, optional
        Standard deviation multiplier for the removal threshold, by default 2.0.
        Lower values remove points more aggressively.

    Returns
    -------
    PointCloud
        Cloud with outliers removed.
    NDArray[np.bool_]
        Boolean inlier mask of length ``n_points`` (``True`` = kept).

    Raises
    ------
    OcculusValidationError
        If ``nb_neighbors`` is non-positive or exceeds the number of points.
    """
    if nb_neighbors <= 0:
        raise OcculusValidationError(f"nb_neighbors must be positive, got {nb_neighbors}")
    if nb_neighbors >= cloud.n_points:
        raise OcculusValidationError(
            f"nb_neighbors ({nb_neighbors}) must be less than n_points ({cloud.n_points})"
        )

    tree = KDTree(cloud.xyz)
    distances, _ = tree.query(cloud.xyz, k=nb_neighbors + 1, workers=-1)
    mean_distances = distances[:, 1:].mean(axis=1)  # exclude self (distance = 0)

    threshold = mean_distances.mean() + std_ratio * mean_distances.std()
    inlier_mask: NDArray[np.bool_] = mean_distances <= threshold

    logger.debug(
        "statistical_outlier_removal: removed %d/%d points",
        (~inlier_mask).sum(),
        cloud.n_points,
    )
    return _subset(cloud, np.where(inlier_mask)[0]), inlier_mask

voxel_downsample(cloud, voxel_size)

Downsample a point cloud by retaining one point per voxel cell.

Points are grouped into a regular 3D grid of cubes with side length voxel_size. For each occupied voxel the first point (after lexicographic sort on voxel index) is retained, preserving all per-point attributes.

Parameters:

Name Type Description Default
cloud PointCloud

Input point cloud. Must have at least one point.

required
voxel_size float

Edge length of each voxel cube, in the same units as the cloud coordinates. Must be strictly positive.

required

Returns:

Type Description
PointCloud

Downsampled cloud of the same concrete subtype as the input.

Raises:

Type Description
OcculusValidationError

If voxel_size is not positive or the cloud is empty.

Source code in src/occulus/filters/__init__.py
def voxel_downsample(cloud: PointCloud, voxel_size: float) -> PointCloud:
    """Downsample a point cloud by retaining one point per voxel cell.

    Points are grouped into a regular 3D grid of cubes with side length
    ``voxel_size``. For each occupied voxel the first point (after lexicographic
    sort on voxel index) is retained, preserving all per-point attributes.

    Parameters
    ----------
    cloud : PointCloud
        Input point cloud. Must have at least one point.
    voxel_size : float
        Edge length of each voxel cube, in the same units as the cloud
        coordinates. Must be strictly positive.

    Returns
    -------
    PointCloud
        Downsampled cloud of the same concrete subtype as the input.

    Raises
    ------
    OcculusValidationError
        If ``voxel_size`` is not positive or the cloud is empty.
    """
    if voxel_size <= 0:
        raise OcculusValidationError(f"voxel_size must be positive, got {voxel_size}")
    if cloud.n_points == 0:
        raise OcculusValidationError("Cannot downsample an empty point cloud")

    xyz = cloud.xyz
    origin = xyz.min(axis=0)
    voxel_idx = np.floor((xyz - origin) / voxel_size).astype(np.int64)

    # Encode 3D voxel index as a single integer key for grouping
    max_idx = voxel_idx.max(axis=0) + 1
    flat = (
        voxel_idx[:, 0] * (max_idx[1] * max_idx[2]) + voxel_idx[:, 1] * max_idx[2] + voxel_idx[:, 2]
    )

    sort_order = np.argsort(flat, kind="stable")
    flat_sorted = flat[sort_order]
    _, first_occurrences = np.unique(flat_sorted, return_index=True)
    selected = sort_order[first_occurrences]

    logger.debug(
        "voxel_downsample: %d%d points (voxel_size=%.4f)",
        cloud.n_points,
        len(selected),
        voxel_size,
    )
    return _subset(cloud, selected)