API Reference

This section provides detailed documentation for all the public classes and methods available in the QuantileFlow package.

Core Components

DDSketch

class QuantileFlow.DDSketch(relative_accuracy: float, mapping_type: Literal['logarithmic', 'lin_interpol', 'cub_interpol'] = 'logarithmic', max_buckets: int = 2048, bucket_strategy: BucketManagementStrategy = BucketManagementStrategy.FIXED, cont_neg: bool = True)[source]

Bases: object

DDSketch implementation for quantile approximation with relative-error guarantees.

This implementation supports different mapping schemes and storage types for optimal performance in different scenarios. It can handle both positive and negative values, and provides configurable bucket management strategies.

Reference:

“DDSketch: A Fast and Fully-Mergeable Quantile Sketch with Relative-Error Guarantees” by Charles Masson, Jee E. Rim and Homin K. Lee

__init__(relative_accuracy: float, mapping_type: Literal['logarithmic', 'lin_interpol', 'cub_interpol'] = 'logarithmic', max_buckets: int = 2048, bucket_strategy: BucketManagementStrategy = BucketManagementStrategy.FIXED, cont_neg: bool = True)[source]

Initialize DDSketch.

Parameters:
  • relative_accuracy – The relative accuracy guarantee (alpha). Must be between 0 and 1.

  • mapping_type – The type of mapping scheme to use: - ‘logarithmic’: Basic logarithmic mapping - ‘lin_interpol’: Linear interpolation mapping - ‘cub_interpol’: Cubic interpolation mapping

  • max_buckets – Maximum number of buckets per store (default 2048). If cont_neg is True, each store will have max_buckets buckets.

  • bucket_strategy – Strategy for managing bucket count. If FIXED, uses ContiguousStorage, otherwise uses SparseStorage.

  • cont_neg – Whether to handle negative values (default True).

Raises:

ValueError – If relative_accuracy is not between 0 and 1.

delete(value: int | float) None[source]

Delete a value from the sketch.

Parameters:

value – The value to delete.

Raises:

ValueError – If value is negative and cont_neg is False.

insert(value: int | float) None[source]

Insert a value into the sketch.

Parameters:

value – The value to insert.

Raises:

ValueError – If value is negative and cont_neg is False.

merge(other: DDSketch) None[source]

Merge another DDSketch into this one.

Parameters:

other – Another DDSketch instance to merge with this one.

Raises:

ValueError – If the sketches are incompatible.

quantile(q: float) float[source]

Compute the approximate quantile.

Parameters:

q – The desired quantile (between 0 and 1).

Returns:

The approximate value at the specified quantile.

Raises:

ValueError – If q is not between 0 and 1 or if the sketch is empty.

MomentSketch

class QuantileFlow.MomentSketch(num_moments: int = 20, compress_values: bool = False)[source]

Bases: object

MomentSketch implementation for quantile approximation using the moment-based approach.

This implementation uses power sums, Chebyshev moment conversion, and maximum entropy optimization to estimate the probability distribution of data and compute quantiles. It supports merging sketches from distributed sources and provides accurate quantile estimates with a compact representation.

Reference:

“Space- and Computationally-Efficient Set Similarity via Locality Sensitive Sketching” by Anshumali Shrivastava

__init__(num_moments: int = 20, compress_values: bool = False)[source]

Initialize MomentSketch.

Parameters:
  • num_moments – Number of moments to track (default 20). Higher values increase accuracy at the cost of computation.

  • compress_values – Whether to compress values using arcsinh transformation (default False). Useful for handling widely distributed data with extreme values.

classmethod from_dict(data: Dict) MomentSketch[source]

Create a sketch from a dictionary.

Parameters:

data – Dictionary representation of a sketch.

Returns:

New MomentSketch instance.

insert(value: int | float) None[source]

Insert a single value into the sketch.

Parameters:

value – The value to insert.

insert_batch(values: List[float] | ndarray) None[source]

Insert multiple values into the sketch.

Parameters:

values – Array or list of values to insert.

interquartile_range() float[source]

Get the interquartile range (IQR).

Returns:

Estimated IQR (difference between 75th and 25th percentiles).

median() float[source]

Get the median value (50th percentile).

Returns:

Estimated median value.

merge(other: MomentSketch) None[source]

Merge another MomentSketch into this one.

Parameters:

other – Another MomentSketch instance to merge.

Raises:

ValueError – If the sketches are incompatible (different compression settings).

percentile(p: float) float[source]

Get the p-th percentile value.

Parameters:

p – Percentile between 0 and 100 (e.g., 75 for 75th percentile).

Returns:

Estimated value at the requested percentile.

Raises:

ValueError – If p is not between 0 and 100.

plot_distribution(figsize=(10, 6))[source]

Plot the estimated probability distribution.

Parameters:

figsize – Figure size (width, height) in inches.

Returns:

Matplotlib figure object.

quantile(fraction: float) float[source]

Get the value at a given quantile.

Parameters:

fraction – Quantile fraction between 0 and 1 (e.g., 0.5 for median).

Returns:

Estimated value at the requested quantile.

Raises:

ValueError – If fraction is not between 0 and 1.

quantiles(fractions: List[float]) List[float][source]

Get values at multiple quantiles.

Parameters:

fractions – List of quantile fractions between 0 and 1.

Returns:

List of estimated values at the requested quantiles.

Raises:

ValueError – If any fraction is not between 0 and 1.

summary_statistics() Dict[str, float][source]

Get summary statistics.

Returns:

Dictionary containing min, q1, median, q3, max, count, and mean.

to_dict() Dict[source]

Convert sketch to a dictionary for serialization.

Returns:

Dictionary representation of the sketch.

Mapping Classes

These classes implement different mapping strategies for DDSketch:

Storage Classes

These classes implement different storage strategies for DDSketch:

Utility Classes