td = tdigest()
td = tdigest(compression)
td = tdigest(X)
td = tdigest(compression, X)
| Parameter | Description |
|---|---|
| compression | compression factor: positive integer scalar. |
| X | an array of double, single, integers, ... |
| Parameter | Description |
|---|---|
| td | t-digest representation of the array elements. |
td = tdigest(compression, X) returns a t-digest representation of the array elements of X.
TDigest is a data structure for accurate on-line accumulation of rank-based statistics such as quantiles and cumulative distribution functions. It is particularly effective for large data sets and for estimating extreme quantiles. The algorithm is described in detail in the paper "Computing Extremely Accurate Quantiles Using t-Digests" by Ted Dunning and Otmar Ertl.
The t-digest is particularly useful for:
The compression factor (100 in the examples) controls the trade-off between accuracy and memory usage - higher values give more accuracy but use more memory.
Once you have a t-digest object, you can add new data points to it using the + operator, and compute percentiles or quantiles using the percentile or quantile methods.
For more details, see the original paper linked in the bibliography.
Methods available:
Properties:
M = rand(1, 15000);
td = tdigest(100, M);
td = td + [1:15000];
td.percentile([5, 50, 95])
td.quantile([0.05 0.5 0.95])
td = tdigest(100);
while(1)
td = td + randn();
td.percentile([5, 50, 95])
end
| Version | Description |
|---|---|
| 1.15.0 | initial version |