mapclassify.JenksCaspallSampled

class mapclassify.JenksCaspallSampled(y, k=5, pct=0.1)[source]

Jenks Caspall Map Classification using a random sample.

Parameters:
ynumpy.array

\((n,1)\), values to classify.

kpython:int (default 5)

The number of classes required.

pctpython:float (default 0.10)

The percentage of \(n\) that should form the sample. If pct is specified such that \(n*pct > 1000\), then \(pct = 1000./n\).

Notes

This is intended for large \(n\) problems. The logic is to apply Jenks_Caspall to a random subset of the \(y\) space and then bin the complete vector \(y\) on the bins obtained from the subset. This would trade off some “accuracy” for a gain in speed.

Examples

>>> import mapclassify
>>> import numpy
>>> cal = mapclassify.load_example()
>>> numpy.random.seed(0)
>>> x = numpy.random.random(100000)
>>> jc = mapclassify.JenksCaspall(x)
>>> jcs = mapclassify.JenksCaspallSampled(x)
>>> jc.bins
array([0.20108144, 0.4025151 , 0.60396127, 0.80302249, 0.99997795])
>>> jcs.bins
array([0.19978245, 0.40793025, 0.59253555, 0.78241472, 0.99997795])
>>> jc.counts.tolist()
[20286, 19951, 20310, 19708, 19745]
>>> jcs.counts.tolist()
[20147, 20633, 18591, 18857, 21772]

# not for testing since we get different times on different hardware # just included for documentation of likely speed gains #>>> t1 = time.time(); jc = Jenks_Caspall(x); t2 = time.time() #>>> t1s = time.time(); jcs = Jenks_Caspall_Sampled(x); t2s = time.time() #>>> t2 - t1; t2s - t1s #1.8292930126190186 #0.061631917953491211

Attributes:
ybnumpy.array

\((n,1)\), bin IDs for observations.

binsnumpy.array

\((k,1)\), the upper bounds of each class.

kpython:int

The number of classes.

countsnumpy.array

\((k,1)\), the number of observations falling in each class.

__init__(y, k=5, pct=0.1)[source]

Methods

__init__(y[, k, pct])

find_bin(x)

Sort input or inputs according to the current bin estimate.

get_adcm()

Absolute deviation around class median (ADCM).

get_fmt()

get_gadf()

Goodness of absolute deviation of fit.

get_legend_classes([fmt])

Format the strings for the classes on the legend.

get_tss()

Returns sum of squares over all class means.

make(*args, **kwargs)

Configure and create a classifier that will consume data and produce classifications, given the configuration options specified by this function.

plot(gdf[, border_color, border_width, ...])

Plot a mapclassifier object.

set_fmt(fmt)

table()

update([y, inplace])

Add data or change classification parameters.

Attributes

fmt

update(y=None, inplace=False, **kwargs)[source]

Add data or change classification parameters.

Parameters:
ynumpy.array (default python:None)

\((n,1)\), array of data to classify.

inplacebool (default python:False)

Whether to conduct the update in place or to return a copy estimated from the additional specifications.

**kwargspython:dict

Additional parameters that are passed to the __init__ function of the class. For documentation, check the class constructor.