Version 0.5 is not backwards compatible.
Features :
- Special case code for linking two datasets that, individually are unique
- Parallel processing using python standard library multiprocessing
- Much faster canopy creation using zope.index
- Asynchronous active learning methods
API breaks :
duplicateClusters
has been removed, it has been replaced bymatch
andmatchBlocks
goodThreshold
has been removed, it has been replaced bythreshold
andthresholdBlocks
- the meaning of
train
has changed. To train from training file usereadTraining
. To use console labeling, pass a dedupe instance to theconsoleLabel
function - The convenience function dataSample has been removed. It has been replaced by
the
sample
methods - It is no longer necessary to pass
frozendicts
toMatching
classes blockingFunction
has been removed and been replaced by theblocker
method