Creates a descriptor matcher by name.
matcher = cv.DescriptorMatcher(type)
matcher = cv.DescriptorMatcher(type, 'OptionName',optionValue, ...)
Input
type In the first variant, it creates a descriptor matcher
of a given type with the default parameters (using default
constructor). The following types are recognized:
- 'BruteForce' (default) L2 distance
- 'BruteForce-SL2' L2SQR distance
- 'BruteForce-L1' L1 distance
- 'BruteForce-Hamming', 'BruteForce-HammingLUT'
- 'BruteForce-Hamming(2)'
- 'FlannBased' Flann-based indexing
In the second variant, it creates a matcher of the given
type using the specified parameters. The following
descriptor matcher types are supported:
- 'BFMatcher' Brute-force descriptor matcher. For each
descriptor in the first set, this matcher finds the
closest descriptor in the second set by trying each
one. This descriptor matcher supports masking
permissible matches of descriptor sets.
- 'FlannBasedMatcher' Flann-based descriptor matcher.
This matcher trains
flann::Index_
on a train
descriptor collection and calls its nearest search
methods to find the best matches. So, this matcher
may be faster when matching a large train collection
than the brute force matcher. FlannBasedMatcher
does not support masking permissible matches of
descriptor sets because flann::Index
does not
support this.
Options
The Brute-force matcher constructor (BFMatcher
) accepts the
following options:
- NormType One of 'L1', 'L2' (default), 'Hamming', or
'Hamming2'. See cv.DescriptorExtractor.defaultNorm.
- CrossCheck If it is false, this is will be default
BFMatcher
behaviour when it finds the k
nearest
neighbors for each query descriptor. If CrossCheck==true
,
then the cv.DescriptorMatcher.knnMatch() method with k=1
will only return pairs (i,j)
such that for i-th query
descriptor the j-th descriptor in the matcher's collection
is the nearest and vice versa, i.e. the BFMatcher
will
only return consistent pairs. Such technique usually
produces best results with minimal number of outliers when
there are enough matches. This is alternative to the ratio
test, used by D. Lowe in SIFT paper. default false
The Flann-based matcher constructor (FlannBasedMatcher
) takes
the following optional arguments:
Index Type of indexer, default 'KDTree'. One of the below.
Each index type takes optional arguments (see IndexParams
options below). You can specify the indexer by a cell
array that starts from the type name followed by option
arguments: {'Type', 'OptionName',optionValue, ...}
.
- 'Linear' Brute-force matching, linear search
- 'KDTree' Randomized kd-trees, parallel search
- 'KMeans' Hierarchical k-means tree
- 'HierarchicalClustering' Hierarchical index
- 'Composite' Combination of KDTree and KMeans
- 'LSH' multi-probe LSH
- 'Autotuned' Automatic tuning to one of the above
(
Linear
, KDTree
, KMeans
)
- 'Saved' Load saved index from a file
Search Option in matching operation. Takes a cell
array of option pairs:
- Checks The number of times the tree(s) in the index
should be recursively traversed. A higher value for
this parameter would give better search precision, but
also take more time. If automatic configuration was
used when the index was created, the number of checks
required to achieve the specified precision was also
computed, in which case this parameter is ignored.
-1 for unlimited. default 32
- EPS search for eps-approximate neighbours. default 0
- Sorted only for radius search, require neighbours
sorted by distance. default true
IndexParams Options for FlannBasedMatcher
The following are the options for FLANN indexers
(Fast Library for Approximate Nearest Neighbors):
Linear
Linear index takes no options.
Saved
Saved index takes only one argument specifying the filename.
KDTree
and Composite
- Trees The number of parallel kd-trees to use. Good values
are in the range [1..16]. default 4
KMeans
and Composite
- Branching The branching factor to use for the hierarchical
k-means tree. default 32
- Iterations The maximum number of iterations to use in the
k-means clustering stage when building the k-means tree.
A value of -1 used here means that the k-means clustering
should be iterated until convergence. default 11
- CentersInit The algorithm to use for selecting the initial
centers when performing a k-means clustering step. The
possible values are (default is 'Random'):
- 'Random' picks the initial cluster centers randomly
- 'Gonzales' picks the initial centers using Gonzales
algorithm
- 'KMeansPP' picks the initial centers using the
algorithm suggested in [ArthurKmeansPP2007]
- 'Groupwise' chooses the initial centers in a way
inspired by Gonzales (by Pierre-Emmanuel Viel).
- CBIndex This parameter (cluster boundary index) influences
the way exploration is performed in the hierarchical
kmeans tree. When
CBIndex
is zero the next kmeans domain
to be explored is choosen to be the one with the closest
center. A value greater then zero also takes into account
the size of the domain. default 0.2
HierarchicalClustering
- Branching same as above.
- CentersInit same as above.
- Trees same as above.
- LeafSize maximum leaf size. default 100
LSH
- TableNumber The number of hash tables to use (usually
between 10 and 30). default 20
- KeySize The length of the key in the hash tables (usually
between 10 and 20). default 15
- MultiProbeLevel Number of levels to use in multi-probe
(0 is regular LSH, 2 is recommended). default 0
Autotuned
- TargetPrecision Is a number between 0 and 1 specifying the
percentage of the approximate nearest-neighbor searches
that return the exact nearest-neighbor. Using a higher
value for this parameter gives more accurate results, but
the search takes longer. The optimum value usually depends
on the application. default 0.8
- BuildWeight Specifies the importance of the index build
time raported to the nearest-neighbor search time. In some
applications it is acceptable for the index build step to
take a long time if the subsequent searches in the index
can be performed very fast. In other applications it is
required that the index be build as fast as possible even
if that leads to slightly longer search times. default 0.01
- MemoryWeight Is used to specify the tradeoff between time
(index build time and search time) and memory used by the
index. A value less than 1 gives more importance to the
time spent and a value greater than 1 gives more
importance to the memory usage. default 0
- SampleFraction Is a number between 0 and 1 indicating what
fraction of the dataset to use in the automatic parameter
configuration algorithm. Running the algorithm on the full
dataset gives the most accurate results, but for very
large datasets can take longer than desired. In such case
using just a fraction of the data helps speeding up this
algorithm while still giving good approximations of the
optimum parameters. default 0.1
Example
For example, KDTree
with tree size = 4 is specified by:
matcher = cv.DescriptorMatcher('FlannBasedMatcher', ...
'Index', {'KDTree', 'Trees', 4}, ...
'Search', {'Sorted', true})
Here is an example for loading a saved index:
matcher = cv.DescriptorMatcher('FlannBasedMatcher', ...
'Index', {'Saved', '/path/to/saved/index.xml'})
References:
[ArthurKmeansPP2007]:
Arthur and S. Vassilvitskii
"k-means++: the advantages of careful seeding",
Proceedings of the eighteenth annual ACM-SIAM symposium
on Discrete algorithms, 2007
[LSH]:
Multi-Probe LSH: Efficient Indexing for High-Dimensional Similarity Search
by Qin Lv, William Josephson, Zhe Wang, Moses Charikar, Kai Li.,
Proceedings of the 33rd International Conference on
Very Large Data Bases (VLDB). Vienna, Austria. September 2007