init - 初始化项目

This commit is contained in:
Lee Nony
2022-05-06 01:58:53 +08:00
commit 90a5cc7cb6
6772 changed files with 2837787 additions and 0 deletions

View File

@@ -0,0 +1,319 @@
Camera calibration With OpenCV {#tutorial_camera_calibration}
==============================
@tableofcontents
@prev_tutorial{tutorial_camera_calibration_square_chess}
@next_tutorial{tutorial_real_time_pose}
| | |
| -: | :- |
| Original author | Bernát Gábor |
| Compatibility | OpenCV >= 4.0 |
Cameras have been around for a long-long time. However, with the introduction of the cheap *pinhole*
cameras in the late 20th century, they became a common occurrence in our everyday life.
Unfortunately, this cheapness comes with its price: significant distortion. Luckily, these are
constants and with a calibration and some remapping we can correct this. Furthermore, with
calibration you may also determine the relation between the camera's natural units (pixels) and the
real world units (for example millimeters).
Theory
------
For the distortion OpenCV takes into account the radial and tangential factors. For the radial
factor one uses the following formula:
\f[x_{distorted} = x( 1 + k_1 r^2 + k_2 r^4 + k_3 r^6) \\
y_{distorted} = y( 1 + k_1 r^2 + k_2 r^4 + k_3 r^6)\f]
So for an undistorted pixel point at \f$(x,y)\f$ coordinates, its position on the distorted image
will be \f$(x_{distorted} y_{distorted})\f$. The presence of the radial distortion manifests in form
of the "barrel" or "fish-eye" effect.
Tangential distortion occurs because the image taking lenses are not perfectly parallel to the
imaging plane. It can be represented via the formulas:
\f[x_{distorted} = x + [ 2p_1xy + p_2(r^2+2x^2)] \\
y_{distorted} = y + [ p_1(r^2+ 2y^2)+ 2p_2xy]\f]
So we have five distortion parameters which in OpenCV are presented as one row matrix with 5
columns:
\f[distortion\_coefficients=(k_1 \hspace{10pt} k_2 \hspace{10pt} p_1 \hspace{10pt} p_2 \hspace{10pt} k_3)\f]
Now for the unit conversion we use the following formula:
\f[\left [ \begin{matrix} x \\ y \\ w \end{matrix} \right ] = \left [ \begin{matrix} f_x & 0 & c_x \\ 0 & f_y & c_y \\ 0 & 0 & 1 \end{matrix} \right ] \left [ \begin{matrix} X \\ Y \\ Z \end{matrix} \right ]\f]
Here the presence of \f$w\f$ is explained by the use of homography coordinate system (and \f$w=Z\f$). The
unknown parameters are \f$f_x\f$ and \f$f_y\f$ (camera focal lengths) and \f$(c_x, c_y)\f$ which are the optical
centers expressed in pixels coordinates. If for both axes a common focal length is used with a given
\f$a\f$ aspect ratio (usually 1), then \f$f_y=f_x*a\f$ and in the upper formula we will have a single focal
length \f$f\f$. The matrix containing these four parameters is referred to as the *camera matrix*. While
the distortion coefficients are the same regardless of the camera resolutions used, these should be
scaled along with the current resolution from the calibrated resolution.
The process of determining these two matrices is the calibration. Calculation of these parameters is
done through basic geometrical equations. The equations used depend on the chosen calibrating
objects. Currently OpenCV supports three types of objects for calibration:
- Classical black-white chessboard
- Symmetrical circle pattern
- Asymmetrical circle pattern
Basically, you need to take snapshots of these patterns with your camera and let OpenCV find them.
Each found pattern results in a new equation. To solve the equation you need at least a
predetermined number of pattern snapshots to form a well-posed equation system. This number is
higher for the chessboard pattern and less for the circle ones. For example, in theory the
chessboard pattern requires at least two snapshots. However, in practice we have a good amount of
noise present in our input images, so for good results you will probably need at least 10 good
snapshots of the input pattern in different positions.
Goal
----
The sample application will:
- Determine the distortion matrix
- Determine the camera matrix
- Take input from Camera, Video and Image file list
- Read configuration from XML/YAML file
- Save the results into XML/YAML file
- Calculate re-projection error
Source code
-----------
You may also find the source code in the `samples/cpp/tutorial_code/calib3d/camera_calibration/`
folder of the OpenCV source library or [download it from here
](https://github.com/opencv/opencv/tree/master/samples/cpp/tutorial_code/calib3d/camera_calibration/camera_calibration.cpp). For the usage of the program, run it with `-h` argument. The program has an
essential argument: the name of its configuration file. If none is given then it will try to open the
one named "default.xml". [Here's a sample configuration file
](https://github.com/opencv/opencv/tree/master/samples/cpp/tutorial_code/calib3d/camera_calibration/in_VID5.xml) in XML format. In the
configuration file you may choose to use camera as an input, a video file or an image list. If you
opt for the last one, you will need to create a configuration file where you enumerate the images to
use. Here's [an example of this ](https://github.com/opencv/opencv/tree/master/samples/cpp/tutorial_code/calib3d/camera_calibration/VID5.xml).
The important part to remember is that the images need to be specified using the absolute path or
the relative one from your application's working directory. You may find all this in the samples
directory mentioned above.
The application starts up with reading the settings from the configuration file. Although, this is
an important part of it, it has nothing to do with the subject of this tutorial: *camera
calibration*. Therefore, I've chosen not to post the code for that part here. Technical background
on how to do this you can find in the @ref tutorial_file_input_output_with_xml_yml tutorial.
Explanation
-----------
-# **Read the settings**
@snippet samples/cpp/tutorial_code/calib3d/camera_calibration/camera_calibration.cpp file_read
For this I've used simple OpenCV class input operation. After reading the file I've an
additional post-processing function that checks validity of the input. Only if all inputs are
good then *goodInput* variable will be true.
-# **Get next input, if it fails or we have enough of them - calibrate**
After this we have a big
loop where we do the following operations: get the next image from the image list, camera or
video file. If this fails or we have enough images then we run the calibration process. In case
of image we step out of the loop and otherwise the remaining frames will be undistorted (if the
option is set) via changing from *DETECTION* mode to the *CALIBRATED* one.
@snippet samples/cpp/tutorial_code/calib3d/camera_calibration/camera_calibration.cpp get_input
For some cameras we may need to flip the input image. Here we do this too.
-# **Find the pattern in the current input**
The formation of the equations I mentioned above aims
to finding major patterns in the input: in case of the chessboard this are corners of the
squares and for the circles, well, the circles themselves. The position of these will form the
result which will be written into the *pointBuf* vector.
@snippet samples/cpp/tutorial_code/calib3d/camera_calibration/camera_calibration.cpp find_pattern
Depending on the type of the input pattern you use either the @ref cv::findChessboardCorners or
the @ref cv::findCirclesGrid function. For both of them you pass the current image and the size
of the board and you'll get the positions of the patterns. Furthermore, they return a boolean
variable which states if the pattern was found in the input (we only need to take into account
those images where this is true!).
Then again in case of cameras we only take camera images when an input delay time is passed.
This is done in order to allow user moving the chessboard around and getting different images.
Similar images result in similar equations, and similar equations at the calibration step will
form an ill-posed problem, so the calibration will fail. For square images the positions of the
corners are only approximate. We may improve this by calling the @ref cv::cornerSubPix function.
(`winSize` is used to control the side length of the search window. Its default value is 11.
`winSize` may be changed by command line parameter `--winSize=<number>`.)
It will produce better calibration result. After this we add a valid inputs result to the
*imagePoints* vector to collect all of the equations into a single container. Finally, for
visualization feedback purposes we will draw the found points on the input image using @ref
cv::findChessboardCorners function.
@snippet samples/cpp/tutorial_code/calib3d/camera_calibration/camera_calibration.cpp pattern_found
-# **Show state and result to the user, plus command line control of the application**
This part shows text output on the image.
@snippet samples/cpp/tutorial_code/calib3d/camera_calibration/camera_calibration.cpp output_text
If we ran calibration and got camera's matrix with the distortion coefficients we may want to
correct the image using @ref cv::undistort function:
@snippet samples/cpp/tutorial_code/calib3d/camera_calibration/camera_calibration.cpp output_undistorted
Then we show the image and wait for an input key and if this is *u* we toggle the distortion removal,
if it is *g* we start again the detection process, and finally for the *ESC* key we quit the application:
@snippet samples/cpp/tutorial_code/calib3d/camera_calibration/camera_calibration.cpp await_input
-# **Show the distortion removal for the images too**
When you work with an image list it is not
possible to remove the distortion inside the loop. Therefore, you must do this after the loop.
Taking advantage of this now I'll expand the @ref cv::undistort function, which is in fact first
calls @ref cv::initUndistortRectifyMap to find transformation matrices and then performs
transformation using @ref cv::remap function. Because, after successful calibration map
calculation needs to be done only once, by using this expanded form you may speed up your
application:
@snippet samples/cpp/tutorial_code/calib3d/camera_calibration/camera_calibration.cpp show_results
The calibration and save
------------------------
Because the calibration needs to be done only once per camera, it makes sense to save it after a
successful calibration. This way later on you can just load these values into your program. Due to
this we first make the calibration, and if it succeeds we save the result into an OpenCV style XML
or YAML file, depending on the extension you give in the configuration file.
Therefore in the first function we just split up these two processes. Because we want to save many
of the calibration variables we'll create these variables here and pass on both of them to the
calibration and saving function. Again, I'll not show the saving part as that has little in common
with the calibration. Explore the source file in order to find out how and what:
@snippet samples/cpp/tutorial_code/calib3d/camera_calibration/camera_calibration.cpp run_and_save
We do the calibration with the help of the @ref cv::calibrateCameraRO function. It has the following
parameters:
- The object points. This is a vector of *Point3f* vector that for each input image describes how
should the pattern look. If we have a planar pattern (like a chessboard) then we can simply set
all Z coordinates to zero. This is a collection of the points where these important points are
present. Because, we use a single pattern for all the input images we can calculate this just
once and multiply it for all the other input views. We calculate the corner points with the
*calcBoardCornerPositions* function as:
@snippet samples/cpp/tutorial_code/calib3d/camera_calibration/camera_calibration.cpp board_corners
And then multiply it as:
@code{.cpp}
vector<vector<Point3f> > objectPoints(1);
calcBoardCornerPositions(s.boardSize, s.squareSize, objectPoints[0], s.calibrationPattern);
objectPoints[0][s.boardSize.width - 1].x = objectPoints[0][0].x + grid_width;
newObjPoints = objectPoints[0];
objectPoints.resize(imagePoints.size(),objectPoints[0]);
@endcode
@note If your calibration board is inaccurate, unmeasured, roughly planar targets (Checkerboard
patterns on paper using off-the-shelf printers are the most convenient calibration targets and
most of them are not accurate enough.), a method from @cite strobl2011iccv can be utilized to
dramatically improve the accuracies of the estimated camera intrinsic parameters. This new
calibration method will be called if command line parameter `-d=<number>` is provided. In the
above code snippet, `grid_width` is actually the value set by `-d=<number>`. It's the measured
distance between top-left (0, 0, 0) and top-right (s.squareSize*(s.boardSize.width-1), 0, 0)
corners of the pattern grid points. It should be measured precisely with rulers or vernier calipers.
After calibration, newObjPoints will be updated with refined 3D coordinates of object points.
- The image points. This is a vector of *Point2f* vector which for each input image contains
coordinates of the important points (corners for chessboard and centers of the circles for the
circle pattern). We have already collected this from @ref cv::findChessboardCorners or @ref
cv::findCirclesGrid function. We just need to pass it on.
- The size of the image acquired from the camera, video file or the images.
- The index of the object point to be fixed. We set it to -1 to request standard calibration method.
If the new object-releasing method to be used, set it to the index of the top-right corner point
of the calibration board grid. See cv::calibrateCameraRO for detailed explanation.
@code{.cpp}
int iFixedPoint = -1;
if (release_object)
iFixedPoint = s.boardSize.width - 1;
@endcode
- The camera matrix. If we used the fixed aspect ratio option we need to set \f$f_x\f$:
@snippet samples/cpp/tutorial_code/calib3d/camera_calibration/camera_calibration.cpp fixed_aspect
- The distortion coefficient matrix. Initialize with zero.
@code{.cpp}
distCoeffs = Mat::zeros(8, 1, CV_64F);
@endcode
- For all the views the function will calculate rotation and translation vectors which transform
the object points (given in the model coordinate space) to the image points (given in the world
coordinate space). The 7-th and 8-th parameters are the output vector of matrices containing in
the i-th position the rotation and translation vector for the i-th object point to the i-th
image point.
- The updated output vector of calibration pattern points. This parameter is ignored with standard
calibration method.
- The final argument is the flag. You need to specify here options like fix the aspect ratio for
the focal length, assume zero tangential distortion or to fix the principal point. Here we use
CALIB_USE_LU to get faster calibration speed.
@code{.cpp}
rms = calibrateCameraRO(objectPoints, imagePoints, imageSize, iFixedPoint,
cameraMatrix, distCoeffs, rvecs, tvecs, newObjPoints,
s.flag | CALIB_USE_LU);
@endcode
- The function returns the average re-projection error. This number gives a good estimation of
precision of the found parameters. This should be as close to zero as possible. Given the
intrinsic, distortion, rotation and translation matrices we may calculate the error for one view
by using the @ref cv::projectPoints to first transform the object point to image point. Then we
calculate the absolute norm between what we got with our transformation and the corner/circle
finding algorithm. To find the average error we calculate the arithmetical mean of the errors
calculated for all the calibration images.
@snippet samples/cpp/tutorial_code/calib3d/camera_calibration/camera_calibration.cpp compute_errors
Results
-------
Let there be [this input chessboard pattern ](pattern.png) which has a size of 9 X 6. I've used an
AXIS IP camera to create a couple of snapshots of the board and saved it into VID5 directory. I've
put this inside the `images/CameraCalibration` folder of my working directory and created the
following `VID5.XML` file that describes which images to use:
@code{.xml}
<?xml version="1.0"?>
<opencv_storage>
<images>
images/CameraCalibration/VID5/xx1.jpg
images/CameraCalibration/VID5/xx2.jpg
images/CameraCalibration/VID5/xx3.jpg
images/CameraCalibration/VID5/xx4.jpg
images/CameraCalibration/VID5/xx5.jpg
images/CameraCalibration/VID5/xx6.jpg
images/CameraCalibration/VID5/xx7.jpg
images/CameraCalibration/VID5/xx8.jpg
</images>
</opencv_storage>
@endcode
Then passed `images/CameraCalibration/VID5/VID5.XML` as an input in the configuration file. Here's a
chessboard pattern found during the runtime of the application:
![](images/fileListImage.jpg)
After applying the distortion removal we get:
![](images/fileListImageUnDist.jpg)
The same works for [this asymmetrical circle pattern ](acircles_pattern.png) by setting the input
width to 4 and height to 11. This time I've used a live camera feed by specifying its ID ("1") for
the input. Here's, how a detected pattern should look:
![](images/asymetricalPattern.jpg)
In both cases in the specified output XML/YAML file you'll find the camera and distortion
coefficients matrices:
@code{.xml}
<camera_matrix type_id="opencv-matrix">
<rows>3</rows>
<cols>3</cols>
<dt>d</dt>
<data>
6.5746697944293521e+002 0. 3.1950000000000000e+002 0.
6.5746697944293521e+002 2.3950000000000000e+002 0. 0. 1.</data></camera_matrix>
<distortion_coefficients type_id="opencv-matrix">
<rows>5</rows>
<cols>1</cols>
<dt>d</dt>
<data>
-4.1802327176423804e-001 5.0715244063187526e-001 0. 0.
-5.7843597214487474e-001</data></distortion_coefficients>
@endcode
Add these values as constants to your program, call the @ref cv::initUndistortRectifyMap and the
@ref cv::remap function to remove distortion and enjoy distortion free inputs for cheap and low
quality cameras.
You may observe a runtime instance of this on the [YouTube
here](https://www.youtube.com/watch?v=ViPN810E0SU).
@youtube{ViPN810E0SU}

Binary file not shown.

After

Width:  |  Height:  |  Size: 45 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 34 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 29 KiB

View File

@@ -0,0 +1,48 @@
Create calibration pattern {#tutorial_camera_calibration_pattern}
=========================================
@tableofcontents
@next_tutorial{tutorial_camera_calibration_square_chess}
| | |
| -: | :- |
| Original author | Laurent Berger |
| Compatibility | OpenCV >= 3.0 |
The goal of this tutorial is to learn how to create calibration pattern.
You can find a chessboard pattern in https://github.com/opencv/opencv/blob/master/doc/pattern.png
You can find a circleboard pattern in https://github.com/opencv/opencv/blob/master/doc/acircles_pattern.png
Create your own pattern
---------------
Now, if you want to create your own pattern, you will need python to use https://github.com/opencv/opencv/blob/master/doc/pattern_tools/gen_pattern.py
Example
create a checkerboard pattern in file chessboard.svg with 9 rows, 6 columns and a square size of 20mm:
python gen_pattern.py -o chessboard.svg --rows 9 --columns 6 --type checkerboard --square_size 20
create a circle board pattern in file circleboard.svg with 7 rows, 5 columns and a radius of 15mm:
python gen_pattern.py -o circleboard.svg --rows 7 --columns 5 --type circles --square_size 15
create a circle board pattern in file acircleboard.svg with 7 rows, 5 columns and a square size of 10mm and less spacing between circle:
python gen_pattern.py -o acircleboard.svg --rows 7 --columns 5 --type acircles --square_size 10 --radius_rate 2
If you want to change unit use -u option (mm inches, px, m)
If you want to change page size use -w and -h options
@cond HAVE_opencv_aruco
If you want to create a ChArUco board read @ref tutorial_charuco_detection "tutorial Detection of ChArUco Corners" in opencv_contrib tutorial.
@endcond
@cond !HAVE_opencv_aruco
If you want to create a ChArUco board read tutorial Detection of ChArUco Corners in opencv_contrib tutorial.
@endcond

View File

@@ -0,0 +1,66 @@
Camera calibration with square chessboard {#tutorial_camera_calibration_square_chess}
=========================================
@tableofcontents
@prev_tutorial{tutorial_camera_calibration_pattern}
@next_tutorial{tutorial_camera_calibration}
| | |
| -: | :- |
| Original author | Victor Eruhimov |
| Compatibility | OpenCV >= 4.0 |
The goal of this tutorial is to learn how to calibrate a camera given a set of chessboard images.
*Test data*: use images in your data/chess folder.
- Compile OpenCV with samples by setting BUILD_EXAMPLES to ON in cmake configuration.
- Go to bin folder and use imagelist_creator to create an XML/YAML list of your images.
- Then, run calibration sample to get camera parameters. Use square size equal to 3cm.
Pose estimation
---------------
Now, let us write code that detects a chessboard in an image and finds its distance from the
camera. You can apply this method to any object with known 3D geometry; which you detect in an
image.
*Test data*: use chess_test\*.jpg images from your data folder.
- Create an empty console project. Load a test image :
Mat img = imread(argv[1], IMREAD_GRAYSCALE);
- Detect a chessboard in this image using findChessboard function :
bool found = findChessboardCorners( img, boardSize, ptvec, CALIB_CB_ADAPTIVE_THRESH );
- Now, write a function that generates a vector\<Point3f\> array of 3d coordinates of a chessboard
in any coordinate system. For simplicity, let us choose a system such that one of the chessboard
corners is in the origin and the board is in the plane *z = 0*
- Read camera parameters from XML/YAML file :
FileStorage fs( filename, FileStorage::READ );
Mat intrinsics, distortion;
fs["camera_matrix"] >> intrinsics;
fs["distortion_coefficients"] >> distortion;
- Now we are ready to find a chessboard pose by running \`solvePnP\` :
vector<Point3f> boardPoints;
// fill the array
...
solvePnP(Mat(boardPoints), Mat(foundBoardCorners), cameraMatrix,
distCoeffs, rvec, tvec, false);
- Calculate reprojection error like it is done in calibration sample (see
opencv/samples/cpp/calibration.cpp, function computeReprojectionErrors).
Question: how would you calculate distance from the camera origin to any one of the corners?
Answer: As our image lies in a 3D space, firstly we would calculate the relative camera pose. This would give us 3D to 2D correspondences. Next, we can apply a simple L2 norm to calculate distance between any point (end point for corners).

Binary file not shown.

After

Width:  |  Height:  |  Size: 10 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 70 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 84 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 78 KiB

View File

@@ -0,0 +1,208 @@
Interactive camera calibration application {#tutorial_interactive_calibration}
==============================
@tableofcontents
@prev_tutorial{tutorial_real_time_pose}
| | |
| -: | :- |
| Original author | Vladislav Sovrasov |
| Compatibility | OpenCV >= 3.1 |
According to classical calibration technique user must collect all data first and when run @ref cv::calibrateCamera function
to obtain camera parameters. If average re-projection error is huge or if estimated parameters seems to be wrong, process of
selection or collecting data and starting of @ref cv::calibrateCamera repeats.
Interactive calibration process assumes that after each new data portion user can see results and errors estimation, also
he can delete last data portion and finally, when dataset for calibration is big enough starts process of auto data selection.
Main application features
------
The sample application will:
- Determine the distortion matrix and confidence interval for each element
- Determine the camera matrix and confidence interval for each element
- Take input from camera or video file
- Read configuration from XML file
- Save the results into XML file
- Calculate re-projection error
- Reject patterns views on sharp angles to prevent appear of ill-conditioned jacobian blocks
- Auto switch calibration flags (fix aspect ratio and elements of distortion matrix if needed)
- Auto detect when calibration is done by using several criteria
- Auto capture of static patterns (user doesn't need press any keys to capture frame, just don't move pattern for a second)
Supported patterns:
- Black-white chessboard
- Asymmetrical circle pattern
- Dual asymmetrical circle pattern
- chAruco (chessboard with Aruco markers)
Description of parameters
------
Application has two groups of parameters: primary (passed through command line) and advances (passed through XML file).
### Primary parameters:
All of this parameters are passed to application through a command line.
-[parameter]=[default value]: description
- -v=[filename]: get video from filename, default input -- camera with id=0
- -ci=[0]: get video from camera with specified id
- -flip=[false]: vertical flip of input frames
- -t=[circles]: pattern for calibration (circles, chessboard, dualCircles, chAruco)
- -sz=[16.3]: distance between two nearest centers of circles or squares on calibration board
- -dst=[295] distance between white and black parts of daulCircles pattern
- -w=[width]: width of pattern (in corners or circles)
- -h=[height]: height of pattern (in corners or circles)
- -of=[camParams.xml]: output file name
- -ft=[true]: auto tuning of calibration flags
- -vis=[grid]: captured boards visualization (grid, window)
- -d=[0.8]: delay between captures in seconds
- -pf=[defaultConfig.xml]: advanced application parameters file
### Advanced parameters:
By default values of advanced parameters are stored in defaultConfig.xml
@code{.xml}
<?xml version="1.0"?>
<opencv_storage>
<charuco_dict>0</charuco_dict>
<charuco_square_length>200</charuco_square_length>
<charuco_marker_size>100</charuco_marker_size>
<calibration_step>1</calibration_step>
<max_frames_num>30</max_frames_num>
<min_frames_num>10</min_frames_num>
<solver_eps>1e-7</solver_eps>
<solver_max_iters>30</solver_max_iters>
<fast_solver>0</fast_solver>
<frame_filter_conv_param>0.1</frame_filter_conv_param>
<camera_resolution>1280 720</camera_resolution>
</opencv_storage>
@endcode
- *charuco_dict*: name of special dictionary, which has been used for generation of chAruco pattern
- *charuco_square_length*: size of square on chAruco board (in pixels)
- *charuco_marker_size*: size of Aruco markers on chAruco board (in pixels)
- *calibration_step*: interval in frames between launches of @ref cv::calibrateCamera
- *max_frames_num*: if number of frames for calibration is greater then this value frames filter starts working.
After filtration size of calibration dataset is equals to *max_frames_num*
- *min_frames_num*: if number of frames is greater then this value turns on auto flags tuning, undistorted view and quality evaluation
- *solver_eps*: precision of Levenberg-Marquardt solver in @ref cv::calibrateCamera
- *solver_max_iters*: iterations limit of solver
- *fast_solver*: if this value is nonzero and Lapack is found QR decomposition is used instead of SVD in solver.
QR faster than SVD, but potentially less precise
- *frame_filter_conv_param*: parameter which used in linear convolution of bicriterial frames filter
- *camera_resolution*: resolution of camera which is used for calibration
**Note:** *charuco_dict*, *charuco_square_length* and *charuco_marker_size* are used for chAruco pattern generation
(see Aruco module description for details: [Aruco tutorials](https://github.com/opencv/opencv_contrib/tree/master/modules/aruco/tutorials))
Default chAruco pattern:
![](images/charuco_board.png)
Dual circles pattern
------
To make this pattern you need standard OpenCV circles pattern and binary inverted one.
Place two patterns on one plane in order when all horizontal lines of circles in one pattern are
continuations of similar lines in another.
Measure distance between patterns as shown at picture below pass it as **dst** command line parameter. Also measure distance between centers of nearest circles and pass
this value as **sz** command line parameter.
![](images/dualCircles.jpg)
This pattern is very sensitive to quality of production and measurements.
Data filtration
------
When size of calibration dataset is greater then *max_frames_num* starts working
data filter. It tries to remove "bad" frames from dataset. Filter removes the frame
on which \f$loss\_function\f$ takes maximum.
\f[loss\_function(i)=\alpha RMS(i)+(1-\alpha)reducedGridQuality(i)\f]
**RMS** is an average re-projection error calculated for frame *i*, **reducedGridQuality**
is scene coverage quality evaluation without frame *i*. \f$\alpha\f$ is equals to
**frame_filter_conv_param**.
Calibration process
------
To start calibration just run application. Place pattern ahead the camera and fixate pattern in some pose.
After that wait for capturing (will be shown message like "Frame #i captured").
Current focal distance and re-projection error will be shown at the main screen. Move pattern to the next position and repeat procedure. Try to cover image plane
uniformly and don't show pattern on sharp angles to the image plane.
![](images/screen_charuco.jpg)
If calibration seems to be successful (confidence intervals and average re-projection
error are small, frame coverage quality and number of pattern views are big enough)
application will show a message like on screen below.
![](images/screen_finish.jpg)
Hot keys:
- Esc -- exit application
- s -- save current data to XML file
- r -- delete last frame
- d -- delete all frames
- u -- enable/disable applying of undistortion
- v -- switch visualization mode
Results
------
As result you will get camera parameters and confidence intervals for them.
Example of output XML file:
@code{.xml}
<?xml version="1.0"?>
<opencv_storage>
<calibrationDate>"Thu 07 Apr 2016 04:23:03 PM MSK"</calibrationDate>
<framesCount>21</framesCount>
<cameraResolution>
1280 720</cameraResolution>
<cameraMatrix type_id="opencv-matrix">
<rows>3</rows>
<cols>3</cols>
<dt>d</dt>
<data>
1.2519588293098975e+03 0. 6.6684948780852471e+02 0.
1.2519588293098975e+03 3.6298123112613683e+02 0. 0. 1.</data></cameraMatrix>
<cameraMatrix_std_dev type_id="opencv-matrix">
<rows>4</rows>
<cols>1</cols>
<dt>d</dt>
<data>
0. 1.2887048808572649e+01 2.8536856683866230e+00
2.8341737483430314e+00</data></cameraMatrix_std_dev>
<dist_coeffs type_id="opencv-matrix">
<rows>1</rows>
<cols>5</cols>
<dt>d</dt>
<data>
1.3569117181595716e-01 -8.2513063822554633e-01 0. 0.
1.6412101575010554e+00</data></dist_coeffs>
<dist_coeffs_std_dev type_id="opencv-matrix">
<rows>5</rows>
<cols>1</cols>
<dt>d</dt>
<data>
1.5570675523402111e-02 8.7229075437543435e-02 0. 0.
1.8382427901856876e-01</data></dist_coeffs_std_dev>
<avg_reprojection_error>4.2691743074130178e-01</avg_reprojection_error>
</opencv_storage>
@endcode

Binary file not shown.

After

Width:  |  Height:  |  Size: 31 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 106 KiB

View File

@@ -0,0 +1,806 @@
Real Time pose estimation of a textured object {#tutorial_real_time_pose}
==============================================
@tableofcontents
@prev_tutorial{tutorial_camera_calibration}
@next_tutorial{tutorial_interactive_calibration}
| | |
| -: | :- |
| Original author | Edgar Riba |
| Compatibility | OpenCV >= 3.0 |
Nowadays, augmented reality is one of the top research topic in computer vision and robotics fields.
The most elemental problem in augmented reality is the estimation of the camera pose respect of an
object in the case of computer vision area to do later some 3D rendering or in the case of robotics
obtain an object pose in order to grasp it and do some manipulation. However, this is not a trivial
problem to solve due to the fact that the most common issue in image processing is the computational
cost of applying a lot of algorithms or mathematical operations for solving a problem which is basic
and immediately for humans.
Goal
----
In this tutorial is explained how to build a real time application to estimate the camera pose in
order to track a textured object with six degrees of freedom given a 2D image and its 3D textured
model.
The application will have the following parts:
- Read 3D textured object model and object mesh.
- Take input from Camera or Video.
- Extract ORB features and descriptors from the scene.
- Match scene descriptors with model descriptors using Flann matcher.
- Pose estimation using PnP + Ransac.
- Linear Kalman Filter for bad poses rejection.
Theory
------
In computer vision estimate the camera pose from *n* 3D-to-2D point correspondences is a fundamental
and well understood problem. The most general version of the problem requires estimating the six
degrees of freedom of the pose and five calibration parameters: focal length, principal point,
aspect ratio and skew. It could be established with a minimum of 6 correspondences, using the well
known Direct Linear Transform (DLT) algorithm. There are, though, several simplifications to the
problem which turn into an extensive list of different algorithms that improve the accuracy of the
DLT.
The most common simplification is to assume known calibration parameters which is the so-called
Perspective-*n*-Point problem:
![](images/pnp.jpg)
**Problem Formulation:** Given a set of correspondences between 3D points \f$p_i\f$ expressed in a world
reference frame, and their 2D projections \f$u_i\f$ onto the image, we seek to retrieve the pose (\f$R\f$
and \f$t\f$) of the camera w.r.t. the world and the focal length \f$f\f$.
OpenCV provides four different approaches to solve the Perspective-*n*-Point problem which return
\f$R\f$ and \f$t\f$. Then, using the following formula it's possible to project 3D points into the image
plane:
\f[s\ \left [ \begin{matrix} u \\ v \\ 1 \end{matrix} \right ] = \left [ \begin{matrix} f_x & 0 & c_x \\ 0 & f_y & c_y \\ 0 & 0 & 1 \end{matrix} \right ] \left [ \begin{matrix} r_{11} & r_{12} & r_{13} & t_1 \\ r_{21} & r_{22} & r_{23} & t_2 \\ r_{31} & r_{32} & r_{33} & t_3 \end{matrix} \right ] \left [ \begin{matrix} X \\ Y \\ Z\\ 1 \end{matrix} \right ]\f]
The complete documentation of how to manage with this equations is in @ref calib3d .
Source code
-----------
You can find the source code of this tutorial in the
`samples/cpp/tutorial_code/calib3d/real_time_pose_estimation/` folder of the OpenCV source library.
The tutorial consists of two main programs:
-# **Model registration**
This application is exclusive to whom don't have a 3D textured model of the object to be detected.
You can use this program to create your own textured 3D model. This program only works for planar
objects, then if you want to model an object with complex shape you should use a sophisticated
software to create it.
The application needs an input image of the object to be registered and its 3D mesh. We have also
to provide the intrinsic parameters of the camera with which the input image was taken. All the
files need to be specified using the absolute path or the relative one from your applications
working directory. If none files are specified the program will try to open the provided default
parameters.
The application starts up extracting the ORB features and descriptors from the input image and
then uses the mesh along with the [MöllerTrumbore intersection
algorithm](http://http://en.wikipedia.org/wiki/M%C3%B6ller%E2%80%93Trumbore_intersection_algorithm/)
to compute the 3D coordinates of the found features. Finally, the 3D points and the descriptors
are stored in different lists in a file with YAML format which each row is a different point. The
technical background on how to store the files can be found in the @ref tutorial_file_input_output_with_xml_yml
tutorial.
![](images/registration.png)
-# **Model detection**
The aim of this application is estimate in real time the object pose given its 3D textured model.
The application starts up loading the 3D textured model in YAML file format with the same
structure explained in the model registration program. From the scene, the ORB features and
descriptors are detected and extracted. Then, is used @ref cv::FlannBasedMatcher with
@ref cv::flann::GenericIndex to do the matching between the scene descriptors and the model descriptors.
Using the found matches along with @ref cv::solvePnPRansac function the `R` and `t` of
the camera are computed. Finally, a KalmanFilter is applied in order to reject bad poses.
In the case that you compiled OpenCV with the samples, you can find it in opencv/build/bin/cpp-tutorial-pnp_detection\`.
Then you can run the application and change some parameters:
@code{.cpp}
This program shows how to detect an object given its 3D textured model. You can choose to use a recorded video or the webcam.
Usage:
./cpp-tutorial-pnp_detection -help
Keys:
'esc' - to quit.
--------------------------------------------------------------------------
Usage: cpp-tutorial-pnp_detection [params]
-c, --confidence (value:0.95)
RANSAC confidence
-e, --error (value:2.0)
RANSAC reprojection error
-f, --fast (value:true)
use of robust fast match
-h, --help (value:true)
print this message
--in, --inliers (value:30)
minimum inliers for Kalman update
--it, --iterations (value:500)
RANSAC maximum iterations count
-k, --keypoints (value:2000)
number of keypoints to detect
--mesh
path to ply mesh
--method, --pnp (value:0)
PnP method: (0) ITERATIVE - (1) EPNP - (2) P3P - (3) DLS
--model
path to yml model
-r, --ratio (value:0.7)
threshold for ratio test
-v, --video
path to recorded video
@endcode
For example, you can run the application changing the pnp method:
@code{.cpp}
./cpp-tutorial-pnp_detection --method=2
@endcode
Explanation
-----------
Here is explained in detail the code for the real time application:
-# **Read 3D textured object model and object mesh.**
In order to load the textured model I implemented the *class* **Model** which has the function
*load()* that opens a YAML file and take the stored 3D points with its corresponding descriptors.
You can find an example of a 3D textured model in
`samples/cpp/tutorial_code/calib3d/real_time_pose_estimation/Data/cookies_ORB.yml`.
@code{.cpp}
/* Load a YAML file using OpenCV */
void Model::load(const std::string path)
{
cv::Mat points3d_mat;
cv::FileStorage storage(path, cv::FileStorage::READ);
storage["points_3d"] >> points3d_mat;
storage["descriptors"] >> descriptors_;
points3d_mat.copyTo(list_points3d_in_);
storage.release();
}
@endcode
In the main program the model is loaded as follows:
@code{.cpp}
Model model; // instantiate Model object
model.load(yml_read_path); // load a 3D textured object model
@endcode
In order to read the model mesh I implemented a *class* **Mesh** which has a function *load()*
that opens a \f$*\f$.ply file and store the 3D points of the object and also the composed triangles.
You can find an example of a model mesh in
`samples/cpp/tutorial_code/calib3d/real_time_pose_estimation/Data/box.ply`.
@code{.cpp}
/* Load a CSV with *.ply format */
void Mesh::load(const std::string path)
{
// Create the reader
CsvReader csvReader(path);
// Clear previous data
list_vertex_.clear();
list_triangles_.clear();
// Read from .ply file
csvReader.readPLY(list_vertex_, list_triangles_);
// Update mesh attributes
num_vertexs_ = list_vertex_.size();
num_triangles_ = list_triangles_.size();
}
@endcode
In the main program the mesh is loaded as follows:
@code{.cpp}
Mesh mesh; // instantiate Mesh object
mesh.load(ply_read_path); // load an object mesh
@endcode
You can also load different model and mesh:
@code{.cpp}
./cpp-tutorial-pnp_detection --mesh=/absolute_path_to_your_mesh.ply --model=/absolute_path_to_your_model.yml
@endcode
-# **Take input from Camera or Video**
To detect is necessary capture video. It's done loading a recorded video by passing the absolute
path where it is located in your machine. In order to test the application you can find a recorded
video in `samples/cpp/tutorial_code/calib3d/real_time_pose_estimation/Data/box.mp4`.
@code{.cpp}
cv::VideoCapture cap; // instantiate VideoCapture
cap.open(video_read_path); // open a recorded video
if(!cap.isOpened()) // check if we succeeded
{
std::cout << "Could not open the camera device" << std::endl;
return -1;
}
@endcode
Then the algorithm is computed frame per frame:
@code{.cpp}
cv::Mat frame, frame_vis;
while(cap.read(frame) && cv::waitKey(30) != 27) // capture frame until ESC is pressed
{
frame_vis = frame.clone(); // refresh visualisation frame
// MAIN ALGORITHM
}
@endcode
You can also load different recorded video:
@code{.cpp}
./cpp-tutorial-pnp_detection --video=/absolute_path_to_your_video.mp4
@endcode
-# **Extract ORB features and descriptors from the scene**
The next step is to detect the scene features and extract it descriptors. For this task I
implemented a *class* **RobustMatcher** which has a function for keypoints detection and features
extraction. You can find it in
`samples/cpp/tutorial_code/calib3d/real_time_pose_estimation/src/RobusMatcher.cpp`. In your
*RobusMatch* object you can use any of the 2D features detectors of OpenCV. In this case I used
@ref cv::ORB features because is based on @ref cv::FAST to detect the keypoints and cv::xfeatures2d::BriefDescriptorExtractor
to extract the descriptors which means that is fast and robust to rotations. You can find more
detailed information about *ORB* in the documentation.
The following code is how to instantiate and set the features detector and the descriptors
extractor:
@code{.cpp}
RobustMatcher rmatcher; // instantiate RobustMatcher
cv::FeatureDetector * detector = new cv::OrbFeatureDetector(numKeyPoints); // instantiate ORB feature detector
cv::DescriptorExtractor * extractor = new cv::OrbDescriptorExtractor(); // instantiate ORB descriptor extractor
rmatcher.setFeatureDetector(detector); // set feature detector
rmatcher.setDescriptorExtractor(extractor); // set descriptor extractor
@endcode
The features and descriptors will be computed by the *RobustMatcher* inside the matching function.
-# **Match scene descriptors with model descriptors using Flann matcher**
It is the first step in our detection algorithm. The main idea is to match the scene descriptors
with our model descriptors in order to know the 3D coordinates of the found features into the
current scene.
Firstly, we have to set which matcher we want to use. In this case is used
@ref cv::FlannBasedMatcher matcher which in terms of computational cost is faster than the
@ref cv::BFMatcher matcher as we increase the trained collection of features. Then, for
FlannBased matcher the index created is *Multi-Probe LSH: Efficient Indexing for High-Dimensional
Similarity Search* due to *ORB* descriptors are binary.
You can tune the *LSH* and search parameters to improve the matching efficiency:
@code{.cpp}
cv::Ptr<cv::flann::IndexParams> indexParams = cv::makePtr<cv::flann::LshIndexParams>(6, 12, 1); // instantiate LSH index parameters
cv::Ptr<cv::flann::SearchParams> searchParams = cv::makePtr<cv::flann::SearchParams>(50); // instantiate flann search parameters
cv::DescriptorMatcher * matcher = new cv::FlannBasedMatcher(indexParams, searchParams); // instantiate FlannBased matcher
rmatcher.setDescriptorMatcher(matcher); // set matcher
@endcode
Secondly, we have to call the matcher by using *robustMatch()* or *fastRobustMatch()* function.
The difference of using this two functions is its computational cost. The first method is slower
but more robust at filtering good matches because uses two ratio test and a symmetry test. In
contrast, the second method is faster but less robust because only applies a single ratio test to
the matches.
The following code is to get the model 3D points and its descriptors and then call the matcher in
the main program:
@code{.cpp}
// Get the MODEL INFO
std::vector<cv::Point3f> list_points3d_model = model.get_points3d(); // list with model 3D coordinates
cv::Mat descriptors_model = model.get_descriptors(); // list with descriptors of each 3D coordinate
@endcode
@code{.cpp}
// -- Step 1: Robust matching between model descriptors and scene descriptors
std::vector<cv::DMatch> good_matches; // to obtain the model 3D points in the scene
std::vector<cv::KeyPoint> keypoints_scene; // to obtain the 2D points of the scene
if(fast_match)
{
rmatcher.fastRobustMatch(frame, good_matches, keypoints_scene, descriptors_model);
}
else
{
rmatcher.robustMatch(frame, good_matches, keypoints_scene, descriptors_model);
}
@endcode
The following code corresponds to the *robustMatch()* function which belongs to the
*RobustMatcher* class. This function uses the given image to detect the keypoints and extract the
descriptors, match using *two Nearest Neighbour* the extracted descriptors with the given model
descriptors and vice versa. Then, a ratio test is applied to the two direction matches in order to
remove these matches which its distance ratio between the first and second best match is larger
than a given threshold. Finally, a symmetry test is applied in order the remove non symmetrical
matches.
@code{.cpp}
void RobustMatcher::robustMatch( const cv::Mat& frame, std::vector<cv::DMatch>& good_matches,
std::vector<cv::KeyPoint>& keypoints_frame,
const std::vector<cv::KeyPoint>& keypoints_model, const cv::Mat& descriptors_model )
{
// 1a. Detection of the ORB features
this->computeKeyPoints(frame, keypoints_frame);
// 1b. Extraction of the ORB descriptors
cv::Mat descriptors_frame;
this->computeDescriptors(frame, keypoints_frame, descriptors_frame);
// 2. Match the two image descriptors
std::vector<std::vector<cv::DMatch> > matches12, matches21;
// 2a. From image 1 to image 2
matcher_->knnMatch(descriptors_frame, descriptors_model, matches12, 2); // return 2 nearest neighbours
// 2b. From image 2 to image 1
matcher_->knnMatch(descriptors_model, descriptors_frame, matches21, 2); // return 2 nearest neighbours
// 3. Remove matches for which NN ratio is > than threshold
// clean image 1 -> image 2 matches
int removed1 = ratioTest(matches12);
// clean image 2 -> image 1 matches
int removed2 = ratioTest(matches21);
// 4. Remove non-symmetrical matches
symmetryTest(matches12, matches21, good_matches);
}
@endcode
After the matches filtering we have to subtract the 2D and 3D correspondences from the found scene
keypoints and our 3D model using the obtained *DMatches* vector. For more information about
@ref cv::DMatch check the documentation.
@code{.cpp}
// -- Step 2: Find out the 2D/3D correspondences
std::vector<cv::Point3f> list_points3d_model_match; // container for the model 3D coordinates found in the scene
std::vector<cv::Point2f> list_points2d_scene_match; // container for the model 2D coordinates found in the scene
for(unsigned int match_index = 0; match_index < good_matches.size(); ++match_index)
{
cv::Point3f point3d_model = list_points3d_model[ good_matches[match_index].trainIdx ]; // 3D point from model
cv::Point2f point2d_scene = keypoints_scene[ good_matches[match_index].queryIdx ].pt; // 2D point from the scene
list_points3d_model_match.push_back(point3d_model); // add 3D point
list_points2d_scene_match.push_back(point2d_scene); // add 2D point
}
@endcode
You can also change the ratio test threshold, the number of keypoints to detect as well as use or
not the robust matcher:
@code{.cpp}
./cpp-tutorial-pnp_detection --ratio=0.8 --keypoints=1000 --fast=false
@endcode
-# **Pose estimation using PnP + Ransac**
Once with the 2D and 3D correspondences we have to apply a PnP algorithm in order to estimate the
camera pose. The reason why we have to use @ref cv::solvePnPRansac instead of @ref cv::solvePnP is
due to the fact that after the matching not all the found correspondences are correct and, as like
as not, there are false correspondences or also called *outliers*. The [Random Sample
Consensus](http://en.wikipedia.org/wiki/RANSAC) or *Ransac* is a non-deterministic iterative
method which estimate parameters of a mathematical model from observed data producing an
approximate result as the number of iterations increase. After applying *Ransac* all the *outliers*
will be eliminated to then estimate the camera pose with a certain probability to obtain a good
solution.
For the camera pose estimation I have implemented a *class* **PnPProblem**. This *class* has 4
attributes: a given calibration matrix, the rotation matrix, the translation matrix and the
rotation-translation matrix. The intrinsic calibration parameters of the camera which you are
using to estimate the pose are necessary. In order to obtain the parameters you can check
@ref tutorial_camera_calibration_square_chess and @ref tutorial_camera_calibration tutorials.
The following code is how to declare the *PnPProblem class* in the main program:
@code{.cpp}
// Intrinsic camera parameters: UVC WEBCAM
double f = 55; // focal length in mm
double sx = 22.3, sy = 14.9; // sensor size
double width = 640, height = 480; // image size
double params_WEBCAM[] = { width*f/sx, // fx
height*f/sy, // fy
width/2, // cx
height/2}; // cy
PnPProblem pnp_detection(params_WEBCAM); // instantiate PnPProblem class
@endcode
The following code is how the *PnPProblem class* initialises its attributes:
@code{.cpp}
// Custom constructor given the intrinsic camera parameters
PnPProblem::PnPProblem(const double params[])
{
_A_matrix = cv::Mat::zeros(3, 3, CV_64FC1); // intrinsic camera parameters
_A_matrix.at<double>(0, 0) = params[0]; // [ fx 0 cx ]
_A_matrix.at<double>(1, 1) = params[1]; // [ 0 fy cy ]
_A_matrix.at<double>(0, 2) = params[2]; // [ 0 0 1 ]
_A_matrix.at<double>(1, 2) = params[3];
_A_matrix.at<double>(2, 2) = 1;
_R_matrix = cv::Mat::zeros(3, 3, CV_64FC1); // rotation matrix
_t_matrix = cv::Mat::zeros(3, 1, CV_64FC1); // translation matrix
_P_matrix = cv::Mat::zeros(3, 4, CV_64FC1); // rotation-translation matrix
}
@endcode
OpenCV provides four PnP methods: ITERATIVE, EPNP, P3P and DLS. Depending on the application type,
the estimation method will be different. In the case that we want to make a real time application,
the more suitable methods are EPNP and P3P since they are faster than ITERATIVE and DLS at
finding an optimal solution. However, EPNP and P3P are not especially robust in front of planar
surfaces and sometimes the pose estimation seems to have a mirror effect. Therefore, in this
tutorial an ITERATIVE method is used due to the object to be detected has planar surfaces.
The OpenCV RANSAC implementation wants you to provide three parameters: 1) the maximum number of
iterations until the algorithm stops, 2) the maximum allowed distance between the observed and
computed point projections to consider it an inlier and 3) the confidence to obtain a good result.
You can tune these parameters in order to improve your algorithm performance. Increasing the
number of iterations will have a more accurate solution, but will take more time to find a
solution. Increasing the reprojection error will reduce the computation time, but your solution
will be unaccurate. Decreasing the confidence your algorithm will be faster, but the obtained
solution will be unaccurate.
The following parameters work for this application:
@code{.cpp}
// RANSAC parameters
int iterationsCount = 500; // number of Ransac iterations.
float reprojectionError = 2.0; // maximum allowed distance to consider it an inlier.
float confidence = 0.95; // RANSAC successful confidence.
@endcode
The following code corresponds to the *estimatePoseRANSAC()* function which belongs to the
*PnPProblem class*. This function estimates the rotation and translation matrix given a set of
2D/3D correspondences, the desired PnP method to use, the output inliers container and the Ransac
parameters:
@code{.cpp}
// Estimate the pose given a list of 2D/3D correspondences with RANSAC and the method to use
void PnPProblem::estimatePoseRANSAC( const std::vector<cv::Point3f> &list_points3d, // list with model 3D coordinates
const std::vector<cv::Point2f> &list_points2d, // list with scene 2D coordinates
int flags, cv::Mat &inliers, int iterationsCount, // PnP method; inliers container
float reprojectionError, float confidence ) // RANSAC parameters
{
cv::Mat distCoeffs = cv::Mat::zeros(4, 1, CV_64FC1); // vector of distortion coefficients
cv::Mat rvec = cv::Mat::zeros(3, 1, CV_64FC1); // output rotation vector
cv::Mat tvec = cv::Mat::zeros(3, 1, CV_64FC1); // output translation vector
bool useExtrinsicGuess = false; // if true the function uses the provided rvec and tvec values as
// initial approximations of the rotation and translation vectors
cv::solvePnPRansac( list_points3d, list_points2d, _A_matrix, distCoeffs, rvec, tvec,
useExtrinsicGuess, iterationsCount, reprojectionError, confidence,
inliers, flags );
Rodrigues(rvec,_R_matrix); // converts Rotation Vector to Matrix
_t_matrix = tvec; // set translation matrix
this->set_P_matrix(_R_matrix, _t_matrix); // set rotation-translation matrix
}
@endcode
In the following code are the 3th and 4th steps of the main algorithm. The first, calling the
above function and the second taking the output inliers vector from RANSAC to get the 2D scene
points for drawing purpose. As seen in the code we must be sure to apply RANSAC if we have
matches, in the other case, the function @ref cv::solvePnPRansac crashes due to any OpenCV *bug*.
@code{.cpp}
if(good_matches.size() > 0) // None matches, then RANSAC crashes
{
// -- Step 3: Estimate the pose using RANSAC approach
pnp_detection.estimatePoseRANSAC( list_points3d_model_match, list_points2d_scene_match,
pnpMethod, inliers_idx, iterationsCount, reprojectionError, confidence );
// -- Step 4: Catch the inliers keypoints to draw
for(int inliers_index = 0; inliers_index < inliers_idx.rows; ++inliers_index)
{
int n = inliers_idx.at<int>(inliers_index); // i-inlier
cv::Point2f point2d = list_points2d_scene_match[n]; // i-inlier point 2D
list_points2d_inliers.push_back(point2d); // add i-inlier to list
}
@endcode
Finally, once the camera pose has been estimated we can use the \f$R\f$ and \f$t\f$ in order to compute
the 2D projection onto the image of a given 3D point expressed in a world reference frame using
the showed formula on *Theory*.
The following code corresponds to the *backproject3DPoint()* function which belongs to the
*PnPProblem class*. The function backproject a given 3D point expressed in a world reference frame
onto a 2D image:
@code{.cpp}
// Backproject a 3D point to 2D using the estimated pose parameters
cv::Point2f PnPProblem::backproject3DPoint(const cv::Point3f &point3d)
{
// 3D point vector [x y z 1]'
cv::Mat point3d_vec = cv::Mat(4, 1, CV_64FC1);
point3d_vec.at<double>(0) = point3d.x;
point3d_vec.at<double>(1) = point3d.y;
point3d_vec.at<double>(2) = point3d.z;
point3d_vec.at<double>(3) = 1;
// 2D point vector [u v 1]'
cv::Mat point2d_vec = cv::Mat(4, 1, CV_64FC1);
point2d_vec = _A_matrix * _P_matrix * point3d_vec;
// Normalization of [u v]'
cv::Point2f point2d;
point2d.x = point2d_vec.at<double>(0) / point2d_vec.at<double>(2);
point2d.y = point2d_vec.at<double>(1) / point2d_vec.at<double>(2);
return point2d;
}
@endcode
The above function is used to compute all the 3D points of the object *Mesh* to show the pose of
the object.
You can also change RANSAC parameters and PnP method:
@code{.cpp}
./cpp-tutorial-pnp_detection --error=0.25 --confidence=0.90 --iterations=250 --method=3
@endcode
-# **Linear Kalman Filter for bad poses rejection**
Is it common in computer vision or robotics fields that after applying detection or tracking
techniques, bad results are obtained due to some sensor errors. In order to avoid these bad
detections in this tutorial is explained how to implement a Linear Kalman Filter. The Kalman
Filter will be applied after detected a given number of inliers.
You can find more information about what [Kalman
Filter](http://en.wikipedia.org/wiki/Kalman_filter) is. In this tutorial it's used the OpenCV
implementation of the @ref cv::KalmanFilter based on
[Linear Kalman Filter for position and orientation tracking](http://campar.in.tum.de/Chair/KalmanFilter)
to set the dynamics and measurement models.
Firstly, we have to define our state vector which will have 18 states: the positional data (x,y,z)
with its first and second derivatives (velocity and acceleration), then rotation is added in form
of three euler angles (roll, pitch, jaw) together with their first and second derivatives (angular
velocity and acceleration)
\f[X = (x,y,z,\dot x,\dot y,\dot z,\ddot x,\ddot y,\ddot z,\psi,\theta,\phi,\dot \psi,\dot \theta,\dot \phi,\ddot \psi,\ddot \theta,\ddot \phi)^T\f]
Secondly, we have to define the number of measurements which will be 6: from \f$R\f$ and \f$t\f$ we can
extract \f$(x,y,z)\f$ and \f$(\psi,\theta,\phi)\f$. In addition, we have to define the number of control
actions to apply to the system which in this case will be *zero*. Finally, we have to define the
differential time between measurements which in this case is \f$1/T\f$, where *T* is the frame rate of
the video.
@code{.cpp}
cv::KalmanFilter KF; // instantiate Kalman Filter
int nStates = 18; // the number of states
int nMeasurements = 6; // the number of measured states
int nInputs = 0; // the number of action control
double dt = 0.125; // time between measurements (1/FPS)
initKalmanFilter(KF, nStates, nMeasurements, nInputs, dt); // init function
@endcode
The following code corresponds to the *Kalman Filter* initialisation. Firstly, is set the process
noise, the measurement noise and the error covariance matrix. Secondly, are set the transition
matrix which is the dynamic model and finally the measurement matrix, which is the measurement
model.
You can tune the process and measurement noise to improve the *Kalman Filter* performance. As the
measurement noise is reduced the faster will converge doing the algorithm sensitive in front of
bad measurements.
@code{.cpp}
void initKalmanFilter(cv::KalmanFilter &KF, int nStates, int nMeasurements, int nInputs, double dt)
{
KF.init(nStates, nMeasurements, nInputs, CV_64F); // init Kalman Filter
cv::setIdentity(KF.processNoiseCov, cv::Scalar::all(1e-5)); // set process noise
cv::setIdentity(KF.measurementNoiseCov, cv::Scalar::all(1e-4)); // set measurement noise
cv::setIdentity(KF.errorCovPost, cv::Scalar::all(1)); // error covariance
/* DYNAMIC MODEL */
// [1 0 0 dt 0 0 dt2 0 0 0 0 0 0 0 0 0 0 0]
// [0 1 0 0 dt 0 0 dt2 0 0 0 0 0 0 0 0 0 0]
// [0 0 1 0 0 dt 0 0 dt2 0 0 0 0 0 0 0 0 0]
// [0 0 0 1 0 0 dt 0 0 0 0 0 0 0 0 0 0 0]
// [0 0 0 0 1 0 0 dt 0 0 0 0 0 0 0 0 0 0]
// [0 0 0 0 0 1 0 0 dt 0 0 0 0 0 0 0 0 0]
// [0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0]
// [0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0]
// [0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0]
// [0 0 0 0 0 0 0 0 0 1 0 0 dt 0 0 dt2 0 0]
// [0 0 0 0 0 0 0 0 0 0 1 0 0 dt 0 0 dt2 0]
// [0 0 0 0 0 0 0 0 0 0 0 1 0 0 dt 0 0 dt2]
// [0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 dt 0 0]
// [0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 dt 0]
// [0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 dt]
// [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0]
// [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0]
// [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1]
// position
KF.transitionMatrix.at<double>(0,3) = dt;
KF.transitionMatrix.at<double>(1,4) = dt;
KF.transitionMatrix.at<double>(2,5) = dt;
KF.transitionMatrix.at<double>(3,6) = dt;
KF.transitionMatrix.at<double>(4,7) = dt;
KF.transitionMatrix.at<double>(5,8) = dt;
KF.transitionMatrix.at<double>(0,6) = 0.5*pow(dt,2);
KF.transitionMatrix.at<double>(1,7) = 0.5*pow(dt,2);
KF.transitionMatrix.at<double>(2,8) = 0.5*pow(dt,2);
// orientation
KF.transitionMatrix.at<double>(9,12) = dt;
KF.transitionMatrix.at<double>(10,13) = dt;
KF.transitionMatrix.at<double>(11,14) = dt;
KF.transitionMatrix.at<double>(12,15) = dt;
KF.transitionMatrix.at<double>(13,16) = dt;
KF.transitionMatrix.at<double>(14,17) = dt;
KF.transitionMatrix.at<double>(9,15) = 0.5*pow(dt,2);
KF.transitionMatrix.at<double>(10,16) = 0.5*pow(dt,2);
KF.transitionMatrix.at<double>(11,17) = 0.5*pow(dt,2);
/* MEASUREMENT MODEL */
// [1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
// [0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
// [0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
// [0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0]
// [0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0]
// [0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0]
KF.measurementMatrix.at<double>(0,0) = 1; // x
KF.measurementMatrix.at<double>(1,1) = 1; // y
KF.measurementMatrix.at<double>(2,2) = 1; // z
KF.measurementMatrix.at<double>(3,9) = 1; // roll
KF.measurementMatrix.at<double>(4,10) = 1; // pitch
KF.measurementMatrix.at<double>(5,11) = 1; // yaw
}
@endcode
In the following code is the 5th step of the main algorithm. When the obtained number of inliers
after *Ransac* is over the threshold, the measurements matrix is filled and then the *Kalman
Filter* is updated:
@code{.cpp}
// -- Step 5: Kalman Filter
// GOOD MEASUREMENT
if( inliers_idx.rows >= minInliersKalman )
{
// Get the measured translation
cv::Mat translation_measured(3, 1, CV_64F);
translation_measured = pnp_detection.get_t_matrix();
// Get the measured rotation
cv::Mat rotation_measured(3, 3, CV_64F);
rotation_measured = pnp_detection.get_R_matrix();
// fill the measurements vector
fillMeasurements(measurements, translation_measured, rotation_measured);
}
// Instantiate estimated translation and rotation
cv::Mat translation_estimated(3, 1, CV_64F);
cv::Mat rotation_estimated(3, 3, CV_64F);
// update the Kalman filter with good measurements
updateKalmanFilter( KF, measurements,
translation_estimated, rotation_estimated);
@endcode
The following code corresponds to the *fillMeasurements()* function which converts the measured
[Rotation Matrix to Eulers
angles](http://euclideanspace.com/maths/geometry/rotations/conversions/matrixToEuler/index.htm)
and fill the measurements matrix along with the measured translation vector:
@code{.cpp}
void fillMeasurements( cv::Mat &measurements,
const cv::Mat &translation_measured, const cv::Mat &rotation_measured)
{
// Convert rotation matrix to euler angles
cv::Mat measured_eulers(3, 1, CV_64F);
measured_eulers = rot2euler(rotation_measured);
// Set measurement to predict
measurements.at<double>(0) = translation_measured.at<double>(0); // x
measurements.at<double>(1) = translation_measured.at<double>(1); // y
measurements.at<double>(2) = translation_measured.at<double>(2); // z
measurements.at<double>(3) = measured_eulers.at<double>(0); // roll
measurements.at<double>(4) = measured_eulers.at<double>(1); // pitch
measurements.at<double>(5) = measured_eulers.at<double>(2); // yaw
}
@endcode
The following code corresponds to the *updateKalmanFilter()* function which update the Kalman
Filter and set the estimated Rotation Matrix and translation vector. The estimated Rotation Matrix
comes from the estimated [Euler angles to Rotation
Matrix](http://euclideanspace.com/maths/geometry/rotations/conversions/eulerToMatrix/index.htm).
@code{.cpp}
void updateKalmanFilter( cv::KalmanFilter &KF, cv::Mat &measurement,
cv::Mat &translation_estimated, cv::Mat &rotation_estimated )
{
// First predict, to update the internal statePre variable
cv::Mat prediction = KF.predict();
// The "correct" phase that is going to use the predicted value and our measurement
cv::Mat estimated = KF.correct(measurement);
// Estimated translation
translation_estimated.at<double>(0) = estimated.at<double>(0);
translation_estimated.at<double>(1) = estimated.at<double>(1);
translation_estimated.at<double>(2) = estimated.at<double>(2);
// Estimated euler angles
cv::Mat eulers_estimated(3, 1, CV_64F);
eulers_estimated.at<double>(0) = estimated.at<double>(9);
eulers_estimated.at<double>(1) = estimated.at<double>(10);
eulers_estimated.at<double>(2) = estimated.at<double>(11);
// Convert estimated quaternion to rotation matrix
rotation_estimated = euler2rot(eulers_estimated);
}
@endcode
The 6th step is set the estimated rotation-translation matrix:
@code{.cpp}
// -- Step 6: Set estimated projection matrix
pnp_detection_est.set_P_matrix(rotation_estimated, translation_estimated);
@endcode
The last and optional step is draw the found pose. To do it I implemented a function to draw all
the mesh 3D points and an extra reference axis:
@code{.cpp}
// -- Step X: Draw pose
drawObjectMesh(frame_vis, &mesh, &pnp_detection, green); // draw current pose
drawObjectMesh(frame_vis, &mesh, &pnp_detection_est, yellow); // draw estimated pose
double l = 5;
std::vector<cv::Point2f> pose_points2d;
pose_points2d.push_back(pnp_detection_est.backproject3DPoint(cv::Point3f(0,0,0))); // axis center
pose_points2d.push_back(pnp_detection_est.backproject3DPoint(cv::Point3f(l,0,0))); // axis x
pose_points2d.push_back(pnp_detection_est.backproject3DPoint(cv::Point3f(0,l,0))); // axis y
pose_points2d.push_back(pnp_detection_est.backproject3DPoint(cv::Point3f(0,0,l))); // axis z
draw3DCoordinateAxes(frame_vis, pose_points2d); // draw axes
@endcode
You can also modify the minimum inliers to update Kalman Filter:
@code{.cpp}
./cpp-tutorial-pnp_detection --inliers=20
@endcode
Results
-------
The following videos are the results of pose estimation in real time using the explained detection
algorithm using the following parameters:
@code{.cpp}
// Robust Matcher parameters
int numKeyPoints = 2000; // number of detected keypoints
float ratio = 0.70f; // ratio test
bool fast_match = true; // fastRobustMatch() or robustMatch()
// RANSAC parameters
int iterationsCount = 500; // number of Ransac iterations.
int reprojectionError = 2.0; // maximum allowed distance to consider it an inlier.
float confidence = 0.95; // ransac successful confidence.
// Kalman Filter parameters
int minInliersKalman = 30; // Kalman threshold updating
@endcode
You can watch the real time pose estimation on the [YouTube
here](http://www.youtube.com/user/opencvdev/videos).
@youtube{XNATklaJlSQ}
@youtube{YLS9bWek78k}

View File

@@ -0,0 +1,8 @@
Camera calibration and 3D reconstruction (calib3d module) {#tutorial_table_of_content_calib3d}
==========================================================
- @subpage tutorial_camera_calibration_pattern
- @subpage tutorial_camera_calibration_square_chess
- @subpage tutorial_camera_calibration
- @subpage tutorial_real_time_pose
- @subpage tutorial_interactive_calibration

View File

@@ -0,0 +1,307 @@
---
author:
- Maksym Ivashechkin
bibliography: 'bibs.bib'
csl: 'acm-sigchi-proceedings.csl'
date: August 2020
title: 'Google Summer of Code: Improvement of Random Sample Consensus in OpenCV'
...
Contribution
============
The integrated part to OpenCV `calib3d` module is RANSAC-based universal
framework USAC (`namespace usac`) written in C++. The framework includes
different state-of-the-arts methods for sampling, verification or local
optimization. The main advantage of the framework is its independence to
any estimation problem and modular structure. Therefore, new solvers or
methods can be added/removed easily. So far it includes the following
components:
1. Sampling method:
1. Uniform standard RANSAC sampling proposed in \[8\] which draw
minimal subset independently uniformly at random. *The default
option in proposed framework*.
2. PROSAC method \[4\] that assumes input data points sorted by
quality so sampling can start from the most promising points.
Correspondences for this method can be sorted e.g., by ratio of
descriptor distances of the best to second match obtained from
SIFT detector. *This is method is recommended to use because it
can find good model and terminate much earlier*.
3. NAPSAC sampling method \[10\] which takes initial point
uniformly at random and the rest of points for minimal sample in
the neighborhood of initial point. This is method can be
potentially useful when models are localized. For example, for
plane fitting. However, in practise struggles from degenerate
issues and defining optimal neighborhood size.
4. Progressive-NAPSAC sampler \[2\] which is similar to NAPSAC,
although it starts from local and gradually converges to
global sampling. This method can be quite useful if local models
are expected but distribution of data can be arbitrary. The
implemented version assumes data points to be sorted by quality
as in PROSAC.
2. Score Method. USAC as well as standard RANSAC finds model which
minimizes total loss. Loss can be represented by following
functions:
1. RANSAC binary 0 / 1 loss. 1 for outlier, 0 for inlier. *Good
option if the goal is to find as many inliers as possible.*
2. MSAC truncated squared error distance of point to model. *The
default option in framework*. The model might not have as many
inliers as using RANSAC score, however will be more accurate.
3. MAGSAC threshold-free method \[3\] to compute score. Using,
although, maximum sigma (standard deviation of noise) level to
marginalize residual of point over sigma. Score of the point
represents likelihood of point being inlier. *Recommended option
when image noise is unknown since method does not require
threshold*. However, it is still recommended to provide at least
approximated threshold, because termination itself is based on
number of points which error is less than threshold. By giving 0
threshold the method will output model after maximum number of
iterations reached.
4. LMeds the least median of squared error distances. In the
framework finding median is efficiently implement with $O(n)$
complexity using quick-sort algorithm. Note, LMeds does not have
to work properly when inlier ratio is less than 50%, in other
cases this method is robust and does not require threshold.
3. Error metric which describes error distance of point to
estimated model.
1. Re-projection distance used for affine, homography and
projection matrices. For homography also symmetric re-projection
distance can be used.
2. Sampson distance used for Fundamental matrix.
3. Symmetric Geometric distance used for Essential matrix.
4. Degeneracy:
1. DEGENSAC method \[7\] which for Fundamental matrix estimation
efficiently verifies and recovers model which has at least 5
points in minimal sample lying on the dominant plane.
2. Collinearity test for affine and homography matrix estimation
checks if no 3 points lying on the line. For homography matrix
since points are planar is applied test which checks if points
in minimal sample lie on the same side w.r.t. to any line
crossing any two points in sample (does not assume reflection).
3. Oriented epipolar constraint method \[6\] for epipolar
geometry which verifies model (fundamental and essential matrix)
to have points visible in the front of the camera.
5. SPRT verification method \[9\] which verifies model by its
evaluation on randomly shuffled points using statistical properties
given by probability of inlier, relative time for estimation,
average number of output models etc. Significantly speeding up
framework, because bad model can be rejected very quickly without
explicitly computing error for every point.
6. Local Optimization:
1. Locally Optimized RANSAC method \[5\] that iteratively
improves so-far-the-best model by non-minimal estimation. *The
default option in framework. This procedure is the fastest and
not worse than others local optimization methods.*
2. Graph-Cut RANSAC method \[1\] that refine so-far-the-best
model, however, it exploits spatial coherence of the
data points. *This procedure is quite precise however
computationally slower.*
3. Sigma Consensus method \[3\] which improves model by applying
non-minimal weighted estimation, where weights are computed with
the same logic as in MAGSAC score. This method is better to use
together with MAGSAC score.
7. Termination:
1. Standard standard equation for independent and
uniform sampling.
2. PROSAC termination for PROSAC.
3. SPRT termination for SPRT.
8. Solver. In the framework there are minimal and non-minimal solvers.
In minimal solver standard methods for estimation is applied. In
non-minimal solver usually the covariance matrix is built and the
model is found as the eigen vector corresponding to the highest
eigen value.
1. Affine2D matrix
2. Homography matrix for minimal solver is used RHO
(Gaussian elimination) algorithm from OpenCV.
3. Fundamental matrix for 7-points algorithm two null vectors are
found using Gaussian elimination (eliminating to upper
triangular matrix and back-substitution) instead of SVD and then
solving 3-degrees polynomial. For 8-points solver Gaussian
elimination is used too.
4. Essential matrix 4 null vectors are found using
Gaussian elimination. Then the solver based on Gröbner basis
described in \[11\] is used. Essential matrix can be computed
only if <span style="font-variant:small-caps;">LAPACK</span> or
<span style="font-variant:small-caps;">Eigen</span> are
installed as it requires eigen decomposition with complex
eigen values.
5. Perspective-n-Point the minimal solver is classical 3 points
with up to 4 solutions. For RANSAC the low number of sample size
plays significant role as it requires less iterations,
furthermore in average P3P solver has around 1.39
estimated models. Also, in new version of `solvePnPRansac(...)`
with `UsacParams` there is an options to pass empty intrinsic
matrix `InputOutputArray cameraMatrix`. If matrix is empty than
using Direct Linear Transformation algorithm (PnP with 6 points)
framework outputs not only rotation and translation vector but
also calibration matrix.
Also, the framework can be run in parallel. The parallelization is done
in the way that multiple RANSACs are created and they share two atomic
variables `bool success` and `int num_hypothesis_tested` which
determines when all RANSACs must terminate. If one of RANSAC terminated
successfully then all other RANSAC will terminate as well. In the end
the best model is synchronized from all threads. If PROSAC sampler is
used then threads must share the same sampler since sampling is done
sequentially. However, using default options of framework parallel
RANSAC is not deterministic since it depends on how often each thread is
running. The easiest way to make it deterministic is using PROSAC
sampler without SPRT and Local Optimization and not for Fundamental
matrix, because they internally use random generators.\
\
For NAPSAC, Progressive NAPSAC or Graph-Cut methods is required to build
a neighborhood graph. In framework there are 3 options to do it:
1. `NEIGH_FLANN_KNN` estimate neighborhood graph using OpenCV FLANN
K nearest-neighbors. The default value for KNN is 7. KNN method may
work good for sampling but not good for GC-RANSAC.
2. `NEIGH_FLANN_RADIUS` similarly as in previous case finds neighbor
points which distance is less than 20 pixels.
3. `NEIGH_GRID` for finding points neighborhood tiles points in
cells using hash-table. The method is described in \[2\]. Less
accurate than `NEIGH_FLANN_RADIUS`, although significantly faster.
Note, `NEIGH_FLANN_RADIUS` and `NEIGH_FLANN_RADIUS` are not able to PnP
solver, since there are 3D object points.\
\
New flags:
1. `USAC_DEFAULT` has standard LO-RANSAC.
2. `USAC_PARALLEL` has LO-RANSAC and RANSACs run in parallel.
3. `USAC_ACCURATE` has GC-RANSAC.
4. `USAC_FAST` has LO-RANSAC with smaller number iterations in local
optimization step. Uses RANSAC score to maximize number of inliers
and terminate earlier.
5. `USAC_PROSAC` has PROSAC sampling. Note, points must be sorted.
6. `USAC_FM_8PTS` has LO-RANSAC. Only valid for Fundamental matrix
with 8-points solver.
7. `USAC_MAGSAC` has MAGSAC++.
Every flag uses SPRT verification. And in the end the final
so-far-the-best model is polished by non minimal estimation of all found
inliers.\
\
A few other important parameters:
1. `randomGeneratorState` since every USAC solver is deterministic in
OpenCV (i.e., for the same points and parameters returns the
same result) by providing new state it will output new model.
2. `loIterations` number of iterations for Local Optimization method.
*The default value is 10*. By increasing `loIterations` the output
model could be more accurate, however, the computationial time may
also increase.
3. `loSampleSize` maximum sample number for Local Optimization. *The
default value is 14*. Note, that by increasing `loSampleSize` the
accuracy of model can increase as well as the computational time.
However, it is recommended to keep value less than 100, because
estimation on low number of points is faster and more robust.
Samples:
There are three new sample files in opencv/samples directory.
1. `epipolar_lines.cpp` input arguments of `main` function are two
pathes to images. Then correspondences are found using
SIFT detector. Fundamental matrix is found using RANSAC from
tentaive correspondences and epipolar lines are plot.
2. `essential_mat_reconstr.cpp` input arguments are path to data file
containing image names and single intrinsic matrix and directory
where these images located. Correspondences are found using SIFT.
The essential matrix is estimated using RANSAC and decomposed to
rotation and translation. Then by building two relative poses with
projection matrices image points are triangulated to object points.
By running RANSAC with 3D plane fitting object points as well as
correspondences are clustered into planes.
3. `essential_mat_reconstr.py` the same functionality as in .cpp
file, however instead of clustering points to plane the 3D map of
object points is plot.
References:
1\. Daniel Barath and Jiří Matas. 2018. Graph-Cut RANSAC. In *Proceedings
of the iEEE conference on computer vision and pattern recognition*,
67336741.
2\. Daniel Barath, Maksym Ivashechkin, and Jiri Matas. 2019. Progressive
NAPSAC: Sampling from gradually growing neighborhoods. *arXiv preprint
arXiv:1906.02295*.
3\. Daniel Barath, Jana Noskova, Maksym Ivashechkin, and Jiri Matas.
2020. MAGSAC++, a fast, reliable and accurate robust estimator. In
*Proceedings of the iEEE/CVF conference on computer vision and pattern
recognition (cVPR)*.
4\. O. Chum and J. Matas. 2005. Matching with PROSAC-progressive sample
consensus. In *Computer vision and pattern recognition*.
5\. O. Chum, J. Matas, and J. Kittler. 2003. Locally optimized RANSAC. In
*Joint pattern recognition symposium*.
6\. O. Chum, T. Werner, and J. Matas. 2004. Epipolar geometry estimation
via RANSAC benefits from the oriented epipolar constraint. In
*International conference on pattern recognition*.
7\. Ondrej Chum, Tomas Werner, and Jiri Matas. 2005. Two-view geometry
estimation unaffected by a dominant plane. In *2005 iEEE computer
society conference on computer vision and pattern recognition
(cVPR05)*, 772779.
8\. M. A. Fischler and R. C. Bolles. 1981. Random sample consensus: A
paradigm for model fitting with applications to image analysis and
automated cartography. *Communications of the ACM*.
9\. Jiri Matas and Ondrej Chum. 2005. Randomized RANSAC with sequential
probability ratio test. In *Tenth iEEE international conference on
computer vision (iCCV05) volume 1*, 17271732.
10\. D. R. Myatt, P. H. S. Torr, S. J. Nasuto, J. M. Bishop, and R.
Craddock. 2002. NAPSAC: High noise, high dimensional robust estimation.
In *In bMVC02*, 458467.
11\. Henrik Stewénius, Christopher Engels, and David Nistér. 2006. Recent
developments on direct relative orientation.