init - 初始化项目

This commit is contained in:
Lee Nony
2022-05-06 01:58:53 +08:00
commit 90a5cc7cb6
6772 changed files with 2837787 additions and 0 deletions

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.0 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.9 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 44 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 26 KiB

View File

@@ -0,0 +1,201 @@
Basic Operations on Images {#tutorial_py_basic_ops}
==========================
Goal
----
Learn to:
- Access pixel values and modify them
- Access image properties
- Set a Region of Interest (ROI)
- Split and merge images
Almost all the operations in this section are mainly related to Numpy rather than OpenCV. A good
knowledge of Numpy is required to write better optimized code with OpenCV.
*( Examples will be shown in a Python terminal, since most of them are just single lines of code )*
Accessing and Modifying pixel values
------------------------------------
Let's load a color image first:
@code{.py}
>>> import numpy as np
>>> import cv2 as cv
>>> img = cv.imread('messi5.jpg')
@endcode
You can access a pixel value by its row and column coordinates. For BGR image, it returns an array
of Blue, Green, Red values. For grayscale image, just corresponding intensity is returned.
@code{.py}
>>> px = img[100,100]
>>> print( px )
[157 166 200]
# accessing only blue pixel
>>> blue = img[100,100,0]
>>> print( blue )
157
@endcode
You can modify the pixel values the same way.
@code{.py}
>>> img[100,100] = [255,255,255]
>>> print( img[100,100] )
[255 255 255]
@endcode
**Warning**
Numpy is an optimized library for fast array calculations. So simply accessing each and every pixel
value and modifying it will be very slow and it is discouraged.
@note The above method is normally used for selecting a region of an array, say the first 5 rows
and last 3 columns. For individual pixel access, the Numpy array methods, array.item() and
array.itemset() are considered better. They always return a scalar, however, so if you want to access
all the B,G,R values, you will need to call array.item() separately for each value.
Better pixel accessing and editing method :
@code{.py}
# accessing RED value
>>> img.item(10,10,2)
59
# modifying RED value
>>> img.itemset((10,10,2),100)
>>> img.item(10,10,2)
100
@endcode
Accessing Image Properties
--------------------------
Image properties include number of rows, columns, and channels; type of image data; number of pixels; etc.
The shape of an image is accessed by img.shape. It returns a tuple of the number of rows, columns, and channels
(if the image is color):
@code{.py}
>>> print( img.shape )
(342, 548, 3)
@endcode
@note If an image is grayscale, the tuple returned contains only the number of rows
and columns, so it is a good method to check whether the loaded image is grayscale or color.
Total number of pixels is accessed by `img.size`:
@code{.py}
>>> print( img.size )
562248
@endcode
Image datatype is obtained by \`img.dtype\`:
@code{.py}
>>> print( img.dtype )
uint8
@endcode
@note img.dtype is very important while debugging because a large number of errors in OpenCV-Python
code are caused by invalid datatype.
Image ROI
---------
Sometimes, you will have to play with certain regions of images. For eye detection in images, first
face detection is done over the entire image. When a face is obtained, we select the face region alone
and search for eyes inside it instead of searching the whole image. It improves accuracy (because eyes
are always on faces :D ) and performance (because we search in a small area).
ROI is again obtained using Numpy indexing. Here I am selecting the ball and copying it to another
region in the image:
@code{.py}
>>> ball = img[280:340, 330:390]
>>> img[273:333, 100:160] = ball
@endcode
Check the results below:
![image](images/roi.jpg)
Splitting and Merging Image Channels
------------------------------------
Sometimes you will need to work separately on the B,G,R channels of an image. In this case, you need
to split the BGR image into single channels. In other cases, you may need to join these individual
channels to create a BGR image. You can do this simply by:
@code{.py}
>>> b,g,r = cv.split(img)
>>> img = cv.merge((b,g,r))
@endcode
Or
@code
>>> b = img[:,:,0]
@endcode
Suppose you want to set all the red pixels to zero - you do not need to split the channels first.
Numpy indexing is faster:
@code{.py}
>>> img[:,:,2] = 0
@endcode
**Warning**
cv.split() is a costly operation (in terms of time). So use it only if necessary. Otherwise go
for Numpy indexing.
Making Borders for Images (Padding)
-----------------------------------
If you want to create a border around an image, something like a photo frame, you can use
**cv.copyMakeBorder()**. But it has more applications for convolution operation, zero
padding etc. This function takes following arguments:
- **src** - input image
- **top**, **bottom**, **left**, **right** - border width in number of pixels in corresponding
directions
- **borderType** - Flag defining what kind of border to be added. It can be following types:
- **cv.BORDER_CONSTANT** - Adds a constant colored border. The value should be given
as next argument.
- **cv.BORDER_REFLECT** - Border will be mirror reflection of the border elements,
like this : *fedcba|abcdefgh|hgfedcb*
- **cv.BORDER_REFLECT_101** or **cv.BORDER_DEFAULT** - Same as above, but with a
slight change, like this : *gfedcb|abcdefgh|gfedcba*
- **cv.BORDER_REPLICATE** - Last element is replicated throughout, like this:
*aaaaaa|abcdefgh|hhhhhhh*
- **cv.BORDER_WRAP** - Can't explain, it will look like this :
*cdefgh|abcdefgh|abcdefg*
- **value** - Color of border if border type is cv.BORDER_CONSTANT
Below is a sample code demonstrating all these border types for better understanding:
@code{.py}
import cv2 as cv
import numpy as np
from matplotlib import pyplot as plt
BLUE = [255,0,0]
img1 = cv.imread('opencv-logo.png')
replicate = cv.copyMakeBorder(img1,10,10,10,10,cv.BORDER_REPLICATE)
reflect = cv.copyMakeBorder(img1,10,10,10,10,cv.BORDER_REFLECT)
reflect101 = cv.copyMakeBorder(img1,10,10,10,10,cv.BORDER_REFLECT_101)
wrap = cv.copyMakeBorder(img1,10,10,10,10,cv.BORDER_WRAP)
constant= cv.copyMakeBorder(img1,10,10,10,10,cv.BORDER_CONSTANT,value=BLUE)
plt.subplot(231),plt.imshow(img1,'gray'),plt.title('ORIGINAL')
plt.subplot(232),plt.imshow(replicate,'gray'),plt.title('REPLICATE')
plt.subplot(233),plt.imshow(reflect,'gray'),plt.title('REFLECT')
plt.subplot(234),plt.imshow(reflect101,'gray'),plt.title('REFLECT_101')
plt.subplot(235),plt.imshow(wrap,'gray'),plt.title('WRAP')
plt.subplot(236),plt.imshow(constant,'gray'),plt.title('CONSTANT')
plt.show()
@endcode
See the result below. (Image is displayed with matplotlib. So RED and BLUE channels will be
interchanged):
![image](images/border.jpg)
Additional Resources
--------------------
Exercises
---------

Binary file not shown.

After

Width:  |  Height:  |  Size: 18 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 23 KiB

View File

@@ -0,0 +1,116 @@
Arithmetic Operations on Images {#tutorial_py_image_arithmetics}
===============================
Goal
----
- Learn several arithmetic operations on images, like addition, subtraction, bitwise operations, and etc.
- Learn these functions: **cv.add()**, **cv.addWeighted()**, etc.
Image Addition
--------------
You can add two images with the OpenCV function, cv.add(), or simply by the numpy operation
res = img1 + img2. Both images should be of same depth and type, or the second image can just be a
scalar value.
@note There is a difference between OpenCV addition and Numpy addition. OpenCV addition is a
saturated operation while Numpy addition is a modulo operation.
For example, consider the below sample:
@code{.py}
>>> x = np.uint8([250])
>>> y = np.uint8([10])
>>> print( cv.add(x,y) ) # 250+10 = 260 => 255
[[255]]
>>> print( x+y ) # 250+10 = 260 % 256 = 4
[4]
@endcode
This will be more visible when you add two images. Stick with OpenCV functions, because they will provide a better result.
Image Blending
--------------
This is also image addition, but different weights are given to images in order to give a feeling of
blending or transparency. Images are added as per the equation below:
\f[g(x) = (1 - \alpha)f_{0}(x) + \alpha f_{1}(x)\f]
By varying \f$\alpha\f$ from \f$0 \rightarrow 1\f$, you can perform a cool transition between one image to
another.
Here I took two images to blend together. The first image is given a weight of 0.7 and the second image
is given 0.3. cv.addWeighted() applies the following equation to the image:
\f[dst = \alpha \cdot img1 + \beta \cdot img2 + \gamma\f]
Here \f$\gamma\f$ is taken as zero.
@code{.py}
img1 = cv.imread('ml.png')
img2 = cv.imread('opencv-logo.png')
dst = cv.addWeighted(img1,0.7,img2,0.3,0)
cv.imshow('dst',dst)
cv.waitKey(0)
cv.destroyAllWindows()
@endcode
Check the result below:
![image](images/blending.jpg)
Bitwise Operations
------------------
This includes the bitwise AND, OR, NOT, and XOR operations. They will be highly useful while extracting
any part of the image (as we will see in coming chapters), defining and working with non-rectangular
ROI's, and etc. Below we will see an example of how to change a particular region of an image.
I want to put the OpenCV logo above an image. If I add two images, it will change the color. If I blend them,
I get a transparent effect. But I want it to be opaque. If it was a rectangular region, I could use
ROI as we did in the last chapter. But the OpenCV logo is a not a rectangular shape. So you can do it with
bitwise operations as shown below:
@code{.py}
# Load two images
img1 = cv.imread('messi5.jpg')
img2 = cv.imread('opencv-logo-white.png')
# I want to put logo on top-left corner, So I create a ROI
rows,cols,channels = img2.shape
roi = img1[0:rows, 0:cols]
# Now create a mask of logo and create its inverse mask also
img2gray = cv.cvtColor(img2,cv.COLOR_BGR2GRAY)
ret, mask = cv.threshold(img2gray, 10, 255, cv.THRESH_BINARY)
mask_inv = cv.bitwise_not(mask)
# Now black-out the area of logo in ROI
img1_bg = cv.bitwise_and(roi,roi,mask = mask_inv)
# Take only region of logo from logo image.
img2_fg = cv.bitwise_and(img2,img2,mask = mask)
# Put logo in ROI and modify the main image
dst = cv.add(img1_bg,img2_fg)
img1[0:rows, 0:cols ] = dst
cv.imshow('res',img1)
cv.waitKey(0)
cv.destroyAllWindows()
@endcode
See the result below. Left image shows the mask we created. Right image shows the final result. For
more understanding, display all the intermediate images in the above code, especially img1_bg and
img2_fg.
![image](images/overlay.jpg)
Additional Resources
--------------------
Exercises
---------
-# Create a slide show of images in a folder with smooth transition between images using
cv.addWeighted function

View File

@@ -0,0 +1,167 @@
Performance Measurement and Improvement Techniques {#tutorial_py_optimization}
==================================================
Goal
----
In image processing, since you are dealing with a large number of operations per second, it is mandatory that your code is not only providing the correct solution, but that it is also providing it in the fastest manner.
So in this chapter, you will learn:
- To measure the performance of your code.
- Some tips to improve the performance of your code.
- You will see these functions: **cv.getTickCount**, **cv.getTickFrequency**, etc.
Apart from OpenCV, Python also provides a module **time** which is helpful in measuring the time of
execution. Another module **profile** helps to get a detailed report on the code, like how much time
each function in the code took, how many times the function was called, etc. But, if you are using
IPython, all these features are integrated in an user-friendly manner. We will see some important
ones, and for more details, check links in the **Additional Resources** section.
Measuring Performance with OpenCV
---------------------------------
The **cv.getTickCount** function returns the number of clock-cycles after a reference event (like the
moment the machine was switched ON) to the moment this function is called. So if you call it before and
after the function execution, you get the number of clock-cycles used to execute a function.
The **cv.getTickFrequency** function returns the frequency of clock-cycles, or the number of
clock-cycles per second. So to find the time of execution in seconds, you can do following:
@code{.py}
e1 = cv.getTickCount()
# your code execution
e2 = cv.getTickCount()
time = (e2 - e1)/ cv.getTickFrequency()
@endcode
We will demonstrate with following example. The following example applies median filtering with kernels
of odd sizes ranging from 5 to 49. (Don't worry about what the result will look like - that is not our
goal):
@code{.py}
img1 = cv.imread('messi5.jpg')
e1 = cv.getTickCount()
for i in range(5,49,2):
img1 = cv.medianBlur(img1,i)
e2 = cv.getTickCount()
t = (e2 - e1)/cv.getTickFrequency()
print( t )
# Result I got is 0.521107655 seconds
@endcode
@note You can do the same thing with the time module. Instead of cv.getTickCount, use the time.time() function.
Then take the difference of the two times.
Default Optimization in OpenCV
------------------------------
Many of the OpenCV functions are optimized using SSE2, AVX, etc. It contains the unoptimized code also.
So if our system support these features, we should exploit them (almost all modern day processors
support them). It is enabled by default while compiling. So OpenCV runs the optimized code if it is
enabled, otherwise it runs the unoptimized code. You can use **cv.useOptimized()** to check if it is
enabled/disabled and **cv.setUseOptimized()** to enable/disable it. Let's see a simple example.
@code{.py}
# check if optimization is enabled
In [5]: cv.useOptimized()
Out[5]: True
In [6]: %timeit res = cv.medianBlur(img,49)
10 loops, best of 3: 34.9 ms per loop
# Disable it
In [7]: cv.setUseOptimized(False)
In [8]: cv.useOptimized()
Out[8]: False
In [9]: %timeit res = cv.medianBlur(img,49)
10 loops, best of 3: 64.1 ms per loop
@endcode
As you can see, optimized median filtering is \~2x faster than the unoptimized version. If you check its source,
you can see that median filtering is SIMD optimized. So you can use this to enable optimization at the
top of your code (remember it is enabled by default).
Measuring Performance in IPython
--------------------------------
Sometimes you may need to compare the performance of two similar operations. IPython gives you a
magic command %timeit to perform this. It runs the code several times to get more accurate results.
Once again, it is suitable to measuring single lines of code.
For example, do you know which of the following addition operations is better, x = 5; y = x\*\*2,
x = 5; y = x\*x, x = np.uint8([5]); y = x\*x, or y = np.square(x)? We will find out with %timeit in the
IPython shell.
@code{.py}
In [10]: x = 5
In [11]: %timeit y=x**2
10000000 loops, best of 3: 73 ns per loop
In [12]: %timeit y=x*x
10000000 loops, best of 3: 58.3 ns per loop
In [15]: z = np.uint8([5])
In [17]: %timeit y=z*z
1000000 loops, best of 3: 1.25 us per loop
In [19]: %timeit y=np.square(z)
1000000 loops, best of 3: 1.16 us per loop
@endcode
You can see that, x = 5 ; y = x\*x is fastest and it is around 20x faster compared to Numpy. If you
consider the array creation also, it may reach up to 100x faster. Cool, right? *(Numpy devs are
working on this issue)*
@note Python scalar operations are faster than Numpy scalar operations. So for operations including
one or two elements, Python scalar is better than Numpy arrays. Numpy has the advantage when the size of
the array is a little bit bigger.
We will try one more example. This time, we will compare the performance of **cv.countNonZero()**
and **np.count_nonzero()** for the same image.
@code{.py}
In [35]: %timeit z = cv.countNonZero(img)
100000 loops, best of 3: 15.8 us per loop
In [36]: %timeit z = np.count_nonzero(img)
1000 loops, best of 3: 370 us per loop
@endcode
See, the OpenCV function is nearly 25x faster than the Numpy function.
@note Normally, OpenCV functions are faster than Numpy functions. So for same operation, OpenCV
functions are preferred. But, there can be exceptions, especially when Numpy works with views
instead of copies.
More IPython magic commands
---------------------------
There are several other magic commands to measure performance, profiling, line profiling, memory
measurement, and etc. They all are well documented. So only links to those docs are provided here.
Interested readers are recommended to try them out.
Performance Optimization Techniques
-----------------------------------
There are several techniques and coding methods to exploit maximum performance of Python and Numpy.
Only relevant ones are noted here and links are given to important sources. The main thing to be
noted here is, first try to implement the algorithm in a simple manner. Once it is working,
profile it, find the bottlenecks, and optimize them.
-# Avoid using loops in Python as much as possible, especially double/triple loops etc. They are
inherently slow.
2. Vectorize the algorithm/code to the maximum extent possible, because Numpy and OpenCV are
optimized for vector operations.
3. Exploit the cache coherence.
4. Never make copies of an array unless it is necessary. Try to use views instead. Array copying is a
costly operation.
If your code is still slow after doing all of these operations, or if the use of large loops is inevitable, use additional libraries like Cython to make it faster.
Additional Resources
--------------------
-# [Python Optimization Techniques](http://wiki.python.org/moin/PythonSpeed/PerformanceTips)
2. Scipy Lecture Notes - [Advanced
Numpy](http://scipy-lectures.github.io/advanced/advanced_numpy/index.html#advanced-numpy)
3. [Timing and Profiling in IPython](http://pynash.org/2013/03/06/timing-and-profiling/)
Exercises
---------

View File

@@ -0,0 +1,18 @@
Core Operations {#tutorial_py_table_of_contents_core}
===============
- @subpage tutorial_py_basic_ops
Learn to read and
edit pixel values, working with image ROI and other basic operations.
- @subpage tutorial_py_image_arithmetics
Perform arithmetic
operations on images
- @subpage tutorial_py_optimization
Getting a solution is
important. But getting it in the fastest way is more important. Learn to check the speed of your
code, optimize the code etc.