init - 初始化项目
BIN
doc/py_tutorials/py_core/images/image_arithmetic.jpg
Normal file
|
After Width: | Height: | Size: 2.0 KiB |
BIN
doc/py_tutorials/py_core/images/maths_tools.jpg
Normal file
|
After Width: | Height: | Size: 3.1 KiB |
BIN
doc/py_tutorials/py_core/images/pixel_ops.jpg
Normal file
|
After Width: | Height: | Size: 4.1 KiB |
BIN
doc/py_tutorials/py_core/images/speed.jpg
Normal file
|
After Width: | Height: | Size: 2.9 KiB |
BIN
doc/py_tutorials/py_core/py_basic_ops/images/border.jpg
Normal file
|
After Width: | Height: | Size: 44 KiB |
BIN
doc/py_tutorials/py_core/py_basic_ops/images/roi.jpg
Normal file
|
After Width: | Height: | Size: 26 KiB |
201
doc/py_tutorials/py_core/py_basic_ops/py_basic_ops.markdown
Normal file
@@ -0,0 +1,201 @@
|
||||
Basic Operations on Images {#tutorial_py_basic_ops}
|
||||
==========================
|
||||
|
||||
Goal
|
||||
----
|
||||
|
||||
Learn to:
|
||||
|
||||
- Access pixel values and modify them
|
||||
- Access image properties
|
||||
- Set a Region of Interest (ROI)
|
||||
- Split and merge images
|
||||
|
||||
Almost all the operations in this section are mainly related to Numpy rather than OpenCV. A good
|
||||
knowledge of Numpy is required to write better optimized code with OpenCV.
|
||||
|
||||
*( Examples will be shown in a Python terminal, since most of them are just single lines of code )*
|
||||
|
||||
Accessing and Modifying pixel values
|
||||
------------------------------------
|
||||
|
||||
Let's load a color image first:
|
||||
@code{.py}
|
||||
>>> import numpy as np
|
||||
>>> import cv2 as cv
|
||||
|
||||
>>> img = cv.imread('messi5.jpg')
|
||||
@endcode
|
||||
You can access a pixel value by its row and column coordinates. For BGR image, it returns an array
|
||||
of Blue, Green, Red values. For grayscale image, just corresponding intensity is returned.
|
||||
@code{.py}
|
||||
>>> px = img[100,100]
|
||||
>>> print( px )
|
||||
[157 166 200]
|
||||
|
||||
# accessing only blue pixel
|
||||
>>> blue = img[100,100,0]
|
||||
>>> print( blue )
|
||||
157
|
||||
@endcode
|
||||
You can modify the pixel values the same way.
|
||||
@code{.py}
|
||||
>>> img[100,100] = [255,255,255]
|
||||
>>> print( img[100,100] )
|
||||
[255 255 255]
|
||||
@endcode
|
||||
|
||||
**Warning**
|
||||
|
||||
Numpy is an optimized library for fast array calculations. So simply accessing each and every pixel
|
||||
value and modifying it will be very slow and it is discouraged.
|
||||
|
||||
@note The above method is normally used for selecting a region of an array, say the first 5 rows
|
||||
and last 3 columns. For individual pixel access, the Numpy array methods, array.item() and
|
||||
array.itemset() are considered better. They always return a scalar, however, so if you want to access
|
||||
all the B,G,R values, you will need to call array.item() separately for each value.
|
||||
|
||||
Better pixel accessing and editing method :
|
||||
@code{.py}
|
||||
# accessing RED value
|
||||
>>> img.item(10,10,2)
|
||||
59
|
||||
|
||||
# modifying RED value
|
||||
>>> img.itemset((10,10,2),100)
|
||||
>>> img.item(10,10,2)
|
||||
100
|
||||
@endcode
|
||||
|
||||
Accessing Image Properties
|
||||
--------------------------
|
||||
|
||||
Image properties include number of rows, columns, and channels; type of image data; number of pixels; etc.
|
||||
|
||||
The shape of an image is accessed by img.shape. It returns a tuple of the number of rows, columns, and channels
|
||||
(if the image is color):
|
||||
@code{.py}
|
||||
>>> print( img.shape )
|
||||
(342, 548, 3)
|
||||
@endcode
|
||||
|
||||
@note If an image is grayscale, the tuple returned contains only the number of rows
|
||||
and columns, so it is a good method to check whether the loaded image is grayscale or color.
|
||||
|
||||
Total number of pixels is accessed by `img.size`:
|
||||
@code{.py}
|
||||
>>> print( img.size )
|
||||
562248
|
||||
@endcode
|
||||
Image datatype is obtained by \`img.dtype\`:
|
||||
@code{.py}
|
||||
>>> print( img.dtype )
|
||||
uint8
|
||||
@endcode
|
||||
|
||||
@note img.dtype is very important while debugging because a large number of errors in OpenCV-Python
|
||||
code are caused by invalid datatype.
|
||||
|
||||
Image ROI
|
||||
---------
|
||||
|
||||
Sometimes, you will have to play with certain regions of images. For eye detection in images, first
|
||||
face detection is done over the entire image. When a face is obtained, we select the face region alone
|
||||
and search for eyes inside it instead of searching the whole image. It improves accuracy (because eyes
|
||||
are always on faces :D ) and performance (because we search in a small area).
|
||||
|
||||
ROI is again obtained using Numpy indexing. Here I am selecting the ball and copying it to another
|
||||
region in the image:
|
||||
@code{.py}
|
||||
>>> ball = img[280:340, 330:390]
|
||||
>>> img[273:333, 100:160] = ball
|
||||
@endcode
|
||||
Check the results below:
|
||||
|
||||

|
||||
|
||||
Splitting and Merging Image Channels
|
||||
------------------------------------
|
||||
|
||||
Sometimes you will need to work separately on the B,G,R channels of an image. In this case, you need
|
||||
to split the BGR image into single channels. In other cases, you may need to join these individual
|
||||
channels to create a BGR image. You can do this simply by:
|
||||
@code{.py}
|
||||
>>> b,g,r = cv.split(img)
|
||||
>>> img = cv.merge((b,g,r))
|
||||
@endcode
|
||||
Or
|
||||
@code
|
||||
>>> b = img[:,:,0]
|
||||
@endcode
|
||||
Suppose you want to set all the red pixels to zero - you do not need to split the channels first.
|
||||
Numpy indexing is faster:
|
||||
@code{.py}
|
||||
>>> img[:,:,2] = 0
|
||||
@endcode
|
||||
|
||||
**Warning**
|
||||
|
||||
cv.split() is a costly operation (in terms of time). So use it only if necessary. Otherwise go
|
||||
for Numpy indexing.
|
||||
|
||||
Making Borders for Images (Padding)
|
||||
-----------------------------------
|
||||
|
||||
If you want to create a border around an image, something like a photo frame, you can use
|
||||
**cv.copyMakeBorder()**. But it has more applications for convolution operation, zero
|
||||
padding etc. This function takes following arguments:
|
||||
|
||||
- **src** - input image
|
||||
- **top**, **bottom**, **left**, **right** - border width in number of pixels in corresponding
|
||||
directions
|
||||
|
||||
- **borderType** - Flag defining what kind of border to be added. It can be following types:
|
||||
- **cv.BORDER_CONSTANT** - Adds a constant colored border. The value should be given
|
||||
as next argument.
|
||||
- **cv.BORDER_REFLECT** - Border will be mirror reflection of the border elements,
|
||||
like this : *fedcba|abcdefgh|hgfedcb*
|
||||
- **cv.BORDER_REFLECT_101** or **cv.BORDER_DEFAULT** - Same as above, but with a
|
||||
slight change, like this : *gfedcb|abcdefgh|gfedcba*
|
||||
- **cv.BORDER_REPLICATE** - Last element is replicated throughout, like this:
|
||||
*aaaaaa|abcdefgh|hhhhhhh*
|
||||
- **cv.BORDER_WRAP** - Can't explain, it will look like this :
|
||||
*cdefgh|abcdefgh|abcdefg*
|
||||
|
||||
- **value** - Color of border if border type is cv.BORDER_CONSTANT
|
||||
|
||||
Below is a sample code demonstrating all these border types for better understanding:
|
||||
@code{.py}
|
||||
import cv2 as cv
|
||||
import numpy as np
|
||||
from matplotlib import pyplot as plt
|
||||
|
||||
BLUE = [255,0,0]
|
||||
|
||||
img1 = cv.imread('opencv-logo.png')
|
||||
|
||||
replicate = cv.copyMakeBorder(img1,10,10,10,10,cv.BORDER_REPLICATE)
|
||||
reflect = cv.copyMakeBorder(img1,10,10,10,10,cv.BORDER_REFLECT)
|
||||
reflect101 = cv.copyMakeBorder(img1,10,10,10,10,cv.BORDER_REFLECT_101)
|
||||
wrap = cv.copyMakeBorder(img1,10,10,10,10,cv.BORDER_WRAP)
|
||||
constant= cv.copyMakeBorder(img1,10,10,10,10,cv.BORDER_CONSTANT,value=BLUE)
|
||||
|
||||
plt.subplot(231),plt.imshow(img1,'gray'),plt.title('ORIGINAL')
|
||||
plt.subplot(232),plt.imshow(replicate,'gray'),plt.title('REPLICATE')
|
||||
plt.subplot(233),plt.imshow(reflect,'gray'),plt.title('REFLECT')
|
||||
plt.subplot(234),plt.imshow(reflect101,'gray'),plt.title('REFLECT_101')
|
||||
plt.subplot(235),plt.imshow(wrap,'gray'),plt.title('WRAP')
|
||||
plt.subplot(236),plt.imshow(constant,'gray'),plt.title('CONSTANT')
|
||||
|
||||
plt.show()
|
||||
@endcode
|
||||
See the result below. (Image is displayed with matplotlib. So RED and BLUE channels will be
|
||||
interchanged):
|
||||
|
||||

|
||||
|
||||
Additional Resources
|
||||
--------------------
|
||||
|
||||
Exercises
|
||||
---------
|
||||
|
After Width: | Height: | Size: 18 KiB |
BIN
doc/py_tutorials/py_core/py_image_arithmetics/images/overlay.jpg
Normal file
|
After Width: | Height: | Size: 23 KiB |
@@ -0,0 +1,116 @@
|
||||
Arithmetic Operations on Images {#tutorial_py_image_arithmetics}
|
||||
===============================
|
||||
|
||||
Goal
|
||||
----
|
||||
|
||||
- Learn several arithmetic operations on images, like addition, subtraction, bitwise operations, and etc.
|
||||
- Learn these functions: **cv.add()**, **cv.addWeighted()**, etc.
|
||||
|
||||
Image Addition
|
||||
--------------
|
||||
|
||||
You can add two images with the OpenCV function, cv.add(), or simply by the numpy operation
|
||||
res = img1 + img2. Both images should be of same depth and type, or the second image can just be a
|
||||
scalar value.
|
||||
|
||||
@note There is a difference between OpenCV addition and Numpy addition. OpenCV addition is a
|
||||
saturated operation while Numpy addition is a modulo operation.
|
||||
|
||||
For example, consider the below sample:
|
||||
@code{.py}
|
||||
>>> x = np.uint8([250])
|
||||
>>> y = np.uint8([10])
|
||||
|
||||
>>> print( cv.add(x,y) ) # 250+10 = 260 => 255
|
||||
[[255]]
|
||||
|
||||
>>> print( x+y ) # 250+10 = 260 % 256 = 4
|
||||
[4]
|
||||
@endcode
|
||||
This will be more visible when you add two images. Stick with OpenCV functions, because they will provide a better result.
|
||||
|
||||
Image Blending
|
||||
--------------
|
||||
|
||||
This is also image addition, but different weights are given to images in order to give a feeling of
|
||||
blending or transparency. Images are added as per the equation below:
|
||||
|
||||
\f[g(x) = (1 - \alpha)f_{0}(x) + \alpha f_{1}(x)\f]
|
||||
|
||||
By varying \f$\alpha\f$ from \f$0 \rightarrow 1\f$, you can perform a cool transition between one image to
|
||||
another.
|
||||
|
||||
Here I took two images to blend together. The first image is given a weight of 0.7 and the second image
|
||||
is given 0.3. cv.addWeighted() applies the following equation to the image:
|
||||
|
||||
\f[dst = \alpha \cdot img1 + \beta \cdot img2 + \gamma\f]
|
||||
|
||||
Here \f$\gamma\f$ is taken as zero.
|
||||
@code{.py}
|
||||
img1 = cv.imread('ml.png')
|
||||
img2 = cv.imread('opencv-logo.png')
|
||||
|
||||
dst = cv.addWeighted(img1,0.7,img2,0.3,0)
|
||||
|
||||
cv.imshow('dst',dst)
|
||||
cv.waitKey(0)
|
||||
cv.destroyAllWindows()
|
||||
@endcode
|
||||
Check the result below:
|
||||
|
||||

|
||||
|
||||
Bitwise Operations
|
||||
------------------
|
||||
|
||||
This includes the bitwise AND, OR, NOT, and XOR operations. They will be highly useful while extracting
|
||||
any part of the image (as we will see in coming chapters), defining and working with non-rectangular
|
||||
ROI's, and etc. Below we will see an example of how to change a particular region of an image.
|
||||
|
||||
I want to put the OpenCV logo above an image. If I add two images, it will change the color. If I blend them,
|
||||
I get a transparent effect. But I want it to be opaque. If it was a rectangular region, I could use
|
||||
ROI as we did in the last chapter. But the OpenCV logo is a not a rectangular shape. So you can do it with
|
||||
bitwise operations as shown below:
|
||||
@code{.py}
|
||||
# Load two images
|
||||
img1 = cv.imread('messi5.jpg')
|
||||
img2 = cv.imread('opencv-logo-white.png')
|
||||
|
||||
# I want to put logo on top-left corner, So I create a ROI
|
||||
rows,cols,channels = img2.shape
|
||||
roi = img1[0:rows, 0:cols]
|
||||
|
||||
# Now create a mask of logo and create its inverse mask also
|
||||
img2gray = cv.cvtColor(img2,cv.COLOR_BGR2GRAY)
|
||||
ret, mask = cv.threshold(img2gray, 10, 255, cv.THRESH_BINARY)
|
||||
mask_inv = cv.bitwise_not(mask)
|
||||
|
||||
# Now black-out the area of logo in ROI
|
||||
img1_bg = cv.bitwise_and(roi,roi,mask = mask_inv)
|
||||
|
||||
# Take only region of logo from logo image.
|
||||
img2_fg = cv.bitwise_and(img2,img2,mask = mask)
|
||||
|
||||
# Put logo in ROI and modify the main image
|
||||
dst = cv.add(img1_bg,img2_fg)
|
||||
img1[0:rows, 0:cols ] = dst
|
||||
|
||||
cv.imshow('res',img1)
|
||||
cv.waitKey(0)
|
||||
cv.destroyAllWindows()
|
||||
@endcode
|
||||
See the result below. Left image shows the mask we created. Right image shows the final result. For
|
||||
more understanding, display all the intermediate images in the above code, especially img1_bg and
|
||||
img2_fg.
|
||||
|
||||

|
||||
|
||||
Additional Resources
|
||||
--------------------
|
||||
|
||||
Exercises
|
||||
---------
|
||||
|
||||
-# Create a slide show of images in a folder with smooth transition between images using
|
||||
cv.addWeighted function
|
||||
@@ -0,0 +1,167 @@
|
||||
Performance Measurement and Improvement Techniques {#tutorial_py_optimization}
|
||||
==================================================
|
||||
|
||||
Goal
|
||||
----
|
||||
|
||||
In image processing, since you are dealing with a large number of operations per second, it is mandatory that your code is not only providing the correct solution, but that it is also providing it in the fastest manner.
|
||||
So in this chapter, you will learn:
|
||||
|
||||
- To measure the performance of your code.
|
||||
- Some tips to improve the performance of your code.
|
||||
- You will see these functions: **cv.getTickCount**, **cv.getTickFrequency**, etc.
|
||||
|
||||
Apart from OpenCV, Python also provides a module **time** which is helpful in measuring the time of
|
||||
execution. Another module **profile** helps to get a detailed report on the code, like how much time
|
||||
each function in the code took, how many times the function was called, etc. But, if you are using
|
||||
IPython, all these features are integrated in an user-friendly manner. We will see some important
|
||||
ones, and for more details, check links in the **Additional Resources** section.
|
||||
|
||||
Measuring Performance with OpenCV
|
||||
---------------------------------
|
||||
|
||||
The **cv.getTickCount** function returns the number of clock-cycles after a reference event (like the
|
||||
moment the machine was switched ON) to the moment this function is called. So if you call it before and
|
||||
after the function execution, you get the number of clock-cycles used to execute a function.
|
||||
|
||||
The **cv.getTickFrequency** function returns the frequency of clock-cycles, or the number of
|
||||
clock-cycles per second. So to find the time of execution in seconds, you can do following:
|
||||
@code{.py}
|
||||
e1 = cv.getTickCount()
|
||||
# your code execution
|
||||
e2 = cv.getTickCount()
|
||||
time = (e2 - e1)/ cv.getTickFrequency()
|
||||
@endcode
|
||||
We will demonstrate with following example. The following example applies median filtering with kernels
|
||||
of odd sizes ranging from 5 to 49. (Don't worry about what the result will look like - that is not our
|
||||
goal):
|
||||
@code{.py}
|
||||
img1 = cv.imread('messi5.jpg')
|
||||
|
||||
e1 = cv.getTickCount()
|
||||
for i in range(5,49,2):
|
||||
img1 = cv.medianBlur(img1,i)
|
||||
e2 = cv.getTickCount()
|
||||
t = (e2 - e1)/cv.getTickFrequency()
|
||||
print( t )
|
||||
|
||||
# Result I got is 0.521107655 seconds
|
||||
@endcode
|
||||
@note You can do the same thing with the time module. Instead of cv.getTickCount, use the time.time() function.
|
||||
Then take the difference of the two times.
|
||||
|
||||
Default Optimization in OpenCV
|
||||
------------------------------
|
||||
|
||||
Many of the OpenCV functions are optimized using SSE2, AVX, etc. It contains the unoptimized code also.
|
||||
So if our system support these features, we should exploit them (almost all modern day processors
|
||||
support them). It is enabled by default while compiling. So OpenCV runs the optimized code if it is
|
||||
enabled, otherwise it runs the unoptimized code. You can use **cv.useOptimized()** to check if it is
|
||||
enabled/disabled and **cv.setUseOptimized()** to enable/disable it. Let's see a simple example.
|
||||
@code{.py}
|
||||
# check if optimization is enabled
|
||||
In [5]: cv.useOptimized()
|
||||
Out[5]: True
|
||||
|
||||
In [6]: %timeit res = cv.medianBlur(img,49)
|
||||
10 loops, best of 3: 34.9 ms per loop
|
||||
|
||||
# Disable it
|
||||
In [7]: cv.setUseOptimized(False)
|
||||
|
||||
In [8]: cv.useOptimized()
|
||||
Out[8]: False
|
||||
|
||||
In [9]: %timeit res = cv.medianBlur(img,49)
|
||||
10 loops, best of 3: 64.1 ms per loop
|
||||
@endcode
|
||||
As you can see, optimized median filtering is \~2x faster than the unoptimized version. If you check its source,
|
||||
you can see that median filtering is SIMD optimized. So you can use this to enable optimization at the
|
||||
top of your code (remember it is enabled by default).
|
||||
|
||||
Measuring Performance in IPython
|
||||
--------------------------------
|
||||
|
||||
Sometimes you may need to compare the performance of two similar operations. IPython gives you a
|
||||
magic command %timeit to perform this. It runs the code several times to get more accurate results.
|
||||
Once again, it is suitable to measuring single lines of code.
|
||||
|
||||
For example, do you know which of the following addition operations is better, x = 5; y = x\*\*2,
|
||||
x = 5; y = x\*x, x = np.uint8([5]); y = x\*x, or y = np.square(x)? We will find out with %timeit in the
|
||||
IPython shell.
|
||||
@code{.py}
|
||||
In [10]: x = 5
|
||||
|
||||
In [11]: %timeit y=x**2
|
||||
10000000 loops, best of 3: 73 ns per loop
|
||||
|
||||
In [12]: %timeit y=x*x
|
||||
10000000 loops, best of 3: 58.3 ns per loop
|
||||
|
||||
In [15]: z = np.uint8([5])
|
||||
|
||||
In [17]: %timeit y=z*z
|
||||
1000000 loops, best of 3: 1.25 us per loop
|
||||
|
||||
In [19]: %timeit y=np.square(z)
|
||||
1000000 loops, best of 3: 1.16 us per loop
|
||||
@endcode
|
||||
You can see that, x = 5 ; y = x\*x is fastest and it is around 20x faster compared to Numpy. If you
|
||||
consider the array creation also, it may reach up to 100x faster. Cool, right? *(Numpy devs are
|
||||
working on this issue)*
|
||||
|
||||
@note Python scalar operations are faster than Numpy scalar operations. So for operations including
|
||||
one or two elements, Python scalar is better than Numpy arrays. Numpy has the advantage when the size of
|
||||
the array is a little bit bigger.
|
||||
|
||||
We will try one more example. This time, we will compare the performance of **cv.countNonZero()**
|
||||
and **np.count_nonzero()** for the same image.
|
||||
|
||||
@code{.py}
|
||||
In [35]: %timeit z = cv.countNonZero(img)
|
||||
100000 loops, best of 3: 15.8 us per loop
|
||||
|
||||
In [36]: %timeit z = np.count_nonzero(img)
|
||||
1000 loops, best of 3: 370 us per loop
|
||||
@endcode
|
||||
See, the OpenCV function is nearly 25x faster than the Numpy function.
|
||||
|
||||
@note Normally, OpenCV functions are faster than Numpy functions. So for same operation, OpenCV
|
||||
functions are preferred. But, there can be exceptions, especially when Numpy works with views
|
||||
instead of copies.
|
||||
|
||||
More IPython magic commands
|
||||
---------------------------
|
||||
|
||||
There are several other magic commands to measure performance, profiling, line profiling, memory
|
||||
measurement, and etc. They all are well documented. So only links to those docs are provided here.
|
||||
Interested readers are recommended to try them out.
|
||||
|
||||
Performance Optimization Techniques
|
||||
-----------------------------------
|
||||
|
||||
There are several techniques and coding methods to exploit maximum performance of Python and Numpy.
|
||||
Only relevant ones are noted here and links are given to important sources. The main thing to be
|
||||
noted here is, first try to implement the algorithm in a simple manner. Once it is working,
|
||||
profile it, find the bottlenecks, and optimize them.
|
||||
|
||||
-# Avoid using loops in Python as much as possible, especially double/triple loops etc. They are
|
||||
inherently slow.
|
||||
2. Vectorize the algorithm/code to the maximum extent possible, because Numpy and OpenCV are
|
||||
optimized for vector operations.
|
||||
3. Exploit the cache coherence.
|
||||
4. Never make copies of an array unless it is necessary. Try to use views instead. Array copying is a
|
||||
costly operation.
|
||||
|
||||
If your code is still slow after doing all of these operations, or if the use of large loops is inevitable, use additional libraries like Cython to make it faster.
|
||||
|
||||
Additional Resources
|
||||
--------------------
|
||||
|
||||
-# [Python Optimization Techniques](http://wiki.python.org/moin/PythonSpeed/PerformanceTips)
|
||||
2. Scipy Lecture Notes - [Advanced
|
||||
Numpy](http://scipy-lectures.github.io/advanced/advanced_numpy/index.html#advanced-numpy)
|
||||
3. [Timing and Profiling in IPython](http://pynash.org/2013/03/06/timing-and-profiling/)
|
||||
|
||||
Exercises
|
||||
---------
|
||||
18
doc/py_tutorials/py_core/py_table_of_contents_core.markdown
Normal file
@@ -0,0 +1,18 @@
|
||||
Core Operations {#tutorial_py_table_of_contents_core}
|
||||
===============
|
||||
|
||||
- @subpage tutorial_py_basic_ops
|
||||
|
||||
Learn to read and
|
||||
edit pixel values, working with image ROI and other basic operations.
|
||||
|
||||
- @subpage tutorial_py_image_arithmetics
|
||||
|
||||
Perform arithmetic
|
||||
operations on images
|
||||
|
||||
- @subpage tutorial_py_optimization
|
||||
|
||||
Getting a solution is
|
||||
important. But getting it in the fastest way is more important. Learn to check the speed of your
|
||||
code, optimize the code etc.
|
||||