Getting Started with OpenCV
Introduction
When discussing computer vision, it's impossible to ignore image processing. Although there is no clear boundary between the two, image processing is generally understood as the preprocessing stage of computer vision. Therefore, before introducing computer vision, it is necessary to first cover image processing.
Image processing generally refers to digital image processing, which analyzes two-dimensional digital images using mathematical functions and image transformations to extract latent information from image data. Its content typically includes three main parts: image compression, image enhancement and restoration, as well as matching, description, and recognition, covering various processing methods and techniques such as noise removal, image segmentation, and feature extraction.
Computer vision is the science of enabling machines to "see," using computers to simulate human visual mechanisms. It replaces the human eye with cameras to recognize, track, and measure targets, and extracts deeper information by processing visual data. For example, by analyzing a video captured around a building using 3D reconstruction techniques, a three-dimensional model of the building can be reconstructed in a computer. Similarly, cameras mounted on a vehicle can capture the scene ahead to infer whether the vehicle can safely pass through the area, thereby aiding decision-making.
For humans, obtaining environmental information through vision is a very easy task, which leads some people to mistakenly believe that implementing computer vision is also simple. However, this is not the case. Computer vision is essentially an "inverse problem," where observed information is used to recover details about the observed object or environment. In this process, some information is often lost, resulting in incomplete data and increasing the complexity of the problem. For example, when capturing a scene with a single camera, the lack of distance information often causes phenomena such as "a person appearing taller than a building" in the image. Therefore, computer vision remains a challenging research field with a long way to go.
Whether it's image processing or computer vision, both essentially involve processing data within a computer. As a result, researchers must confront a tricky problem—how to implement their research findings through code and verify them via simulation. In this process, they often encounter the issue of repeatedly writing basic functional programs, which is commonly referred to as "reinventing the wheel."
To provide researchers with "ready-made wheels," Intel proposed the concept of the Open Source Computer Vision Library (OpenCV). This library integrates a large number of common algorithms in the fields of image processing and computer vision, effectively avoiding repetitive and inefficient development work. Thus, OpenCV came into being.
OpenCV is composed of a series of C language functions and C++ classes. In addition to supporting C/C++ development, it also supports programming languages such as C# and Ruby, and provides language interfaces for Python, MATLAB, Java, and others. It can run on various operating systems including Linux, Windows, macOS, Android, and iOS. The emergence of OpenCV has greatly streamlined the process of validating computer vision algorithms, making it highly favored by many researchers.
After more than two decades of development, OpenCV has become one of the most important tools in the field of computer vision.
references
- Video Tutorial: https://www.bilibili.com/video/BV1jk4y1i7gN*
Book Teaching: https://www.epubit.com/bookDetails?id=UB7209964621702
Reference code: https://github.com/tungchiahui/OpenCV_Projects