Recently, I started experimenting with stereo vision.
It is a technique to produce depth maps using images captured by close positions. Then, with these maps, it is possible to create 3D representations.
The core of this workflow is the matching algorithm, which takes pairs of post-processed images and creates a “disparity” map. The disparity is the distance between a point in the two images. Depth and disparity are inverses, so it is easy to switch from one to the other.
OpenCV contains some stereo matching algorithms, but they produced a lot of noise. So I looked for another library, and I found libelas.
It is a GPLv3 C++/MATLAB library with many parameters to tune, but I could not find a Python version. My options were to switch to C++ or to port it by myself. I chose the latter, hoping that also others can benefit from it 🙂️.
Long story short, I published my first package on PyPI: PyElas.
How to use it
You can install it using pip
. Then you just have to do this:
# Somehow, load left and right images in grayscale. They must be accessible through the buffer protocol. # For example, OpenCV can do that left = cv2.imread('left.jpg', cv2.IMREAD_GRAYSCALE) right = cv2.imread('right.jpg', cv2.IMREAD_GRAYSCALE) # Create an Elas object. # You can pass parameters to the constructor or set them as members e = Elas() # Finally, process the images. You will obtain two Numpy float32 arrays with the two disparity maps disp_left, disp_right = e.process(left, right)
The list of all parameters is available on the README.md
(you can find it on my GitHub). Or, even better, you can type help(elas.Elas)
on the Python interactive interpreter.
Some technicalities
Libelas is quite a black box: apart from the parameters, it exposes only the function to process a pair of images.
Said parameters are the only member actually reused between several runs of the algorithm. However, once set, they cannot be changed anymore.
I used a similar yet different approach for the Python version. The process method is the only thing you can do, but you can always modify the parameters.
From a C++ code point of view, I do not store any Elas object. I memorize just an Elas::parameters
and create an Elas object for each process
call. As you can see on the code, the constructor just copies the settings into the instance. So, constructing it at every run does not add much overhead.
Moreover, Elas::parameters
is almost a POD-type, so it is trivial to manage. And, with direct access to it, the Python interpreter can read and write values to it. This saved me from writing a lot of code in getters and setters.
This simplicity allowed me to use the Python C API without any wrapper but still obtain the result quickly.
Does it work well?
I do not know yet 😂️ ! I wrote this binding/port to see what it can do, but I still have to try it in depth.
I looked for a stereo matching library. But, nowadays, AI is the protagonist in computer vision. If you look on GitHub, the most popular projects about this topic are neural networks. Maybe I will try them as well, in the future.
Also, this is only one part of the workflow. There is much more to say, and that is for sure. In the future, I will post about my setup.