Piero V.

OpenCV and time lapses

After buying my Pixel 4a, I decided to take a picture of a poplar near my home every day. I did this for one year, and I created a time lapse. But I will not publish it here because it would reveal where my home is 😜️.

Methodical is not enough

With time lapses, you usually keep your camera still, but this was not an option in my case. Therefore, I tried to be methodical in taking the various pictures.

I used a sewer cover as a point to shoot the photo and a telephone pole as a reference (its tip is close to the upper-left corner in every picture).

Still, the results were varied, but luckily OpenCV came to the rescue.

Homography matrices

We could say that my scenario is like capturing the same scene with different cameras. Therefore, we can compute the homography matrix to reproject one image to the previous one.

And OpenCV has a very handful function to do so: findHomography. It takes the coordinates of corresponding points as inputs, and it returns a 3-by-3 matrix as output.

If you are using Python, you must pass the points as two NumPy matrices. Both must have the same shapes: a row for each pair and two columns with the coordinates. The point at the ith row of the first array must correspond to the point at the ith row of the second array.

OpenCV can use several algorithms to find a transformation. I used RANSAC, which worked quite well for many of my images. Therefore, I did not test the other options.

Then, you can pass the result to warpPerspective to make the two images hopefully overlap.

One of the function parameters is output size. I added 500px borders around the first one and then passed the extended dimension to warpPerspective. In this way, I avoided losing details not captured from the first perspective, at least for this phase.

As already stated in the introduction, with time lapses, the camera position is supposed to stay still. So the homography matrices should be relative to the first image. Therefore, I decided to run findHomography with the points matched on the five previous already transformed images. Using five instead of only one helped me with a few pictures that the point matching algorithm could not align correctly.

Point matching

Finding point correspondences in more than 300 images is a long, boring, and tedious task! And of course, some algorithms that are faster and much more precise than human exists 😁️!

I decided to use feature matching; in particular, I chose SIFT. I did not have a time limit so, taking 2-3 seconds to detect a few thousand points is acceptable. It may even be negligible considering the matching time: I settled on the brute force matcher, which can take tens of seconds or even minutes, depending on the number of features to match!

The last time I used SIFT was like 3 and a half years ago. So I copy&pasted&adapted the code from the first refresher I found with Google 😂️.

I also tried the FLANN matcher, but it did not work on the first try, while the brute force succeeded flawlessly… So I left my computer running while I was at work, and a few hours later I got my results 😎️.

In the first attempt, I did not use the “moving average” approach, i.e., I matched each picture only with the previous one (already transformed). Unfortunately, one outlier made the rest fail. Luckily, it was at ⅔ of the video, so I did not waste too much time.

Eventually, a few images were still not processed correctly, so I aligned them manually.

Video creation

The video creation was the easiest step.

First, you need to find a crop window that works for all the images.

In particular, I chose a rough rectangle, and then I refined it to match a 16:9 aspect ratio. I also decided to resize each cropped picture to 2560×1440, the closest “standard” resolution.

To make the transition from a shot to the next one smoother, I blended them with addWeighted, to which I passed a decreasing alpha and an increasing beta.

With the VideoWriter class, you can create a video file. However, when using AVC, it seemed very slow to me. So I preferred using motion JPEG and then converting the result with ffmpeg.

Alternatively, you can save the frame as image files and feed them directly to ffmpeg instead.

Here you can find the scripts I used. However, I did not write them with the intent of publishing them. So, use them at your own peril 😂️. And apart from this, they can be improved and generalized a lot.

References