E27 computer vision - spring 2016 - project 3 how did you


E27: Computer Vision - Spring 2016 - PROJECT 3

PROJECT - STRUCTURED LIGHT AND POINT CLOUD DATA

OVERVIEW

In this lab you will investigate how the first-generation Microsoft Kinect sensor uses structured light to generate dense range data, and learn how to reconstruct 3D point clouds from such data.

TASKS

Background. Before you begin, read the ROS Kinect technical documentation at https://wiki.ros.org/kinect_calibration/technical to get an idea of how the first-generation Kinect works.

Getting started. Begin by downloading the project 3 starter code and files from the course website. Inside the archive, you will find a projector image and a number of simulated camera images of 3D objects built using a simplified model of a Kinect sensor. There are also two Python scripts, starter.py and PointCloudApp.py which help compute disparity maps and visualize 3D data.

You can test the PointCloudApp module by running one of the two commands

python PointCloudApp.py

python PointCloudApp.py cam1_xyz.npz

The first should display a 3D torus, and the second should display the 3D reconstruction corresponding to the cam1.png file. In either case, you can click and drag in the 3D window to rotate the view.

Dense point cloud reconstruction. Given the projector image, and any camera image, you can compute a disparity image for the scene. OpenCV makes this very easy via the StereoSGBM object (see https://goo.gl/U5iW51).

634_Figure.png

Your goal is to extend the starter.py program to generate an n-by-3 array of XYZ data to be used as input to the PointCloudApp module, where n is the useable number of pixels from the disparity image.

To do so, you will need to consider the camera geometry as well as the intrinsic parameters. For the simulated setup, the calibration matrices K for both the projector and the camera are identical, with

81_Figure1.png

and b, the stereo baseline, is 0.05 m.

Remember, K maps a 3D point P in the frame of the camera to a point on the sensor plane:

954_Figure2.png

Hence, mapping each point q through K-1 will return some point P which is proportional to (X, Y, Z). Furthermore, the Z coordinate of each point P can be obtained by examining the disparity value ? at each (u, v) location via

Z = b · f/?

Using this knowledge, it is possible to reconstruct the (X, Y, Z) location for every pixel in the disparity image where ? ≠ 0.

Vectorized code. Due to the inefficiency of interpreted languages, and the ability of underlying libraries to exploit data parallelism, programs in environments like MATLAB and numpy often perform much better if they operate on entire arrays rather than using for loops to explicitly iterate over the data. Although such vectorized implementations are faster, they can sometimes be more difficult to write.

Your final program should be vectorized, containing no explicit iteration (i.e. loops). The bottom part of PointCloudApp.py contains some helpful examples of array operations that avoid loops - in particular, pay attention to the use of numpy.meshgrid and the use of logical masks for array indexing.

You might find it helpful to write and debug an iterative version of your program before writing the vectorized version. For this problem, note that K-1 has a very simple form, which might help guide your implementation.

WHAT TO TURN IN-

In addition to your program source code, you should submit a PDF write-up that addresses the following topics:

1. How did you approach writing the vectorized version of the program? Did you implement the iterative version first and then modify it, or did you do something else?

2. Vary the window_size parameter for stereo matching. Try values of 7, 9, 15, and 21. How do the point clouds change? Are there any benefits to using smaller values? Larger ones?

3. In addition to the projector and IR camera, Kinect sensors also have an RGB camera which records color images. This can be useful for assigning colors to the points in the XYZ point cloud data (the resulting representation is sometimes referred to as XYZRGB).

The starter code for this project includes both data taken from a real Kinect as well as a program load_kinect_data.py to view it. This particular data was taken from a sensor mounted on a TurtleBot 2 (https://www.turtlebot.com/). Given that knowledge, why is there a blank vertical stripe on the right hand side of each point cloud from the real Kinect? What physical feature does the stripe correspond to?

Assignment Data - https://www.dropbox.com/s/3jqfjacbt5mnfr3/Project.zip?dl=0.

Request for Solution File

Ask an Expert for Answer!!
Computer Engineering: E27 computer vision - spring 2016 - project 3 how did you
Reference No:- TGS01478377

Expected delivery within 24 Hours