The position of a point in space can be computed from two views using triangulation:
Of course, to get a reasonably accurate 3D reconstruction (modulo scale) we need the parameters of the stereo pair (focal lengths and relative position and orientation of the cameras). Otherwise the estimated 3D points may suffer unacceptable projective deformation.
One of the most wonderful results of multiview geometry is that the calibration parameters can be inferred automatically just from the views of a small number of arbitrary 3D points (in unknown positions).
The easyVision example program "demostereo" illustrates this idea using two webcams. The interest point can be detected by a simple heuristic on the hsv color space. And a good estimation of position and velocity is obtained by a Kalman filter. First we adjust the parameters of the region detector and check that the tracker works:
If the point "stops", it is stored. When the set of points produces a promising estimation of the stereo geometry, the camera parameters are updated in the 3D view:
Subsequent estimations should be progressively closer to a similar reconstruction.
Now we can register points in the desired positions. For example, here we mark something like a box:
Note that auto-calibration from a single fundamental matrix is possible here because the camera model has only one internal unknown parameter, the focal distance.
This program shows in a very attractive way the basis of stereo vision. And this kind of 3D point capture method can also be useful in some applications.