Abstract

A well-known method proposed by Quan to compute projective invariants of 3D points uses six points in three 2D images. The method is nonlinear and complicated. It usually produces three possible solutions. It is noted previously that the problem can be solved directly and linearly using six points in five images. This paper presents a method to compute projective invariants of 3D points from four uncalibrated images directly. For a set of six 3D points in general position, we choose four of them as the reference basis and represent the other two points under this basis. It is known that the cross ratios of the coefficients of these representations are projective invariant. After a series of linear transformations, a system of four bilinear equations in the three unknown projective invariants is derived. Systems of nonlinear multivariable equations are usually hard to solve. We show that this form of equations can be solved linearly and uniquely. This finding is remarkable. It means that the natural configuration of the projective reconstruction problem might be six points and four images. The solutions are given in explicit formulas.

1. Introduction

The recovery of the geometric structure of 3D points from 2D images is fundamental in computer vision. After decades of research, most of the mathematical aspect of this problem is well understood. It is proved that the geometric information of a 3D point configuration cannot be recovered from a single image, unless the configuration is further constrained [1]. When two or more images are available, the 3D structure of a scene can be recovered up to an unknown projective transformation. The projective reconstruction of camera parameters and 3D scene structure from multiple uncalibrated views is also called projective structure and motion [15].

A camera is a device that transforms properties of a 3D scene onto an image plane. A pinhole camera model is used to represent the linear projection from 3D space onto each image plane. In this paper, 3D world points are represented by homogeneous 4-vector . The projection of the th 3D point is represented by a homogeneous 3-vector . The relationships among the 3D points and their 2D projections are where is the projection matrix (which is 3 × 4 and is also called the camera matrix) of the th camera, is a nonzero scale factor called projective depth, and is the th projection of the th 3D point. Suppose that perspective images of a set of 3D points are given. The structure and motion problem is to recover the 3D point locations and camera locations from the image measurements. When the cameras are uncalibrated and no additional geometric information of the point set is available, the reconstruction is determined only up to an unknown projective transformation. For any 3D projective transformation matrix , and produce an equally valid reconstruction.

Existing methods for projective reconstruction are usually indirect. They rely on a priori estimation of some tensors of multiple images of the scene to estimate the 3D point structure. A second-order tensor usually called the fundamental matrix captures the geometry between two views of a 3D scene. A third-order tensor usually called trifocal tensor captures the geometry among three views of a 3D scene. When these tensors of multiple views of a scene are known, there are many algorithms to recover the 3D geometric structure of the scene from them [617].

We can also compute 3D projective invariants of a point set from its 2D images directly. In the famous paper [9], Quan proposed a method to compute 3D projective invariants of six 3D points from three uncalibrated images. However, the method proposed by Quan is rather complicated and hard to use in real applications.

This paper presents a fast linear method for computing projective invariants of six 3D points from four 2D view images. A 3D point structure can be configured by first choosing four reference points as a basis and then representing the other two points under this basis. The cross ratios of the coordinates of the other two points under this basis are projective invariant. A system of four bilinear equations in three unknowns is derived first. Traditional methods to solve nonlinear multivariable equations are very complicated. The main contribution of this paper is that we will show that this system of equations can be easily transformed into some linear equations. This finding is remarkable. It means that the natural configuration of the projective reconstruction problem is six points and four images. The projective invariants are given in explicit formulas.

We review a few related works in this section. The most famous of which is the work of Quan [9].

In [1], Faugeras studied projective reconstruction using five reference points called standard basis whose homogeneous coordinates are Suppose that the 3D points , , , and are transformed by each camera into 2D image points They form a projective basis for the th image plane. Then the reduced camera matrix looks like If point correspondences between images are known, projective reconstruction can be performed by solving a system of quadratic equations.

Quan proposed an algorithm to compute projective invariants of six 3D points from three projection images [9]. Given any six 3D points, the author selected five points as the standard basis as in (2). The six unknown points in 3D space are projective equivalent to the following normalized points: The known point locations in the three 2D images are first normalized according to the projective basis. After this, the known point locations in the th image are then corresponding to From these correspondence relations, a homogeneous nonlinear equation of the form can be derived for the th image, where It is also noticed that

Since six 3D points have 18 degrees of freedom and a 3D projective transformation has 15 degrees of freedom, six points in 3D space can have independent projective invariants. There are many forms of projective invariants. It is noticed that the ratios of , , , and in (7) are projective invariant. The three independent such invariants can be So the goal is to compute these unknown 3D projective invariants from three of the 2D images.

Quan tried to solve the system of bilinear equations (7) using the classical resultant technique. After eliminating the variable , he obtained two homogeneous polynomial equations of the third degree in three variables Eliminating again will result in a homogeneous polynomial equation in and of degree eight. After that, a third degree polynomial equation can be derived numerically through polynomial factorization of the following form:

As we can see from the procedure described above, the method proposed by Quan is hard to implement by ordinary users and inconvenient for real applications. In [13], the author proposed a method to eliminate variable and variable in a single step. A third degree polynomial equation in single variable was given explicitly.

3. A Linear Method to Compute Projective Invariants from 4 Images

A novel direct method for computing projective invariants of six 3D points from four images is presented in this section. We begin by considering a set of 3D points which are seen from four views.

Suppose that a set of six 3D points labeled are given, the geometric structure of which is unknown. The point set is projected into view images by four unknown camera matrices ,,, and . The relationships between them are The only information available is the point locations in the four images and point correspondences between the four projections where . It is often supposed that no four points in space are coplanar and no three points in the images are collinear. Otherwise the problem is much simpler.

Points and can be represented as linear combinations of , , , and Since points , , , and are linearly independent, this representation is unique and all the and are nonzero. There are many forms of projective invariants. It is observed that the cross ratios of coefficients in (15) are projective invariant. Six 3D points have 18 degrees of freedom and 3D projective transformation has 15 degrees of freedom. So, six 3D points can have 3 independent projective invariants. A set of functional independent projective invariants of this form are The projective invariance of , , and can be proved easily. Suppose that the six points , , , , , and are transformed into , , , , , and by a 3D projective transformation , where is a 4 × 4 full rank matrix. That is, where is a nonzero real number. Let be the linear representations of and in , , , and . Multiplying each side of each equation in (18) by matrix , we have Since vectors , , , and are linearly independent, the linear representations in (15) and (19) are exactly the same. So we have This proved the invariance of , , and .

The set of projective invariants in (16) have the property that when an invariant equals one, four of the 3D points are coplanar. This can be proved easily. For example, if = 1, then . From (15), we have Subtracting one equation from the other equation in (21), we get Since and are not zero, we have a nontrivial linear combination of points , , , and . So they are coplanar.

On the other hand, if points , , , and are coplanar, then there are numbers ,  ,, and which are not all zero such that Substituting and using (15) into (23), we obtain Since points , , , and are not coplanar, the coefficients in (24) have to be exactly zero. From this condition we have From (25), we obtain This proved the claim that the necessary and sufficient condition for four of the six points to be coplanar is that one of the projective invariants equals one.

Our next objective is to derive these invariants from image point correspondences. Multiplying each side of (15) by the projection matrices , , , and , we have That is, Applying variable eliminations to (28), we get where Rewriting (29) in another form, we have where Since we have known in advance that the systems of equations in (31) have nontrivial solutions, the coefficients matrices in (31) must be rank deficient. That is Using these constraints, we can obtain a system of four bilinear equations in variables , , and of the following form: where

The system of bilinear equations in (35) can be solved directly by numerical methods. However, nonlinear numerical methods are usually time consuming and sometimes not very stable. Classical method of variable elimination through the resultant technique will result in a high order polynomial equation in a single variable. This is not what we anticipate. The main contribution of this paper is that we will show that the system of nonlinear equations can be solved linearly. This is done by using a modified scheme of variable elimination.

Now we proceed to derive the linear solution of the system of equations (35). Rewriting (35) in matrix form, we can obtain

Since , , and are nonzero, the determinant of the coefficient matrix in (37) has to be zero. So we have This is a second degree polynomial equation in variable . A quadratic equation generally has two solutions. To obtain a unique solution, we have to apply further constraints. It is checked that Applying constraints (39) to (38), we obtain the following equation: The solutions of (40) are and

The solution corresponds to the condition that four of the 3D points are coplanar. We neglect this solution according to the assumption that no four points are coplanar. In this way a unique linear solution of the projective invariant is obtained.

Now we derive the solution of . From (35), we can obtain Since , , and are nonzero, we have Applying constraints (39) to (43), we obtain Then the unique solution of is Now we derive the solution of . From (35), we can obtain Since , , and are nonzero, we have Applying constraints (39) to (47), we obtain Then the unique solution of is

4. Implementation of the Algorithm

We have validated the proposed method on the mathematica platform. The implementation is very simple. The code is given in Algorithm 1.

X = RandomReal[{−1000, 1000}, {6, 4}];
M = RandomReal[{−1, 1}, {4, 3, 4}];
T = RandomReal[{0, 1}, {4, 6}];
x = RandomReal[{0, 1}, {4, 6, 3}];
a = RandomReal[{0, 1}, {4, 6, 4}];
b = RandomReal[{0, 1}, {4, 6, 4}];
u = RandomReal[{0, 1}, {4, 6}];
v = RandomReal[{0, 1}, {4, 6}];
X[[ ]] = 1;
X[[ ]] = 1;
X[[ ]] = 1;
X[[ ]] = 1;
X[[ ]] = 1;
X[[ ]] = 1;
XT = Transpose[{X[[ ]], X[[ ]], X[[ ]], X[[ ]]}];
A = LinearSolve[XT, X[[ ]]];
B = LinearSolve[XT, X[[ ]]];
Inv1 = (A[[ ]] B[[ ]])/(A[[ ]] B[[ ]]);
Inv2 = (A[[ ]] B[[ ]])/(A[[ ]] B[[ ]]);
Inv3 = (A[[ ]] B[[ ]])/(A[[ ]] B[[ ]]);
Print[“The three invariants computed from 3D point
locations: ”, Inv1, “ ”, Inv2, “ ”, Inv3];
For[i = 1, i <= 4, i++, For[j = 1, j <= 6, j++,
x[[i, j]] = M[[i]]   X[[j]];
u[[i, j]] = x[[i, j, 1]]/x[[i, j, 3]];
v[[i, j]] = x[[i, j, 2]]/x[[i, j, 3]];
]];
For[i = 1, i <= 4, i++, For[j = 5, j <= 6, j++,
a[[i, j, 1]] = u[[i, 1]] − u[[i, j]];
a[[i, j, 2]] = u[[i, 2]] − u[[i, j]];
a[[i, j, 3]] = u[[i, 3]] − u[[i, j]];
a[[i, j, 4]] = u[[i, 4]] − u[[i, j]];
b[[i, j, 1]] = v[[i, 1]] − v[[i, j]];
b[[i, j, 2]] = v[[i, 2]] − v[[i, j]];
b[[i, j, 3]] = v[[i, 3]] − v[[i, j]];
b[[i, j, 4]] = v[[i, 4]] − v[[i, j]];
]];
For[i = 1, i <= 4, i++,
T[[i, 1]] = (a[[i, 5, 3]] b[[i, 5, 4]] − a[[i, 5, 4]] b[[i, 5, 3]])
  (a[[i, 6, 1]] b[[i, 6, 2]] − a[[i, 6, 2]] b[[i, 6, 1]]);
T[[i, 2]] = (a[[i, 5, 4]] b[[i, 5, 2]] − a[[i, 5, 2]] b[[i, 5, 4]])
  (a[[i, 6, 1]] b[[i, 6, 3]] − a[[i, 6, 3]] b[[i, 6, 1]]);
T[[i, 3]] = (a[[i, 5, 2]] b[[i, 5, 3]] − a[[i, 5, 3]] b[[i, 5, 2]])
  (a[[i, 6, 1]] b[[i, 6, 4]] − a[[i, 6, 4]] b[[i, 6, 1]]);
T[[i, 4]] = (a[[i, 5, 1]] b[[i, 5, 4]] − a[[i, 5, 4]] b[[i, 5, 1]])
  (a[[i, 6, 2]] b[[i, 6, 3]] − a[[i, 6, 3]] b[[i, 6, 2]]);
T[[i, 5]] = (a[[i, 5, 3]] b[[i, 5, 1]] − a[[i, 5, 1]] b[[i, 5, 3]])
  (a[[i, 6, 2]] b[[i, 6, 4]] − a[[i, 6, 4]] b[[i, 6, 2]]);
T[[i, 6]] = (a[[i, 5, 1]] b[[i, 5, 2]] − a[[i, 5, 2]] b[[i, 5, 1]])
  (a[[i, 6, 3]] b[[i, 6, 4]] − a[[i, 6, 4]] b[[i, 6, 3]]);
];
I1 = −Det[{
{T[[ ]], T[[ ]] + T[[ ]], T[[ ]], T[[ ]]},
{T[[ ]], T[[ ]] + T[[ ]], T[[ ]], T[[ ]]},
{T[[ ]], T[[ ]] + T[[ ]], T[[ ]], T[[ ]]},
{T[[ ]], T[[ ]] + T[[ ]], T[[ ]], T[[ ]]}
}]/Det[{
{T[[ ]], T[[ ]], T[[ ]], T[[ ]]},
{T[[ ]], T[[ ]], T[[ ]], T[[ ]]},
{T[[ ]], T[[ ]], T[[ ]], T[[ ]]},
{T[[ ]], T[[ ]], T[[ ]], T[[ ]]}
}];
I2 = −Det[{
{T[[ ]] + T[[ ]], T[[ ]], T[[ ]], T[[ ]]},
{T[[ ]] + T[[ ]], T[[ ]], T[[ ]], T[[ ]]},
{T[[ ]] + T[[ ]], T[[ ]], T[[ ]], T[[ ]]},
{T[[ ]] + T[[ ]], T[[ ]], T[[ ]], T[[ ]]}
}]/Det[{
{T[[ ]], T[[ ]], T[[ ]], T[[ ]]},
{T[[ ]], T[[ ]], T[[ ]], T[[ ]]},
{T[[ ]], T[[ ]], T[[ ]], T[[ ]]},
{T[[ ]], T[[ ]], T[[ ]], T[[ ]]}
}];
I3 = −Det[{
{T[[ ]] + T[[ ]], T[[ ]], T[[ ]], T[[ ]]},
{T[[ ]] + T[[ ]], T[[ ]], T[[ ]], T[[ ]]},
{T[[ ]] + T[[ ]], T[[ ]], T[[ ]], T[[ ]]},
{T[[ ]] + T[[ ]], T[[ ]], T[[ ]], T[[ ]]}
}]/Det[{
{T[[ ]], T[[ ]], T[[ ]], T[[ ]]},
{T[[ ]], T[[ ]], T[[ ]], T[[ ]]},
{T[[ ]], T[[ ]], T[[ ]], T[[ ]]},
{T[[ ]], T[[ ]], T[[ ]], T[[ ]]}
}];
Print[“The three invariants computed from 2D projections: ”,
I1, “ ”, I2, “ ”, I3];

5. Conclusions

We have presented a direct and linear method for computing projective invariants of six 3D points from four 3D to 2D projection images. It can be used in 3D point pattern recognition from 2D images directly. Traditional methods for solving this problem are nonlinear and very complicated to use in real applications. The proposed formulas are clear and easy to implement by ordinary users. Another feature of our method is that we compute the projective invariants using only the original data. It is noticed that transformations of the original data can amplify the noise level of the data. This study provides a deeper understanding of the structure and motion problem. It seems that the natural configuration of the projective reconstruction problem is six points and four images.

Future directions of research include using this method in iterative or minimization schemas to solve the projective reconstruction problem with noising data, missing data, or outliers. It is also possible to develop similar methods for the cases of seven points in three images and eight points in two images.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work is supported by the National Science Foundation for Distinguished Young Scholars of China under Grant nos. 61225012 and 71325002; the Specialized Research Fund of the Doctoral Program of Higher Education for the Priority Development Areas under Grant no. 20120042130003; the Specialized Research Fund for the Doctoral Program of Higher Education under Grant no. 20110042110024; the Fundamental Research Funds for the Central Universities under Grant nos. N110204003 and N120104001.