8
Factorization with Missing and Noisy Data Carme Juli`a, Angel Sappa, Felipe Lumbreras, Joan Serrat, and Antonio L´opez Computer Vision Center and Computer Science Department, Universitat Aut`onoma de Barcelona, 08193 Bellaterra, Spain {cjulia,asappa,felipe,joans,antonio}@cvc.uab.es Abstract. Several factorization techniques have been proposed for tack- ling the Structure from Motion problem. Most of them provide a good solution, while the amount of missing data is within an acceptable ratio. Focussing on this problem, we propose an incremental multiresolution scheme able to deal with a high rate of missing data, as well as noisy data. It is based on an iterative approach that applies a classical fac- torization technique in an incrementally reduced space. Information re- covered following a coarse-to-fine strategy is used for both, filling in the missing entries of the input matrix and denoising original data. A sta- tistical study of the proposed scheme in front of a classical factorization technique is given. Experimental results obtained with synthetic data and real video sequences are presented to demonstrate the viability of the proposed approach. 1 Introduction Structure From Motion (SFM) consists in extracting the 3d shape of a scene as well as the camera motion from trajectories of tracked features. Factorization is a method addressing this problem and the Singular Value Decomposition (SVD) is generally used when there are not missing entries. Unfortunately, in most of the real cases not all the data points are available, hence other decomposition methods need to be used. The central idea of all factorization techniques is to express a matrix of trajectories W as the product of two unknown matrices, namely, the 3d object’s shape S and the relative camera pose at each frame M : W 2f ×p = M 2f ×4 S 4×p , where f and p are the number of frames and feature points respectively. An affine camera model is assumed in this paper. Although unknown, these factors can be estimated thanks to the key result that their rank is small and due to constraints derived from the orthonormality of the camera axes. Working with non degenerate cases, the rank of the measurement matrix W is 4. In the seminal approach Tomasi and Kanade [1] propose an initialization method in which they first decompose the largest full submatrix by the factor- ization method and then the initial solution grows by one row or by one column at a time, unveiling missing data. The problem is that finding the largest full submatrix of a matrix with missing entries is a NP-hard problem.

Factorization with Missing and Noisy Data

  • Upload
    uab

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Factorization with Missing and Noisy Data

Carme Julia, Angel Sappa, Felipe Lumbreras, Joan Serrat, and Antonio Lopez

Computer Vision Center and Computer Science Department,Universitat Autonoma de Barcelona,

08193 Bellaterra, Spain{cjulia,asappa,felipe,joans,antonio}@cvc.uab.es

Abstract. Several factorization techniques have been proposed for tack-ling the Structure from Motion problem. Most of them provide a goodsolution, while the amount of missing data is within an acceptable ratio.Focussing on this problem, we propose an incremental multiresolutionscheme able to deal with a high rate of missing data, as well as noisydata. It is based on an iterative approach that applies a classical fac-torization technique in an incrementally reduced space. Information re-covered following a coarse-to-fine strategy is used for both, filling in themissing entries of the input matrix and denoising original data. A sta-tistical study of the proposed scheme in front of a classical factorizationtechnique is given. Experimental results obtained with synthetic dataand real video sequences are presented to demonstrate the viability ofthe proposed approach.

1 Introduction

Structure From Motion (SFM) consists in extracting the 3d shape of a scene aswell as the camera motion from trajectories of tracked features. Factorization isa method addressing this problem and the Singular Value Decomposition (SVD)is generally used when there are not missing entries. Unfortunately, in most ofthe real cases not all the data points are available, hence other decompositionmethods need to be used.

The central idea of all factorization techniques is to express a matrix oftrajectories W as the product of two unknown matrices, namely, the 3d object’sshape S and the relative camera pose at each frame M : W2f×p = M2f×4S4×p,where f and p are the number of frames and feature points respectively. Anaffine camera model is assumed in this paper. Although unknown, these factorscan be estimated thanks to the key result that their rank is small and due toconstraints derived from the orthonormality of the camera axes. Working withnon degenerate cases, the rank of the measurement matrix W is 4.

In the seminal approach Tomasi and Kanade [1] propose an initializationmethod in which they first decompose the largest full submatrix by the factor-ization method and then the initial solution grows by one row or by one columnat a time, unveiling missing data. The problem is that finding the largest fullsubmatrix of a matrix with missing entries is a NP-hard problem.

2

Jacobs [2] treats each column with missing entries as an affine subspaceand shows that for every r-tuple of columns the space spanned by all possiblecompletions of them must contain the column space of the completely filledmatrix. Unknown entries are recovered by finding the least squares regressiononto that subspace. However it is strongly affected by noise on the data.

An incremental SVD scheme of incomplete data is proposed by Brand [3].The main drawback of that technique is that the final result depends on the orderin which the data are encountered. Brandt [4] proposes a different technique thataddresses missing data by means of an EM algorithm.

A method for recovering the most reliable imputation, addressing the SFMproblem, is provided by Suter and Chen [5]. They propose an iterative algorithmto employ this criterion to the problem of missing data. Their aim is not to obtainthe factors M and S, but the projection onto a low rank matrix to reduce noiseand to fill in missing data.

A different approach to deal with missing data is the Alternation technique.Since Wiberg [6] introduced it to solve the factorization problem for missing data,many variants have been proposed in the literature. In [7], Buchanan resumemost of these methods and proposes the Alternation/Damped Newton Hybrid,which combines the Alternation strategy with the Newton method.

One disadvantage of the above methods is that the solution depends on thepercentage of missing data. They give a good factorization while the amountof missing points is reduced, which is not common in real image sequences,unfortunately. Additionally to this problem, when real sequences are considered,the presence of noisy data needs to be taken into account in order to evaluatethe performance of the factorization technique.

Addressing to these problems, we propose to use an iterative multiresolutionscheme, which incrementally fill in missing data. A statistical study of the per-formance of the proposed scheme is carried out considering different percentagesof missing and noisy data. The key point of the implemented approach is to workwith a reduced set of feature points along a few number of consecutive frames.Thus, the 3d reconstruction corresponding to the selected feature points and thecamera motion of the used frames are obtained. Missing entries of the trajectorymatrix are recovered, while, at the same time, noisy data are filtered.

This paper is organized as follows. Section 2 contains a brief review of Al-ternation factorization techniques for the case where there are missing data.Section 3 presents the incremental multiresolution scheme used to factorize amatrix of trajectories that has a large amount of missing data. The error func-tion is defined in section 4. Section 5 contains results obtained with syntheticand real data. Conclusions and future work are given in section 6.

2 Alternation techniques

Alternation is a strategy that iteratively solves one variable at each time. Wiberg [6]introduced these methods to solve the factorization problem when some data aremissed. Given a matrix of trajectories W2f×p, the goal is to find the best rank

3

r approximating matrix, where r < 2f, p. That is, to compute matrix factors Mand S such that minimize the cost function:

‖W −MS‖ (1)

In the case of missing data, only available elements of W are considered. Thealgorithm starts with an initial random 2f × r matrix M0 and repeats the nextsteps until the product MkSk converges:

Sk = (M tk−1Mk−1)−1M t

k−1W (2)

Mk = WSk(StkSk)−1 (3)

As it is pointed out in [8], the most important advantage of this 2-step algo-rithm is that these equations are the matrix versions of the normal equations.That is, each Mk and Sk is the least-squares solution of a set of equations ofthe form W = MS. Besides, since the updates of M given S (and analogouslyin the case of S given M) can be independently done for each row of S, missingentries in W correspond to omitted equations. Due to that fact, with too fewdata points the method would fail to converge or converge very slowly, but thishappens only with large amounts of missing data.

In the application of affine SFM, the last row of S should be filled with onesS =

[X 1

]t, where X are the 3d recovered coordinates of the feature points.Through the paper, an Alternation with motion constraints (AM) approach

has been used. Our AM implementation is based on the Alternation for SFM andadditionally, at each iteration k, it is used the fact that M is the motion matrix—the camera pose at each frame. Therefore, given Mk−1, and before computingSk, we impose the orthonormality of the camera axes at each frame.

3 Proposed Approach

We propose an iterative multiresolution approach using the AM previously de-scribed. We will refer it as Incremental Alternation with Motion constraints(IAM). Essentially, our basic idea is to generate sub-matrices with a reduceddensity of missing points. Thus, the AM could be used for factoring these sub-matrices and recovering their corresponding 3d shape and motion. The proposedtechnique consists of two stages, which are fully explained below.

3.1 Observation Matrix Splitting

Let W2f×p be the observation matrix of p feature points tracked over f framescontaining missing entries. The first stage consists in splitting this matrix ina uniformly distributed set of k × k non-overlapped sub-matrices, each one ofthem defined as Wk(i,j), i ∈ (0, b 2f

k c], j ∈ (0, b pk c]. For the sake of presentation

simplicity, hereinafter a sub-matrix in the current iteration level k will be referredas Wk (assuming k > 1, since k = 1 is simply the AM method).

4

Different strategies were tested in order to compute in a fast and robust waysub-matrices with a reduced density of missing entries (e.g. quadtrees, ternarygraph structure), but they do not give the desired and necessary properties ofoverlapping. Finally, a two step process has been chosen.

Although the idea is to focus the process in a small area (sub-matrix Wk),with a reduced density of missing data, recovering information from a smallpatch can be easily affected from noisy data, as it is pointed out in [5]. Hencethere is a trade off between the size of a sub-matrix and the confidence of itsrecovered data. Therefore, in order to improve the confidence of recovered data amultiresolution approach is proposed. In a second step four W2k overlapped sub-matrices, with twice the size of Wk are computed as illustrated in Fig.1. The ideaof this enlargement process is to study the behavior of feature points containedin Wk when a bigger region is considered. W2k matrices are only generated whenk > 2.

Since generating four W2k, for every Wk, is a computationally expensive task,a simple and more direct approach is followed. It consists in splitting the originalW matrix in four different ways, by shifting W2k half of its size (i.e., Wk) throughrows, columns or both at the same time, see Fig. 1 (right).

Fig. 1. W2k overlapped matrices of the observation matrix W , computed during thefirst stage (section 3.1), at iteration k = 6.

3.2 Sub-matrices Processing

At this stage, the objective is to recover missing data by applying AM at everysingle sub-matrix—independently of their size hereinafter sub-matrices will bereferred as Wi.

Given a sub-matrix Wi, the AM gives its corresponding Mi and Si matriceswhose product is used for computing an approximation error εi (1). In case theresulting error is smaller than a given threshold σ, recovered data are associatedwith a weighting factor, defined as 1

εi, in order to measure the goodness of that

value. These weighting factors are later on used for filling, those missing data, inthe initial matrix W . Otherwise, the resulting error is higher than σ, computeddata are discarded.

5

Finally, when every sub-matrix Wi has been processed, recovered missingdata are used for filling in the initial matrix W . In case a missing data has beenrecovered from more than one sub-matrix (overlapped regions for k > 2), thoserecovered data are merged by using their corresponding normalized weightingfactors. On the contrary, when a missing data has been recovered from only onesub-matrix, this value is directly used for filling in that position.

Once recovered missing data were used for filling in the original matrix W ,the iterative process starts again (section 3.1) splitting the new matrix W (theoriginal one plus recovered data) either by increasing k one unit or, in case thesize of sub-matrices Wk at the new iteration stage is quite small, by settingk = 2. This iterative process is applied until one of the following conditions istrue: a) the amount of recovered missing data is large enough for computingmatrices M and S from the filled matrix W ; b) at the current iteration nomissing data were recovered. The smaller Wk size was set to 5×5.

4 The Error Function

As it is pointed out in [5], the error function presented in section 2 could beambiguous and in some cases contradictory. That is because that formula onlytakes into account the recovered values corresponding to known features, but itignores how the rest of entries are filled. Therefore in order to compare the pro-posed IAM scheme with the classical AM, an error function that considers all thefeatures of the sequence, including missing points, is proposed. Unfortunately,this is only possible when we have access to the whole information. Hence, whenreal data are considered, in order to perform a comparison by using the proposederror function, a full matrix should be selected. This matrix is used as input andmissing data are randomly removed.

Since the aim is also to study how the data are filtered, the entries of thematrix of trajectories given by the product MS are compared with the corre-sponding elements in the original matrix W . Therefore, the error function isdefined as:

‖W −MS‖ =∑

i,j

|Wij − (MS)ij |2 ∀i, j (4)

5 Experiments

As it was previously mentioned we want to do an statistical study about thefiltering capability of the IAM technique in front of the classical AM. At thesame time, the robustness to missing data will be considered. In order to performa comparison that help us to infer some conclusions, different levels of Gaussiannoise—standard deviation with values from 1

8 to 1, both in the synthetic and realcases—are added into the 2d feature point trajectories and different amounts ofmissing data are considered—from 20% up to 80%.

6

Statistical results are obtained by applying: a) AM over the given matrix Wand b) AM after filling in missing data with the proposed IAM. Although thisstrategy consists of two parts (IAM+AM), for simplicity, we referred it as IAM.

For each setting (level of noise, amount of missing data) 100 attempts arerepeated and the number of convergent cases—those in which the error valueis smaller than a threshold—obtained with each approach is computed. Thethreshold is defined for that particular setting, by the mean of the inliers errorvalues ε, which are defined as |ε| < q3 +1.5∆q—where q3 is the value of the thirdquartile and ∆q is the interquartile distance q3 − q1.

Experiments using both synthetic and real data are presented below.

5.1 Synthetic Object

Synthetic data were generated by randomly distributing 35 3D feature pointsover the whole surface of a cylinder. The cylinder is defined by a radius of 100and a height of 400; it rotates over its axis. The corresponding measurementmatrices are directly obtained using different number of frames. Hence thesematrices have a banded structure.

As it is shown in Fig. 2 (left), the case of free-noisy data differs consider-ably from the others. In particular, for the IAM approach, a high average ofconvergent cases is obtained, no matter the ratio of missing data.

The fact that the size of the used matrices are small does not help to see thedenoising capability of both IAM and AM.

20 30 40 50 60 70 800

10

20

30

40

50

60

70

80

90

100

%missing data

Num

ber

of e

xper

imen

ts th

at c

onve

rges

Ratio of convergence for different percentages of missing data (m.d.)

AM, no noiseIAM, no noiseAM, std = 1/8IAM, std = 1/8AM, std = 2/3IAM, std = 2/3

0 0.2 0.4 0.6 0.8 10

10

20

30

40

50

60

70

80

90

100

standard deviation (std)

Num

ber

of e

xper

imen

ts th

at c

onve

rges

Ratio of convergence for different standard deviation values

AM, 20% m.d.IAM, 20% m.d.AM, 40% m.d.IAM, 40% m.d.AM, 70% m.d.IAM, 70% m.d.

Fig. 2. Synthetic case. (left) Ratio of convergence for the different percentages of miss-ing data, fixing different sigma values. (right) The same for different deviation values,fixing different percentages of missing data. Zero standard deviation means no noisydata.

7

5.2 Real Object

Experimental results with a real video sequence of 101 frames with a resolutionof 640 × 480 pixels are presented. The studied object is shown in Fig. 3 (left).Different ratios of missing data are obtained by removing data randomly; theremoved data are recorded in order to compute the defined error value (4).

Feature points are selected by means of a corner detector algorithm and onlypoints over the object to be studied were considered (concretely, 87 points aretracked). A single rotation around a vertical axis was performed. After selectinga set of feature points over the object to be studied an iterative feature trackingalgorithm has been used. More details about corner detection and tracking algo-rithm can be found in [9]. Fig. 3 (middle) shows feature point trajectories usedto fill in the matrix W and Fig. 3 (right) is an example of the 3d reconstructionobtained applying IAM.

250 300 350 400 450 500 55050

100

150

200

250

300

50100

150200

250

340

360

380

400

420

440

460

480

100

200

Fig. 3. (left) Object used to test the proposed approach. (middle) Feature point tra-jectories from the previous object. (right) Its corresponding 3d reconstruction.

20 30 40 50 60 70 800

10

20

30

40

50

60

70

80

90

100

%missing data

Num

ber

of e

xper

imen

ts th

at c

onve

rges

Ratio of convergence for different percentages of missing data (m.d.)

AM, no noiseIAM, no noiseAM, std = 1/6IAM, std = 1/6AM, std = 2/3IAM, std = 2/3

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

10

20

30

40

50

60

70

80

90

100

standard deviation (std)

Num

ber

of e

xper

imen

ts th

at c

onve

rges

Ratio of convergence for different standard deviation values

AM, 20% m.d.IAM, 20% m.d.AM, 40% m.d.IAM, 40% m.d.AM, 70% m.d.IAM, 70% m.d.

Fig. 4. Real case. (left) Ratio of convergence for the different percentages of missingdata, fixing different sigma values. (right) The same for different deviation values, fixingdifferent percentages of missing data. Zero standard deviation means no noisy data.

8

As it is shown in Fig. 4 (right), the number of convergent cases is in generalhigher applying IAM and although the results with this approach are also betterin the case of free-noisy data, the results do not differs so substantially from thenoisy data case as in the synthetic experiment. In fact, notice that we work witha real sequence, hence noisy data are already included into the original datapoints.

6 Conclusions and Future Work

This paper presents an efficient technique for tackling the SFM problem whena high ratio of missing and noisy data is considered. The proposed approachexploits the simplicity of an Alternation technique by means of an iterativescheme. Missing data are incrementally recovered improving the final results.Noise have been added to the data and a statistical study about the filteringcapability of AM in front of our incremental strategy have been done. It hasbeen shown that, in most of the cases, results of IAM are better than the onesof AM in the sense of number of convergent cases.

In the future, we would like to use our icremental multiresolution schemewith other classical factorization techniques. Additionally, other functions thatconsider the goodness of the obtained M and S and not only the recoveredelements of W will be studied.

References

1. Tomasi, C., Kanade, T.: Shape and motion from image streams: a factorizationmethod. Full report on the orthographic case (1992)

2. Jacobs, D.: Linear fitting with missing data for structure-from-motion. Computervision and image understanding, CVIU (2001) 7–81

3. Brand, M.: Incremental singular value decomposition of uncertain data with missingvalues. In: Proceedings, European Conference on Computer Vision. (2002) 707–720

4. Brandt, S.: Closed-form solutions for affine reconstruction under missing data. In:Proceedings Statistical Methods for Video Processing Workshop, in conjunctionwith ECCV. (2002) 109–114

5. Chen, P., Suter, D.: Recovering the missing components in a large noisy low-rankmatrix: Application to sfm. IEEE Transactions on Pattern Analysis and MachineIntelligence 26 (2004)

6. Wiberg, T.: Computation of principal components when data is missing. In: Pro-ceedings of the Second Symposium of Computational Statistics. (1976) 229–326

7. Buchanan, A.: Investigation into matrix factorization when elements are unknown.Technical report (2004)

8. Hartley, R., Schaffalitzky, F.: Powerfactorization: 3d reconstruction with missing oruncertain data. Australian-Japan advanced workshop on Computer Vision (2003)

9. Ma, Y., Soatto, J., Koseck, J., Sastry, S.: An invitation to 3d vision: From imagesto geometric models. Springer-Verlang, New York (2004)