Least Squares Fitting-Noisy Data

Least Squares Fitting: Noisy Data

Very habitually data has a significant amount of noise. The least squares approximation is deliberately well-suited to represent noisy data. The next example shows the effects noise can have and how least squares is used.

Traffic flow model:

Presume you are interested in the time it takes to travel on a certain section of highway for the sake of planning. According to theory presumptuous up to a moderate amount of traffic, the time should be approximately:

T(x) = ax + b

where b is the pass through time when there is no other traffic, and x is the current number of cars on the road (in hundreds). To determine the coefficients a as well as b you could run several experiments which consist of driving the highway at different times of day and as well estimating the number of cars on the road using a counter. Of course together of these measurements will contain noise that is random fluctuations.

We could replicate such data in Mat lab as follows

> x = 1:.1:6;
> T = .1*x + 1;
> Tn = T + .1*randn(size(x));

The data must look like it lies on a line but with noise. Click on the Tools button as well as choose Basic fitting. Then select a linear fit. The resulting line must go through the data in what looks like a very reasonable way. Click on illustrate equations. Evaluate the equation with T(x) = .1x + 1. The coefficients must be pretty close considering the amount of noise in the plot.

Next attempt to fit the data with a spline. The result must be ugly. We are able to see from this example that splines aren’t suited to noisy data.

How does Mat lab obtain a extremely nice line to approximate noisy data? The answer is a extremely standard numerical/statistical method known as least squares.

Linear least squares:

Consider in the previous illustration that we wish to fit a line to a lot of data that doesn’t exactly lie on a line. For the equation of the line we have merely two free coefficients however we have many data points. We can’t possibly make the line go through every data point we can merely wish for it to come reasonably close to as many data points as possible. Therefore our line must have an error with respect to each data point. If l(x) is our line as well as {(xi, yi)} are the data then:

ei= yi− l(xi)

is the error of l with respect to all (xi, yi). To make l(x) reasonable we wish to at the same time minimize all the errors: {e1, e2, . . . , en}. There are numerous possible ways one could go about this however the standard one is to try to minimize the sum of the squares of the errors. That is we signify by ε the sum of the squares:

2464_linear least squares.jpg

In the above expression xi and yi are given however we are free to choose a and b so we can think of ε as a function of a and b, i.e. E(a, b). In calculus when one desires to find a minimum value of a function of two variables we set the partial derivatives equal to zero:

843_partial derrivatives.jpg

We can simplify these equations to obtain:

1871_partial derrivatives.jpg

Therefore the whole problem reduces to a 2 by 2 linear system to find the coefficients a as well as b. The entrances in the matrix are determined from simple formulas using the data. The process is quick as well as easily automated which is one reason it is very standard.

We could utilize the same process to obtain a quadratic or higher polynomial fit to data. If we try to robust an n degree polynomial the software has to solve an n × n linear system which is simply done.

This is what Mat lab’s basic fitting tool uses to get an n degree polynomial fit whenever the number of data points is more than n + 1.

Drag coefficients:

Drag because of air resistance is proportional to the square of the velocity i.e. d = kv2. In a wind tunnel experiment the velocity v is able to be varied by setting the speed of the fan and the drag can be measured directly (it is the force on the object). In this as well as every experiment some random noise will occur. The following series of commands replicates the data one might receive from a wind tunnel:

> v = 0:1:60;
> d = .1234*v.^2;
>dn = d + .4*v.*randn(size(v));

The plot must look like a quadratic however with some noise. Using the tools menu add a quadratic fit as well as enable the ‘show equations’ option. What is the coefficient of x2? How close is it to 0.1234?

Note that whenever you choose a polynomial in Mat lab with a degree less than n−1 Mat lab will produce a least squares fit.

You will notice that the quadratic fit comprise both a constant and linear term. We know from the physical state of affairs that these must not be there they are remnants of noise and the fitting process. Since we know the curve must be kv2, we are able to do better by employing that knowledge.

For illustration we know that the graph of d versus v2 must be a straight line. We are able to produce this easily:

>vs = v.^2;

By altering the independent variable from v to v2 we produced a plot that looks like a line with noise. Insert a linear fit. What is the linear coefficient? This must be closer to 0.1234 than using a quadratic fit.

The second fit still has a constant term which we know must not be there. If there was no noise then at every data point we would have k = d/v2. We are able to express this as a linear system vs’*k = dn’, which is deficiently over determined since there are more equations than unknowns. Since there is noise every point will give a different estimate for k. In other words the over determined linear system is as well inconsistent. When Mat lab encounters such systems it automatically gives a least squares answer of the matrix problem that is one that minimizes the sum of the squared errors which is precisely what we want. To get the least squares approximation for k do

> k = vs’\dn’

This will produce a number near to .1234.

Note that this is an application where we comprise physical knowledge. In these circumstances extrapolation would be meaningful. For illustration we could use the plot to find the predicted drag at 80 mph.

Latest technology based Matlab Programming Online Tutoring Assistance

Tutors, at the www.tutorsglobe.com, take pledge to provide full satisfaction and assurance in Matlab Programming help via online tutoring. Students are getting 100% satisfaction by online tutors across the globe. Here you can get homework help for Matlab Programming, project ideas and tutorials. We provide email based Matlab Programming help. You can join us to ask queries 24x7 with live, experienced and qualified online tutors specialized in Matlab Programming. Through Online Tutoring, you would be able to complete your homework or assignments at your home. Tutors at the TutorsGlobe are committed to provide the best quality online tutoring assistance for Matlab Programming Homework help and assignment help services. They use their experience, as they have solved thousands of the Matlab Programming assignments, which may help you to solve your complex issues of Matlab Programming. TutorsGlobe assure for the best quality compliance to your homework. Compromise with quality is not in our dictionary. If we feel that we are not able to provide the homework help as per the deadline or given instruction by the student, we refund the money of the student without any delay.