I say all of that just to say that, I am very happy that I am learning statistics this summer. More importantly, I am learning how to specifically apply it to astrophysics research. Here is what I have done and learned:
First of all, if you want to follow along you can look at my code on GitHub here.
(I didn't do well at commenting things. It's on my To Do List. And keep in mind that this code is not polished up at all.)
Objective.
The idea was to find a good fitting model to our data points. The data points had the desired signal and random noise associated with it. So, we had to use likelihood and thus the \( chi^2\) function to find out what kind of line fitted our data best, which insinuated how the data would look if there was no random noise.Phase I.
(No FOR loops allowed)The first challenge was to create a Python function that could deduce the highest degree of the polynomial equation (linear, quadratic, cubic, quartic, etc.) necessary to calculate the "y" values when only being given the coefficients and the "x" values. For example, if I give my Python function, called "poly" the input a_values = [2, 5] then the function is suppose to know that the equation necessary will be of the form \( a_0 + a_1*x_{values}\)
and thus the equation should be \( 2 + 5\times x_{values} \). Therefore, if the "x" values were x_values = [ 1, 2, 3] then poly should give me the "y" values 7, 12 and 17.
However, if the user inputs a_values = [ 2, 5, 1, 150], "poly" should know to use the formula
\[ 2 + 5\times x_{values} + 1\times x_{values}^2 + 150 \times x_{values}^3 \]
to get the "y" values.
Furthermore, once "poly" finds the "y" values it is suppose to add noise to the data as a way of simulating real data that has been measured with uncertainties (sigma). This is why my "poly" function within my Python script essentially has two equations, e.g.
\[ data = 2 + 5\times x_{values} + 1\times x_{values}^2 + 150 \times x_{values}^3 + \sigma\times random_{noise}\]
\[ model = 2 + 5\times x_{values} + 1\times x_{values}^2 + 150 \times x_{values}^3 \]
Because no FOR loops were allowed, matrices were used to complete this task. (Technically, in Python they are numpy arrays that acted like matrices, but there was no matrix multiplication techniques employed.)
Feel free to run my code as follows:
>>>import numpy as np
>>>execfile('stat_ex')
>>>poly1st(x_vals, a_values = np.array([2, 5, 1, 150]) , plot=True, sigma = 0.5)
You can change the "x_vals", "a_values", and "sigma" parameters however you want.
Because the noise is generated by random numbers (of a Gaussian distribution), the plot you see will change every time you run the function "poly1st." The line represents how the data (points) would look without the noise.
to be continued...
No comments:
Post a Comment