Tuesday, June 30, 2015

Grad level Intro to Astro-Stats for Undergraduates

Another day, another blog.  Today in my very interactive Astro-statistics-Python-coding class, I gained a better understanding about the concepts of probability density/distribution function and the cumulative probability distribution function.  I will explain how the other interns and I have made significantly more progress in enhancing our Python scripts to do much more analysis regarding our use of simulated data that constitutes a linear trend with random noise added to it.  If you want to follow along, here is my code that you can copy and paste into a script or go to RAW and then download (with the "wget" Terminal command).  The next series of plots are but one of the many possible outcomes from executing my code:

>>>cdf(a_vals, plot=True)      # this can be ran however many times you wish

Figure 1: Once again, 5 "x" values were chosen and a linear equation \( a_0 + a_1\times x_{vals} +\) noise\(_{random}\)  was used to simulate 5 measurements that were taken.  For this step, I chose the intercept and slope to be \(a_0=1\) and \(a_1=2\).

Figure 2: Contour plot of the Likelihood matrix with the dimensions  (number of \(a_1\) elements) X (number of \(a_0\) elements).  The regions within the darker and redder lines indicate areas where the likelihood is maximized.  In this example, it seems to imply that \(a_1\) values close to 2.0 paired with \(a_0\) values near 1.0 are most likely to be the correct parameter values.  This is great!  The Likelihood-\(\chi^2\) fitting was applied to the simulated data and it is determining what the two fitting parameters (\(a_0 \) and \(a_1\)) are.
A quick aside: why does the contour plot have that slope to it?
 If you run my "cdf(a_vals, plot=True)" function multiple times, the values and eccentricities of the regions may change but it will still have that tilt.  As I am writing this blog, I force myself to explain these details meticulously, which in turn compels me to consider all questions that might arise in the thoughts of my audience.  The question I have posed is one I am still contemplating.  As of now, I think that the slope of the ellipses attests to how well the two parameters are being constrained.  The linear slope parameter \(a_1\) is being constrained more than the intercept \(a_0\).  I am inclined to believe that this is why the regions are stretched horizontally more than they are stretched vertically.

The next two plots reiterate and confirm what the contour plot is showing but from a 1-dimensional point of view.  (Thus the need for 2 plots to express the results of the 2D contour.)

Figure 3: Marginalization of Likelihood for \(a_0\).  In my Likelihood matrix, the columns are considered to be the x-axis (\(a_0\)) for the contour plot.  The rows act as the y-axis (\(a_1\)) for the contour plot.  Therefore, when I get the sum for each column of the matrix I obtain the summation of all \(a_0\) likelihoods for each column.  Each of these sums have been plotted here.  The maximum marginalized likelihood seems to occur when \(a_0\) is ~1.25.  This suggests that the best value for the \(a_0\) fitting parameter is near 1.25, but I want to find a very precise number with minimal uncertainty for my fitting parameter.

Figure 4: Marginalization of Likelihood for \(a_1\).  The sum of each row in my likelihood matrix was taken and plotted here with the rows' corresponding value of \(a_1\).  The maximum marginalized likelihood seems to occur when \(a_1\) is equivalent to 2.0. In other words, this plot is stating that the correct value for \(a_1\) is most likely 2.0.

Now that you're all caught up to speed, I can discuss the extra steps (since my last post) we have done for this exercise.  I guess you can call this Phase III just to stay consistent with the previous couple posts.

 Figures 3 and 4 illustrate the Probability Density Function (PDF) that we know to be a Gaussian.  This means that the contour plot of Figure 2 yields a Gaussian distribution as well.  The data in those plots also show the best, or most satisfactory, value for each fitting parameter.  However, one of those circular data points may not reveal the most satisfactory value for the fitting parameter despite the fact that one of those points truly does look like it is at the peak of the curve (in both plots).  To find a value that is truly at the peak of the curve, I had to first integrate the PDF produced by my contour plot in Figure 2 and subsequently interpolate the resultant curve in order to find the x-value that corresponds to where the area under the curves of Figures 3 and 4 equate to \(\frac{1}{2} \) the total area.

to be continued...

Monday, June 29, 2015

Exercise in Likelihood and \(\chi^2\), continued...

Phase II.

Now that we understood how to simulate data measured with random noise, it was time to find the best fitting parameters for the data.  As an example, we used data that represented a linear trend.  We also specified 5 x-axis points, which implies that we have 5 data points each with a y-axis value called "d" (for data).
If you want to follow along, feel free to download (with the Terminal "wget" command after clicking RAW) or copy and paste my Python script into another script.
Figure 1: Five data points representing 5 measurements with uncertainties associated with them.  To create this plot, I only needed an array of x-axis points and the two coefficients (a) for the linear equation: \( a_0 + a_1\times x_{vals} \) .   There are uncertainties for each measurement due to what we, in the Astro-Statistics lecture, call "noise."  This random noise is added to the linear equation to simulate realistic measurements.  This plot was generated by calling my "poly" function.  For example, >>>execfile('stat_ex')       >>>poly(x_vals, a_vals, sigmas, plot=True)
As Figure 1 implies, a satisfactory model to fit that data would more than likely be a linear equation.  Figure 1 was forged by actually knowing the correct fitting parameters from the beginning: \(a_0 = 1 \) and \(a_1 = 2 \) .  The goal of this exercise is to use the simulated \( d_{vals} \), i.e. measurements, to figure out two a coefficients that would provide a satisfactory linear model to fit the data.

The probability function will allow me to find satisfactory fitting parameters:
\[ P(d_1 | \mu ; \sigma_1 ) = \frac{1}{\sigma_1 \sqrt{2\pi} } e^{-\frac{1}{2} \Big(\frac{d_1 - \mu}{\sigma_1} \Big)^2 } \]
P is the probability of me measuring \(d_1 \), which is dependent on the function \(\mu \) and the uncertainty \(\sigma_1 \) of that specific measurement.  With the probability formula, I can find the maximum Likelihood and optimal \(\chi^2 \).  The derivation is as follows, where L is Likelihood:
\[ L = \prod_i P(d_i | \mu ; \sigma_i) \]
\[ ln(L) = \sum_i ln\big[P(d_i | \mu ; \sigma_i )\big] \]
\[ ln(L) = - \sum_i ln\big(\sigma_i \sqrt{2 \pi}) - \frac{1}{2}\sum_i \big(\frac{d_i - \mu}{\sigma_i}\big)^2 \]
\[ln(L) = C - \frac{1}{2}\sum_i \chi^2 \]

Likelihood is the probability of measuring all of the data ("d_vals") that I measured.  This is why the probability of measuring each data point are multiplied together in the \( \Pi \) symbol.  The C means that the first term is a mere constant, or offset, while the second term contains the famous \( \chi^2 \) expression.

Understanding how to use the equations is one thing.  Figuring out how to transfer that math language into a coding language is another thing entirely.  The example that I am about to convey is but one of many possible outcomes when running my Python script. If you repeatedly run the "poly" function in the command line, you will get different results and plots each time.  The following plots are exemplary of what you may see.

>>>poly(x_vals, a_vals, sigmas, plot=True)  # feel free to run this as many times as you wish

Figure 2: Contour plot of the Likelihood matrix in my Python script.  The darker and redder regions represent where the Likelihood was the most.  Therefore, within that region lies the x and y coordinate pair that indicates the best fitting parameters: \(a_0 \) and \(a_1 \).  In this specific example, the contour suggests that the best fitting parameters are where \(a_1 \) is a little below 2.0 and \( a_0 \) ~ 1.40 .

The last two figures reiterate the results of Figure 2 but with much more clarity.  The matrix that was utilized to make the contour plot has now been transformed into a 1-dimensional version of the results.  These last figures confirm what was conjectured due to Figure 2.  The maximum Likelihood is found when \(a_0 \) is ~ 1.40 and \( a_1 \) is a tiny amount below 2.0.  The true values were \(a_0 = 1.0 \) and \(a_1 = 2.0\).

This was a superb exercise for my first week of learning Astro-Statistics.

Saturday, June 27, 2015

Exercise in Likelihood and \(\chi^2\)

At the Banneker Institute, the other interns and I are being taught Astro-Statistics.  Statistics is an essential part of astrophysics research.  During my first research project at the Center for Astrophysics, I was amazed at how much statistics was necessary to understand the instrumentation of the telescope and the Confidence and Precision when fitting a model to collected data.  Moreover, I felt somewhat badly about how little statistics I knew at the time.  However, my advisors were able to explain to me what the equations were and what I needed to know at the time to get the job done by the end of the summer.
I say all of that just to say that, I am very happy that I am learning statistics this summer.  More importantly, I am learning how to specifically apply it to astrophysics research.  Here is what I have done and learned:

First of all, if you want to follow along you can look at my code on GitHub here.
(I didn't do well at commenting things.  It's on my To Do List.  And keep in mind that this code is not polished up at all.)


The idea was to find a good fitting model to our data points.  The data points had the desired signal and random noise associated with it. So, we had to use likelihood and thus the \( chi^2\) function to find out what kind of line fitted our data best, which insinuated how the data would look if there was no random noise. 

Phase I.

 (No FOR loops allowed)

 The first challenge was to create a Python function that could deduce the highest degree of the polynomial equation (linear, quadratic, cubic, quartic, etc.) necessary to calculate the "y" values when only being given the coefficients and the "x" values. For example, if I give my Python function, called "poly" the input a_values = [2, 5]  then the function is suppose to know that the equation necessary will be of the form \( a_0 + a_1*x_{values}\)
and thus the equation should be \( 2 + 5\times x_{values} \).  Therefore, if the "x" values were x_values = [ 1, 2, 3] then poly should give me the "y" values 7, 12 and 17.

However, if the user inputs a_values = [ 2, 5, 1, 150], "poly" should know to use the formula
 \[ 2 + 5\times x_{values} + 1\times x_{values}^2 + 150 \times x_{values}^3 \]
to get the "y" values.
Furthermore, once "poly" finds the "y" values it is suppose to add noise to the data as a way of simulating real data that has been measured with uncertainties (sigma). This is why my "poly" function within my Python script essentially has two equations, e.g.
\[ data = 2 + 5\times x_{values} + 1\times x_{values}^2 + 150 \times x_{values}^3  + \sigma\times random_{noise}\]
\[ model = 2 + 5\times x_{values} + 1\times x_{values}^2 + 150 \times x_{values}^3 \]
Because no FOR loops were allowed, matrices were used to complete this task.  (Technically, in Python they are numpy arrays that acted like matrices, but there was no matrix multiplication techniques employed.)

Feel free to run my code as follows:
>>>import numpy as np
>>>poly1st(x_vals, a_values = np.array([2, 5, 1, 150]) , plot=True, sigma = 0.5)

You can change the "x_vals", "a_values", and "sigma" parameters however you want.

Because the noise is generated by random numbers (of a Gaussian distribution), the plot you see will change every time you run the function "poly1st."  The line represents how the data (points) would look without the noise. 

to be continued...

High Precision Photometry of Transiting Exoplanets as I currently know it : Part III

Background &
Doppler Shift  cont.

The results from a star's spectrum are used to plot the data shown on radial velocity curves.  The following gif illustrates how the moving position of the absorption lines in a spectrum can directly relate to the velocity at which the star is moving/wobbling away or toward us observers.  

Animation exhibiting the real phenomenon known as Doppler Shift.  As the star moves/wobbles away or toward the observer, the absorption lines that are indicative of the star's outer atmospheric composition shift from left to right or right to left on the spectrum.
The displacement between the absorption lines' original positions and their new position insinuate the velocity at which the star moves away or toward the observer.  However, this technique alone keeps observer's oblivious to the velocity of the star when it is moving in a direction that moving along the observer's line of sight.  This means that stellar systems that we view from pole-on or top-view will not provide any velocity information when using this doppler shift technique.

An exemplary orbiting stellar-planet system that would not exhibit doppler shift for the observing reader because as you look at your monitor, your line of sight is perpendicular to the plane of the screen, i.e. the directions at which the star and planet move.

The concept of doppler shift can be further elaborated by the equation 
\[ \frac{\Delta\lambda}{\lambda{_0}}=\frac{v_{star}}{c}  \]

where \(\Delta\lambda \) is the change in wavelengths of the moved absorption lines, \(\lambda{_0} \) is the initial wavelength, \(v_{star}\) is the velocity of the star moving radially and c is the speed of light.  Evidently, there is a direct relationship between the radial velocity of the star and the change of wavelengths exhibited by the star's absorption lines in the spectrum.  Thus, the radial velocity curves are made.  

to be continued...

Wednesday, June 24, 2015

High Precision Photometry of Transiting Exoplanets as I currently know it : Part II

Background &
 Non-Transit and Transiting Exoplanets  continued...

Exoplanets that do not transit their parent star are found using photometry and verified with spectroscopy.  The same applies for transiting exoplanets; however, with modern technology and current methods utilized in the new field of research, much more information can be extracted and extrapolated from the data of a transiting exoplanet.  For those that do not transit, astronomers can find, at most, the orbital parameters and lower limits on the planet's mass.  For transiting planets, we can find those parameters and even the physical properties of the individual planet.  These properties include radius, equilibrium temperature, mass, density, surface gravity, and more.

Radial Velocity Curves

A very common method of detecting hints of a revolving exoplanet is to observe if radial velocity measurements of a star insinuate that the star is wobbling.  Although the planet revolves around the star, the star also revolves around the barycenter, or center of mass, of the stellar-planet system.  However, because stars are typically immensely more massive than their revolving planets, the star makes very small movements around the barycenter due to the tug of the planets relatively weak gravity.  These movements can be so small that the star it is more accurately described to be a wobbling motion as opposed to a revolving motion.  This is also indicative of the fact that stars "revolve" around their barycenter that is, many times, within the radius of star itself.  The following "gif" illustrates this.  (However, there are many cases where the barycenter is outside of the stellar radius and the star visually does revolve around a common center of mass shared with the star's orbital companion, which may well be another star.)

In this motion-picture, the star and planet move around a common center of mass.  This is an apt illustration to compare to our Sun.  The barycenter for the Sun, its eight revolving planets, and all of the other orbiting material in the Solar System is just beneath the Sun's surface.  This illustration demonstrates stellar wobble due to a planet's (weak) gravitational pull.

Because the data astronomers analyze never look as clear and unambiguous as that gif, we use plots.  Oh! How vast in number the plots we Make!  Radial velocity plots, such as the one below, give evidence when stellar wobble due to a planet occurs.

Radial velocity curve for the star 51 Pegasi.  The points indicate the measurements obtain by a telescope and spectrograph.  The line indicates the fitting done on the data points that best represents the continuous trend that the star may be following.  As implied, the star's velocity toward the observer (Earthlings) can reach up to 50 meters per second.  When going away from the observer, the speed can be up to 70 m/s.

Doppler Shift

The following gif superbly illustrates how the velocity measurements in the above radial velocity curve were detected.

to be continued...

Monday, June 22, 2015

High Precision Photometry of Transiting Exoplanets as I currently know it : Part I

Current Research Project Goal:  Obtain high precision measurements of Transiting Exoplanets (TEPs) in order to determine the physical properties of the planets with minimal uncertainties.

Methodology: Use the defocusing technique with MINERVA telescopes, located on Mt. Hopkins, to observe bright stars with exoplanet candidates.


The study of exoplanets is a relatively new field of research.  Although great scientists like Benjamin Banneker contemplated the topic as far back as the 18th century, the astronomy community have only recently confirmed that exoplanets exist.  To clarify, an exoplanet, or formerly known as extra-solar planet, is a planet that revolves around a star that is not our Sun.  A paper published in 1995 announced Mayor and Queloz' discovery of the first known exoplanet.  In 1999, Harvard's very own David Charbonneau led the team that found the first transiting exoplanet.

Artist's rendering of the exoplanet known as 51 Pegasi b.  Discovered by Mayor and Queloz, they are credited for being the first to confirm the existence of exoplanets.

Non-Transit and Transiting Exoplanets 

In order to confirm the finding of an exoplanet, astronomers use photometry and spectroscopy.  Simply put, photometry is the measurement of light/electromagnetic radiation/photons and spectroscopy is the study of how matter reacts to electromagnetic radiation.  Photometry helps astronomers produce graphs known as light curves.  Spectroscopy allows astronomers to determine how fast an object is moving toward or away from the observer and helps to produce radial velocity plots.  Both light curves and radial velocity plots are used to detect, confirm and analyze exoplanets.  These plots become even more fascinating when the star being observed by an astronomer is also being transited by its revolving exoplanet.

to be continued...