Monday, August 17, 2015

Astro-Stats : Bayes' Theorem (for Astrophysics), cont.

Now, it is time for us to apply what I explained in the previous post to Python code.  As always, you can follow along on your computer if you have Python installed.  You can copy my code (from here) and paste it into your own file, or you can download my file using the "wget" command in the Terminal. (And don't forget to get the necessary text file mentioned in the previous post.)

I will lead you into this exercise a little differently than I typically do.  I will first show you an unsatisfactory fitting.  This will allow you to see how the process towards finding a satisfactory fit goes sometimes.  After exhibiting the unsatisfactory result, we will learn from the previous results and then run my code again to find the better results.

To follow along...
 >>>
>>>execfile('companion_detect.py')
>>>p_posterior(metal_0, metal_1, alphas=np.linspace(0,1,100), betas=np.linspace(0,1,120), plot=True)
>>>

As stated in the previous post, many priors (for each parameter) must be inserted to perform the Grid Search methodology.  When running the code, we have just input 100 different values for the \(\alpha\) parameter that lie between 0 and 1.  120 numbers between 0 and 1 were chosen as the priors for \(\beta \).  The contour plot shows the Likelihood for each combination of \(\alpha\) and \(\beta\).

Figure 1: Contour plot of Likelihoods for the two parameters.  Colors blue to red represent least to largest Likelihoods respectively.

I know what you're thinking... "This is an ugly plot.  Did something go wrong?"  Well, no.  Everything is fine with my code.  This ugliness insinuates that the priors assigned were far from correct.  Notice how the reddish area is barely visible?  This means that values near 1.0 for \(\alpha\) yield higher Likelihoods than the blue or colorless regions.  The location of the red area also implies that values near 0.1 seem promising for the \(\beta\) parameter. 

Figure 2: Terrible looking cumulative distribution function (CDF) plot.  This CDF was suppose to derive from a normal probability distribution function (PDF).  This plot suggests that the PDF was not of a normal distribution.  This is unfortunate primarily because the parameter and its uncertainties were chosen by assuming that the PDF was Gausssian.

If you want to understand the intricate details about how the plot of Figure 2 came about, read my Grad Level Intro... post and the Grad Level Intro... cont. post.  Those posts will also explain why the PDF and CDF are important.  For now, I'll just say that the purpose of Figure 2 is to provide the best fitting parameter.  There are two reasons why I am not satisfied with the result given by Figure 2.  The first reason for my disdain is the hideosity of the plots (and yes, that is a real word).  The second is the non-Gaussian PDF.  Figure 1 and 2 plots suggest that this fitting did not go well and that the unseen PDF is not of a normal distribution although it should be.


Figure 3: Atrocious attempt at finding the best fitting value for \(\beta\). 
The disgusting plots of Figures 2 and 3 only confirm the terrible fitting job that was insinuated first by Figure 1. 

Well, all is not lost!  As stated previously, the contour plot gave us a clue as to how we can ameliorate the fitting.  Now, let's run the code with different priors.

>>>
>>>execfile('companion_detect.py')
>>>p_posterior(metal_0, metal_1, alphas=np.linspace(0,0.1,100), betas=np.linspace(1,3,120), plot=True)

Now, our \(\alpha\) priors range from 0 to 0.1 and our \(\beta\) priors span from 1 to 3.  Let's check out the results!

Figure 4: Contour plot for the grid search methodology employed.  The Likelihoods calculated are so small that every number shown seems to be the same because only the first 4 digits are shown.  However, I know that the colors represent the different Likelihoods from blue (least) to red (greatest).
That contour plot looks much more aesthetic than the previous one.  However, the number situation can be very confusing at first glance.

Figure 5: Plot of the CDF for the \(\alpha\) parameter.  The value yielding the best fit was 1.6 for this parameter.
Figure 6: Plot of CDF for \(\beta\) parameter.

There are your results!  Although, I do assume that the results would be more accurate if I were to constrain the range of priors more. 


I wanted to show the bad fitting first as a way to convey the detective-like nature of astrophysics researchers.  Researchers, no matter how smart, almost never get the accurate or most satisfactory results on the first try.  This is why one must put on the detective hat and sniff for clues as to what might have gone wrong during the analysis.  I looked for clues in that first contour plot once I noticed how "fishy" it looked.  Figure 1 led me to choosing better priors for my second attempt at the Grid Search.  Moments like that reaffirm my love for plots.  Experiences like that have persuaded me to create many plots as I write complex code.  The graphics can really give a person the ability to see subtle issues that may be visually conspicuous in one's code or methodology that may not be discerned by merely glancing at a bunch of numbers on a computer screen.

Monday, August 3, 2015

Astro-Stats & Python : Bayes' Theorem (for Astrophysics)

Back to the Astro-Stats!

In my Bayes' Theorem (for a Coin) post, I introduced Bayes' Theorem for probability along with an example with a Python script.  This time, I will use the same concept for astrophysical data.  Let me first set the stage for this new example and then I'll get to the code so that you can follow along your computer.

My data comprises of the metallicity of stars and the confirmation/denial of a planet detection around a star.  You can see the text file containing this data here.  If you want to follow along, you must copy this information and paste it into your own text file; or, you can download this file using the "wget" command (on that same link) in the Terminal.

Figure 1: Snapshot of part of table in the ".txt" file.  The first column is the real name of the star.  The second column is the metallicity of the star relative to the Sun.  This metallicity is the ratio of iron abundance to hydrogen abundance [Fe/H].  Because the Sun is used as the reference, Fe/H for the Sun equals 0.
As with my last analysis with Bayes' Theorem, I must input a prior for each parameter sought, find the Likelihood and then evaluate the new parameters.  In terms of mathematics, the differences between this exercise and the Coin example are conspicuously seen within my Likelihood and Bayes' Probability equations.

The Likelihood is equal to the probability (for the parameters' being measured) of all of the stars with detected planets multiplied by the probability (for the same parameters) of all of the stars without a planet detection.  Because each detection or absence thereof are events that occur independently of each other, the probabilities must be multiplied together to obtain the total probability for the stars with detected planets, as well as those without planets.  This can be described as
\[\mathcal{L}=\prod^{k} P(\alpha,\beta) \cdot \prod^{N-k}[ 1-P(\alpha, \beta) ]     \]
where \(\mathcal{L} \) is the Likelihood, k is the number of stars with planets, N is the number of stars total, P is the probability of stars with planets, and \(\alpha \) and \(\beta\) are parameters.  The rightmost \(\prod\) term is the total probability of stars without detected planets and the \(\prod\) term to the left of that is the total probability of stars with detected planets.

Essentially, the Likelihood is being found by multiplying two probabilities.  This corresponds to the generic definition of Likelihood, which can be written as \(\mathcal{L} = \prod^{N}_{i}P_{i} \)  where N is the total number of probabilities.  Likelihood can be defined as the product of all the probabilities.  This was discussed briefly in my Exercise in Likelihood and \(\chi^2\), continued... post.

For this specific example, the given probability equation is defined as
\[P (\alpha, \beta) = \alpha 10^{\beta M}    \]
where M is the metallicity of one of the observed stars.  Be aware that many statistical applications assume the probability of all outcomes to be equal to each other.  This equation signifies that the probability of each star for having a planet companion is distinct and depends primarily on the metallicity of that star along with the parameters.

Unlike the Coin example, I must now suggest many priors for my parameters \(\alpha\) and \(\beta\).  Previously, I provided only 1 prior probability and then let my function use the newfound posterior probability as the subsequent iteration's prior probability.  My methodology is different now because I will use the Grid Search and contour plot strategy to determine which of my priors yielded the highest Likelihoods and thus the highest posterior probabilities.  This method was used in my Grad Level to Intro to Astro-Stats for Undergraduates post and its sequel.  Similar to the equation stated for my Coin example, the posterior Likelihood is described as the following proportion.
\[\mathcal{L}_{posterior} \propto  \alpha \cdot \beta \cdot \mathcal{L}_{prior}      \]
Each \( \mathcal{L}_{posterior} \) will be tracked inside of a matrix, or grid, of numbers that will later be used to create a contour plot.


Now that the math and concepts are (hopefully) clear to you, we can get to the real-life application of all of this knowledge.  My next post will illustrate the results of utilizing this statistical method in my Python script.

 to be continued...