Wilson Inspired: Part III : Astro-Stats & Python : Lev-Marq to Markov Chain Monte Carlo and Bootstrapping

Now that my function (using the Levenberg-Marquardt, or LM, statistical method) has found the best fitting parameters, another function takes action and performs the Markov Chain Monte Carlo (MCMC).

Figure 1: Plot of steps taken by the "random walker" representing the parameter \(a_0\). The MCMC process is causing the walker to search around the best value for this parameter while looking at the Likelihood at each step. The gaps represent iterations of the the MCMC process when this specific parameter was not changed.

As you can see in Figure 1, the "walker" moves vertically on the plot over time. Some values seem to be repeated. This is the Bootstrapping application of my code. It is okay for the walker to repeat steps that were chosen previously. However, because of the MCMC method, the walker gravitates towards values that yield high Likelihoods and probabilities.

From here, I will follow the same analysis I conducted in my Astro-Stats & Python : Bootstrapping, Monte Carlo and a Histogram post. The histogram of each parameter will reveal the probability distribution function (PDF) of the parameter although the PDF is not explicitly plotted. Then the cumulative distribution function (CDF) will reveal where the peak of the probability distribution function is as well as the interval for which 68.3% of the central area under the PDF curve is accounted for. (This is explained thoroughly in my Grad level Intro to Astro-Stats for Undergraduates, cont. post with an example given to apply the concept.)

Despite similar analyses, the results are tremendously different now though. Because I have now used the LM statistical method first, my MCMC didn't have to waste time in finding the vicinity of the best fitting parameters and subsequently the confidence interval. This time, my MCMC process devoted 2,000 trials solely towards searching around the best fitting parameters. This distinction is evident if you compare the uncertainties of the 3 parameters in that post with the 3 parameters' confidence intervals in this post. This is why I set the "par_sigmas" so low. Because the "walker" is so close to the best values with the highest Likelihoods, the walker will refuse to make steps that are very far away (where there are much lower Likelihoods). Therefore, it is best to tell the walker to attempt to make very small steps. The walker will be more willing to accept very small steps than larger ones and thus the acceptance rate has a chance at being above 40%.

I will now just show the rest of the results. At a later time, I will run the same code and explain what is happening when the data follows a Gaussian profile, as opposed to my current example's trinomial of second degree profile. But if you are just too eager to wait and see, you can run my code again with the "sim_data_gauss" option set to "True".

Figure 2: The Y-axis is the number of times that that value appeared during the 2,000 trials. The highest bin(s) seems to be around 0.003 + 9.469, which is 9.472.

Figure 3: As the histogram for \(a_0\) implied, 9.472 was calculated to be the best value for this parameter. This is good because the true value was 10. The red error bars represent the confidence interval, i.e. the \(\pm\) uncertainties. The green line shows that the CDF goes from 0 to 1 (because it's a type of probability distribution function). It also helps guide your eye towards seeing the X-axis value that is chosen.

Figure 4: Plot showing the changes made to parameter \(a_1\) throughout the Random Walk.

Figure 5: The highest bin(s) conspicuously suggest that a number near 0.003 + 499.997 (which is 500) is the best value.

Figure 6: Look at those low uncertainties! Talk about precision, eh!

Figure 7: Plot showing the changes made to parameter \(a_2\) throughout the MCMC process.

Figure 8: The highest bin(s), i.e. the highest probabilities and frequencies, are around 0.015 + 1999.98, which is 1999.995.

Figure 9: I must say, I am proud of this one. Beautiful plots can make all of that hard work seem worth it. And oh boy! that precision though!

Wilson Inspired

Saturday, July 18, 2015

Part III : Astro-Stats & Python : Lev-Marq to Markov Chain Monte Carlo and Bootstrapping

No comments:

Post a Comment