Project Title  Parallel Stability Analysis of Exoplanetary Systems detected via Radial Velocity 
Summary  The intern will perform NBody simulations of planetary systems detected via radial velocity techniques, in order to understand the gravitational stability of the systems and determine any modifications that need to be made to orbital parameters as a result. While individual simulations are not largely conducive to parallelization, due to the small number of bodies in the system, typical results of radial velocity analysis produce a probability distribution for the exoplanetary system, rather than a single specific value, thus we need to analyze each system many times and not just once for stability. Probing a sample of the solution, including an investigation of inclination which cannot be determined via the radial velocity technique, provides information helping to constrain results. This sample is highly concurrent in nature, and will benefit from parallelization. 
Job Description  The Radial Velocity technique for detecting planets around other stars involves analyzing the wobbles a star makes as objects orbit around it. Larger, closer orbiting bodies produce larger wobbles. Closer bodies produce more frequent wobbles. Eccentric orbits change the shape of the wobbles. By matching a Keplerian 2body solution in a summative fashion for each planet, one can determine a likely number of orbiting planets, their mass, their location, and their eccentricity. However, the results of this data fitting technique carries with it some uncertainty. Mathematically, any uncertainty in the data itself can be directly linked to uncertainty in the model fit using Bayesian methods, typically MonteCarlo Markov Chain analysis. The output of the MCMC algorithm is not a “best fit,” but rather the posterior distribution, the likelihood function of each model parameter given the existence of the data and the known or expected error in the data. Nothing in the Keplerian 2body solution takes into account the possibility of interaction between planets that could result in instability. As a result, dynamic stability analysis is a typical second step after detecting and characterizing an exoplanetary system. Our approach is to sample the posterior distribution under the assumption of a system that is “edgeon” (minimum possible planet mass), and integrate the system forward for a significant (10^610^8 years, depending on available resources) time to determine what portion of the model fit posterior distribution is dynamically stable. Additionally, a subset is sampled and run for varying higher system inclination (and thus higher planet masses) to determine any constraints on the inclination of the system. Our current model uses an existing FORTRAN NBody code, which we have modified to take into account starplanet tidal effects as well as calculate relativistic precession, integrating forward using a 4th order symplectic integrator. The code produces many temporary files, and we run in parallel by scheduling many individual runs. Each run is integrated forward only until two planets, or a planet and the central star, come within one Hill radius, at which point a collision and thus instability is assumed. We have a prototype code in C/CUDA that replaces this by running multiple integrations from different samples concurrently in a CUDA kernel. The intern will work to test and improve this prototype code, and compare our GPGPU enabled code to our current process. 
Use of Blue Waters  Our current production code is not CUDA enabled, and runs on a traditional 130node 8 core per node CPU cluster. We have a prototype code running in C/CUDA, which is currently being tested. We have two small Tesla K20 enabled workstations for testing, but would like to be able to build and deploy our code for the Blue Waters architecture. Resource needs vary by system being modeled, but as a typical case our current analysis of HD10180 involves running concurrent sampled NBody simulations with a time step of 0.01 days (1 percent of shortest planet orbit) out to 10^6 years. On a single CPU of our cluster, this typically takes less than a week, and by running 1000 samples concurrently we perform a sample of our posterior distribution in about a week (some, typically many, runs will terminate before completion due to planetary collision) running on 80 nodes of our 130 node cluster (other nodes are reserved for shorter completion jobs). Each sample of the posterior distribution represents roughly 100000 CPU hours, and over the course of studying the system we will sample many times. Our hope is that by moving to accelerated nodes, we can reduce the total number of nodes needed to perform the same analysis. 
Conditions/Qualifications  Must be an undergraduate at Kean University with programming experience. 
Start Date  05/31/2018 
End Date  05/31/2019 
Location  Kean University Union, NJ 
Interns  Carley Garlasco
