GalaxSee

Creating a Galaxy

Starting Simple

There are various methods for creating a model of a galaxy on a computer, ranging from a straightforward application of the universal law of gravitation to complex "treecodes" designed to run on massively parallel supercomputers. For the purposes of understanding the general principles involved, we will discuss the most straightforward approach here.

First Steps

One of the first things to decide is how many stars there will be in the model. Just to make things more concrete, lets suppose we decide to use 1,000 stars (much less than in a real galaxy, but also a much more manageable number). Now, the kinds of questions we are going to ask will be things like "why are galaxies shaped the way they are?" and the way we're going to "run" the model is to simulate the action of gravity on the stars in the model. Both of these facts tell us that we are going to have to keep track of where each individual star is, i.e., its position in space. In order to store each star's location in three dimensional space, we must have a "place" in the computer to put an x, y, and z position for each star.

Keeping Track of Things (Arrays)

One way to do this would be to have variables called "star_1_x_position", "star_1_y_position", "star_1_z_position", "star_2_x_position", "star_2_y_position", "star_2_z_position", etc. However, not only would this require some poor programmer to type in three thousand different variable names, it would make a lot of other aspects of the program a lot more difficult than they had to be.

What is generally done instead is to use one or more arrays, or, ordered lists of numbers. An example of storing things in an array is a gradebook. All of the scores for one test are generally stored in a single column, with a heading like "Test 1." An array is just a "column" of numbers in a computer. For example, you could have an array called first_test(), with first_test(1) containing student number one's score, first_test(2) containing student number two's score, etc.

There are various ways to use arrays to store the positions of a thousand stars--a simple way is to have three arrays which each have a thousand elements. We could name these arrays x_position, y_position, and z_position, and then in the first slot in the x_position array, we store the x-coordinate of the first star, in the second slot of the x_position array would be stored the x-coordinate of the second star, etc. The advantages of using arrays to store these numbers are numerous. For example, suppose we wanted to start out with all of our stars in random positions. If we are using three arrays as mentioned above, we can write a single, simple loop that will give initial values to each of the three thousand coordinates. For example:

step 1: let i=1 step 2: let x_position(i)=a random number step 3: let y_position(i)=a random number step 4: let z_position(i)=a random number step 5: add 1 to the value of i (let i=i+1) step 6: if i<=1000, go back to step 2 (of course, in place of the words "a random number" would be some function that returns a random number in the appropriate range).

These statements are easier to type and easier to check for errors than the corresponding three thousand lines of code it would require to do it the other way. They are also easier to modify if, for example, we wanted to make the random positions all fall within a sphere, or to have some particular statistical distribution. In short, for modeling systems with lots of individual members, arrays are your friends.

Moving Things Around

Once we have locations for all of these stars, we need to figure out how we will change these location as the program progresses. In fact, the whole point of our model is to keep changing the positions, hopefully in a way that reflects what actually happens in nature, and allowing us to see what configuration the stars are in as "time" goes on. So the main (computational) effort in the program will go into moving the stars around according to whatever rules we think will appropriately model the real life phenomenon of galactic dynamics.

How to Truck Around Billions of Tons of Blazing Gas

The first thing we will need to know if we are going to move the stars is how fast, and in what direction, to move them. In real life, the speed and direction of an object is referred to as that object's velocity. Like position, velocity can be broken up into three parts (three vector components), corresponding to how fast the position is changing in the x, y, and z directions. We will use three arrays to hold the velocities of the stars: x_velocity, y_velocity, and z_velocity. For example, if star number 533 were moving straight in the x-direction at 2,500 meters per second, then x_velocity(533) would be 2,500, y_velocity(533) would be zero, and z_velocity(533) would be zero. Of course, if star 533 were headed in a direction other than straight in the x direction, the other values wouldn't be zero--they would have values corresponding to the motion of the star in these directions.

If we know the velocity for each star, we know which direction each star is headed and how fast it is going there. That knowledge allows us to change the position of the star to reflect its motion.

For example, suppose x_position(533)=10,000, y_position(533)=50,000, and z_position(533)=600,000, and we have the values for x-,y-, and z_velocity(533) as above. Then in a time step of, say, 86,400 seconds (one day), star 533 would move 2,500 (m/s) * 86,400 s=216,000,000 m. So after 1 day, the position of star 533 would be as follows:

x_position(533)=10000+216,000,000=216,010,000 y_position(533)=50,000+0=50,000 z_position(533)=600,000+0=600,000 Now, what will actually happen is that the computer will have all of the velocities for all of the stars and it will loop through the entire list, updating all of them. Of course, if the velocities weren't changing, all of the stars would move off in straight lines and we wouldn't have galaxies at all. In order for the stars in our model to move around the way real stars do, we have to calculate the change in velocity (acceleration) that the force of gravity imparts to the stars.

XLR8-ing

This is accomplished by applying Newton's second law of motion (force = mass * acceleration) and the universal(?) law of gravitation. Once we know the acceleration for each star, we use that to update the velocity of each star, which is used in turn to update the position of each star. Remember that the new positions are used to calculate the new force (which we use to find the acceleration) which is then used to calculate the new velocities, which are then used to determine the new position.

Some important considerations must be raised--first, how, exactly, are we going to apply Newton's second law (and in doing so, the law of gravity), and second, how much error will be introduced by this application?

Details

There are a number of ways that could be used to simulate the effect of the gravitational field and use Newton's second law of motion to "move" the stars around. The most straightforward way is known as the Euler method. The Euler method involves a simple concept which can be easily summed up as "what you have is what you had plus what you gained (or minus what you lost)", or, "(what it is) = (what it was) plus (how much it changed)". For example, above we actually implemented the Euler method when we showed how to "update" the position of the now legendary star number 533. The value we were working with was the position. We applied "(what it is) = (what it was) plus (how much it changed)" using 10,000 for "what it was" and 216,000,000 for "how much it changed" to arrive at 216,010,000 for "what it is".

That is the basic procedure that will be used to update the positions, and the process for updating the velocities is analogous--if we know the acceleration of a star, we can multiply that acceleration by our "updating interval" (or "time step") to give us a change in velocity, add that change to the current velocity, and we have the new velocity. "Knowing the acceleration of a star" requires us, as we know from the second law of motion, to have the mass of the star and the force being applied to the star. The mass of the star will simply be a number stored in an array, much like the arrays that hold the positions and velocities (except that the masses won't be changing). The force, on the other hand, will have to be calculated.

The Gravity Thing

Calculating the gravitational force on one object due to a bunch of other objects is a fairly straightforward (albeit numerically intensive) task. The nice thing about gravity is that (we think) we know how it works. F=GMm/r^2, for all times, places, and objects. Since this is really a vector equation, it is more appropriately written F(vector)=-(GMm/r^2)r(unit vector). We do have to be "picky" about the vectors, making sure that all of the forces pull in all of the correct directions. However, one of the nice things about doing this on the computer is that we only have to grind through the details once and (in the unlikely event that you get it right the first time) the computer does all of the number crunching thereafter.

Vectors

The vector nature of the problem is easily taken care of, since the locations of the stars are all stored, essentially, as vector quantities. The symbol r(unit vector) in the equation above is a unit vector pointing from one mass to the other mass. Sometimes the law of gravitation is written

F_{1,2}=-G\frac{m_1m_2}{r^2} \circonflex{r_{2,1}

meaning "the force that object 1 exerts on object 2 is equal to the negative of the gravitational constant G times the product of the masses divided by the square of the distance between them times a unit vector pointing from object 2 to object 1".

The numbers G, m_1, m_2, and r are easy to obtain (r is obtained by the three dimensional version of the Pythagorean theorem, and the remaining numbers are stored constants). The only sticky point is that unit vector r(unit vector).

The vector quantity r(vector) pointing from one star to another star is easily computed. Suppose we want three variables r_x, r_y, and r_z to contain the x-, y-, and z-components of the position vector pointing from, say, star number 533 to star number 621. That is accomplished by the following three lines of code:

r_x = x_position(621) - x_position(533) r_y = y_position(621) - y_position(533) r_z = z_position(621) - z_position(533). A unit vector in the direction of r(vector) is easy to create, by dividing each of r_x, r_y, and r_z by the magnitude of r(vector). But hey! We already know how to find the magnitude of r(vector)--that's just r, the distance between the two stars, which we already talked about computing with the Pythagorean theorem. That is, r_magnitude=sqrt( (r_x)^2 + (r_y)^2 + (r_z)^2 ).

Being the clever folks that we are, we'll make sure to reuse the value of r_magnitude rather than computing it twice--in fact, we'll be so clever that we only use it once, raised to the appropriate power.

Now, all of these computations regarding the acceleration due to the force of gravity are being done with a computer, with a finite number of decimal places and hence finite accuracy. These effects must be considered carefully in order to correctly interpret the results of the calculation. A simple example, with which most of us are familiar: divide one by three with a calculator. The answer it gives is an approximation. When we are doing science with a computer, it is important to know how approximate the answers are.

The error introduced by the finitude of the possible accuracy of the computer (known as machine error or roundoff error) is only one of our problems--there are much harder-to-address sources of error, related, for example, to using the Euler method, and, in a more fundamental sense, using a "small" number of stars. In any scientific experiment, it is extremely important to know how much error is expected in the measurement, and of course experiments performed on a computer are no exception.

Error In the Euler Method

Numerical integration is of course an approximate procedure. There are sources of error in every method. Let's look at the error introduced by the Euler method for our particular problem. First, let's review how the method works:
  1. In the beginning, we have the positions and velocities of all of the stars.
  2. Using these positions, we compute the acceleration, due to gravity, of each star
  3. Using the relationship velocity=original velocity + acceleration*time, we compute new velocities.
  4. Using the new velocities and the relationship position = original position + velocity*time we compute new positions.
  5. Return to step 2

What's Wrong With This Picture?

The error introduced by this method (other than machine error) lies in the relationships

velocity=original velocity + acceleration*time

position = original position + velocity*time

used in steps 3 and 4. These relationships are correct only when (for the first one) the acceleration is constant or (for the second) the velocity is constant. Neither of these conditions hold for our problem, since the acceleration is changing as the positions change, and of course the velocity is changing when there is an acceleration.

What Does This Mean?

This means that we are essentially assuming that, for the amount of time we choose to "let pass" between steps, the acceleration and velocity are constant. The smaller we make the time step, the less this assumption hurts us. Lets consider an analogous example.

Suppose we are trying to calculate the area of a circle, but we don't have an accurate value for pi. One thing we could do would be to make an "approximate" circle out of things that we have exact area formulas for (like triangles). So we take a (computational) protractor and mark off, say, 20 degree angles all around the circle, and draw rays from the center out to the circle at each 20 degree increment. We then form (long, thin) triangles by drawing a line segment from each point of intersection with a ray and the circle to (both of) the nearest such points. At this point we have an18 (=360/20) -gon "approximating" our circle. Suppose we can find the areas of these triangles as accurately as the machine allows. Then the error in our estitmate of the area (obtained from summing the areas of these 18 triangles (okay, we would really find one area and multiply it by 18)) would be due to the fact that we assumed that the curved shape of the circle was approximated by a straight line drawn from one particular point to the next. We could improve our estimate by using more triangles, so that the shape of the n-gon was closer to that of the circle.

With the galaxy model, by choosing a smaller time step we let the stars move in straight lines for shorter distances and so reduce the error associated with the method. Of course, as with most things in computational science, giving here requires taking there. We trade, in this case, speed for accuracy. It takes longer to compute, say, the earth' s orbit if we step a day at a time than if we step a week at a time, but in one case we have a 52-gon and the other a 365-gon, which (if we do everything correctly) should give us a better approximation to the actual smooth curve that is the earth's orbit. Of course, if all we want is a general idea of what the orbit looks like, a time step of one week might be very satisfactory. If we are trying to get a sense of how the orbit is perturbed from an ellipse by the action of the other planets in the solar system, it is likely that the error introduced by using a time step that large will be bigger than the effect we are attempting to measure.

Hard Questions

The questions like "how accurate is accurate enough?" "How much will this approximating assumption affect the answer?", etc. are often very difficult, and are basically the life's work of many scientists and mathematicians. A great deal of effort goes into learning how the assumptions that we make affect the conclusions that we reach--in a very real sense, this is the very essence of doing science.


[ Home | GalaxSee Home | Index ]
[ Introduction | Instructional Resources | Simulation Software | GalaxSee Help ]
[ Fractal Modeling Tools | Baroreceptor Modeling | SimSurface | Gnuplot ]
[ The Pit and the Pendulum | Environmental Models | InteGreat ]

Last Update: August 8, 1997
Please direct questions and comments about this page to WebMaster@shodor.org
© Copyright 1997 The Shodor Education Foundation, Inc.