There is a "chunk of the Virgo consortium universe"
available for you here.
The data
come from a massive simulation of a cube of the universe measuring 140
Mpc on a side. Details of the simulation and the galaxy creation can
be found at http://www.mpa-garching.mpg.de/Virgo/data_download.html
The data give the x, y, and z
coordinates in Mpc and star formation
rate in solar masses per year of 8384 simulated galaxies.
We are going to define subsets of galaxies as "late types" (ie
Sb/Sc spirals) and "early types" (ellipticals and S0's) based on their
star
formation rates. Let's say late types are things with SFR's > 1
Msun/yr, and early types are things w/ SFR's < 0.1 Msun/yr. (Does
this definition make sense?)
- Plot up x vs y for all galaxies to give a feel for what the
data look like. Then do the same thing just for E's and just for
S's. Describe any differences and explain whether or not it makes sense.
- Now calculate the two point correlation function for all the
galaxies (see below for how to do this). Do this for seperations
between 2 and 20 Mpc, then fit a power law and see what slope you get.
Describe qualitatively what this is telling you about how galaxies are
distributed.
- Now do the same thing just looking at 2 point correlation
function for spirals only, and again for ellipticals only. Plot 'em up,
fit a slope, and describe the differences. ie what does this tell you
about the clustering of different types of galaxies in the universe,
and does this make sense qualitatively based on what you know about
galaxies?
- Compare your values for the clustering properties of
different galaxy sets (all, early, late) to those published in the
literature.
Calculating the 2pt correlation
function:
Remember the 2ptcf
describes the probability of finding two galaxies seperated by a
distance r over and beyond that expected from random distribution.
The simplest
estimator for this is given by
1+xi(r) =
DD(r)/RR(r)
where xi(r) is the
2ptcf, DD(r) is the number of pairs in the dataset with seperation r,
and RR(r) is the number of pairs with seperation r that you'd expect
just from random points. Note that this expression assumes equal total
numbers of data points and random points.