Working with SDSS data

OK, so now you have a feel for the cluster, lets get some data and try and work out some quantitative analysis.

Instructions for downloading data
Jupyter notebook to get you started on analysis

Try a few of these things:

(Note: when I say plot "this" versus "that", "this" goes on the y-axis and "that" goes on the x-axis. So "plot color versus magnitude" means that color goes on the y-axis and magnitude goes on the x-axis.)

Plot r mag uncertainty versus r mag. If you want your mags accurate to 10% or better, what is the rough magnitude limit of your analysis?
Plot dec vs ra (so a sky map) of resolved sources (type=3) brighter than r=20. Make another for unresolved sources (type=6) brighter than r=20. Think about the differences.
Plot redshift versus r-band magnitude for galaxies that have a measured redshift and are within 30 arcminutes of the cluster center. Using the plot as a guide, work out a quantitative, statistical estimate of the cluster redshift. If you wanted to define a "spectroscopically confirmed cluster member", how might you do it?

Tip: Look at the redshift plot, decide what range of redshifts define the cluster. Make a selection on galaxies with redshifts in that range, and calculate the average redshift of those objects.
Plot g-r color versus r magnitude for all resolved sources projected within 1 Mpc of the cluster center -- this is a color-magnitude diagram (CMD) for galaxies.

Tip: Given the redshift you calculate above, use the astropy code in the sample workbook to work out the angular radius (in arcseconds) that encompasses 1 Mpc in the cluster. Then make a selection on galaxies with a radial distance less than that angular radius, and plot the color-magnitude diagram for those clusters.
Plot the CMD, then overplot in a different color the CMD for spectroscopically confirmed cluster members. Again, restrict it to resolved sources projected within 1 Mpc of the center.

Tip: Do a dual selection: galaxies within the 1 Mpc projected radius AND within the range of redshifts that you defined for the cluster. Then plot the CMD for those galaxies on top of the one you made in the previous step for all objects projected within 1 Mpc.
Identify the bluest spectrosocopically confirmed galaxies, find their ra and dec, and then find them using Skyserver's "Navigate" function. What do they look like morphologically? Spectroscopically?

Tip: Define a new column of data for the SDSS data table to hold the g-r color: SDSS['g-r']=SDSS['g']-SDSS['r']. The do a show_in_browser call on objects within the cluster redshift range. Sort on the g-r column to find the bluest objects and look at their coordinates. Then find them using Navigate.
Identify the most luminous spectroscopically confirmed galaxies and look at them in Navigator. What do they look like morphologically? Spectroscopically?

Tip: Now sort your show_in_browser table on the r-magnitude and look at the coordinates of the brightest object. Then find it using Navigate.
Identify the highest redshift objects in the field (they won't be in the cluster, obviously!) and find them in Navigator. What do they look like, morphologically and spectroscopically?

Tip: This time do a show_in_browser call on all objects with a redshift (whether or not they are in the cluster redshift range), and sort on the redshift to find the coordinates of the highest redshift object.

Feel free to try other things, even if they might seem non-sensical at first. If a pattern shows up, think about it! And remember, there are other source properties available in the SDSS database; you can browse the PhotoObj and SpecObj tables to see what else is there and add them to your download request if you want.

Plotting and Slicing tips

When plotting mags and colors, don't autoscale, or you'll get unreadable plots. For CMDs, for example, reasonable limits on the magnitude range would be r = 13-24, and limits on the color range would be g-r = -1 to +2.

When plotting lots of data points, make the marker sizes small so that the density of points doesn't make it so you can't see all the data. try something like scatter(x,y,s=1)

Also, to plot subsamples, the easiest way to do this is to set a selection flag like this:

want=(SDSS['g']<20)# for selecting objects with a g mag brighter than 20
or
want=(np.abs(SDSS['redshift']-0.1)<0.05)# for selecting objects in the redshift range 0.005 to 0.015
or
want=(SDSS['modelMagErr_r']<0.2)# for selecting objects with an r magnitude uncertainty less than 0.2
etc.....

followed by, for example,

scatter(SDSS['r'], SDSS['g']-SDSS['r'], s=1)# if you want to plot the whole sample
scatter(SDSS['r'][want], SDSS['g'][want]-SDSS['r'][want], s=20,color='red')# to then overplot the subsample

You can also "stack" selections like this:

want=SDSS['g']<18 # brightwant=np.logical_and(want, SDSS['g']-SDSS['r']>0.7)# redwant=np.logical_and(want, SDSS['redshift'] != -999)# has redshift

which would give you a "want" selection that is bright red galaxies with measured redshifts

Many of these tips are employed in the template plotting code, so it's a good place to start