Most of the numerical SDSS data is stored in the form of FITS files. These files can contain both images and binary data tables in a well-defined format. FITS files can be read and written with many programming languages, but the most common ones used by SDSS are IDL and Python.
The Goddard utilities contain tools for reading and writing FITS files. The most commonly used functions are
Older versions of the idlutils package (up to
v5) bundled the Goddard utilities directly with the product. Starting with version
idlutils expects the Goddard library to be installed externally. SDSS provides a forked version of the library in our GitHub organization.
idlutils also contains additional programs for manipulating FITS files.
The astropy.io.fits package handles the reading and writing of FITS files in Python. Because of the general usefulness of the astropy package, this is the recommended Python reader for most FITS files.
Another package is
fitsio, developed by Erin Sheldon, which is a Python wrapper on the CFITSIO library. It allows direct access to the columns of a FITS binary table which can be useful for reading large fits files, as detailed below. However, fitsio requires that the input files adhere rather strictly to the FITS standard. This package is available for download here.
Large FITS Files
FITS files larger than about 2 GB can be more challenging to read. One such file is the spAll file. The simplest method for reading large FITS files is to download the
fitsio Python module described above. The module can read only selected columns from the FITS file:
import fitsio columns = ['PLATE', 'MJD', 'FIBERID', 'Z', 'ZWARNING', 'Z_ERR'] d = fitsio.read('spAll-v5_10_0.fits', columns=columns)
The astropy.io.fits module has more stringent hardware requirements as it must read the whole file in order to use it. On a 64-bit machine with > 4 GB of memory, it is possible to use the
from astropy.io import fits fx = fits.open('spAll-v5_10_0.fits', memmap=True) d = fx.data
In IDL, the routine
HOGG_MRDFITS() is available as part of the idlutils package. This routine is similar to
fitsio, in that one can specify a subset of columns to read. It avoids memory overload by reading only a subset of the rows of a FITS file, extracting the columns, then moving on to the next subset of rows.