APOGEE Visit Spectra Reduction

The first stage of the APOGEE data reduction pipeline (apred) reduces the raw spectra of consecutive, spectrally-dithered exposures of one visit (of a particular plate on a given night) and extracts the individual spectra of each of the objects on a plate.  The other steps of the reduction process include dark subtraction, flat-fielding, wavelength and flux calibration, removal of sky emission and absorption within the Earth’s atmosphere, and combination of individual spectrally-dithered exposures into single spectrum for each object.  At the visit reduction level, the pipeline also provides an initial estimate of the radial velocity of each object.  A more in-depth description of the visit-level reduction process follows below.  See Nidever et al. (2015) for further technical details.

APOGEE Visit Data Reduction

The apred sequence consists of three sequential steps:

Extract 2-dimensional images from the 3-dimensional raw data cubes and apply the basic calibration steps of dark subtraction and flat fielding.
Extract and calibrate 1-dimensional spectra from the 2-dimensional images and attach a wavelength calibration.
From the 1-dimensional spectra, measure the dither shifts between the individual exposures, subtract sky from each fiber, correct for telluric absorption in each fiber, combine the dithered exposures into a single well-sampled visit spectrum, perform flux calibration, and obtain an initial radial velocity estimate for the spectrum.


For each readout of each exposure, the raw data are first corrected for bias variations in the IR detectors and electronics. This is accomplished by using a reference array of pixels that are generated by the readout electronics, as well as a set of reference pixels around the edge of each detector.

Each individual readout is then corrected for a dark current contribution, by subtracting a calibration dark current frame made from a combination of multiple individual dark frames.

The data are then collapsed from the 3D data cubes into 2D images. This is done on a pixel-by-pixel basis. At the most basic level, a linear function is fit to the series of up-the-ramp readouts for each pixel to determine the best-fitting slope. A linear function is used to fit all exposures, even if conditions vary throughout the exposure. The best-fitting linear slope, multiplied by the total exposure time, is taken to be the flux at this pixel location for the exposure.

The up-the-ramp sampling allows for the recognition of cosmic ray events during the course of the exposure, which appear as significant jumps in the rate of charge accumulation in the up-the-ramp sampling. The ap3d software attempts to recognize these events and flag the affected pixels.

The 2D images are then corrected for variations in pixel-to-pixel response by dividing them by a calibration flat field, which is constructed from an average of multiple exposure frames illuminated by a flat light source within the spectrograph.

For DR13, we have also made an attempt to correct for some of the persistence that affects about a third of the “blue” detector, and a smaller fraction of the “green” detector. However, this correction is only partial in nature and does not completely remove the persistence issues.


The ap2d routine takes the calibrated 2D images and extracts individual 1D spectra for each exposure. This is accomplished by modelling the distribution of the light from each fiber as a function of wavelength. The flux from all 300 fibers is fit simultaneously in order to account for contributions of the wings of the light distribution from each fiber into the distribution of the two adjacent spectra. The profiles for each fiber are derived from a calibration frame taken through the telescope immediately after the exposure sequence on each plate. The shape and magnitude of the contribution of light from the wings of the fiber into the adjacent fibers is estimated using a library of calibration observations where only every sixth fiber is illuminated.

After the 1D images are extracted, a wavelength calibration is applied, as determined from observations of arc calibration lamps. Because the APOGEE spectrograph is in a gravitationally-fixed orientation and is kept at a stable vacuum and temperature, the form of this wavelength correction is very stable, and a single wavelength calibration is adopted to determine the non-linear terms in the conversion between pixel location and wavelength. Note the the wavelength scale for each fiber is slightly different because of the different locations of the fibers in the pseudo-slit.

The wavelength calibration of the APOGEE data is done using vacuum wavelengths.  However, the wavelengths of atomic transitions are usually quoted at standard temperature and pressure (S.T.P.); this is how the CRC Handbook of Chemistry and Physics lists them for transitions redward of 2000 Ångstroms. Thus, recognizing spectral lines associated with specific atomic transitions may require converting the SDSS data to the equivalent values at S.T.P.  For APOGEE data, we have used the conversion from Ciddor (Applied Optics, Vol 35, p 1566, 1996) to convert between vacuum and air wavelengths. For a vacuum wavelength (VAC) in Ångstroms, convert to air wavelength (AIR) using the equation:

AIR = VAC / (1.0 +  5.792105E-2/(238.0185E0 - (1.E4/VAC)^2) + 1.67917E-3/( 57.362E0 - (1.E4/VAC)^2)

There are small linear shifts in the wavelength scale between different exposures, which result from (i) the intentional dithering of the detectors between exposures to allow for well-sampled combined images, and (ii) a small, slowly varying flexure in the instrument optical bench as the liquid nitrogen tank depletes over time (a larger “reset” shift occurs when this tank is filled, but this is always done during the daytime). The linear shifts are measured using prominent night sky emission lines that appear in every spectrum, and these shifts are applied to the wavelength solution.


The first stage in ap1dvisit determines to high accuracy the linear shifts between each exposure in a visit that result from the dithering of the detectors. This can be done at higher accuracy than the determination of the wavelength zeropoint from the sky lines by cross-correlating the different exposures with each other.

Each fiber of each exposure is then corrected for contribution of night sky emission. The IR portion of the spectrum includes significant numbers of very bright OH emission lines. There can also be some continuum sky contribution, especially when there is significant moonlight (and even more so when thin clouds are present). Sky subtraction is accomplished using 35 sky fibers that are distributed across each plug plate. Multiple fibers are used because the IR sky can vary spatially. For each object, the sky is estimated from nearest four sky fibers. However, as the wavelength scale is not identical for each fiber, the sky spectra need to be shifted a bit before they can be subtracted. Also, because the line profiles differ slightly from fiber to fiber, there are small differences that lead to imperfect sky subtraction, in particular, of the bright night sky lines. Since the sky subtraction of the bright night sky lines is non-ideal, there are small regions surrounding each line that are rendered useless for science. This is an area for potential improvement in the pipeline, but we note that even with perfect sky modelling, the signal-to-noise under bright sky lines would be substantially degraded compared with the surrounding spectrum.

The Earth’s atmosphere also leads to significant absorption in the observed spectra, which arises from CO2, H2O, and CH4 bands in the APOGEE spectral window. A correction for this telluric absorption is derived from observations of 35 “telluric” standards spread across the plate. These stars are chosen by their intrinsic color, with the goal of targeting hot stars that exhibit relatively few spectral features in the APOGEE wavelength region. Multiple telluric stars are chosen because the absorption can vary across the field of view. For each telluric standard, the amplitude of the absorption for the separate families of CO2, H2O, and CH4 bands are estimated by fitting model absorption spectra to the observed. A surface is fit to these scaling factors, and this surface is used to predict the appropriate scale factors to be used for each individual fiber. The individual fiber scaling factors, together with model telluric spectra that are convolved with the fiber-specific line spread function, are used to correct each individual spectrum. Significant improvement was made to the telluric correction for DR13 (as compared with DR12), but there are still some cases where the correction is imperfect.

After sky correction, pairs of dithered frames are combined to produce well-sampled images. All of the different pairs are then combined to produce a single spectrum of each object for the visit.

The final visit spectra are then approximately flux calibrated. The relative flux calibration is performed using a calibration frame that computes the instrument spectral response, as determined from an observation of a blackbody source. The absolute level of the spectrum is then determined using a scaling based on the object’s catalog H-band magnitude. We note that subsequent pipeline for the analysis for stellar parameters and abundances (ASPCAP) normalizes the spectra to a pseudo-continuum, so the flux calibration done here is not critical.

Finally, an initial radial velocity (RV) estimate is made by cross-correlating each visit spectrum with a grid of synthetic spectra. The best matching one serves as a template, and the derived shift between the observed spectra and the best-fitting templates provide an initial RV estimate. Note that this estimate is later refined using multiple visits to the same object, because these provide a higher signal-to-noise spectrum.

Output visit spectra: apVisit files

The final dither-combined spectra from a given visit are written into individual apVisit files, as described in detail in the apVisit data model. See the documentation on APOGEE data for more information on how to retrieve these.

Multiple visit spectra of the same object are combined in the next stage of the pipeline, visit combination.