APOGEE Visit Spectra Reduction

This page gives a brief description of the first stage of the APOGEE Data Reduction Pipeline (apred), which reduces the raw spectra of consecutive, spectrally-dithered exposures of one visit of a particular plate on a given night. Essentially, the apred pipeline stage extracts the individual visit spectra for each of the objects targeted on a plate. See Nidever et al. (2015) , the publication which describes the Data Reduction Pipeline, for further details.

APOGEE Visit Data Reduction

Data taken from the northern spectrograph are reduced in the same way as data taken from the southern spectrograph unless otherwise noted.

The apred sequence consists of three sequential steps:

Extract 2-dimensional images from the 3-dimensional raw data cubes and apply the basic calibration steps of dark subtraction and flat fielding.
Extract and calibrate 1-dimensional spectra from the 2-dimensional images and attach a wavelength calibration.
From the 1-dimensional spectra, measure the dither shifts between the individual exposures, subtract sky from each fiber, correct for telluric absorption in each fiber, combine the dithered exposures into a single well-sampled visit spectrum, perform flux calibration, and obtain an initial radial velocity estimate for the spectrum.


For each readout of each exposure, the raw data are first corrected for bias variations in the IR detectors and electronics. This is accomplished by using a reference array of pixels that are generated by the readout electronics, as well as a set of reference pixels around the edge of each detector.

Each individual readout is then corrected for a dark current contribution, by subtracting a calibration dark current frame made from a combination of multiple individual dark frames.

The data are then collapsed from the 3D data cubes into 2D images, which is done on a pixel-by-pixel basis. At the most basic level, a linear function is fit to the series of up-the-ramp readouts for each pixel to determine the best-fitting slope. A linear function is used to fit all exposures, even if conditions vary throughout the exposure. The best-fitting linear slope, multiplied by the total exposure time, is taken to be the flux at this pixel location for the exposure.

The up-the-ramp sampling allows for the recognition of cosmic ray events during the exposure. Cosmic ray events appear as significant jumps in the rate of charge accumulation within the series of data points in up-the-ramp sampling. The ap3d software attempts to recognize these events using this signature and then flags the affected pixels.

The 2D images are then corrected for variations in pixel-to-pixel response by dividing them by a calibration flat field. The calibration flat field is an average of multiple exposure frames illuminated by a flat light source within the spectrograph.

Attempts have been made to correct for some of the persistence in the Northern spectrograph that affects a third of the “blue” detector and a smaller fraction of the “green” detector. Based on an analysis of illuminated frames followed by a series of long dark frames, a double-exponential fit for the amplitude of the persistence was derived for all pixels. This correction, described in detail in Holtzman et al. (2018), depends only on the exposure level and elapsed time. It was only applied to the "blue" detector. This correction is only partial and does not wholly remove the persistence issues. Therefore, during the visit combination step, visits that have been significantly affected by persistence are down-weighted. Note that the persistence effects in the Southern spectrograph are minimal.


The ap2d routine takes the calibrated 2D images and extracts individual 1D spectra for each exposure by modeling the distribution of the light from each fiber as a function of wavelength. The wings of the light distribution from each fiber can affect its two adjacent spectra; thus, the flux from all 300 fibers is fit simultaneously. The profiles for each fiber are derived from a calibration frame taken through the telescope immediately after the exposure sequence. The shape and magnitude of the contribution of light from the wings of the fiber into the adjacent fibers are estimated using a library of calibration observations where only every sixth fiber is illuminated.

After the 1D images are extracted, a wavelength calibration is applied, which is determined from observations of arc calibration lamps. Because the APOGEE spectrograph is in a gravitationally-fixed orientation and is kept at a stable vacuum and temperature, the form of this wavelength correction is very robust. Thus, a single wavelength calibration is adopted to determine the non-linear terms in the conversion between pixel location and wavelength. Note that the wavelength scale for each fiber is slightly different because of the distinct locations of the fibers in the pseudo-slit.

The wavelength calibration of the APOGEE data is done in vacuum wavelengths. However, the wavelengths of atomic transitions are usually quoted at standard temperature and pressure (S.T.P.); this is how the CRC Handbook of Chemistry and Physics lists them for transitions redward of 2000 Ångstroms. Thus, recognizing spectral lines associated with specific atomic transitions may require converting the SDSS data to the equivalent values at S.T.P.  For APOGEE data, we have used the conversion from Ciddor (Applied Optics, Vol 35, p 1566, 1996) to convert between vacuum and air wavelengths. For a vacuum wavelength (VAC) in Ångstroms, convert to air wavelength (AIR) using the equation:

AIR = VAC / (1.0 +  5.792105E-2/(238.0185E0 - (1.E4/VAC)^2) + 1.67917E-3/( 57.362E0 - (1.E4/VAC)^2)

There are small linear shifts in the wavelength scale between different exposures. These result from two sources: (i) the intentional dithering of the detectors between exposures to allow for well-sampled combined images, and (ii) a small, slowly varying flexure in the instrument optical bench. The flexure in the optical bench occurs as the liquid nitrogen tank depletes over time (a larger "reset" shift occurs when this tank is filled, but this is always done during the day). The linear shifts are measured using prominent night sky emission lines that appear in every spectrum, and these shifts are applied to the wavelength solution.


The first stage in ap1dvisit determines the linear shift between each exposure with a visit; these shifts result from the dithering of the detectors. A linear shift can be determined to higher precision than a direct measurement of the wavelength zero point (e.g., determined from the skylines by cross-correlating the different exposures with each other).

Each fiber of each exposure is then corrected for the contribution of night sky emission. The IR portion of the spectrum includes a significant number of very bright OH emission lines. There can also be some continuum sky contribution, especially when there is substantial moonlight or when thin clouds are present. Sky subtraction is accomplished using sky fibers that are distributed across each plug plate. Multiple fibers are used to take into account variations in the IR sky. For each object, the sky is estimated from the nearest four sky fibers. However, as the wavelength scale is not identical for each fiber, the sky spectra need to be shifted before they can be subtracted. Also, because the line profiles differ slightly from fiber to fiber, there are small differences that lead to imperfect sky subtraction, particularly for the brightest night skylines. Because the sky subtraction for the bright night skylines is non-ideal, there are small regions of the spectra that are effectively rendered useless for science surrounding each sky feature. Sky removal remains an area for improvement in the pipeline. We note, however, that even with perfect sky modeling, the signal-to-noise under bright skylines would be substantially degraded compared with the surrounding spectrum.

The Earth's atmosphere also leads to significant absorption in the observed spectra, which arises from CO2, H2O, and CH4 bands in the APOGEE spectral window. A correction for this telluric absorption is derived from observations of "telluric" standards spread across the plate. The goal is to target hot stars that exhibit relatively few spectral features in the APOGEE wavelength region, which is accomplished by selecting stars based on their intrinsic color. Multiple telluric stars are chosen for each plate because the absorption can vary across the field of view. For each telluric standard, the amplitude of the absorption from the separate families of CO2, H2O, and CH4 bands are estimated by fitting model absorption spectra to that observed. A surface is fit to these scaling factors and this surface is used to predict the appropriate scale factors for each fiber. The individual-fiber scaling factors, together with model telluric spectra that have been convolved with the fiber-specific line spread function, are used to correct each individual science spectrum. Significant improvements have been made to the telluric correction over time, but there are still some cases where the correction remains imperfect.

After sky correction, pairs of dithered frames are combined to produce well-sampled images. The spectra from each pair are combined to create a single "visit" spectrum for each object observed.

The final visit spectra are then approximately flux calibrated. The relative flux calibration is performed using a calibration frame that computes the instrument spectral response, as determined from an observation of a blackbody source. The absolute level of the spectrum is then determined using a scaling based on the object's catalog H-band magnitude. We note that the subsequent pipeline for the analysis for stellar parameters and abundances (ASPCAP) normalizes the spectra to a pseudo-continuum, so the flux calibration done here is not critical.

Finally, an initial radial velocity (RV) estimate is made by cross-correlating each visit spectrum with a grid of synthetic spectra. The best matching one serves as a template, and the derived shift between the observed spectra and the best-fitting templates provide the initial RV estimate. Note that this estimate is later refined using multiple visits to the same object because these provide a higher signal-to-noise spectrum.

Output visit spectra: apVisit/asVisit files

The final dither-combined spectra from a given visit are written into individual apVisit/asVisit files, as described in detail in the apVisit/asVisit data model . Here, the "ap" refers to spectra taken from the northern spectrograph at APO and "as" refers to spectra taken from the southern spectrograph at LCO.

See the documentation on APOGEE data for more information on how to retrieve these visit-level spectra.

Multiple visit spectra of the same object are combined in the next stage of the pipeline: Visit Combination.