Using APOGEE Spectra

Table of Contents

There are several important aspects relevant to all APOGEE spectra that you should be aware of as you examine and use the data.

Types of APOGEE Spectra

Several different types of APOGEE spectra are available:

Individual Visit Spectra:: of each visit to each star are available in apVisit/asVisit files.
Combined Spectra:: from all visits to a star are available in apStar/asStar files.
Pseudo-Continuum Normalized Spectra:: that are used in the derivation of stellar parameters are available in aspcapStar files.

The "ap" and "as" prefixes to the Visit and Star files indicate which instrument obtained the spectra; "ap" refers to spectra taken with the northern instrument and "as" for the southern spectrograph.

Instructions for how to retrieve the spectra can be found here.

The combined spectra in the apStar/asStar files may be the most useful. These combined spectra are generated by resampling individual visits onto a common, logarithmically-spaced wavelength scale after removing each visit's derived radial velocity (log λ_i+1 – log λ_i = 6E-6, with a common starting wavelength of 15100.802 Angstroms). The resulting spectra are in rest, vacuum wavelengths. Data from the entire APOGEE wavelength range (which includes some gaps ) are included in a single array. The wavelength scale is recorded in the header in standard FITS cards; thus, standard software should allow straightforward plotting of flux vs. wavelength and perform other tasks.

The apStar/asStar files also include the individual visit spectra that have been resampled and shifted to rest wavelength.

The apVisit/asVisit files contain individual visit spectra before resampling and removal of radial velocities. Note that while these have wavelength calibration information, the native wavelength scale is not an evenly spaced linear or logarithmic scale. The wavelength information is included as a separate wavelength array. The wavelength information is also available in a table that provides the parameters of the function used to fit the pixel-wavelength relation, and this information is required if you wish to plot flux against wavelength. For the apVisit files, spectra from each of the three chips are stored in different rows in the image extensions.

Data Quality Flags

Information about the data quality of APOGEE spectra is encoded in several different bitmasks that are included with the spectra.

At the individual pixel level, the visit and combined spectra include a mask array in HDU3 in the apVisit and apStar files, with bits set according to the APOGEE_PIXMASK bitmask. This bitmask flags both bad pixels and "warning" pixels; data in bad pixels are unreliable, while data in "warning" pixels may be unreliable.
At the individual visit level, information is included in a APOGEE_STARFLAG bitmask that is recorded in the FITS header as card STARFLAG.
At the combined spectrum level, there is a bitwise OR and a bitwise AND of the APOGEE_PIXMASK bitmasks from the individual visits that is output in HDU3 in the apStar files (rows 1 and 2), as well as a bitwise OR and a bitwise AND of the APOGEE_STARFLAG bitmasks from the individual visits that are recorded in the apStar headers in the STARFLAG and ANDFLAG cards.
If you inspect the bitmasks, you will see that data in some locations may not be reliable; some of the specific reasons for this are discussed below. In many cases, unreliable pixels may come in "blocks" of contiguous pixels. This can arise even if only a single pixel within the block has bad data due the combination of dithered spectra or spectra from multiple visits. Thus, values in the final spectra have contributions from multiple pixels from the raw input spectra. If a bad pixel is expected to have a significant contribution to any pixel in the final spectrum, then that final pixel will be flagged.

Vacuum Wavelengths

The wavelength calibration of the APOGEE data is done using vacuum wavelengths. However, the wavelengths of atomic transitions in the optical and infrared are usually quoted at standard temperature and pressure (S.T.P.); this is how the CRC Handbook of Chemistry and Physics (also available free online from The Internet Archive) lists them for transitions redward of 2000 Ångstroms. Thus, spectral lines associated with specific atomic transitions may require converting the SDSS data to the equivalent values at S.T.P. For APOGEE data, the conversion from Ciddor (Applied Optics, Vol 35, p 1566, 1996) has been employed to convert between vacuum and air wavelengths. For a vacuum wavelength (VAC) in Ångstroms, convert to air wavelength (AIR) using the equation:

AIR = VAC / (1.0 +  5.792105E-2/(238.0185E0 - (1.E4/VAC)^2) + 1.67917E-3/( 57.362E0 - (1.E4/VAC)^2)

Wavelength Coverage and Detector Gaps

The spectra are recorded onto three different detectors ("chips"). While the overall coverage ranges from 1.514 to 1.696 microns, there are small gaps between the detectors, which result in gaps in the wavelength coverage. While all of the spectra lie in the infrared H- band, sometimes the chips are referred to as the "blue", "green", and "red" chips, going from the shorter wavelengths to longer wavelengths. Data products refer to the separate chips as chips "a", "b", and "c", in the order in which they are read out. As it turns out, the "red" chip is the first one to read out, so this nomenclature is in reverse wavelength order. The following table explains the terminology.

chip	name	start wavelength	end wavelength	central dispersion
a	"red"	1.647 μm	1.696 μm	-0.236 A/pix
b	"green"	1.585 μm	1.644 μm	-0.283 A/pix
c	"blue"	1.514 μm	1.581 μm	-0.326 A/pix

Note that the starting and ending wavelengths vary slightly from fiber to fiber because of variations of their placement along the instrument pseudo-slit. The dispersion varies with wavelength, and, to a lesser extent, by fiber.

For ease in analysis, all of the spectra are rebinned to the same dispersion and wavelength axis such that "chip gaps" fall in the same place for all spectra.

Imperfect Subtraction of Night Sky Lines

The night sky lines (i.e., "airglow"), primarily from OH emission in the Earth's atmosphere, can be extremely bright. The sky emission is removed from the science spectra using the combination of the emission from sky fibers at sky positions near that of the target. However, this subtraction is almost always imperfect for two reasons. First, the sky spectrum has to be wavelength-shifted to match the science spectra; this occurs because the fibers have different locations along the pseudo-slit. Second, the line spread function (LSF) varies by fiber due to changes in image quality across the field-of-view. Because the night skylines are so bright, even small fractional variations due to these issues can cause the subtraction to be very noticeably imperfect; thus, most skylines are either under- or over-subtracted.

Note that, even if the airglow subtraction were perfect, the area of the spectrum "under" the skylines would be of significantly lower signal-to-noise, due to the substantial Poisson contribution from the bright lines. Significant effort has not been made yet into improving the sky subtraction due to this. Additional work along these lines may be attempted for subsequent data releases.

The imperfect night sky line subtraction does have the unfortunate result of making the APOGEE spectra appear a bit "ugly" to a quick, casual inspection. The APOGEE data products (e.g., apVisit/asVisit and apStar/asStar files) include a record of the sky spectrum that was subtracted, and it is possible to use this as a guide for recognizing pixels that are likely to be affected by imperfect sky subtraction.

Uncertainty Arrays

All APOGEE spectra include an array of uncertainties ("errors") for each pixel; these are given as the standard deviation of the flux values. These uncertainties are initially calculated from the raw pixel data based on the inherent properties of the detectors (gain and readout noise). These are then propagated into the uncertainties for subsequent data products.

However, in downstream spectral products, data for any given pixel may have been derived from some combination of pixels in the raw data, and data from any individual raw pixel may contribute to more than one pixel in the combined spectra. As a result, there may be correlated errors between pixels and can occur even in visit spectra because these are the combination of two dithered observations. If the dithers are spaced by 0.5 pixels exactly, then the spectral combination software interleaves the two dithered exposures. Still, if the dithers are even slightly imperfect (as they generally are), any pixel in the combined well-sampled spectrum will have contributions from multiple raw pixels. For the visit-combined apStar spectra, the pixels have inputs from multiple raw pixels, because the apStar spectra are RV-corrected and resampled onto a standard wavelength grid. Although the uncertainties are propagated into the apVisit and apStar spectra, this propagation ignores the correlation of uncertainties that result from having processed pixels that are derived from multiple raw pixels.

Multiple observations of selected targets have been used to estimate empirical uncertainties, and these demonstrate that, for most targets, the calculated uncertainties are reasonable, i.e., the scatter from observation to observation is comparable to the estimated uncertainty for each observation. However, for very bright targets, the calculated uncertainties are almost certainly underestimated because systematic uncertainties most likely limit the accuracy of these data from the data processing and in the calibration data products. These have not yet been fully quantified, but we expect an uncertainty "floor" at the 0.5% level, i.e., a maximum S/N of ~200. Such a floor has not been set in the spectrum uncertainty arrays, and so, users should be aware that there is a likely maximum S/N~200.

Bad Pixels/Missing Regions

The IR detectors used for APOGEE are not cosmetically perfect. Small regions of each chip are bad, and there are a significant number of individual bad or "hot" pixels. These are flagged during the data processing and can lead to bad or missing regions in any given spectrum. Because visit spectra are combined from multiple individual dithered spectra, a single bad pixel can propagate into multiple pixels in the visit-combined spectra. In combination with the poorly subtracted skylines, these bad pixels can have the effect of making individual visit spectra look rather "ugly." The mask arrays can be used to identify the cause of most bad pixels.

Because any given star will typically not use the same fiber for different visits, combined spectra generally look somewhat cleaner, especially if the observed radial velocity (including differences in barycentric RV) of a target differs significantly from visit-to-visit. However, even if the combined spectra do not have regions with missing data, there may be regions where the noise level is elevated if that portion of the spectrum landed on a bad region of one of the arrays in one or more of its visits.

Ghosts

The use of VPH gratings results in the production of some "ghosts" on the 2-D images. The most prominent of these is the "Littrow ghost," which for the APOGEE data falls somewhere in the wavelength region 1.624 to 1.626 microns, depending on the fiber.

The amplitude of the ghost depends on the brightness of other stars in the field, so it does not always contribute a significant amount of flux. Pixels possibly affected by the "Littrow ghost" are flagged with the LITTROW_GHOST bit in the APOGEE_PIXMASK bitmask.

Fiber Cross Talk

The spacing of adjacent spectra is ~ 6.5 pixels (as measured between adjacent PSF peaks) to pack the spectra of as many stars as possible across the APOGEE detectors. Therefore, the wings of the PSF overlap slightly between neighboring spectra, and the effect is particularly apparent if an object is located adjacent to a much brighter object. As a result, the targets on each plate are sorted into three brightness categories -- bright (B), medium (M), and faint (F) -- and these categories are placed along the pseudo-slit (and hence, on the detectors) in the order FMBBMF FMBBMF. In principle, an object in the faint class or a sky fiber should never land next to a much brighter object. Yet, the magnitude ranges of these brightness categories can be broad and make it possible for targets of significantly different brightness to have spectra adjacent to one another.

The extraction portion of the data reduction pipeline accounts for contributions of light from the two neighboring spectra for each target. The effectiveness of this extraction depends on our knowledge of the amplitude of the wings of the light distribution. In cases where adjacent targets are significantly brighter than a given object, small inaccuracies in the PSF model may lead to significant errors in the extraction of the spectrum.

For each visit, a bit is set in the APOGEE_STARFLAG bitmask if an adjacent object is more than 100 times brighter than the star itself (VERY_BRIGHT_NEIGHBOR) or more than 10 times brighter (BRIGHT_NEIGHBOR). The former case, which is rare, is automatically considered as a bad spectrum and will not be included in the combined spectrum.

Incomplete Data Acquisition

DR16 contains all APOGEE-1 data as well as APOGEE-2 data that were collected through August 2018. The APOGEE-2 data set includes four years of observations from the northern instrument, but only 16 months of observations from the southern instrument. The full complement of visits, leading to a net S/N > 100, were successfully acquired for the majority of APOGEE targets. However, for a small number of APOGEE-2 northern fields and a substantial fraction of APOGEE-2 southern fields, the full set of visits were not completed. Therefore, the S/N for the combined spectra of some stars in these fields (generally the faintest ones) may not achieve the survey goal of S/N > 100.

Persistence in the "Blue" Detector

Some areas of the detectors used in the APOGEE northern instrument suffer from a problem referred to as "super-persistence." In these locations on the detector, previous exposure to light causes a glow in subsequent images that can be substantial and last for a significant amount of time. The problem is most severe on about 1/3 of the "blue" chip, i.e., the chip that records wavelengths between 1.514 and 1.581 microns. Due to the chip orientation, this full wavelength region may be impacted for all objects in this part of the chip (~1/3). There are also regions in the "green" chip that are affected by a lower level of super-persistence, but these regions are less clearly defined by fiber number or wavelength.

After the completion of the APOGEE-1 survey in summer 2014, the instrument was opened, and the blue detector, which had the worst impact from super-persistence, was replaced with a detector with better performance. While this will mitigate the effects of super-persistence for data taken after summer 2014, DR16 includes data taken from before this date as well.

The impact of super-persistence for a given object is dependent on (1) the prior exposure history and (2) the brightness of the current target. The fiber management system described in the Fiber Cross Talk section provides some level of mitigation. Grouping the fibers by target brightness makes it relatively uncommon for a faint target to be observed through a fiber that was previously placed on a bright target. However, because the magnitude ranges that define these categories are broad, there can still be cases where faint targets follow relatively brighter ones. Also, calibration flat field exposures are taken between every plate to map the distribution of light between fibers and to measure the fiber-to-fiber throughput variations. These roughly evenly-illuminated frames are sufficient to give rise to some super-persistence.

Superpersistence is a complex phenomenon. In DR13/DR14, we attempted to implement a first-order super-persistence correction, based on some calibration data that were obtained, and scaling the results to try to match persistence observed in dark frames taken before most of the science exposures. We also implemented a scheme in which pixels affected by persistence are assigned a lower weight when different visits are combined. Additional description of how this problem is addressed is available in Holtzman et al. (2018). We implemented the same procedure for DR16.

The effect of super-persistence can be significant and is easily noticed: the flux levels in the region of the spectrum affected can be enhanced by tens of percent or more. This enhancement is likely to have some wavelength dependence meaning that spectral features may be distorted. However, depending on the brightness of the target and the preceding ones, it is not guaranteed that the spectra are adversely affected at a significant level, so we do not flag all data that falls within the super-persistence region as bad by default.

In the data reduction pipeline, we flag all pixels in the regions where significant superperstence is known to occur in the APOGEE_PIXMASK bitmask using three different flags corresponding to the level of the effect: PERSIST_HIGH, PERSIST_MED, and PERSIST_LOW. In addition, we have a visit level flag, APOGEE_STARFLAG, for each object, with bits that get set when a significant number of pixels ($/gt$20%) in the spectrum are affected, which are again split into categories of PERSIST_HIGH, PERSIST_MED, and PERSIST_LOW. In addition, we look for evidence in the spectra of a "jump" in flux between the "green" and the "blue" chips, and if this is present at an easily recognized level, we set a flag PERSIST_JUMP_HIGH or PERSIST_JUMP_LOW if the "blue" portion of the spectrum seems abnormally high or abnormally low (the latter could occur, e.g., if a sky fiber from a region affected by superpersistence is used for sky subtraction, although the pipeline takes some steps to try to avoid this occurrence).

In the combined spectra, star level flags are provided that are bitwise AND and bitwise OR combinations of the visit APOGEE_STARFLAG flags to indicate whether a given object was marked as having a significant number of pixels in the super-persistence region in all or any of the visit spectra comprising the combination. Starting in DR13, a scheme was implemented in which pixels affected by super-persistence are given inflated uncertainties to reduce their impact on the combined spectra. For those stars where some visits were impacted by super-persistence, and others were not, the random uncertainties are larger because those visits affected by the super-persistence are essentially ignored. For stars in which all visits are impacted by persistence, this has the effect of giving the persistence-affected wavelengths less weight in the ASPCAP fitting than pixels at other wavelengths.

As noted in Holtzman et al. (2015), the effects of persistence can be seen in some of the individual element abundances for the DR12 data. While this may be true for some objects in the current data release, comparison of results for stars unaffected by persistence to those affected by persistence in all visits suggests that the down-weighting scheme has helped significantly to mitigate persistence effects. The stars most likely to still be impacted are those that are the faintest stars or those with weak absorption features. Additional discussion on these issues relevant to the DR16 reductions can be found in Holtzman et al. (2018).