MaNGA Data Reduction Pipeline
The MaNGA Data Reduction Pipeline (DRP) processes the raw data to produce flux calibrated, sky subtracted, coadded data cubes from each of the individual exposures for a given galaxy.
The DRP consists of two primary parts: the 2d stage that produces flux calibrated fiber spectra from raw individual exposures, and the 3d stage that combines multiple flux calibrated exposures with astrometric information to produce stacked data cubes.
These science-grade data cubes are then processed by the MaNGA Data Analysis Pipeline (DAP), which measures the shape and location of various spectral features, fits stellar population models, and performs a variety of other analyses necessary to derive astrophysically meaningful quantities from the calibrated data cubes.
In this page we provide a basic overview of the major steps in the MaNGA DRP in DR17 (the DAP is described here). The DR17 DRP lives in an svn-controlled repository
mangadrp and is broadly similar to the versions in use for previous data releases (see differences here); for detailed information see the technical pipeline papers (Law et al. 2016 and DR17 update Law et al. 2021).
See Law et al. 2021 for a detailed discussion of DRP versions (their Table 2) and the differences between the DR17 DRP and earlier pipeline versions. Each version of the DRP contains a userguide within the svn product detailing how to install and run the DRP and its dependent software products and data directories.
DRP: 2d Stage
The 2d stage of the MaNGA DRP is largely derived from the BOSS idlspec2d pipeline that has been modified to address the different hardware design and science requirements of the MaNGA survey.
Each of the four raw camera images (blue and red cameras for each of two spectrographs) for a given exposure are preprocessed to remove quadrant-dependent biases from the four readout amplifiers, identify cosmic ray tracks by looking for features sharper than the detector PSF, and estimate the typical uncertainty in each pixel based on the raw counts and the detector read noise. The resulting cleaned camera images show the spectra of each fiber dispersed vertically along the detector.
Each MaNGA science exposure is preceded by two calibration exposures. The first calibration exposure is of a bright quartz flatfield lamp that uniformly illuminates every fiber with a bright background. This bright flatfield exposure is used to measure the precise location and width of each fiber spectrum on the detector (since they change over time due to gravitational flexure), and to calibrate the relative throughput of each fiber (since dust on the fiber bundles and various optical stresses can alter performance). The second calibration exposure is of a bright Neon-Mercury-Cadmium lamp, which produces bright spectral emission lines at well-determined wavelengths. This exposure allows us to calibrate the relative wavelength-to-pixel solution of each fiber spectrum, in addition to measuring the effective spectral resolution from the width of the arc-lamp emission lines.
Science exposures (and calibration exposures) are extracted from the two dimensional detector images using a gaussian profile fitting method. For each row on the detector, we model the counts in the row as the linear sum of Nfiber gaussian profiles plus a smoothly varying background to account for scattered light. The integral of each of these gaussian profiles is then recorded as the total flux for a given fiber in that row, and the spectrum of each fiber is constructed as the vector of fluxes across all rows. The extracted science spectra are saved in mgFrame files, while the extracted flatfield and arc-lamp calibration spectra are saved in mgFlat and mgArc files respectively.
Since the night sky is quite bright at optical and near-infrared wavelengths (especially in the vicinity of OH emission lines), the pipeline next creates a model of the background sky to subtract from the data so that the underlying galaxy or star light can be isolated. This is achieved by combining each of the 92 single fibers (46 per spectrograph) that sample sky regions away from astronomical targets into a super-sampled spectrum of the night sky background. This super-sampled model is evaluated on the pixellized wavelength grid of each of the science fibers (with some scaling factors to account for spatial variation in the sky background, and the shape and strength of individual OH features), and subtracted from the science fiber spectra to produce the mgSFrame files.
Each individual MaNGA exposure is then calibrated to absolute radiometric units using the 12 minibundles (6 per spectrograph) that have been placed on spectrophotometric standard stars. Using a model of the focal-plane point spread function derived from the guider data, we compensate for the light losses from inter-fiber regions in each of these minibundles to construct the total flux for each standard star. These spectra are then compared against theoretical stellar spectra (BOSZ in DR15/DR17, Kurucz in DR13/DR14) in order to derive the best-fit calibration vector as a function of wavelength. This calibration step converts the science spectra from units of flatfielded electron counts to units of 10-17 erg/s/cm2/Angstrom/fiber (hence calibrating out both system and atmospheric throughput variations) and saved to the mgFFrame files.
Finally, the flux-calibrated spectra from each of the four individual cameras (a blue and a red camera for each spectrograph) are combined together across the dichroic break into a single file with a regular and fixed wavelength solution. Two such mgCFrame files are produced by the pipeline; a LINEAR version with a wavelength step of 1.0 Angstroms per pixel, and a LOG version with a wavelength step of 1e-4 in units of log10(lambda/Angstroms) per pixel. These mgCFrame files are thus row-stacked, one-dimensional spectra of each of the 1423 MaNGA fibers in a given exposure on a common wavelength grid.
DRP: 3d Stage
Once a sufficient number of exposures have been obtained on a given plate (and processed through the 2d DRP), the 3d stage of the DRP combines these exposures into data cubes and summary RSS files for each IFU on the plate.
First, the pipeline searches through the mgCFrame files for each of the individual exposures and selects out the rows corresponding to a given IFU. A baseline astrometric solution is then computed that gives the position on sky of each fiber as a function of wavelength. This baseline solution takes into account the manufactured fiber bundle metrology (i.e., the lab-measured physical location of each fiber in the IFU bundle), the applied telescope dithering, measured offsets of the drilled hole from the nominal location, optical distortions from a model of the SDSS telescope, and chromatic and field differential refraction effects. Since the chromatic differential refraction term is a function of wavelength, this means that each fiber covers a slightly different region of the sky at different wavelengths.
Although this baseline solution provides a reasonable first guess at the astrometric position of each fiber, it is nonetheless imperfect and cannot account for such issues as guiding errors and translational and rotational tolerances of the IFU ferrule within its hole. We therefore use a second pass 'extended' astrometry module (EAM) that compares the MaNGA fiber spectra to SDSS broadband imaging of the target galaxies obtained during prior generations of SDSS. This broadband image registration improves the typical accuracy of the MaNGA fiber astrometry to about 0.1 arcsec, and provides an independent assessment of the flux calibration accuracy. The summary RSS files providing the individual fiber spectra and astrometric solutions for all exposures of a given galaxy are provided in both LOG and LINEAR formats as output products of the 3d DRP.
Once the on-sky location of each fiber spectrum has been derived using the DRP astrometry module, they are combined into a spatially rectified data cube using a flux-conserving variant of Shepard's method. In brief, for each wavelength the individual dithered spectra provide an irregularly distributed grid of measurements of the on-sky intensity that we map to a regular grid sampled every 0.5 arcsec. The data cubes are then built up in the spectral direction from these reconstructed maps at each wavelength slice.
Finally, a post-processing stage applied to the data cubes computes a variety of useful information, including covariance information, broadband griz images reconstructed from the MaNGA cubes, estimates of the griz reconstructed PSF, and a variety of quality control metrics and reference information.