HSTGRISM

Author

Kornpob Bhirombhakdi, Andrew Fruchter

Contact

kbhirombhakdi@stsci.edu

Installation

pip install hstgrism, polynomial2d, scalex

Also, other simple packages such as numpy, pandas, and matplotlib.

Version

8.0.0

What does hstgrism do?

hstgrism is a python 3 package aiming to reduce grism observations from HST. This software follows the reduction process as implemented in the standard software aXe. However, hstgrism provides several advantages compared to aXe such as:

  • hstgrism implements in python 3, while aXe requires python 2.
  • hstgrism does not depend on packages such as SExtractor and iraf/pyraf (which is already at the end-of-support state). aXe depends on these packages.
  • hstgrism can installed easily, while installing aXe (together with setting up its environment) can be difficult.
  • Following divide-and-conquer practice, hstgrism breaks down the reduction process into small steps. Each step produces files that can be used in the next steps. This practice provides a lot of flexibility especially when a user would like to build their routines around hstgrism. In aXe, the main process is called through 'axecore,' which is difficult for any customization.
  • Each step in hstgrism has its diagnostic plots, which would help a user examining the results along the way.
  • hstgrism maintains simple APIs for each step. For aXe, even though a one-line command can complete the reduction, setting it up can be difficult.
  • hstgrism implements 2D polynomial model with the ability to construct any type of mask when estimating for background. aXe implements 1D polynomial model with 2D smoothing filter, and a customized mask is difficult to achieve.
  • hstgrism performs reduction to the very end of the process by properly including aperture correction, while aXe does not perform this step and can be difficult to achieve by an external code.
  • hstgrism provides a lot of flexibility when it comes to making a customized routine, or updating with new calibration files when available.
  • hstgrism is designed for a point source extraction, which can be a powerful core engine for more complicated extraction that a user can build their routines around.
  • hstgrism provides a cleaner and simpler interfaces. In aXe, a user will have to setup the working environment in a specific way, and it would require the user to clean up manually.
  • hstgrism saves outputs in csv format for parameters, fits format for any image, and user-specific graphic format for plots. These output formats are more practical for downstream analysis compared to those of aXe.

With all the advantanges hstgrism provides, a user should note that hstgrism is still at its early stage, compared to the standard software like aXe. There are several options that aXe can offer, but still are unavailable in hstgrism. Something that aXe can perform but hstgrism cannot includes for example:

  • Master Sky subtraction -- aXe internally provides using master sky subtraction to globally remove the sky signatures. hstgrism does not provide this process internally. While performing only a local background subtraction with 2D polynomial model in hstgrism would be sufficient in most cases, there are some certain cases that performing master sky subtraction would yield better advantage. However, if a user prefers to use hstgrism, introducing master sky subtraction into the process is possible.
  • Various specifications for an extraction aperture -- While hstgrism performs only simple extraction aperture by assuming a constant wavelength at each pixel X (as the dispersion direction for HST) with its cross-dispersion direction along Y axis, aXe offers more complicated mapping that considers, for example, the tilt of traces and shape of the sources. In cases such as traces with strong curvatures, or sources with large PSFs or extended, aXe would provide better extraction.
  • Optimal extraction -- Optimal extraction (Horne 1986), which was shown to optimize the signal-to-noise ratio of the extracted spectra, is implemented in aXe. There would be vaious cases that this functionality would be useful.
  • Contamination flag and model -- aXe performs contamination flagging and modelling internally when a source catalog with multiple sources is provided. However, a user should note that aXe does not perform any contamination correction. While unavailable in the core engine, hstgrism has modules that can be put together and achieves these functionalities.
  • axedrizzle -- aXe is imbued with axedrizzle, which is designed to perform drizzling technique properly for grism images. A user might want to extract spectra from a drizzled image in some cases such as for improving the signal-to-noise ratio, or to remove cosmic rays. However, it is worth noting that there is a trade-off between performing extraction on N individual images versus one drizzled image.

What is this document?

This document serves as a quick overview and a one-cell template of hstgrism workflow. A user can copy the template into a new working environment, and change values accordingly. More details of each step can be found in other related demo files which include:

  • 01_trace.ipynb: Start the process by mappging trace and wavelength
  • 02_cutout2d.ipynb: Make a 2D cutout from a given image, which will simplify the problem.
  • 03_flatfield.ipynb: Generate flatfield image.
  • 04_background.ipynb: Perform background estimation.
  • 05_tofullframe.ipynb: Place any 2D cutout back to a full image, which will help with diagnostic purposes or other complicated and customized routines.
  • 06_cps_to_flam.ipynb: Finish the process by extracting the counts per second, and converting to fluxes in flam unit.

Outputs

Each hstgrism modules will save outputs that can be useful for the other steps downstream. Each output file will have its name convention as ./savefolder/saveprefix_savesuffix.extension where savefolder and saveprefix are specified in hstgrism.container.Container class, savesuffix and extension are preseted in most cases. If the file output is a plot, a user can specify its extension through plotformat in the Container.

Overall, hstgrism produces:

Step Outputs What it contains
01_trace.ipynb xyref.csv Information related to computing xyref
trace.csv Maps of trace and wavelength
overview.pdf Diagnostic plot for trace and wavelength maps
02_cutout.ipynb tbox.csv Information for extraction region around the object
bbox.csv Contain information for bounding box as the outer region from the extraction region, which will be used mainly for background estimation
bbcorner.csv Four corners of the cutout region
cutout.fits Cutout image
mask.fits Object mask image
bbox.pdf Diagnostic plot for the cutout definition
03_flatfield.ipynb fullflat.fits Flatfield image in a full frame
cutflat.fits Cutout of flatfield
04_background.ipynb cutbkg.fits Cutout size of estimated background
maskfit.fits Final mask applied during the iterative background estimation
sum1d.pdf Diagnostic plot
05_tofullframe.ipynb User specifies savesuffix.fits A full frame image replaced by a cutout
06_cps_to_flam.ipynb User specifies path Fluxes or cps, and other related information specified by a user

APIs

hstgrism provides simple APIs which are explained more in the demo files, and in each module's docstring.

One-cell template

In [1]:
from hstgrism.container import Container
from hstgrism.confreader import ConfReader
from hstgrism.computesip import ComputeSIP
from hstgrism.computetracenwavelength import ComputeTraceNWavelength
from hstgrism.computexyref import ComputeXYREF
from hstgrism.wfc3irg102 import WFC3IRG102
from hstgrism.cutout2d import Cutout2D
from hstgrism.flatfield import FlatField
from hstgrism.background import Background
from hstgrism.to_fullframe import to_fullframe
from hstgrism.grismapcorr import GrismApCorr
from hstgrism.show_overview import show_overview
from hstgrism.show_bbox import show_bbox
from hstgrism.show_sum1d import show_sum1d
from scipy.interpolate import interp1d
import numpy as np
import pandas as pd
from astropy.io import fits
import matplotlib.pyplot as plt
import copy
##########
# Make Container
##########
saveprefix = 'testprefix'
savefolder = 'testfolder'
plotformat = 'pdf'
overwrite = True
container_params = {'saveprefix':saveprefix,
                    'savefolder':savefolder,
                    'plotformat':plotformat,
                    'overwrite':overwrite
                   }
containerobj = Container(**container_params)
##########
# Compute xyref
##########
xyd = (351.32706,545.13459)
confile = '/Users/kbhirombhakdi/_work/_calib_files/WFC3.IR.G102.cal.V4.32/G102.F105W.V4.32.conf'
beam = 'A'
gfile = ('/Users/kbhirombhakdi/_work/_data/11552/G102_set/mastDownload/HST/iab901ekq/iab901ekq_flt.fits',1)
dfile = ('/Users/kbhirombhakdi/_work/_data/11552/G102_set/mastDownload/HST/iab901ejq/iab901ejq_flt.fits',1) # file and extension
xyref_params = {'xyd':xyd,
                'xydiff':'default',
                'xyoff':'default',
                'confile':confile,
                'beam':beam,
                'gfile':gfile,
                'dfile':dfile,
                'container':containerobj
               }
xyrefobj = ComputeXYREF(**xyref_params)
xyrefobj.compute()
xyrefobj.save()
##########
# Map trace and wavelength
##########
xyref = xyrefobj.data['xyref']
grism_params = {'confile':confile,
                'beam':beam,
                'xyref':xyref,
                'container':containerobj
               }
grismobj = WFC3IRG102(**grism_params)
grismobj.compute()
grismobj.save()
##########
# Make cutout
##########
gfile = ('/Users/kbhirombhakdi/_work/_data/11552/G102_set/mastDownload/HST/iab901ekq/iab901ekq_flt.fits',1)
tfile = './{0}/{1}_trace.csv'.format(containerobj.data['savefolder'],containerobj.data['saveprefix'])
tdata = pd.read_csv(tfile)
xg = tdata.xh + tdata.xyref[0]
yg = tdata.yh + tdata.xyref[1]
halfdy = 20
tbox = {'xg':xg,'yg':yg,'halfdyup':halfdy,'halfdylow':halfdy}
bbox={'padxleft': 15, 'padxright': 15, 'padyup': 15, 'padylow': 15}
image = fits.open(gfile[0])[gfile[1]].data
do_mask = True
cutoutobj = Cutout2D(tbox,bbox,image,containerobj)
cutoutobj.compute(do_mask)
cutoutobj.save()
##########
# Make flatfield
##########
flatfile = '/Users/kbhirombhakdi/_work/_calib_files/WFC3.IR.G102.cal.V4.32/WFC3.IR.G102.flat.2.fits'
tfile = './{0}/{1}_trace.csv'.format(containerobj.data['savefolder'],containerobj.data['saveprefix'])
tdata = pd.read_csv(tfile)
bbcorner = './{0}/{1}_bbcorner.csv'.format(containerobj.data['savefolder'],containerobj.data['saveprefix'])
bbdata = pd.read_csv(bbcorner)
flatobj = FlatField(container=containerobj,flatfile=flatfile,xyref=tdata.xyref,dldp=tdata.dldp)
flatobj.compute()
flatobj.save(do_cutout=True,
             bb0x=bbdata.bb0x[0],bb1x=bbdata.bb1x[0],bb0y=bbdata.bb0y[0],bb1y=bbdata.bb1y[0])
##########
# Background estimation
##########
sfolder,sprefix = containerobj.data['savefolder'],containerobj.data['saveprefix']
gfile = ('./{0}/{1}_cutout.fits'.format(sfolder,sprefix),1)
ffile = ('./{0}/{1}_cutflat.fits'.format(sfolder,sprefix),1)
mfile = ('./{0}/{1}_mask.fits'.format(sfolder,sprefix),1)
gdata = fits.open(gfile[0])[gfile[1]].data / fits.open(ffile[0])[ffile[1]].data
mdata = fits.open(mfile[0])[mfile[1]].data.astype(bool)
norder = 0
sigclip=(True, 5, 1.)
bkgobj = Background(gdata=gdata,norder=norder,mdata=mdata,sigclip=sigclip,container=containerobj)
bkgobj.poly2d.fit()
bkgobj.save(do_yfit=True,do_maskfit=True)
##########
# Place back to full frame
##########
savesuffix = 'FlattoFullFrame'
saveprefix = 'iab901ekq_replaced'
overwrite = False
containerobj_new = copy.deepcopy(containerobj)
containerobj_new.data['saveprefix'] = saveprefix
containerobj_new.data['overwrite'] = overwrite
cutfile = ('./{0}/{1}_cutflat.fits'.format(containerobj.data['savefolder'],containerobj.data['saveprefix']),1)
cutdata = fits.open(cutfile[0])[cutfile[1]].data
fullfile = ('/Users/kbhirombhakdi/_work/_data/11552/G102_set/mastDownload/HST/iab901ekq/iab901ekq_flt.fits',1)
bb = pd.read_csv('./{0}/{1}_bbcorner.csv'.format(containerobj.data['savefolder'],containerobj.data['saveprefix']))
bb0x = bb.bb0x[0]
bb1x = bb.bb1x[0]
bb0y = bb.bb0y[0]
bb1y = bb.bb1y[0]
to_fullframe(cutdata,fullfile,
             bb0x,bb1x,bb0y,bb1y,
             savesuffix,containerobj
            )
##########
# Extraction
##########
sfolder,sprefix = containerobj.data['savefolder'],containerobj.data['saveprefix']
gfile = ('./{0}/{1}_cutout.fits'.format(sfolder,sprefix),1)
ffile = ('./{0}/{1}_cutflat.fits'.format(sfolder,sprefix),1)
bfile = ('./{0}/{1}_cutbkg.fits'.format(sfolder,sprefix),1)
mfile = ('./{0}/{1}_mask.fits'.format(sfolder,sprefix),1)
bbox = pd.read_csv('./{0}/{1}_bbox.csv'.format(sfolder,sprefix))
trace = pd.read_csv('./{0}/{1}_trace.csv'.format(sfolder,sprefix))
tbox = pd.read_csv('./{0}/{1}_tbox.csv'.format(sfolder,sprefix))
instrument = 'HST-WFC3-IR-G102'
ww = trace.ww.values
xh = trace.xh.values
cps = fits.open(gfile[0])[gfile[1]].data / fits.open(ffile[0])[ffile[1]].data
cps -= fits.open(bfile[0])[bfile[1]].data
cps *= fits.open(mfile[0])[mfile[1]].data
cps = np.nansum(cps,axis=0)[bbox.padxleft[0]:-bbox.padxright[0]]
apsizepix = tbox.halfdyup.values + tbox.halfdylow.values + 1
apsizearcsec = apsizepix * GrismApCorr().table[instrument]['scale']
apcorrobj = GrismApCorr(instrument=instrument,apsize=apsizearcsec,wave=ww,aptype='diameter',apunit='arcsec',waveunit='A')
apcorrobj.compute()
apcorr = apcorrobj.data['apcorr']
sfile = ('/Users/kbhirombhakdi/_work/_calib_files/WFC3.IR.G102.cal.V4.32/WFC3.IR.G102.1st.sens.2.fits',1)
sdata = pd.DataFrame(fits.open(sfile[0])[sfile[1]].data)
smodel = interp1d(sdata.WAVELENGTH,sdata.SENSITIVITY,kind='linear',bounds_error=False,fill_value=None)
sensitivity = smodel(ww)
sensitivity[~np.isfinite(sensitivity)] = 1. # replace nan with 1. to avoid divided-by-zero problem
xhdiff = np.diff(xh)
xhdiff = np.append(xhdiff,xhdiff[-1]) # add one last element to maintain the array dimension
wwdiff = np.diff(ww)
wwdiff = np.append(wwdiff,wwdiff[-1]) # add one last element to maintain the array dimension
wwperpix = wwdiff/xhdiff
flam = cps / (apcorr * sensitivity * wwperpix)
output = {'ww':ww,'cps':cps,'apcorr':apcorr,'sensitivity':sensitivity,'wwperpix':wwperpix,'flam':flam}
string = './{0}/flam.csv'.format(sfolder)
pd.DataFrame(output).to_csv(string)
Save ./testfolder/testprefix_xyref.csv
Rescale = False
Perform compute
Rescale = False
Perform compute
Rescale = False
Perform compute
Rescale = False
Perform compute
Save ./testfolder/testprefix_trace.csv
Save ./testfolder/testprefix_tbox.csv
Save ./testfolder/testprefix_bbox.csv
Save ./testfolder/testprefix_bbcorner.csv
Save ./testfolder/testprefix_cutout.fits
Save ./testfolder/testprefix_mask.fits
Save ./testfolder/testprefix_fullflat.fits
Save ./testfolder/testprefix_cutflat.fits
Rescale = True

Exclude 0 invalid data points
data_minmax = (0.0,254.0) : scale_minmax = (-1.0,1.0) : method = linear

Exclude 0 invalid data points
data_minmax = (0.0,73.0) : scale_minmax = (-1.0,1.0) : method = linear

Exclude 1 invalid data points
data_minmax = (-1.757429022930123,187.76333427616157) : scale_minmax = (-1.0,1.0) : method = linear
Rescale = False
Perform fit
Perform compute
Sigma clipping = True, sigma level = 1.0, iter #1
Update maskfit
Perform fit
Perform compute
Sigma clipping = True, sigma level = 1.0, iter #2
Update maskfit
Perform fit
Perform compute
Sigma clipping = True, sigma level = 1.0, iter #3
Update maskfit
Perform fit
Perform compute
Sigma clipping = True, sigma level = 1.0, iter #4
Update maskfit
Perform fit
Perform compute
Sigma clipping = True, sigma level = 1.0, iter #5
Save ./testfolder/testprefix_cutbkg.fits
Save ./testfolder/testprefix_maskfit.fits
Make a copy of /Users/kbhirombhakdi/_work/_data/11552/G102_set/mastDownload/HST/iab901ekq/iab901ekq_flt.fits to ./testfolder/testprefix_FlattoFullFrame.fits
/Users/kbhirombhakdi/anaconda3/envs/py3/lib/python3.7/site-packages/ipykernel_launcher.py:101: RuntimeWarning: invalid value encountered in true_divide
/Users/kbhirombhakdi/anaconda3/envs/py3/lib/python3.7/site-packages/polynomial2d/polynomial2d.py:173: RuntimeWarning: invalid value encountered in greater
  tmpmask = (n_sig > sigma_level) # good data == False
Replace bad apcorr values with median = 0.988
/Users/kbhirombhakdi/anaconda3/envs/py3/lib/python3.7/site-packages/ipykernel_launcher.py:143: RuntimeWarning: invalid value encountered in true_divide

Examples of diagnostic plot

In [2]:
gfile = ('/Users/kbhirombhakdi/_work/_data/11552/G102_set/mastDownload/HST/iab901ekq/iab901ekq_flt.fits',1)
dfile = ('/Users/kbhirombhakdi/_work/_data/11552/G102_set/mastDownload/HST/iab901ejq/iab901ejq_flt.fits',1) # file and extension
show_overview(gfile,dfile,xyd,xyref,grismobj.output['xh'],grismobj.output['yh'],grismobj.output['ww'],
              objname='GD153',
              save=True,
              container=containerobj,
             )
Save ./testfolder/testprefix_overview.pdf

In [3]:
show_bbox(cutoutdata=cutoutobj.cutout,
          objname='GD153',
          maskdata=cutoutobj.mask,
          do_trace=True,
          xcut=xg - cutoutobj.bbcorner['bb0x'],
          ycut=yg - cutoutobj.bbcorner['bb0y'],
          ww=tdata.ww,
          save=True,container=containerobj
         )
Save ./testfolder/testprefix_bbox.pdf
In [4]:
tmpdata = gdata - bkgobj.poly2d.model['YFIT']
tmpcont = copy.deepcopy(containerobj)
tmpcont.data['saveprefix'] += '_withMASKFIT'
show_sum1d(gdata=tmpdata,objname='GD153',mdata=bkgobj.poly2d.model['MASKFIT'],do_zero=True,save=True,container=tmpcont)
Save ./testfolder/testprefix_withMASKFIT_sum1d.pdf
In [5]:
tmpdata = gdata - bkgobj.poly2d.model['YFIT']
tmpcont = copy.deepcopy(containerobj)
tmpcont.data['saveprefix'] += '_withoutMASKFIT'
show_sum1d(gdata=tmpdata,objname='GD153',mdata=None,do_zero=True,save=True,container=tmpcont)
Save ./testfolder/testprefix_withoutMASKFIT_sum1d.pdf
In [6]:
plt.plot(ww,cps)
plt.xlabel('wavelength (A)',fontsize=12)
plt.ylabel('cps',fontsize=12)
plt.title('GD153',fontsize=12)
Out[6]:
Text(0.5, 1.0, 'GD153')
In [7]:
plt.plot(ww,sensitivity,'x:',label='from computation')
plt.errorbar(sdata.WAVELENGTH,sdata.SENSITIVITY,sdata.ERROR,alpha=0.4,label='from file')
plt.legend()
plt.xlabel('wavelength (A)',fontsize=12)
plt.ylabel('sensitivity (cps/A per flam)',fontsize=12)
plt.title(instrument,fontsize=12)
Out[7]:
Text(0.5, 1.0, 'HST-WFC3-IR-G102')
In [8]:
calspec = ('/Users/kbhirombhakdi/_work/_calib_files/PYSYN_CDBS/calspec/gd153_mod_010.fits',1)
caldata = pd.DataFrame(fits.open(calspec[0])[calspec[1]].data)

plt.figure(figsize=(10,10))
plt.plot(ww,flam,label='from extraction')
m = np.argwhere((caldata.WAVELENGTH.values >= ww.min())&(caldata.WAVELENGTH.values <= ww.max())).flatten()
plt.plot(caldata.WAVELENGTH.values[m],caldata.FLUX.values[m],'--',label='from calspec')
plt.legend()
plt.xlabel('wavelength (A)',fontsize=12)
plt.ylabel('flam',fontsize=12)
plt.title('GD153',fontsize=12)
plt.ylim(1e-15,5e-15)
plt.yscale('log')
In [ ]: