Archive Class¶
The Archive class is the primary mechanism for opening PSRFITS files.
-
class
Archive
(filename[, prepare=True, lowmem=False, verbose=True, weight=True, center_pulse=True, baseline_removal=True, wcfreq=True, thread=False, onlyheader=False])¶ Parameters: - prepare (bool) – Argument passed to
load()
. IfTrue
, then the file will be automatically polarization averaged withpscrunch()
, dedispersed withdedisperse()
and using the weighted center frequency if the parameterwcfreq
is set toTrue
, and centered withcenter()
ifcenter_pulse
is set toTrue
. - lowmem (bool) – Argument passed to
load()
. IfTrue
, then the PSRFITS file is opened in memmap mode and the data arrays are also replaced with memmaps. - verbose (bool) – Print extra information on loading and processing.
- weight (bool) – Argument passed to
load()
. Use the stored data weights, which is the typical mode. - center_pulse (bool) – Argument passed to
load()
. IfTrue
, then the peak of the pulse is centered in the middle of the data arrays. This is preferred for plotting purposes but the resulting arrival-time shifts are computed and stored internally. - baseline_removal (bool) – Argument passed to
load()
. Subtracts the baseline intensity of the average profile off-pulse region from all individual data profiles usingremove_baseline()
. - wcfreq (bool) – Argument passed to
load()
. IfTrue
, then the weighted center frequency is used indedisperse()
ifprepare=True
. - thread (bool) – Argument passed to
load()
. IfTrue
, then the calculation of the data array will be parallelized, which can lead to some speed-up for large data files but will take longer for small data files given the extra overhead required to start the process. - onlyheader (bool) – Argument passed to
load()
. IfTrue
, then only the primary and table headers are processed, without the data array. This is much faster if you only need access to metadata.
- prepare (bool) – Argument passed to
Usage:
ar = Archive(FILENAME) #loads archive, dedispersed and polarization averaged by default
ar.tscrunch() #averages the pulse in time
data = ar.getData() #returns the numpy data array for use by you
ar.imshow() #plots frequency vs phase for the pulses
Description of Data¶
From Appendix A.1 of the thesis Lam 2016:
The primary data array of profiles in a PSRFITS file is given by \(\mathcal{I}(t,\mathrm{pol},\nu,\phi)\), the pulse intensity as a function of time \(t\), polarization \(\mathrm{pol}\), frequency \(\nu\), phase \(\phi\), where the arguments are in the order of the array dimensions. To save memory, intensity data are stored in multiple arrays. The raw data array (DATA) \(d\) is the largest in dimensionality but for folded pulse data is typically stored as an array of 16-bit integers. To retrieve the raw data value for each pulse profile, the data array is then multiplied by a scale array (DAT_SCL) \(s\) and an offset array (DAT_OFFS) \(o\) is added. An array of weights (DAT_WTS) \(w\) is also stored internally and typically modifies the raw data, e.g., when excising radio frequency interference. The three modifier arrays are of much smaller size than the data array and are typically stored as in 32-bit single-precision float format. Mathematically, the resultant array of pulse intensities can be written as
PSRFITS files also contain a wide range of additional information stored internally, including a history of all PSRCHIVE modifications to the file, a folding ephemeris, and a large global header of useful metadata. Besides the data array, PyPulse will unpack and store all extra information for retrieval via get() methods as desired.
Methods¶
-
load
(filename[, prepare=True, center_pulse=True, baseline_removal=True, weight=True, wcfreq=True, onlyheader=False])¶ Load a PSRFITS file, process the metadata, and form the data arrays. This is called internally by
__init__()
.Parameters: - filename (str) – Path to load file from.
- prepare (bool) – This performs three tasks. It will polarization average the data via
pscrunch()
, dedisperse the data withdedisperse()
, and rotate the pulse so that the peak is in the center of phase withcenter()
. For centering, this will store the relevant time delays associated with the rotation. - center_pulse (bool) – The peak of the pulse is centered in the middle of the data arrays. This is preferred for plotting purposes but the resulting arrival-time shifts are computed and stored internally.
- baseline_removal (bool) – Subtract the baseline intensity of the average profile off-pulse region from all individual data profiles.
- weight (bool) – Use the stored data weights, which is the typical mode.
- wcfreq (bool) – The weighted center frequency is used in
dedisperse()
ifprepare=True
. - onlyheader (bool) – Ohe primary and table headers are processed, without the data array. This is much faster if you only need access to metadata.
Returns: None
-
save
(filename)¶ Save the data to a new PSRFITS file.
Parameters: filename (str) – Path to save file to.
Warning
save()
will output a PSRFITS file but the output data arrays vary slightly from the input data arrays. More
-
gc
()¶ Manually clear the data cube and weights for Python garbage collection
-
shape
([squeeze=True])¶ Return the shape of the data array.
Parameters: squeeze (bool) – Return the shape of the data array when dimension of length 1 are removed. Returns: shape, tuple of integers
-
reset
([prepare=True])¶ Replace the data with the original clone, preventing full reloading. Useful for larger files but only if the lowmem flag is set to True.
Parameters: prepare (bool) – Argument passed to load()
.
-
scrunch
([arg='Dp', **kwargs])¶ Average the data cube along different axes.
Parameters: arg (str) – Can be T for tscrunch()
, p forpscrunch()
, F forfscrunch()
, B forbscrunch()
, and D fordedisperse()
, following the PSRCHIVE conventions.Returns: self
-
tscrunch
([nsubint=None, factor=None])¶ Perform a weighted average the data cube along the time dimension.
Parameters: Returns: self
-
pscrunch
()¶ Perform an average the data cube along the polarization dimension. Can handle data in Coherence (AABBCRCI) or Stokes (IQUV) format.
Returns: self
Todo
Perform a weighted average of the data cube
-
fscrunch
([nchan=None, factor=None])¶ Perform a weighted average the data cube along the frequency dimension
Parameters: Returns: self
-
bscrunch
([nbins=None, factor=None])¶ Perform an average the data cube along the phase (bin) dimension.
Parameters: Returns: self
Todo
Perform a weighted average of the data cube
-
dedisperse
([DM=None, reverse=False, wcfreq=False])¶ Dedisperse the pulses by introducing the appropriate time delays and rotating in phase.
Parameters: Returns: self
-
dededisperse
([DM=None, wcfreq=False])¶ Runs
dedisperse()
with reverse=False flag. See that function for parameter notation.
-
calculateAverageProfile
()¶ Calculate the average profile by performing an unweighted average along each dimension Automatically calls
calculateOffpulseWindow()
.
Todo
Perform a weigthed average.
-
calculateOffpulseWindow
()¶ Calculate an off-pulse window using the
SinglePulse
, with the windowsize parameter equal to one-eighth the number of phase bins.
-
center
([phase_offset=0.5])¶ Center the peak of the pulse in the middle of the data arrays.
Parameters: phase_offset (float) – Determine the phase offset (in [0,1]) of the peak, i.e., impose an arbitrary rotation to where the center of the peak should fall. Returns: self
-
removeBaseline
()¶ Removes the baseline of the pulses given the off-pulse window of the average pulse profile pre-calculated by
calculateAverageProfile()
:return: self
-
remove_baseline
()¶ See
removeBaseline()
.
-
getLevels
([differences=False])¶ Returns calibration levels if the Archive is a calibrator in the form of a square wave signal. If differences is set to True, then this function will return the frequencies, the amplitude differences in the height of the square wave as a function of polarization/frequency, and the associated errors. If False, then it will return the frequencies, the mean values of the low and high portions of the square wave and the associated errors.
Parameters: differences (bool) –
-
getPulsarCalibrator
()¶ Uses
getLevels()
to get aCalibrator
object with associated metadataReturn type: Calibrator
-
calibrate
(psrcal[, fluxcal=None])¶ Polarization calibrates the data using another archive file. Flux calibration optional.
Parameters:
Warning
This function is under construction.
-
getData
([squeeze=True, setnan=None, weight=True])¶ Return the data array.
Parameters: Returns: self
-
setData
(newdata)¶ Replaces the data array with new data. Must be the same shape.
Parameters: newdata (numpy.ndarray) – New data array.
-
getWeights
([squeeze=True])¶ Return a copy of the weights array.
Parameters: squeeze (bool) – All dimensions of length 1 are removed.
-
setWeights
(val[, t=None, f=None])¶ Set weights to a certain value. Can be used for RFI-excision routines.
Parameters:
-
saveData
([filename=None, ext='npy', ascii=False])¶ Save the data array to a different format. Default is to save to a numpy binary file (.npy).
Parameters: - filename (str) – Filename to save the data to. If none, save to the archive’s original filename after replacing the extension with
ext
. - ext (str) – Filename extension
- ascii (bool) – Save the data to to a text file. If all four dimensions have length greater than 1, the data are saved in time, polarization, frequency, and phase order, with intensity as the fifth column. Otherwise, use numpy’s
savetxt()
to output the array.
- filename (str) – Filename to save the data to. If none, save to the archive’s original filename after replacing the extension with
-
outputPulses
(filename)¶ Write out a standard .npy file by calling
saveData()
.Parameters: filename (str) – Filename to save the data to.
-
getAxis
([flag=None, edges=False, wcfreq=False])¶ Get the time or frequency axes for plotting.
Parameters: Return type: numpy.ndarray
Todo
Let flag be both “T” and “F”.
-
getFrequencies
()¶ Convenience function for
getAxis('F')()
-
getFreqs
()¶ See
getFrequencies()
.
-
getTimes
()¶ Convenience function for
getAxis('T')()
-
getPulse
(t[, f=None])¶ Get the pulse shape as a function of time and potentially frequency if provided. Assumes the shape of the data is polarization averaged.
Parameters: Return type: numpy.ndarray
Todo
Do not assume polarization averaging.
-
getPeakFlux
(t[, f=None])¶ Return the maximum value of the pulses, with parameters passed to
getPulse()
Parameters: Return type:
-
getIntegratedFlux
(t[, f=None])¶ Return the integrated value of the pulses, with parameters passed to
getPulse()
Parameters: Return type:
-
getSinglePulses
([func=None, windowsize=None, **kwargs])¶ Efficiently wrap the data array with
SinglePulse
.Parameters: - func (function) – Arbitrary function to map onto the data array.
- windowsize (int) – Parameter passed to
SinglePulse
that describes the off-pulse window length - **kwargs – Additional parameters passed to
SinglePulse
Return type: numpy.ndarray of type np.object
-
fitPulses
(template, nums[, flatten=False, func=None, windowsize=None, **kwargs])¶ Fit all of the pulses with a given template shape.
Parameters: - template (list/numpy.ndarray) – Template shape
- nums (list/numpy.ndarray) – Numbers that denote which return values from
fitPulse()
fromSinglePulse
. Example: to return only TOA values, use nums=[1]. For TOA values and scale factors, use nums=[1,3]. - flatten (bool) – Flatten the data array.
- func (function) – Arbitrary function to map onto the data array.
- windowsize (int) – Parameter passed to
SinglePulse
that describes the off-pulse window length - **kwargs – Additional parameters passed to
SinglePulse
-
getDynamicSpectrum
([window=None, template=None, mpw=None, align=None, windowsize=None, verbose=False, snr=False, maketemplate=True])¶ Return the dynamic spectrum.
Parameters: - window (numpy.ndarray) – Return the dynamic spectrum using only certain phase bins.
- template (list/numpy.ndarray) – Generate the dynamic spectrum using the scale factor from template matching. Otherwise simply sum along the phase axis.
- mpw (list/numpy.ndarray) – Main-pulse window if calculating the dynamic spectrum using a template. Required if a template is provided.
- align (float) – Parameter passed to
SinglePulse
that describe a rotation of the pulse. - windowsize (int) – Parameter passed to
SinglePulse
that describes the off-pulse window length - verbose (bool) – Print the time index as each template is fit.
- snr (bool) – Instead of the scale factors, return the signal-to-noise ratios.
- maketemplate (bool) – Instead of supplying a template, make a basic smoothed one from the average pulse for matched filtering.
Warning
return values are not well-defined. Can either return the dynamic spectra, or will return a tuple of the scale factors, offsets, and errors of the template fit.
-
plot
([ax=None, show=True])¶ Basic plotter of the data, if the data array can be reduced to one dimension.
Parameters: - ax (matplotlib.axes._subplots.AxesSubplot) – Provide a matplotlib axis to plot to.
- show (bool) – Generate a matplotlib plot display.
-
imshow
([ax=None, cbar=False, mask=None, show=True, **kwargs])¶ Basic plotter of the data, if the data array can be reduced to two dimensions. The origin is set to the lower left.
Parameters: - ax (matplotlib.axes._subplots.AxesSubplot) – Provide a matplotlib axis to plot to.
- cbar (bool) – Include a matplotlib colorbar.
- mask (numpy.ndarray) – Apply a mask array using the conventions of a numpy masked array (numpy.ma.core.MaskedArray)
- show (bool) – Generate a matplotlib plot display.
- **kwargs – Additional arguments to pass to imshow.
-
pavplot
([ax=None, mode='GTpd', show=True, wcfreq=True])¶ Produces a PSRCHIVE pav-like plot for comparison
Parameters: ax (matplotlib.axes._subplots.AxesSubplot) – Provide a matplotlib axis to plot to.
-
waterfall
([offset=None, border=0, labels=True, album=False, bins=None, show=True])¶ Produce a waterfall plot if the data array can be reduced to two dimensions.
Parameters:
-
joyDivision
([border=0.1, labels=False, album=True, **kwargs])¶ Calls
waterfall()
in the style of the Joy Division album cover. All parameters are passed to the function.
-
time
(template, filename[, MJD=False, wcfreq=False, **kwargs])¶ Calculate times-of-arrival (TOAs).
Parameters: - template (list/numpy.ndarray/Archive) – Template shape to fit to the pulses.
- filename (str) – Path to save text to. If filename=None, print the text.
- MJD (bool) – Calculate absolute TOAs in MJD units instead of relative TOAs in bin (time) units.
- simple (bool) –
- wcfreq (bool) – Use the weighted center frequency.
Warning
MJD=True is currently under testing and comparisons with PSRCHIVE.
-
getPeriod
([header=False])¶ Returns the period of the pulsar. By default returns the Polyco-calculated period. Otherwise, returns the period as calculated by the pulsar parameter table. If a calibrator file, returns 1 divided by the header CAL_FREQ value.
Parameters: header (bool) – Enforce a return of the pulsar parameter table value. Return type: float
-
getValue
(value)¶ Looks for a key in one of the headers and returns the value. First looks in the primary header, then the subintegration header, then the pulsar parameter table if it exists.
Parameters: value (str) – Value to look for. Return type: str
-
getSubintinfo
(value)¶ Looks for a key in the subintegration header, a subset of the functionality of
getValue()
Parameters: value (str) – Value to look for. Return type: str
-
getMJD
([full=False, numwrap=float])¶
-
getTbin
([numwrap=float])¶ Returns the time per phase bin.
Parameters: numwrap (type) – Cast the return value into a type. Return type: Value given by numwrap
-
getCoords
([parse=True])¶ Returns the header coordinate (RA, DEC) values.
Parameters: parse (bool) – Return each value as a tuple of floats Returns: RA,dec, either each as strings or tuples .
-
getPulsarCoords
([parse=True])¶ See
getCoords()
.
-
getBandwidth
([header=False])¶ Returns the observation bandwidth as the product of the channel bandwidth (subintegration header CHAN_BW) and the number of channels (subintegration header NCHAN) values.
Parameters: header (bool) – Returns the header OBSBW value Return type: float
-
getDurations
()¶ Return the subintegration durations array. :rtype: numpy.ndarray
Todo
Check for completeness of inputs into the durations array
-
getCenterFrequency
([weighted=False])¶ Returns the center frequency. If a HISTORY table is provided in the PSRFITS file, return the latest CTR_FREQ value. Otherwise, return the header OBSFREQ value.
Parameters: weighted (bool) – Return the center frequency weighted by the weights array \((\sum_i w_i \nu_i / \sum w_i\) for frequency \(i)\). Return type: float
-
getFreqUnit
()¶
-
getScaleUnit
()¶ See
getDataUnit()
-
getIntensityUnit
()¶ See
getDataUnit()
-
getFluxDensityUnit
()¶ See
getDataUnit()
-
getFluxUnit
()¶ See
getDataUnit()
-
isCalibrator
()¶ Returns if the file is a calibration observation or not, given by the OBS_MODE flag in the header.
Return type: bool
-
record
(frame)¶ Internal function that runs within state-changing functions to record those state changes to a history variable that can be written out if the archive if saved.
Parameters: frame (frame) – Frame object returned by python’s inspect module.
-
print_pypulse_history
()¶ Prints all elements in the PyPulse history list.
History class¶
The History class stores the History table in the PSRFITS file. Typical users should not need to worry about using this class directly. It can be accessed in an Archive ar using ar.history (no function call).
-
class
History
(history)¶ Parameters: history (pyfits.hdu.table.BinTableHDU) – The binary table header data unit (HDU).
-
getValue
(field[, num=None]) Returns a dictionary array value.
Parameters: field (str) – A column name (i.e. as provided by hdulist[‘HISTORY’].columns) Example: getValue(‘NCHAN’) will return a list of the frequency channelization history of the file.
-
getLatest
(field)¶ Returns the latest key value for a given field.
Parameters: field (str) – A column name, see getValue()
Polyco Class¶
The Polyco class stores the Polyco table in the PSRFITS file. Typical users should not need to worry about using this class directly. It can be accessed in an Archive ar using ar.polyco (no function call).
-
class
Polyco
(polyco[, MJD=None])¶ Parameters: MJD (float) – A default MJD to calculate the Polyco on.
-
getValue
(field[, num=None]) Returns a dictionary array value.
Parameters: field (str) – A column name (i.e. as provided by hdulist[‘POLYCO’].columns)
-
getLatest
(field) Returns the latest key value for a given field.
Parameters: field (str) – A column name, see getValue()