TESGroup¶
- class mass.TESGroup(filenames, noise_filenames=None, noise_only=False, noise_is_continuous=True, hdf5_filename=None, hdf5_noisefilename=None, never_use=None, use_only=None, max_chans=None, experimentStateFile=None, excludeStates='auto', overwrite_hdf5_file=False)[source]¶
The interface for a group of one or more microcalorimeters.
- property channel¶
- compute_average_pulse(masks, subtract_mean=True, forceNew=False)[source]¶
Compute an average pulse in each TES channel.
Store the averages in self.datasets.average_pulse, a length nSamp vector. Note that this method replaces any previously computed self.datasets.average_pulse
- Args:
- masks: A sequence of length self.n_channels, one sequence per channel.
The elements of masks should be booleans or interpretable as booleans.
- subtract_mean (bool): whether each average pulse will subtract a constant
to ensure that the pretrigger mean (first self.nPresamples elements) is zero.
- compute_filters(fmax=None, f_3db=None, cut_pre=0, cut_post=0, forceNew=False, category=None, filter_type='ats')[source]¶
- compute_noise_spectra(max_excursion=1000, forceNew=False)[source]¶
Replaced by the equivalent compute_noise(…)
Deprecated since version 0.7.9: Use compute_noise(), which is equivalent but better named
- correct_flux_jumps(flux_quant)[source]¶
Remove ‘flux’ jumps’ from pretrigger mean.
When using umux readout, if a pulse is recorded that has a very fast rising edge (e.g. a cosmic ray), the readout system will “slip” an integer number of flux quanta. This means that the baseline level returned to after the pulse will different from the pretrigger value by an integer number of flux quanta. This causes that pretrigger mean summary quantity to jump around in a way that causes trouble for the rest of MASS. This function attempts to correct these jumps.
Arguments: flux_quant – size of 1 flux quantum
- property external_trigger_subframe_as_seconds¶
This is not a posix timestamp, it is just the external trigger subframecount converted to seconds based on the nominal clock rate of the crate.
- property external_trigger_subframe_count¶
- property first_good_dataset¶
- property good_channels¶
- hist(bin_edges, attr='p_energy', t0=0, tlast=1e+20, category={}, g_func=None)[source]¶
- return a tuple of (bin_centers, counts) of p_energy of good pulses in all good datasets
(use .hists to get the histograms individually). filters out nan values
bin_edges – edges of bins unsed for histogram attr – which attribute to histogram “p_energy” or “p_filt_value” t0 and tlast – cuts all pulses outside this timerange before fitting g_func – a function a function taking a MicrocalDataSet and returnning a vector like ds.good() would return
This vector is anded with the vector calculated by the histogrammer
- hists(bin_edges, attr='p_energy', t0=0, tlast=1e+20, category={}, g_func=None)[source]¶
return a tuple of (bin_centers, countsdict). automatically filters out nan values where countsdict is a dictionary mapping channel numbers to numpy arrays of counts bin_edges – edges of bins unsed for histogram attr – which attribute to histogram “p_energy” or “p_filt_value” t0 and tlast – cuts all pulses outside this timerange before fitting g_func – a function a function taking a MicrocalDataSet and returnning a vector like ds.good() would return
This vector is anded with the vector calculated by the histogrammer
- iter_channel_numbers(include_badchan=False)[source]¶
Iterator over the channel numbers in channel number order
- Args:
- include_badchan (bool): whether to include officially bad channels
in the result (default False).
- iter_channels(include_badchan=False)[source]¶
Iterator over the self.datasets in channel number order
- Args:
- include_badchan (bool): whether to include officially bad channels
in the result (default False).
- linefit(line_name='MnKAlpha', t0=0, tlast=1e+20, axis=None, dlo=50, dhi=50, binsize=1, bin_edges=None, attr='p_energy', label='full', plot=True, guess_params=None, ph_units='eV', category={}, g_func=None, has_tails=False)[source]¶
Do a fit to line_name and return the fitter. You can get the params results with fitter.last_fit_params_dict or any other way you like.
line_name – A string like “MnKAlpha” will get “MnKAlphaFitter”, your you can pass in a fitter like a mass.GaussianFitter(). t0 and tlast – cuts all pulses outside this timerange before fitting axis – if axis is None and plot==True, will create a new figure, otherwise plot onto this axis dlo and dhi and binsize – by default it tries to fit with bin edges given by np.arange(fitter.spect.nominal_peak_energy-dlo,
fitter.spect.nominal_peak_energy+dhi, binsize)
bin_edges – pass the bin_edges you want as a numpy array attr – default is “p_energy”, you could pick “p_filt_value” or others. be sure to pass in bin_edges as well because
the default calculation will probably fail for anything other than p_energy
label – passed to fitter.plot plot – passed to fitter.fit, determine if plot happens guess_params – passed to fitter.fit, fitter.fit will guess the params on its own if this is None ph_units – passed to fitter.fit, used in plot label category – pass {“side”:”A”} or similar to use categorical cuts g_func – a function a function taking a MicrocalDataSet and returnning a vector like ds.good() would return holdvals – a dictionary mapping keys from fitter.params_meaning to values… eg {“background”:0, “dP_dE”:1}
This vector is anded with the vector calculated by the histogrammer
this should be the same as ds.linefit, but for now I’ve just copied and pasted the code
- make_masks(pulse_avg_range=None, pulse_peak_range=None, pulse_rms_range=None, gains=None)[source]¶
Generate a sequence of masks for use in compute_average_pulses().
- Args:
pulse_avg_range – A 2-sequence giving the (minimum,maximum) p_pulse_average pulse_peak_range – A 2-sequence giving the (minimum,maximum) p_peak_value pulse_rms_range – A 2-sequence giving the (minimum,maximum) p_pulse_rms gains – The set of gains to use, if any.
- Returns:
a list of ndvectors of boolean dtype, one list per channel. Each vector says whether each pulse in that channel is in the given range of allowed pulse sizes.
- property num_good_channels¶
- plot_average_pulses(axis=None, channels=None, cmap=None, legend=True, fcut=None, include_badchan=False)[source]¶
Plot average pulse for channel number <channum> on matplotlib.Axes <axis>, or on a new Axes if <axis> is None. If <channum> is not a valid channel number, then plot all average pulses. If <fcut> is not None, then lowpass filter the traces with this cutoff frequency prior to plotting.
- plot_filters(axis=None, channels=None, cmap=None, filtname='filt_noconst', legend=True)[source]¶
Plot the optimal filters.
- Args:
channels: Sequence of channel numbers to display. If None, then show all.
- plot_histogram_pages(attr, valrange, bins, y_range=None, subplot_shape=(3, 4), suffix=None, lines=None, fileformat='png', one_file=False)[source]¶
Make plots of histograms for all channels.
This creates the plots for each good channel, placing multiple plots on each page, and saves each page to its own file. Only pulses that pass cuts are included. The file names have the form “<attr>-hist-<suffix>-<page number>.png”. The default value for the suffix is that pulsefile’s base name.
- Arguments:
attr – string containing name of attribute to plot valrange – range of value over which to histogram (passed into histogram function) bins – number of bins (passed into histogram function) y_range – if not None, values to use for y limits. Defaults to None. subplot_shape – tuple indicating shape of subplots. First element is
number of rows, second is number of columns.
- suffix – suffix to use for filenames. Defaults to None, which causes the
function to use the first 15 characters of the pulse filename for the first data set (which typically will have a value like ‘20171017_103454’)
- lines – if not None, must contain a hashtable, keyed off of channel
number. The value for each channel is a list of numbers. A dashed horizontal line is plotted for each value in this list. Defaults to None.
- fileformat – output format (‘png’, ‘pdf’, etc). Must be a value supported by
your installation of matplotlib.
- one_file – If True, combine all pages to one pdf file. If False, use
separate files for all pages. Defaults to False. If format is something other than ‘pdf’, this uses the ImageMagick program convert to combine the files. You can install it on ubuntu via apt-get install imagemagick.
- plot_noise(axis=None, channels=None, cmap=None, scale_factor=1.0, sqrt_psd=False, legend=True, include_badchan=False)[source]¶
Plot the noise power spectra.
- Args:
channels: Sequence of channels to display. If None, then show all. scale_factor: Multiply counts by this number to get physical units. sqrt_psd: Whether to show the sqrt(PSD) or (by default) the PSD itself. cmap: A matplotlib color map. Defaults to something. legend (bool): Whether to plot the legend (default True)
- plot_noise_autocorrelation(axis=None, channels=None, cmap=None, legend=True)[source]¶
Plot the noise autocorrelation functions.
- Args:
channels: Sequence of channel numbers to display. If None, then show all.
- plot_summaries(quantity, valid='uncut', downsample=None, log=False, hist_limits=None, channel_numbers=None, dataset_numbers=None)[source]¶
Plot a summary of one quantity from the data set.
This plot includes time series and histograms of this quantity. This method plots all channels in the group, but only one quantity. If you would rather see all quantities for one channel, then use the group’s group.channel[i].plot_summaries() method.
- Args:
- quantity: A case-insensitive whitespace-ignored one of the following list, or the numbers
that go with it: “Pulse RMS” (0) “Pulse Avg” (1) “Peak Value” (2) “Pretrig RMS” (3) “Pretrig Mean” (4) “Max PT Deriv” (5) “Rise Time” (6) “Peak Time” (7) “Peak Index” (8)
- valid: The words ‘uncut’ or ‘cut’, meaning that only uncut or cut data
are to be plotted OR None, meaning that all pulses should be plotted.
- downsample (int): To prevent the scatter plots (left panels) from getting too crowded,
plot only one out of this many samples. If None, then plot will be downsampled to 10,000 total points.
log (bool): Use logarithmic y-axis on the histograms (right panels). hist_limits: if not None, limit the right-panel histograms to this range. channel_numbers: A sequence of channel numbers to plot. If None, then plot all. dataset_numbers: A sequence of the datasets [0…n_channels-1] to plot. If None
(the default) then plot all datasets in numerical order. But ignored if channel_numbers is not None.
- plot_summary_pages(x_attr, y_attr, x_range=None, y_range=None, subplot_shape=(3, 4), suffix=None, lines=None, down=10, fileformat='png', one_file=False)[source]¶
Make scatter plots of summary quantities for all channels.
This creates the plots for each good channel, placing multiple plots on each page, and saves each page to its own file. Pulses that pass cuts are plotted in blue, and cut pulses are plotted in gray. The file names have the form “<x_attr>.vs.<y-attr>-<suffix>-<page number>.png”. The default value for the suffix is that pulsefile’s base name.
- Arguments:
x_attr – string containing name of X value attribute y_attr – string containing name of Y value attribute x_range – if not None, values to use for x limits. Defaults to None. y_range – if not None, values to use for y limits. Defaults to None. subplot_shape – tuple indicating shape of subplots. First element is
number of rows, second is number of columns.
- suffix – suffix to use for filenames. Defaults to None, which causes the
function to use the first 15 characters of the pulse filename for the first data set (which typically will have a value like ‘20171017_103454’)
- lines – if not None, must contain a hashtable, keyed off of channel
number. The value for each channel is a list of numbers. A dashed horizontal line is plotted for each value in this list. Defaults to None.
down – downsample by this factor. Defaults to 10 fileformat – output format (‘png’, ‘pdf’, etc). Must be a value supported by
your installation of matplotlib.
- one_file – If True, combine all pages to one pdf file. If False, use
separate files for all pages. Defaults to False. If format is something other than ‘pdf’, this uses the ImageMagick program convert to combine the files. You can install it on ubuntu via apt-get install imagemagick.
- plot_traces(pulsenums, dataset_num=0, channum=None, pulse_summary=True, axis=None, difference=False, residual=False, valid_status=None, shift1=False)[source]¶
Plot some example pulses, given by record number.
- Args:
<pulsenums> A sequence of record numbers, or a single number. <dataset_num> Dataset index (0 to n_dets-1, inclusive). Will be used only if
<channum> is invalid.
<channum> Channel number. If valid, it will be used instead of dataset_num. <pulse_summary> Whether to put text about the first few pulses on the plot
(default True)
<axis> A plt axis to plot on (default None, i.e., create a new axis) <difference> Whether to show successive differences (that is, d(pulse)/dt) or the raw data
(default False).
- <residual> Whether to show the residual between data and opt filtered model,
or just raw data (default False).
- <valid_status> If None, plot all pulses in <pulsenums>. If “valid” omit any from that set
that have been cut. If “cut”, show only those that have been cut. (default None).
- <shift1> Whether to take pulses with p_shift1==True and delay them by
1 sample (default False, i.e., show the pure raw data w/o shifting).
- pulse_model_to_hdf5(hdf5_file=None, n_basis=6, replace_output=False, maximum_n_pulses=4000, extra_n_basis_5lag=0, noise_weight_basis=True, category=None, f_3db_5lag=None, _rethrow=False)[source]¶
- read_trace(record_num, dataset_num=0, channum=None)[source]¶
Read one trace from cache or disk.
- Args:
record_num (int): the pulse record number to read. dataset_num (int): the dataset number to use channum (int): the channel number to use (if both this and dataset_num
are given, use channum in preference).
- Returns:
an ndarray: the pulse numbered <record_num>
- segnum2sample_range(segnum)[source]¶
Return the (first,end) sample numbers of the segment numbered <segnum>. Note that <end> is 1 beyond the last sample number in that segment.
- set_chan_bad(*args)[source]¶
Set one or more channels to be bad.
(No effect for channels already listed as bad.)
- Args:
- *args Arguments to this function are integers or containers of integers. Each
integer is added to the bad-channels list.
- Examples:
data.set_chan_bad(1, “too few good pulses”) data.set_chan_bad(103, [1, 3, 5], “detector unstable”)
- set_chan_good(*args)[source]¶
Set one or more channels to be good.
(No effect for channels already listed as good.)
- Args:
- *args Arguments to this function are integers or containers of integers. Each
integer is removed from the bad-channels list.
- property shortname¶
Return a string containing part of the filename and the number of good channels
- summarize_data(peak_time_microsec=None, pretrigger_ignore_microsec=None, cut_pre=0, cut_post=0, include_badchan=False, forceNew=False, use_cython=True, doPretrigFit=False)[source]¶
Summarize the data with per-pulse summary quantities for each channel.
peak_time_microsec will be determined automatically if None, and will be stored in channels as ds.peak_samplenumber.
- Args:
use_cython uses a cython (aka faster) implementation of summarize.
- property timestamp_offset¶
- property why_chan_bad¶
MicrocalDataSet¶
- class mass.MicrocalDataSet(pulserec_dict, tes_group=None, hdf5_group=None)[source]¶
Represent a single microcalorimeter’s PROCESSED data.
- HDF5_CHUNK_SIZE = 256¶
- apply_cuts(controls, clear=False, forceNew=True)[source]¶
Apply the cuts.
- Args:
controls (AnalysisControl): contains the cuts to apply. clear (bool): Whether to clear previous cuts first (default False). forceNew (bool): whether to recompute if it already exists (default False).
- assume_white_noise(noise_variance=1.0, forceNew=False)[source]¶
Set the noise variance to noise_variance and the spectrum to be white.
This is appropriate when no noise files were taken. Though you may set noise_variance to a value other than 1, this will affect only the predicted resolution, and will not change the optimal filters that get computed/used.
- Args:
noise_variance(number): what to set as the lag-0 noise autocorrelation. forceNew (bool): whether to update the noise autocorrelation if it’s already
been set (default False).
- auto_cuts(nsigma_pt_rms=8.0, nsigma_max_deriv=8.0, pretrig_rms_percentile=None, forceNew=False, clearCuts=True)[source]¶
Compute and apply an appropriate set of automatically generated cuts.
The peak time and rise time come from the measured most-common peak time. The pulse RMS and postpeak-derivative cuts are based on what’s observed in the (presumably) pulse-free noise file associated with this data file.
- Args:
- nsigma_pt_rms (float): How big an excursion is allowed in pretrig RMS
(default 8.0).
- nsigma_max_deriv (float): How big an excursion is allowed in max
post-peak derivative (default 8.0).
- pretrig_rms_percentile (float): Make upper limit for pretrig_rms at
least as large as this percentile of the data. I.e., if you pass in 99, then the upper limit for pretrig_rms will exclude no more than the 1 % largest values. This number is a percentage, not a fraction. This should not be routinely used - it is intended to help auto_cuts work even if there is a problem during a data acquisition that causes large drifts in noise properties.
forceNew (bool): Whether to perform auto-cuts even if cuts already exist. clearCuts (bool): Whether to clear any existing cuts first (default
True).
The two excursion limits are given in units of equivalent sigma from the noise file. “Equivalent” meaning that the noise file was assessed not for RMS but for median absolute deviation, normalized to Gaussian distributions.
- Returns:
The cut object that was applied.
- avg_pulses_auto_masks(max_pulses_to_use=7000, subtract_mean=True, forceNew=False)[source]¶
Compute an average pulse.
Compute average pulse using an automatically generated mask of +- 5%% around the median pulse_average value.
- Args:
- max_pulses_to_use (int): Use no more than
the first this many good pulses (default 7000).
forceNew (bool): whether to re-compute if results already exist (default False)
- bad(*args, **kwargs)[source]¶
Returns a boolean vector, one per pulse record, saying whether record is bad
- calibrate(attr, line_names, name_ext='', size_related_to_energy_resolution=10, fit_range_ev=200, excl=(), plot_on_fail=False, bin_size_ev=2.0, category={}, forceNew=False, maxacc=0.015, nextra=3, param_adjust_closure=None, curvetype='gain', approximate=False, diagnose=False)[source]¶
- compute_5lag_filter(fmax=None, f_3db=None, cut_pre=0, cut_post=0, category={}, forceNew=False)[source]¶
Requires that compute_noise has been run and that average pulse has been computed
- compute_ats_filter(fmax=None, f_3db=None, transform=None, cut_pre=0, cut_post=0, category={}, shift1=True, forceNew=False, minimum_n_pulses=20, maximum_n_pulses=4000, optimize_dp_dt=True)[source]¶
Compute a arrival-time-safe filter to model the pulse and its time-derivative. Requires that compute_noise has been run.
- Args:
- fmax: if not None, the hard cutoff in frequency space, above which
the DFT of the filter will be set to zero (default None)
- f_3db: if not None, the 3 dB rolloff point in frequency space, above which
the DFT of the filter will rolled off with a 1-pole filter (default None)
- transform: a callable object that will be called on all data records
before filtering (default None)
- optimize_dp_dt: bool, try a more elaborate approach to dp_dt than just the finite
difference (works well for x-ray, bad for gamma rays)
cut_pre: Cut this many samples from the start of the filter, giving them 0 weight. cut_post: Cut this many samples from the end of the filter, giving them 0 weight. shift1: Potentially shift each pulse by one sample based on ds.shift1 value, resulting filter is one sample shorter than pulse records. If you used a zero threshold trigger (eg dastard egdeMulti you can likely use shift1=False)
- Returns:
the filter (an ndarray)
Modified in April 2017 to make the model for the rising edge and the rest of the pulse differently. For the rising edge, we use entropy minimization to understand the pulse shape dependence on arrival-time. For the rest of the pulse, it is less noisy and in fact more robust to rely on the finite-difference of the pulse average to get the arrival-time dependence.
- compute_average_pulse(mask, subtract_mean=True, forceNew=False)[source]¶
Compute the average pulse this channel.
Store as self.average_pulse
- Args:
mask – A boolean array saying which records to average. subtract_mean – Whether to subtract the pretrigger mean and set the
pretrigger period to strictly zero (default True).
forceNew – Whether to recompute when already exists (default False)
- compute_noise(max_excursion=1000, forceNew=False)[source]¶
Compute the noise autocorrelation and power spectrum of this channel.
- Args:
- max_excursion (number): the biggest excursion from the median allowed
in each data segment, or else it will be ignored (default 1000).
- n_lags: if not None, the number of lags in each noise spectrum and the max lag
for the autocorrelation. If None, the record length is used (default None).
forceNew (bool): whether to recompute if it already exists (default False).
- compute_noise_nlags(n_lags, max_excursion=1000, plot=False)[source]¶
Compute the noise autocorrelation and power spectrum of this channel using records of length nlags. Treats data in separate noise traces as continuous.
- Args:
- max_excursion (number): the biggest excursion from the median allowed
in each data segment, or else it will be ignored (default 1000).
- n_lags: if not None, the number of lags in each noise spectrum and the max lag
for the autocorrelation. If None, the record length is used (default None).
forceNew (bool): whether to recompute if it already exists (default False).
- compute_noise_spectra(max_excursion=1000, n_lags=None, forceNew=False)[source]¶
Replaced by the equivalent compute_noise(…)
Deprecated since version 0.7.9: Use compute_noise(), which is equivalent but better named
- correct_flux_jumps(flux_quant)[source]¶
Remove ‘flux’ jumps’ from pretrigger mean.
When using umux readout, if a pulse is recorded that has a very fast rising edge (e.g. a cosmic ray), the readout system will “slip” an integer number of flux quanta. This means that the baseline level returned to after the pulse will different from the pretrigger value by an integer number of flux quanta. This causes that pretrigger mean summary quantity to jump around in a way that causes trouble for the rest of MASS. This function attempts to correct these jumps.
Arguments: flux_quant – size of 1 flux quantum
- drift_correct(attr='p_filt_value', forceNew=False, category={})[source]¶
Drift correct using the standard entropy-minimizing algorithm
- expected_attributes = ('nSamples', 'nPresamples', 'nPulses', 'timebase', 'channum', 'timestamp_offset')¶
- filter_data(filter_name='filt_noconst', transform=None, forceNew=False, use_cython=None)[source]¶
Filter the complete data file one chunk at a time.
- Args:
- filter_name: the object under self.filter to use for filtering the
data records (default ‘filt_noconst’)
- transform: a callable object that will be called on all data records
before filtering (default None)
forceNew: Whether to recompute when already exists (default False)
- first_n_good_pulses(n=50000, category={})[source]¶
Return the first good pulse records.
- Args:
n: maximum number of good pulses to include (default 50000).
- Returns:
(data, g) data is a (X,Y) array where X is number of records, and Y is number of samples per record g is a 1d array of of pulse record numbers of the pulses in data.
If we did load all of ds.data at once, this would be roughly equivalent to return ds.data[ds.cuts.good()][:n], np.nonzero(ds.cuts.good())[0][:n]
- fit_spectral_line(prange, mask=None, times=None, fit_type='dc', line='MnKAlpha', nbins=200, plot=True, **kwargs)[source]¶
- flag_crosstalking_pulses(priorTime, postTime, combineCategories=True, nearestNeighborsDistances=1, crosstalk_key='is_crosstalking', forceNew=False)[source]¶
Uses a list of nearest neighbor channels to flag pulses in current channel based on arrival times of pulses in neighboring channels
Args: priorTime (float): amount of time to check, in ms, before the pulse arrival time postTime (float): amount of time to check, in ms, after the pulse arrival time combineChannels (bool): whether to combine all neighboring channel pulses for flagging crosstalk nearestNeighborDistances (int or int array): nearest neighbor distances to use for flagging,
i.e. 1 = 1st nearest neighbors, 2 = 2nd nearest neighbors, etc.
forceNew (bool): whether to re-compute the crosstalk cuts (default False)
- get_pulse_model(f, f_5lag, n_basis, pulses_for_svd, extra_n_basis_5lag=0, maximum_n_pulses=4000, noise_weight_basis=True, category={})[source]¶
- good(*args, **kwargs)[source]¶
Returns a boolean vector, one per pulse record, saying whether record is good
- hist(bin_edges, attr='p_energy', t0=0, tlast=1e+20, category={}, g_func=None)[source]¶
return a tuple of (bin_centers, counts) of p_energy of good pulses (or another attribute).
Automatically filtes out nan values
Parameters¶
- bin_edges_type_
edges of bins unsed for histogram
- attrstr, optional
which attribute to histogram “p_energy” or “p_filt_value”, by default “p_energy”
- t0int, optional
cuts all pulses before this time before fitting, by default 0
- tlast_type_, optional
cuts all pulses after this time before fitting, by default 1e20
- categorydict, optional
_description_, by default {}
- g_func_type_, optional
a function a function taking a MicrocalDataSet and returnning a vector like ds.good() would return This vector is anded with the vector calculated by the histogrammer, by default None
Returns¶
- ndarray, ndarray
Histogram bin centers, counts
- linefit(line_name='MnKAlpha', t0=0, tlast=1e+20, axis=None, dlo=50, dhi=50, binsize=1, bin_edges=None, attr='p_energy', label='full', plot=True, guess_params=None, ph_units='eV', category={}, g_func=None, has_tails=False)[source]¶
Do a fit to line_name and return the fitter. You can get the params results with fitter.last_fit_params_dict or any other way you like.
line_name – A string like “MnKAlpha” will get “MnKAlphaFitter”, your you can pass in a fitter like a mass.GaussianFitter(). t0 and tlast – cuts all pulses outside this timerange before fitting axis – if axis is None and plot==True, will create a new figure, otherwise plot onto this axis dlo and dhi and binsize – by default it tries to fit with bin edges given by np.arange(fitter.spect.nominal_peak_energy-dlo,
fitter.spect.nominal_peak_energy+dhi, binsize)
bin_edges – pass the bin_edges you want as a numpy array attr – default is “p_energy”, you could pick “p_filt_value” or others. be sure to pass in bin_edges as well because
the default calculation will probably fail for anything other than p_energy
label – passed to fitter.plot plot – passed to fitter.fit, determine if plot happens guess_params – passed to fitter.fit, fitter.fit will guess the params on its own if this is None ph_units – passed to fitter.fit, used in plot label category – pass {“side”:”A”} or similar to use categorical cuts g_func – a function a function taking a MicrocalDataSet and returnning a vector like ds.good() would return holdvals – a dictionary mapping keys from fitter.params_meaning to values… eg {“background”:0, “dP_dE”:1}
This vector is anded with the vector calculated by the histogrammer
- property p_peak_time¶
- phase_correct(attr='p_filt_value_dc', forceNew=False, category={}, ph_peaks=None, method2017=True, kernel_width=None, save_to_hdf5=True)[source]¶
Apply the 2017 or 2015 phase correction method.
- Args:
forceNew (bool): whether to recompute if it already exists (default False). category (dict): if not None, then a dict giving a category name and the
required category label.
ph_peaks: Peaks to use for alignment. If None, then use _find_peaks_heuristic() kernel_width: Width (in PH units) of the kernel-smearing function. If None, use a heuristic.
- phase_correct2014(typical_resolution, maximum_num_records=50000, plot=False, forceNew=False, category={})[source]¶
Apply the phase correction that worked for calibronium-like data as of June 2014.
For more notes, do help(mass.core.analysis_algorithms.FilterTimeCorrection)
- Args:
- typical_resolution (number): should be an estimated energy resolution in UNITS OF
self.p_pulse_rms. This helps the peak-finding (clustering) algorithm decide which pulses go together into a single peak. Be careful to use a semi-reasonable quantity here.
- maximum_num_records (int): don’t use more than this many records to learn
the correction (default 50000).
plot (bool): whether to make a relevant plot forceNew (bool): whether to recompute if it already exists (default False). category (dict): if not None, then a dict giving a category name and the
required category label.
- property pkl_fname¶
- plot_hist(bin_edges, attr='p_energy', axis=None, label_lines=[], category={}, g_func=None)[source]¶
plot a coadded histogram from all good datasets and all good pulses bin_edges – edges of bins unsed for histogram attr – which attribute to histogram “p_energy” or “p_filt_value” axis – if None, then create a new figure, otherwise plot onto this axis annotate_lines – enter lines names in STANDARD_FEATURES to add to the plot, calls annotate_lines g_func – a function a function taking a MicrocalDataSet and returnning a vector like ds.good() would return
This vector is anded with the vector calculated by the histogrammer
- plot_summaries(valid='uncut', downsample=None, log=False)[source]¶
Plot a summary of the data set, including time series and histograms of key pulse properties.
- Args:
- valid: An array of booleans self.nPulses long saying which pulses are to be plotted
OR ‘uncut’ or ‘cut’, meaning that only uncut or cut data are to be plotted OR None or ‘all’, meaning that all pulses should be plotted.
- downsample: To prevent the scatter plots (left panels) from getting too crowded,
plot only one out of this many samples. If None, then plot will be downsampled to 10,000 total points (default None).
log (bool): Use logarithmic y-axis on the histograms (right panels). (Default False)
- plot_traces(pulsenums, pulse_summary=True, axis=None, difference=False, residual=False, valid_status=None, shift1=False, subtract_baseline=False, fcut=None)[source]¶
Plot some example pulses, given by sample number.
- Args:
<pulsenums> A sequence of sample numbers, or a single one. <pulse_summary> Whether to put text about the first few pulses on the plot <axis> A plt axis to plot on. <difference> Whether to show successive differences (that is, d(pulse)/dt) or the raw data <residual> Whether to show the residual between data and opt filtered model, or just raw data. <valid_status> If None, plot all pulses in <pulsenums>. If “valid” omit any from that set
that have been cut. If “cut”, show only those that have been cut.
<shift1> Whether to take pulses with p_shift1==True and delay them by 1 sample <subtract_baseline> Whether to subtract pretrigger mean prior to plotting the pulse <fcut> If not none, apply a lowpass filter with this cutoff frequency prior to plotting
- property rowcount¶
Deprecated since version 0.8.2: Use subframecount, which is equivalent but better named
- set_nearest_neighbors_list(mapFilename, nearestNeighborCategory='physical', distanceType='cartesian', forceNew=False)[source]¶
Finds the nearest neighbors in a given space for all channels in a data set
Args: mapFilename (str): Location of map file in the following format
Column 0 - list of channel numbers. Remaining column(s) - coordinates that define a particular column in a given space.
For example, can be the row and column number in a physical space or the frequency order number in a frequency space (umux readout).
- nearestNeighborCategory (str): name used to categorize the type of nearest neighbor.
This will be the name given to the subgroup of the hdf5 file under the nearest_neighbor group. This will also be a key for dictionary nearest_neighbors_dictionary
distanceType (str): Type of distance to measure between nearest neighbors, i.e. cartesian forceNew (bool): whether to re-compute nearest neighbors list if it exists (default False)
- property shortname¶
return a string containing part of the filename and the channel number, useful for labelling plots
- smart_cuts(threshold=10.0, n_trainings=10000, forceNew=False)[source]¶
Young! Why is there no doc string here??
- property subframes_after_last_external_trigger¶
- property subframes_from_nearest_external_trigger¶
- property subframes_until_next_external_trigger¶
- summarize_data(peak_time_microsec=None, pretrigger_ignore_microsec=None, cut_pre=0, cut_post=0, forceNew=False, use_cython=True, doPretrigFit=False)[source]¶
Summarize the complete data set one chunk at a time.
Store results in the HDF5 datasets p_pretrig_mean and similar.
Args: peak_time_microsec: the time in microseconds at which this channel’s
pulses typically peak (default None). You should leave this as None, and let the value be estimated from the data.
- pretrigger_ignore_microsec: how much time before the trigger to ignore
when computing pretrigger mean (default None). If None, it will be chosen sensibly.
cut_pre: Cut this many samples from the start of a pulse record when calculating summary values cut_post: Cut this many samples from the end of the a record when calculating summary values forceNew: whether to re-compute summaries if they exist (default False) use_cython: whether to use cython for summarizing the data (default True). doPretrigFit: whether to do a linear fit of the pretrigger data
- time_drift_correct(attr='p_filt_value_phc', sec_per_degree=2000, pulses_per_degree=2000, max_degrees=20, forceNew=False, category={})[source]¶
Drift correct over long times with an entropy-minimizing algorithm. Here we correct as a low-ish-order Legendre polynomial in time.
- attr: the attribute of self that is to be corrected. (The result
will be stored in self.p_filt_value_tdc[:]).
sec_per_degree: assign as many as one polynomial degree per this many seconds pulses_per_degree: assign as many as one polynomial degree per this many pulses max_degrees: never use more than this many degrees of Legendre polynomial.
forceNew: whether to do this step, if it appears already to have been done. category: choices for categorical cuts
SpectralLine (base class of line models)¶
- class mass.SpectralLine(element, material, linetype, energies, lorentzian_fwhm, intrinsic_sigma, reference_plot_instrument_gaussian_fwhm, reference_short, reference_amplitude, reference_amplitude_type, normalized_lorentzian_integral_intensity, nominal_peak_energy, fitter_type, position_uncertainty, reference_measurement_type, is_default_material)[source]¶
An abstract base class for modeling spectral lines as a sum of Voigt profiles (i.e., Gaussian-convolved Lorentzians).
Call addline to create a new subclass properly.
The API follows scipy.stats.stats.rv_continuous and is kind of like rv_frozen. Calling this object with an argument evalutes the pdf at the argument, it does not return an rv_frozen.
But so far we ony define rvs and pdf.
- components(x, instrument_gaussian_fwhm)[source]¶
List of spectrum components as a function of <x>, the energy in eV
- minimum_fwhm(instrument_gaussian_fwhm)[source]¶
for the narrowest lorentzian in the line model, calculate the combined fwhm including the lorentzian, intrinstic_sigma, and instrument_gaussian_fwhm
- model(has_linear_background=True, has_tails=False, prefix='', qemodel=None)[source]¶
Generate a LineModel instance from a SpectralLine
- pdf(x, instrument_gaussian_fwhm)[source]¶
Spectrum (units of fraction per eV) as a function of <x>, the energy in eV
- property peak_energy¶
- plot(x=None, instrument_gaussian_fwhm=0, axis=None, components=True, label=None, setylim=True)[source]¶
Plot the spectrum. x - np array of energy in eV to plot at (sensible default) axis - axis to plot on (default creates new figure) components - True plots each voigt component in addition to the spectrum label - a string to label the plot with (optional)
- classmethod quick_monochromatic_line(name, energy, lorentzian_fwhm, intrinsic_sigma)[source]¶
Create a quick monochromatic line. Intended for use in calibration when we know a line energy, but not a lineshape model. Returns and instrance of SpectralLine with most fields having contents like “unknown: quick_line”. The line will have a single lorentzian element with the given energy, fwhm, and intrinsic_sigma values.
- property reference¶
- rvs(size, instrument_gaussian_fwhm, rng=None)[source]¶
The CDF and PPF (cumulative distribution and percentile point functions) are hard to compute. But it’s easy enough to generate the random variates themselves, so we override that method.
- property shortname¶
EnergyCalibration¶
- class mass.EnergyCalibration(nonlinearity=1.1, curvetype='loglog', approximate=False, useGPR=True)[source]¶
Object to store information relevant to one detector’s absolute energy calibration and to offer conversions between pulse height and energy.
The behavior is governed by the constructor arguments loglog, approximate, and zerozero and by the number of data points. The construction-time arguments can be changed by calling EnergyCalibration.set_use_approximation() and EnergyCalibration.set_curvetype().
- curvetype – Either a code number in the range [0,len(self.CURVETYPE)) or a
string from the tuple self.CURVETYPE.
- approximate – Whether to construct a smoothing spline (minimal curvature
subject to a condition that chi-squared not be too large). If not, curve will be an exact spline in E vs PH, in log(E) vs log(PH), or as appropriate to the curvetype.
The forward conversion from PH to E uses the callable __call__ method or its synonym, the method ph2energy.
The inverse conversion method energy2ph calls Brent’s method of root-finding. It’s probably quite slow compared to a self.ph2energy for an array of equal length.
All of __call__, ph2energy, and energy2ph should return a scalar when given a scalar input, or a matching numpy array when given any sequence as an input.
- CURVETYPE = ('loglog', 'linear', 'linear+0', 'gain', 'invgain', 'loggain')¶
- add_cal_point(pht, energy, name='', pht_error=None, e_error=None, overwrite=True)[source]¶
Add a single energy calibration point <pht>, <energy>,
<pht> must be in units of the self.ph_field and <energy> is in eV. <pht_error> is the 1-sigma uncertainty on the pulse height. If None (the default), then assign pht_error = <pht>/1000. <e_error> is the 1-sigma uncertainty on the energy itself. If None (the default), then assign e_error=<energy>/10^5 (typically 0.05 eV).
Also, you can call it with <energy> as a string, provided it’s the name of a known feature appearing in the dictionary mass.energy_calibration.STANDARD_FEATURES. Thus the following are equivalent:
cal.add_cal_point(12345.6, 5898.801, “Mn Ka1”) cal.add_cal_point(12456.6, “Mn Ka1”)
Careful! If you give a name that’s already in the list, then this value replaces the previous one. If you do NOT give a name, though, then this will NOT replace but will add to any existing points at the same energy. You can prevent overwriting by setting <overwrite>=False.
- property cal_point_energies¶
- property cal_point_names¶
- property cal_point_phs¶
- drop_one_errors()[source]¶
For each calibration point, calculate the difference between the ‘correct’ energy and the energy predicted by creating a calibration without that point and using ph2energy to calculate the predicted energy, return energies, drop_one_energy_diff
- energy2ph(energy)[source]¶
Convert energy (or array of energies) energy to pulse height in arbs.
Should return a scalar if passed a scalar, and a numpy array if passed a list or array Uses a spline with steps no greater than ~1% in pulse height space. For a Brent’s method root finding (i.e., an actual inversion of the ph->energy function), use method energy2ph_exact.
- energy2ph_exact(energy)[source]¶
Convert energy (or array of energies) energy to pulse height in arbs.
Inverts the _ph2e function by Brent’s method for root finding. Can be fragile! Use method energy2ph for less precise but more generally error-free computation. Should return a scalar if passed a scalar, and a numpy array if passed a list or array.
- name2ph(name)[source]¶
Convert a named energy feature to pulse height. name need not be a calibration point.
- ph2energy(pulse_ht, der=0)[source]¶
Convert pulse height (or array of pulse heights) <pulse_ht> to energy (in eV). Should return a scalar if passed a scalar, and a numpy array if passed a list or array
- Args:
pulse_ht (float or np.array(dtype=float)): pulse heights in an arbitrary unit. der (int): the order of derivative. der should be >= 0.
- plot(axis=None, color='blue', markercolor='red', plottype='linear', ph_rescale_power=0.0, removeslope=False, energy_x=False, showtext=True, showerrors=True, min_energy=None, max_energy=None)[source]¶
- remove_cal_point_name(name)[source]¶
If you don’t like calibration point named <name>, this removes it.
- remove_cal_point_prefix(prefix)[source]¶
This removes all cal points whose name starts with <prefix>. Return number removed.
EnergyCalibrationAutocal¶
- class mass.EnergyCalibrationAutocal(calibration, ph=None, line_names=None)[source]¶
- property anyfailed¶
- autocal(smoothing_res_ph=20, fit_range_ev=200.0, binsize_ev=1.0, nextra=2, nincrement=3, nextramax=8, maxacc=0.015)[source]¶
- fit_lines()[source]¶
All calibration emission lines are fitted with appropriate model. self.line_names will be sored by energy after this method is finished.
- guess_fit_params(smoothing_res_ph=20, fit_range_ev=200.0, binsize_ev=1.0, nextra=2, nincrement=3, nextramax=8, maxacc=0.015)[source]¶
Calculate reasonable parameters for complex models or Gaussian models.
- Args:
- binsize_ev (float or list[float]): bin sizes of the histograms of given calibration lines.
If a single number is given, this same number will be used for all calibration lines.