Excecutable and modules¶
bin.OriginFinder
¶
- Script name:
OriginFinder.py
- Usage:
python OriginFinder.py –configure /path/to/configuration.ini –override section0:option0:value0 section1:option1
where section0 is a section, option0 is its option, and value0 is an input that you choose. If an option is boolean, please do as section1:option1
- Online mode
- Description:
Online mode automatically creates a list of glitches using an external event generator such as Omicron and pycbc-live in the current version.
- Processes:
It automatically queries a list of glitches that are generated an external event trigger generator (e.g., omicron trigger, pycbc-live tirgger upto date).
It chooses the glitches that not overlap with any other glitches.
It conditions auxiliary channels and quantifies them as “importance” which is the fraction of frequency bins of the on-source window above a upper threshold of the off-source window in a given frequency band.
It produces a plot of values of importance for auxiliary channels of a glitch right after values of importance are computed.
It creates summary plots; glitch indices VS. channels, SNR of h(t) VS channels, time versus channels, the averaged values of importance VS channels.
It clusters glitches based on PCA and Gaussian mixture and make its plot and the rates of each clusters.
Note values of importance are saved as .csv files.
- Offline mode
- Description:
Offline mode is similar to the online mode. However the dominant difference is that a list of glitches are provided by a user.
For instance, a user can use a .csv file generated by Gravity Spy, (potentially cWB, or other pipelines). This mode has benefit when external event trigger generators can not catch glitches as you wish.
- Processes:
The offline modes is operated with a list of glitches that is already supplied by a user. For instance, Gravity Spy .csv file can be used.
It chooses the glitches that not overlap with any other glitches.
It conditions auxiliary channels and quantifies them as “importance” which is the fraction of frequency bins of the on-source window above a upper threshold of the off-source window in a given frequency band.
It creates summary plots; glitch indices VS. channels, SNR of h(t) VS channels, time versus channels, the averaged values of importance VS channels.
It clusters glitches based on PCA and Gaussian mixture and make its plot and the rates of each clusters.
It produces a plot of values of importance for auxiliary channels of individual glitches.
Note values of importance are saved as .csv files.
- Null sample generator
- Description:
In order to perform statistics (see the following section), null sample generator creates the null hypothesis set where the samples are in quiet times.
Note that large number of null samples are preferred.
- Processes:
It creates randomly distributed synthetic event time periods whose distribution of durations follow that of the target glitches.
It chooses those synthetic time periods (null samples) that are not overlapping with any other glitches including actuall all the glitches and null samples
It analyzes null samples in the same way either online mode or offline mode operates.
- Statistics mode
- Description:
It finds channels that are responsible for the target glitches based on a various statistical methods using the null samples.
It can find channels that are non-trivial by eyes, thanks to the null samples.
A user needs to supply a confidence level to find channels that reject the null hypothesis.
Note that each statistics are operated for each channel (in each frequency band) independently.
- Processes:
It performs the one-sided binomial test for the target glitches as a whole. Also, it calculates “Witness ratio statistic” (WRS), which is the ratio of the fraction of the target samples with the value of importance above a threshold over the sum of the fraction of the target samples and the null samples above a threhold. Note that the threshold is set as the mean of importance of the null samples on the basis of the experiments conducted so far, The threshold should be chosen from another set of null samples to get unbiased values. However, it the number of null samples are sufficiently large, both null samples are expected to converge to the same value. In spite of the fact that the threshold is not truly independent from the null samples that are used for the statistical test, WRS is approximately the probability of a channel has glitches in the coincidence with target glitches in h(t).
It performes the one-sided Welch’s t-test on target glitches as a whole and make a plot and save the table as a .csv file.
It makes a plot combining WRS and t-value to provides the channels that reject both the binomial test and t-test.
It incorporates the binomial test and t-test for clustering for reduces the number of redundant sub-classes.
It analyzes each individual glitch by means of chi-square test and make their plots.
- WitnessFlag mode
- Description:
TThe methodology consists of two major parts of 1) finding witness channels 2) finding flags with chosen witness channels. Finding the witness channels are determined by statistical tests and reduce a list of channels automatically and stop the analysis automatically. The determination of flags can be made with multiple channels. The following process is made with null samples which are the samples at quiet times.
- Processes:
- Finding witness channels:
shuffle a list of glitches
take a few glitches and analyze them with all the safe channels
perform the one-sided binomial test and one-sided Welch’s t-test against null samples
reject the channels which do NOT pass both the tests, i.e., which can not reject the hypothesis that a channel is consistent with null samples
calculate the error ratio of the t-value of the top-ranking channel to its previous t-value
analyze the next glitch using the channels that pass both the tests
add the values of importance to the target samples
repeat the bullet points (3)-(7)
terminate the process when the error ratio reaches the tolerance
- Finding flags
select high ranking witness channels
determine the upper cut of the null samples for those high ranking witness channels using null samples
analyze all the glitches using the selected witness channels
make a flag when those channels give importance above the uppercut of the null
calculate efficiency and deadtime
origli.utilities.const
¶
Script name: const.py
- Description:
File containing the list of glitch names This is supposed to be used by import_data_hdf5.py under bin dir
origli.utilities.multiband_search_utilities
¶
file name: multiband_search_utilities.py
This file contains the utilities to be used for multi-frequency band search
-
origli.utilities.multiband_search_utilities.
CreateAllChannels_rho_multband
(Listsegment, IFO, re_sfchs, number_process, PlusHOFT, sigma)[source]¶ - description:
use a single glitch time
query timeseries of all the channels around a glitch
condition (whitening and compare the on- and off-source window)
quantify all the channels (compute values of importance of all the channels)
USAGE: IndexSatisfied, Mat_Count_in_multibands, list_sample_rates, re_sfchs, gpstime, duration, SNR, confi, ID = CreateAllChannels_rho_multband(Listsegment, IFO, re_sfchs, number_process, PlusHOFT, sigma)
- Parameters
Listsegment – a list of segment parameters
:param IFO :param channels: a list of safe channels :param number_process: a number of processes in parallel :param PlusHOFT: whether to get data of hoft, {‘True’ or ‘False’} :param sigma: an integer to be used for calculating values of importance :return:
IndexSatisfied: glitch index Mat_Count_in_multibands: a matrix of rho where frequencies in rows and channels in columns, numpy array list_sample_rates: a list of sampling rates of channels, numpy array re_sfchs: list of channels without “IFO:” at the beginning gpstime: a gps time duration: a value of duration SNR: signal to noise ratio in the h(t) conf: a confidence level of classification of a glitch, provided by Gravity Spy. Otherwise None ID: a glitch ID, provided by Gravity Spy in usual
-
origli.utilities.multiband_search_utilities.
CreateRho_multiband
(full_timeseries, target_timeseries_start, target_timeseries_end, pre_background_start, pre_background_end, fol_background_start, fol_background_end, sigma)[source]¶ - discription:
calculate the whitened FFT of the on- and off-source window for a single channel
compute the value of importance for a single channel
USAGE: Counts_in_multibands, sample_rate = CreateRho_multiband(full_timeseries, target_timeseries_start, target_timeseries_end, pre_background_start, pre_background_end, fol_background_start, fol_background_end, sigma)
- Parameters
full_timeseries – the full time series in gwpy object including on- and off source windows
target_timeseries_start – the start time of the on-source window
target_timeseries_end – the end time of the on-source window
pre_background_start – the start time of the preceding off-source window
pre_background_end – the end time of the preceding off-source window
fol_background_start – the start time of the following off-source window
fol_background_end – the end time of the following off-source window
sigma – an interger to calculate the value of importance
- Returns
Counts_in_multibands: values of importance in different frequency bands where importance is a fraction of frequency bins in a frequency range above an upper bound of the off-source window for a single channel sample_rate: a sampling rate of a single channel
-
origli.utilities.multiband_search_utilities.
HierarchyChannelAboveThreshold_single_channel_multiband
(whitened_fft_target, whitened_fft_PBG, whitened_fft_FBG, duration, sampling_rate, sigma)[source]¶ - description:
calculate values of importance: a fraction of frequency bins in a frequency range above an upper bound of the off-source window for a single channel, in different frequency bnads
USAGE: Counts_in_multibands = HierarchyChannelAboveThreshold_single_channel_multiband(whitened_fft_target, whitened_fft_PBG, whitened_fft_FBG, duration, sigma)
- Parameters
whitened_fft_target – whitened fft of the on-source window
whitened_fft_PBG – whitened fft of the preceding off-source window
whitened_fft_FBG – whitened fft of the following off-source window
duration – a duration of the on-source window
sampling_rate – sampling rate of a channel
sigma – an integer to determine the upper bound of the off-source window
- Returns
Counts_in_multibands: values of importance in different frequency bands, numpy array
-
class
origli.utilities.multiband_search_utilities.
PlotTableAnalysis_multiband
[source]¶ Bases:
object
-
CreateMatCount_multiband
(g_Individual=None)[source]¶ - description:
find counts for each glitch
stack over all the gliches
make a matix comprising importance versus channels
dependencies: self.HierarchyChannelAboveThreshold(g, LowerCutOffFreq, UpperCutOffFreq) USAGE: MatCount, ListChannelName, ListSNR, ListConf, ListGPS, ListDuration, ListID, mat_rho, freq_bands, ListOriginalChannelName = CreateMatCount_multiband() # for all glitches
MatCount, ListChannelName, SNR, Conf, GPS, Duration, ID, mat_rho, freq_bands, ListOriginalChannelName = CreateMatCount_multiband(g_Individual) # for an individual glitch
- Parameters
g_Individual – a HDF5 file group for a glitch
- Returns
MatCount: a matrix comprising importance versus channels ListChannelName: a list of channel names, combining frequency band information ListSNR: a list of SNRs ListConf: a list of confidence ListGPS: a list of GPS times ListID: a list of IDs mat_rho: a matrix of rho where frequencies in rows channels in columns, numpy array freq_bands: a matrix of frequency bands ListOriginalChannelName: a list of channel names, original
-
PlotCausalityVSChannelMultiBand
(list_Causal_passed, list_Causal_fail, list_causal_passed_err, list_causal_failed_err, list_Test, ListChannelName, output_dir, output_file, BinomialTestConfidence, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶ - description:
plot witness ratio statistics of chanenls in multi-frequency bands the cells which do not pass the test are masked
USAGE: PlotCausalityVSChannelMultiBand(list_Causal_passed, list_Causal_fail, list_Test, ListChannelName, output_dir, output_file, BinomialTestConfidence)
- Parameters
list_Causal_passed – a list of the probability of the causality that are passed one-tailed Binomial test, if not zero
list_Causal_fail – a list of the probability of the causality that are failed one-tailed Binomial test, if not zero
list_causal_passed_err – a list of the error of the causal probability that are passed one-tailed Binomial test, otherwise, zero
list_causal_failed_err – a list of the error of the causal probability that are failed one-tailed Binomial test, otherwise, zero
list_Test – a list of results of the Binomial test, ‘pass’ or ‘fail’
ListChannelName – a list of channel names
output_dir – (only used for all glitches)
output_file – (only used for all glitches)
BinomialTestConfidence – binomial test confidence level
freq_bands – frequency bands used for the multi-frequency band search, which is defined in const.py
- Returns
None
-
PlotCausalityVSChannelMultiBandNoMaskNoTable
(list_Causal_passed, list_Causal_fail, list_causal_passed_err, list_causal_failed_err, list_Test, ListChannelName, output_dir, output_file, BinomialTestConfidence, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶ - description:
plot witness ratio statistics of chanenls in multi-frequency bands without mask and table
USAGE: PlotCausalityVSChannelMultiBandNoMaskNoTable(list_Causal_passed, list_Causal_fail, list_Test, ListChannelName, output_dir, output_file, BinomialTestConfidence)
- Parameters
list_Causal_passed – a list of the probability of the causality that are passed one-tailed Binomial test, if not zero
list_Causal_fail – a list of the probability of the causality that are failed one-tailed Binomial test, if not zero
list_causal_passed_err – a list of the error of the causal probability that are passed one-tailed Binomial test, otherwise, zero
list_causal_failed_err – a list of the error of the causal probability that are failed one-tailed Binomial test, otherwise, zero
list_Test – a list of results of the Binomial test, ‘pass’ or ‘fail’
ListChannelName – a list of channel names
output_dir – (only used for all glitches)
output_file – (only used for all glitches)
BinomialTestConfidence – binomial test confidence level
freq_bands – frequency bands used for the multi-frequency band search, which is defined in const.py
- Returns
None
-
PlotFrequencyVSChannel
(glitchtype, SNR, Conf, GPS, Duration, ID, URL, mat_rho, ListOriginalChannelName, freq_bands, output_dir, output_file)[source]¶ - description:
make a plot of frequencies versus channels for a glitch
dependencies: CreateChannelTicks() USAGE: PlotFrequencyVSChannel(glitchtype, SNR, Conf, GPS, Duration, ID, URL, mat_rho, ListOriginalChannelName, freq_bands, output_dir, output_file)
- Parameters
glitchtype – a type of a glitch
SNR – SNR in h(t)
Conf – classification confidence level
GPS – a gps time
Duration – a duration of a gltich
ID – Gravity Spy ID
URL – Q-transform in h(t) of a glitch store in Gravity Spy
mat_rho – a matrix of rho where frequencies in rows channels in columns, numpy array
ListOriginalChannelName – a list of original channel names
freq_bands – a matrix of frequency bands
output_dir –
output_file –
- Returns
None
-
PlotIndividualFCS_ImportanceVSChannel_multiband
(glitchtype, IFO, GravitySpy_df, output_dir, mode='offline', sigma=None, Listsegments=None, re_sfchs=None, Data_outputpath=None, Data_outputfilename=None, PlusHOFT='False', number_process=None)[source]¶ - description:
load a file comprising all glitches in a class
- create a plot comprising frequency versus channel & importance versus channel
- dependency:
self.CreateChannelTicks(ListChannel) self.make_subset_channel_based_on_samplingrate() self.CreateMatCount() self.PlotImportanceVSChannel()
save a plot
dependencies: make_subset_channel_based_on_samplingrate(), CreateChannelTicks(), CreateMatCount(), PlotImportanceVSChannel() USAGE: PlotIndividualFCS_ImportanceVSChannel_multiband(self, glitchtype, IFO, GravitySpy_df, output_dir, mode=’offline’, Listsegments=None, re_sfchs=None, Data_outputpath=None, Data_outputfilename=None, PlusHOFT=’False’, number_process=None)
- Parameters
glitchtype – a type of glitch used for create a name of a plot
IFO – # a type of IFO used in a name of a plot
GravitySpy_df – a meta data of Gravity Spy in pandas frame
output_dir – a output directory
simga – an integer number used for the upper bound of BG noise
mode – ‘offline’ or ‘online’
sigma – an integer to determine the upper bound of the off-source window
Listsegments – a list of allowed glitches, which is used for online mode only, None in default
re_sfchs – a list of safe channels except unused channels, which is used for online mode only, None in default
Data_outputpath – a directory saving for a HDF5 file, which is used for online mode only, None in default
Data_outputfilename – a file saving for a HDF5 file, which is used for online mode only, None in default
PlusHOFT – whether to get data of HOFT {‘True’, ‘False’}, which is used for online mode only, ‘False’ in default
number_process – a number of processes in parallel, which is used for online mode only, None in default
return None
-
Plot_WRS_Welch_t_test_MultiBand
(channels, list_Causal_passed, list_Causal_fail, list_Test_binomial, list_t_values_passed, list_t_values_failed, list_t_Test, confidence_level, output_dir, output_file, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶ - description:
plot the result of one-sided Welch t-test
USAGE: Plot_Welch_t_test(channels, list_t_values_passed, list_t_values_failed, list_Test, output_dir, output_file)
- Parameters
channels – a list of channels
list_t_values_passed – a list of t-values that pass the test
list_t_values_failed – a list of t-values that fail the test
list_Test – a list of the test results {‘pass’, ‘fail’}
confidence_level – a confidence level
output_dir – an output directory
output_file – an output file name
freq_bands – frequency bands used for the multi-frequency band search, which is defined in const.py
- Returns
None
-
Plot_Welch_t_test_MultiBand
(channels, list_t_values_passed, list_t_values_failed, list_Test, confidence_level, output_dir, output_file, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶ - description:
plot the result of one-sided Welch t-test
USAGE: Plot_Welch_t_test(channels, list_t_values_passed, list_t_values_failed, list_Test, output_dir, output_file)
- Parameters
channels – a list of channels
list_t_values_passed – a list of t-values that pass the test
list_t_values_failed – a list of t-values that fail the test
list_Test – a list of the test results {‘pass’, ‘fail’}
confidence_level – a confidence level
output_dir – an output directory
output_file – an output file name
freq_bands – frequency bands used for the multi-frequency band search, which is defined in const.py
- Returns
None
-
Plot_p_greater_MultiBand
(channels, list_p_greater_passed, list_p_greater_failed, list_Test, confidence_level, output_dir, output_file, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶ - description:
plot the result of p_greater
USAGE: Plot_p_greater_MultiBand(self, channels, list_p_greater_passed, list_p_greater_failed, list_Test, confidence_level, output_dir, output_file)
- Parameters
channels – a list of channels
list_p_greater_passed – a list of p_greater that pass the test
list_p_greater_failed – a list of p_greater that fail the test
list_Test – a list of the test results {‘pass’, ‘fail’}
confidence_level – a confidence level
output_dir – an output directory
output_file – an output file name
freq_bands – frequency bands used for the multi-frequency band search, which is defined in const.py
- Returns
None
-
ReconstructFromFlattenedList
(flattened_list, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶ - description:
make the matrix of the original order from the flatten
USAGE: mat_originl, mat_flipped = ReconstructFromFlattenedList(flattened_list, freq_bands=Const.freq_bands)
- Parameters
flattened_list – flattend list from the matrix using np.flatten(order=’F’)
freq_bands – frequency bands
- Returns
mat_originl: a matrix where frequency bnads in row from the top to bottom, and channels in columns mat_flipped: a matrix where # frequency bnads in row from the bottom to top, and channels in columns
-
-
origli.utilities.multiband_search_utilities.
SaveTargetAndBackGroundHDF5_multiband_OFFLINE
(Listsegments, re_sfchs, IFO, outputpath, outputfilename, number_process, sigma, PlusHOFT='False')[source]¶ - description:
THIS IS USED FOR “OFFLINE” MODE 0. assuming Listsegments is given by Findglitchlist() 1.take the information of the list of allowed target and a preceding and following segments 2. whiten a target segmement based on the average background segment 3. find whitened FFT 4. save the whitened target and backgrounds FFTs Note this depends on
USAGE: SaveTargetAndBackGroundHDF5(Listsegments, re_sfchs, IFO, outputpath, outputfilename, mode==’offline’)
- Parameters
Listsegments – a list of segment parameters
channels – a list of safe channels
outputpath – a directory of an output file
outputfilename – a name of an output file
number_process – a number of processes in parallel
sigma – an integer to determine the upper bound of the off-source window
PlusHOFT – whether to get data of hoft, {‘True’ or ‘False’}, ‘False’ in default
:return None
-
origli.utilities.multiband_search_utilities.
SaveTargetAndBackGroundHDF5_multiband_ONLINE
(Listsegments, re_sfchs, IFO, outputpath, outputfilename, number_process, sigma, PlusHOFT)[source]¶ - description:
THIS IS USED FOR “ONLINE” MODE 0. assuming Listsegments is given by Findglitchlist() 1.take the information of the list of allowed target and a preceding and following segments 2. whiten a target segmement based on the average background segment 3. find whitened FFT 4. save the whitened target and backgrounds FFTs Note this depends on Multiprocess_whitening()
USAGE: SaveTargetAndBackGroundHDF5(Listsegments, re_sfchs, IFO, outputpath, outputfilename, number_process, sigma, PlusHOFT)
- Parameters
Listsegments – a list of segment parameters
channels – a list of safe channels
outputpath – a directory of an output file
outputfilename – a name of an output file
number_process – a number of processes in parallel
sigma – an integer to determine the upper bound of the off-source window
PlusHOFT – whether to get data of hoft, {‘True’ or ‘False’}
:return None
origli.utilities.utilities
¶
Script name: utilities.py
- Description:
File containing utilities
-
origli.utilities.utilities.
FindBGlist
(state, number_trials, step, outputMother_dir, df, Epoch_lt, TargetGlitchClass, IFO, BGSNR_thre, targetSNR_thre, Confidence_thre, UpperDurationThresh, LowerDurationThresh, UserDefinedDuration, gap, TriggerPeakFreqLowerCutoff=0, TriggerPeakFreqUpperCutoff=8192, targetUpperSNR_thre=inf, flag='Both')[source]¶ - description:
load Gravity Spy data set (.csv file)
get the data about the target glitch class
get the subset of the target glitches based on SNR and confidence level threshold a user defines
accept glitches where their back ground segment do not coincide with any other glitches
return the info of the accepted glitches
USAGE: Listsegments = FindglitchlistOnLineMode(df, Epochstart, Epochend, Commissioning_lt, TargetGlitchClass, IFO, BGSNR_thre, targetSNR_thre, Confidence_thre, UpperDurationThresh, LowerDurationThresh, UserDefinedDuration, gap, flag)
- Parameters
df – GravitySpy meta data in pandas format
Epochstart – starting time of an epoch
Epochend – end time of an epoch
Commissioning_lt – commissioning time in list
TargetGlitchClass – a target glitch class name (str)
IFO – a type of interferometer (H1, L1, V1) (str)
BGSNR_thre – an upper threhold of SNR for background glitches, i.e., quiet enough ), float or int
targetSNR_thre – a lower threshold of SNR for target glitches, float or int
Confidence_thre – a threshold of confidence level (float or int )
UpperDurationThresh – an upper bound of duration in sec (float or int)
LowerDurationThresh – a lower bound of duration in sec (float or int)
UserDefinedDuration – user defined duration of a glitch (float or int), 0 in default
gap – a time gap between the target and the background segments in sec, 1 sec in default
flag – ‘Both’ or ‘Either’ taking both backgronds or either the preceding or the following background, respectively to accept glitches
- Returns
the list of parameters of glitches passing the above thresholds Listsegments contains of
ListIndexSatisfied: a list of index of glitches Listtarget_timeseries_start: a list of a target glitch starting time Listtarget_timeseries_end: a list of a target glitch ending time Listpre_background_start: a list of preceding background starting time Listpre_background_end; a list of preceding background ending time Listfol_background_start: a list of following background starting time Listfol_background_end: a list of following background ending time Listgpstime: a list of GPS times Listduration: a list of durations ListSNR: a list of SNRs Listconfi: a list of confidence levels ListID: a list of IDs
-
origli.utilities.utilities.
FindRadomlistPoints
(state, IFO, Epoch_lt, number_samples, step, outputMother_dir, df_target)[source]¶ - description:
within an epoch, create a list of synthetic points with randomly chosen with durations following a distribution of that of a target glitch class
make pandas frame dataset
USAGE: df = FindRadomlistPoints(state, Epoch_lt, number_trials, step)
- Parameters
state – IFO state {observing, nominal-lock}
IFO – an observer {H1, L1}
Epoch_lt – a list of epochs
number_samples – number of samples picked up
step – step of data points in sec
outputMother_dir – an output directory in witch the data set is in
df_target – true glitch samples generated by an ETG with SNR above an upper threshold of background
- Returns
df: synthetic random data points within an epoch with durations generated from a distribution of a target glitch
-
origli.utilities.utilities.
FindShiftedPoints
(state, IFO, Epoch_lt, number_samples, step, outputMother_dir, df_target)[source]¶ - description:
within an epoch, create a list of synthetic points by shifting a target glitch class
make pandas frame dataset
USAGE: df = FindRadomlistPoints(state, Epoch_lt, number_trials, step)
- Parameters
state – IFO state {observing, nominal-lock}
IFO – an observer {H1, L1}
Epoch_lt – a list of epochs
number_samples – number of samples picked up
step – step of data points in sec
outputMother_dir – an output directory in witch the data set is in
df_target – true glitch samples generated by an ETG with SNR above an upper threshold of background
- Returns
df: synthetic random data points within an epoch with durations generated from a distribution of a target glitch
-
origli.utilities.utilities.
Findglitchlist
(df, Epoch_lt, TargetGlitchClass, IFO, BGSNR_thre, targetSNR_thre, Confidence_thre, UserDefinedDuration, UpperDurationThresh, LowerDurationThresh, gap, position_duration_bfr_centr, TriggerPeakFreqLowerCutoff=0, TriggerPeakFreqUpperCutoff=8192, targetUpperSNR_thre=inf, flag='Both')[source]¶ - description:
load Gravity Spy data set (.csv file)
get the data about the target glitch class
get the subset of the target glitches based on SNR and confidence level threshold a user defines
accept glitches where their back ground segment do not coincide with any other glitches
return the info of the accepted glitches
USAGE: Listsegments = FindglitchlistOnLineMode(df, Epochstart, Epochend, Commissioning_lt, TargetGlitchClass, IFO, BGSNR_thre, targetSNR_thre, Confidence_thre, UserDefinedDuration, gap, flag)
- Parameters
df – GravitySpy meta data in pandas format
Epochstart – starting time of an epoch
Epochend – end time of an epoch
Commissioning_lt – commissioning time in list
TargetGlitchClass – a target glitch class name (str)
IFO – a type of interferometer (H1, L1, V1) (str)
BGSNR_thre – an upper threhold of SNR for background glitches, i.e., quiet enough ), float or int
targetSNR_thre – a lower threshold of SNR for target glitches, float or int
Confidence_thre – a threshold of confidence level (float or int )
UpperDurationThresh – an upper bound of duration in sec (float or int)
LowerDurationThresh – a lower bound of duration in sec (float or int)
UserDefinedDuration – user defined duration of a glitch (float or int), 0 in default
gap – a time gap between the target and the background segments in sec, 1 sec in default
position_duration_bfr_centr – proportion of duration for a target segment around a center time, e.g, 0.5 indicates duration is even distributed around a center time, 0.83 indicates 5/6 is before the center time
TriggerPeakFreqLowerCutoff – a lower limit cutoff value of the peak frequency of triggers given by an ETG for target glitches
TriggerPeakFreqUpperCutoff – an upper limit cutoff value of the peak frequency of triggers given by an ETG queries for target glitches
targetUpperSNR_thre – an upper limit cutoff value of SNR of triggers given by an ETG queries for target glitches
flag – ‘Both’ or ‘Either’ taking both backgronds or either the preceding or the following background, respectively to accept glitches
- Returns
the list of parameters of glitches passing the above thresholds Listsegments contains of
ListIndexSatisfied: a list of index of glitches Listtarget_timeseries_start: a list of a target glitch starting time Listtarget_timeseries_end: a list of a target glitch ending time Listpre_background_start: a list of preceding background starting time Listpre_background_end; a list of preceding background ending time Listfol_background_start: a list of following background starting time Listfol_background_end: a list of following background ending time Listgpstime: a list of GPS times Listduration: a list of durations ListSNR: a list of SNRs Listconfi: a list of confidence levels ListID: a list of IDs
-
origli.utilities.utilities.
FindglitchlistLongestBG
(df, Epoch_lt, TargetGlitchClass, IFO, BGSNR_thre, targetSNR_thre, Confidence_thre, UpperDurationThresh, LowerDurationThresh, UserDefinedDuration, gap, position_duration_bfr_centr, TriggerPeakFreqLowerCutoff=0, TriggerPeakFreqUpperCutoff=8192, targetUpperSNR_thre=inf, flag='Both')[source]¶ - description:
load Gravity Spy data set (.csv file)
get the data about the target glitch class
get the subset of the target glitches based on SNR and confidence level threshold a user defines
accept glitches where their back ground segment do not coincide with any other glitches
return the info of the accepted glitches
USAGE: Listsegments = FindglitchlistOnLineMode(df, Epochstart, Epochend, Commissioning_lt, TargetGlitchClass, IFO, BGSNR_thre, targetSNR_thre, Confidence_thre, UpperDurationThresh, LowerDurationThresh, UserDefinedDuration, gap, flag)
- Parameters
df – GravitySpy meta data in pandas format
Epochstart – starting time of an epoch
Epochend – end time of an epoch
Commissioning_lt – commissioning time in list
TargetGlitchClass – a target glitch class name (str)
IFO – a type of interferometer (H1, L1, V1) (str)
BGSNR_thre – an upper threhold of SNR for background glitches, i.e., quiet enough ), float or int
targetSNR_thre – a lower threshold of SNR for target glitches, float or int
Confidence_thre – a threshold of confidence level (float or int )
UpperDurationThresh – an upper bound of duration in sec (float or int)
LowerDurationThresh – a lower bound of duration in sec (float or int)
UserDefinedDuration – user defined duration of a glitch (float or int), 0 in default
gap – a time gap between the target and the background segments in sec, 1 sec in default
position_duration_bfr_centr – proportion of duration for a target segment around a center time, e.g, 0.5 indicates duration is even distributed around a center time, 0.83 indicates 5/6 is before the center time
TriggerPeakFreqLowerCutoff – a lower limit cutoff value of the peak frequency of triggers given by an ETG for target glitches
TriggerPeakFreqUpperCutoff – an upper limit cutoff value of the peak frequency of triggers given by an ETG queries for target glitches
targetUpperSNR_thre – an upper limit cutoff value of SNR of triggers given by an ETG queries for target glitches
flag – ‘Both’ or ‘Either’ taking both backgronds or either the preceding or the following background, respectively to accept glitches
- Returns
the list of parameters of glitches passing the above thresholds Listsegments contains of
ListIndexSatisfied: a list of index of glitches Listtarget_timeseries_start: a list of a target glitch starting time Listtarget_timeseries_end: a list of a target glitch ending time Listpre_background_start: a list of preceding background starting time Listpre_background_end; a list of preceding background ending time Listfol_background_start: a list of following background starting time Listfol_background_end: a list of following background ending time Listgpstime: a list of GPS times Listduration: a list of durations ListSNR: a list of SNRs Listconfi: a list of confidence levels ListID: a list of IDs
-
origli.utilities.utilities.
Findglitchlist_for_timeseries_analysis
(df, Epoch_lt, TargetGlitchClass, IFO, BGSNR_thre, targetSNR_thre, Confidence_thre, UpperDurationThresh, LowerDurationThresh, UserDefinedDuration, gap, position_duration_bfr_centr, TriggerPeakFreqLowerCutoff=0, TriggerPeakFreqUpperCutoff=8192, targetUpperSNR_thre=inf, flag='Both')[source]¶ - description:
load Gravity Spy data set (.csv file)
get the data about the target glitch class
get the subset of the target glitches based on SNR and confidence level threshold a user defines
accept glitches where their back ground segment do not coincide with any other glitches
return the info of the accepted glitches
USAGE: Listsegments = Findglitchlist_for_timeseries_analysis(df, Epoch_lt, TargetGlitchClass, IFO, BGSNR_thre, targetSNR_thre, Confidence_thre, UpperDurationThresh, LowerDurationThresh, UserDefinedDuration, gap, position_duration_bfr_centr, TriggerPeakFreqLowerCutoff=0, TriggerPeakFreqUpperCutoff=8192, targetUpperSNR_thre=np.inf,flag=’Both’):
- Parameters
df – GravitySpy meta data in pandas format
Epochstart – starting time of an epoch
Epochend – end time of an epoch
Commissioning_lt – commissioning time in list
TargetGlitchClass – a target glitch class name (str)
IFO – a type of interferometer (H1, L1, V1) (str)
BGSNR_thre – an upper threhold of SNR for background glitches, i.e., quiet enough ), float or int
targetSNR_thre – a lower threshold of SNR for target glitches, float or int
Confidence_thre – a threshold of confidence level (float or int )
UpperDurationThresh – an upper bound of duration in sec (float or int)
LowerDurationThresh – a lower bound of duration in sec (float or int)
UserDefinedDuration – user defined duration of a glitch (float or int), 0 in default
gap – a time gap between the target and the background segments in sec, 1 sec in default
position_duration_bfr_centr – proportion of duration for a target segment around a center time, e.g, 0.5 indicates duration is even distributed around a center time, 0.83 indicates 5/6 is before the center time
TriggerPeakFreqLowerCutoff – a lower limit cutoff value of the peak frequency of triggers given by an ETG for target glitches
TriggerPeakFreqUpperCutoff – an upper limit cutoff value of the peak frequency of triggers given by an ETG queries for target glitches
targetUpperSNR_thre – an upper limit cutoff value of SNR of triggers given by an ETG queries for target glitches
flag – ‘Both’ or ‘Either’ taking both backgronds or either the preceding or the following background, respectively to accept glitches
- Returns
the list of parameters of glitches passing the above thresholds Listsegments contains of
ListIndexSatisfied: a list of index of glitches Listtarget_timeseries_start: a list of a target glitch starting time Listtarget_timeseries_end: a list of a target glitch ending time Listpre_background_start: a list of preceding background starting time Listpre_background_end; a list of preceding background ending time Listfol_background_start: a list of following background starting time Listfol_background_end: a list of following background ending time Listgpstime: a list of GPS times Listduration: a list of durations ListSNR: a list of SNRs Listconfi: a list of confidence levels ListID: a list of IDs
-
origli.utilities.utilities.
GrabGPStimesSafechannel
(fileid, ifo, SNRthre, glitch, PathSafeChannel, Epochstart, Epochend, Commissioning_lt=None)[source]¶ description: This imports a file containing the output of GravitySpy
This take the GPS times during O2 run and take the list of the safe channels and modify it such as it works in gwpy
USAGE: GPSs, ids ,re_sfchs = GrabGPStimesSafechannel(‘/home/kentaro.mogushi/longlived/MachineLearningJointPisaUM/dataset/GravityspyTrainingset/gspy-db-20180813.csv’, ‘L1’, 7, ‘Blip’, ‘/home/kentaro.mogushi/longlived/MachineLearningJointPisaUM/dataset/ListSaveChannel/L1/O2_omicron_channel_list_hvetosafe_GDS.txt’) :param fileid: a file that contains all the meta data of glitches used for training set for GravitySpy
- Parameters
ifo – a kind of interferometer {L1, H1, V1}
SNRthre – the minimum threshold of SNR, e.g, 7
glitch – a kind of glitch
PathSafeChannel – the full path of the file of the metadata
Epochstart – GPS time when a science run begins, float or int
Epochend – GPS time when a science run ends, float or int
Commissioning_lt – the set of commissioning times in a list of lists, e.g., [[Cstart1, Cend1], [Cstart2, Cend2]], None in defalt
- :return
GPSs: a list of GPS timse ids: a list of unique id re_sfchs: a list of safe channels
-
origli.utilities.utilities.
GrabSafechannel
(PathSafeChannel)[source]¶ - description:
take the list of the safe channels and modify it such as it works in gwpy
USAGE: re_sfchs = GrabGPStimesSafechannel(‘/home/kentaro.mogushi/longlived/MachineLearningJointPisaUM/dataset/ListSaveChannel/L1/O2_omicron_channel_list_hvetosafe_GDS.txt’)
- Parameters
:PathSafeChannel – the full path of the file of the metadata
- :return
re_sfchs: a list of safe channels
-
class
origli.utilities.utilities.
IdentifyGlitch
[source]¶ Bases:
object
-
CombinedIndentifyingProcess
(IFO, ListSegments, TriggerDir)[source]¶ - description:
use OmimcronTriggerPath() to log in either L1 or H1 cluster and get a list of trigger files paths
use CopyOmicroTriggerandUnzip() to copy trigger files and unzip them
- iterate with trigger XML files
3-1. use readXML() to get mata data stored in a .xml file 3-2. use ExtractOmcronTriggerMetadata() to re-arrange meta data matrix
save the matrix as .csv file
copy the .csv file into the CIT cluster and go back to the CIT cluster
USAGE: CombinedIndentifyingProcess(self, ‘L1’, ListSegments, ‘/home/kentaro.mogushi/longlived/OmicronTrigger’)
- Parameters
IFO – ifo {H1, L1, V1}
ListSegments – a list of segments
TriggerDir – a mother directory of a trigger file
- Returns
None
-
CopyOmicroTriggerandUnzip
(input_file, input_dir, TriggerDir, output_dir='OmicronTriggerXML')[source]¶ - description:
load a file comprising all the paths of omicron trigger files you are interested in
copy all the omicron trigger file in your working place
go to an output directory
unzip all the files and replace them with the zipped files
go back to a working directory
USAGE: CopyOmicroTriggerandUnzip(‘omicron.txt’, trigger_dir, trigger_dir)
- Parameters
input_file – an input file
input_dir – an input directory, in default, a current directory
trigger_dir – # a directory right above an output directory
output_dir – # an output directory where all the omicron trigger files will be stored
- Returns
None
-
ExtractOmcronTriggerMetadata
(name_a)[source]¶ - description:
load meta data matrix, expected to be created by readXML()
re-arrange for the sake of convenience and convert it in a numpy array
return numpy re-arranged meta data matrix
USAGE: Matdataset = ExtractOmcronTriggerMetadata(name_a)
- Parameters
name_a – omicron trigger info (list)
- Returns
Matdataset: omicron trigger metadata (numpy array)
-
FindGlitchNearestGPS
(PathMetadataFile, candidate_GPS)[source]¶ - description:
take a omicron trigger meta data file
find a glitch nearest to an input GPS, which you are concerned about
3. replace the label of this glitch from ‘arbitrary’ to ‘candidate’ This function is assumed to be used in DQR. Once GraceDB provides a GPS time, this function labels a glitch nearest glitch given by Omicron Trigger as it can be considered as the most significant candidate of astronomical event. In this way, we can study only the candidate glitch by specifying glitch_type = candidate in a configuration file
USAGE: largetSNR = FindGlitchNearestGPS(PathMetadataFile, GPS)
- Parameters
PathMetadataFile – a path to omicron meta data file
GPS – a GPS time to be concerned
- Returns
None
-
FindObservingTimeSegments
(IFO, startT, endT, outputMother_dir, state='observing')[source]¶ - description:
take a DataQualityFrag from a server
save its segment data as a HDF file
take active segments
return active segments as a numpy array
USAGE: SegmentsMat, trigger_dir = FindObservingTimeSegments(IFO, startT, endT, outputMother_dir, state)
- Parameters
IFO – a type of interferometer, ‘L1’ or ‘H1’
startT – (float, int or string: e.g., ‘Dec 8 2016’) starting time
endT – ending time
outputMother_dir – an ouput directory where the HDF5 file will be stored
state – state of an interferometer, {observing, nominal-lock}
- Returns
SegmentsMat: active segments in a numpy array
-
GetTriggerMetadata
(ListSegments, IFO, output_dir, number_process, trigger_pipeline='omicron', output_file='TriggerMetadata.csv', channel='GDS-CALIB_STRAIN')[source]¶ - description:
take the path of files storing omicron triggers during a segment
make a metadata in pandas frame
3. save it as a .csv file Note: this function works samely as CombinedIndentifyingProcess() but it is faster. And it is supposed to support pyCBC trigger as well.
USAGE: GetTriggerMetadata(ListSegments, IFO, output_dir, number_process, trigger_pipeline=’omicron’, output_file=’TriggerMetadata.csv’, channel=’GDS-CALIB_STRAIN’)
- Parameters
ListSegments – (list of lists) [[s1, e1], [s2, e2], …], this is because I want to exclude segments of non-observing times
IFO – interferometer (L1, or H1)
output_dir – an output directory
output_file – a name of an output file
number_process – a number maximum processes in parallel
trigger_pipeline – trigger method ‘omicron’, ‘pycbc-live’
channel – (str) the name of a channel
- Returns
ListXML: a list of trigger XML files
-
OmimcronTriggerPath
(ListSegments, TriggerDir, IFO, output_file='omicron.txt', channel='GDS-CALIB_STRAIN')[source]¶ - description:
log-in either Livingston or Hanford cluster
take the path of files storing omicron triggers during a segment
save those paths into an output file
USAGE: OmimcronTriggerPath(‘L1’, ListSegments, ‘/home/kentaro.mogushi/longlived/OmicronTrigger’, IFO)
- Parameters
IFO – interferometer (L1, or H1)
ListSegments – (list of lists) [[s1, e1], [s2, e2], …], this is because I want to exclude segments of non-observing times
channel – (str) the name of a channel
trigger_dir – (str) an output directory
output_file – (str) an output file
- Param
IFO:
- Returns
trigger_dir
-
SaveMetaDataAsCSV
(MetaData, output_dir, output_file)[source]¶ - description:
load trigger meta data matrix, which is expected to be created by ExtractOmcronTriggerMetadata()
add label for these triggers with ‘unknown’
add imgUrl with None
save this matrix as .csv file
USAGE: SaveMetaDataAsCSV(AllMatDataStr, trigger_dir, ‘OmicrontriggerMetadata.csv’)
- Parameters
MetaData – numpy array, omicron trigger metadata
output_file – an output file
output_dir – and output directory
- Returns
None
-
calculate_chisqr_weighted_snr
(snr, chisq, chisq_dof)[source]¶ - description:
calculate the chi-square weighted SNR Reference: Macleod et al. 2015, Equation (21)
USAGE: chisqr_weighted_snr = calculte_chisqr_weighted_snr(self, snr, chisqr, chisqr_dof)
- Parameters
snr – SNR
chisqr – chi-square
chisqr_dof – chi-square degrees of freedom
- Returns
chisqr_weighted_snr: the chi-square weighted SNR
-
pyCBC_SNR_filter
(all_pycbc_triggers_frame, SNR_low_cut, SNR_high_cut)[source]¶ - description:
band pass filter with SNR for pycbc triggers
USAGE: df_SNR_cut = pyCBC_SNR_filter(all_pycbc_triggers_frame, SNR_low_cut=7.5, SNR_high_cut=150)
- Parameters
all_pycbc_triggers_frame – a pandas frame of all the pyCBC triggers
SNR_low_cut – a lower cutoff SNR
SNR_high_cut – a higher cutoff SNR
- Returns
df_new: pycbc triggers that pass the all the condition defined above
-
pyCBC_chisqr_weighted_snr_filter
(pycbc_triggers_frame, chisqr_weighted_snr_lower_cutoff)[source]¶ - description
high pass filter with chi square weighted SNR for pycbc triggers
USAGE: df_chi_square_weighted_cut = pyCBC_chisqr_weighted_snr_filter(pycbc_triggers_frame, chisqr_weighted_snr_lower_cutoff)
- Parameters
pycbc_triggers_frame – pyCBC triggers in pandas frame
chisqr_weighted_snr_lower_cutoff – a lower cutoff chi-square weighted snr
- Returns
df_new: triggers with chi square weighted SNR above chisqr_weighted_snr_lower_cutoff
-
pyCBC_massratio_filter
(pycbc_triggers_frame, low_cutoff_massratio, high_cutoff_massratio)[source]¶ - description
band pass filter with total mass for pycbc triggers
USAGE: df_massratio_cut = pyCBC_query_massratio(pycbc_triggers_frame, low_cutoff_massratio, high_cutoff_massratio)
- Parameters
pycbc_triggers_frame – pyCBC triggers in pandas frame
low_cutoff_totalmass – a lower cutoff of the total mass in solar mass
high_cutoff_totalmass – a higher cutoff of the total mass in solar mass
- Returns
df_new: triggers with mass ratio between low_cutoff_massratio and high_cutoff_massratio
-
pyCBC_query_outlier
(pycbc_triggers_frame, BinNum, Nsigma, cut)[source]¶ - description:
bin the triggers with values of log10 of chi-square per DOF
2. calculate the lower bound of log10 of chi-square per DOF in the bin 3 (cut = ‘median’) Calculate the median of log10 of SNR and log10 of chi square per degree of freedom in the bin 3.(cut = ‘mad’) Calculate the upper bound of log10 of SNR in the bin where the lower bound is the median minus the median absolute deviation, the upper bound is the median plus the median absolute deviation 4. Polynomial fit the upper bound of log10 of SNR as a function of the lower bound of log10 of chi-square per DOF 5. split triggers using the polynomial fit to loud and quiet triggers
USAGE: pycbc_loud = pyCBC_query_outlier(pycbc_triggers_frame, BinNum=50, Nsigma=1, cut=’median’)
- Parameters
pycbc_triggers_frame – pycbc triggers in pandas frame
BinNum – the number of bins of a histogram for log10 of chi squar per degrees of freedom
Nsigma – an integer to determine the upper bound of the quiet triggers
cut – a method of cut {median or mad}
- Returns
pycbc_loud: loud triggers
-
pyCBC_template_duration_filter
(pycbc_triggers_frame, low_cutoff_duration, high_cutoff_duration)[source]¶ - description:
band pass filter with template duration for pycbc triggers
USAGE: df_template_duration_cut = pyCBC_template_duration_filter(pycbc_triggers_frame, low_cutoff_duration, high_cutoff_duration)
- Parameters
pycbc_triggers_frame – pyCBC triggers in pandas frame
low_cutoff_duration – a lower cutoff of the template duration in sec
high_cutoff_duration – a higher cutoff of the template duration in sec
- Returns
df_new: triggers with template druation between low_cutoff_duration and high_cutoff_duration
-
pyCBC_totalmass_filter
(pycbc_triggers_frame, low_cutoff_totalmass, high_cutoff_totalmass)[source]¶ - description
band pass filter with total mass for pycbc triggers
USAGE: df_totalmass_cut = pyCBC_totalmass_filter(pycbc_triggers_frame, low_cutoff_totalmass, high_cutoff_totalmass)
- Parameters
pycbc_triggers_frame – pyCBC triggers in pandas frame
low_cutoff_totalmass – a lower cutoff of the total mass in solar mass
high_cutoff_totalmass – a higher cutoff of the total mass in solar mass
- Returns
triggers with total mass between low_cutoff_totalmass and high_cutoff_totalmass
-
pycbc_clustering_timeslice
(pycbc_trigger, IFO, startT, endT, window, extension_duration)[source]¶ - description:
This is a clustering filter 1. take a frame of pycbc triggers 2. pick a trigger with highest SNR in a window that is a time-sliced bin 3. create new columns required to run this code
USAGE: pycbc_trigger = pycbc_clustering_timeslice(pycbc_trigger, IFO, startT, endT, window=0.1, extension_duration=1.5)
- Parameters
pycbc_trigger – pycbc triggers in pandas frame
IFO – ifo
startT – start time of an epoch
endT – end time of an epoch
window – a window length in sec for clustering
extension_duration – factor for extending the template duration, e.g., extension_duration = 1.5 makes the duration of the on-source window 1.5 times longer than that of the trigger
- Returns
pycbc_trigger: clustered pycbc triggers
-
pycbc_clustering_window_around_trigger
(pycbc_trigger, IFO, one_sided_window, extension_duration)[source]¶ - description:
This is a clustering filter 1. take a frame of pycbc triggers 2. pick up a trigger with highest SNR in a window around a trigger. The window size is twice of “one_sided_window” 3. create new columns required to run this code
USAGE: pycbc_trigger = pycbc_clustering_window_around_trigger(pycbc_trigger, IFO, one_sided_window=0.1, extension_duration=1.5)
- Parameters
pycbc_trigger – pycbc triggers in pandas frame format
IFO – ifo
one_sided_window – one sided window in seconds around a trigger for clustering,
extension_duration – factor for extending the template duration, e.g., extension_duration = 1.5 makes the duration of the on-source window 1.5 times longer than that of the trigger
- Returns
pycbc_trigger: clustered pycbc triggers
-
readXML
(input_dir, input_file)[source]¶ - description:
take meta data stored s a .xml file in a directory named OmicronTriggerXML
meta data matrix in the form of list
USAGE: TriggerMat = readXML(input_dir, input_file)
- Parameters
input_dir – an input directory
input_file – input file name
- Returns
name_a: (numpy array)
-
-
origli.utilities.utilities.
ListUsedSafeChannel
(path_list_channel, ifo)[source]¶ - description:
take a path to a .csv file that has lists of channels
remove unused safe channel from a list of safe channels
USAGE: sfchs = ListUsedSafeChannel(path_list_channel, ifo)
- Parameters
path_list_channel – a path to a list of channels (.csv file)
ifo – observatory {L1, H1} in str
- Returns
sfchs: subst of safe channels in numpy array
-
origli.utilities.utilities.
Multiprocess_ConvertToTable
(cache_indiv, trigger_pipeline, IFO, Columns)[source]¶ - description:
Multi process does not work if I include this inside the class IdentifyGlitch() so that I define this here globally
- Parameters
cache_indiv – an individual cache (each trigger file)
trigger_pipeline – a name of trigger pipeline {omicron, pycbc-live}
IFO – a name of the detector {L1, H1}
Columns – a list of columns for a metadata. This is None for omicron trigger as it is nother to do
- Returns
df_indiv: a metadata of triggers in pandas frame
-
origli.utilities.utilities.
Multiprocess_whitening
(full_timeseries, target_timeseries_start, target_timeseries_end, pre_background_start, pre_background_end, fol_background_start, fol_background_end)[source]¶ - description:
This is used for multi processing for whitening segments
- Parameters
full_timeseries – time series comprising target and BGs
target_timeseries_start – a start time of a target segment
target_timeseries_end – an end time of a target segment
pre_background_start – a start time of a preceding BG
pre_background_start – an end time of a preceding BG
pre_background_start – a start time of a following BG
pre_background_start – an end time of a following BG
- Returns
whitened_fft_target: whitened fft of a target segment whitened_fft_PBG: whitened fft of a preceding segment whitened_fft_FBG: whitened fft of a following segment sample_rate: sampling rate of this channel DURATION: a duration of a target segment
-
origli.utilities.utilities.
Multiprocess_whitening_timeseries
(full_timeseries, target_timeseries_start, target_timeseries_end, pre_background_start, pre_background_end, fol_background_start, fol_background_end)[source]¶ - description:
This is used for multi processing for whitening segments and puts out the absolute values of whitened timeseries of the on- and off-source windows instead of putting out of the frequency series. The the deviation of the whitened timeseries is imformative so that it calculates absolute values This function is used to study the metric evaluated in the time domain compared with the metric evaluated in the frequency domain. In conclusion, the metric in the frequency domain is better so that this function is useless.
USAGE: whitened_target_timeseries_abs, whitened_pre_off_source_abs, whitened_fol_off_source_abs, sample_rate, DURATION = Multiprocess_whitening_timeseries(full_timeseries, target_timeseries_start, target_timeseries_end, pre_background_start, pre_background_end, fol_background_start, fol_background_end)
- Parameters
full_timeseries – time series comprising target and BGs
target_timeseries_start – a start time of a target segment
target_timeseries_end – an end time of a target segment
pre_background_start – a start time of a preceding BG
pre_background_start – an end time of a preceding BG
pre_background_start – a start time of a following BG
pre_background_start – an end time of a following BG
- Returns
whitened_target_timeseries_abs: the absolute values of whitened timeseries in the on-source window whitened_pre_off_source_abs: the absolute values of whitened timeseries in the preceding off-source window whitened_fol_off_source_abs: the absolute values of whitened timeseries in the preceding off-source window sample_rate: sampling rate of this channel DURATION: a duration of a target segment
-
class
origli.utilities.utilities.
PlotTableAnalysis
[source]¶ Bases:
object
-
AutoDetermineTrendBin
(SegmentStart, SegmentEnd)[source]¶ description: automatically determine the bins of the subclass trend plot USAGE: trend = AutoDetermineTrendBin(SegmentStart, SegmentEnd)
- Parameters
SegmentStart – start time of a segment
SegmentEnd – end time of a segment
- Returns
trend: {‘mins’, ‘hours’, ‘days’, ‘month’}
-
BinomialDist
(k, n, likelihood)[source]¶ description: Binomial distribution USAGE: out = BinomialDist(self, k, n, likelihood)
- Parameters
k – the number of detections
n – the number of trials
p – likelihood of detection
- Returns
out: a value of probability density to find k detection out of n trials
-
BinomialTest
(k, N, rate)[source]¶ description: compute p-value of one-tailed Binomial test against null rate USAGE: p_value = BinomialTest(k, N, rate)
- Parameters
k – observed numberof successes
N – total number of samples
rate – rate of successes drawn from a null hypothesis
- Returns
p_value: probablity of successes equal or greater than the observed number of successes
-
Calculate_number_and_rate
(df_target, df_null, d_c, err_cal, channels)[source]¶ - description:
use the target and null samples
calculate the numbers of the target and null samples above threshold
calculate the fraction of the target and null samples above threshold
USAGE: channels, list_num1, list_num0, total_sample1, total_sample0, list_Causal, list_causal_err = Calculate_number_and_rate(df_target, df_null, d_c=None, err_cal=False, channels=None)
- Parameters
df_target – target samples in pandas frame
df_null – null samples in pandas frame
d_c – a threshold. if it is None, the threshold is the mean value of null samples
err_cal – boolen, wether calculate the error of the statistics or not
channels – a list of channels
- Returns
channels: a list of channels list_num1: a list of the number of the target samples above a threshold list_num0: a list of the number of the null samples above a threshold total_sample1: the number of the target samples, float total_sample0: the number of the null samples, float list_Causal: a list of the statistics list_causal_err: a list of the errors of the statistics. It is None if err_cal = False
-
CreateChannelTicks
(ListChannelName)[source]¶ - description:
take the dominant sub-channel names
get a list of index where a sub-sensor name changes
dependencies: (tacitly) CreateMatCount(), make_subset_channel_based_on_samplingrate() USAGE: CenterTicks, ListInd, ListSubsys = CreateChannelTicks(ListChannelName)
- Parameters
ListChannelName – a list of channel names
- Returns
CenterTicks: center value of a dominant sub sensor belongs ListInd: the edge indices of a dominant sub sensor belongs ListSubsys: a list of dominant sensor names
-
CreateMatCount
(sigma, g_Individual=None, LowerCutOffFreq='None', UpperCutOffFreq='None')[source]¶ - description:
find counts for each glitch
stack over all the gliches
make a matix comprising importance versus channels
- USAGE: MatCount, ListChannelName, ListSNR, ListConf, ListGPS, ListDuration = CreateMatCount(sigma, LowerCutOffFreq=’None’, UpperCutOffFreq=’None’) # for all glitches
MatCount, ListChannelName, ListSNR, ListConf = CreateMatCount(sigma, g_Individual, LowerCutOffFreq=’None’, UpperCutOffFreq=’None’) # for an individual glitch
- Parameters
sigma (sigma) – an integer to determine the upper bound of the off-source window
g_Individual – a grounp of a HDF5 file that has values of importance for a gltich
LowerCutOffFreq – a lower limit frequency cut to calculate a value of importance
UpperCutOffFreq – an uppper limit frequency cut to calculate a value of importance
- Returns
MatCount: a matrix comprising importance versus channels ListChannelName: a list of channel names ListSNR: a list of SNRs ListConf: a list of confidence
dependencies: self.HierarchyChannelAboveThreshold(g, LowerCutOffFreq, UpperCutOffFreq)
-
Determine_number_of_subclass
(Path_Target_Glitch_SubClassClustered_Dataset, Path_Null_Dataset, test_confidence)[source]¶ - description:
query the clustered target samples
query null samples
perform one-sided binomial test and one-sided Welch t-test on each subclass
count the number of subclasses which has at least one channel passing the both tests
USAGE: num_subclass = Determine_number_of_subclass(Path_Target_Glitch_SubClassClustered_Dataset, Path_Null_Dataset, test_confidence)
- Parameters
Path_Target_Glitch_SubClassClustered_Dataset – a path to the clustered target samples
Path_Null_Dataset – a path to null samples
test_confidence – a statistical confidence level
- Returns
num_subclass: the number of subclasses which has at least one channel passing the both tests
-
FindNumberChannels
(g)[source]¶ - description:
count a number of channels that are analyzed
USAGE: NumberOfChannels = FindNumberChannels(g)
- Parameters
g – a HDF file group object
- Returns
NumberOfChannels: a number of channels that are analyzed
-
FindSubClass
(MatCount, ListChannelName, ListGPS, ListDuration, output_dir, upper_number_cluster, applied_Transformation)[source]¶ - description:
use a clustering approach
plot glitch index VS channel grouped by clusters
make a table comprising a list of GPS times a given cluster
plot Importance VS channel of a given sub-class
make a corresponding table
USAGE: FindSubClass(MatCount, ListChannelName, ListGPS, output_dir)
- Parameters
MatCount – a matrix comprising importance versus channels
ListChannelName – a list of channel names
ListGPS – a list of GPS times
ListDuration – a list of durations
output_dir –
upper_number_cluster – a value of upper limit of the number of clusters
applied_Transformation – decomposition applided {‘PCA’, ‘kernelPCA’, ‘None’}
- Returns
None
-
FrequencyBandColor
(PathGlitchChannel_Low, PathGlitchChannel_Mid, PathGlitchChannel_High)[source]¶ - description:
load files that have a low, middle, high frequency band importances. (GPSImportanceChannels.csv)
make a matrix that has RGB color based on each importance
output this matrix
USAGE: RGBMat = FrequencyBandColor(PathGlitchChannel_Low, PathGlitchChannel_Mid, PathGlitchChannel_High)
- Parameters
PathGlitchChannel_Low – a path to a file that has a low frequency band
PathGlitchChannel_Mid – a path to a file that has a middle frequency band
PathGlitchChannel_High – a path to a file that has a high frequency band
- Returns
RGBMat
-
Get_MatCountGPSDuration
(Path_Target_Glitch_SubClassClustered_Dataset)[source]¶ - description:
get a importance matrix, a list of channels, a list of GPS times, and a list of durations from the clustered target samples
- Parameters
Path_Target_Glitch_SubClassClustered_Dataset – a path to the clustered target samples
- Returns
MatCount: a importance matrix ListChannelName: a list of channels ListGPS: a list of GPS times ListDuration: a list of durations
-
HierarchyChannelAboveThreshold
(g, sigma, LowerCutOffFreq='None', UpperCutOffFreq='None')[source]¶ - description:
calculate values of importance which is the number of frequency bins above a the off-source window for channels for a given time of a glitch
USAGE: RankingChannelAndCount, ListCount, ListChannelName, GPS, ID, SNR, confidence, duration = HierarchyChannelAboveThreshold(g, LowerCutOffFreq, UpperCutOffFreq, sigma=10)
- Parameters
g – (hdf5 format) a group having a glitch
pt – a value of area integrating a distribution upto
sigma – value of standard deviation of ratio of medians to determine important channels
- Returns
RankingChannelAndRatioMed: # a list of channel names with their ratios descended by ratios Importantchannels: # channels with their ratio is greater than the threshold of ratio, along with their ratios
-
MakeMatrixOccurrenceVSChannel
(MatCount, ListChannelName, ListGPS, ListDuration, output_dir)[source]¶ - description:
save matrix of GPS, duration, importance as .csv file
USAGE: PlotOccurrenceVSChannel(MatCount, ListChannelName, ListGPS, ListDuration, output_dir, output_file)
- Parameters
MatCount – a matrix comprising importance versus channels
ListChannelName – a list of channel names
ListGPS – a list of GPS times
ListDuration – a list of durations
output_dir –
- Returns
None
dependencies: CreateChannelTicks()
-
PlotCausalityVSChannel
(list_Causal_passed, list_Causal_fail, list_causal_passed_err, list_causal_failed_err, list_Test, ListChannelName, output_dir, output_file, BinomialTestConfidence, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶ - description:
Calculate the probabilities of the causality of channels
USAGE: PlotCausalityVSChannel(list_Causal_passed, list_Causal_fail, list_Test, ListChannelName, output_dir, output_file, BinomialTestConfidence)
- Parameters
list_Causal_passed – a list of the probability of the causality that are passed one-tailed Binomial test, if not zero
list_Causal_fail – a list of the probability of the causality that are failed one-tailed Binomial test, if not zero
list_causal_passed_err – a list of the error of the causal probability that are passed one-tailed Binomial test, otherwise, zero
list_causal_failed_err – a list of the error of the causal probability that are failed one-tailed Binomial test, otherwise, zero
list_Test – a list of results of the Binomial test, ‘pass’ or ‘fail’
ListChannelName – a list of channel names
output_dir – (only used for all glitches)
output_file – (only used for all glitches)
BinomialTestConfidence – binomial test confidence level
freq_bands – frequency bands used for the multi-frequency band search, which is defined in const.py
- Returns
None
-
PlotConfidenceVSChannel
(MatCount, ListChannelName, ListConf, output_dir, output_file)[source]¶ make a plot of values of confidence level of Gravity Spy versus channels dependencies: CreateChannelTicks() USAGE: PlotConfidenceVSChannel(MatCount, ListChannelName, ListConf, output_dir, output_file)
- Parameters
MatCount – a matrix comprising importance versus channels
ListChannelName – a list of channel names
ListConf – a list confidence levels
output_dir –
output_file –
- Returns
None
-
PlotImportanceVSChannel
(MatCount, ListChannelName, output_dir, output_file, ax=None)[source]¶ - description:
for all glitches in a class, this method works stand-alone
convert number to channel names in x ticks
color background based on channel types
3. plot a bar showing importance of channels USAGE: PlotImportanceVSChannel(MatCount, ListChannelName, output_dir, output_file) - for an individual glitch in a class, this method is used by …. 1. convert number to channel names in x ticks 2. color background based on channel types 3. plot a bar showing importance of channels
dependencies: CreateChannelTicks() tacitly CreateMatCount() USAGE: PlotImportanceVSChannel(MatCount, ListChannelName, None, None, ax) # for on-line mode USAGE: PlotImportanceVSChannel(MatCount, ListChannelName, output_dir, output_file) @ for off-line mode
- Parameters
MatCount – a matrix comprising importance versus channels
ListChannelName – a list of channel names
output_dir – (only used for all glitches)
output_file – (only used for all glitches)
ax – matplotlib.pyplot object (used only for an individual glitch)
- Returns
None
-
PlotIndividualFCS_ImportanceVSChannel
(glitchtype, IFO, GravitySpy_df, output_dir, sigma, LowerCutOffFreq='None', UpperCutOffFreq='None', mode='offline', Listsegments=None, re_sfchs=None, Data_outputpath=None, Data_outputfilename=None, PlusHOFT='False', number_process=None)[source]¶ - description:
load a file comprising all glitches in a class
- create a plot comprising frequency versus channel & importance versus channel
- dependency:
self.CreateChannelTicks(ListChannel) self.make_subset_channel_based_on_samplingrate() self.CreateMatCount() self.PlotImportanceVSChannel()
save a plot
dependencies: make_subset_channel_based_on_samplingrate(), CreateChannelTicks(), CreateMatCount(), PlotImportanceVSChannel() USAGE: PlotIndividualFCS_ImportanceVSChannel(glitchtype, IFO, output_dir, sigma, LowerCutOffFreq, UpperCutOffFreq)
- Parameters
glitchtype – a type of glitch used for create a name of a plot
IFO – # a type of IFO used in a name of a plot
GravitySpy_df – a meta data of Gravity Spy in pandas frame
output_dir – a output directory
simga – an integer number used for the upper bound of BG noise
LowerCutOffFreq – the lower cut-off frequency used in CreateMatCount, ‘None’ in default
UpperCutOffFreq – the upper cut-off frequency used in CreateMatCount, ‘None’ in default
mode – ‘offline’ or ‘online’
Listsegments – a list of allowed glitches, which is used for online mode only, None in default
re_sfchs – a list of safe channels except unused channels, which is used for online mode only, None in default
Data_outputpath – a directory saving for a HDF5 file, which is used for online mode only, None in default
Data_outputfilename – a file saving for a HDF5 file, which is used for online mode only, None in default
PlusHOFT – whether to get data of HOFT {‘True’, ‘False’}, which is used for online mode only, ‘False’ in default
number_process – a number of processes in parallel, which is used for online mode only, None in default
return None
-
PlotOccurrenceVSChannel
(MatCount, ListChannelName, ListGPS, ListDuration, output_dir, output_file)[source]¶ - description:
plot glitch indecies versus channels
USAGE: PlotOccurrenceVSChannel(MatCount, ListChannelName, ListGPS, ListDuration, output_dir, output_file)
- Parameters
MatCount – a matrix comprising importance versus channels
ListChannelName – a list of channel names
ListGPS – a list of GPS times
ListDuration – a list of durations
output_dir –
output_file –
- Returns
None
dependencies: CreateChannelTicks()
-
PlotSNRVSChannel
(MatCount, ListChannelName, ListSNR, output_dir, output_file)[source]¶ - description:
make a plot of SNR of h(t) versus channels
dependencies: CreateChannelTicks() USAGE: PlotSNRVSChannel(self, MatCount, ListChannelName, ListSNR, output_dir, output_file)
- Parameters
MatCount – a matrix comprising importance versus channels
ListChannelName – a list of channel names
ListSNR – a list SNRs
output_dir –
output_file –
- Returns
None
-
PlotTimeVSChannel
(MatCount, ListChannelName, ListGPS, output_dir, output_file, startT, endT, dt)[source]¶ - description:
make a plot of glitch times versus channels where the time flow is from top to bottom If there are more than one glitch in a time bin, those gltiches’ values of importance are averaged If there is no glitches in a time bin, all the values of importance are set to be zero
dependencies: CreateChannelTicks() USAGE: PlotTimeVSChannel(MatCount, ListChannelName, output_dir, output_file, startT, endT, dt)
- Parameters
MatCount – a matrix comprising importance versus channels
ListChannelName – a list of channel names
ListGPS – a list of GPS times
output_dir – a path to an output directory
output_file – a name of an output file
startT – start time of an epoch
endT – end time of an epoch
dt – step size in sec of the time slice
- Returns
None
-
Plot_Welch_t_test
(channels, list_t_values_passed, list_t_values_failed, list_Test, confidence_level, output_dir, output_file, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶ - description:
plot the result of one-sided Welch t-test
USAGE: Plot_Welch_t_test(channels, list_t_values_passed, list_t_values_failed, list_Test, output_dir, output_file)
- Parameters
channels – a list of channels
list_t_values_passed – a list of t-values that pass the test
list_t_values_failed – a list of t-values that fail the test
list_Test – a list of the test results {‘pass’, ‘fail’}
confidence_level – a confidence level
output_dir – an output directory
output_file – an output file name
freq_bands – frequency bands used for the multi-frequency band search, which is defined in const.py
- Returns
None
-
Plot_fap
(channels_band, list_GPS, list_duration, mat_fap, mat_Test, confidence_level, output_dir, output_file=None, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶ - description:
plot the result of point chi-square test
USAGE: Plot_point_chisqur_test(channels, list_GPS, list_duration, mat_fap, mat_Test, confidence_level, output_dir, output_file, freq_bands=Const.freq_bands)
- Parameters
channels_band – a list of channels in numpy array
list_GPS – a list of GPS times in numpy array
list_duration – a list of durations in numpy array
mat_fap – matrix of fap
mat_Test – a list of the test results {‘pass’, ‘fail’}
output_dir – an output directory
output_file – an output file name
freq_bands – frequency bands used for the multi-frequency band search, which is defined in const.py
- Returns
None
-
Plot_p_belong
(channels_band, list_GPS, list_duration, mat_p_belong, mat_Test, confidence_level, output_dir, output_file=None, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶ - description:
plot the result of p_belong
USAGE: Plot_p_belong(channels, list_GPS, list_duration, mat_p_belong, mat_Test, confidence_level, output_dir, output_file, freq_bands=Const.freq_bands)
- Parameters
channels – a list of channels in numpy array
list_GPS – a list of GPS times in numpy array
list_duration – a list of durations in numpy array
mat_p_belong – matrix of p_belong
mat_Test – a list of the test results {‘pass’, ‘fail’}
output_dir – an output directory
output_file – an output file name
freq_bands – frequency bands used for the multi-frequency band search, which is defined in const.py
- Returns
None
-
Plot_p_greater
(channels, list_p_greater_passed, list_p_greater_failed, list_Test, confidence_level, output_dir, output_file, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶ - description:
plot the result of p_greater
USAGE: Plot_p_greater(channels, list_p_greater_passed, list_p_greater_failed, list_Test, output_dir, output_file)
- Parameters
channels – a list of channels
list_p_greater_passed – a list of p_greater above confidence level
list_p_greater_failed – a list of p_greater that fail the test
list_Test – a list of the test results {‘pass’, ‘fail’}
confidence_level – a confidence level
output_dir – an output directory
output_file – an output file name
freq_bands – frequency bands used for the multi-frequency band search, which is defined in const.py
- Returns
None
-
Plot_point_chisqur_test
(channels_band, list_GPS, list_duration, mat_chsqr_passed, mat_chsqr_failed, mat_Test, p_values, confidence_level, output_dir, output_file=None, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶ - description:
plot the result of point chi-square test
USAGE: Plot_point_chisqur_test(channels, list_GPS, list_duration, mat_chsqr_passed, mat_chsqr_failed, mat_Test, confidence_level, output_dir, output_file, freq_bands=Const.freq_bands)
- Parameters
channels_band – a list of channels in numpy array
list_GPS – a list of GPS times in numpy array
list_duration – a list of durations in numpy array
mat_chsqr_passed – a matrix of “passed” chi-square values where glitch indices are in rows and channels are columns in numpy array note that channels in glitches that passed the test have non-zero values, otherwise zero
mat_chsqr_failed – a matrix of “failed” chi-square values where glitch indices are in rows and channels are columns in numpy array note that channels in glitches that failed the test have non-zero values, otherwise zero
mat_Test – a list of the test results {‘pass’, ‘fail’}
p_values – a matrix of p-values where glitch indices are in rows and channels are columns in numpy array
output_dir – an output directory
output_file – an output file name
freq_bands – frequency bands used for the multi-frequency band search, which is defined in const.py
- Returns
None
-
Principal_component_analysis
(MatCount, frac_component=0.9)[source]¶ - description:
PCA decomposition applided to feature matrix
USAGE: MatCount, MatCount_inverse_pca = Principal_component_analysis(self, MatCount, frac_component=0.9)
- Parameters
MatCount – a feature matrix of glitches with samples in rows and features in columns
frac_component – a cumulative variance
- Returns
MatCount_pca: a feature matix in PCA space MatCount_inverse_pca: a reconstructed feature matrix in the original space
-
Save_fap_csv
(channels, list_GPS, list_duration, mat_fap, output_dir)[source]¶ - description:
save the FAP table as csv file
USAGE: Save_fap_csv(channels, list_GPS, list_duration, mat_fap, output_dir)
- Parameters
channels – a list of channels
list_GPS – a list of GPS times
list_duration – a list of durations
mat_fap – a matrix of FAP of each channel for each glitch
output_dir – output directory
- Returns
None
-
Save_p_belong_csv
(channels_passed, list_GPS, list_duration, mat_p_belong, output_dir)[source]¶ - description:
save the p_belong table as csv file channels_passed is the list of channels whose p_greater is above 0.5 to avoid misinterpretation of p_belong.
USAGE: Save_p_belong_csv(channels_passed, list_GPS, list_duration, mat_p_belong, output_dir)
- Parameters
channels_passed – a list of channels
list_GPS – a list of GPS times
list_duration – a list of durations
mat_p_belong – a matrix of p_belong of each channel for each glitch
output_dir – output directory
- Returns
None
-
Student_t_independet_test
(data1, data2, Welch_test=True)[source]¶ Independent Student t-test USAGE: stat, p_value = self.Student_t_independet_test(data1, data0)
- Parameters
data1 – a population
data2 – another population
- Returns
t_value: t-value (critical value) p: p-value
-
TableCausality
(list_Causal_passed, list_Causal_fail, list_Test, ListChannelName, output_dir, output_file, BinomialTestConfidence, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶ make a table of witness ratio statistic (WRS) of channels as a .csv file USAGE: TableCausality(self, list_Causal_passed, list_Causal_fail, list_Test, ListChannelName, output_dir, output_file)
- Parameters
list_Causal_passed – a list of the probability of the causality that are passed one-tailed Binomial test, if not zero
list_Causal_fail – a list of the probability of the causality that are failed one-tailed Binomial test, if not zero
list_Test – a list of results of the Binomial test, ‘pass’ or ‘fail’
ListChannelName – a list of channel names
output_dir –
output_file –
BinomialTestConfidence – binomial test confidence level
freq_bands – frequency bands used for the multi-frequency band search, which is defined in const.py
- Returns
None
-
TableImportance
(MatCount, ListChannelName, output_dir, output_file)[source]¶ make a table of values of importance as a .csv file USAGE: TableImportance(MatCount, ListChannelName, output_dir, output_file)
- Parameters
MatCount – a matrix comprising importance versus channels
ListChannelName – a list of channel names
output_dir –
output_file –
- Returns
None
-
Table_Welch_t_test
(channels, list_t_values_passed, list_t_values_failed, list_Test, confidence_level, output_dir, output_file, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶ - description:
save the result of one-sided Welch t-test as a .csv file
USAGE: Table_Welch_t_test(channels, list_t_values_passed, list_t_values_failed, list_Test, output_dir, output_file)
- Parameters
channels – a list of channels
list_t_values_passed – a list of t-values that pass the test
list_t_values_failed – a list of t-values that fail the test
list_Test – a list of the test results {‘pass’, ‘fail’}
confidence_level – a confidence level
output_dir – an output directory
output_file – an output file name
freq_bands – frequency bands used for the multi-frequency band search, which is defined in const.py
- Returns
None
-
Table_p_greater
(channels, list_p_greater_passed, list_p_greater_failed, list_Test, confidence_level, output_dir, output_file, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶ - description:
save the result of p_greater as a .csv file
USAGE: Table_Welch_t_test(channels, list_p_greater_passed, list_p_greater_failed, list_Test, output_dir, output_file)
- Parameters
channels – a list of channels
list_p_greater_passed – a list of p_greater that pass the test
list_p_greater_failed – a list of p_greater that fail the test
list_Test – a list of the test results {‘pass’, ‘fail’}
confidence_level – a confidence level
output_dir – an output directory
output_file – an output file name
freq_bands – frequency bands used for the multi-frequency band search, which is defined in const.py
- Returns
None
-
TrendSubClass
(SegmentStart, SegmentEnd, input_dir, input_file, output_dir, trend='months', norm=False)[source]¶ input file is supposed to be ClusteredGPSImportanceChannels.csv USAGE: TrendSubClass(SegmentStart, SegmentEnd, input_dir, input_file, output_dir, trend=’month’)
- Parameters
SegmentStart – start GPS time of a segment
SegmentEnd – end GPS time of a segment
input_dir – input directory
input_file – an input file
output_dir – an output directory
trend – a trend ‘months’, ‘days’, ‘hours’ or ‘mins’
- Returns
None
-
calculate_fap
(target_mat, null_mat)[source]¶ - description:
calculate the values of fap of each channel in each glitch
USAGE: faps = calculate_fap(target_mat, null_mat)
- Parameters
target_mat – a matrix of target samples where glitch indices are in rows and channels are in columns
null_mat – a matrix of null samples where glitch indices are in rows and channels are in columns
- Returns
fap
-
calculate_reweighted_importance
(df1, df_FAP)[source]¶ - description:
reweight the values of importance using FAP the reweighted importance is defined as rho_new = rho / (FAP + 1)
USAGE: df1_new = calculate_reweighted_importance(df1, df_FAP)
- Parameters
df1 – a target samples in pandas frame
df_FAP – a FAP matrix in pandas frame
- Returns
df1_new: reweighted importance matrix in pandas frame
-
chisquare_test
(target_mat, null_mat)[source]¶ - description:
calculate the values of chi-square of each channel in each glitch and put out values of chi-square and corresponding p-values
USAGE: chi2_value, p_value = chisquare_test(self, target_mat, null_mat)
- Parameters
target_mat – a matrix of target samples where glitch indices are in rows and channels are in columns
null_mat – a matrix of null samples where glitch indices are in rows and channels are in columns
- Returns
chi2_value: a matrix of chi-square values where glitch indices are in rows and channels are in columns p_value: a matrix of p-values where glitch indices are in rows and channels are in columns
-
find_channels
(df_target)[source]¶ - description:
find a list of channels from the target samples
USAGE: channels = find_channels(df_target)
- Parameters
df_target – target samples in pandas frame
- Returns
channels: a list of channels
-
find_meaing_ful_confidence
(df1, df0, BinomialTestConfidence, d_c=None, err_cal=False, channels=0)[source]¶ - description
load files of target glitches (df1) and dummy quite data set (df0)
compute true positive probability
perform one-tailed Binomial test
USAGE: list_Causal_passed, list_Causal_fail, list_causal_passed_err, list_causal_failed_err, list_Test, list_channels= find_meaing_ful_confidence(df1, df0, BinomialTestConfidence, d_c)
- Parameters
df1 – target glitches in the pandas format
df0 – null dataset in the pandas format
BinomialTestConfidence – a confidence level used for one-tailed Binomial test
d_c – user defined threshold to claim detection of a glitch, None in default
channels – a list of channels, 0 in default if d_c is None, values of threshold is given by the mean value of importance generated by the dummy quite dataset
- Returns
list_Causal_passed: a list of the probability of the causality that are passed one-tailed Binomial test, otherwise, zero list_Causal_fail: a list of the probability of the causality that are failed one-tailed Binomial test, otherwise, zero list_causal_passed_err: a list of the error of the causal probability that are passed one-tailed Binomial test, otherwise, zero list_causal_failed_err: a list of the error of the causal probability that are failed one-tailed Binomial test, otherwise, zero list_Test: a list of results of the Binomial test, ‘pass’ or ‘fail’ channels: a list of channel names
-
make_subset_channel_based_on_samplingrate
(g, target_sampling_rate)[source]¶ dependencies being upon: PlotIndividualFCS_ImportanceVSChannel() Usage: X, duration, Listch_label_num, ListChannel, GPS, SNR, confidence, ID = make_subset_channel_based_on_samplingrate(f[‘gps00000’], 256)
- Parameters
g – target glitch class’s group or a file itself (at a GPS time), HDF5 format
target_sampling_rate – Sampling rate used to group them, (256, 512, 1024, 2048, 4096, 8192, 16384
- Return X
matrix comprising some channels with a same samplign rate of a target glitch class at a given time whitened a reference, only duration: duration of time series Listch_label_num: the list of channel labels ListChannel: the list of name of channels GPS: GPS time of this group SNR: SNR of this glitch confidence: confidence level of this glitch ID: GravitySpy uniqu ID
-
perform_Welch_test
(df_target, df_null, confidence_level, channels=0)[source]¶ - description:
perform one-sided Welch t-test
USAGE: channels, list_t_values_passed, list_t_values_failed, list_Test = perform_Welch_test(df_target, df_null, confidence_level, channels=None)
- Parameters
df_target – target samples in pandas frame
df_null – null samples in pandas frame
confidence_level – a confidence level
channels – a list of channels, 0 in default
- :return
channels: a list of channels list_t_values_passed: a list of t-values that pass the test list_t_values_failed: a list of t-vlaues that fail the test list_Test: a list of the test results {‘pass’, ‘fail’}
-
perform_beta_dist
(df_target, df_null, channels=0)[source]¶ - description:
create beta distribution fits for target and null samples
USAGE: rv_t_dict, rv_n_dict = perform_beta_dist(df_target, df_null, channels=0)
- Parameters
df_target – target samples in pandas frame
df_null – null samples in pandas frame
channels – channels (optional)
- Returns
rv_t_dict: a dictionary of beta distribution (scipy obj) for the target samples rv_n_dict: a dictionary of beta distribution (scipy obj) for the null samples
-
perform_fap
(df_target, df_null, confidence_level, channels=0)[source]¶ - description:
calculate a FAP for each channel at each glitch
USAGE: channels, list_GPS, list_duration, mat_fap, mat_Test, confidence_level= perform_fap(df_target, df_null, confidence_level, channels)
- Parameters
df_target – target samples in pandas frame
df_null – null samples in pandas frame
confidence_level – a confidence level
channels – a list of channels, 0 in default
- :return
channels: a list of channels in numpy array list_GPS: a list of GPS times in numpy array list_duration: a list of durations in numpy array mat_fap: matrix of fap mat_Test: a list of the test results {‘pass’, ‘fail’} confidence_level: a value of user defined confidence level that is used for the test
-
perform_p_belong
(df_target, p_greater_dict, rv_t_dict, rv_n_dict, confidence_level, channels=0)[source]¶ - description:
1. calculate p_belong for each channel in each frequency in each glitch keep only channels whose p_greater is greater than 0.5
USAGE: list_channels_passed, list_GPS, list_duration, mat_p_belong, mat_Test, confidence_level = perform_p_belong(df_target, p_greater_dict, rv_t_dict, rv_n_dict, confidence_level, channels=0)
- Parameters
df_target – target samples in pandas frame
p_greater_dict – null samples in pandas frame
rv_t_dict – a dictionary of beta distribution (scipy obj) for the target samples
rv_n_dict – a dictionary of beta distribution (scipy obj) for the null samples
confidence_level – confidecel level
channels – channels (optional)
- Returns
list_channels_passed: a list of channels whose p_greater is above 0.5 list_GPS: a list of GPS time of the target samples list_duration: list of durations mat_p_belong: a matrix of p_belong where glitch samples are in row and passed channels are in columns mat_Test: a matrix of {pass, fail} where confidence_level: confidence level used
-
perform_p_greater
(df_target, rv_t_dict, rv_n_dict, confidence_level, channels=0)[source]¶ - description:
calcualte p_greater for channels note that set p_greater to be 0.5 if the p_greater is not monotonically growing for the target samples
USAGE: channels, p_greater_dict, list_p_greater, list_p_greater_passed, list_p_greater_failed, list_Test, confidence_level = perform_p_greater(df_target, rv_t_dict, rv_n_dict, confidence_level, channels=0)
- Parameters
df_target – target samples in pandas frame
rv_t_dict – a dictionary of beta distribution (scipy obj) for the target samples
rv_n_dict – a dictionary of beta distribution (scipy obj) for the null samples
confidence_level – confidence level
channels – channels
- Returns
channels: list of channels p_greater_dict: a dictionary of p_greater list_p_greater: a list of p_greater list_p_greater_passed: a lisf of p_greater where p_greater is kept if the value greater than confidence level, other wise 0 list_p_greater_failed: a lisf of p_greater where p_greater is kept if the value less than confidence level, other wise 0 list_Test: list of {pas, fail} confidence_level: confidence level
-
perform_point_chisqr_test
(df_target, df_null, confidence_level, channels=0)[source]¶ - description:
perform a single point chi-square test for each channel at each glitch
USAGE: channels, list_GPS, list_duration, mat_chsqr_passed, mat_chsqr_failed, mat_Test, p_values, confidence_level= perform_point_chisqr_test(df_target, df_null, confidence_level, channels)
- Parameters
df_target – target samples in pandas frame
df_null – null samples in pandas frame
confidence_level – a confidence level
channels – a list of channels, 0 in default
- :return
channels: a list of channels in numpy array list_GPS: a list of GPS times in numpy array list_duration: a list of durations in numpy array mat_chsqr_passed: a matrix of “passed” chi-square values where glitch indices are in rows and channels are columns in numpy array
note that channels in glitches that passed the test have non-zero values, otherwise zero
- mat_chsqr_failed: a matrix of “failed” chi-square values where glitch indices are in rows and channels are columns in numpy array
note that channels in glitches that failed the test have non-zero values, otherwise zero
mat_Test: a list of the test results {‘pass’, ‘fail’} confidence_level: a value of user defined confidence level that is used for the test p_values: a matrix of p-values where glitch indices are in rows and channels are columns in numpy array
-
query_targetglitch_null
(path_target_glitch_dataset, path_null_dataset)[source]¶ - description:
load files, otherwise, end the program
- Parameters
path_target_glitch_dataset – a path to .csv file of a target gltich class
path_null_dataset – a path to .csv file of a null dataset
- Returns
df1: a pandas dataframe of a target glitch class df0: a pandas datafram of a null dataset
-
ranking_channels
(list_ranking_statistic, list_Test)[source]¶ - description:
sort based on the value of the ranking statistic
sort based on the test with “pass” and “fail”, where “pass” comes before “fail”
findthe indecies based on the sort 1) and 2)
USAGE: list_sorted_base_index_pass_fail, list_sorted_ranking_statistic_pass_fail, list_sorted_Test_pass_fail = ranking_channels(list_ranking_statistic, list_Test)
- Parameters
list_ranking_statistic –
list_Test –
- Returns
-
-
origli.utilities.utilities.
RemoveChannelUnused
(re_sfchs, PathListChannelUnused)[source]¶ - Description:
K.M find that some of the channels in the list of the safe channels are not used in O2 so that gwpy can not get the time series of those channels. This function remove those channels.
- USAGE:
re_sfchs = RemoveChannelUnused(re_sfchs, ‘/home/kentaro.mogushi/longlived/MachineLearningJointPisaUM/dataset/ListSaveChannel/L1/O2_omicron_channel_list_hvetosafe_GDS.txt’)
- Parameters
re_sfchs – the list the safe channels in the numpy array format
- Return re_sfchs
the list the safe channels without unused channels in the numpy array format
-
origli.utilities.utilities.
SaveTargetAndBackGroundHDF5_OFFLINE
(Listsegments, re_sfchs, IFO, outputpath, outputfilename, number_process, PlusHOFT='False')[source]¶ - description:
THIS IS USED FOR “OFFLINE” MODE 0. assuming Listsegments is given by Findglitchlist() 1.take the information of the list of allowed target and a preceding and following segments 2. whiten a target segmement based on the average background segment 3. find whitened FFT 4. save the whitened target and backgrounds FFTs Note this depends on
USAGE: SaveTargetAndBackGroundHDF5(Listsegments, re_sfchs, IFO, outputpath, outputfilename, mode==’offline’)
- Parameters
Listsegments – a list of segment parameters
channels – a list of safe channels
outputpath – a directory of an output file
outputfilename – a name of an output file
number_process – a number of processes in parallel
PlusHOFT – whether to get data of hoft, {‘True’ or ‘False’}, ‘False’ in default
-
origli.utilities.utilities.
SaveTargetAndBackGroundHDF5_ONLINE
(Listsegments, re_sfchs, IFO, outputpath, outputfilename, number_process, PlusHOFT)[source]¶ - description:
THIS IS USED FOR “ONLINE” MODE 0. assuming Listsegments is given by Findglitchlist() 1.take the information of the list of allowed target and a preceding and following segments 2. whiten a target segmement based on the average background segment 3. find whitened FFT 4. save the whitened target and backgrounds FFTs Note this depends on Multiprocess_whitening()
USAGE: SaveTargetAndBackGroundHDF5(Listsegments, re_sfchs, IFO, outputpath, outputfilename, mode==’offline’)
- Parameters
Listsegments – a list of segment parameters
channels – a list of safe channels
outputpath – a directory of an output file
outputfilename – a name of an output file
number_process – a number of processes in parallel
PlusHOFT – whether to get data of hoft, {‘True’ or ‘False’}
-
origli.utilities.utilities.
SaveTargetAndBackGroundHDF5_TimeShift
(Listsegments, re_sfchs, IFO, outputpath, outputfilename, number_process, PlusHOFT='False')[source]¶ - description:
THIS IS USED FOR “OFFLINE” MODE 0. assuming Listsegments is given by Findglitchlist() 1.take the information of the list of allowed target and a preceding and following segments 2. whiten a target segmement based on the average background segment 3. find whitened FFT 4. save the whitened target and backgrounds FFTs Note this depends on
USAGE: SaveTargetAndBackGroundHDF5(Listsegments, re_sfchs, IFO, outputpath, outputfilename, mode==’offline’)
- Parameters
Listsegments – a list of segment parameters
channels – a list of safe channels
outputpath – a directory of an output file
outputfilename – a name of an output file
number_process – a number of processes in parallel
PlusHOFT – whether to get data of hoft, {‘True’ or ‘False’}, ‘False’ in default
-
origli.utilities.utilities.
TimeShiftingSamplePrecedingBGonly
(df, Epoch_lt, TargetGlitchClass, IFO, BGSNR_thre, targetSNR_thre, Confidence_thre, UserDefinedDuration, gap, position_duration_bfr_centr, TriggerPeakFreqLowerCutoff=0, TriggerPeakFreqUpperCutoff=8192, targetUpperSNR_thre=inf, flag='Both')[source]¶ - description:
load Gravity Spy data set (.csv file)
get the data about the target glitch class
get the subset of the target glitches based on SNR and confidence level threshold a user defines
accept glitches where their back ground segment do not coincide with any other glitches
return the info of the accepted glitches
USAGE: Listsegments = FindglitchlistOnLineMode(df, Epochstart, Epochend, Commissioning_lt, TargetGlitchClass, IFO, BGSNR_thre, targetSNR_thre, Confidence_thre, UserDefinedDuration, gap, flag)
- Parameters
df – GravitySpy meta data in pandas format
Epochstart – starting time of an epoch
Epochend – end time of an epoch
Commissioning_lt – commissioning time in list
TargetGlitchClass – a target glitch class name (str)
IFO – a type of interferometer (H1, L1, V1) (str)
BGSNR_thre – an upper threhold of SNR for background glitches, i.e., quiet enough ), float or int
targetSNR_thre – a lower threshold of SNR for target glitches, float or int
Confidence_thre – a threshold of confidence level (float or int )
UserDefinedDuration – user defined duration of a glitch (float or int), 0 in default
gap – a time gap between the target and the background segments in sec, 1 sec in default
position_duration_bfr_centr – proportion of duration for a target segment around a center time, e.g, 0.5 indicates duration is even distributed around a center time, 0.83 indicates 5/6 is before the center time
TriggerPeakFreqLowerCutoff – a lower limit cutoff value of the peak frequency of triggers given by an ETG for target glitches
TriggerPeakFreqUpperCutoff – an upper limit cutoff value of the peak frequency of triggers given by an ETG queries for target glitches
targetUpperSNR_thre – an upper limit cutoff value of SNR of triggers given by an ETG queries for target glitches
flag – ‘Both’ or ‘Either’ taking both backgronds or either the preceding or the following background, respectively to accept glitches
- Returns
the list of parameters of glitches passing the above thresholds Listsegments contains of
ListIndexSatisfied: a list of index of glitches Listtarget_timeseries_start: a list of a target glitch starting time Listtarget_timeseries_end: a list of a target glitch ending time Listpre_background_start: a list of preceding background starting time Listpre_background_end; a list of preceding background ending time Listfol_background_start: a list of following background starting time Listfol_background_end: a list of following background ending time Listgpstime: a list of GPS times Listduration: a list of durations ListSNR: a list of SNRs Listconfi: a list of confidence levels ListID: a list of IDs
-
origli.utilities.utilities.
select_some_trials
(iterate, maximum_iterate=5)[source]¶ - description:
this function is used to reduce the number of trials for the off-source window
USAGE: list_index_trials = select_some_trials(iterate, maximum_iterate=5)
- Parameters
iterate – a number of trials
maximum_iterate – a maximum number of trials
- Returns
a randomly chosen trials where the total number is maximum_iteration or less
origli.utilities.veto_utilities
¶
file name: veto_utilities.py
this file contains the utilities to be used for finding veto channel
-
origli.utilities.veto_utilities.
BackgroundCut
(df_null, channel, background_upper_cut)[source]¶ - description:
calculate the upper cut of channels using the FAP distribution
USAGE: cut = BackgroundCut(df_null, channel, background_upper_cut)
- Parameters
df_null – null samples in pandas frame
channel – a list of channels
background_upper_cut – confidence level of the uppercut of null samples of witness channel(s), e.g., 1sigma = 0.68268, 2sigma = 0.95449, 3sigma = 0.997300204, 4sigma = 0.99993666, and 5simga = 99.9999426
- Returns
cut: an uuper cut of the null sample of those channels
-
origli.utilities.veto_utilities.
CreateAllChannels_rho
(Listsegment, IFO, re_sfchs, number_process, PlusHOFT, sigma, LowerCutOffFreq, UpperCutOffFreq)[source]¶ - description:
use a single glitch time
query timeseries of all the channels around a glitch
condition (whitening and compare the on- and off-source window)
quantify all the channels (compute values of importance of all the channels)
USAGE: List_Count, re_sfchs, gpstime, duration, SNR, confi, ID = CreateAllChannels_rho(Listsegment, IFO, re_sfchs, number_process, PlusHOFT, sigma, LowerCutOffFreq, UpperCutOffFreq)
- Parameters
Listsegment – a list of segment parameters
:param IFO :param channels: a list of safe channels :param number_process: a number of processes in parallel :param PlusHOFT: whether to get data of hoft, {‘True’ or ‘False’} :param sigma: an integer to be used for calculating values of importance :param LowerCutOffFreq: a lower cutoff frequency :param UpperCutOffFreq: an upper cutoff frequency
-
origli.utilities.veto_utilities.
CreateRho
(full_timeseries, target_timeseries_start, target_timeseries_end, pre_background_start, pre_background_end, fol_background_start, fol_background_end, sigma, LowerCutOffFreq, UpperCutOffFreq)[source]¶ - discription:
calculate the whitened FFT of the on- and off-source window for a single channel
compute the value of importance for a single channel
- Parameters
full_timeseries – the full time series in gwpy object including on- and off source windows
target_timeseries_start – the start time of the on-source window
target_timeseries_end – the end time of the on-source window
pre_background_start – the start time of the preceding off-source window
pre_background_end – the end time of the preceding off-source window
fol_background_start – the start time of the following off-source window
fol_background_end – the end time of the following off-source window
sigma – an interger to calculate the value of importance
LowerCutOffFreq – a lower cutoff frequency
UpperCutOffFreq – an upper cutoff frequency
- Returns
Count: the importance: a fraction of frequency bins in a frequency range above an upper bound of the off-source window for a single channel
-
origli.utilities.veto_utilities.
FlagFinder
(Epoch_lt, Listsegments, IFO, channels, list_statistics, num_high_rank_channels_to_be_used, df_null, background_upper_cut, number_process, sigma, PlusHOFT, LowerCutOffFreq, UpperCutOffFreq, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶ - description:
select high ranking witness channels
determine the upper cut of the null samples for those high ranking witness channels
analyze all the glitches using the selected witness channels
make a flag when those channels give importance above the upper cut of the null (flags are made if ALL the chosen witness have value of importance above the upper cut of the null samples)
calculate efficiency and deadtime
USAGE: efficiency, deadtime_frac, df = FlagFinder(Epoch_lt, Listsegments, IFO, channels, list_statistics, num_high_rank_channels_to_be_used, df_null, background_upper_cut, number_process, sigma, PlusHOFT, LowerCutOffFreq, UpperCutOffFreq)
- Parameters
Epoch_lt – a list of an epoch
Listsegments – a list of glitches
IFO – ifo
channels – a list of channels, which are expected to be witness channels
list_statistics – a list of ranking statistics, either witness ratio statistics or t-value
num_high_rank_channels_to_be_used – number of high ranking channels to be used for making flag
df_null – null samples in pandas frame
background_upper_cut – confidence level of the uppercut of null samples of witness channel(s), e.g., 1sigma = 0.68268, 2sigma = 0.95449, 3sigma = 0.997300204, 4sigma = 0.99993666, and 5simga = 99.9999426
number_process – number of processors
sigma – an integer to determine the upper bound of the off-source window
PlusHOFT – boolen, whether analyze hoft
LowerCutOffFreq – a lower cutoff frequency
UpperCutOffFreq – an upper cutoff frequency
- Returns
efficiency: a ratio of glitches that are flagged to total glitches analyzed without an issue of no data available deadtime_frac: a ratio of total on-source windows to the total analysis time df: a matrix of GPS, duration, SNR, confidence level of classification of glitches, flag and importance in pandas frame
-
origli.utilities.veto_utilities.
FlagFinder_all_witnesses
(Proportion_Duration_Bfr_Centr, Listsegments, IFO, channels, list_statistics, num_high_rank_channels_to_be_used, df_null, background_upper_cut, number_process, sigma, PlusHOFT, LowerCutOffFreq, UpperCutOffFreq, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶ - description:
select high ranking witness channels
determine the upper cut of the null samples for those high ranking witness channels
analyze all the glitches using the selected witness channels
make flags when those channels give importance above the upper cut of the null (flags are make for individual channels)
USAGE: df_flag = FlagFinder(Proportion_Duration_Bfr_Centr, Listsegments, IFO, channels, list_statistics, num_high_rank_channels_to_be_used, df_null, background_upper_cut, number_process, sigma, PlusHOFT, LowerCutOffFreq, UpperCutOffFreq)
- Parameters
Proportion_Duration_Bfr_Centr – a fraction of the on-source window before the peak GPS time
Listsegments – a list of glitches
IFO – ifo
channels – a list of channels, which are expected to be witness channels
list_statistics – a list of ranking statistics, either witness ratio statistics or t-value
num_high_rank_channels_to_be_used – number of high ranking channels to be used for making flag
df_null – null samples in pandas frame
background_upper_cut – confidence level of the uppercut of null samples of witness channel(s), e.g., 1sigma = 0.68268, 2sigma = 0.95449, 3sigma = 0.997300204, 4sigma = 0.99993666, and 5simga = 99.9999426
number_process – number of processors
sigma – an integer to determine the upper bound of the off-source window
PlusHOFT – boolen, whether analyze hoft
LowerCutOffFreq – a lower cutoff frequency
UpperCutOffFreq – an upper cutoff frequency
- Returns
df_flag: a matrix of GPS, duration, SNR, confidence level of classification of glitches, importance of the witness channels, the flags of the witness channels in pandas frame
-
origli.utilities.veto_utilities.
HierarchyChannelAboveThreshold_single_channel
(whitened_fft_target, whitened_fft_PBG, whitened_fft_FBG, duration, sampling_rate, sigma, LowerCutOffFreq='None', UpperCutOffFreq='None')[source]¶ - description:
calculate the importance: a fraction of frequency bins in a frequency range above an upper bound of the off-source window for a single channel
USAGE: Count = HierarchyChannelAboveThreshold_single_channel(whitened_fft_target, whitened_fft_PBG, whitened_fft_FBG, duration, sampling_rate, sigma, LowerCutOffFreq=’None’, UpperCutOffFreq=’None’)
- Parameters
whitened_fft_target – whitened fft of the on-source window
whitened_fft_PBG – whitened fft of the preceding off-source window
whitened_fft_FBG – whitened fft of the following off-source window
duration – a duration of the on-source window
sampling_rate – sampling rate of a channel
sigma – an integer to determine the upper bound of the off-source window
LowerCutOffFreq – a lower cutoff frequency
UpperCutOffFreq – an upper cutoff frequency
- Returns
Count: importance
-
origli.utilities.veto_utilities.
WitnessFinder
(Listsegments, IFO, re_sfchs_init, sigma, number_process, first_chunk, tolerance, confidence_level, df_null, shuffle='True', PlusHOFT='False', LowerCutOffFreq='None', UpperCutOffFreq='None', freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶ - description:
use a list of glitches
analyze glitches of the number of ‘first chunk’ with all the channels of ‘re_sfchs_init’
perform one-sided binomial test and Welch one-sided t-test
reject the channel that do NOT pass the both tests, i.e., can not reject a hypothesis that a channel is consistent with null samples
calculate the error ratio of the t-value of the top ranking channel to the previous t-value
analyze a next glitch using the channels that pass the both tests
add the values of importance to the passed analyzed samples
repeat (3)-(7)
terminate the process when the error ratio reaches the tolerance
USAGE: re_sfchs, MatCount, list_Causal_passed_final, list_t_values_passed_final = WitnessFinder(Listsegments, IFO, re_sfchs_init, sigma, number_process, first_chunk, tolerance, confidence_level, df_null, shuffle=True, PlusHOFT=’False’, LowerCutOffFreq=’None’, UpperCutOffFreq=’None’)
- param Listsegments
a list of glitches
- param IFO
ifo
- param re_sfchs_init
all thes safe channels to be used at the beginning
- param sigma
an integer to determine the upper bound of the off-source window
- param number_process
number processes of a machine
- param first_chunk
the number of samples to be used for the first chunk, where all the channels are to be used
- param tolerate
tolerance number to stop the USAGE: List_Count, re_sfchs, gpstime, duration, SNR, confi, ID = CreateAllChannels_rho(Listsegment, IFO, re_sfchs, number_process, PlusHOFT, sigma, LowerCutOffFreq, UpperCutOffFreq)
- analysis
- param confidence_level
confidence level for one-sided binomial test and Welch one-sided t-test
- param df_null
null samples in pandas frame, which is expected to already created
- param shuffle
boolen, whether shuffle the list of glitches
- param PlusHOFT
boolen, whether analysis hoft
- param LowerCutOffFreq
a lower cutoff frequency
- param UpperCutOffFreq
an upper cutoff frequency
- return
re_sfchs: a list of channels that will have passed the tests untill the tolerance MatCount: a matrix of importance of the channels that passed the test list_Causal_passed_final: a list of witness ratio statistics of the channels that passed the tests list_t_values_passed_final: a list of t-values of the channels that passed the tests
-
origli.utilities.veto_utilities.
make_veto_omicron_in_aux
(epoch_start, epoch_end, IFO, channel, df_foreground, glitch_type, Proportion_Duration_Bfr_Centr, SNR_thresh, df_flag, OutputHDF5_dir, ifostate, N_processes)[source]¶ - description:
query omicron triggers (aux) of a witness channel
find the aux omicron triggers which are coincident with the glitches that are analyzed
find the aux SNR cut which corresponds to the importance cut of this witness channel
find the aux omicron triggers which are coincident with all the glitches with label being studied
veto glitches when the coincident aux triggers have SNR above the aux SNR cut (given in the step 3)
USAGE: rho_cut, snr_cut, deadtime, efficiency, efficiency_over_deadtime, df_target = make_veto_omicron_in_aux(epoch_start, epoch_end, IFO, channel, df_foreground, glitch_type, Proportion_Duration_Bfr_Centr, SNR_thresh, df_flag, OutputHDF5_dir, ifostate, N_processes)
- Parameters
epoch_start – start time of the analysis period
epoch_end – end time of the analysis period
IFO – ifo {‘H1’ or ‘L1’}
channel – a witness channel name, it could be a channel in a particular frequency band
df_foreground – a pandas data frame of all the glitches in the strain channel fed into pychChoo
glitch_type – a glitch type that is focused on
Proportion_Duration_Bfr_Centr – a fraction the on-source window before the peak GPS time of a glitch
SNR_thresh – a lower SNR threshold to select glitches that are studied
df_flag – a pandas data frame of flagged (including ‘Y’ and ‘N’) of the witness channels for the glitches that have been analyzed with FlagFinder_all_witnesses()
OutputHDF5_dir – a output directory where the omicron trigger of a witness channel
ifostate – state of an ifo
N_processes – number of cores
- Returns
rho_cut: lower cut of importance of a witness channel snr_cut: corresponding SNR cut of this witness channel deadtime: a fraction of the time that are vetoed during the analysis time efficiency: a fraction of glitches that are vetoed efficiency_over_deadtime: ratio of efficiency over deadtime df_target: a pandas data frame of this witness, where the glitches that are vetoed are marked as ‘Y’
origli.utilities.condor_utilities
¶
Script name: condor_utilities.py
- Description:
File containing utilities
origli.utilities.burn_in_utilities
¶
-
origli.utilities.burn_in_utilities.
BG_upper_threshold_single_channel_given_freqband
(list_dummy_duration, list_whitened_fft, sampling_rate, list_num_trial_used, sigma, LowerCutOffFreq='None', UpperCutOffFreq='None')[source]¶ - description:
For a single channel per glitch, this function calculates a list of the background upper threshold across dummy on-source windows
USAGE: list_bg_upper_threshold = BG_upper_threshold_single_channel_given_freqband(list_dummy_duration, list_whitened_fft, sampling_rate, list_num_trial_used, sigma, LowerCutOffFreq=’None’, UpperCutOffFreq=’None’)
- Parameters
list_dummy_duration – a list of dummy on-source window
list_whitened_fft – a list of the normalized spectrums where each element is a spectrum for a given dummy on-source window
sampling_rate – sampling rate of a channel
list_num_trial_used – a list of trials of dummy on-source window within the total on-source window
sigma – an integer to determine the upper bound of the off-source window
LowerCutOffFreq – a lower cutoff frequency
UpperCutOffFreq – an upper cutoff frequency
- Returns
Count: importance
-
origli.utilities.burn_in_utilities.
BG_upper_threshold_single_channel_multiband
(list_dummy_duration, list_whitened_fft, sampling_rate, list_num_trial_used, sigma)[source]¶ - description:
calculate values of the background upper threshold per dummy on-source window and per frequency band
USAGE: MatBGUpperThresh = BG_upper_threshold_single_channel_multiband(list_dummy_duration, list_whitened_fft, sampling_rate, list_num_trial_used, sigma)
- Parameters
list_dummy_duration – a list of dummy on-source window
list_whitened_fft – a list of the normalized spectrums where each element is a spectrum for a given dummy on-source window
sampling_rate – sampling rate of a channel
list_num_trial_used – a list of trials of dummy on-source window within the total on-source window
sigma – an integer to determine the upper bound of the off-source window
- Returns
MatBGUpperThresh: values of the background upper threshold, numpy array, where the frequency bands are rows from the top to bottom, the dummy on-source windows are in columns from left to right
-
origli.utilities.burn_in_utilities.
CreateAllChannels_BGUpperThresh_multband
(Listsegment, IFO, re_sfchs, number_process, PlusHOFT, sigma, duration_max, trial_duration_sample)[source]¶ - description:
use a single glitch time
query timeseries of all the channels around a glitch
calculate values of the background upper threshold per dummy on-source window per frequency band
iterate through channels
USAGE: IndexSatisfied, list_mat_BG_upper_thresh, array_dummy_duration, list_sample_rates, re_sfchs, gpstime, duration, SNR, confi, ID = CreateAllChannels_BGUpperThresh_multband(Listsegment, IFO, re_sfchs, number_process, PlusHOFT, sigma, duration_max = 15, trial_duration_sample = 20)
- Parameters
Listsegment – a list of segment parameters
:param IFO :param re_sfchs: a list of safe channels :param number_process: a number of processes in parallel :param PlusHOFT: whether to get data of hoft, {‘True’ or ‘False’} :param sigma: an integer to be used for calculating values of importance :param maximum value of dummy on-sourc window in sec :param a number of dummy on-source windows within the total on-source window :return:
IndexSatisfied: glitch index list_mat_BG_upper_thresh: a list of matrices per channel where element of a matrix is a value of the background upper threshold, numpy array, where the frequency bands are rows from the top to bottom, the dummy on-source windows are in columns from left to right array_dummy_duration: numpy array of the dummy on-source windows list_sample_rates: a list of sampling rates of channels, numpy array re_sfchs: list of channels without “IFO:” at the beginning gpstime: a gps time duration: a value of duration SNR: signal to noise ratio in the h(t) conf: a confidence level of classification of a glitch, provided by Gravity Spy. Otherwise None ID: a glitch ID, provided by Gravity Spy in usual
-
origli.utilities.burn_in_utilities.
CreateAllChannels_rho_multband_from_bg_up_bd_prior
(Listsegment, IFO, re_sfchs, number_process, PlusHOFT, hdf5_obj_bg_up_thresh)[source]¶ - description:
use a single glitch time
query timeseries of all the channels around a glitch
calculate the normalized spectrum
compute the value of importance for a single channel across frequency bands
USAGE: IndexSatisfied, Mat_Count_in_multibands, list_sample_rates, re_sfchs, gpstime, duration, SNR, confi, ID = CreateAllChannels_rho_multband_from_bg_up_bd_prior(Listsegment, IFO, re_sfchs, number_process, PlusHOFT, hdf5_obj_bg_up_thresh)
- Parameters
Listsegment – a list of segment parameters
:param IFO :param channels: a list of safe channels :param number_process: a number of processes in parallel :param PlusHOFT: whether to get data of hoft, {‘True’ or ‘False’} :param hdf5_obj_bg_up_thresh: a HDF5 object that contains the polynomial parameters of the fit that represent the background upper theshold as a functin of on-source window length for all the channels and freq bands :return:
IndexSatisfied: glitch index Mat_Count_in_multibands: a matrix of rho where frequencies in rows and channels in columns, numpy array list_sample_rates: a list of sampling rates of channels, numpy array re_sfchs: list of channels without “IFO:” at the beginning gpstime: a gps time duration: a value of duration SNR: signal to noise ratio in the h(t) conf: a confidence level of classification of a glitch, provided by Gravity Spy. Otherwise None ID: a glitch ID, provided by Gravity Spy in usual
-
origli.utilities.burn_in_utilities.
CreateBGUpperThresh_single_channel_multiband
(full_timeseries, target_timeseries_start, target_timeseries_end, array_dummy_duration, sigma)[source]¶ - discription:
- make list_whitened_fft: list of numpy arrays of the nomalized spectrum where each element of this list is the normalized spectrum for each trial with a given dummy on-source window.
These spectrums are concatenated to a vector from left to right, e.g, np.array([sp0_try0, sp1_try0, …, sp0_try1, sp0_try1, …. ]) Hence, this list is [ (sp for dummy 0), (sp for dummy 1), (….), ….]
sample_rate: sampling rate of this channel DURATION: a duration of a target segment list_num_trial_used: a list of trials per dummy on-source window. Note the number of trials per dummy on-source vary becuase of the limited length of the extended total on-source window.
The longer the dummy on-source, the fewer the trials are available
calculate values of the background upper threshold per dummy on-source window and per frequency band
USAGE: MatBGUpperThresh, sample_rate = CreateBGUpperThresh_single_channel_multiband(full_timeseries, target_timeseries_start, target_timeseries_end, array_dummy_duration, sigma)
- Parameters
full_timeseries – the full time series in gwpy object including on- and off source windows
target_timeseries_start – the start time of the on-source window
target_timeseries_end – the end time of the on-source window
array_dummy_duration – numpy array of dummy on-source windows
sigma – an interger to calculate the value of importance
- Returns
MatBGUpperThresh: values of the background upper threshold, numpy array, where the frequency bands are rows from the top to bottom, the dummy on-source windows are in columns from left to right sample_rate: a sampling rate of a single channel
-
origli.utilities.burn_in_utilities.
CreateRho_single_channel_multiband_from_bg_up_bd_prior
(full_timeseries, target_timeseries_start, target_timeseries_end, list_poly_para)[source]¶ - discription:
calculate the normalized spectrum
compute the value of importance for a single channel across frequency bands
USAGE: Counts_in_multibands, sample_rate = CreateRho_single_channel_multiband_from_bg_up_bd_prior(full_timeseries, target_timeseries_start, target_timeseries_end, list_poly_para)
- Parameters
full_timeseries – the full time series in gwpy object including on- and off source windows
target_timeseries_start – the start time of the on-source window
target_timeseries_end – the end time of the on-source window
list_poly_para – a list of polynomial fit of the background upper threshold per freq band
- Returns
Counts_in_multibands: values of importance in different frequency bands where importance is a fraction of frequency bins in a frequency range above an upper bound of the off-source window for a single channel sample_rate: a sampling rate of a single channel
-
origli.utilities.burn_in_utilities.
FindBGlis_extendBG
(state, number_trials, step, outputMother_dir, df, Epoch_lt, TargetGlitchClass, IFO, BGSNR_thre, targetSNR_thre, Confidence_thre, UpperDurationThresh, LowerDurationThresh, UserDefinedDuration, gap, TriggerPeakFreqLowerCutoff=0, TriggerPeakFreqUpperCutoff=8192, targetUpperSNR_thre=inf)[source]¶ - description: get the null samples where their durations are drawn from the target set. The only subset of the null samples are chosen if their on-source windows do not coincide with any other glitches.
gltich file (.csv file)
the target samples
create random time stamps with their durations drawn from the target set
accept the random time stamps where their on-source windows do not coincide with any other glitches
return the info of the accepted glitches
USAGE: Listsegments = FindBGlis_extendBG(state, number_trials, step, outputMother_dir, df, Epoch_lt, TargetGlitchClass, IFO, BGSNR_thre, targetSNR_thre, Confidence_thre, UpperDurationThresh, LowerDurationThresh, UserDefinedDuration, gap, TriggerPeakFreqLowerCutoff, TriggerPeakFreqUpperCutoff, targetUpperSNR_thre)
- Parameters
df – GravitySpy meta data in pandas format
Epochstart – starting time of an epoch
Epochend – end time of an epoch
Commissioning_lt – commissioning time in list
TargetGlitchClass – a target glitch class name (str)
IFO – a type of interferometer (H1, L1, V1) (str)
BGSNR_thre – an upper threhold of SNR for background glitches, i.e., quiet enough ), float or int
targetSNR_thre – a lower threshold of SNR for target glitches, float or int
Confidence_thre – a threshold of confidence level (float or int )
UpperDurationThresh – an upper bound of duration in sec (float or int)
LowerDurationThresh – a lower bound of duration in sec (float or int)
UserDefinedDuration – user defined duration of a glitch (float or int), 0 in default
gap – a time gap between the target and the background segments in sec, 1 sec in default
TriggerPeakFreqUpperCutoff – the lower SNR threshold for selecting the target set
targetUpperSNR_thre – the upper SNR threshold for selecting the target set
- Returns
the list of parameters of glitches passing the above thresholds Listsegments contains of
ListIndexSatisfied: a list of index of glitches Listtarget_timeseries_start: a list of a target glitch starting time Listtarget_timeseries_end: a list of a target glitch ending time Listpre_background_start: a list of preceding background starting time Listpre_background_end; a list of preceding background ending time Listfol_background_start: a list of following background starting time Listfol_background_end: a list of following background ending time Listgpstime: a list of GPS times Listduration: a list of durations ListSNR: a list of SNRs Listconfi: a list of confidence levels ListID: a list of IDs
-
origli.utilities.burn_in_utilities.
FindBGlistBurnIn
(state, duration_max, number_trials, step, outputMother_dir, df, Epoch_lt, IFO, BGSNR_thre, UserDefinedDuration, gap)[source]¶ description: From the random time stamps created with FindRadomlistPointsForBurnIn(), select the subset in which their on-source windows do not overlap with any other glitches Note that if on-source window can be extended, it will do
1. load glitch data set (.csv file) 4. accept time stamps where their on-source windows do not coincide with any other glitches 5. return the info of the accepted glitches
USAGE: Listsegments = FindBGlistBurnIn(state, duration_max, number_trials, step, outputMother_dir, df, Epoch_lt, IFO, BGSNR_thre, UserDefinedDuration, gap)
- Parameters
df – GravitySpy meta data in pandas format
Epochstart – starting time of an epoch
Epochend – end time of an epoch
Commissioning_lt – commissioning time in list
TargetGlitchClass – a target glitch class name (str)
IFO – a type of interferometer (H1, L1, V1) (str)
BGSNR_thre – an upper threhold of SNR for background glitches, i.e., quiet enough ), float or int
targetSNR_thre – a lower threshold of SNR for target glitches, float or int
Confidence_thre – a threshold of confidence level (float or int )
UpperDurationThresh – an upper bound of duration in sec (float or int)
LowerDurationThresh – a lower bound of duration in sec (float or int)
UserDefinedDuration – user defined duration of a glitch (float or int), 0 in default
gap – a time gap between the target and the background segments in sec, 1 sec in default
flag – ‘Both’ or ‘Either’ taking both backgronds or either the preceding or the following background, respectively to accept glitches
- Returns
the list of parameters of glitches passing the above thresholds Listsegments contains of
ListIndexSatisfied: a list of index of glitches Listtarget_timeseries_start: a list of a target glitch starting time Listtarget_timeseries_end: a list of a target glitch ending time Listpre_background_start: a list of preceding background starting time Listpre_background_end; a list of preceding background ending time Listfol_background_start: a list of following background starting time Listfol_background_end: a list of following background ending time Listgpstime: a list of GPS times Listduration: a list of durations ListSNR: a list of SNRs Listconfi: a list of confidence levels ListID: a list of IDs
-
origli.utilities.burn_in_utilities.
FindRadomlistPointsForBurnIn
(state, IFO, Epoch_lt, number_samples, step, duration_max, outputMother_dir)[source]¶ - description: create random time stamps with their durations are uniformly distributed in log10 of 0.02 sec to duration_max
within an epoch, create a list of synthetic points with randomly chosen with durations
make pandas frame dataset
USAGE: df = FindRadomlistPointsForBurnIn(state, IFO, Epoch_lt, number_samples, step, duration_max, outputMother_dir)
- Parameters
state – IFO state {observing, nominal-lock}
IFO – an observer {H1, L1}
Epoch_lt – a list of epochs
number_samples – number of samples picked up
step – step of data points in sec
duration_max – a maximum value of the duration in sec
outputMother_dir – an output directory in witch the data set is in
- Returns
df: synthetic random data points within an epoch
-
origli.utilities.burn_in_utilities.
FindglitchlistextendBG
(df, Epoch_lt, TargetGlitchClass, IFO, BGSNR_thre, targetSNR_thre, Confidence_thre, UpperDurationThresh, LowerDurationThresh, UserDefinedDuration, gap, position_duration_bfr_centr, TriggerPeakFreqLowerCutoff=0, TriggerPeakFreqUpperCutoff=8192, targetUpperSNR_thre=inf)[source]¶ - description:
glitch data set (.csv file)
get the data about the target glitch class
get the subset of the target glitches based on SNR and confidence level threshold a user defines
the preceeding and following BGs are 64 sec-long for every samples regardless the overlapping with any other glitches
return the info of the accepted glitches
USAGE: Listsegments = FindglitchlistextendBG(df, Epochstart, Epochend, Commissioning_lt, TargetGlitchClass, IFO, BGSNR_thre, targetSNR_thre, Confidence_thre, UpperDurationThresh, LowerDurationThresh, UserDefinedDuration, gap, position_duration_bfr_centr, TriggerPeakFreqLowerCutoff=0, TriggerPeakFreqUpperCutoff=8192, targetUpperSNR_thre=np.inf)
- Parameters
df – GravitySpy meta data in pandas format
Epochstart – starting time of an epoch
Epochend – end time of an epoch
Commissioning_lt – commissioning time in list
TargetGlitchClass – a target glitch class name (str)
IFO – a type of interferometer (H1, L1, V1) (str)
BGSNR_thre – an upper threhold of SNR for background glitches, i.e., quiet enough ), float or int
targetSNR_thre – a lower threshold of SNR for target glitches, float or int
Confidence_thre – a threshold of confidence level (float or int )
UpperDurationThresh – an upper bound of duration in sec (float or int)
LowerDurationThresh – a lower bound of duration in sec (float or int)
UserDefinedDuration – user defined duration of a glitch (float or int), 0 in default
gap – a time gap between the target and the background segments in sec, 1 sec in default
position_duration_bfr_centr – proportion of duration for a target segment around a center time, e.g, 0.5 indicates duration is even distributed around a center time, 0.83 indicates 5/6 is before the center time
TriggerPeakFreqLowerCutoff – a lower limit cutoff value of the peak frequency of triggers given by an ETG for target glitches
TriggerPeakFreqUpperCutoff – an upper limit cutoff value of the peak frequency of triggers given by an ETG queries for target glitches
targetUpperSNR_thre – an upper limit cutoff value of SNR of triggers given by an ETG queries for target glitches
flag – ‘Both’ or ‘Either’ taking both backgronds or either the preceding or the following background, respectively to accept glitches
- Returns
the list of parameters of glitches passing the above thresholds Listsegments contains of
ListIndexSatisfied: a list of index of glitches Listtarget_timeseries_start: a list of a target glitch starting time Listtarget_timeseries_end: a list of a target glitch ending time Listpre_background_start: a list of preceding background starting time Listpre_background_end; a list of preceding background ending time Listfol_background_start: a list of following background starting time Listfol_background_end: a list of following background ending time Listgpstime: a list of GPS times Listduration: a list of durations ListSNR: a list of SNRs Listconfi: a list of confidence levels ListID: a list of IDs
-
origli.utilities.burn_in_utilities.
HierarchyChannelAboveThreshold_single_channel_multiband_from_bg_up_bd_prior
(whitened_fft_target, sampling_rate, duration, list_poly_para)[source]¶ - description:
calculate the importance for a signle channel for frequency bands
USAGE: Counts_in_multibands = HierarchyChannelAboveThreshold_single_channel_multiband_from_bg_up_bd_prior(whitened_fft_target, sampling_rate, duration, list_poly_para)
- Parameters
whitened_fft_target – whitened fft of the on-source window
duration – a duration of the on-source window
sampling_rate – sampling rate of a channel
list_poly_para – a list of polynomial fit of the background upper threshold per freq band
- Returns
Counts_in_multibands: values of importance in different frequency bands, numpy array
-
origli.utilities.burn_in_utilities.
Multiprocess_whitening_for_burn_in
(full_timeseries, target_timeseries_start, target_timeseries_end, array_dummy_duration)[source]¶ - description:
This is used for normalizing the spectrum of each trial in the on-source window of random time stamps created with FindBGlistBurnIn() iterate each trial for each dummy on 1. from the time series of a single channel 2. calculate how many trials are available per dummy on-source window within the extended on-source window 3. iterate trials per dummy on-source window
USAGE: list_whitened_fft, sample_rate, DURATION, list_num_trial_used = Multiprocess_whitening_for_burn_in(full_timeseries, target_timeseries_start, target_timeseries_end, array_dummy_duration)
- Parameters
full_timeseries – time series comprising target and BGs
target_timeseries_start – a start time of a target segment
target_timeseries_end – an end time of a target segment
array_dummy_duration – numpy array of dummy on-source windows
- Returns
- list_whitened_fft: list of numpy arrays of the nomalized spectrum where each element of this list is the normalized spectrum for each trial with a given dummy on-source window.
These spectrums are concatenated to a vector from left to right, e.g, np.array([sp0_try0, sp1_try0, …, sp0_try1, sp0_try1, …. ]) Hence, this list is [ (sp for dummy 0), (sp for dummy 1), (….), ….]
sample_rate: sampling rate of this channel DURATION: a duration of a target segment list_num_trial_used: a list of trials per dummy on-source window. Note the number of trials per dummy on-source vary becuase of the limited length of the extended total on-source window.
The longer the dummy on-source, the fewer the trials are available
-
origli.utilities.burn_in_utilities.
Multiprocess_whitening_for_target
(full_timeseries, target_timeseries_start, target_timeseries_end)[source]¶ - description:
This is used for multi processing for whitening segments
USAGE: whitened_fft_target, sample_rate, DURATION = Multiprocess_whitening_for_target(full_timeseries, target_timeseries_start, target_timeseries_end)
- Parameters
full_timeseries – time series comprising target and BGs
target_timeseries_start – a start time of a target segment
target_timeseries_end – an end time of a target segment
- Returns
whitened_fft_target: whitened fft of a target segment sample_rate: sampling rate of this channel DURATION: a duration of a target segment
-
origli.utilities.burn_in_utilities.
SaveDummyTargetHDF5_OFFLINE_burn_in
(Listsegments, re_sfchs, IFO, outputpath, outputfilename, number_process, PlusHOFT='False', trial_duration_sample=20)[source]¶ - description:
Save the normalized spectrum for every channels for every glitches, with the use of Multiprocess_whitening_for_burn_in() 1. iterate all the samples 2. whiten the on-source window for every channels 4. save the whitened spectrum as a HDF5 file
each group in the HDF5 is for a single channel and each of datasets per group has the normalized spectrum for a dummy on-source window
USAGE: SaveTargetAndBackGroundHDF5_OFFLINE_burn_in(Listsegments, re_sfchs, IFO, outputpath, outputfilename, number_process, PlusHOFT=’False’)
- Parameters
Listsegments – a list of segment parameters
re_sfchs – a list of safe channels
outputpath – a directory of an output file
outputfilename – a name of an output file
number_process – a number of processes in parallel
PlusHOFT – whether to get data of hoft, {‘True’ or ‘False’}, ‘False’ in default
trial_duration_sample – a number of dummy on-source
:return None
-
origli.utilities.burn_in_utilities.
SaveOnlyTargetHDF5_OFFLINE
(Listsegments, re_sfchs, IFO, outputpath, outputfilename, number_process, PlusHOFT='False')[source]¶ - description:
THIS IS USED FOR “OFFLINE” MODE 0. assuming Listsegments is given by Findglitchlist() 1.take the information of the list of allowed target and a preceding and following segments 2. whiten a target segmement based on the average background segment 3. find whitened FFT 4. save the whitened target and backgrounds FFTs Note this depends on
USAGE: SaveOnlyTargetHDF5_OFFLINE(Listsegments, re_sfchs, IFO, outputpath, outputfilename, number_process, PlusHOFT=’False’)
- Parameters
Listsegments – a list of segment parameters
re_sfchs – a list of safe channels
outputpath – a directory of an output file
outputfilename – a name of an output file
number_process – a number of processes in parallel
PlusHOFT – whether to get data of hoft, {‘True’ or ‘False’}, ‘False’ in default
-
origli.utilities.burn_in_utilities.
SaveOnlyTargetHDF5_multiband_OFFLINE_from_bg_up_bd_prior
(Listsegments, re_sfchs, IFO, outputpath, outputfilename, number_process, path_hdf5_bg_up_thresh, PlusHOFT='False')[source]¶ - description:
THIS IS USED FOR “OFFLINE” MODE 1. take the information of the list of allowed target and a preceding and following segments 2. query time series and get normalized spectrum and calculate importance 4. save the whitened target and backgrounds FFTs Note this depends on
USAGE: SaveOnlyTargetHDF5_multiband_OFFLINE_from_bg_up_bd_prior(Listsegments, re_sfchs, IFO, outputpath, outputfilename, number_process, path_hdf5_bg_up_thresh, PlusHOFT=’False’)
- Parameters
Listsegments – a list of segment parameters
re_sfchs – a list of safe channels
outputpath – a directory of an output file
outputfilename – a name of an output file
number_process – a number of processes in parallel
path_hdf5_bg_up_thresh – path to a HDF5 file that contains the polynomial fit of the background upper threshold
PlusHOFT – whether to get data of hoft, {‘True’ or ‘False’}, ‘False’ in default
:return None
-
origli.utilities.burn_in_utilities.
SaveUppperThreshodBG_multiband_OFFLINE_burn_in
(Listsegments, re_sfchs, IFO, outputpath, outputfilename, number_process, sigma, PlusHOFT='False', duration_max=15, trial_duration_sample=20)[source]¶ - description:
THIS IS USED FOR “OFFLINE” MODE 1. iterate glitch samples 2. get values of the background upper threshold per dummy on-source window per frequency band for each of channels 4. save the values Note this depends on
USAGE: SaveUppperThreshodBG_multiband_OFFLINE(Listsegments, re_sfchs, IFO, outputpath, outputfilename, number_process, sigma, PlusHOFT=’False’, duration_max = 15, trial_duration_sample = 20)
- Parameters
Listsegments – a list of segment parameters
re_sfchs – a list of safe channels
outputpath – a directory of an output file
outputfilename – a name of an output file
number_process – a number of processes in parallel
PlusHOFT – whether to get data of hoft, {‘True’ or ‘False’}, ‘False’ in default
sigma – an integer to determine the upper bound of the off-source window
duration_max – a maximum value of length of dummy on-source window in sec
trial_duration_sample – a number of dummy on-source window within the total on-source window
:return None
-
origli.utilities.burn_in_utilities.
cal_importance_single_channel_singl_freqband_from_bg_up_bd_prior
(whitened_fft_target, sampling_rate, DURATION, poly_para, LowerCutOffFreq, UpperCutOffFreq)[source]¶ - description:
calculate the importance for a single channel in a given frequency band
USAGE: Count = cal_importance_single_channel_singl_freqband_from_bg_up_bd_prior(whitened_fft_target, sampling_rate, DURATION, poly_para, LowerCutOffFreq, UpperCutOffFreq)
- Parameters
whitened_fft_target – on-source window normalized spectrum
sampling_rate – sample rate
DURATION – duration of on-source window
poly_para – polynomial parameters of the fit of the background upper threshold as a function of on-source window length
LowerCutOffFreq –
UpperCutOffFreq –
- Returns
-
origli.utilities.burn_in_utilities.
clean_duration_for_asd
(duration)[source]¶ - description:
This function is used for clean the dicimal points in value of duration to avoid the error shown in ASD estimator in gwpy
- Parameters
duration – duration in sec
- Returns
duration: cleaned duration in sec
-
origli.utilities.burn_in_utilities.
make_bg_up_thesh_interpolate_hdf5
(input_dir, input_hdf5_file)[source]¶ - description:
make a dictionry of values of the background upper threshold across dummy on-source window and frequency bands per channel
USAGE: list_all_dummy_duration_sorted, mat_ch_sorted_dict = make_bg_up_thesh_interpolate_hdf5(input_dir, input_hdf5_file)
- Parameters
input_dir – input directory
input_hdf5_file – name of a HDF5 file
- Returns
list_all_dummy_duration_sorted: ascending list of dummy on-source windows mat_ch_sorted_dict: a dictionary in which each of the key contains array of values the he background upper threshold per dummy on-source window per frequency band where frequency bands are in row from top to bottom and dummy on-source windows are in columns from left to right
-
origli.utilities.burn_in_utilities.
save_interpolate_bg_upper_thres_hdf5
(list_all_dummy_duration_sorted, mat_ch_sorted_dict, output_dir, output_hdf5_file, med_abs_sigma=6, poly_degree=10)[source]¶ - description:
fit the polynomial function against the background upper threshold as a function of dummy on-source window per freq band per channel
save the polynomial parametes to a HDF5 file
USAGE: save_interpolate_bg_upper_thres_hdf5(list_all_dummy_duration_sorted, mat_ch_sorted_dict, output_dir, output_hdf5_file, med_abs_sigma=6, poly_degree=10)
- Parameters
list_all_dummy_duration_sorted – ascending list of dummy on-source windows
mat_ch_sorted_dict – a dictionary in which each of the key contains array of values the he background upper threshold per dummy on-source window per frequency band where frequency bands are in row from top to bottom and dummy on-source windows are in columns from left to right
output_dir – output directory
output_hdf5_file – otuput file name
med_abs_sigma – a interger number of median absolute error to remove outliers for fitting
poly_degree – polynomial degree
- Returns
None