Excecutable and modules¶

`bin.OriginFinder`¶

Script name:: OriginFinder.py
Usage:: python OriginFinder.py –configure /path/to/configuration.ini –override section0:option0:value0 section1:option1

where section0 is a section, option0 is its option, and value0 is an input that you choose. If an option is boolean, please do as section1:option1

Online mode
Description:
Online mode automatically creates a list of glitches using an external event generator such as Omicron and pycbc-live in the current version.
Processes:
It automatically queries a list of glitches that are generated an external event trigger generator (e.g., omicron trigger, pycbc-live tirgger upto date).

It chooses the glitches that not overlap with any other glitches.

It conditions auxiliary channels and quantifies them as “importance” which is the fraction of frequency bins of the on-source window above a upper threshold of the off-source window in a given frequency band.

It produces a plot of values of importance for auxiliary channels of a glitch right after values of importance are computed.

It creates summary plots; glitch indices VS. channels, SNR of h(t) VS channels, time versus channels, the averaged values of importance VS channels.

It clusters glitches based on PCA and Gaussian mixture and make its plot and the rates of each clusters.

Note values of importance are saved as .csv files.
Offline mode
Description:
Offline mode is similar to the online mode. However the dominant difference is that a list of glitches are provided by a user.

For instance, a user can use a .csv file generated by Gravity Spy, (potentially cWB, or other pipelines). This mode has benefit when external event trigger generators can not catch glitches as you wish.
Processes:
The offline modes is operated with a list of glitches that is already supplied by a user. For instance, Gravity Spy .csv file can be used.

It chooses the glitches that not overlap with any other glitches.

It conditions auxiliary channels and quantifies them as “importance” which is the fraction of frequency bins of the on-source window above a upper threshold of the off-source window in a given frequency band.

It creates summary plots; glitch indices VS. channels, SNR of h(t) VS channels, time versus channels, the averaged values of importance VS channels.

It clusters glitches based on PCA and Gaussian mixture and make its plot and the rates of each clusters.

It produces a plot of values of importance for auxiliary channels of individual glitches.

Note values of importance are saved as .csv files.
Null sample generator
Description:
In order to perform statistics (see the following section), null sample generator creates the null hypothesis set where the samples are in quiet times.

Note that large number of null samples are preferred.
Processes:
It creates randomly distributed synthetic event time periods whose distribution of durations follow that of the target glitches.

It chooses those synthetic time periods (null samples) that are not overlapping with any other glitches including actuall all the glitches and null samples

It analyzes null samples in the same way either online mode or offline mode operates.
Statistics mode
Description:
It finds channels that are responsible for the target glitches based on a various statistical methods using the null samples.

It can find channels that are non-trivial by eyes, thanks to the null samples.

A user needs to supply a confidence level to find channels that reject the null hypothesis.

Note that each statistics are operated for each channel (in each frequency band) independently.
Processes:
It performs the one-sided binomial test for the target glitches as a whole. Also, it calculates “Witness ratio statistic” (WRS), which is the ratio of the fraction of the target samples with the value of importance above a threshold over the sum of the fraction of the target samples and the null samples above a threhold. Note that the threshold is set as the mean of importance of the null samples on the basis of the experiments conducted so far, The threshold should be chosen from another set of null samples to get unbiased values. However, it the number of null samples are sufficiently large, both null samples are expected to converge to the same value. In spite of the fact that the threshold is not truly independent from the null samples that are used for the statistical test, WRS is approximately the probability of a channel has glitches in the coincidence with target glitches in h(t).

It performes the one-sided Welch’s t-test on target glitches as a whole and make a plot and save the table as a .csv file.

It makes a plot combining WRS and t-value to provides the channels that reject both the binomial test and t-test.

It incorporates the binomial test and t-test for clustering for reduces the number of redundant sub-classes.

It analyzes each individual glitch by means of chi-square test and make their plots.
WitnessFlag mode
Description:
TThe methodology consists of two major parts of 1) finding witness channels 2) finding flags with chosen witness channels. Finding the witness channels are determined by statistical tests and reduce a list of channels automatically and stop the analysis automatically. The determination of flags can be made with multiple channels. The following process is made with null samples which are the samples at quiet times.
Processes:
Finding witness channels:

shuffle a list of glitches

take a few glitches and analyze them with all the safe channels

perform the one-sided binomial test and one-sided Welch’s t-test against null samples

reject the channels which do NOT pass both the tests, i.e., which can not reject the hypothesis that a channel is consistent with null samples

calculate the error ratio of the t-value of the top-ranking channel to its previous t-value

analyze the next glitch using the channels that pass both the tests

add the values of importance to the target samples

repeat the bullet points (3)-(7)

terminate the process when the error ratio reaches the tolerance

Finding flags

select high ranking witness channels

determine the upper cut of the null samples for those high ranking witness channels using null samples

analyze all the glitches using the selected witness channels

make a flag when those channels give importance above the uppercut of the null

calculate efficiency and deadtime

`origli.utilities.const`¶

Script name: const.py

Description:: File containing the list of glitch names This is supposed to be used by import_data_hdf5.py under bin dir

`origli.utilities.multiband_search_utilities`¶

file name: multiband_search_utilities.py

This file contains the utilities to be used for multi-frequency band search

origli.utilities.multiband_search_utilities.CreateAllChannels_rho_multband(Listsegment, IFO, re_sfchs, number_process, PlusHOFT, sigma)[source]¶

description:

use a single glitch time
query timeseries of all the channels around a glitch
condition (whitening and compare the on- and off-source window)
quantify all the channels (compute values of importance of all the channels)

USAGE: IndexSatisfied, Mat_Count_in_multibands, list_sample_rates, re_sfchs, gpstime, duration, SNR, confi, ID = CreateAllChannels_rho_multband(Listsegment, IFO, re_sfchs, number_process, PlusHOFT, sigma)

Parameters: Listsegment – a list of segment parameters

:param IFO :param channels: a list of safe channels :param number_process: a number of processes in parallel :param PlusHOFT: whether to get data of hoft, {‘True’ or ‘False’} :param sigma: an integer to be used for calculating values of importance :return:

IndexSatisfied: glitch index Mat_Count_in_multibands: a matrix of rho where frequencies in rows and channels in columns, numpy array list_sample_rates: a list of sampling rates of channels, numpy array re_sfchs: list of channels without “IFO:” at the beginning gpstime: a gps time duration: a value of duration SNR: signal to noise ratio in the h(t) conf: a confidence level of classification of a glitch, provided by Gravity Spy. Otherwise None ID: a glitch ID, provided by Gravity Spy in usual

origli.utilities.multiband_search_utilities.CreateRho_multiband(full_timeseries, target_timeseries_start, target_timeseries_end, pre_background_start, pre_background_end, fol_background_start, fol_background_end, sigma)[source]¶

discription:

calculate the whitened FFT of the on- and off-source window for a single channel
compute the value of importance for a single channel

USAGE: Counts_in_multibands, sample_rate = CreateRho_multiband(full_timeseries, target_timeseries_start, target_timeseries_end, pre_background_start, pre_background_end, fol_background_start, fol_background_end, sigma)

Parameters

full_timeseries – the full time series in gwpy object including on- and off source windows
target_timeseries_start – the start time of the on-source window
target_timeseries_end – the end time of the on-source window
pre_background_start – the start time of the preceding off-source window
pre_background_end – the end time of the preceding off-source window
fol_background_start – the start time of the following off-source window
fol_background_end – the end time of the following off-source window
sigma – an interger to calculate the value of importance

Returns

Counts_in_multibands: values of importance in different frequency bands where importance is a fraction of frequency bins in a frequency range above an upper bound of the off-source window for a single channel sample_rate: a sampling rate of a single channel

origli.utilities.multiband_search_utilities.HierarchyChannelAboveThreshold_single_channel_multiband(whitened_fft_target, whitened_fft_PBG, whitened_fft_FBG, duration, sampling_rate, sigma)[source]¶

description:: calculate values of importance: a fraction of frequency bins in a frequency range above an upper bound of the off-source window for a single channel, in different frequency bnads

USAGE: Counts_in_multibands = HierarchyChannelAboveThreshold_single_channel_multiband(whitened_fft_target, whitened_fft_PBG, whitened_fft_FBG, duration, sigma)

Parameters

whitened_fft_target – whitened fft of the on-source window
whitened_fft_PBG – whitened fft of the preceding off-source window
whitened_fft_FBG – whitened fft of the following off-source window
duration – a duration of the on-source window
sampling_rate – sampling rate of a channel
sigma – an integer to determine the upper bound of the off-source window

Returns

Counts_in_multibands: values of importance in different frequency bands, numpy array

class origli.utilities.multiband_search_utilities.PlotTableAnalysis_multiband[source]¶

Bases: object

CreateMatCount_multiband(g_Individual=None)[source]¶

description:

find counts for each glitch
stack over all the gliches
make a matix comprising importance versus channels

dependencies: self.HierarchyChannelAboveThreshold(g, LowerCutOffFreq, UpperCutOffFreq) USAGE: MatCount, ListChannelName, ListSNR, ListConf, ListGPS, ListDuration, ListID, mat_rho, freq_bands, ListOriginalChannelName = CreateMatCount_multiband() # for all glitches

MatCount, ListChannelName, SNR, Conf, GPS, Duration, ID, mat_rho, freq_bands, ListOriginalChannelName = CreateMatCount_multiband(g_Individual) # for an individual glitch

Parameters: g_Individual – a HDF5 file group for a glitch
Returns: MatCount: a matrix comprising importance versus channels ListChannelName: a list of channel names, combining frequency band information ListSNR: a list of SNRs ListConf: a list of confidence ListGPS: a list of GPS times ListID: a list of IDs mat_rho: a matrix of rho where frequencies in rows channels in columns, numpy array freq_bands: a matrix of frequency bands ListOriginalChannelName: a list of channel names, original

PlotCausalityVSChannelMultiBand(list_Causal_passed, list_Causal_fail, list_causal_passed_err, list_causal_failed_err, list_Test, ListChannelName, output_dir, output_file, BinomialTestConfidence, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶

description:: plot witness ratio statistics of chanenls in multi-frequency bands the cells which do not pass the test are masked

USAGE: PlotCausalityVSChannelMultiBand(list_Causal_passed, list_Causal_fail, list_Test, ListChannelName, output_dir, output_file, BinomialTestConfidence)

Parameters

list_Causal_passed – a list of the probability of the causality that are passed one-tailed Binomial test, if not zero
list_Causal_fail – a list of the probability of the causality that are failed one-tailed Binomial test, if not zero
list_causal_passed_err – a list of the error of the causal probability that are passed one-tailed Binomial test, otherwise, zero
list_causal_failed_err – a list of the error of the causal probability that are failed one-tailed Binomial test, otherwise, zero
list_Test – a list of results of the Binomial test, ‘pass’ or ‘fail’
ListChannelName – a list of channel names
output_dir – (only used for all glitches)
output_file – (only used for all glitches)
BinomialTestConfidence – binomial test confidence level
freq_bands – frequency bands used for the multi-frequency band search, which is defined in const.py

Returns

None

PlotCausalityVSChannelMultiBandNoMaskNoTable(list_Causal_passed, list_Causal_fail, list_causal_passed_err, list_causal_failed_err, list_Test, ListChannelName, output_dir, output_file, BinomialTestConfidence, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶

description:: plot witness ratio statistics of chanenls in multi-frequency bands without mask and table

USAGE: PlotCausalityVSChannelMultiBandNoMaskNoTable(list_Causal_passed, list_Causal_fail, list_Test, ListChannelName, output_dir, output_file, BinomialTestConfidence)

Parameters

list_Causal_passed – a list of the probability of the causality that are passed one-tailed Binomial test, if not zero
list_Causal_fail – a list of the probability of the causality that are failed one-tailed Binomial test, if not zero
list_causal_passed_err – a list of the error of the causal probability that are passed one-tailed Binomial test, otherwise, zero
list_causal_failed_err – a list of the error of the causal probability that are failed one-tailed Binomial test, otherwise, zero
list_Test – a list of results of the Binomial test, ‘pass’ or ‘fail’
ListChannelName – a list of channel names
output_dir – (only used for all glitches)
output_file – (only used for all glitches)
BinomialTestConfidence – binomial test confidence level
freq_bands – frequency bands used for the multi-frequency band search, which is defined in const.py

Returns

None

PlotFrequencyVSChannel(glitchtype, SNR, Conf, GPS, Duration, ID, URL, mat_rho, ListOriginalChannelName, freq_bands, output_dir, output_file)[source]¶

description:: make a plot of frequencies versus channels for a glitch

dependencies: CreateChannelTicks() USAGE: PlotFrequencyVSChannel(glitchtype, SNR, Conf, GPS, Duration, ID, URL, mat_rho, ListOriginalChannelName, freq_bands, output_dir, output_file)

Parameters

glitchtype – a type of a glitch
SNR – SNR in h(t)
Conf – classification confidence level
GPS – a gps time
Duration – a duration of a gltich
ID – Gravity Spy ID
URL – Q-transform in h(t) of a glitch store in Gravity Spy
mat_rho – a matrix of rho where frequencies in rows channels in columns, numpy array
ListOriginalChannelName – a list of original channel names
freq_bands – a matrix of frequency bands
output_dir –
output_file –

Returns

None

PlotIndividualFCS_ImportanceVSChannel_multiband(glitchtype, IFO, GravitySpy_df, output_dir, mode='offline', sigma=None, Listsegments=None, re_sfchs=None, Data_outputpath=None, Data_outputfilename=None, PlusHOFT='False', number_process=None)[source]¶

description:

load a file comprising all glitches in a class
create a plot comprising frequency versus channel & importance versus channel
- dependency:
  self.CreateChannelTicks(ListChannel) self.make_subset_channel_based_on_samplingrate() self.CreateMatCount() self.PlotImportanceVSChannel()
save a plot

dependencies: make_subset_channel_based_on_samplingrate(), CreateChannelTicks(), CreateMatCount(), PlotImportanceVSChannel() USAGE: PlotIndividualFCS_ImportanceVSChannel_multiband(self, glitchtype, IFO, GravitySpy_df, output_dir, mode=’offline’, Listsegments=None, re_sfchs=None, Data_outputpath=None, Data_outputfilename=None, PlusHOFT=’False’, number_process=None)

Parameters

glitchtype – a type of glitch used for create a name of a plot
IFO – # a type of IFO used in a name of a plot
GravitySpy_df – a meta data of Gravity Spy in pandas frame
output_dir – a output directory
simga – an integer number used for the upper bound of BG noise
mode – ‘offline’ or ‘online’
sigma – an integer to determine the upper bound of the off-source window
Listsegments – a list of allowed glitches, which is used for online mode only, None in default
re_sfchs – a list of safe channels except unused channels, which is used for online mode only, None in default
Data_outputpath – a directory saving for a HDF5 file, which is used for online mode only, None in default
Data_outputfilename – a file saving for a HDF5 file, which is used for online mode only, None in default
PlusHOFT – whether to get data of HOFT {‘True’, ‘False’}, which is used for online mode only, ‘False’ in default
number_process – a number of processes in parallel, which is used for online mode only, None in default

return None

Plot_WRS_Welch_t_test_MultiBand(channels, list_Causal_passed, list_Causal_fail, list_Test_binomial, list_t_values_passed, list_t_values_failed, list_t_Test, confidence_level, output_dir, output_file, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶

description:: plot the result of one-sided Welch t-test

USAGE: Plot_Welch_t_test(channels, list_t_values_passed, list_t_values_failed, list_Test, output_dir, output_file)

Parameters

channels – a list of channels
list_t_values_passed – a list of t-values that pass the test
list_t_values_failed – a list of t-values that fail the test
list_Test – a list of the test results {‘pass’, ‘fail’}
confidence_level – a confidence level
output_dir – an output directory
output_file – an output file name
freq_bands – frequency bands used for the multi-frequency band search, which is defined in const.py

Returns

None

Plot_Welch_t_test_MultiBand(channels, list_t_values_passed, list_t_values_failed, list_Test, confidence_level, output_dir, output_file, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶

description:: plot the result of one-sided Welch t-test

USAGE: Plot_Welch_t_test(channels, list_t_values_passed, list_t_values_failed, list_Test, output_dir, output_file)

Parameters

channels – a list of channels
list_t_values_passed – a list of t-values that pass the test
list_t_values_failed – a list of t-values that fail the test
list_Test – a list of the test results {‘pass’, ‘fail’}
confidence_level – a confidence level
output_dir – an output directory
output_file – an output file name
freq_bands – frequency bands used for the multi-frequency band search, which is defined in const.py

Returns

None

Plot_p_greater_MultiBand(channels, list_p_greater_passed, list_p_greater_failed, list_Test, confidence_level, output_dir, output_file, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶

description:: plot the result of p_greater

USAGE: Plot_p_greater_MultiBand(self, channels, list_p_greater_passed, list_p_greater_failed, list_Test, confidence_level, output_dir, output_file)

Parameters

channels – a list of channels
list_p_greater_passed – a list of p_greater that pass the test
list_p_greater_failed – a list of p_greater that fail the test
list_Test – a list of the test results {‘pass’, ‘fail’}
confidence_level – a confidence level
output_dir – an output directory
output_file – an output file name
freq_bands – frequency bands used for the multi-frequency band search, which is defined in const.py

Returns

None

ReconstructFromFlattenedList(flattened_list, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶

description:: make the matrix of the original order from the flatten

USAGE: mat_originl, mat_flipped = ReconstructFromFlattenedList(flattened_list, freq_bands=Const.freq_bands)

Parameters

flattened_list – flattend list from the matrix using np.flatten(order=’F’)
freq_bands – frequency bands

Returns

mat_originl: a matrix where frequency bnads in row from the top to bottom, and channels in columns mat_flipped: a matrix where # frequency bnads in row from the bottom to top, and channels in columns

getHDF5Object()[source]¶

show a HDF5 file object USAGE: getHDF5Object()

Returns: a dictionary

setHDF5Object(input_dir, input_file)[source]¶

set a HDF5 file object

Parameters: f_dict – a dictionary of time series data sets
Returns: None

origli.utilities.multiband_search_utilities.SaveTargetAndBackGroundHDF5_multiband_OFFLINE(Listsegments, re_sfchs, IFO, outputpath, outputfilename, number_process, sigma, PlusHOFT='False')[source]¶

description:: THIS IS USED FOR “OFFLINE” MODE 0. assuming Listsegments is given by Findglitchlist() 1.take the information of the list of allowed target and a preceding and following segments 2. whiten a target segmement based on the average background segment 3. find whitened FFT 4. save the whitened target and backgrounds FFTs Note this depends on

USAGE: SaveTargetAndBackGroundHDF5(Listsegments, re_sfchs, IFO, outputpath, outputfilename, mode==’offline’)

Parameters

Listsegments – a list of segment parameters
channels – a list of safe channels
outputpath – a directory of an output file
outputfilename – a name of an output file
number_process – a number of processes in parallel
sigma – an integer to determine the upper bound of the off-source window
PlusHOFT – whether to get data of hoft, {‘True’ or ‘False’}, ‘False’ in default

:return None

origli.utilities.multiband_search_utilities.SaveTargetAndBackGroundHDF5_multiband_ONLINE(Listsegments, re_sfchs, IFO, outputpath, outputfilename, number_process, sigma, PlusHOFT)[source]¶

description:: THIS IS USED FOR “ONLINE” MODE 0. assuming Listsegments is given by Findglitchlist() 1.take the information of the list of allowed target and a preceding and following segments 2. whiten a target segmement based on the average background segment 3. find whitened FFT 4. save the whitened target and backgrounds FFTs Note this depends on Multiprocess_whitening()

USAGE: SaveTargetAndBackGroundHDF5(Listsegments, re_sfchs, IFO, outputpath, outputfilename, number_process, sigma, PlusHOFT)

Parameters

Listsegments – a list of segment parameters
channels – a list of safe channels
outputpath – a directory of an output file
outputfilename – a name of an output file
number_process – a number of processes in parallel
sigma – an integer to determine the upper bound of the off-source window
PlusHOFT – whether to get data of hoft, {‘True’ or ‘False’}

:return None

`origli.utilities.utilities`¶

Script name: utilities.py

Description:: File containing utilities

origli.utilities.utilities.FindBGlist(state, number_trials, step, outputMother_dir, df, Epoch_lt, TargetGlitchClass, IFO, BGSNR_thre, targetSNR_thre, Confidence_thre, UpperDurationThresh, LowerDurationThresh, UserDefinedDuration, gap, TriggerPeakFreqLowerCutoff=0, TriggerPeakFreqUpperCutoff=8192, targetUpperSNR_thre=inf, flag='Both')[source]¶

description:

load Gravity Spy data set (.csv file)
get the data about the target glitch class
get the subset of the target glitches based on SNR and confidence level threshold a user defines
accept glitches where their back ground segment do not coincide with any other glitches
return the info of the accepted glitches

USAGE: Listsegments = FindglitchlistOnLineMode(df, Epochstart, Epochend, Commissioning_lt, TargetGlitchClass, IFO, BGSNR_thre, targetSNR_thre, Confidence_thre, UpperDurationThresh, LowerDurationThresh, UserDefinedDuration, gap, flag)

Parameters

df – GravitySpy meta data in pandas format
Epochstart – starting time of an epoch
Epochend – end time of an epoch
Commissioning_lt – commissioning time in list
TargetGlitchClass – a target glitch class name (str)
IFO – a type of interferometer (H1, L1, V1) (str)
BGSNR_thre – an upper threhold of SNR for background glitches, i.e., quiet enough ), float or int
targetSNR_thre – a lower threshold of SNR for target glitches, float or int
Confidence_thre – a threshold of confidence level (float or int )
UpperDurationThresh – an upper bound of duration in sec (float or int)
LowerDurationThresh – a lower bound of duration in sec (float or int)
UserDefinedDuration – user defined duration of a glitch (float or int), 0 in default
gap – a time gap between the target and the background segments in sec, 1 sec in default
flag – ‘Both’ or ‘Either’ taking both backgronds or either the preceding or the following background, respectively to accept glitches

Returns

the list of parameters of glitches passing the above thresholds Listsegments contains of

ListIndexSatisfied: a list of index of glitches Listtarget_timeseries_start: a list of a target glitch starting time Listtarget_timeseries_end: a list of a target glitch ending time Listpre_background_start: a list of preceding background starting time Listpre_background_end; a list of preceding background ending time Listfol_background_start: a list of following background starting time Listfol_background_end: a list of following background ending time Listgpstime: a list of GPS times Listduration: a list of durations ListSNR: a list of SNRs Listconfi: a list of confidence levels ListID: a list of IDs

origli.utilities.utilities.FindRadomlistPoints(state, IFO, Epoch_lt, number_samples, step, outputMother_dir, df_target)[source]¶

description:

within an epoch, create a list of synthetic points with randomly chosen with durations following a distribution of that of a target glitch class
make pandas frame dataset

USAGE: df = FindRadomlistPoints(state, Epoch_lt, number_trials, step)

Parameters

state – IFO state {observing, nominal-lock}
IFO – an observer {H1, L1}
Epoch_lt – a list of epochs
number_samples – number of samples picked up
step – step of data points in sec
outputMother_dir – an output directory in witch the data set is in
df_target – true glitch samples generated by an ETG with SNR above an upper threshold of background

Returns

df: synthetic random data points within an epoch with durations generated from a distribution of a target glitch

origli.utilities.utilities.FindShiftedPoints(state, IFO, Epoch_lt, number_samples, step, outputMother_dir, df_target)[source]¶

description:

within an epoch, create a list of synthetic points by shifting a target glitch class
make pandas frame dataset

USAGE: df = FindRadomlistPoints(state, Epoch_lt, number_trials, step)

Parameters

state – IFO state {observing, nominal-lock}
IFO – an observer {H1, L1}
Epoch_lt – a list of epochs
number_samples – number of samples picked up
step – step of data points in sec
outputMother_dir – an output directory in witch the data set is in
df_target – true glitch samples generated by an ETG with SNR above an upper threshold of background

Returns

df: synthetic random data points within an epoch with durations generated from a distribution of a target glitch

origli.utilities.utilities.Findglitchlist(df, Epoch_lt, TargetGlitchClass, IFO, BGSNR_thre, targetSNR_thre, Confidence_thre, UserDefinedDuration, UpperDurationThresh, LowerDurationThresh, gap, position_duration_bfr_centr, TriggerPeakFreqLowerCutoff=0, TriggerPeakFreqUpperCutoff=8192, targetUpperSNR_thre=inf, flag='Both')[source]¶

description:

load Gravity Spy data set (.csv file)
get the data about the target glitch class
get the subset of the target glitches based on SNR and confidence level threshold a user defines
accept glitches where their back ground segment do not coincide with any other glitches
return the info of the accepted glitches

USAGE: Listsegments = FindglitchlistOnLineMode(df, Epochstart, Epochend, Commissioning_lt, TargetGlitchClass, IFO, BGSNR_thre, targetSNR_thre, Confidence_thre, UserDefinedDuration, gap, flag)

Parameters

df – GravitySpy meta data in pandas format
Epochstart – starting time of an epoch
Epochend – end time of an epoch
Commissioning_lt – commissioning time in list
TargetGlitchClass – a target glitch class name (str)
IFO – a type of interferometer (H1, L1, V1) (str)
BGSNR_thre – an upper threhold of SNR for background glitches, i.e., quiet enough ), float or int
targetSNR_thre – a lower threshold of SNR for target glitches, float or int
Confidence_thre – a threshold of confidence level (float or int )
UpperDurationThresh – an upper bound of duration in sec (float or int)
LowerDurationThresh – a lower bound of duration in sec (float or int)
UserDefinedDuration – user defined duration of a glitch (float or int), 0 in default
gap – a time gap between the target and the background segments in sec, 1 sec in default
position_duration_bfr_centr – proportion of duration for a target segment around a center time, e.g, 0.5 indicates duration is even distributed around a center time, 0.83 indicates 5/6 is before the center time
TriggerPeakFreqLowerCutoff – a lower limit cutoff value of the peak frequency of triggers given by an ETG for target glitches
TriggerPeakFreqUpperCutoff – an upper limit cutoff value of the peak frequency of triggers given by an ETG queries for target glitches
targetUpperSNR_thre – an upper limit cutoff value of SNR of triggers given by an ETG queries for target glitches
flag – ‘Both’ or ‘Either’ taking both backgronds or either the preceding or the following background, respectively to accept glitches

Returns

the list of parameters of glitches passing the above thresholds Listsegments contains of

ListIndexSatisfied: a list of index of glitches Listtarget_timeseries_start: a list of a target glitch starting time Listtarget_timeseries_end: a list of a target glitch ending time Listpre_background_start: a list of preceding background starting time Listpre_background_end; a list of preceding background ending time Listfol_background_start: a list of following background starting time Listfol_background_end: a list of following background ending time Listgpstime: a list of GPS times Listduration: a list of durations ListSNR: a list of SNRs Listconfi: a list of confidence levels ListID: a list of IDs

origli.utilities.utilities.FindglitchlistLongestBG(df, Epoch_lt, TargetGlitchClass, IFO, BGSNR_thre, targetSNR_thre, Confidence_thre, UpperDurationThresh, LowerDurationThresh, UserDefinedDuration, gap, position_duration_bfr_centr, TriggerPeakFreqLowerCutoff=0, TriggerPeakFreqUpperCutoff=8192, targetUpperSNR_thre=inf, flag='Both')[source]¶

description:

load Gravity Spy data set (.csv file)
get the data about the target glitch class
get the subset of the target glitches based on SNR and confidence level threshold a user defines
accept glitches where their back ground segment do not coincide with any other glitches
return the info of the accepted glitches

USAGE: Listsegments = FindglitchlistOnLineMode(df, Epochstart, Epochend, Commissioning_lt, TargetGlitchClass, IFO, BGSNR_thre, targetSNR_thre, Confidence_thre, UpperDurationThresh, LowerDurationThresh, UserDefinedDuration, gap, flag)

Parameters

df – GravitySpy meta data in pandas format
Epochstart – starting time of an epoch
Epochend – end time of an epoch
Commissioning_lt – commissioning time in list
TargetGlitchClass – a target glitch class name (str)
IFO – a type of interferometer (H1, L1, V1) (str)
BGSNR_thre – an upper threhold of SNR for background glitches, i.e., quiet enough ), float or int
targetSNR_thre – a lower threshold of SNR for target glitches, float or int
Confidence_thre – a threshold of confidence level (float or int )
UpperDurationThresh – an upper bound of duration in sec (float or int)
LowerDurationThresh – a lower bound of duration in sec (float or int)
UserDefinedDuration – user defined duration of a glitch (float or int), 0 in default
gap – a time gap between the target and the background segments in sec, 1 sec in default
position_duration_bfr_centr – proportion of duration for a target segment around a center time, e.g, 0.5 indicates duration is even distributed around a center time, 0.83 indicates 5/6 is before the center time
TriggerPeakFreqLowerCutoff – a lower limit cutoff value of the peak frequency of triggers given by an ETG for target glitches
TriggerPeakFreqUpperCutoff – an upper limit cutoff value of the peak frequency of triggers given by an ETG queries for target glitches
targetUpperSNR_thre – an upper limit cutoff value of SNR of triggers given by an ETG queries for target glitches
flag – ‘Both’ or ‘Either’ taking both backgronds or either the preceding or the following background, respectively to accept glitches

Returns

the list of parameters of glitches passing the above thresholds Listsegments contains of

ListIndexSatisfied: a list of index of glitches Listtarget_timeseries_start: a list of a target glitch starting time Listtarget_timeseries_end: a list of a target glitch ending time Listpre_background_start: a list of preceding background starting time Listpre_background_end; a list of preceding background ending time Listfol_background_start: a list of following background starting time Listfol_background_end: a list of following background ending time Listgpstime: a list of GPS times Listduration: a list of durations ListSNR: a list of SNRs Listconfi: a list of confidence levels ListID: a list of IDs

origli.utilities.utilities.Findglitchlist_for_timeseries_analysis(df, Epoch_lt, TargetGlitchClass, IFO, BGSNR_thre, targetSNR_thre, Confidence_thre, UpperDurationThresh, LowerDurationThresh, UserDefinedDuration, gap, position_duration_bfr_centr, TriggerPeakFreqLowerCutoff=0, TriggerPeakFreqUpperCutoff=8192, targetUpperSNR_thre=inf, flag='Both')[source]¶

description:

load Gravity Spy data set (.csv file)
get the data about the target glitch class
get the subset of the target glitches based on SNR and confidence level threshold a user defines
accept glitches where their back ground segment do not coincide with any other glitches
return the info of the accepted glitches

USAGE: Listsegments = Findglitchlist_for_timeseries_analysis(df, Epoch_lt, TargetGlitchClass, IFO, BGSNR_thre, targetSNR_thre, Confidence_thre, UpperDurationThresh, LowerDurationThresh, UserDefinedDuration, gap, position_duration_bfr_centr, TriggerPeakFreqLowerCutoff=0, TriggerPeakFreqUpperCutoff=8192, targetUpperSNR_thre=np.inf,flag=’Both’):

Parameters

df – GravitySpy meta data in pandas format
Epochstart – starting time of an epoch
Epochend – end time of an epoch
Commissioning_lt – commissioning time in list
TargetGlitchClass – a target glitch class name (str)
IFO – a type of interferometer (H1, L1, V1) (str)
BGSNR_thre – an upper threhold of SNR for background glitches, i.e., quiet enough ), float or int
targetSNR_thre – a lower threshold of SNR for target glitches, float or int
Confidence_thre – a threshold of confidence level (float or int )
UpperDurationThresh – an upper bound of duration in sec (float or int)
LowerDurationThresh – a lower bound of duration in sec (float or int)
UserDefinedDuration – user defined duration of a glitch (float or int), 0 in default
gap – a time gap between the target and the background segments in sec, 1 sec in default
position_duration_bfr_centr – proportion of duration for a target segment around a center time, e.g, 0.5 indicates duration is even distributed around a center time, 0.83 indicates 5/6 is before the center time
TriggerPeakFreqLowerCutoff – a lower limit cutoff value of the peak frequency of triggers given by an ETG for target glitches
TriggerPeakFreqUpperCutoff – an upper limit cutoff value of the peak frequency of triggers given by an ETG queries for target glitches
targetUpperSNR_thre – an upper limit cutoff value of SNR of triggers given by an ETG queries for target glitches
flag – ‘Both’ or ‘Either’ taking both backgronds or either the preceding or the following background, respectively to accept glitches

Returns

the list of parameters of glitches passing the above thresholds Listsegments contains of

ListIndexSatisfied: a list of index of glitches Listtarget_timeseries_start: a list of a target glitch starting time Listtarget_timeseries_end: a list of a target glitch ending time Listpre_background_start: a list of preceding background starting time Listpre_background_end; a list of preceding background ending time Listfol_background_start: a list of following background starting time Listfol_background_end: a list of following background ending time Listgpstime: a list of GPS times Listduration: a list of durations ListSNR: a list of SNRs Listconfi: a list of confidence levels ListID: a list of IDs

origli.utilities.utilities.GrabGPStimesSafechannel(fileid, ifo, SNRthre, glitch, PathSafeChannel, Epochstart, Epochend, Commissioning_lt=None)[source]¶

description: This imports a file containing the output of GravitySpy

This take the GPS times during O2 run and take the list of the safe channels and modify it such as it works in gwpy

USAGE: GPSs, ids ,re_sfchs = GrabGPStimesSafechannel(‘/home/kentaro.mogushi/longlived/MachineLearningJointPisaUM/dataset/GravityspyTrainingset/gspy-db-20180813.csv’, ‘L1’, 7, ‘Blip’, ‘/home/kentaro.mogushi/longlived/MachineLearningJointPisaUM/dataset/ListSaveChannel/L1/O2_omicron_channel_list_hvetosafe_GDS.txt’) :param fileid: a file that contains all the meta data of glitches used for training set for GravitySpy

Parameters

ifo – a kind of interferometer {L1, H1, V1}
SNRthre – the minimum threshold of SNR, e.g, 7
glitch – a kind of glitch
PathSafeChannel – the full path of the file of the metadata
Epochstart – GPS time when a science run begins, float or int
Epochend – GPS time when a science run ends, float or int
Commissioning_lt – the set of commissioning times in a list of lists, e.g., [[Cstart1, Cend1], [Cstart2, Cend2]], None in defalt

:return: GPSs: a list of GPS timse ids: a list of unique id re_sfchs: a list of safe channels

origli.utilities.utilities.GrabSafechannel(PathSafeChannel)[source]¶

description:: take the list of the safe channels and modify it such as it works in gwpy

USAGE: re_sfchs = GrabGPStimesSafechannel(‘/home/kentaro.mogushi/longlived/MachineLearningJointPisaUM/dataset/ListSaveChannel/L1/O2_omicron_channel_list_hvetosafe_GDS.txt’)

Parameters: :PathSafeChannel – the full path of the file of the metadata

:return: re_sfchs: a list of safe channels

class origli.utilities.utilities.IdentifyGlitch[source]¶

Bases: object

CombinedIndentifyingProcess(IFO, ListSegments, TriggerDir)[source]¶

description:

use OmimcronTriggerPath() to log in either L1 or H1 cluster and get a list of trigger files paths
use CopyOmicroTriggerandUnzip() to copy trigger files and unzip them
iterate with trigger XML files
3-1. use readXML() to get mata data stored in a .xml file 3-2. use ExtractOmcronTriggerMetadata() to re-arrange meta data matrix
save the matrix as .csv file
copy the .csv file into the CIT cluster and go back to the CIT cluster

USAGE: CombinedIndentifyingProcess(self, ‘L1’, ListSegments, ‘/home/kentaro.mogushi/longlived/OmicronTrigger’)

Parameters

IFO – ifo {H1, L1, V1}
ListSegments – a list of segments
TriggerDir – a mother directory of a trigger file

Returns

None

CopyOmicroTriggerandUnzip(input_file, input_dir, TriggerDir, output_dir='OmicronTriggerXML')[source]¶

description:

load a file comprising all the paths of omicron trigger files you are interested in
copy all the omicron trigger file in your working place
go to an output directory
unzip all the files and replace them with the zipped files
go back to a working directory

USAGE: CopyOmicroTriggerandUnzip(‘omicron.txt’, trigger_dir, trigger_dir)

Parameters

input_file – an input file
input_dir – an input directory, in default, a current directory
trigger_dir – # a directory right above an output directory
output_dir – # an output directory where all the omicron trigger files will be stored

Returns

None

ExtractOmcronTriggerMetadata(name_a)[source]¶

description:

load meta data matrix, expected to be created by readXML()
re-arrange for the sake of convenience and convert it in a numpy array
return numpy re-arranged meta data matrix

USAGE: Matdataset = ExtractOmcronTriggerMetadata(name_a)

Parameters: name_a – omicron trigger info (list)
Returns: Matdataset: omicron trigger metadata (numpy array)

FindGlitchNearestGPS(PathMetadataFile, candidate_GPS)[source]¶

description:

take a omicron trigger meta data file
find a glitch nearest to an input GPS, which you are concerned about

3. replace the label of this glitch from ‘arbitrary’ to ‘candidate’ This function is assumed to be used in DQR. Once GraceDB provides a GPS time, this function labels a glitch nearest glitch given by Omicron Trigger as it can be considered as the most significant candidate of astronomical event. In this way, we can study only the candidate glitch by specifying glitch_type = candidate in a configuration file

USAGE: largetSNR = FindGlitchNearestGPS(PathMetadataFile, GPS)

Parameters

PathMetadataFile – a path to omicron meta data file
GPS – a GPS time to be concerned

Returns

None

FindObservingTimeSegments(IFO, startT, endT, outputMother_dir, state='observing')[source]¶

description:

take a DataQualityFrag from a server
save its segment data as a HDF file
take active segments
return active segments as a numpy array

USAGE: SegmentsMat, trigger_dir = FindObservingTimeSegments(IFO, startT, endT, outputMother_dir, state)

Parameters

IFO – a type of interferometer, ‘L1’ or ‘H1’
startT – (float, int or string: e.g., ‘Dec 8 2016’) starting time
endT – ending time
outputMother_dir – an ouput directory where the HDF5 file will be stored
state – state of an interferometer, {observing, nominal-lock}

Returns

SegmentsMat: active segments in a numpy array

GetTriggerMetadata(ListSegments, IFO, output_dir, number_process, trigger_pipeline='omicron', output_file='TriggerMetadata.csv', channel='GDS-CALIB_STRAIN')[source]¶

description:

take the path of files storing omicron triggers during a segment
make a metadata in pandas frame

3. save it as a .csv file Note: this function works samely as CombinedIndentifyingProcess() but it is faster. And it is supposed to support pyCBC trigger as well.

USAGE: GetTriggerMetadata(ListSegments, IFO, output_dir, number_process, trigger_pipeline=’omicron’, output_file=’TriggerMetadata.csv’, channel=’GDS-CALIB_STRAIN’)

Parameters

ListSegments – (list of lists) [[s1, e1], [s2, e2], …], this is because I want to exclude segments of non-observing times
IFO – interferometer (L1, or H1)
output_dir – an output directory
output_file – a name of an output file
number_process – a number maximum processes in parallel
trigger_pipeline – trigger method ‘omicron’, ‘pycbc-live’
channel – (str) the name of a channel

Returns

ListXML: a list of trigger XML files

OmimcronTriggerPath(ListSegments, TriggerDir, IFO, output_file='omicron.txt', channel='GDS-CALIB_STRAIN')[source]¶

description:

log-in either Livingston or Hanford cluster

take the path of files storing omicron triggers during a segment

save those paths into an output file

USAGE: OmimcronTriggerPath(‘L1’, ListSegments, ‘/home/kentaro.mogushi/longlived/OmicronTrigger’, IFO)

Parameters

IFO – interferometer (L1, or H1)
ListSegments – (list of lists) [[s1, e1], [s2, e2], …], this is because I want to exclude segments of non-observing times
channel – (str) the name of a channel
trigger_dir – (str) an output directory
output_file – (str) an output file

Param

IFO:

Returns

trigger_dir

SaveMetaDataAsCSV(MetaData, output_dir, output_file)[source]¶

description:

load trigger meta data matrix, which is expected to be created by ExtractOmcronTriggerMetadata()
add label for these triggers with ‘unknown’
add imgUrl with None
save this matrix as .csv file

USAGE: SaveMetaDataAsCSV(AllMatDataStr, trigger_dir, ‘OmicrontriggerMetadata.csv’)

Parameters

MetaData – numpy array, omicron trigger metadata
output_file – an output file
output_dir – and output directory

Returns

None

calculate_chisqr_weighted_snr(snr, chisq, chisq_dof)[source]¶

description:: calculate the chi-square weighted SNR Reference: Macleod et al. 2015, Equation (21)

USAGE: chisqr_weighted_snr = calculte_chisqr_weighted_snr(self, snr, chisqr, chisqr_dof)

Parameters

snr – SNR
chisqr – chi-square
chisqr_dof – chi-square degrees of freedom

Returns

chisqr_weighted_snr: the chi-square weighted SNR

pyCBC_SNR_filter(all_pycbc_triggers_frame, SNR_low_cut, SNR_high_cut)[source]¶

description:: band pass filter with SNR for pycbc triggers

USAGE: df_SNR_cut = pyCBC_SNR_filter(all_pycbc_triggers_frame, SNR_low_cut=7.5, SNR_high_cut=150)

Parameters

all_pycbc_triggers_frame – a pandas frame of all the pyCBC triggers
SNR_low_cut – a lower cutoff SNR
SNR_high_cut – a higher cutoff SNR

Returns

df_new: pycbc triggers that pass the all the condition defined above

pyCBC_chisqr_weighted_snr_filter(pycbc_triggers_frame, chisqr_weighted_snr_lower_cutoff)[source]¶

description: high pass filter with chi square weighted SNR for pycbc triggers

USAGE: df_chi_square_weighted_cut = pyCBC_chisqr_weighted_snr_filter(pycbc_triggers_frame, chisqr_weighted_snr_lower_cutoff)

Parameters

pycbc_triggers_frame – pyCBC triggers in pandas frame
chisqr_weighted_snr_lower_cutoff – a lower cutoff chi-square weighted snr

Returns

df_new: triggers with chi square weighted SNR above chisqr_weighted_snr_lower_cutoff

pyCBC_massratio_filter(pycbc_triggers_frame, low_cutoff_massratio, high_cutoff_massratio)[source]¶

description: band pass filter with total mass for pycbc triggers

USAGE: df_massratio_cut = pyCBC_query_massratio(pycbc_triggers_frame, low_cutoff_massratio, high_cutoff_massratio)

Parameters

pycbc_triggers_frame – pyCBC triggers in pandas frame
low_cutoff_totalmass – a lower cutoff of the total mass in solar mass
high_cutoff_totalmass – a higher cutoff of the total mass in solar mass

Returns

df_new: triggers with mass ratio between low_cutoff_massratio and high_cutoff_massratio

pyCBC_query_outlier(pycbc_triggers_frame, BinNum, Nsigma, cut)[source]¶

description:

bin the triggers with values of log10 of chi-square per DOF

2. calculate the lower bound of log10 of chi-square per DOF in the bin 3 (cut = ‘median’) Calculate the median of log10 of SNR and log10 of chi square per degree of freedom in the bin 3.(cut = ‘mad’) Calculate the upper bound of log10 of SNR in the bin where the lower bound is the median minus the median absolute deviation, the upper bound is the median plus the median absolute deviation 4. Polynomial fit the upper bound of log10 of SNR as a function of the lower bound of log10 of chi-square per DOF 5. split triggers using the polynomial fit to loud and quiet triggers

USAGE: pycbc_loud = pyCBC_query_outlier(pycbc_triggers_frame, BinNum=50, Nsigma=1, cut=’median’)

Parameters

pycbc_triggers_frame – pycbc triggers in pandas frame
BinNum – the number of bins of a histogram for log10 of chi squar per degrees of freedom
Nsigma – an integer to determine the upper bound of the quiet triggers
cut – a method of cut {median or mad}

Returns

pycbc_loud: loud triggers

pyCBC_template_duration_filter(pycbc_triggers_frame, low_cutoff_duration, high_cutoff_duration)[source]¶

description:: band pass filter with template duration for pycbc triggers

USAGE: df_template_duration_cut = pyCBC_template_duration_filter(pycbc_triggers_frame, low_cutoff_duration, high_cutoff_duration)

Parameters

pycbc_triggers_frame – pyCBC triggers in pandas frame
low_cutoff_duration – a lower cutoff of the template duration in sec
high_cutoff_duration – a higher cutoff of the template duration in sec

Returns

df_new: triggers with template druation between low_cutoff_duration and high_cutoff_duration

pyCBC_totalmass_filter(pycbc_triggers_frame, low_cutoff_totalmass, high_cutoff_totalmass)[source]¶

description: band pass filter with total mass for pycbc triggers

USAGE: df_totalmass_cut = pyCBC_totalmass_filter(pycbc_triggers_frame, low_cutoff_totalmass, high_cutoff_totalmass)

Parameters

pycbc_triggers_frame – pyCBC triggers in pandas frame
low_cutoff_totalmass – a lower cutoff of the total mass in solar mass
high_cutoff_totalmass – a higher cutoff of the total mass in solar mass

Returns

triggers with total mass between low_cutoff_totalmass and high_cutoff_totalmass

pycbc_clustering_timeslice(pycbc_trigger, IFO, startT, endT, window, extension_duration)[source]¶

description:: This is a clustering filter 1. take a frame of pycbc triggers 2. pick a trigger with highest SNR in a window that is a time-sliced bin 3. create new columns required to run this code

USAGE: pycbc_trigger = pycbc_clustering_timeslice(pycbc_trigger, IFO, startT, endT, window=0.1, extension_duration=1.5)

Parameters

pycbc_trigger – pycbc triggers in pandas frame
IFO – ifo
startT – start time of an epoch
endT – end time of an epoch
window – a window length in sec for clustering
extension_duration – factor for extending the template duration, e.g., extension_duration = 1.5 makes the duration of the on-source window 1.5 times longer than that of the trigger

Returns

pycbc_trigger: clustered pycbc triggers

pycbc_clustering_window_around_trigger(pycbc_trigger, IFO, one_sided_window, extension_duration)[source]¶

description:: This is a clustering filter 1. take a frame of pycbc triggers 2. pick up a trigger with highest SNR in a window around a trigger. The window size is twice of “one_sided_window” 3. create new columns required to run this code

USAGE: pycbc_trigger = pycbc_clustering_window_around_trigger(pycbc_trigger, IFO, one_sided_window=0.1, extension_duration=1.5)

Parameters

pycbc_trigger – pycbc triggers in pandas frame format
IFO – ifo
one_sided_window – one sided window in seconds around a trigger for clustering,
extension_duration – factor for extending the template duration, e.g., extension_duration = 1.5 makes the duration of the on-source window 1.5 times longer than that of the trigger

Returns

pycbc_trigger: clustered pycbc triggers

readXML(input_dir, input_file)[source]¶

description:

take meta data stored s a .xml file in a directory named OmicronTriggerXML
meta data matrix in the form of list

USAGE: TriggerMat = readXML(input_dir, input_file)

Parameters

input_dir – an input directory
input_file – input file name

Returns

name_a: (numpy array)

origli.utilities.utilities.ListUsedSafeChannel(path_list_channel, ifo)[source]¶

description:

take a path to a .csv file that has lists of channels
remove unused safe channel from a list of safe channels

USAGE: sfchs = ListUsedSafeChannel(path_list_channel, ifo)

Parameters

path_list_channel – a path to a list of channels (.csv file)
ifo – observatory {L1, H1} in str

Returns

sfchs: subst of safe channels in numpy array

origli.utilities.utilities.Multiprocess_ConvertToTable(cache_indiv, trigger_pipeline, IFO, Columns)[source]¶

description:: Multi process does not work if I include this inside the class IdentifyGlitch() so that I define this here globally

Parameters

cache_indiv – an individual cache (each trigger file)
trigger_pipeline – a name of trigger pipeline {omicron, pycbc-live}
IFO – a name of the detector {L1, H1}
Columns – a list of columns for a metadata. This is None for omicron trigger as it is nother to do

Returns

df_indiv: a metadata of triggers in pandas frame

origli.utilities.utilities.Multiprocess_whitening(full_timeseries, target_timeseries_start, target_timeseries_end, pre_background_start, pre_background_end, fol_background_start, fol_background_end)[source]¶

description:: This is used for multi processing for whitening segments

Parameters

full_timeseries – time series comprising target and BGs
target_timeseries_start – a start time of a target segment
target_timeseries_end – an end time of a target segment
pre_background_start – a start time of a preceding BG
pre_background_start – an end time of a preceding BG
pre_background_start – a start time of a following BG
pre_background_start – an end time of a following BG

Returns

whitened_fft_target: whitened fft of a target segment whitened_fft_PBG: whitened fft of a preceding segment whitened_fft_FBG: whitened fft of a following segment sample_rate: sampling rate of this channel DURATION: a duration of a target segment

origli.utilities.utilities.Multiprocess_whitening_timeseries(full_timeseries, target_timeseries_start, target_timeseries_end, pre_background_start, pre_background_end, fol_background_start, fol_background_end)[source]¶

description:: This is used for multi processing for whitening segments and puts out the absolute values of whitened timeseries of the on- and off-source windows instead of putting out of the frequency series. The the deviation of the whitened timeseries is imformative so that it calculates absolute values This function is used to study the metric evaluated in the time domain compared with the metric evaluated in the frequency domain. In conclusion, the metric in the frequency domain is better so that this function is useless.

USAGE: whitened_target_timeseries_abs, whitened_pre_off_source_abs, whitened_fol_off_source_abs, sample_rate, DURATION = Multiprocess_whitening_timeseries(full_timeseries, target_timeseries_start, target_timeseries_end, pre_background_start, pre_background_end, fol_background_start, fol_background_end)

Parameters

full_timeseries – time series comprising target and BGs
target_timeseries_start – a start time of a target segment
target_timeseries_end – an end time of a target segment
pre_background_start – a start time of a preceding BG
pre_background_start – an end time of a preceding BG
pre_background_start – a start time of a following BG
pre_background_start – an end time of a following BG

Returns

whitened_target_timeseries_abs: the absolute values of whitened timeseries in the on-source window whitened_pre_off_source_abs: the absolute values of whitened timeseries in the preceding off-source window whitened_fol_off_source_abs: the absolute values of whitened timeseries in the preceding off-source window sample_rate: sampling rate of this channel DURATION: a duration of a target segment

class origli.utilities.utilities.PlotTableAnalysis[source]¶

Bases: object

AutoDetermineTrendBin(SegmentStart, SegmentEnd)[source]¶

description: automatically determine the bins of the subclass trend plot USAGE: trend = AutoDetermineTrendBin(SegmentStart, SegmentEnd)

Parameters

SegmentStart – start time of a segment
SegmentEnd – end time of a segment

Returns

trend: {‘mins’, ‘hours’, ‘days’, ‘month’}

BinomialDist(k, n, likelihood)[source]¶

description: Binomial distribution USAGE: out = BinomialDist(self, k, n, likelihood)

Parameters

k – the number of detections
n – the number of trials
p – likelihood of detection

Returns

out: a value of probability density to find k detection out of n trials

BinomialTest(k, N, rate)[source]¶

description: compute p-value of one-tailed Binomial test against null rate USAGE: p_value = BinomialTest(k, N, rate)

Parameters

k – observed numberof successes
N – total number of samples
rate – rate of successes drawn from a null hypothesis

Returns

p_value: probablity of successes equal or greater than the observed number of successes

Calculate_number_and_rate(df_target, df_null, d_c, err_cal, channels)[source]¶

description:

use the target and null samples
calculate the numbers of the target and null samples above threshold
calculate the fraction of the target and null samples above threshold

USAGE: channels, list_num1, list_num0, total_sample1, total_sample0, list_Causal, list_causal_err = Calculate_number_and_rate(df_target, df_null, d_c=None, err_cal=False, channels=None)

Parameters

df_target – target samples in pandas frame
df_null – null samples in pandas frame
d_c – a threshold. if it is None, the threshold is the mean value of null samples
err_cal – boolen, wether calculate the error of the statistics or not
channels – a list of channels

Returns

channels: a list of channels list_num1: a list of the number of the target samples above a threshold list_num0: a list of the number of the null samples above a threshold total_sample1: the number of the target samples, float total_sample0: the number of the null samples, float list_Causal: a list of the statistics list_causal_err: a list of the errors of the statistics. It is None if err_cal = False

CreateChannelTicks(ListChannelName)[source]¶

description:

take the dominant sub-channel names
get a list of index where a sub-sensor name changes

dependencies: (tacitly) CreateMatCount(), make_subset_channel_based_on_samplingrate() USAGE: CenterTicks, ListInd, ListSubsys = CreateChannelTicks(ListChannelName)

Parameters: ListChannelName – a list of channel names
Returns: CenterTicks: center value of a dominant sub sensor belongs ListInd: the edge indices of a dominant sub sensor belongs ListSubsys: a list of dominant sensor names

CreateMatCount(sigma, g_Individual=None, LowerCutOffFreq='None', UpperCutOffFreq='None')[source]¶

description:

find counts for each glitch
stack over all the gliches
make a matix comprising importance versus channels

USAGE: MatCount, ListChannelName, ListSNR, ListConf, ListGPS, ListDuration = CreateMatCount(sigma, LowerCutOffFreq=’None’, UpperCutOffFreq=’None’) # for all glitches

MatCount, ListChannelName, ListSNR, ListConf = CreateMatCount(sigma, g_Individual, LowerCutOffFreq=’None’, UpperCutOffFreq=’None’) # for an individual glitch

Parameters

sigma (sigma) – an integer to determine the upper bound of the off-source window
g_Individual – a grounp of a HDF5 file that has values of importance for a gltich
LowerCutOffFreq – a lower limit frequency cut to calculate a value of importance
UpperCutOffFreq – an uppper limit frequency cut to calculate a value of importance

Returns

MatCount: a matrix comprising importance versus channels ListChannelName: a list of channel names ListSNR: a list of SNRs ListConf: a list of confidence

dependencies: self.HierarchyChannelAboveThreshold(g, LowerCutOffFreq, UpperCutOffFreq)

Determine_number_of_subclass(Path_Target_Glitch_SubClassClustered_Dataset, Path_Null_Dataset, test_confidence)[source]¶

description:

query the clustered target samples
query null samples
perform one-sided binomial test and one-sided Welch t-test on each subclass
count the number of subclasses which has at least one channel passing the both tests

USAGE: num_subclass = Determine_number_of_subclass(Path_Target_Glitch_SubClassClustered_Dataset, Path_Null_Dataset, test_confidence)

Parameters

Path_Target_Glitch_SubClassClustered_Dataset – a path to the clustered target samples
Path_Null_Dataset – a path to null samples
test_confidence – a statistical confidence level

Returns

num_subclass: the number of subclasses which has at least one channel passing the both tests

FindNumberChannels(g)[source]¶

description:: count a number of channels that are analyzed

USAGE: NumberOfChannels = FindNumberChannels(g)

Parameters: g – a HDF file group object
Returns: NumberOfChannels: a number of channels that are analyzed

FindNumberSample()[source]¶

description:: find the number of samples

Returns: the number of samples

FindSubClass(MatCount, ListChannelName, ListGPS, ListDuration, output_dir, upper_number_cluster, applied_Transformation)[source]¶

description:

use a clustering approach
plot glitch index VS channel grouped by clusters
make a table comprising a list of GPS times a given cluster
plot Importance VS channel of a given sub-class
make a corresponding table

USAGE: FindSubClass(MatCount, ListChannelName, ListGPS, output_dir)

Parameters

MatCount – a matrix comprising importance versus channels
ListChannelName – a list of channel names
ListGPS – a list of GPS times
ListDuration – a list of durations
output_dir –
upper_number_cluster – a value of upper limit of the number of clusters
applied_Transformation – decomposition applided {‘PCA’, ‘kernelPCA’, ‘None’}

Returns

None

FrequencyBandColor(PathGlitchChannel_Low, PathGlitchChannel_Mid, PathGlitchChannel_High)[source]¶

description:

load files that have a low, middle, high frequency band importances. (GPSImportanceChannels.csv)
make a matrix that has RGB color based on each importance
output this matrix

USAGE: RGBMat = FrequencyBandColor(PathGlitchChannel_Low, PathGlitchChannel_Mid, PathGlitchChannel_High)

Parameters

PathGlitchChannel_Low – a path to a file that has a low frequency band
PathGlitchChannel_Mid – a path to a file that has a middle frequency band
PathGlitchChannel_High – a path to a file that has a high frequency band

Returns

RGBMat

Get_MatCountGPSDuration(Path_Target_Glitch_SubClassClustered_Dataset)[source]¶

description:: get a importance matrix, a list of channels, a list of GPS times, and a list of durations from the clustered target samples

Parameters: Path_Target_Glitch_SubClassClustered_Dataset – a path to the clustered target samples
Returns: MatCount: a importance matrix ListChannelName: a list of channels ListGPS: a list of GPS times ListDuration: a list of durations

HierarchyChannelAboveThreshold(g, sigma, LowerCutOffFreq='None', UpperCutOffFreq='None')[source]¶

description:: calculate values of importance which is the number of frequency bins above a the off-source window for channels for a given time of a glitch

USAGE: RankingChannelAndCount, ListCount, ListChannelName, GPS, ID, SNR, confidence, duration = HierarchyChannelAboveThreshold(g, LowerCutOffFreq, UpperCutOffFreq, sigma=10)

Parameters

g – (hdf5 format) a group having a glitch
pt – a value of area integrating a distribution upto
sigma – value of standard deviation of ratio of medians to determine important channels

Returns

RankingChannelAndRatioMed: # a list of channel names with their ratios descended by ratios Importantchannels: # channels with their ratio is greater than the threshold of ratio, along with their ratios

MakeMatrixOccurrenceVSChannel(MatCount, ListChannelName, ListGPS, ListDuration, output_dir)[source]¶

description:: save matrix of GPS, duration, importance as .csv file

USAGE: PlotOccurrenceVSChannel(MatCount, ListChannelName, ListGPS, ListDuration, output_dir, output_file)

Parameters

MatCount – a matrix comprising importance versus channels
ListChannelName – a list of channel names
ListGPS – a list of GPS times
ListDuration – a list of durations
output_dir –

Returns

None

dependencies: CreateChannelTicks()

PlotCausalityVSChannel(list_Causal_passed, list_Causal_fail, list_causal_passed_err, list_causal_failed_err, list_Test, ListChannelName, output_dir, output_file, BinomialTestConfidence, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶

description:: Calculate the probabilities of the causality of channels

USAGE: PlotCausalityVSChannel(list_Causal_passed, list_Causal_fail, list_Test, ListChannelName, output_dir, output_file, BinomialTestConfidence)

Parameters

list_Causal_passed – a list of the probability of the causality that are passed one-tailed Binomial test, if not zero
list_Causal_fail – a list of the probability of the causality that are failed one-tailed Binomial test, if not zero
list_causal_passed_err – a list of the error of the causal probability that are passed one-tailed Binomial test, otherwise, zero
list_causal_failed_err – a list of the error of the causal probability that are failed one-tailed Binomial test, otherwise, zero
list_Test – a list of results of the Binomial test, ‘pass’ or ‘fail’
ListChannelName – a list of channel names
output_dir – (only used for all glitches)
output_file – (only used for all glitches)
BinomialTestConfidence – binomial test confidence level
freq_bands – frequency bands used for the multi-frequency band search, which is defined in const.py

Returns

None

PlotConfidenceVSChannel(MatCount, ListChannelName, ListConf, output_dir, output_file)[source]¶

make a plot of values of confidence level of Gravity Spy versus channels dependencies: CreateChannelTicks() USAGE: PlotConfidenceVSChannel(MatCount, ListChannelName, ListConf, output_dir, output_file)

Parameters

MatCount – a matrix comprising importance versus channels
ListChannelName – a list of channel names
ListConf – a list confidence levels
output_dir –
output_file –

Returns

None

PlotImportanceVSChannel(MatCount, ListChannelName, output_dir, output_file, ax=None)[source]¶

description:

for all glitches in a class, this method works stand-alone

convert number to channel names in x ticks
color background based on channel types

3. plot a bar showing importance of channels USAGE: PlotImportanceVSChannel(MatCount, ListChannelName, output_dir, output_file) - for an individual glitch in a class, this method is used by …. 1. convert number to channel names in x ticks 2. color background based on channel types 3. plot a bar showing importance of channels

dependencies: CreateChannelTicks() tacitly CreateMatCount() USAGE: PlotImportanceVSChannel(MatCount, ListChannelName, None, None, ax) # for on-line mode USAGE: PlotImportanceVSChannel(MatCount, ListChannelName, output_dir, output_file) @ for off-line mode

Parameters

MatCount – a matrix comprising importance versus channels
ListChannelName – a list of channel names
output_dir – (only used for all glitches)
output_file – (only used for all glitches)
ax – matplotlib.pyplot object (used only for an individual glitch)

Returns

None

PlotIndividualFCS_ImportanceVSChannel(glitchtype, IFO, GravitySpy_df, output_dir, sigma, LowerCutOffFreq='None', UpperCutOffFreq='None', mode='offline', Listsegments=None, re_sfchs=None, Data_outputpath=None, Data_outputfilename=None, PlusHOFT='False', number_process=None)[source]¶

description:

load a file comprising all glitches in a class
create a plot comprising frequency versus channel & importance versus channel
- dependency:
  self.CreateChannelTicks(ListChannel) self.make_subset_channel_based_on_samplingrate() self.CreateMatCount() self.PlotImportanceVSChannel()
save a plot

dependencies: make_subset_channel_based_on_samplingrate(), CreateChannelTicks(), CreateMatCount(), PlotImportanceVSChannel() USAGE: PlotIndividualFCS_ImportanceVSChannel(glitchtype, IFO, output_dir, sigma, LowerCutOffFreq, UpperCutOffFreq)

Parameters

glitchtype – a type of glitch used for create a name of a plot
IFO – # a type of IFO used in a name of a plot
GravitySpy_df – a meta data of Gravity Spy in pandas frame
output_dir – a output directory
simga – an integer number used for the upper bound of BG noise
LowerCutOffFreq – the lower cut-off frequency used in CreateMatCount, ‘None’ in default
UpperCutOffFreq – the upper cut-off frequency used in CreateMatCount, ‘None’ in default
mode – ‘offline’ or ‘online’
Listsegments – a list of allowed glitches, which is used for online mode only, None in default
re_sfchs – a list of safe channels except unused channels, which is used for online mode only, None in default
Data_outputpath – a directory saving for a HDF5 file, which is used for online mode only, None in default
Data_outputfilename – a file saving for a HDF5 file, which is used for online mode only, None in default
PlusHOFT – whether to get data of HOFT {‘True’, ‘False’}, which is used for online mode only, ‘False’ in default
number_process – a number of processes in parallel, which is used for online mode only, None in default

return None

PlotOccurrenceVSChannel(MatCount, ListChannelName, ListGPS, ListDuration, output_dir, output_file)[source]¶

description:: plot glitch indecies versus channels

USAGE: PlotOccurrenceVSChannel(MatCount, ListChannelName, ListGPS, ListDuration, output_dir, output_file)

Parameters

MatCount – a matrix comprising importance versus channels
ListChannelName – a list of channel names
ListGPS – a list of GPS times
ListDuration – a list of durations
output_dir –
output_file –

Returns

None

dependencies: CreateChannelTicks()

PlotSNRVSChannel(MatCount, ListChannelName, ListSNR, output_dir, output_file)[source]¶

description:: make a plot of SNR of h(t) versus channels

dependencies: CreateChannelTicks() USAGE: PlotSNRVSChannel(self, MatCount, ListChannelName, ListSNR, output_dir, output_file)

Parameters

MatCount – a matrix comprising importance versus channels
ListChannelName – a list of channel names
ListSNR – a list SNRs
output_dir –
output_file –

Returns

None

PlotTimeVSChannel(MatCount, ListChannelName, ListGPS, output_dir, output_file, startT, endT, dt)[source]¶

description:: make a plot of glitch times versus channels where the time flow is from top to bottom If there are more than one glitch in a time bin, those gltiches’ values of importance are averaged If there is no glitches in a time bin, all the values of importance are set to be zero

dependencies: CreateChannelTicks() USAGE: PlotTimeVSChannel(MatCount, ListChannelName, output_dir, output_file, startT, endT, dt)

Parameters

MatCount – a matrix comprising importance versus channels
ListChannelName – a list of channel names
ListGPS – a list of GPS times
output_dir – a path to an output directory
output_file – a name of an output file
startT – start time of an epoch
endT – end time of an epoch
dt – step size in sec of the time slice

Returns

None

Plot_Welch_t_test(channels, list_t_values_passed, list_t_values_failed, list_Test, confidence_level, output_dir, output_file, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶

description:: plot the result of one-sided Welch t-test

USAGE: Plot_Welch_t_test(channels, list_t_values_passed, list_t_values_failed, list_Test, output_dir, output_file)

Parameters

channels – a list of channels
list_t_values_passed – a list of t-values that pass the test
list_t_values_failed – a list of t-values that fail the test
list_Test – a list of the test results {‘pass’, ‘fail’}
confidence_level – a confidence level
output_dir – an output directory
output_file – an output file name
freq_bands – frequency bands used for the multi-frequency band search, which is defined in const.py

Returns

None

Plot_fap(channels_band, list_GPS, list_duration, mat_fap, mat_Test, confidence_level, output_dir, output_file=None, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶

description:: plot the result of point chi-square test

USAGE: Plot_point_chisqur_test(channels, list_GPS, list_duration, mat_fap, mat_Test, confidence_level, output_dir, output_file, freq_bands=Const.freq_bands)

Parameters

channels_band – a list of channels in numpy array
list_GPS – a list of GPS times in numpy array
list_duration – a list of durations in numpy array
mat_fap – matrix of fap
mat_Test – a list of the test results {‘pass’, ‘fail’}
output_dir – an output directory
output_file – an output file name
freq_bands – frequency bands used for the multi-frequency band search, which is defined in const.py

Returns

None

Plot_p_belong(channels_band, list_GPS, list_duration, mat_p_belong, mat_Test, confidence_level, output_dir, output_file=None, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶

description:: plot the result of p_belong

USAGE: Plot_p_belong(channels, list_GPS, list_duration, mat_p_belong, mat_Test, confidence_level, output_dir, output_file, freq_bands=Const.freq_bands)

Parameters

channels – a list of channels in numpy array
list_GPS – a list of GPS times in numpy array
list_duration – a list of durations in numpy array
mat_p_belong – matrix of p_belong
mat_Test – a list of the test results {‘pass’, ‘fail’}
output_dir – an output directory
output_file – an output file name
freq_bands – frequency bands used for the multi-frequency band search, which is defined in const.py

Returns

None

Plot_p_greater(channels, list_p_greater_passed, list_p_greater_failed, list_Test, confidence_level, output_dir, output_file, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶

description:: plot the result of p_greater

USAGE: Plot_p_greater(channels, list_p_greater_passed, list_p_greater_failed, list_Test, output_dir, output_file)

Parameters

channels – a list of channels
list_p_greater_passed – a list of p_greater above confidence level
list_p_greater_failed – a list of p_greater that fail the test
list_Test – a list of the test results {‘pass’, ‘fail’}
confidence_level – a confidence level
output_dir – an output directory
output_file – an output file name
freq_bands – frequency bands used for the multi-frequency band search, which is defined in const.py

Returns

None

Plot_point_chisqur_test(channels_band, list_GPS, list_duration, mat_chsqr_passed, mat_chsqr_failed, mat_Test, p_values, confidence_level, output_dir, output_file=None, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶

description:: plot the result of point chi-square test

USAGE: Plot_point_chisqur_test(channels, list_GPS, list_duration, mat_chsqr_passed, mat_chsqr_failed, mat_Test, confidence_level, output_dir, output_file, freq_bands=Const.freq_bands)

Parameters

channels_band – a list of channels in numpy array
list_GPS – a list of GPS times in numpy array
list_duration – a list of durations in numpy array
mat_chsqr_passed – a matrix of “passed” chi-square values where glitch indices are in rows and channels are columns in numpy array note that channels in glitches that passed the test have non-zero values, otherwise zero
mat_chsqr_failed – a matrix of “failed” chi-square values where glitch indices are in rows and channels are columns in numpy array note that channels in glitches that failed the test have non-zero values, otherwise zero
mat_Test – a list of the test results {‘pass’, ‘fail’}
p_values – a matrix of p-values where glitch indices are in rows and channels are columns in numpy array
output_dir – an output directory
output_file – an output file name
freq_bands – frequency bands used for the multi-frequency band search, which is defined in const.py

Returns

None

Principal_component_analysis(MatCount, frac_component=0.9)[source]¶

description:: PCA decomposition applided to feature matrix

USAGE: MatCount, MatCount_inverse_pca = Principal_component_analysis(self, MatCount, frac_component=0.9)

Parameters

MatCount – a feature matrix of glitches with samples in rows and features in columns
frac_component – a cumulative variance

Returns

MatCount_pca: a feature matix in PCA space MatCount_inverse_pca: a reconstructed feature matrix in the original space

Save_fap_csv(channels, list_GPS, list_duration, mat_fap, output_dir)[source]¶

description:: save the FAP table as csv file

USAGE: Save_fap_csv(channels, list_GPS, list_duration, mat_fap, output_dir)

Parameters

channels – a list of channels
list_GPS – a list of GPS times
list_duration – a list of durations
mat_fap – a matrix of FAP of each channel for each glitch
output_dir – output directory

Returns

None

Save_p_belong_csv(channels_passed, list_GPS, list_duration, mat_p_belong, output_dir)[source]¶

description:: save the p_belong table as csv file channels_passed is the list of channels whose p_greater is above 0.5 to avoid misinterpretation of p_belong.

USAGE: Save_p_belong_csv(channels_passed, list_GPS, list_duration, mat_p_belong, output_dir)

Parameters

channels_passed – a list of channels
list_GPS – a list of GPS times
list_duration – a list of durations
mat_p_belong – a matrix of p_belong of each channel for each glitch
output_dir – output directory

Returns

None

Student_t_independet_test(data1, data2, Welch_test=True)[source]¶

Independent Student t-test USAGE: stat, p_value = self.Student_t_independet_test(data1, data0)

Parameters

data1 – a population
data2 – another population

Returns

t_value: t-value (critical value) p: p-value

TableCausality(list_Causal_passed, list_Causal_fail, list_Test, ListChannelName, output_dir, output_file, BinomialTestConfidence, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶

make a table of witness ratio statistic (WRS) of channels as a .csv file USAGE: TableCausality(self, list_Causal_passed, list_Causal_fail, list_Test, ListChannelName, output_dir, output_file)

Parameters

list_Causal_passed – a list of the probability of the causality that are passed one-tailed Binomial test, if not zero
list_Causal_fail – a list of the probability of the causality that are failed one-tailed Binomial test, if not zero
list_Test – a list of results of the Binomial test, ‘pass’ or ‘fail’
ListChannelName – a list of channel names
output_dir –
output_file –
BinomialTestConfidence – binomial test confidence level
freq_bands – frequency bands used for the multi-frequency band search, which is defined in const.py

Returns

None

TableImportance(MatCount, ListChannelName, output_dir, output_file)[source]¶

make a table of values of importance as a .csv file USAGE: TableImportance(MatCount, ListChannelName, output_dir, output_file)

Parameters

MatCount – a matrix comprising importance versus channels
ListChannelName – a list of channel names
output_dir –
output_file –

Returns

None

Table_Welch_t_test(channels, list_t_values_passed, list_t_values_failed, list_Test, confidence_level, output_dir, output_file, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶

description:: save the result of one-sided Welch t-test as a .csv file

USAGE: Table_Welch_t_test(channels, list_t_values_passed, list_t_values_failed, list_Test, output_dir, output_file)

Parameters

channels – a list of channels
list_t_values_passed – a list of t-values that pass the test
list_t_values_failed – a list of t-values that fail the test
list_Test – a list of the test results {‘pass’, ‘fail’}
confidence_level – a confidence level
output_dir – an output directory
output_file – an output file name
freq_bands – frequency bands used for the multi-frequency band search, which is defined in const.py

Returns

None

Table_p_greater(channels, list_p_greater_passed, list_p_greater_failed, list_Test, confidence_level, output_dir, output_file, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶

description:: save the result of p_greater as a .csv file

USAGE: Table_Welch_t_test(channels, list_p_greater_passed, list_p_greater_failed, list_Test, output_dir, output_file)

Parameters

channels – a list of channels
list_p_greater_passed – a list of p_greater that pass the test
list_p_greater_failed – a list of p_greater that fail the test
list_Test – a list of the test results {‘pass’, ‘fail’}
confidence_level – a confidence level
output_dir – an output directory
output_file – an output file name
freq_bands – frequency bands used for the multi-frequency band search, which is defined in const.py

Returns

None

TrendSubClass(SegmentStart, SegmentEnd, input_dir, input_file, output_dir, trend='months', norm=False)[source]¶

input file is supposed to be ClusteredGPSImportanceChannels.csv USAGE: TrendSubClass(SegmentStart, SegmentEnd, input_dir, input_file, output_dir, trend=’month’)

Parameters

SegmentStart – start GPS time of a segment
SegmentEnd – end GPS time of a segment
input_dir – input directory
input_file – an input file
output_dir – an output directory
trend – a trend ‘months’, ‘days’, ‘hours’ or ‘mins’

Returns

None

calculate_fap(target_mat, null_mat)[source]¶

description:: calculate the values of fap of each channel in each glitch

USAGE: faps = calculate_fap(target_mat, null_mat)

Parameters

target_mat – a matrix of target samples where glitch indices are in rows and channels are in columns
null_mat – a matrix of null samples where glitch indices are in rows and channels are in columns

Returns

fap

calculate_reweighted_importance(df1, df_FAP)[source]¶

description:: reweight the values of importance using FAP the reweighted importance is defined as rho_new = rho / (FAP + 1)

USAGE: df1_new = calculate_reweighted_importance(df1, df_FAP)

Parameters

df1 – a target samples in pandas frame
df_FAP – a FAP matrix in pandas frame

Returns

df1_new: reweighted importance matrix in pandas frame

chisquare_test(target_mat, null_mat)[source]¶

description:: calculate the values of chi-square of each channel in each glitch and put out values of chi-square and corresponding p-values

USAGE: chi2_value, p_value = chisquare_test(self, target_mat, null_mat)

Parameters

target_mat – a matrix of target samples where glitch indices are in rows and channels are in columns
null_mat – a matrix of null samples where glitch indices are in rows and channels are in columns

Returns

chi2_value: a matrix of chi-square values where glitch indices are in rows and channels are in columns p_value: a matrix of p-values where glitch indices are in rows and channels are in columns

find_channels(df_target)[source]¶

description:: find a list of channels from the target samples

USAGE: channels = find_channels(df_target)

Parameters: df_target – target samples in pandas frame
Returns: channels: a list of channels

find_meaing_ful_confidence(df1, df0, BinomialTestConfidence, d_c=None, err_cal=False, channels=0)[source]¶

description

load files of target glitches (df1) and dummy quite data set (df0)
compute true positive probability
perform one-tailed Binomial test

USAGE: list_Causal_passed, list_Causal_fail, list_causal_passed_err, list_causal_failed_err, list_Test, list_channels= find_meaing_ful_confidence(df1, df0, BinomialTestConfidence, d_c)

Parameters

df1 – target glitches in the pandas format
df0 – null dataset in the pandas format
BinomialTestConfidence – a confidence level used for one-tailed Binomial test
d_c – user defined threshold to claim detection of a glitch, None in default
channels – a list of channels, 0 in default if d_c is None, values of threshold is given by the mean value of importance generated by the dummy quite dataset

Returns

list_Causal_passed: a list of the probability of the causality that are passed one-tailed Binomial test, otherwise, zero list_Causal_fail: a list of the probability of the causality that are failed one-tailed Binomial test, otherwise, zero list_causal_passed_err: a list of the error of the causal probability that are passed one-tailed Binomial test, otherwise, zero list_causal_failed_err: a list of the error of the causal probability that are failed one-tailed Binomial test, otherwise, zero list_Test: a list of results of the Binomial test, ‘pass’ or ‘fail’ channels: a list of channel names

getHDF5Object()[source]¶

show a HDF5 file object USAGE: getHDF5Object()

Returns: a dictionary

make_subset_channel_based_on_samplingrate(g, target_sampling_rate)[source]¶

dependencies being upon: PlotIndividualFCS_ImportanceVSChannel() Usage: X, duration, Listch_label_num, ListChannel, GPS, SNR, confidence, ID = make_subset_channel_based_on_samplingrate(f[‘gps00000’], 256)

Parameters

g – target glitch class’s group or a file itself (at a GPS time), HDF5 format
target_sampling_rate – Sampling rate used to group them, (256, 512, 1024, 2048, 4096, 8192, 16384

Return X

matrix comprising some channels with a same samplign rate of a target glitch class at a given time whitened a reference, only duration: duration of time series Listch_label_num: the list of channel labels ListChannel: the list of name of channels GPS: GPS time of this group SNR: SNR of this glitch confidence: confidence level of this glitch ID: GravitySpy uniqu ID

perform_Welch_test(df_target, df_null, confidence_level, channels=0)[source]¶

description:: perform one-sided Welch t-test

USAGE: channels, list_t_values_passed, list_t_values_failed, list_Test = perform_Welch_test(df_target, df_null, confidence_level, channels=None)

Parameters

df_target – target samples in pandas frame
df_null – null samples in pandas frame
confidence_level – a confidence level
channels – a list of channels, 0 in default

:return: channels: a list of channels list_t_values_passed: a list of t-values that pass the test list_t_values_failed: a list of t-vlaues that fail the test list_Test: a list of the test results {‘pass’, ‘fail’}

perform_beta_dist(df_target, df_null, channels=0)[source]¶

description:: create beta distribution fits for target and null samples

USAGE: rv_t_dict, rv_n_dict = perform_beta_dist(df_target, df_null, channels=0)

Parameters

df_target – target samples in pandas frame
df_null – null samples in pandas frame
channels – channels (optional)

Returns

rv_t_dict: a dictionary of beta distribution (scipy obj) for the target samples rv_n_dict: a dictionary of beta distribution (scipy obj) for the null samples

perform_fap(df_target, df_null, confidence_level, channels=0)[source]¶

description:: calculate a FAP for each channel at each glitch

USAGE: channels, list_GPS, list_duration, mat_fap, mat_Test, confidence_level= perform_fap(df_target, df_null, confidence_level, channels)

Parameters

df_target – target samples in pandas frame
df_null – null samples in pandas frame
confidence_level – a confidence level
channels – a list of channels, 0 in default

:return: channels: a list of channels in numpy array list_GPS: a list of GPS times in numpy array list_duration: a list of durations in numpy array mat_fap: matrix of fap mat_Test: a list of the test results {‘pass’, ‘fail’} confidence_level: a value of user defined confidence level that is used for the test

perform_p_belong(df_target, p_greater_dict, rv_t_dict, rv_n_dict, confidence_level, channels=0)[source]¶

description:: 1. calculate p_belong for each channel in each frequency in each glitch keep only channels whose p_greater is greater than 0.5

USAGE: list_channels_passed, list_GPS, list_duration, mat_p_belong, mat_Test, confidence_level = perform_p_belong(df_target, p_greater_dict, rv_t_dict, rv_n_dict, confidence_level, channels=0)

Parameters

df_target – target samples in pandas frame
p_greater_dict – null samples in pandas frame
rv_t_dict – a dictionary of beta distribution (scipy obj) for the target samples
rv_n_dict – a dictionary of beta distribution (scipy obj) for the null samples
confidence_level – confidecel level
channels – channels (optional)

Returns

list_channels_passed: a list of channels whose p_greater is above 0.5 list_GPS: a list of GPS time of the target samples list_duration: list of durations mat_p_belong: a matrix of p_belong where glitch samples are in row and passed channels are in columns mat_Test: a matrix of {pass, fail} where confidence_level: confidence level used

perform_p_greater(df_target, rv_t_dict, rv_n_dict, confidence_level, channels=0)[source]¶

description:: calcualte p_greater for channels note that set p_greater to be 0.5 if the p_greater is not monotonically growing for the target samples

USAGE: channels, p_greater_dict, list_p_greater, list_p_greater_passed, list_p_greater_failed, list_Test, confidence_level = perform_p_greater(df_target, rv_t_dict, rv_n_dict, confidence_level, channels=0)

Parameters

df_target – target samples in pandas frame
rv_t_dict – a dictionary of beta distribution (scipy obj) for the target samples
rv_n_dict – a dictionary of beta distribution (scipy obj) for the null samples
confidence_level – confidence level
channels – channels

Returns

channels: list of channels p_greater_dict: a dictionary of p_greater list_p_greater: a list of p_greater list_p_greater_passed: a lisf of p_greater where p_greater is kept if the value greater than confidence level, other wise 0 list_p_greater_failed: a lisf of p_greater where p_greater is kept if the value less than confidence level, other wise 0 list_Test: list of {pas, fail} confidence_level: confidence level

perform_point_chisqr_test(df_target, df_null, confidence_level, channels=0)[source]¶

description:: perform a single point chi-square test for each channel at each glitch

USAGE: channels, list_GPS, list_duration, mat_chsqr_passed, mat_chsqr_failed, mat_Test, p_values, confidence_level= perform_point_chisqr_test(df_target, df_null, confidence_level, channels)

Parameters

df_target – target samples in pandas frame
df_null – null samples in pandas frame
confidence_level – a confidence level
channels – a list of channels, 0 in default

:return

channels: a list of channels in numpy array list_GPS: a list of GPS times in numpy array list_duration: a list of durations in numpy array mat_chsqr_passed: a matrix of “passed” chi-square values where glitch indices are in rows and channels are columns in numpy array

note that channels in glitches that passed the test have non-zero values, otherwise zero

mat_chsqr_failed: a matrix of “failed” chi-square values where glitch indices are in rows and channels are columns in numpy array: note that channels in glitches that failed the test have non-zero values, otherwise zero

mat_Test: a list of the test results {‘pass’, ‘fail’} confidence_level: a value of user defined confidence level that is used for the test p_values: a matrix of p-values where glitch indices are in rows and channels are columns in numpy array

query_targetglitch_null(path_target_glitch_dataset, path_null_dataset)[source]¶

description:: load files, otherwise, end the program

Parameters

path_target_glitch_dataset – a path to .csv file of a target gltich class
path_null_dataset – a path to .csv file of a null dataset

Returns

df1: a pandas dataframe of a target glitch class df0: a pandas datafram of a null dataset

ranking_channels(list_ranking_statistic, list_Test)[source]¶

description:

sort based on the value of the ranking statistic
sort based on the test with “pass” and “fail”, where “pass” comes before “fail”
findthe indecies based on the sort 1) and 2)

USAGE: list_sorted_base_index_pass_fail, list_sorted_ranking_statistic_pass_fail, list_sorted_Test_pass_fail = ranking_channels(list_ranking_statistic, list_Test)

Parameters

list_ranking_statistic –
list_Test –

Returns

setHDF5Object(input_dir, input_file)[source]¶

set a HDF5 file object

Parameters: f_dict – a dictionary of time series data sets
Returns: None

origli.utilities.utilities.RemoveChannelUnused(re_sfchs, PathListChannelUnused)[source]¶

Description:: K.M find that some of the channels in the list of the safe channels are not used in O2 so that gwpy can not get the time series of those channels. This function remove those channels.
USAGE:: re_sfchs = RemoveChannelUnused(re_sfchs, ‘/home/kentaro.mogushi/longlived/MachineLearningJointPisaUM/dataset/ListSaveChannel/L1/O2_omicron_channel_list_hvetosafe_GDS.txt’)

Parameters: re_sfchs – the list the safe channels in the numpy array format
Return re_sfchs: the list the safe channels without unused channels in the numpy array format

origli.utilities.utilities.SaveTargetAndBackGroundHDF5_OFFLINE(Listsegments, re_sfchs, IFO, outputpath, outputfilename, number_process, PlusHOFT='False')[source]¶

description:: THIS IS USED FOR “OFFLINE” MODE 0. assuming Listsegments is given by Findglitchlist() 1.take the information of the list of allowed target and a preceding and following segments 2. whiten a target segmement based on the average background segment 3. find whitened FFT 4. save the whitened target and backgrounds FFTs Note this depends on

USAGE: SaveTargetAndBackGroundHDF5(Listsegments, re_sfchs, IFO, outputpath, outputfilename, mode==’offline’)

Parameters

Listsegments – a list of segment parameters
channels – a list of safe channels
outputpath – a directory of an output file
outputfilename – a name of an output file
number_process – a number of processes in parallel
PlusHOFT – whether to get data of hoft, {‘True’ or ‘False’}, ‘False’ in default

origli.utilities.utilities.SaveTargetAndBackGroundHDF5_ONLINE(Listsegments, re_sfchs, IFO, outputpath, outputfilename, number_process, PlusHOFT)[source]¶

description:: THIS IS USED FOR “ONLINE” MODE 0. assuming Listsegments is given by Findglitchlist() 1.take the information of the list of allowed target and a preceding and following segments 2. whiten a target segmement based on the average background segment 3. find whitened FFT 4. save the whitened target and backgrounds FFTs Note this depends on Multiprocess_whitening()

USAGE: SaveTargetAndBackGroundHDF5(Listsegments, re_sfchs, IFO, outputpath, outputfilename, mode==’offline’)

Parameters

Listsegments – a list of segment parameters
channels – a list of safe channels
outputpath – a directory of an output file
outputfilename – a name of an output file
number_process – a number of processes in parallel
PlusHOFT – whether to get data of hoft, {‘True’ or ‘False’}

origli.utilities.utilities.SaveTargetAndBackGroundHDF5_TimeShift(Listsegments, re_sfchs, IFO, outputpath, outputfilename, number_process, PlusHOFT='False')[source]¶

description:: THIS IS USED FOR “OFFLINE” MODE 0. assuming Listsegments is given by Findglitchlist() 1.take the information of the list of allowed target and a preceding and following segments 2. whiten a target segmement based on the average background segment 3. find whitened FFT 4. save the whitened target and backgrounds FFTs Note this depends on

USAGE: SaveTargetAndBackGroundHDF5(Listsegments, re_sfchs, IFO, outputpath, outputfilename, mode==’offline’)

Parameters

Listsegments – a list of segment parameters
channels – a list of safe channels
outputpath – a directory of an output file
outputfilename – a name of an output file
number_process – a number of processes in parallel
PlusHOFT – whether to get data of hoft, {‘True’ or ‘False’}, ‘False’ in default

origli.utilities.utilities.TimeShiftingSamplePrecedingBGonly(df, Epoch_lt, TargetGlitchClass, IFO, BGSNR_thre, targetSNR_thre, Confidence_thre, UserDefinedDuration, gap, position_duration_bfr_centr, TriggerPeakFreqLowerCutoff=0, TriggerPeakFreqUpperCutoff=8192, targetUpperSNR_thre=inf, flag='Both')[source]¶

description:

load Gravity Spy data set (.csv file)
get the data about the target glitch class
get the subset of the target glitches based on SNR and confidence level threshold a user defines
accept glitches where their back ground segment do not coincide with any other glitches
return the info of the accepted glitches

USAGE: Listsegments = FindglitchlistOnLineMode(df, Epochstart, Epochend, Commissioning_lt, TargetGlitchClass, IFO, BGSNR_thre, targetSNR_thre, Confidence_thre, UserDefinedDuration, gap, flag)

Parameters

df – GravitySpy meta data in pandas format
Epochstart – starting time of an epoch
Epochend – end time of an epoch
Commissioning_lt – commissioning time in list
TargetGlitchClass – a target glitch class name (str)
IFO – a type of interferometer (H1, L1, V1) (str)
BGSNR_thre – an upper threhold of SNR for background glitches, i.e., quiet enough ), float or int
targetSNR_thre – a lower threshold of SNR for target glitches, float or int
Confidence_thre – a threshold of confidence level (float or int )
UserDefinedDuration – user defined duration of a glitch (float or int), 0 in default
gap – a time gap between the target and the background segments in sec, 1 sec in default
position_duration_bfr_centr – proportion of duration for a target segment around a center time, e.g, 0.5 indicates duration is even distributed around a center time, 0.83 indicates 5/6 is before the center time
TriggerPeakFreqLowerCutoff – a lower limit cutoff value of the peak frequency of triggers given by an ETG for target glitches
TriggerPeakFreqUpperCutoff – an upper limit cutoff value of the peak frequency of triggers given by an ETG queries for target glitches
targetUpperSNR_thre – an upper limit cutoff value of SNR of triggers given by an ETG queries for target glitches
flag – ‘Both’ or ‘Either’ taking both backgronds or either the preceding or the following background, respectively to accept glitches

Returns

the list of parameters of glitches passing the above thresholds Listsegments contains of

ListIndexSatisfied: a list of index of glitches Listtarget_timeseries_start: a list of a target glitch starting time Listtarget_timeseries_end: a list of a target glitch ending time Listpre_background_start: a list of preceding background starting time Listpre_background_end; a list of preceding background ending time Listfol_background_start: a list of following background starting time Listfol_background_end: a list of following background ending time Listgpstime: a list of GPS times Listduration: a list of durations ListSNR: a list of SNRs Listconfi: a list of confidence levels ListID: a list of IDs

origli.utilities.utilities.select_some_trials(iterate, maximum_iterate=5)[source]¶

description:: this function is used to reduce the number of trials for the off-source window

USAGE: list_index_trials = select_some_trials(iterate, maximum_iterate=5)

Parameters

iterate – a number of trials
maximum_iterate – a maximum number of trials

Returns

a randomly chosen trials where the total number is maximum_iteration or less

`origli.utilities.veto_utilities`¶

file name: veto_utilities.py

this file contains the utilities to be used for finding veto channel

origli.utilities.veto_utilities.BackgroundCut(df_null, channel, background_upper_cut)[source]¶

description:: calculate the upper cut of channels using the FAP distribution

USAGE: cut = BackgroundCut(df_null, channel, background_upper_cut)

Parameters

df_null – null samples in pandas frame
channel – a list of channels
background_upper_cut – confidence level of the uppercut of null samples of witness channel(s), e.g., 1sigma = 0.68268, 2sigma = 0.95449, 3sigma = 0.997300204, 4sigma = 0.99993666, and 5simga = 99.9999426

Returns

cut: an uuper cut of the null sample of those channels

origli.utilities.veto_utilities.CreateAllChannels_rho(Listsegment, IFO, re_sfchs, number_process, PlusHOFT, sigma, LowerCutOffFreq, UpperCutOffFreq)[source]¶

description:

use a single glitch time
query timeseries of all the channels around a glitch
condition (whitening and compare the on- and off-source window)
quantify all the channels (compute values of importance of all the channels)

USAGE: List_Count, re_sfchs, gpstime, duration, SNR, confi, ID = CreateAllChannels_rho(Listsegment, IFO, re_sfchs, number_process, PlusHOFT, sigma, LowerCutOffFreq, UpperCutOffFreq)

Parameters: Listsegment – a list of segment parameters

:param IFO :param channels: a list of safe channels :param number_process: a number of processes in parallel :param PlusHOFT: whether to get data of hoft, {‘True’ or ‘False’} :param sigma: an integer to be used for calculating values of importance :param LowerCutOffFreq: a lower cutoff frequency :param UpperCutOffFreq: an upper cutoff frequency

origli.utilities.veto_utilities.CreateRho(full_timeseries, target_timeseries_start, target_timeseries_end, pre_background_start, pre_background_end, fol_background_start, fol_background_end, sigma, LowerCutOffFreq, UpperCutOffFreq)[source]¶

discription:

calculate the whitened FFT of the on- and off-source window for a single channel
compute the value of importance for a single channel

Parameters

full_timeseries – the full time series in gwpy object including on- and off source windows
target_timeseries_start – the start time of the on-source window
target_timeseries_end – the end time of the on-source window
pre_background_start – the start time of the preceding off-source window
pre_background_end – the end time of the preceding off-source window
fol_background_start – the start time of the following off-source window
fol_background_end – the end time of the following off-source window
sigma – an interger to calculate the value of importance
LowerCutOffFreq – a lower cutoff frequency
UpperCutOffFreq – an upper cutoff frequency

Returns

Count: the importance: a fraction of frequency bins in a frequency range above an upper bound of the off-source window for a single channel

origli.utilities.veto_utilities.FlagFinder(Epoch_lt, Listsegments, IFO, channels, list_statistics, num_high_rank_channels_to_be_used, df_null, background_upper_cut, number_process, sigma, PlusHOFT, LowerCutOffFreq, UpperCutOffFreq, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶

description:

select high ranking witness channels
determine the upper cut of the null samples for those high ranking witness channels
analyze all the glitches using the selected witness channels
make a flag when those channels give importance above the upper cut of the null (flags are made if ALL the chosen witness have value of importance above the upper cut of the null samples)
calculate efficiency and deadtime

USAGE: efficiency, deadtime_frac, df = FlagFinder(Epoch_lt, Listsegments, IFO, channels, list_statistics, num_high_rank_channels_to_be_used, df_null, background_upper_cut, number_process, sigma, PlusHOFT, LowerCutOffFreq, UpperCutOffFreq)

Parameters

Epoch_lt – a list of an epoch
Listsegments – a list of glitches
IFO – ifo
channels – a list of channels, which are expected to be witness channels
list_statistics – a list of ranking statistics, either witness ratio statistics or t-value
num_high_rank_channels_to_be_used – number of high ranking channels to be used for making flag
df_null – null samples in pandas frame
background_upper_cut – confidence level of the uppercut of null samples of witness channel(s), e.g., 1sigma = 0.68268, 2sigma = 0.95449, 3sigma = 0.997300204, 4sigma = 0.99993666, and 5simga = 99.9999426
number_process – number of processors
sigma – an integer to determine the upper bound of the off-source window
PlusHOFT – boolen, whether analyze hoft
LowerCutOffFreq – a lower cutoff frequency
UpperCutOffFreq – an upper cutoff frequency

Returns

efficiency: a ratio of glitches that are flagged to total glitches analyzed without an issue of no data available deadtime_frac: a ratio of total on-source windows to the total analysis time df: a matrix of GPS, duration, SNR, confidence level of classification of glitches, flag and importance in pandas frame

origli.utilities.veto_utilities.FlagFinder_all_witnesses(Proportion_Duration_Bfr_Centr, Listsegments, IFO, channels, list_statistics, num_high_rank_channels_to_be_used, df_null, background_upper_cut, number_process, sigma, PlusHOFT, LowerCutOffFreq, UpperCutOffFreq, freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶

description:

select high ranking witness channels
determine the upper cut of the null samples for those high ranking witness channels
analyze all the glitches using the selected witness channels
make flags when those channels give importance above the upper cut of the null (flags are make for individual channels)

USAGE: df_flag = FlagFinder(Proportion_Duration_Bfr_Centr, Listsegments, IFO, channels, list_statistics, num_high_rank_channels_to_be_used, df_null, background_upper_cut, number_process, sigma, PlusHOFT, LowerCutOffFreq, UpperCutOffFreq)

Parameters

Proportion_Duration_Bfr_Centr – a fraction of the on-source window before the peak GPS time
Listsegments – a list of glitches
IFO – ifo
channels – a list of channels, which are expected to be witness channels
list_statistics – a list of ranking statistics, either witness ratio statistics or t-value
num_high_rank_channels_to_be_used – number of high ranking channels to be used for making flag
df_null – null samples in pandas frame
background_upper_cut – confidence level of the uppercut of null samples of witness channel(s), e.g., 1sigma = 0.68268, 2sigma = 0.95449, 3sigma = 0.997300204, 4sigma = 0.99993666, and 5simga = 99.9999426
number_process – number of processors
sigma – an integer to determine the upper bound of the off-source window
PlusHOFT – boolen, whether analyze hoft
LowerCutOffFreq – a lower cutoff frequency
UpperCutOffFreq – an upper cutoff frequency

Returns

df_flag: a matrix of GPS, duration, SNR, confidence level of classification of glitches, importance of the witness channels, the flags of the witness channels in pandas frame

origli.utilities.veto_utilities.HierarchyChannelAboveThreshold_single_channel(whitened_fft_target, whitened_fft_PBG, whitened_fft_FBG, duration, sampling_rate, sigma, LowerCutOffFreq='None', UpperCutOffFreq='None')[source]¶

description:: calculate the importance: a fraction of frequency bins in a frequency range above an upper bound of the off-source window for a single channel

USAGE: Count = HierarchyChannelAboveThreshold_single_channel(whitened_fft_target, whitened_fft_PBG, whitened_fft_FBG, duration, sampling_rate, sigma, LowerCutOffFreq=’None’, UpperCutOffFreq=’None’)

Parameters

whitened_fft_target – whitened fft of the on-source window
whitened_fft_PBG – whitened fft of the preceding off-source window
whitened_fft_FBG – whitened fft of the following off-source window
duration – a duration of the on-source window
sampling_rate – sampling rate of a channel
sigma – an integer to determine the upper bound of the off-source window
LowerCutOffFreq – a lower cutoff frequency
UpperCutOffFreq – an upper cutoff frequency

Returns

Count: importance

origli.utilities.veto_utilities.WitnessFinder(Listsegments, IFO, re_sfchs_init, sigma, number_process, first_chunk, tolerance, confidence_level, df_null, shuffle='True', PlusHOFT='False', LowerCutOffFreq='None', UpperCutOffFreq='None', freq_bands=[[1, 50], [1, 128], [128, 256], [256, 512], [512, 1024], [1024, 2048], [2048, 4096], [4096, 8192], ['None', 'None']])[source]¶

description:

use a list of glitches

analyze glitches of the number of ‘first chunk’ with all the channels of ‘re_sfchs_init’

perform one-sided binomial test and Welch one-sided t-test

reject the channel that do NOT pass the both tests, i.e., can not reject a hypothesis that a channel is consistent with null samples

calculate the error ratio of the t-value of the top ranking channel to the previous t-value

analyze a next glitch using the channels that pass the both tests

add the values of importance to the passed analyzed samples

repeat (3)-(7)

terminate the process when the error ratio reaches the tolerance

USAGE: re_sfchs, MatCount, list_Causal_passed_final, list_t_values_passed_final = WitnessFinder(Listsegments, IFO, re_sfchs_init, sigma, number_process, first_chunk, tolerance, confidence_level, df_null, shuffle=True, PlusHOFT=’False’, LowerCutOffFreq=’None’, UpperCutOffFreq=’None’)

param Listsegments

a list of glitches

param IFO

ifo

param re_sfchs_init

all thes safe channels to be used at the beginning

param sigma

an integer to determine the upper bound of the off-source window

param number_process

number processes of a machine

param first_chunk

the number of samples to be used for the first chunk, where all the channels are to be used

param tolerate

tolerance number to stop the USAGE: List_Count, re_sfchs, gpstime, duration, SNR, confi, ID = CreateAllChannels_rho(Listsegment, IFO, re_sfchs, number_process, PlusHOFT, sigma, LowerCutOffFreq, UpperCutOffFreq)

analysis

param confidence_level: confidence level for one-sided binomial test and Welch one-sided t-test
param df_null: null samples in pandas frame, which is expected to already created
param shuffle: boolen, whether shuffle the list of glitches
param PlusHOFT: boolen, whether analysis hoft
param LowerCutOffFreq: a lower cutoff frequency
param UpperCutOffFreq: an upper cutoff frequency
return: re_sfchs: a list of channels that will have passed the tests untill the tolerance MatCount: a matrix of importance of the channels that passed the test list_Causal_passed_final: a list of witness ratio statistics of the channels that passed the tests list_t_values_passed_final: a list of t-values of the channels that passed the tests

origli.utilities.veto_utilities.make_veto_omicron_in_aux(epoch_start, epoch_end, IFO, channel, df_foreground, glitch_type, Proportion_Duration_Bfr_Centr, SNR_thresh, df_flag, OutputHDF5_dir, ifostate, N_processes)[source]¶

description:

query omicron triggers (aux) of a witness channel
find the aux omicron triggers which are coincident with the glitches that are analyzed
find the aux SNR cut which corresponds to the importance cut of this witness channel
find the aux omicron triggers which are coincident with all the glitches with label being studied
veto glitches when the coincident aux triggers have SNR above the aux SNR cut (given in the step 3)

USAGE: rho_cut, snr_cut, deadtime, efficiency, efficiency_over_deadtime, df_target = make_veto_omicron_in_aux(epoch_start, epoch_end, IFO, channel, df_foreground, glitch_type, Proportion_Duration_Bfr_Centr, SNR_thresh, df_flag, OutputHDF5_dir, ifostate, N_processes)

Parameters

epoch_start – start time of the analysis period
epoch_end – end time of the analysis period
IFO – ifo {‘H1’ or ‘L1’}
channel – a witness channel name, it could be a channel in a particular frequency band
df_foreground – a pandas data frame of all the glitches in the strain channel fed into pychChoo
glitch_type – a glitch type that is focused on
Proportion_Duration_Bfr_Centr – a fraction the on-source window before the peak GPS time of a glitch
SNR_thresh – a lower SNR threshold to select glitches that are studied
df_flag – a pandas data frame of flagged (including ‘Y’ and ‘N’) of the witness channels for the glitches that have been analyzed with FlagFinder_all_witnesses()
OutputHDF5_dir – a output directory where the omicron trigger of a witness channel
ifostate – state of an ifo
N_processes – number of cores

Returns

rho_cut: lower cut of importance of a witness channel snr_cut: corresponding SNR cut of this witness channel deadtime: a fraction of the time that are vetoed during the analysis time efficiency: a fraction of glitches that are vetoed efficiency_over_deadtime: ratio of efficiency over deadtime df_target: a pandas data frame of this witness, where the glitches that are vetoed are marked as ‘Y’

`origli.utilities.condor_utilities`¶

Script name: condor_utilities.py

Description:: File containing utilities

class origli.utilities.condor_utilities.CondorUtils[source]¶

Bases: object

create_condor_dag_file(path_sub, work_dir, num_glitches_to_be_analyzed)[source]¶

create_condor_submission_file(work_dir, abs_path_executable, abs_path_config, obs_run='o3')[source]¶

`origli.utilities.burn_in_utilities`¶

origli.utilities.burn_in_utilities.BG_upper_threshold_single_channel_given_freqband(list_dummy_duration, list_whitened_fft, sampling_rate, list_num_trial_used, sigma, LowerCutOffFreq='None', UpperCutOffFreq='None')[source]¶

description:: For a single channel per glitch, this function calculates a list of the background upper threshold across dummy on-source windows

USAGE: list_bg_upper_threshold = BG_upper_threshold_single_channel_given_freqband(list_dummy_duration, list_whitened_fft, sampling_rate, list_num_trial_used, sigma, LowerCutOffFreq=’None’, UpperCutOffFreq=’None’)

Parameters

list_dummy_duration – a list of dummy on-source window
list_whitened_fft – a list of the normalized spectrums where each element is a spectrum for a given dummy on-source window
sampling_rate – sampling rate of a channel
list_num_trial_used – a list of trials of dummy on-source window within the total on-source window
sigma – an integer to determine the upper bound of the off-source window
LowerCutOffFreq – a lower cutoff frequency
UpperCutOffFreq – an upper cutoff frequency

Returns

Count: importance

origli.utilities.burn_in_utilities.BG_upper_threshold_single_channel_multiband(list_dummy_duration, list_whitened_fft, sampling_rate, list_num_trial_used, sigma)[source]¶

description:: calculate values of the background upper threshold per dummy on-source window and per frequency band

USAGE: MatBGUpperThresh = BG_upper_threshold_single_channel_multiband(list_dummy_duration, list_whitened_fft, sampling_rate, list_num_trial_used, sigma)

Parameters

list_dummy_duration – a list of dummy on-source window
list_whitened_fft – a list of the normalized spectrums where each element is a spectrum for a given dummy on-source window
sampling_rate – sampling rate of a channel
list_num_trial_used – a list of trials of dummy on-source window within the total on-source window
sigma – an integer to determine the upper bound of the off-source window

Returns

MatBGUpperThresh: values of the background upper threshold, numpy array, where the frequency bands are rows from the top to bottom, the dummy on-source windows are in columns from left to right

origli.utilities.burn_in_utilities.CreateAllChannels_BGUpperThresh_multband(Listsegment, IFO, re_sfchs, number_process, PlusHOFT, sigma, duration_max, trial_duration_sample)[source]¶

description:

use a single glitch time
query timeseries of all the channels around a glitch
calculate values of the background upper threshold per dummy on-source window per frequency band
iterate through channels

USAGE: IndexSatisfied, list_mat_BG_upper_thresh, array_dummy_duration, list_sample_rates, re_sfchs, gpstime, duration, SNR, confi, ID = CreateAllChannels_BGUpperThresh_multband(Listsegment, IFO, re_sfchs, number_process, PlusHOFT, sigma, duration_max = 15, trial_duration_sample = 20)

Parameters: Listsegment – a list of segment parameters

:param IFO :param re_sfchs: a list of safe channels :param number_process: a number of processes in parallel :param PlusHOFT: whether to get data of hoft, {‘True’ or ‘False’} :param sigma: an integer to be used for calculating values of importance :param maximum value of dummy on-sourc window in sec :param a number of dummy on-source windows within the total on-source window :return:

IndexSatisfied: glitch index list_mat_BG_upper_thresh: a list of matrices per channel where element of a matrix is a value of the background upper threshold, numpy array, where the frequency bands are rows from the top to bottom, the dummy on-source windows are in columns from left to right array_dummy_duration: numpy array of the dummy on-source windows list_sample_rates: a list of sampling rates of channels, numpy array re_sfchs: list of channels without “IFO:” at the beginning gpstime: a gps time duration: a value of duration SNR: signal to noise ratio in the h(t) conf: a confidence level of classification of a glitch, provided by Gravity Spy. Otherwise None ID: a glitch ID, provided by Gravity Spy in usual

origli.utilities.burn_in_utilities.CreateAllChannels_rho_multband_from_bg_up_bd_prior(Listsegment, IFO, re_sfchs, number_process, PlusHOFT, hdf5_obj_bg_up_thresh)[source]¶

description:

use a single glitch time
query timeseries of all the channels around a glitch
calculate the normalized spectrum
compute the value of importance for a single channel across frequency bands

USAGE: IndexSatisfied, Mat_Count_in_multibands, list_sample_rates, re_sfchs, gpstime, duration, SNR, confi, ID = CreateAllChannels_rho_multband_from_bg_up_bd_prior(Listsegment, IFO, re_sfchs, number_process, PlusHOFT, hdf5_obj_bg_up_thresh)

Parameters: Listsegment – a list of segment parameters

:param IFO :param channels: a list of safe channels :param number_process: a number of processes in parallel :param PlusHOFT: whether to get data of hoft, {‘True’ or ‘False’} :param hdf5_obj_bg_up_thresh: a HDF5 object that contains the polynomial parameters of the fit that represent the background upper theshold as a functin of on-source window length for all the channels and freq bands :return:

IndexSatisfied: glitch index Mat_Count_in_multibands: a matrix of rho where frequencies in rows and channels in columns, numpy array list_sample_rates: a list of sampling rates of channels, numpy array re_sfchs: list of channels without “IFO:” at the beginning gpstime: a gps time duration: a value of duration SNR: signal to noise ratio in the h(t) conf: a confidence level of classification of a glitch, provided by Gravity Spy. Otherwise None ID: a glitch ID, provided by Gravity Spy in usual

origli.utilities.burn_in_utilities.CreateBGUpperThresh_single_channel_multiband(full_timeseries, target_timeseries_start, target_timeseries_end, array_dummy_duration, sigma)[source]¶

discription:

make list_whitened_fft: list of numpy arrays of the nomalized spectrum where each element of this list is the normalized spectrum for each trial with a given dummy on-source window.
These spectrums are concatenated to a vector from left to right, e.g, np.array([sp0_try0, sp1_try0, …, sp0_try1, sp0_try1, …. ]) Hence, this list is [ (sp for dummy 0), (sp for dummy 1), (….), ….]

sample_rate: sampling rate of this channel DURATION: a duration of a target segment list_num_trial_used: a list of trials per dummy on-source window. Note the number of trials per dummy on-source vary becuase of the limited length of the extended total on-source window.

The longer the dummy on-source, the fewer the trials are available
calculate values of the background upper threshold per dummy on-source window and per frequency band

USAGE: MatBGUpperThresh, sample_rate = CreateBGUpperThresh_single_channel_multiband(full_timeseries, target_timeseries_start, target_timeseries_end, array_dummy_duration, sigma)

Parameters

full_timeseries – the full time series in gwpy object including on- and off source windows
target_timeseries_start – the start time of the on-source window
target_timeseries_end – the end time of the on-source window
array_dummy_duration – numpy array of dummy on-source windows
sigma – an interger to calculate the value of importance

Returns

MatBGUpperThresh: values of the background upper threshold, numpy array, where the frequency bands are rows from the top to bottom, the dummy on-source windows are in columns from left to right sample_rate: a sampling rate of a single channel

origli.utilities.burn_in_utilities.CreateRho_single_channel_multiband_from_bg_up_bd_prior(full_timeseries, target_timeseries_start, target_timeseries_end, list_poly_para)[source]¶

discription:

calculate the normalized spectrum
compute the value of importance for a single channel across frequency bands

USAGE: Counts_in_multibands, sample_rate = CreateRho_single_channel_multiband_from_bg_up_bd_prior(full_timeseries, target_timeseries_start, target_timeseries_end, list_poly_para)

Parameters

full_timeseries – the full time series in gwpy object including on- and off source windows
target_timeseries_start – the start time of the on-source window
target_timeseries_end – the end time of the on-source window
list_poly_para – a list of polynomial fit of the background upper threshold per freq band

Returns

Counts_in_multibands: values of importance in different frequency bands where importance is a fraction of frequency bins in a frequency range above an upper bound of the off-source window for a single channel sample_rate: a sampling rate of a single channel

origli.utilities.burn_in_utilities.FindBGlis_extendBG(state, number_trials, step, outputMother_dir, df, Epoch_lt, TargetGlitchClass, IFO, BGSNR_thre, targetSNR_thre, Confidence_thre, UpperDurationThresh, LowerDurationThresh, UserDefinedDuration, gap, TriggerPeakFreqLowerCutoff=0, TriggerPeakFreqUpperCutoff=8192, targetUpperSNR_thre=inf)[source]¶

description: get the null samples where their durations are drawn from the target set. The only subset of the null samples are chosen if their on-source windows do not coincide with any other glitches.

gltich file (.csv file)
the target samples
create random time stamps with their durations drawn from the target set
accept the random time stamps where their on-source windows do not coincide with any other glitches
return the info of the accepted glitches

USAGE: Listsegments = FindBGlis_extendBG(state, number_trials, step, outputMother_dir, df, Epoch_lt, TargetGlitchClass, IFO, BGSNR_thre, targetSNR_thre, Confidence_thre, UpperDurationThresh, LowerDurationThresh, UserDefinedDuration, gap, TriggerPeakFreqLowerCutoff, TriggerPeakFreqUpperCutoff, targetUpperSNR_thre)

Parameters

df – GravitySpy meta data in pandas format
Epochstart – starting time of an epoch
Epochend – end time of an epoch
Commissioning_lt – commissioning time in list
TargetGlitchClass – a target glitch class name (str)
IFO – a type of interferometer (H1, L1, V1) (str)
BGSNR_thre – an upper threhold of SNR for background glitches, i.e., quiet enough ), float or int
targetSNR_thre – a lower threshold of SNR for target glitches, float or int
Confidence_thre – a threshold of confidence level (float or int )
UpperDurationThresh – an upper bound of duration in sec (float or int)
LowerDurationThresh – a lower bound of duration in sec (float or int)
UserDefinedDuration – user defined duration of a glitch (float or int), 0 in default
gap – a time gap between the target and the background segments in sec, 1 sec in default
TriggerPeakFreqUpperCutoff – the lower SNR threshold for selecting the target set
targetUpperSNR_thre – the upper SNR threshold for selecting the target set

Returns

the list of parameters of glitches passing the above thresholds Listsegments contains of

ListIndexSatisfied: a list of index of glitches Listtarget_timeseries_start: a list of a target glitch starting time Listtarget_timeseries_end: a list of a target glitch ending time Listpre_background_start: a list of preceding background starting time Listpre_background_end; a list of preceding background ending time Listfol_background_start: a list of following background starting time Listfol_background_end: a list of following background ending time Listgpstime: a list of GPS times Listduration: a list of durations ListSNR: a list of SNRs Listconfi: a list of confidence levels ListID: a list of IDs

origli.utilities.burn_in_utilities.FindBGlistBurnIn(state, duration_max, number_trials, step, outputMother_dir, df, Epoch_lt, IFO, BGSNR_thre, UserDefinedDuration, gap)[source]¶

description: From the random time stamps created with FindRadomlistPointsForBurnIn(), select the subset in which their on-source windows do not overlap with any other glitches Note that if on-source window can be extended, it will do

1. load glitch data set (.csv file) 4. accept time stamps where their on-source windows do not coincide with any other glitches 5. return the info of the accepted glitches

USAGE: Listsegments = FindBGlistBurnIn(state, duration_max, number_trials, step, outputMother_dir, df, Epoch_lt, IFO, BGSNR_thre, UserDefinedDuration, gap)

Parameters

df – GravitySpy meta data in pandas format
Epochstart – starting time of an epoch
Epochend – end time of an epoch
Commissioning_lt – commissioning time in list
TargetGlitchClass – a target glitch class name (str)
IFO – a type of interferometer (H1, L1, V1) (str)
BGSNR_thre – an upper threhold of SNR for background glitches, i.e., quiet enough ), float or int
targetSNR_thre – a lower threshold of SNR for target glitches, float or int
Confidence_thre – a threshold of confidence level (float or int )
UpperDurationThresh – an upper bound of duration in sec (float or int)
LowerDurationThresh – a lower bound of duration in sec (float or int)
UserDefinedDuration – user defined duration of a glitch (float or int), 0 in default
gap – a time gap between the target and the background segments in sec, 1 sec in default
flag – ‘Both’ or ‘Either’ taking both backgronds or either the preceding or the following background, respectively to accept glitches

Returns

the list of parameters of glitches passing the above thresholds Listsegments contains of

ListIndexSatisfied: a list of index of glitches Listtarget_timeseries_start: a list of a target glitch starting time Listtarget_timeseries_end: a list of a target glitch ending time Listpre_background_start: a list of preceding background starting time Listpre_background_end; a list of preceding background ending time Listfol_background_start: a list of following background starting time Listfol_background_end: a list of following background ending time Listgpstime: a list of GPS times Listduration: a list of durations ListSNR: a list of SNRs Listconfi: a list of confidence levels ListID: a list of IDs

origli.utilities.burn_in_utilities.FindRadomlistPointsForBurnIn(state, IFO, Epoch_lt, number_samples, step, duration_max, outputMother_dir)[source]¶

description: create random time stamps with their durations are uniformly distributed in log10 of 0.02 sec to duration_max

within an epoch, create a list of synthetic points with randomly chosen with durations
make pandas frame dataset

USAGE: df = FindRadomlistPointsForBurnIn(state, IFO, Epoch_lt, number_samples, step, duration_max, outputMother_dir)

Parameters

state – IFO state {observing, nominal-lock}
IFO – an observer {H1, L1}
Epoch_lt – a list of epochs
number_samples – number of samples picked up
step – step of data points in sec
duration_max – a maximum value of the duration in sec
outputMother_dir – an output directory in witch the data set is in

Returns

df: synthetic random data points within an epoch

origli.utilities.burn_in_utilities.FindglitchlistextendBG(df, Epoch_lt, TargetGlitchClass, IFO, BGSNR_thre, targetSNR_thre, Confidence_thre, UpperDurationThresh, LowerDurationThresh, UserDefinedDuration, gap, position_duration_bfr_centr, TriggerPeakFreqLowerCutoff=0, TriggerPeakFreqUpperCutoff=8192, targetUpperSNR_thre=inf)[source]¶

description:

glitch data set (.csv file)
get the data about the target glitch class
get the subset of the target glitches based on SNR and confidence level threshold a user defines
the preceeding and following BGs are 64 sec-long for every samples regardless the overlapping with any other glitches
return the info of the accepted glitches

USAGE: Listsegments = FindglitchlistextendBG(df, Epochstart, Epochend, Commissioning_lt, TargetGlitchClass, IFO, BGSNR_thre, targetSNR_thre, Confidence_thre, UpperDurationThresh, LowerDurationThresh, UserDefinedDuration, gap, position_duration_bfr_centr, TriggerPeakFreqLowerCutoff=0, TriggerPeakFreqUpperCutoff=8192, targetUpperSNR_thre=np.inf)

Parameters

df – GravitySpy meta data in pandas format
Epochstart – starting time of an epoch
Epochend – end time of an epoch
Commissioning_lt – commissioning time in list
TargetGlitchClass – a target glitch class name (str)
IFO – a type of interferometer (H1, L1, V1) (str)
BGSNR_thre – an upper threhold of SNR for background glitches, i.e., quiet enough ), float or int
targetSNR_thre – a lower threshold of SNR for target glitches, float or int
Confidence_thre – a threshold of confidence level (float or int )
UpperDurationThresh – an upper bound of duration in sec (float or int)
LowerDurationThresh – a lower bound of duration in sec (float or int)
UserDefinedDuration – user defined duration of a glitch (float or int), 0 in default
gap – a time gap between the target and the background segments in sec, 1 sec in default
position_duration_bfr_centr – proportion of duration for a target segment around a center time, e.g, 0.5 indicates duration is even distributed around a center time, 0.83 indicates 5/6 is before the center time
TriggerPeakFreqLowerCutoff – a lower limit cutoff value of the peak frequency of triggers given by an ETG for target glitches
TriggerPeakFreqUpperCutoff – an upper limit cutoff value of the peak frequency of triggers given by an ETG queries for target glitches
targetUpperSNR_thre – an upper limit cutoff value of SNR of triggers given by an ETG queries for target glitches
flag – ‘Both’ or ‘Either’ taking both backgronds or either the preceding or the following background, respectively to accept glitches

Returns

the list of parameters of glitches passing the above thresholds Listsegments contains of

ListIndexSatisfied: a list of index of glitches Listtarget_timeseries_start: a list of a target glitch starting time Listtarget_timeseries_end: a list of a target glitch ending time Listpre_background_start: a list of preceding background starting time Listpre_background_end; a list of preceding background ending time Listfol_background_start: a list of following background starting time Listfol_background_end: a list of following background ending time Listgpstime: a list of GPS times Listduration: a list of durations ListSNR: a list of SNRs Listconfi: a list of confidence levels ListID: a list of IDs

origli.utilities.burn_in_utilities.HierarchyChannelAboveThreshold_single_channel_multiband_from_bg_up_bd_prior(whitened_fft_target, sampling_rate, duration, list_poly_para)[source]¶

description:: calculate the importance for a signle channel for frequency bands

USAGE: Counts_in_multibands = HierarchyChannelAboveThreshold_single_channel_multiband_from_bg_up_bd_prior(whitened_fft_target, sampling_rate, duration, list_poly_para)

Parameters

whitened_fft_target – whitened fft of the on-source window
duration – a duration of the on-source window
sampling_rate – sampling rate of a channel
list_poly_para – a list of polynomial fit of the background upper threshold per freq band

Returns

Counts_in_multibands: values of importance in different frequency bands, numpy array

origli.utilities.burn_in_utilities.Multiprocess_whitening_for_burn_in(full_timeseries, target_timeseries_start, target_timeseries_end, array_dummy_duration)[source]¶

description:: This is used for normalizing the spectrum of each trial in the on-source window of random time stamps created with FindBGlistBurnIn() iterate each trial for each dummy on 1. from the time series of a single channel 2. calculate how many trials are available per dummy on-source window within the extended on-source window 3. iterate trials per dummy on-source window

USAGE: list_whitened_fft, sample_rate, DURATION, list_num_trial_used = Multiprocess_whitening_for_burn_in(full_timeseries, target_timeseries_start, target_timeseries_end, array_dummy_duration)

Parameters

full_timeseries – time series comprising target and BGs
target_timeseries_start – a start time of a target segment
target_timeseries_end – an end time of a target segment
array_dummy_duration – numpy array of dummy on-source windows

Returns

list_whitened_fft: list of numpy arrays of the nomalized spectrum where each element of this list is the normalized spectrum for each trial with a given dummy on-source window.: These spectrums are concatenated to a vector from left to right, e.g, np.array([sp0_try0, sp1_try0, …, sp0_try1, sp0_try1, …. ]) Hence, this list is [ (sp for dummy 0), (sp for dummy 1), (….), ….]

sample_rate: sampling rate of this channel DURATION: a duration of a target segment list_num_trial_used: a list of trials per dummy on-source window. Note the number of trials per dummy on-source vary becuase of the limited length of the extended total on-source window.

The longer the dummy on-source, the fewer the trials are available

origli.utilities.burn_in_utilities.Multiprocess_whitening_for_target(full_timeseries, target_timeseries_start, target_timeseries_end)[source]¶

description:: This is used for multi processing for whitening segments

USAGE: whitened_fft_target, sample_rate, DURATION = Multiprocess_whitening_for_target(full_timeseries, target_timeseries_start, target_timeseries_end)

Parameters

full_timeseries – time series comprising target and BGs
target_timeseries_start – a start time of a target segment
target_timeseries_end – an end time of a target segment

Returns

whitened_fft_target: whitened fft of a target segment sample_rate: sampling rate of this channel DURATION: a duration of a target segment

origli.utilities.burn_in_utilities.SaveDummyTargetHDF5_OFFLINE_burn_in(Listsegments, re_sfchs, IFO, outputpath, outputfilename, number_process, PlusHOFT='False', trial_duration_sample=20)[source]¶

description:: Save the normalized spectrum for every channels for every glitches, with the use of Multiprocess_whitening_for_burn_in() 1. iterate all the samples 2. whiten the on-source window for every channels 4. save the whitened spectrum as a HDF5 file

each group in the HDF5 is for a single channel and each of datasets per group has the normalized spectrum for a dummy on-source window

USAGE: SaveTargetAndBackGroundHDF5_OFFLINE_burn_in(Listsegments, re_sfchs, IFO, outputpath, outputfilename, number_process, PlusHOFT=’False’)

Parameters

Listsegments – a list of segment parameters
re_sfchs – a list of safe channels
outputpath – a directory of an output file
outputfilename – a name of an output file
number_process – a number of processes in parallel
PlusHOFT – whether to get data of hoft, {‘True’ or ‘False’}, ‘False’ in default
trial_duration_sample – a number of dummy on-source

:return None

origli.utilities.burn_in_utilities.SaveOnlyTargetHDF5_OFFLINE(Listsegments, re_sfchs, IFO, outputpath, outputfilename, number_process, PlusHOFT='False')[source]¶

description:: THIS IS USED FOR “OFFLINE” MODE 0. assuming Listsegments is given by Findglitchlist() 1.take the information of the list of allowed target and a preceding and following segments 2. whiten a target segmement based on the average background segment 3. find whitened FFT 4. save the whitened target and backgrounds FFTs Note this depends on

USAGE: SaveOnlyTargetHDF5_OFFLINE(Listsegments, re_sfchs, IFO, outputpath, outputfilename, number_process, PlusHOFT=’False’)

Parameters

Listsegments – a list of segment parameters
re_sfchs – a list of safe channels
outputpath – a directory of an output file
outputfilename – a name of an output file
number_process – a number of processes in parallel
PlusHOFT – whether to get data of hoft, {‘True’ or ‘False’}, ‘False’ in default

origli.utilities.burn_in_utilities.SaveOnlyTargetHDF5_multiband_OFFLINE_from_bg_up_bd_prior(Listsegments, re_sfchs, IFO, outputpath, outputfilename, number_process, path_hdf5_bg_up_thresh, PlusHOFT='False')[source]¶

description:: THIS IS USED FOR “OFFLINE” MODE 1. take the information of the list of allowed target and a preceding and following segments 2. query time series and get normalized spectrum and calculate importance 4. save the whitened target and backgrounds FFTs Note this depends on

USAGE: SaveOnlyTargetHDF5_multiband_OFFLINE_from_bg_up_bd_prior(Listsegments, re_sfchs, IFO, outputpath, outputfilename, number_process, path_hdf5_bg_up_thresh, PlusHOFT=’False’)

Parameters

Listsegments – a list of segment parameters
re_sfchs – a list of safe channels
outputpath – a directory of an output file
outputfilename – a name of an output file
number_process – a number of processes in parallel
path_hdf5_bg_up_thresh – path to a HDF5 file that contains the polynomial fit of the background upper threshold
PlusHOFT – whether to get data of hoft, {‘True’ or ‘False’}, ‘False’ in default

:return None

origli.utilities.burn_in_utilities.SaveUppperThreshodBG_multiband_OFFLINE_burn_in(Listsegments, re_sfchs, IFO, outputpath, outputfilename, number_process, sigma, PlusHOFT='False', duration_max=15, trial_duration_sample=20)[source]¶

description:: THIS IS USED FOR “OFFLINE” MODE 1. iterate glitch samples 2. get values of the background upper threshold per dummy on-source window per frequency band for each of channels 4. save the values Note this depends on

USAGE: SaveUppperThreshodBG_multiband_OFFLINE(Listsegments, re_sfchs, IFO, outputpath, outputfilename, number_process, sigma, PlusHOFT=’False’, duration_max = 15, trial_duration_sample = 20)

Parameters

Listsegments – a list of segment parameters
re_sfchs – a list of safe channels
outputpath – a directory of an output file
outputfilename – a name of an output file
number_process – a number of processes in parallel
PlusHOFT – whether to get data of hoft, {‘True’ or ‘False’}, ‘False’ in default
sigma – an integer to determine the upper bound of the off-source window
duration_max – a maximum value of length of dummy on-source window in sec
trial_duration_sample – a number of dummy on-source window within the total on-source window

:return None

origli.utilities.burn_in_utilities.cal_importance_single_channel_singl_freqband_from_bg_up_bd_prior(whitened_fft_target, sampling_rate, DURATION, poly_para, LowerCutOffFreq, UpperCutOffFreq)[source]¶

description:: calculate the importance for a single channel in a given frequency band

USAGE: Count = cal_importance_single_channel_singl_freqband_from_bg_up_bd_prior(whitened_fft_target, sampling_rate, DURATION, poly_para, LowerCutOffFreq, UpperCutOffFreq)

Parameters

whitened_fft_target – on-source window normalized spectrum
sampling_rate – sample rate
DURATION – duration of on-source window
poly_para – polynomial parameters of the fit of the background upper threshold as a function of on-source window length
LowerCutOffFreq –
UpperCutOffFreq –

Returns

origli.utilities.burn_in_utilities.clean_duration_for_asd(duration)[source]¶

description:: This function is used for clean the dicimal points in value of duration to avoid the error shown in ASD estimator in gwpy

Parameters: duration – duration in sec
Returns: duration: cleaned duration in sec

origli.utilities.burn_in_utilities.make_bg_up_thesh_interpolate_hdf5(input_dir, input_hdf5_file)[source]¶

description:: make a dictionry of values of the background upper threshold across dummy on-source window and frequency bands per channel

USAGE: list_all_dummy_duration_sorted, mat_ch_sorted_dict = make_bg_up_thesh_interpolate_hdf5(input_dir, input_hdf5_file)

Parameters

input_dir – input directory
input_hdf5_file – name of a HDF5 file

Returns

list_all_dummy_duration_sorted: ascending list of dummy on-source windows mat_ch_sorted_dict: a dictionary in which each of the key contains array of values the he background upper threshold per dummy on-source window per frequency band where frequency bands are in row from top to bottom and dummy on-source windows are in columns from left to right

origli.utilities.burn_in_utilities.save_interpolate_bg_upper_thres_hdf5(list_all_dummy_duration_sorted, mat_ch_sorted_dict, output_dir, output_hdf5_file, med_abs_sigma=6, poly_degree=10)[source]¶

description:

fit the polynomial function against the background upper threshold as a function of dummy on-source window per freq band per channel
save the polynomial parametes to a HDF5 file

USAGE: save_interpolate_bg_upper_thres_hdf5(list_all_dummy_duration_sorted, mat_ch_sorted_dict, output_dir, output_hdf5_file, med_abs_sigma=6, poly_degree=10)

Parameters

list_all_dummy_duration_sorted – ascending list of dummy on-source windows
mat_ch_sorted_dict – a dictionary in which each of the key contains array of values the he background upper threshold per dummy on-source window per frequency band where frequency bands are in row from top to bottom and dummy on-source windows are in columns from left to right
output_dir – output directory
output_hdf5_file – otuput file name
med_abs_sigma – a interger number of median absolute error to remove outliers for fitting
poly_degree – polynomial degree

Returns

None

Excecutable and modules¶

`origli`¶

`bin.OriginFinder`¶

`origli.utilities.const`¶

`origli.utilities.multiband_search_utilities`¶

`origli.utilities.utilities`¶

`origli.utilities.veto_utilities`¶

`origli.utilities.condor_utilities`¶

`origli.utilities.burn_in_utilities`¶

Table of Contents

Previous topic

This Page

Excecutable and modules¶

bin.OriginFinder¶

`bin.OriginFinder`¶