pyflowbat package#
Submodules#
pyflowbat.gating module#
- pyflowbat.gating.find_percentile(workspace: pyflowbat.pyflowbat.Workspace, sample_collection_name: str, sample_name: str, channel_name: str, percentile: float, **kwargs) float #
Finds the specified percentile of the data in a given channel a PyFlowBAT sample.
- Parameters
workspace (pyflowbat.pyflowbat.Workspace) – the PyFlowBAT Workspace containing the sample whose percentile is desired.
sample_collection_name (str) – the name of the sample collection containing the sample whose percentile is desired
sample_name (str) – the name of the sample whose percentile is desired
channel_name (str) – the name of the channel whose percentile is desired
percentile (float) – the desired percentile
- Returns
the desired percentile of the data in the specified channel of the PyFlowBAT sample
- Return type
float
- pyflowbat.gating.gate_heks(data_to_gate: dict[str, FlowCal.io.FCSData], limits: dict[str, list[float]], method: str = 'same', samples: Optional[list[str]] = None, **kwargs) dict[str, FlowCal.io.FCSData] #
Gates all samples in a collection for HEK 293FT by finding the most extreme valleys in the FSC-A and SSC-A channels.
- Parameters
data_to_gate (pyflowbat.pyflowbat.SampleCollection) – the PyFlowBAT sample collection to gate; NOTE: this parameter is provided to by the pyflowbat.pyflowbat.Workspace.apply_gate method and should NOT be specified by the user
limits (dict[str, list[float]]) – the limits of the FSC-A and SSC-A channels; NOTE: this parameter is provided to by the pyflowbat.pyflowbat.Workspace.apply_gate method and should NOT be specified by the user, to specify this parameter’s value, change the value of limits within the Workspace in which this gate is being applied
method (str) – whether or not to use a unique gate for each sample in the collection or the same gate for all samples, options are “unique” or “same”, defaults to “same”
samples (Optional[list[str]]) – the samples to use to define the gate if the “same” method is used, defaults to None
- Returns
the gated PyFlowBAT sample collection
- Return type
pyflowbat.pyflowbat.SampleCollection
- pyflowbat.gating.gate_high_low(data_to_gate: dict[str, FlowCal.io.FCSData], gating_channel_name: str, high: Optional[float] = None, low: Optional[float] = None, **kwargs) dict[str, FlowCal.io.FCSData] #
Gates all samples in a collection within an upper and lower bound in a specified channel.
- Parameters
data_to_gate (pyflowbat.pyflowbat.SampleCollection) – the PyFlowBAT sample collection to gate; NOTE: this parameter is provided to by the pyflowbat.pyflowbat.Workspace.apply_gate method and should NOT be specified by the user
gating_channel_name (str) – the name of the channel to gate the samples
high (Optional[float]) – the upper bound
low (Optional[float]) – the lower bound
- Returns
the gated PyFlowBAT sample collection
- Return type
pyflowbat.pyflowbat.SampleCollection
- pyflowbat.gating.gate_singlets(data_to_gate: dict[str, FlowCal.io.FCSData], a: float = 10000000000, b: float = 20000, **kwargs) dict[str, FlowCal.io.FCSData] #
Gates all samples in a collection for singlets by drawing a tilted ellipse.
- Parameters
data_to_gate (pyflowbat.pyflowbat.SampleCollection) – the PyFlowBAT sample collection to gat NOTE: this parameter is provided to by the pyflowbat.pyflowbat.Workspace.apply_gate method and should NOT be specified by the user
a (float) – the long axis of the ellipse surrounding the singlets, defaults to 1*10**10
b (float) – the short axis of the ellipse surrounding the singlets, defaults to 2*10**4
- Returns
the gated PyFlowBAT sample collection
- Return type
pyflowbat.pyflowbat.SampleCollection
pyflowbat.operations module#
- pyflowbat.operations.apply_conversion_factor(x: float, factor: float, **kwargs) float #
Given a number x and a conversion factor, returns x multiplied by the factor.
- Parameters
x (float) – the value to convert
factor (float) – the conversion factor to apply to x
- Returns
x multiplied by the conversion factor
- Return type
float
- pyflowbat.operations.channel_mean(data: FlowCal.io.FCSData, channel_name: str, **kwargs) float #
Computes the mean value of a given channel of FCS data.
- Parameters
data (FlowCal.io.FCSData) – the FCS data with the desired mean
channel_name (str) – the name of the channel of the desired mean
- Returns
the mean value of the specified channel of the specified FCS sample
- Return type
float
- pyflowbat.operations.compute_conversion_factor_stdErr(x: float, factor: float, factor_err: float, **kwargs) float #
Computes the standard error of the conversion factor multiplication.
- Parameters
x (float) – the value being converted
factor (float) – the conversion factor being applied
factor_err (float) – the standard error of the conversion factor
- Returns
the standard error of the conversion
- Return type
float
- pyflowbat.operations.non_negative(x: float, **kwargs) float #
Given a number x, returns the x if x is non-negative and 0 otherwise.
- Parameters
x (float) – a number
- Returns
x if x is non-negative and 0 otherwise
- Return type
float
- pyflowbat.operations.split_sample_name(name: str, by: str, index: int, **kwargs) str #
Splits a string by a character.
- Parameters
name (str) – the string to split
by (str) – the character to split by
index (int) – the index of the split string to return
- Returns
the string at the specified index of the provided string split by the specified character
- Return type
str
pyflowbat.pyflowbat module#
- class pyflowbat.pyflowbat.StatisticsExtraction(sample_collection_name: str, statistics_collection_name: str, include: list[str], not_include: list[str])#
Bases:
object
A class containing a rule for extracting statistics from a sample collection.
- Parameters
sample_collection_name (str) – the name of the sample collection in the workspace from which to extract the statistics
statistics_collection_name (str) – the name of the statistics collection to create/extract statistics into
include (list[str]) – the list of keywords in the names of samples that must be present to extract statistics from that sample, all keywords must be present in a sample’s name for statistics to be extracted from it
not_include (list[str]) – the list of keywords in the names of samples that must not be present to extract statistics from that sample, all keywords must not be present in a sample’s name for statistics to be extracted from it
- class pyflowbat.pyflowbat.Workspace(stylesheet: dict = {'axes.axisbelow': True, 'axes.edgecolor': 'B0B0B0', 'axes.facecolor': 'white', 'axes.grid': True, 'axes.labelsize': 'large', 'axes.linewidth': 3.0, 'axes.prop_cycle': cycler('color', ['#0267c1', '#efa00b', '#00af54', '#e2cfea', '#d65108', '#6915E0', '#3DA5D9', '#FEC601', '#097109', '#EA7317', '#AF0BA5']), 'axes.spines.bottom': False, 'axes.spines.left': False, 'axes.spines.right': False, 'axes.spines.top': False, 'axes.titlesize': 'x-large', 'figure.facecolor': 'white', 'figure.subplot.bottom': 0.07, 'figure.subplot.left': 0.08, 'figure.subplot.right': 0.95, 'font.size': 14.0, 'grid.color': '#B0B0B0', 'grid.linestyle': ':', 'grid.linewidth': 1.0, 'legend.fancybox': True, 'lines.linewidth': 4, 'lines.solid_capstyle': 'butt', 'patch.edgecolor': 'white', 'patch.linewidth': 0.5, 'savefig.edgecolor': 'white', 'savefig.facecolor': 'white', 'svg.fonttype': 'path', 'xtick.major.size': 0, 'xtick.minor.size': 0, 'ytick.major.size': 0, 'ytick.minor.size': 0}, lims_file: str = '_std', full_output: bool = False)#
Bases:
object
A class describing PyFlowBAT workspaces. Workspaces contain all the methods for operating on batches of PyFlowBAT data and contain all samples and statistics.
- Parameters
stylesheet (dict) – the stylesheet for plotting PyFlowBAT analyzed data, defaults to the PyFlowBAT standard stylesheet
lims_file (str) – the path to the CSV file defining upper and lower limits for standard PyFlowBAT gating functions
full_output (bool) – whether or not to display all output of PyFlowBAT operations; it is HIGHLY recommended to leave this at the default value: True, defaults to True
- apply_compensation_matrix(sample_collection_name: str, new_sample_collection_name: str) None #
Applies the workspace compensation matrix to one sample collection creating another. NOTE: the workspace’s calculate_compensation_matrix method must be run before this method may be run.
- Parameters
sample_collection_name (str) – the name of the sample collection to compensate
new_sample_collection_name (str) – the name of the new sample collection to create for the compensated samples
- apply_gate(sample_collection_name: str, new_sample_collection_name: str, gating_function: Callable, output_plots: int = 5, gating_channel_names: list[str] = ['FSC-A', 'SSC-A'], output_override: Optional[bool] = None, **kwargs) None #
Applies a specified gate to a sample collection.
- Parameters
sample_collection_name (str) – the name of the sample collection to gate
new_sample_collection_name (str) – the name of the new sample collection to create for the gated samples
gating_function (function/Callable) – the function defining the gate to apply
output_plots (int) – the number of plots to visualize if full output is true, defaults to 5
gating_channel_names (list[str]) – the channels to visualize, if has more than one element, uses the first two elements if has one element, uses that value twice, defaults to [“FSC-A”, “SSC-A”]
output_override (Optional[bool]) – if not None, workspace will use this instead of its full_output variable, defaults to None
**kwargs – keywords to pass to the gating function
- apply_operation(statistics_collection_name: str, new_statistics_collection_name: str, statistic_name: Union[str, list[str]], new_statistic_name: str, operation: Callable, **kwargs) None #
Applies a specified operation to a statistic in a statistics collection.
- Parameters
statistics_collection_name (str) – the name of the statistics collection to operate on
new_statistics_collection_name (str) – the name of the new statistics collection to create for the post-operation statistics
statistic_name (Union[str, list[str]]) – the name of the statistic to operate on; can be a single statistic or multiple; only statistics specified here will be used in the operation
new_statistic_name (str) – the name of the statistic to be created from the operation
operation (function/Callable) – the function defining the opperation to apply
**kwargs – keywords to pass to the operation
- calculate_beads_factors(beads_file_file_path: str, beads_fluorescent_channels: list[tuple[str, str]], beads_num_pops: int, beads_conversions_file: str = '_std') None #
Calculates the conversion factors for this workspace from a specified beads file.
- Parameters
beads_file_file_path (str) – the path to the beads file to calculate conversion factors from
beads_fluorescent_channels (list[tuple[str, str]]) – a list of tuples containing the names of the fluorescent channels and the corresponding MEFs; follows the pattern of: [(CHANNEL, MEF), (CHANNEL, MEF),…]
beads_num_pops (int) – the number of beads populations in the beads file to calculate conversion factors from
beads_conversions_file (str) – the path to the file specifying the manufacturers bead conversion values, defaults to the PyFlowBAT standard beads values aka Spherotech RCP-30-5 Rainbow Calibration Beads
- calculate_compensation_matrix(sample_collection_name: str, compensation_sample_names: str, compensation_channel_names: list[str], threshold: int = 0.0001, compensation_rate: float = 0.1) None #
Calculates a compensation matrix by repeatedly flattening to zero the best fit line through the specified compensation sample data. Flattening stops once the best fit line has a slope under the provided threshold. NOTE: at present, only 2- and 3- sample compensation matrices can be calculated. NOTE: at present, flattening continues for all compensation samples as long as any compensation sample has a best fit slope greater than the threshold. NOTE: this method must always be run before the workspace’s apply_compensation_matrix method.
- Parameters
sample_collection_name (str) – the name of the collection from which to find the samples for compensation matrix calculation
compensation_sample_names (list[str]) – the names of the samples used for compensation
compensation_channel_names (list[str]) – the channel names to be compensated
threshold (float) – the threshold to flatten the best fit lines to, defaults to 10**4
compensation_rate (float) – the rate at which to calculate the compensation matrix, defaults to 0.1
- combine_replicates(statistics_collection_name: str, combined_statistics_collection_name: str, combine_by: list[str], combination_operations: dict[str, Union[str, Callable]], sem_cols: list[str]) None #
Combines replicates in a statistics collection.
- Parameters
statistics_collection_name (str) – the name of the statistics collection from which to combine replicates
combined_statistics_collection_name (str) – the name of the new statistics collection storing the combined statistics
combine_by (list[str]) – the statistics by which to combine replicates; each set of statistics is combined if all statistics here are equivalent
combination_operations (dict[str, Union[function/Callable, str]]) – the functions defining the operation by which to combine rows for each column; may be a Pandas GroupBy function or a function; follows the pattern: {STATISTIC: OPERATION, STATISTIC: OPERATION…}
sem_cols (list[str]) – list of statistics for which the standard error of the mean of the combination should be calculated
**kwargs – keywords to pass to the combination operation
- create_statistic_extraction(sample_collection_name: str, statistics_collection_name: str, include: list[str], not_include: list[str], statistic_names: list[str]) pyflowbat.pyflowbat.StatisticsExtraction #
Creates a rule needed for extracting statistics from a sample collection
- Parameters
sample_collection_name (str) – the name of the sample collection from which to extract
statistics_collection_name (str) – the name of the statistics collection to create and extract statistics to
include (list[str]) – a list of words that must be included in the samples from which to extract statistics
not_include (list[str]) – a list of words that must NOT be included in the samples from which to extract statistics
statistic_names (list[str]) – a list of the stastics that will be extracted using this rule
- Returns
the extraction rule to be used in extract_samples
- Return type
- extract_statistic(extraction: pyflowbat.pyflowbat.StatisticsExtraction, statistic_name: str, operation: Callable, **kwargs) None #
Extracts a statistic for a StatisticExtraction rule.
- Parameters
extraction (pyflowbat.pyflowbat.StatisticsExtraction) – the StatisticExtraction rule to use for the extraction
statistic_name (str) – the name of the statistic being created
operation (function/Callable) – the function defining the operation defining the extraction
**kwargs – keywords to pass to the operation
- graph_statistics(data: list[list[Union[str, tuple[str, str]]]], errors: tuple[bool, bool] = (False, False), legend: Optional[list[str]] = None, title: Optional[str] = None, labels: tuple[Optional[str], Optional[str]] = (None, None), xlog: bool = False, ylog: bool = False, save: Union[bool, str] = True) None #
Plots statistics from a collection.
- Parameters
data (list[list[Union[str, tuple[str, str]]]]) – the data to plot; follows the pattern of a list of the following: [STATISTIC_COLLECTION_NAME, X_STATISTIC_NAME, Y_STATISTIC_NAME, SPECIFICATION_TUPLEs…]; where each SPECIFICATION_TUPLE is follows the pattern of: (STATISTIC, VALUE); and only rows were STATISTIC=VALUE are included in this x,y plot; an arbitrary number of SPECIFICATION_TUPLEs can be included; an arbitrary number of these lists can be included to define different scatter plots in one figure
errors (tuple[bool, bool]) – if x and y axis errors should be included; only one tuple can be provided for the entire figure, defaults to (False, False)
legend (Optional[list[str]]) – the legend for the plot, defaults to None
title (Optional[str]) – the title for the plot, defaults to None
labels (tuple[Optional[str], Optional[str]]) – the x and y labels for the plot, defaults to (None, None)
xlog (bool) – if the x axis should be in logarithmic scale, defaults to False
ylog (bool) – if the y axis should be in logarithmic scale, defaults to False
save (Union[bool, str]) – if the plot should be saved; can be a boolean or a string; if a string, saves to the specified file; if a boolean, saves to a file with the name of the plot title if True; does not save figure if set to False, defaults to True
- init_r(check_R_installation: bool = True) None #
Initialized R functionality for R language gating functions. NOTE: this method and R gating functionality both require the R programming language to be installed for features to work properly.
- Parameters
check_R_installation (bool) – whether or not to check if the R language is installed it is HIGHLY recommended to leave this at the default value: True, defaults to True
- load_samples(sample_collection_name: str, samples_folder_path: str, include: list[str], not_include: list[str]) None #
Loads samples from a folder into a sample collection in the workspace.
- Parameters
sample_collection_name (str) – the name of the sample collection into which to load samples
samples_folder_path (str) – the path to the folder with the samples to load
include (list[str]) – the list of keywords in the names of files that must be present to load, all keywords must be present in a file’s name to load as a sample
not_include (list[str]) – the list of keywords in the names of files that must not be present to load, all keywords must not be present in a file’s name to load as a sample
- visualize_plot_change(sample_collection_name_0: str, sample_collection_name_f: str, sample_name: str, channel_names: tuple[str, str]) None #
Visualizes the change in a sample from one collection to another.
- Parameters
sample_collection_name_0 (str) – the name of the initial sample collection
sample_collection_name_f (str) – the name of the final sample collection
sample_name (str) – the name of the sample to visualize
channel_names (tuple[str, str]) – the names of the channels to visualize change in
- visualize_plot_overlay(plots: list[list], colors: list[str], channel_names: tuple[str, str], sizes: Optional[list[int]] = None, markers: Optional[list[str]] = None, legend: Optional[list] = None, title: Optional[str] = None) None #
Visualizes an overlay between samples.
- Parameters
plots (list[list]) – the data to plot
colors (list[str]) – the names of the colors for the overlayed plots
channel_names (tuple[str, str]) – the names of the channels to visualize change in
sizes (Optional[list[int]]) – the sizes of the scatters to overlay
markers (Optional[list[str]]) – the markers for the plot
legend (Optional[list]) – the legend for the plot
title (Optional[str]) – the title for the plot
pyflowbat.r_gating module#
- pyflowbat.r_gating.clust_2d_gate(data_to_gate: dict[str, FlowCal.io.FCSData], gating_channel_names: list[str], target: list[float], quantile: float = 0.95, K: int = 2, _r_ready: bool = False, **kwargs) dict[str, FlowCal.io.FCSData] #
PyFlowBAT wrapper for the flowClust.2d function from the R OpenCyto library. For more, please see the OpenCyto documentation.
- Parameters
data_to_gate (pyflowbat.pyflowbat.SampleCollection) – the PyFlowBAT sample collection to gate; NOTE: this parameter is provided by the pyflowbat.pyflowbat.Workspace.apply_gate method and should NOT be specified by the user
gating_channel_names (list[str]) – the 2 channels to use for gating
target (list[float]) – the 2 dimenstional target around which to cluster
quantile (float) – the quantile provided to flowClust.2d for gating, defaults to 0.95
K (int) – the K provided to flowClust.2d for gating, defaults to 2
_r_ready (bool) – a boolean describing whether or not R functionality has been initialized forthis Workspace; NOTE: this parameter is provided by the pyflowbat.pyflowbat.Workspace.apply_gate method and should NOT be specified by the user
- Returns
the gated PyFlowBAT sample collection
- Return type
pyflowbat.pyflowbat.SampleCollection
- pyflowbat.r_gating.singlet_gate(data_to_gate: dict[str, FlowCal.io.FCSData], gating_channel_names: list[str], _r_ready: bool = False, **kwargs) dict[str, FlowCal.io.FCSData] #
PyFlowBAT wrapper for the singletGate function from the R OpenCyto library. For more, please see the OpenCyto documentation.
- Parameters
data_to_gate (pyflowbat.pyflowbat.SampleCollection) – the PyFlowBAT sample collection to gate; NOTE: this parameter is provided by the pyflowbat.pyflowbat.Workspace.apply_gate method and should NOT be specified by the user
gating_channel_names (list[str]) – the 2 channels to use for gating
_r_ready (bool) – a boolean describing whether or not R functionality has been initialized for this Workspace; NOTE: this parameter is provided by the pyflowbat.pyflowbat.Workspace.apply_gate method and should NOT be specified by the user
- Returns
the gated PyFlowBAT sample collection
- Return type
pyflowbat.pyflowbat.SampleCollection
- pyflowbat.r_gating.transitional_gate(data_to_gate: dict[str, FlowCal.io.FCSData], gating_channel_names: list[str], target: list[float], quantile: float = 0.95, K: int = 2, translation: int = 0.15, _r_ready: bool = False, **kwargs) dict[str, FlowCal.io.FCSData] #
PyFlowBAT wrapper for the flowClust.2d translational gate function from the R OpenCyto library. For more, please see the OpenCyto documentation.
- Parameters
data_to_gate (pyflowbat.pyflowbat.SampleCollection) – the PyFlowBAT sample collection to gate; NOTE: this parameter is provided by the pyflowbat.pyflowbat.Workspace.apply_gate method and should NOT be specified by the user
gating_channel_names (list[str]) – the 2 channel names to use for gating
target (list[float]) – the 2 dimenstional target around which to cluster
quantile (float) – the quantile provided to flowClust.2d for gating, defaults to 0.95
K (int) – the K provided to flowClust.2d for gating, defaults to 2
translation (float) – the translation provided to flowClust.2d for gating, defaults to 0.15
_r_ready (bool) – a boolean describing whether or not R functionality has been initialized forthis Workspace; NOTE: this parameter is provided by the pyflowbat.pyflowbat.Workspace.apply_gate method and should NOT be specified by the user
- Returns
the gated PyFlowBAT sample collection
- Return type
pyflowbat.pyflowbat.SampleCollection
pyflowbat.r_gating_general module#
- pyflowbat.r_gating_general.general_r_gate(data_to_gate: dict[str, FlowCal.io.FCSData], gating_channel_names: list[str], r_file: str, r_function: str, arguments: dict, _r_ready: bool = False, **kwargs) dict[str, FlowCal.io.FCSData] #
PyFlowBAT wrapper for an arbitrary R function.
- Parameters
data_to_gate (pyflowbat.pyflowbat.SampleCollection) – the PyFlowBAT sample collection to gate; NOTE: this parameter is provided to by the pyflowbat.pyflowbat.Workspace.apply_gate method and should NOT be specified by the user
gating_channel_names (list[str]) – the channels to use for gating
r_file (str) – the path to the R file with the R function to use
r_function (str) – the function in the R file to use
arguments (dict) – the arguments to pass to the R function
_r_ready (bool) – a boolean describing whether or not R functionality has been initialized forthis Workspace; NOTE: this parameter is provided to by the pyflowbat.pyflowbat.Workspace.apply_gate method and should NOT be specified by the user
- Returns
the gated PyFlowBAT sample collection
- Return type
pyflowbat.pyflowbat.SampleCollection