pyflowbat package#

Submodules#

pyflowbat.gating module#

pyflowbat.gating.find_percentile(workspace: pyflowbat.pyflowbat.Workspace, sample_collection_name: str, sample_name: str, channel_name: str, percentile: float, **kwargs) float#

Finds the specified percentile of the data in a given channel a PyFlowBAT sample.

Parameters
  • workspace (pyflowbat.pyflowbat.Workspace) – the PyFlowBAT Workspace containing the sample whose percentile is desired.

  • sample_collection_name (str) – the name of the sample collection containing the sample whose percentile is desired

  • sample_name (str) – the name of the sample whose percentile is desired

  • channel_name (str) – the name of the channel whose percentile is desired

  • percentile (float) – the desired percentile

Returns

the desired percentile of the data in the specified channel of the PyFlowBAT sample

Return type

float

pyflowbat.gating.gate_heks(data_to_gate: dict[str, FlowCal.io.FCSData], limits: dict[str, list[float]], method: str = 'same', samples: Optional[list[str]] = None, **kwargs) dict[str, FlowCal.io.FCSData]#

Gates all samples in a collection for HEK 293FT by finding the most extreme valleys in the FSC-A and SSC-A channels.

Parameters
  • data_to_gate (pyflowbat.pyflowbat.SampleCollection) – the PyFlowBAT sample collection to gate; NOTE: this parameter is provided to by the pyflowbat.pyflowbat.Workspace.apply_gate method and should NOT be specified by the user

  • limits (dict[str, list[float]]) – the limits of the FSC-A and SSC-A channels; NOTE: this parameter is provided to by the pyflowbat.pyflowbat.Workspace.apply_gate method and should NOT be specified by the user, to specify this parameter’s value, change the value of limits within the Workspace in which this gate is being applied

  • method (str) – whether or not to use a unique gate for each sample in the collection or the same gate for all samples, options are “unique” or “same”, defaults to “same”

  • samples (Optional[list[str]]) – the samples to use to define the gate if the “same” method is used, defaults to None

Returns

the gated PyFlowBAT sample collection

Return type

pyflowbat.pyflowbat.SampleCollection

pyflowbat.gating.gate_high_low(data_to_gate: dict[str, FlowCal.io.FCSData], gating_channel_name: str, high: Optional[float] = None, low: Optional[float] = None, **kwargs) dict[str, FlowCal.io.FCSData]#

Gates all samples in a collection within an upper and lower bound in a specified channel.

Parameters
  • data_to_gate (pyflowbat.pyflowbat.SampleCollection) – the PyFlowBAT sample collection to gate; NOTE: this parameter is provided to by the pyflowbat.pyflowbat.Workspace.apply_gate method and should NOT be specified by the user

  • gating_channel_name (str) – the name of the channel to gate the samples

  • high (Optional[float]) – the upper bound

  • low (Optional[float]) – the lower bound

Returns

the gated PyFlowBAT sample collection

Return type

pyflowbat.pyflowbat.SampleCollection

pyflowbat.gating.gate_singlets(data_to_gate: dict[str, FlowCal.io.FCSData], a: float = 10000000000, b: float = 20000, **kwargs) dict[str, FlowCal.io.FCSData]#

Gates all samples in a collection for singlets by drawing a tilted ellipse.

Parameters
  • data_to_gate (pyflowbat.pyflowbat.SampleCollection) – the PyFlowBAT sample collection to gat NOTE: this parameter is provided to by the pyflowbat.pyflowbat.Workspace.apply_gate method and should NOT be specified by the user

  • a (float) – the long axis of the ellipse surrounding the singlets, defaults to 1*10**10

  • b (float) – the short axis of the ellipse surrounding the singlets, defaults to 2*10**4

Returns

the gated PyFlowBAT sample collection

Return type

pyflowbat.pyflowbat.SampleCollection

pyflowbat.operations module#

pyflowbat.operations.apply_conversion_factor(x: float, factor: float, **kwargs) float#

Given a number x and a conversion factor, returns x multiplied by the factor.

Parameters
  • x (float) – the value to convert

  • factor (float) – the conversion factor to apply to x

Returns

x multiplied by the conversion factor

Return type

float

pyflowbat.operations.channel_mean(data: FlowCal.io.FCSData, channel_name: str, **kwargs) float#

Computes the mean value of a given channel of FCS data.

Parameters
  • data (FlowCal.io.FCSData) – the FCS data with the desired mean

  • channel_name (str) – the name of the channel of the desired mean

Returns

the mean value of the specified channel of the specified FCS sample

Return type

float

pyflowbat.operations.compute_conversion_factor_stdErr(x: float, factor: float, factor_err: float, **kwargs) float#

Computes the standard error of the conversion factor multiplication.

Parameters
  • x (float) – the value being converted

  • factor (float) – the conversion factor being applied

  • factor_err (float) – the standard error of the conversion factor

Returns

the standard error of the conversion

Return type

float

pyflowbat.operations.non_negative(x: float, **kwargs) float#

Given a number x, returns the x if x is non-negative and 0 otherwise.

Parameters

x (float) – a number

Returns

x if x is non-negative and 0 otherwise

Return type

float

pyflowbat.operations.split_sample_name(name: str, by: str, index: int, **kwargs) str#

Splits a string by a character.

Parameters
  • name (str) – the string to split

  • by (str) – the character to split by

  • index (int) – the index of the split string to return

Returns

the string at the specified index of the provided string split by the specified character

Return type

str

pyflowbat.pyflowbat module#

class pyflowbat.pyflowbat.StatisticsExtraction(sample_collection_name: str, statistics_collection_name: str, include: list[str], not_include: list[str])#

Bases: object

A class containing a rule for extracting statistics from a sample collection.

Parameters
  • sample_collection_name (str) – the name of the sample collection in the workspace from which to extract the statistics

  • statistics_collection_name (str) – the name of the statistics collection to create/extract statistics into

  • include (list[str]) – the list of keywords in the names of samples that must be present to extract statistics from that sample, all keywords must be present in a sample’s name for statistics to be extracted from it

  • not_include (list[str]) – the list of keywords in the names of samples that must not be present to extract statistics from that sample, all keywords must not be present in a sample’s name for statistics to be extracted from it

class pyflowbat.pyflowbat.Workspace(stylesheet: dict = {'axes.axisbelow': True, 'axes.edgecolor': 'B0B0B0', 'axes.facecolor': 'white', 'axes.grid': True, 'axes.labelsize': 'large', 'axes.linewidth': 3.0, 'axes.prop_cycle': cycler('color', ['#0267c1', '#efa00b', '#00af54', '#e2cfea', '#d65108', '#6915E0', '#3DA5D9', '#FEC601', '#097109', '#EA7317', '#AF0BA5']), 'axes.spines.bottom': False, 'axes.spines.left': False, 'axes.spines.right': False, 'axes.spines.top': False, 'axes.titlesize': 'x-large', 'figure.facecolor': 'white', 'figure.subplot.bottom': 0.07, 'figure.subplot.left': 0.08, 'figure.subplot.right': 0.95, 'font.size': 14.0, 'grid.color': '#B0B0B0', 'grid.linestyle': ':', 'grid.linewidth': 1.0, 'legend.fancybox': True, 'lines.linewidth': 4, 'lines.solid_capstyle': 'butt', 'patch.edgecolor': 'white', 'patch.linewidth': 0.5, 'savefig.edgecolor': 'white', 'savefig.facecolor': 'white', 'svg.fonttype': 'path', 'xtick.major.size': 0, 'xtick.minor.size': 0, 'ytick.major.size': 0, 'ytick.minor.size': 0}, lims_file: str = '_std', full_output: bool = False)#

Bases: object

A class describing PyFlowBAT workspaces. Workspaces contain all the methods for operating on batches of PyFlowBAT data and contain all samples and statistics.

Parameters
  • stylesheet (dict) – the stylesheet for plotting PyFlowBAT analyzed data, defaults to the PyFlowBAT standard stylesheet

  • lims_file (str) – the path to the CSV file defining upper and lower limits for standard PyFlowBAT gating functions

  • full_output (bool) – whether or not to display all output of PyFlowBAT operations; it is HIGHLY recommended to leave this at the default value: True, defaults to True

apply_compensation_matrix(sample_collection_name: str, new_sample_collection_name: str) None#

Applies the workspace compensation matrix to one sample collection creating another. NOTE: the workspace’s calculate_compensation_matrix method must be run before this method may be run.

Parameters
  • sample_collection_name (str) – the name of the sample collection to compensate

  • new_sample_collection_name (str) – the name of the new sample collection to create for the compensated samples

apply_gate(sample_collection_name: str, new_sample_collection_name: str, gating_function: Callable, output_plots: int = 5, gating_channel_names: list[str] = ['FSC-A', 'SSC-A'], output_override: Optional[bool] = None, **kwargs) None#

Applies a specified gate to a sample collection.

Parameters
  • sample_collection_name (str) – the name of the sample collection to gate

  • new_sample_collection_name (str) – the name of the new sample collection to create for the gated samples

  • gating_function (function/Callable) – the function defining the gate to apply

  • output_plots (int) – the number of plots to visualize if full output is true, defaults to 5

  • gating_channel_names (list[str]) – the channels to visualize, if has more than one element, uses the first two elements if has one element, uses that value twice, defaults to [“FSC-A”, “SSC-A”]

  • output_override (Optional[bool]) – if not None, workspace will use this instead of its full_output variable, defaults to None

  • **kwargs – keywords to pass to the gating function

apply_operation(statistics_collection_name: str, new_statistics_collection_name: str, statistic_name: Union[str, list[str]], new_statistic_name: str, operation: Callable, **kwargs) None#

Applies a specified operation to a statistic in a statistics collection.

Parameters
  • statistics_collection_name (str) – the name of the statistics collection to operate on

  • new_statistics_collection_name (str) – the name of the new statistics collection to create for the post-operation statistics

  • statistic_name (Union[str, list[str]]) – the name of the statistic to operate on; can be a single statistic or multiple; only statistics specified here will be used in the operation

  • new_statistic_name (str) – the name of the statistic to be created from the operation

  • operation (function/Callable) – the function defining the opperation to apply

  • **kwargs – keywords to pass to the operation

calculate_beads_factors(beads_file_file_path: str, beads_fluorescent_channels: list[tuple[str, str]], beads_num_pops: int, beads_conversions_file: str = '_std') None#

Calculates the conversion factors for this workspace from a specified beads file.

Parameters
  • beads_file_file_path (str) – the path to the beads file to calculate conversion factors from

  • beads_fluorescent_channels (list[tuple[str, str]]) – a list of tuples containing the names of the fluorescent channels and the corresponding MEFs; follows the pattern of: [(CHANNEL, MEF), (CHANNEL, MEF),…]

  • beads_num_pops (int) – the number of beads populations in the beads file to calculate conversion factors from

  • beads_conversions_file (str) – the path to the file specifying the manufacturers bead conversion values, defaults to the PyFlowBAT standard beads values aka Spherotech RCP-30-5 Rainbow Calibration Beads

calculate_compensation_matrix(sample_collection_name: str, compensation_sample_names: str, compensation_channel_names: list[str], threshold: int = 0.0001, compensation_rate: float = 0.1) None#

Calculates a compensation matrix by repeatedly flattening to zero the best fit line through the specified compensation sample data. Flattening stops once the best fit line has a slope under the provided threshold. NOTE: at present, only 2- and 3- sample compensation matrices can be calculated. NOTE: at present, flattening continues for all compensation samples as long as any compensation sample has a best fit slope greater than the threshold. NOTE: this method must always be run before the workspace’s apply_compensation_matrix method.

Parameters
  • sample_collection_name (str) – the name of the collection from which to find the samples for compensation matrix calculation

  • compensation_sample_names (list[str]) – the names of the samples used for compensation

  • compensation_channel_names (list[str]) – the channel names to be compensated

  • threshold (float) – the threshold to flatten the best fit lines to, defaults to 10**4

  • compensation_rate (float) – the rate at which to calculate the compensation matrix, defaults to 0.1

combine_replicates(statistics_collection_name: str, combined_statistics_collection_name: str, combine_by: list[str], combination_operations: dict[str, Union[str, Callable]], sem_cols: list[str]) None#

Combines replicates in a statistics collection.

Parameters
  • statistics_collection_name (str) – the name of the statistics collection from which to combine replicates

  • combined_statistics_collection_name (str) – the name of the new statistics collection storing the combined statistics

  • combine_by (list[str]) – the statistics by which to combine replicates; each set of statistics is combined if all statistics here are equivalent

  • combination_operations (dict[str, Union[function/Callable, str]]) – the functions defining the operation by which to combine rows for each column; may be a Pandas GroupBy function or a function; follows the pattern: {STATISTIC: OPERATION, STATISTIC: OPERATION…}

  • sem_cols (list[str]) – list of statistics for which the standard error of the mean of the combination should be calculated

  • **kwargs – keywords to pass to the combination operation

create_statistic_extraction(sample_collection_name: str, statistics_collection_name: str, include: list[str], not_include: list[str], statistic_names: list[str]) pyflowbat.pyflowbat.StatisticsExtraction#

Creates a rule needed for extracting statistics from a sample collection

Parameters
  • sample_collection_name (str) – the name of the sample collection from which to extract

  • statistics_collection_name (str) – the name of the statistics collection to create and extract statistics to

  • include (list[str]) – a list of words that must be included in the samples from which to extract statistics

  • not_include (list[str]) – a list of words that must NOT be included in the samples from which to extract statistics

  • statistic_names (list[str]) – a list of the stastics that will be extracted using this rule

Returns

the extraction rule to be used in extract_samples

Return type

pyflowbat.pyflowbat.StatisticsExtraction

extract_statistic(extraction: pyflowbat.pyflowbat.StatisticsExtraction, statistic_name: str, operation: Callable, **kwargs) None#

Extracts a statistic for a StatisticExtraction rule.

Parameters
  • extraction (pyflowbat.pyflowbat.StatisticsExtraction) – the StatisticExtraction rule to use for the extraction

  • statistic_name (str) – the name of the statistic being created

  • operation (function/Callable) – the function defining the operation defining the extraction

  • **kwargs – keywords to pass to the operation

graph_statistics(data: list[list[Union[str, tuple[str, str]]]], errors: tuple[bool, bool] = (False, False), legend: Optional[list[str]] = None, title: Optional[str] = None, labels: tuple[Optional[str], Optional[str]] = (None, None), xlog: bool = False, ylog: bool = False, save: Union[bool, str] = True) None#

Plots statistics from a collection.

Parameters
  • data (list[list[Union[str, tuple[str, str]]]]) – the data to plot; follows the pattern of a list of the following: [STATISTIC_COLLECTION_NAME, X_STATISTIC_NAME, Y_STATISTIC_NAME, SPECIFICATION_TUPLEs…]; where each SPECIFICATION_TUPLE is follows the pattern of: (STATISTIC, VALUE); and only rows were STATISTIC=VALUE are included in this x,y plot; an arbitrary number of SPECIFICATION_TUPLEs can be included; an arbitrary number of these lists can be included to define different scatter plots in one figure

  • errors (tuple[bool, bool]) – if x and y axis errors should be included; only one tuple can be provided for the entire figure, defaults to (False, False)

  • legend (Optional[list[str]]) – the legend for the plot, defaults to None

  • title (Optional[str]) – the title for the plot, defaults to None

  • labels (tuple[Optional[str], Optional[str]]) – the x and y labels for the plot, defaults to (None, None)

  • xlog (bool) – if the x axis should be in logarithmic scale, defaults to False

  • ylog (bool) – if the y axis should be in logarithmic scale, defaults to False

  • save (Union[bool, str]) – if the plot should be saved; can be a boolean or a string; if a string, saves to the specified file; if a boolean, saves to a file with the name of the plot title if True; does not save figure if set to False, defaults to True

init_r(check_R_installation: bool = True) None#

Initialized R functionality for R language gating functions. NOTE: this method and R gating functionality both require the R programming language to be installed for features to work properly.

Parameters

check_R_installation (bool) – whether or not to check if the R language is installed it is HIGHLY recommended to leave this at the default value: True, defaults to True

load_samples(sample_collection_name: str, samples_folder_path: str, include: list[str], not_include: list[str]) None#

Loads samples from a folder into a sample collection in the workspace.

Parameters
  • sample_collection_name (str) – the name of the sample collection into which to load samples

  • samples_folder_path (str) – the path to the folder with the samples to load

  • include (list[str]) – the list of keywords in the names of files that must be present to load, all keywords must be present in a file’s name to load as a sample

  • not_include (list[str]) – the list of keywords in the names of files that must not be present to load, all keywords must not be present in a file’s name to load as a sample

visualize_plot_change(sample_collection_name_0: str, sample_collection_name_f: str, sample_name: str, channel_names: tuple[str, str]) None#

Visualizes the change in a sample from one collection to another.

Parameters
  • sample_collection_name_0 (str) – the name of the initial sample collection

  • sample_collection_name_f (str) – the name of the final sample collection

  • sample_name (str) – the name of the sample to visualize

  • channel_names (tuple[str, str]) – the names of the channels to visualize change in

visualize_plot_overlay(plots: list[list], colors: list[str], channel_names: tuple[str, str], sizes: Optional[list[int]] = None, markers: Optional[list[str]] = None, legend: Optional[list] = None, title: Optional[str] = None) None#

Visualizes an overlay between samples.

Parameters
  • plots (list[list]) – the data to plot

  • colors (list[str]) – the names of the colors for the overlayed plots

  • channel_names (tuple[str, str]) – the names of the channels to visualize change in

  • sizes (Optional[list[int]]) – the sizes of the scatters to overlay

  • markers (Optional[list[str]]) – the markers for the plot

  • legend (Optional[list]) – the legend for the plot

  • title (Optional[str]) – the title for the plot

pyflowbat.r_gating module#

pyflowbat.r_gating.clust_2d_gate(data_to_gate: dict[str, FlowCal.io.FCSData], gating_channel_names: list[str], target: list[float], quantile: float = 0.95, K: int = 2, _r_ready: bool = False, **kwargs) dict[str, FlowCal.io.FCSData]#

PyFlowBAT wrapper for the flowClust.2d function from the R OpenCyto library. For more, please see the OpenCyto documentation.

Parameters
  • data_to_gate (pyflowbat.pyflowbat.SampleCollection) – the PyFlowBAT sample collection to gate; NOTE: this parameter is provided by the pyflowbat.pyflowbat.Workspace.apply_gate method and should NOT be specified by the user

  • gating_channel_names (list[str]) – the 2 channels to use for gating

  • target (list[float]) – the 2 dimenstional target around which to cluster

  • quantile (float) – the quantile provided to flowClust.2d for gating, defaults to 0.95

  • K (int) – the K provided to flowClust.2d for gating, defaults to 2

  • _r_ready (bool) – a boolean describing whether or not R functionality has been initialized forthis Workspace; NOTE: this parameter is provided by the pyflowbat.pyflowbat.Workspace.apply_gate method and should NOT be specified by the user

Returns

the gated PyFlowBAT sample collection

Return type

pyflowbat.pyflowbat.SampleCollection

pyflowbat.r_gating.singlet_gate(data_to_gate: dict[str, FlowCal.io.FCSData], gating_channel_names: list[str], _r_ready: bool = False, **kwargs) dict[str, FlowCal.io.FCSData]#

PyFlowBAT wrapper for the singletGate function from the R OpenCyto library. For more, please see the OpenCyto documentation.

Parameters
  • data_to_gate (pyflowbat.pyflowbat.SampleCollection) – the PyFlowBAT sample collection to gate; NOTE: this parameter is provided by the pyflowbat.pyflowbat.Workspace.apply_gate method and should NOT be specified by the user

  • gating_channel_names (list[str]) – the 2 channels to use for gating

  • _r_ready (bool) – a boolean describing whether or not R functionality has been initialized for this Workspace; NOTE: this parameter is provided by the pyflowbat.pyflowbat.Workspace.apply_gate method and should NOT be specified by the user

Returns

the gated PyFlowBAT sample collection

Return type

pyflowbat.pyflowbat.SampleCollection

pyflowbat.r_gating.transitional_gate(data_to_gate: dict[str, FlowCal.io.FCSData], gating_channel_names: list[str], target: list[float], quantile: float = 0.95, K: int = 2, translation: int = 0.15, _r_ready: bool = False, **kwargs) dict[str, FlowCal.io.FCSData]#

PyFlowBAT wrapper for the flowClust.2d translational gate function from the R OpenCyto library. For more, please see the OpenCyto documentation.

Parameters
  • data_to_gate (pyflowbat.pyflowbat.SampleCollection) – the PyFlowBAT sample collection to gate; NOTE: this parameter is provided by the pyflowbat.pyflowbat.Workspace.apply_gate method and should NOT be specified by the user

  • gating_channel_names (list[str]) – the 2 channel names to use for gating

  • target (list[float]) – the 2 dimenstional target around which to cluster

  • quantile (float) – the quantile provided to flowClust.2d for gating, defaults to 0.95

  • K (int) – the K provided to flowClust.2d for gating, defaults to 2

  • translation (float) – the translation provided to flowClust.2d for gating, defaults to 0.15

  • _r_ready (bool) – a boolean describing whether or not R functionality has been initialized forthis Workspace; NOTE: this parameter is provided by the pyflowbat.pyflowbat.Workspace.apply_gate method and should NOT be specified by the user

Returns

the gated PyFlowBAT sample collection

Return type

pyflowbat.pyflowbat.SampleCollection

pyflowbat.r_gating_general module#

pyflowbat.r_gating_general.general_r_gate(data_to_gate: dict[str, FlowCal.io.FCSData], gating_channel_names: list[str], r_file: str, r_function: str, arguments: dict, _r_ready: bool = False, **kwargs) dict[str, FlowCal.io.FCSData]#

PyFlowBAT wrapper for an arbitrary R function.

Parameters
  • data_to_gate (pyflowbat.pyflowbat.SampleCollection) – the PyFlowBAT sample collection to gate; NOTE: this parameter is provided to by the pyflowbat.pyflowbat.Workspace.apply_gate method and should NOT be specified by the user

  • gating_channel_names (list[str]) – the channels to use for gating

  • r_file (str) – the path to the R file with the R function to use

  • r_function (str) – the function in the R file to use

  • arguments (dict) – the arguments to pass to the R function

  • _r_ready (bool) – a boolean describing whether or not R functionality has been initialized forthis Workspace; NOTE: this parameter is provided to by the pyflowbat.pyflowbat.Workspace.apply_gate method and should NOT be specified by the user

Returns

the gated PyFlowBAT sample collection

Return type

pyflowbat.pyflowbat.SampleCollection

Module contents#