flowio

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

FlowIO: Flow Cytometry Standard File Handler

FlowIO:流式细胞术标准文件处理工具

Overview

概述

FlowIO is a lightweight Python library for reading and writing Flow Cytometry Standard (FCS) files. Parse FCS metadata, extract event data, and create new FCS files with minimal dependencies. The library supports FCS versions 2.0, 3.0, and 3.1, making it ideal for backend services, data pipelines, and basic cytometry file operations.
FlowIO是一个轻量级Python库,用于读取和写入流式细胞术标准(FCS)文件。它可以解析FCS元数据、提取事件数据,并以最少的依赖项创建新的FCS文件。该库支持FCS 2.0、3.0和3.1版本,非常适合后端服务、数据管道和基础流式细胞术文件操作。

When to Use This Skill

何时使用该工具

This skill should be used when:
  • FCS files requiring parsing or metadata extraction
  • Flow cytometry data needing conversion to NumPy arrays
  • Event data requiring export to FCS format
  • Multi-dataset FCS files needing separation
  • Channel information extraction (scatter, fluorescence, time)
  • Cytometry file validation or inspection
  • Pre-processing workflows before advanced analysis
Related Tools: For advanced flow cytometry analysis including compensation, gating, and FlowJo/GatingML support, recommend FlowKit library as a companion to FlowIO.
在以下场景中应使用该工具:
  • 需要解析FCS文件或提取元数据时
  • 需要将流式细胞术数据转换为NumPy数组时
  • 需要将事件数据导出为FCS格式时
  • 需要分离多数据集FCS文件时
  • 需要提取通道信息(散射、荧光、时间)时
  • 需要验证或检查流式细胞术文件时
  • 高级分析前的预处理工作流中
相关工具: 如需进行包括补偿、门控以及支持FlowJo/GatingML的高级流式细胞术分析,推荐将FlowKit库作为FlowIO的配套工具使用。

Installation

安装

bash
uv pip install flowio
Requires Python 3.9 or later.
bash
uv pip install flowio
需要Python 3.9或更高版本。

Quick Start

快速开始

Basic File Reading

基础文件读取

python
from flowio import FlowData
python
from flowio import FlowData

Read FCS file

读取FCS文件

flow_data = FlowData('experiment.fcs')
flow_data = FlowData('experiment.fcs')

Access basic information

访问基础信息

print(f"FCS Version: {flow_data.version}") print(f"Events: {flow_data.event_count}") print(f"Channels: {flow_data.pnn_labels}")
print(f"FCS版本: {flow_data.version}") print(f"事件数: {flow_data.event_count}") print(f"通道: {flow_data.pnn_labels}")

Get event data as NumPy array

将事件数据转换为NumPy数组

events = flow_data.as_array() # Shape: (events, channels)
undefined
events = flow_data.as_array() # 形状: (事件数, 通道数)
undefined

Creating FCS Files

创建FCS文件

python
import numpy as np
from flowio import create_fcs
python
import numpy as np
from flowio import create_fcs

Prepare data

准备数据

data = np.array([[100, 200, 50], [150, 180, 60]]) # 2 events, 3 channels channels = ['FSC-A', 'SSC-A', 'FL1-A']
data = np.array([[100, 200, 50], [150, 180, 60]]) # 2个事件,3个通道 channels = ['FSC-A', 'SSC-A', 'FL1-A']

Create FCS file

创建FCS文件

create_fcs('output.fcs', data, channels)
undefined
create_fcs('output.fcs', data, channels)
undefined

Core Workflows

核心工作流

Reading and Parsing FCS Files

读取和解析FCS文件

The FlowData class provides the primary interface for reading FCS files.
Standard Reading:
python
from flowio import FlowData
FlowData类是读取FCS文件的主要接口。
标准读取:
python
from flowio import FlowData

Basic reading

基础读取

flow = FlowData('sample.fcs')
flow = FlowData('sample.fcs')

Access attributes

访问属性

version = flow.version # '3.0', '3.1', etc. event_count = flow.event_count # Number of events channel_count = flow.channel_count # Number of channels pnn_labels = flow.pnn_labels # Short channel names pns_labels = flow.pns_labels # Descriptive stain names
version = flow.version # '3.0'、'3.1'等 event_count = flow.event_count # 事件数量 channel_count = flow.channel_count # 通道数量 pnn_labels = flow.pnn_labels # 通道短名称 pns_labels = flow.pns_labels # 荧光标记的描述性名称

Get event data

获取事件数据

events = flow.as_array() # Preprocessed (gain, log scaling applied) raw_events = flow.as_array(preprocess=False) # Raw data

**Memory-Efficient Metadata Reading:**

When only metadata is needed (no event data):

```python
events = flow.as_array() # 预处理后的数据(应用增益、对数缩放) raw_events = flow.as_array(preprocess=False) # 原始数据

**内存高效的元数据读取:**

当仅需要元数据(无需事件数据)时:

```python

Only parse TEXT segment, skip DATA and ANALYSIS

仅解析TEXT段,跳过DATA和ANALYSIS段

flow = FlowData('sample.fcs', only_text=True)
flow = FlowData('sample.fcs', only_text=True)

Access metadata

访问元数据

metadata = flow.text # Dictionary of TEXT segment keywords print(metadata.get('$DATE')) # Acquisition date print(metadata.get('$CYT')) # Instrument name

**Handling Problematic Files:**

Some FCS files have offset discrepancies or errors:

```python
metadata = flow.text # TEXT段关键字的字典 print(metadata.get('$DATE')) # 采集日期 print(metadata.get('$CYT')) # 仪器名称

**处理有问题的文件:**

部分FCS文件存在偏移差异或错误:

```python

Ignore offset discrepancies between HEADER and TEXT sections

忽略HEADER和TEXT段之间的偏移差异

flow = FlowData('problematic.fcs', ignore_offset_discrepancy=True)
flow = FlowData('problematic.fcs', ignore_offset_discrepancy=True)

Use HEADER offsets instead of TEXT offsets

使用HEADER段的偏移量而非TEXT段的偏移量

flow = FlowData('problematic.fcs', use_header_offsets=True)
flow = FlowData('problematic.fcs', use_header_offsets=True)

Ignore offset errors entirely

完全忽略偏移错误

flow = FlowData('problematic.fcs', ignore_offset_error=True)

**Excluding Null Channels:**

```python
flow = FlowData('problematic.fcs', ignore_offset_error=True)

**排除空通道:**

```python

Exclude specific channels during parsing

解析时排除特定通道

flow = FlowData('sample.fcs', null_channel_list=['Time', 'Null'])
undefined
flow = FlowData('sample.fcs', null_channel_list=['Time', 'Null'])
undefined

Extracting Metadata and Channel Information

提取元数据和通道信息

FCS files contain rich metadata in the TEXT segment.
Common Metadata Keywords:
python
flow = FlowData('sample.fcs')
FCS文件的TEXT段包含丰富的元数据。
常见元数据关键字:
python
flow = FlowData('sample.fcs')

File-level metadata

文件级元数据

text_dict = flow.text acquisition_date = text_dict.get('$DATE', 'Unknown') instrument = text_dict.get('$CYT', 'Unknown') data_type = flow.data_type # 'I', 'F', 'D', 'A'
text_dict = flow.text acquisition_date = text_dict.get('$DATE', '未知') instrument = text_dict.get('$CYT', '未知') data_type = flow.data_type # 'I'、'F'、'D'、'A'

Channel metadata

通道元数据

for i in range(flow.channel_count): pnn = flow.pnn_labels[i] # Short name (e.g., 'FSC-A') pns = flow.pns_labels[i] # Descriptive name (e.g., 'Forward Scatter') pnr = flow.pnr_values[i] # Range/max value print(f"Channel {i}: {pnn} ({pns}), Range: {pnr}")

**Channel Type Identification:**

FlowIO automatically categorizes channels:

```python
for i in range(flow.channel_count): pnn = flow.pnn_labels[i] # 短名称(如'FSC-A') pns = flow.pns_labels[i] # 描述性名称(如'Forward Scatter') pnr = flow.pnr_values[i] # 范围/最大值 print(f"通道 {i}: {pnn} ({pns}), 范围: {pnr}")

**通道类型识别:**

FlowIO会自动对通道进行分类:

```python

Get indices by channel type

按通道类型获取索引

scatter_idx = flow.scatter_indices # [0, 1] for FSC, SSC fluoro_idx = flow.fluoro_indices # [2, 3, 4] for FL channels time_idx = flow.time_index # Index of time channel (or None)
scatter_idx = flow.scatter_indices # FSC、SSC的索引,如[0, 1] fluoro_idx = flow.fluoro_indices # 荧光通道的索引,如[2, 3, 4] time_idx = flow.time_index # 时间通道的索引(若不存在则为None)

Access specific channel types

访问特定类型的通道数据

events = flow.as_array() scatter_data = events[:, scatter_idx] fluorescence_data = events[:, fluoro_idx]

**ANALYSIS Segment:**

If present, access processed results:

```python
if flow.analysis:
    analysis_keywords = flow.analysis  # Dictionary of ANALYSIS keywords
    print(analysis_keywords)
events = flow.as_array() scatter_data = events[:, scatter_idx] fluorescence_data = events[:, fluoro_idx]

**ANALYSIS段:**

如果存在ANALYSIS段,可访问处理后的结果:

```python
if flow.analysis:
    analysis_keywords = flow.analysis  # ANALYSIS段关键字的字典
    print(analysis_keywords)

Creating New FCS Files

创建新的FCS文件

Generate FCS files from NumPy arrays or other data sources.
Basic Creation:
python
import numpy as np
from flowio import create_fcs
从NumPy数组或其他数据源生成FCS文件。
基础创建:
python
import numpy as np
from flowio import create_fcs

Create event data (rows=events, columns=channels)

创建事件数据(行=事件,列=通道)

events = np.random.rand(10000, 5) * 1000
events = np.random.rand(10000, 5) * 1000

Define channel names

定义通道名称

channel_names = ['FSC-A', 'SSC-A', 'FL1-A', 'FL2-A', 'Time']
channel_names = ['FSC-A', 'SSC-A', 'FL1-A', 'FL2-A', 'Time']

Create FCS file

创建FCS文件

create_fcs('output.fcs', events, channel_names)

**With Descriptive Channel Names:**

```python
create_fcs('output.fcs', events, channel_names)

**添加描述性通道名称:**

```python

Add optional descriptive names (PnS)

添加可选的描述性名称(PnS)

channel_names = ['FSC-A', 'SSC-A', 'FL1-A', 'FL2-A', 'Time'] descriptive_names = ['Forward Scatter', 'Side Scatter', 'FITC', 'PE', 'Time']
create_fcs('output.fcs', events, channel_names, opt_channel_names=descriptive_names)

**With Custom Metadata:**

```python
channel_names = ['FSC-A', 'SSC-A', 'FL1-A', 'FL2-A', 'Time'] descriptive_names = ['Forward Scatter', 'Side Scatter', 'FITC', 'PE', 'Time']
create_fcs('output.fcs', events, channel_names, opt_channel_names=descriptive_names)

**添加自定义元数据:**

```python

Add TEXT segment metadata

添加TEXT段元数据

metadata = { '$SRC': 'Python script', '$DATE': '19-OCT-2025', '$CYT': 'Synthetic Instrument', '$INST': 'Laboratory A' }
create_fcs('output.fcs', events, channel_names, opt_channel_names=descriptive_names, metadata=metadata)

**Note:** FlowIO exports as FCS 3.1 with single-precision floating-point data.
metadata = { '$SRC': 'Python script', '$DATE': '19-OCT-2025', '$CYT': 'Synthetic Instrument', '$INST': 'Laboratory A' }
create_fcs('output.fcs', events, channel_names, opt_channel_names=descriptive_names, metadata=metadata)

**注意:** FlowIO以FCS 3.1版本导出,数据为单精度浮点型。

Exporting Modified Data

导出修改后的数据

Modify existing FCS files and re-export them.
Approach 1: Using write_fcs() Method:
python
from flowio import FlowData
修改现有FCS文件并重新导出。
方法1:使用write_fcs()方法:
python
from flowio import FlowData

Read original file

读取原始文件

flow = FlowData('original.fcs')
flow = FlowData('original.fcs')

Write with updated metadata

写入更新后的元数据

flow.write_fcs('modified.fcs', metadata={'$SRC': 'Modified data'})

**Approach 2: Extract, Modify, and Recreate:**

For modifying event data:

```python
from flowio import FlowData, create_fcs
flow.write_fcs('modified.fcs', metadata={'$SRC': 'Modified data'})

**方法2:提取、修改并重新创建:**

如需修改事件数据:

```python
from flowio import FlowData, create_fcs

Read and extract data

读取并提取数据

flow = FlowData('original.fcs') events = flow.as_array(preprocess=False)
flow = FlowData('original.fcs') events = flow.as_array(preprocess=False)

Modify event data

修改事件数据

events[:, 0] = events[:, 0] * 1.5 # Scale first channel
events[:, 0] = events[:, 0] * 1.5 # 缩放第一个通道

Create new FCS file with modified data

使用修改后的数据创建新的FCS文件

create_fcs('modified.fcs', events, flow.pnn_labels, opt_channel_names=flow.pns_labels, metadata=flow.text)
undefined
create_fcs('modified.fcs', events, flow.pnn_labels, opt_channel_names=flow.pns_labels, metadata=flow.text)
undefined

Handling Multi-Dataset FCS Files

处理多数据集FCS文件

Some FCS files contain multiple datasets in a single file.
Detecting Multi-Dataset Files:
python
from flowio import FlowData, MultipleDataSetsError

try:
    flow = FlowData('sample.fcs')
except MultipleDataSetsError:
    print("File contains multiple datasets")
    # Use read_multiple_data_sets() instead
Reading All Datasets:
python
from flowio import read_multiple_data_sets
部分FCS文件在单个文件中包含多个数据集。
检测多数据集文件:
python
from flowio import FlowData, MultipleDataSetsError

try:
    flow = FlowData('sample.fcs')
except MultipleDataSetsError:
    print("文件包含多个数据集")
    # 改用read_multiple_data_sets()方法
读取所有数据集:
python
from flowio import read_multiple_data_sets

Read all datasets from file

读取文件中的所有数据集

datasets = read_multiple_data_sets('multi_dataset.fcs')
print(f"Found {len(datasets)} datasets")
datasets = read_multiple_data_sets('multi_dataset.fcs')
print(f"发现 {len(datasets)} 个数据集")

Process each dataset

处理每个数据集

for i, dataset in enumerate(datasets): print(f"\nDataset {i}:") print(f" Events: {dataset.event_count}") print(f" Channels: {dataset.pnn_labels}")
# Get event data for this dataset
events = dataset.as_array()
print(f"  Shape: {events.shape}")
print(f"  Mean values: {events.mean(axis=0)}")

**Reading Specific Dataset:**

```python
from flowio import FlowData
for i, dataset in enumerate(datasets): print(f"\n数据集 {i}:") print(f" 事件数: {dataset.event_count}") print(f" 通道: {dataset.pnn_labels}")
# 获取该数据集的事件数据
events = dataset.as_array()
print(f"  形状: {events.shape}")
print(f"  平均值: {events.mean(axis=0)}")

**读取特定数据集:**

```python
from flowio import FlowData

Read first dataset (nextdata_offset=0)

读取第一个数据集(nextdata_offset=0)

first_dataset = FlowData('multi.fcs', nextdata_offset=0)
first_dataset = FlowData('multi.fcs', nextdata_offset=0)

Read second dataset using NEXTDATA offset from first

使用第一个数据集的NEXTDATA偏移量读取第二个数据集

next_offset = int(first_dataset.text['$NEXTDATA']) if next_offset > 0: second_dataset = FlowData('multi.fcs', nextdata_offset=next_offset)
undefined
next_offset = int(first_dataset.text['$NEXTDATA']) if next_offset > 0: second_dataset = FlowData('multi.fcs', nextdata_offset=next_offset)
undefined

Data Preprocessing

数据预处理

FlowIO applies standard FCS preprocessing transformations when
preprocess=True
.
Preprocessing Steps:
  1. Gain Scaling: Multiply values by PnG (gain) keyword
  2. Logarithmic Transformation: Apply PnE exponential transformation if present
    • Formula:
      value = a * 10^(b * raw_value)
      where PnE = "a,b"
  3. Time Scaling: Convert time values to appropriate units
Controlling Preprocessing:
python
undefined
preprocess=True
时,FlowIO会应用标准FCS预处理转换。
预处理步骤:
  1. 增益缩放: 将值乘以PnG(增益)关键字对应的值
  2. 对数转换: 如果存在PnE指数转换,则应用该转换
    • 公式:
      value = a * 10^(b * raw_value)
      ,其中PnE = "a,b"
  3. 时间缩放: 将时间值转换为合适的单位
控制预处理:
python
undefined

Preprocessed data (default)

预处理后的数据(默认)

preprocessed = flow.as_array(preprocess=True)
preprocessed = flow.as_array(preprocess=True)

Raw data (no transformations)

原始数据(无转换)

raw = flow.as_array(preprocess=False)
undefined
raw = flow.as_array(preprocess=False)
undefined

Error Handling

错误处理

Handle common FlowIO exceptions appropriately.
python
from flowio import (
    FlowData,
    FCSParsingError,
    DataOffsetDiscrepancyError,
    MultipleDataSetsError
)

try:
    flow = FlowData('sample.fcs')
    events = flow.as_array()

except FCSParsingError as e:
    print(f"Failed to parse FCS file: {e}")
    # Try with relaxed parsing
    flow = FlowData('sample.fcs', ignore_offset_error=True)

except DataOffsetDiscrepancyError as e:
    print(f"Offset discrepancy detected: {e}")
    # Use ignore_offset_discrepancy parameter
    flow = FlowData('sample.fcs', ignore_offset_discrepancy=True)

except MultipleDataSetsError as e:
    print(f"Multiple datasets detected: {e}")
    # Use read_multiple_data_sets instead
    from flowio import read_multiple_data_sets
    datasets = read_multiple_data_sets('sample.fcs')

except Exception as e:
    print(f"Unexpected error: {e}")
适当处理FlowIO的常见异常。
python
from flowio import (
    FlowData,
    FCSParsingError,
    DataOffsetDiscrepancyError,
    MultipleDataSetsError
)

try:
    flow = FlowData('sample.fcs')
    events = flow.as_array()

except FCSParsingError as e:
    print(f"解析FCS文件失败: {e}")
    # 尝试使用宽松解析
    flow = FlowData('sample.fcs', ignore_offset_error=True)

except DataOffsetDiscrepancyError as e:
    print(f"检测到偏移差异: {e}")
    # 使用ignore_offset_discrepancy参数
    flow = FlowData('sample.fcs', ignore_offset_discrepancy=True)

except MultipleDataSetsError as e:
    print(f"检测到多数据集: {e}")
    # 改用read_multiple_data_sets方法
    from flowio import read_multiple_data_sets
    datasets = read_multiple_data_sets('sample.fcs')

except Exception as e:
    print(f"意外错误: {e}")

Common Use Cases

常见用例

Inspecting FCS File Contents

检查FCS文件内容

Quick exploration of FCS file structure:
python
from flowio import FlowData

flow = FlowData('unknown.fcs')

print("=" * 50)
print(f"File: {flow.name}")
print(f"Version: {flow.version}")
print(f"Size: {flow.file_size:,} bytes")
print("=" * 50)

print(f"\nEvents: {flow.event_count:,}")
print(f"Channels: {flow.channel_count}")

print("\nChannel Information:")
for i, (pnn, pns) in enumerate(zip(flow.pnn_labels, flow.pns_labels)):
    ch_type = "scatter" if i in flow.scatter_indices else \
              "fluoro" if i in flow.fluoro_indices else \
              "time" if i == flow.time_index else "other"
    print(f"  [{i}] {pnn:10s} | {pns:30s} | {ch_type}")

print("\nKey Metadata:")
for key in ['$DATE', '$BTIM', '$ETIM', '$CYT', '$INST', '$SRC']:
    value = flow.text.get(key, 'N/A')
    print(f"  {key:15s}: {value}")
快速探索FCS文件结构:
python
from flowio import FlowData

flow = FlowData('unknown.fcs')

print("=" * 50)
print(f"文件: {flow.name}")
print(f"版本: {flow.version}")
print(f"大小: {flow.file_size:,} 字节")
print("=" * 50)

print(f"\n事件数: {flow.event_count:,}")
print(f"通道数: {flow.channel_count}")

print("\n通道信息:")
for i, (pnn, pns) in enumerate(zip(flow.pnn_labels, flow.pns_labels)):
    ch_type = "散射" if i in flow.scatter_indices else \
              "荧光" if i in flow.fluoro_indices else \
              "时间" if i == flow.time_index else "其他"
    print(f"  [{i}] {pnn:10s} | {pns:30s} | {ch_type}")

print("\n关键元数据:")
for key in ['$DATE', '$BTIM', '$ETIM', '$CYT', '$INST', '$SRC']:
    value = flow.text.get(key, 'N/A')
    print(f"  {key:15s}: {value}")

Batch Processing Multiple Files

批量处理多个文件

Process a directory of FCS files:
python
from pathlib import Path
from flowio import FlowData
import pandas as pd
处理目录中的所有FCS文件:
python
from pathlib import Path
from flowio import FlowData
import pandas as pd

Find all FCS files

查找所有FCS文件

fcs_files = list(Path('data/').glob('*.fcs'))
fcs_files = list(Path('data/').glob('*.fcs'))

Extract summary information

提取摘要信息

summaries = [] for fcs_path in fcs_files: try: flow = FlowData(str(fcs_path), only_text=True) summaries.append({ 'filename': fcs_path.name, 'version': flow.version, 'events': flow.event_count, 'channels': flow.channel_count, 'date': flow.text.get('$DATE', 'N/A') }) except Exception as e: print(f"Error processing {fcs_path.name}: {e}")
summaries = [] for fcs_path in fcs_files: try: flow = FlowData(str(fcs_path), only_text=True) summaries.append({ 'filename': fcs_path.name, 'version': flow.version, 'events': flow.event_count, 'channels': flow.channel_count, 'date': flow.text.get('$DATE', 'N/A') }) except Exception as e: print(f"处理 {fcs_path.name} 时出错: {e}")

Create summary DataFrame

创建摘要DataFrame

df = pd.DataFrame(summaries) print(df)
undefined
df = pd.DataFrame(summaries) print(df)
undefined

Converting FCS to CSV

将FCS转换为CSV

Export event data to CSV format:
python
from flowio import FlowData
import pandas as pd
将事件数据导出为CSV格式:
python
from flowio import FlowData
import pandas as pd

Read FCS file

读取FCS文件

flow = FlowData('sample.fcs')
flow = FlowData('sample.fcs')

Convert to DataFrame

转换为DataFrame

df = pd.DataFrame( flow.as_array(), columns=flow.pnn_labels )
df = pd.DataFrame( flow.as_array(), columns=flow.pnn_labels )

Add metadata as attributes

将元数据添加为属性

df.attrs['fcs_version'] = flow.version df.attrs['instrument'] = flow.text.get('$CYT', 'Unknown')
df.attrs['fcs_version'] = flow.version df.attrs['instrument'] = flow.text.get('$CYT', 'Unknown')

Export to CSV

导出为CSV

df.to_csv('output.csv', index=False) print(f"Exported {len(df)} events to CSV")
undefined
df.to_csv('output.csv', index=False) print(f"已将 {len(df)} 个事件导出为CSV")
undefined

Filtering Events and Re-exporting

过滤事件并重新导出

Apply filters and save filtered data:
python
from flowio import FlowData, create_fcs
import numpy as np
应用过滤条件并保存过滤后的数据:
python
from flowio import FlowData, create_fcs
import numpy as np

Read original file

读取原始文件

flow = FlowData('sample.fcs') events = flow.as_array(preprocess=False)
flow = FlowData('sample.fcs') events = flow.as_array(preprocess=False)

Apply filtering (example: threshold on first channel)

应用过滤条件(示例:对第一个通道设置阈值)

fsc_idx = 0 threshold = 500 mask = events[:, fsc_idx] > threshold filtered_events = events[mask]
print(f"Original events: {len(events)}") print(f"Filtered events: {len(filtered_events)}")
fsc_idx = 0 threshold = 500 mask = events[:, fsc_idx] > threshold filtered_events = events[mask]
print(f"原始事件数: {len(events)}") print(f"过滤后事件数: {len(filtered_events)}")

Create new FCS file with filtered data

使用过滤后的数据创建新的FCS文件

create_fcs('filtered.fcs', filtered_events, flow.pnn_labels, opt_channel_names=flow.pns_labels, metadata={**flow.text, '$SRC': 'Filtered data'})
undefined
create_fcs('filtered.fcs', filtered_events, flow.pnn_labels, opt_channel_names=flow.pns_labels, metadata={**flow.text, '$SRC': 'Filtered data'})
undefined

Extracting Specific Channels

提取特定通道

Extract and process specific channels:
python
from flowio import FlowData
import numpy as np

flow = FlowData('sample.fcs')
events = flow.as_array()
提取并处理特定通道:
python
from flowio import FlowData
import numpy as np

flow = FlowData('sample.fcs')
events = flow.as_array()

Extract fluorescence channels only

仅提取荧光通道

fluoro_indices = flow.fluoro_indices fluoro_data = events[:, fluoro_indices] fluoro_names = [flow.pnn_labels[i] for i in fluoro_indices]
print(f"Fluorescence channels: {fluoro_names}") print(f"Shape: {fluoro_data.shape}")
fluoro_indices = flow.fluoro_indices fluoro_data = events[:, fluoro_indices] fluoro_names = [flow.pnn_labels[i] for i in fluoro_indices]
print(f"荧光通道: {fluoro_names}") print(f"形状: {fluoro_data.shape}")

Calculate statistics per channel

计算每个通道的统计信息

for i, name in enumerate(fluoro_names): channel_data = fluoro_data[:, i] print(f"\n{name}:") print(f" Mean: {channel_data.mean():.2f}") print(f" Median: {np.median(channel_data):.2f}") print(f" Std Dev: {channel_data.std():.2f}")
undefined
for i, name in enumerate(fluoro_names): channel_data = fluoro_data[:, i] print(f"\n{name}:") print(f" 平均值: {channel_data.mean():.2f}") print(f" 中位数: {np.median(channel_data):.2f}") print(f" 标准差: {channel_data.std():.2f}")
undefined

Best Practices

最佳实践

  1. Memory Efficiency: Use
    only_text=True
    when event data is not needed
  2. Error Handling: Wrap file operations in try-except blocks for robust code
  3. Multi-Dataset Detection: Check for MultipleDataSetsError and use appropriate function
  4. Preprocessing Control: Explicitly set
    preprocess
    parameter based on analysis needs
  5. Offset Issues: If parsing fails, try
    ignore_offset_discrepancy=True
    parameter
  6. Channel Validation: Verify channel counts and names match expectations before processing
  7. Metadata Preservation: When modifying files, preserve original TEXT segment keywords
  1. 内存效率: 当不需要事件数据时,使用
    only_text=True
    参数
  2. 错误处理: 将文件操作包裹在try-except块中,以提高代码健壮性
  3. 多数据集检测: 检查MultipleDataSetsError异常,并使用相应的函数
  4. 预处理控制: 根据分析需求显式设置
    preprocess
    参数
  5. 偏移问题: 如果解析失败,尝试使用
    ignore_offset_discrepancy=True
    参数
  6. 通道验证: 在处理前验证通道数量和名称是否符合预期
  7. 元数据保留: 修改文件时,保留原始TEXT段的关键字

Advanced Topics

高级主题

Understanding FCS File Structure

理解FCS文件结构

FCS files consist of four segments:
  1. HEADER: FCS version and byte offsets for other segments
  2. TEXT: Key-value metadata pairs (delimiter-separated)
  3. DATA: Raw event data (binary/float/ASCII format)
  4. ANALYSIS (optional): Results from data processing
Access these segments via FlowData attributes:
  • flow.header
    - HEADER segment
  • flow.text
    - TEXT segment keywords
  • flow.events
    - DATA segment (as bytes)
  • flow.analysis
    - ANALYSIS segment keywords (if present)
FCS文件由四个部分组成:
  1. HEADER: FCS版本和其他段的字节偏移量
  2. TEXT: 键值对形式的元数据(分隔符分隔)
  3. DATA: 原始事件数据(二进制/浮点/ASCII格式)
  4. ANALYSIS(可选):数据处理的结果
可通过FlowData属性访问这些段:
  • flow.header
    - HEADER段
  • flow.text
    - TEXT段关键字
  • flow.events
    - DATA段(字节形式)
  • flow.analysis
    - ANALYSIS段关键字(如果存在)

Detailed API Reference

详细API参考

For comprehensive API documentation including all parameters, methods, exceptions, and FCS keyword reference, consult the detailed reference file:
Read:
references/api_reference.md
The reference includes:
  • Complete FlowData class documentation
  • All utility functions (read_multiple_data_sets, create_fcs)
  • Exception classes and handling
  • FCS file structure details
  • Common TEXT segment keywords
  • Extended example workflows
When working with complex FCS operations or encountering unusual file formats, load this reference for detailed guidance.
如需包含所有参数、方法、异常和FCS关键字参考的完整API文档,请查阅详细参考文件:
阅读:
references/api_reference.md
该参考文档包括:
  • 完整的FlowData类文档
  • 所有工具函数(read_multiple_data_sets、create_fcs)
  • 异常类和处理方法
  • FCS文件结构细节
  • 常见TEXT段关键字
  • 扩展示例工作流
当处理复杂FCS操作或遇到不常见的文件格式时,请查阅该参考文档获取详细指导。

Integration Notes

集成说明

NumPy Arrays: All event data is returned as NumPy ndarrays with shape (events, channels)
Pandas DataFrames: Easily convert to DataFrames for analysis:
python
import pandas as pd
df = pd.DataFrame(flow.as_array(), columns=flow.pnn_labels)
FlowKit Integration: For advanced analysis (compensation, gating, FlowJo support), use FlowKit library which builds on FlowIO's parsing capabilities
Web Applications: FlowIO's minimal dependencies make it ideal for web backend services processing FCS uploads
NumPy数组: 所有事件数据均以NumPy ndarray形式返回,形状为(事件数, 通道数)
Pandas DataFrames: 可轻松转换为DataFrame进行分析:
python
import pandas as pd
df = pd.DataFrame(flow.as_array(), columns=flow.pnn_labels)
FlowKit集成: 如需进行高级分析(补偿、门控、FlowJo支持),可使用FlowKit库,它基于FlowIO的解析功能构建
Web应用: FlowIO的依赖项极少,非常适合处理FCS上传的Web后端服务

Troubleshooting

故障排除

Problem: "Offset discrepancy error" Solution: Use
ignore_offset_discrepancy=True
parameter
Problem: "Multiple datasets error" Solution: Use
read_multiple_data_sets()
function instead of FlowData constructor
Problem: Out of memory with large files Solution: Use
only_text=True
for metadata-only operations, or process events in chunks
Problem: Unexpected channel counts Solution: Check for null channels; use
null_channel_list
parameter to exclude them
Problem: Cannot modify event data in place Solution: FlowIO doesn't support direct modification; extract data, modify, then use
create_fcs()
to save
问题: "偏移差异错误" 解决方案: 使用
ignore_offset_discrepancy=True
参数
问题: "多数据集错误" 解决方案: 使用
read_multiple_data_sets()
函数替代FlowData构造函数
问题: 处理大文件时内存不足 解决方案: 对于仅需元数据的操作,使用
only_text=True
;或分块处理事件
问题: 通道数量不符合预期 解决方案: 检查是否存在空通道;使用
null_channel_list
参数排除它们
问题: 无法直接修改事件数据 解决方案: FlowIO不支持直接修改;提取数据、修改后,使用
create_fcs()
保存

Summary

总结

FlowIO provides essential FCS file handling capabilities for flow cytometry workflows. Use it for parsing, metadata extraction, and file creation. For simple file operations and data extraction, FlowIO is sufficient. For complex analysis including compensation and gating, integrate with FlowKit or other specialized tools.
FlowIO为流式细胞术工作流提供了必要的FCS文件处理功能。可用于解析、元数据提取和文件创建。对于简单的文件操作和数据提取,FlowIO已足够。如需进行包括补偿和门控的复杂分析,可与FlowKit或其他专用工具集成。