Stats Operations
The copick.ops.stats module provides statistical analysis functions for Copick projects. These functions generate comprehensive statistics about picks, meshes, and segmentations including counts, distributions, and frequency analysis.
Functions
copick.ops.stats.picks_stats
picks_stats(root: Union[str, CopickRoot], runs: Union[str, CopickRun, Iterable[str], Iterable[CopickRun], None] = None, user_id: Union[str, Iterable[str], None] = None, session_id: Union[str, Iterable[str], None] = None, object_name: Union[str, Iterable[str], None] = None, parallel: bool = False, workers: Optional[int] = 8, show_progress: bool = True) -> Dict[str, Union[int, Dict[str, int]]]
Generate statistics for picks in a Copick project.
Parameters:
-
root(Union[str, CopickRoot]) –The Copick project root.
-
runs(Union[str, CopickRun, Iterable[str], Iterable[CopickRun], None], default:None) –The runs to query picks from. If
None, query all runs. -
user_id(Union[str, Iterable[str], None], default:None) –The user ID of the picks. If
None, query all users. -
session_id(Union[str, Iterable[str], None], default:None) –The session ID of the picks. If
None, query all sessions. -
object_name(Union[str, Iterable[str], None], default:None) –The pickable object of the picks. If
None, query all picks. -
parallel(bool, default:False) –Whether to query picks in parallel. Default is
False. -
workers(Optional[int], default:8) –The number of workers to use. Default is
8. -
show_progress(bool, default:True) –Whether to show progress. Default is
True.
Returns:
-
Dict[str, Union[int, Dict[str, int]]]–A dictionary containing total pick count and distribution statistics.
copick.ops.stats.meshes_stats
meshes_stats(root: Union[str, CopickRoot], runs: Union[str, CopickRun, Iterable[str], Iterable[CopickRun], None] = None, user_id: Union[str, Iterable[str], None] = None, session_id: Union[str, Iterable[str], None] = None, object_name: Union[str, Iterable[str], None] = None, parallel: bool = False, workers: Optional[int] = 8, show_progress: bool = True) -> Dict[str, Union[int, Dict[str, int]]]
Generate statistics for meshes in a Copick project.
Parameters:
-
root(Union[str, CopickRoot]) –The Copick project root.
-
runs(Union[str, CopickRun, Iterable[str], Iterable[CopickRun], None], default:None) –The runs to query meshes from. If
None, query all runs. -
user_id(Union[str, Iterable[str], None], default:None) –The user ID of the meshes. If
None, query all users. -
session_id(Union[str, Iterable[str], None], default:None) –The session ID of the meshes. If
None, query all sessions. -
object_name(Union[str, Iterable[str], None], default:None) –The pickable object of the meshes. If
None, query all meshes. -
parallel(bool, default:False) –Whether to query meshes in parallel. Default is
False. -
workers(Optional[int], default:8) –The number of workers to use. Default is
8. -
show_progress(bool, default:True) –Whether to show progress. Default is
True.
Returns:
-
Dict[str, Union[int, Dict[str, int]]]–A dictionary containing mesh count and frequency statistics.
copick.ops.stats.segmentations_stats
segmentations_stats(root: Union[str, CopickRoot], runs: Union[str, CopickRun, Iterable[str], Iterable[CopickRun], None] = None, user_id: Union[str, Iterable[str], None] = None, session_id: Union[str, Iterable[str], None] = None, is_multilabel: bool = None, name: Union[str, Iterable[str], None] = None, voxel_size: Union[float, Iterable[float], None] = None, parallel: bool = False, workers: Optional[int] = 8, show_progress: bool = True) -> Dict[str, Union[int, Dict[str, int]]]
Generate statistics for segmentations in a Copick project.
Parameters:
-
root(Union[str, CopickRoot]) –The Copick project root.
-
runs(Union[str, CopickRun, Iterable[str], Iterable[CopickRun], None], default:None) –The runs to query segmentations from. If
None, query all runs. -
user_id(Union[str, Iterable[str], None], default:None) –The user ID of the segmentations. If
None, query all users. -
session_id(Union[str, Iterable[str], None], default:None) –The session ID of the segmentations. If
None, query all sessions. -
is_multilabel(bool, default:None) –Whether the segmentations are multilabel. If
None, query all segmentations. -
name(Union[str, Iterable[str], None], default:None) –The name of the segmentation. If
None, query all segmentations. -
voxel_size(Union[float, Iterable[float], None], default:None) –The voxel size of the segmentation. If
None, query all segmentations. -
parallel(bool, default:False) –Whether to query segmentations in parallel. Default is
False. -
workers(Optional[int], default:8) –The number of workers to use. Default is
8. -
show_progress(bool, default:True) –Whether to show progress. Default is
True.
Returns:
-
Dict[str, Union[int, Dict[str, int]]]–A dictionary containing segmentation count and frequency statistics.
Usage Examples
Picks Statistics
from copick.ops.stats import picks_stats
from copick.impl.filesystem import CopickRootFSSpec
# Open a project
root = CopickRootFSSpec.from_file("config.json")
# Get comprehensive picks statistics
all_picks_stats = picks_stats(root)
print(f"Total picks: {all_picks_stats['total_picks']}")
print(f"Distribution by run: {all_picks_stats['distribution_by_run']}")
# Get statistics for specific runs and objects
filtered_stats = picks_stats(
root=root,
runs=["experiment_001", "experiment_002"],
object_name=["ribosome", "proteasome"],
user_id=["annotator_1"]
)
# Enable parallel processing for large projects
parallel_stats = picks_stats(
root=root,
parallel=True,
workers=12,
show_progress=True
)
Meshes Statistics
from copick.ops.stats import meshes_stats
# Get mesh statistics with filtering
mesh_stats = meshes_stats(
root=root,
runs=["experiment_001"],
user_id=["modeler_1", "modeler_2"],
object_name=["ribosome"]
)
print(f"Total meshes: {mesh_stats['total_meshes']}")
print(f"Distribution by user: {mesh_stats['distribution_by_user']}")
print(f"Frequent combinations: {mesh_stats['session_user_object_combinations']}")
# Get statistics for all meshes with parallel processing
all_mesh_stats = meshes_stats(root, parallel=True)
Segmentations Statistics
from copick.ops.stats import segmentations_stats
# Get segmentation statistics with comprehensive filtering
seg_stats = segmentations_stats(
root=root,
runs=["experiment_001"],
user_id=["segmentation_model"],
name=["membrane", "organelle"],
voxel_size=[10.0, 20.0],
is_multilabel=True
)
print(f"Total segmentations: {seg_stats['total_segmentations']}")
print(f"Distribution by name: {seg_stats['distribution_by_name']}")
print(f"Distribution by voxel size: {seg_stats['distribution_by_voxel_size']}")
print(f"Distribution by multilabel: {seg_stats['distribution_by_multilabel']}")
# Analyze combination frequencies
combinations = seg_stats['session_user_voxelspacing_multilabel_combinations']
most_frequent = max(combinations.items(), key=lambda x: x[1])
print(f"Most frequent combination: {most_frequent[0]} ({most_frequent[1]} occurrences)")
Return Value Structure
Picks Statistics
{
"total_picks": int, # Total number of individual pick points
"total_pick_files": int, # Total number of pick files
"distribution_by_run": { # Pick count per run
"run_name": pick_count
},
"distribution_by_user": { # Pick count per user
"user_id": pick_count
},
"distribution_by_session": { # Pick count per session
"session_id": pick_count
},
"distribution_by_object": { # Pick count per object
"object_name": pick_count
}
}
Meshes Statistics
{
"total_meshes": int, # Total number of mesh files
"distribution_by_user": { # Mesh count per user
"user_id": mesh_count
},
"distribution_by_session": { # Mesh count per session
"session_id": mesh_count
},
"distribution_by_object": { # Mesh count per object
"object_name": mesh_count
},
"session_user_object_combinations": { # Frequency of specific combinations
"session_user_object": frequency
}
}
Segmentations Statistics
{
"total_segmentations": int, # Total number of segmentation files
"distribution_by_user": { # Segmentation count per user
"user_id": segmentation_count
},
"distribution_by_session": { # Segmentation count per session
"session_id": segmentation_count
},
"distribution_by_name": { # Segmentation count per name
"name": segmentation_count
},
"distribution_by_voxel_size": { # Segmentation count per voxel size
"voxel_size": segmentation_count
},
"distribution_by_multilabel": { # Segmentation count by multilabel status
"True/False": segmentation_count
},
"session_user_voxelspacing_multilabel_combinations": { # Frequency of specific combinations
"session_user_name_voxelsize_multilabel": frequency
}
}
Performance Considerations
Parallel Processing
All stats functions support parallel processing for improved performance with large projects: