Sync Operations
The copick.ops.sync module provides functions for synchronizing data between Copick projects, including picks, meshes, segmentations, and tomograms. These functions enable efficient copying and migration of annotations and volume data between different Copick projects with support for parallel processing, name mapping, and user filtering.
Core Features
- Parallel Processing: Multi-threaded synchronization for improved performance
- Name Mapping: Flexible mapping of source names to target names for runs, objects, and users
- User Filtering: Selective synchronization based on user IDs
- Object Validation: Automatic creation of missing pickable objects in target projects
- Progress Tracking: Optional logging and progress reporting
- Error Handling: Comprehensive error reporting with detailed failure information
Functions
copick.ops.sync.sync_picks
sync_picks(source_root: CopickRoot, target_root: CopickRoot, source_runs: Optional[List[str]] = None, target_runs: Optional[Dict[str, str]] = None, source_objects: Optional[List[str]] = None, target_objects: Optional[Dict[str, str]] = None, source_users: Optional[List[str]] = None, target_users: Optional[Dict[str, str]] = None, exist_ok: bool = False, max_workers: int = 4, log: bool = False) -> None
Synchronize picks between two Copick projects.
Parameters:
-
source_root(CopickRoot) –The source Copick project root.
-
target_root(CopickRoot) –The target Copick project root.
-
source_runs(Optional[List[str]], default:None) –The list of source run names to synchronize.
-
target_runs(Optional[Dict[str, str]], default:None) –A dictionary mapping source run names to target run names.
-
source_objects(Optional[List[str]], default:None) –The list of source object types to synchronize.
-
target_objects(Optional[Dict[str, str]], default:None) –The dictionary mapping source object types to target types.
-
source_users(Optional[List[str]], default:None) –The list of source user IDs to synchronize. If None, all users are synced.
-
target_users(Optional[Dict[str, str]], default:None) –The dictionary mapping source user IDs to target user IDs.
-
exist_ok(bool, default:False) –Whether to overwrite existing picks in the target project.
-
max_workers(int, default:4) –The maximum number of worker threads to use for synchronization.
-
log(bool, default:False) –Whether to log the synchronization process.
copick.ops.sync.sync_meshes
sync_meshes(source_root: CopickRoot, target_root: CopickRoot, source_runs: Optional[List[str]] = None, target_runs: Optional[Dict[str, str]] = None, source_objects: Optional[List[str]] = None, target_objects: Optional[Dict[str, str]] = None, source_users: Optional[List[str]] = None, target_users: Optional[Dict[str, str]] = None, exist_ok: bool = False, max_workers: int = 4, log: bool = False) -> None
Synchronize meshes between two Copick projects.
Parameters:
-
source_root(CopickRoot) –The source Copick project root.
-
target_root(CopickRoot) –The target Copick project root.
-
source_runs(Optional[List[str]], default:None) –The list of source run names to synchronize.
-
target_runs(Optional[Dict[str, str]], default:None) –A dictionary mapping source run names to target run names.
-
source_objects(Optional[List[str]], default:None) –The list of source object types to synchronize.
-
target_objects(Optional[Dict[str, str]], default:None) –The dictionary mapping source object types to target types.
-
source_users(Optional[List[str]], default:None) –The list of source user IDs to synchronize. If None, all users are synced.
-
target_users(Optional[Dict[str, str]], default:None) –The dictionary mapping source user IDs to target user IDs.
-
exist_ok(bool, default:False) –Whether to overwrite existing meshes in the target project.
-
max_workers(int, default:4) –The maximum number of worker threads to use for synchronization.
-
log(bool, default:False) –Whether to log the synchronization process.
copick.ops.sync.sync_segmentations
sync_segmentations(source_root: CopickRoot, target_root: CopickRoot, source_runs: Optional[List[str]] = None, target_runs: Optional[Dict[str, str]] = None, voxel_spacings: Optional[List[float]] = None, source_names: Optional[List[str]] = None, target_names: Optional[Dict[str, str]] = None, source_users: Optional[List[str]] = None, target_users: Optional[Dict[str, str]] = None, exist_ok: bool = False, max_workers: int = 4, log: bool = False) -> None
Synchronize segmentations between two Copick projects.
Parameters:
-
source_root(CopickRoot) –The source Copick project root.
-
target_root(CopickRoot) –The target Copick project root.
-
source_runs(Optional[List[str]], default:None) –The list of source run names to synchronize.
-
target_runs(Optional[Dict[str, str]], default:None) –A dictionary mapping source run names to target run names.
-
voxel_spacings(Optional[List[float]], default:None) –The voxel spacings to consider for synchronization.
-
source_names(Optional[List[str]], default:None) –The list of source segmentation names to synchronize. If None, all segmentations are synced.
-
target_names(Optional[Dict[str, str]], default:None) –The dictionary mapping source segmentation names to target names.
-
source_users(Optional[List[str]], default:None) –The list of source user IDs to synchronize. If None, all users are synced.
-
target_users(Optional[Dict[str, str]], default:None) –The dictionary mapping source user IDs to target user IDs.
-
exist_ok(bool, default:False) –Whether to overwrite existing segmentations in the target project.
-
max_workers(int, default:4) –The maximum number of worker threads to use for synchronization.
-
log(bool, default:False) –Whether to log the synchronization process.
copick.ops.sync.sync_tomograms
sync_tomograms(source_root: CopickRoot, target_root: CopickRoot, source_runs: Optional[List[str]] = None, target_runs: Optional[Dict[str, str]] = None, voxel_spacings: Optional[List[float]] = None, source_tomo_types: Optional[List[str]] = None, target_tomo_types: Optional[Dict[str, str]] = None, exist_ok: bool = False, max_workers: int = 4, log: bool = False) -> None
Synchronize tomograms between two Copick projects.
Parameters:
-
source_root(CopickRoot) –The source Copick project root.
-
target_root(CopickRoot) –The target Copick project root.
-
source_runs(Optional[List[str]], default:None) –The list of source run names to synchronize.
-
target_runs(Optional[Dict[str, str]], default:None) –A dictionary mapping source run names to target run names.
-
voxel_spacings(Optional[List[float]], default:None) –The voxel spacings to consider for synchronization.
-
source_tomo_types(Optional[List[str]], default:None) –The list of source tomogram types to synchronize.
-
target_tomo_types(Optional[Dict[str, str]], default:None) –The dictionary mapping source tomogram types to target types.
-
exist_ok(bool, default:False) –Whether to overwrite existing tomograms in the target project.
-
max_workers(int, default:4) –The maximum number of worker threads to use for synchronization.
-
log(bool, default:False) –Whether to log the synchronization process.
Usage Examples
Basic Synchronization
import copick
from copick.ops.sync import sync_picks
# Load source and target projects
source_root = copick.from_file("source_config.json")
target_root = copick.from_file("target_config.json")
# Sync all picks from all runs
sync_picks(
source_root=source_root,
target_root=target_root,
log=True
)
Selective Synchronization with Name Mapping
# Sync specific runs with name mapping
sync_picks(
source_root=source_root,
target_root=target_root,
source_runs=["run1", "run2"],
target_runs={"run1": "experiment_A", "run2": "experiment_B"},
source_objects=["ribosome", "membrane"],
target_objects={"ribosome": "ribo", "membrane": "mem"},
log=True
)
User-Specific Synchronization
# Sync picks from specific users with user ID mapping
sync_picks(
source_root=source_root,
target_root=target_root,
source_users=["user123", "user456"],
target_users={"user123": "analyst1", "user456": "analyst2"},
exist_ok=True, # Allow overwriting existing picks
max_workers=8, # Use more threads for faster processing
log=True
)
Synchronizing Segmentations with Voxel Spacing Filtering
from copick.ops.sync import sync_segmentations
# Sync segmentations for specific voxel spacings
sync_segmentations(
source_root=source_root,
target_root=target_root,
voxel_spacings=[10.0, 20.0], # Only sync these voxel spacings
source_names=["membrane", "organelle"],
target_names={"membrane": "cell_membrane", "organelle": "mitochondria"},
log=True
)
Synchronizing Tomograms with Type Mapping
from copick.ops.sync import sync_tomograms
# Sync tomograms with type mapping
sync_tomograms(
source_root=source_root,
target_root=target_root,
voxel_spacings=[10.0],
source_tomo_types=["wbp", "raw"],
target_tomo_types={"wbp": "filtered", "raw": "original"},
exist_ok=True,
log=True
)
Complete Multi-Data Type Synchronization
from copick.ops.sync import sync_picks, sync_meshes, sync_segmentations
# Sync multiple data types in sequence
data_types = [
(sync_picks, {}),
(sync_meshes, {}),
(sync_segmentations, {"voxel_spacings": [10.0, 20.0]})
]
common_args = {
"source_root": source_root,
"target_root": target_root,
"source_runs": ["run1", "run2"],
"target_runs": {"run1": "exp1", "run2": "exp2"},
"max_workers": 6,
"log": True
}
for sync_func, extra_args in data_types:
print(f"Synchronizing {sync_func.__name__}...")
sync_func(**common_args, **extra_args)
print(f"Completed {sync_func.__name__}")
Common Patterns
Name Mapping Syntax
All synchronization functions support flexible name mapping using dictionaries:
# Run name mapping
target_runs = {
"source_run1": "target_run1",
"source_run2": "target_run2"
}
# Object name mapping
target_objects = {
"ribosome": "large_ribosomal_subunit",
"membrane": "plasma_membrane",
"vesicle": "transport_vesicle"
}
# User ID mapping
target_users = {
"original_user": "new_user_id",
"temp_user": "permanent_user"
}
Error Handling and Logging
import logging
# Configure logging for detailed output
logging.basicConfig(level=logging.INFO)
try:
sync_picks(
source_root=source_root,
target_root=target_root,
log=True # Enable verbose logging
)
except Exception as e:
print(f"Synchronization failed: {e}")
# Check logs for detailed error information
Performance Optimization
# Optimize for large datasets
sync_picks(
source_root=source_root,
target_root=target_root,
max_workers=12, # Increase parallelism
exist_ok=True, # Skip duplicate checks
log=False # Reduce logging overhead
)
Integration with CLI
The sync operations are also available through the CLI interface:
# Basic synchronization
copick sync picks -c source_config.json --target-config target_config.json
# With name mapping and user filtering
copick sync picks -c source_config.json --target-config target_config.json \
--source-runs "run1,run2" \
--target-runs "run1:exp1,run2:exp2" \
--source-users "user1,user2" \
--target-users "user1:analyst1,user2:analyst2" \
--log
# From CryoET Data Portal
copick sync picks \
--source-dataset-ids "12345,67890" \
--target-config target_config.json \
--log