Skip to content

Sync Operations

The copick.ops.sync module provides functions for synchronizing data between Copick projects, including picks, meshes, segmentations, and tomograms. These functions enable efficient copying and migration of annotations and volume data between different Copick projects with support for parallel processing, name mapping, and user filtering.

Core Features

  • Parallel Processing: Multi-threaded synchronization for improved performance
  • Name Mapping: Flexible mapping of source names to target names for runs, objects, and users
  • User Filtering: Selective synchronization based on user IDs
  • Object Validation: Automatic creation of missing pickable objects in target projects
  • Progress Tracking: Optional logging and progress reporting
  • Error Handling: Comprehensive error reporting with detailed failure information

Functions

copick.ops.sync.sync_picks

sync_picks(source_root: CopickRoot, target_root: CopickRoot, source_runs: Optional[List[str]] = None, target_runs: Optional[Dict[str, str]] = None, source_objects: Optional[List[str]] = None, target_objects: Optional[Dict[str, str]] = None, source_users: Optional[List[str]] = None, target_users: Optional[Dict[str, str]] = None, exist_ok: bool = False, max_workers: int = 4, log: bool = False) -> None

Synchronize picks between two Copick projects.

Parameters:

  • source_root (CopickRoot) –

    The source Copick project root.

  • target_root (CopickRoot) –

    The target Copick project root.

  • source_runs (Optional[List[str]], default: None ) –

    The list of source run names to synchronize.

  • target_runs (Optional[Dict[str, str]], default: None ) –

    A dictionary mapping source run names to target run names.

  • source_objects (Optional[List[str]], default: None ) –

    The list of source object types to synchronize.

  • target_objects (Optional[Dict[str, str]], default: None ) –

    The dictionary mapping source object types to target types.

  • source_users (Optional[List[str]], default: None ) –

    The list of source user IDs to synchronize. If None, all users are synced.

  • target_users (Optional[Dict[str, str]], default: None ) –

    The dictionary mapping source user IDs to target user IDs.

  • exist_ok (bool, default: False ) –

    Whether to overwrite existing picks in the target project.

  • max_workers (int, default: 4 ) –

    The maximum number of worker threads to use for synchronization.

  • log (bool, default: False ) –

    Whether to log the synchronization process.

copick.ops.sync.sync_meshes

sync_meshes(source_root: CopickRoot, target_root: CopickRoot, source_runs: Optional[List[str]] = None, target_runs: Optional[Dict[str, str]] = None, source_objects: Optional[List[str]] = None, target_objects: Optional[Dict[str, str]] = None, source_users: Optional[List[str]] = None, target_users: Optional[Dict[str, str]] = None, exist_ok: bool = False, max_workers: int = 4, log: bool = False) -> None

Synchronize meshes between two Copick projects.

Parameters:

  • source_root (CopickRoot) –

    The source Copick project root.

  • target_root (CopickRoot) –

    The target Copick project root.

  • source_runs (Optional[List[str]], default: None ) –

    The list of source run names to synchronize.

  • target_runs (Optional[Dict[str, str]], default: None ) –

    A dictionary mapping source run names to target run names.

  • source_objects (Optional[List[str]], default: None ) –

    The list of source object types to synchronize.

  • target_objects (Optional[Dict[str, str]], default: None ) –

    The dictionary mapping source object types to target types.

  • source_users (Optional[List[str]], default: None ) –

    The list of source user IDs to synchronize. If None, all users are synced.

  • target_users (Optional[Dict[str, str]], default: None ) –

    The dictionary mapping source user IDs to target user IDs.

  • exist_ok (bool, default: False ) –

    Whether to overwrite existing meshes in the target project.

  • max_workers (int, default: 4 ) –

    The maximum number of worker threads to use for synchronization.

  • log (bool, default: False ) –

    Whether to log the synchronization process.

copick.ops.sync.sync_segmentations

sync_segmentations(source_root: CopickRoot, target_root: CopickRoot, source_runs: Optional[List[str]] = None, target_runs: Optional[Dict[str, str]] = None, voxel_spacings: Optional[List[float]] = None, source_names: Optional[List[str]] = None, target_names: Optional[Dict[str, str]] = None, source_users: Optional[List[str]] = None, target_users: Optional[Dict[str, str]] = None, exist_ok: bool = False, max_workers: int = 4, log: bool = False) -> None

Synchronize segmentations between two Copick projects.

Parameters:

  • source_root (CopickRoot) –

    The source Copick project root.

  • target_root (CopickRoot) –

    The target Copick project root.

  • source_runs (Optional[List[str]], default: None ) –

    The list of source run names to synchronize.

  • target_runs (Optional[Dict[str, str]], default: None ) –

    A dictionary mapping source run names to target run names.

  • voxel_spacings (Optional[List[float]], default: None ) –

    The voxel spacings to consider for synchronization.

  • source_names (Optional[List[str]], default: None ) –

    The list of source segmentation names to synchronize. If None, all segmentations are synced.

  • target_names (Optional[Dict[str, str]], default: None ) –

    The dictionary mapping source segmentation names to target names.

  • source_users (Optional[List[str]], default: None ) –

    The list of source user IDs to synchronize. If None, all users are synced.

  • target_users (Optional[Dict[str, str]], default: None ) –

    The dictionary mapping source user IDs to target user IDs.

  • exist_ok (bool, default: False ) –

    Whether to overwrite existing segmentations in the target project.

  • max_workers (int, default: 4 ) –

    The maximum number of worker threads to use for synchronization.

  • log (bool, default: False ) –

    Whether to log the synchronization process.

copick.ops.sync.sync_tomograms

sync_tomograms(source_root: CopickRoot, target_root: CopickRoot, source_runs: Optional[List[str]] = None, target_runs: Optional[Dict[str, str]] = None, voxel_spacings: Optional[List[float]] = None, source_tomo_types: Optional[List[str]] = None, target_tomo_types: Optional[Dict[str, str]] = None, exist_ok: bool = False, max_workers: int = 4, log: bool = False) -> None

Synchronize tomograms between two Copick projects.

Parameters:

  • source_root (CopickRoot) –

    The source Copick project root.

  • target_root (CopickRoot) –

    The target Copick project root.

  • source_runs (Optional[List[str]], default: None ) –

    The list of source run names to synchronize.

  • target_runs (Optional[Dict[str, str]], default: None ) –

    A dictionary mapping source run names to target run names.

  • voxel_spacings (Optional[List[float]], default: None ) –

    The voxel spacings to consider for synchronization.

  • source_tomo_types (Optional[List[str]], default: None ) –

    The list of source tomogram types to synchronize.

  • target_tomo_types (Optional[Dict[str, str]], default: None ) –

    The dictionary mapping source tomogram types to target types.

  • exist_ok (bool, default: False ) –

    Whether to overwrite existing tomograms in the target project.

  • max_workers (int, default: 4 ) –

    The maximum number of worker threads to use for synchronization.

  • log (bool, default: False ) –

    Whether to log the synchronization process.

Usage Examples

Basic Synchronization

import copick
from copick.ops.sync import sync_picks

# Load source and target projects
source_root = copick.from_file("source_config.json")
target_root = copick.from_file("target_config.json")

# Sync all picks from all runs
sync_picks(
    source_root=source_root,
    target_root=target_root,
    log=True
)

Selective Synchronization with Name Mapping

# Sync specific runs with name mapping
sync_picks(
    source_root=source_root,
    target_root=target_root,
    source_runs=["run1", "run2"],
    target_runs={"run1": "experiment_A", "run2": "experiment_B"},
    source_objects=["ribosome", "membrane"],
    target_objects={"ribosome": "ribo", "membrane": "mem"},
    log=True
)

User-Specific Synchronization

# Sync picks from specific users with user ID mapping
sync_picks(
    source_root=source_root,
    target_root=target_root,
    source_users=["user123", "user456"],
    target_users={"user123": "analyst1", "user456": "analyst2"},
    exist_ok=True,  # Allow overwriting existing picks
    max_workers=8,  # Use more threads for faster processing
    log=True
)

Synchronizing Segmentations with Voxel Spacing Filtering

from copick.ops.sync import sync_segmentations

# Sync segmentations for specific voxel spacings
sync_segmentations(
    source_root=source_root,
    target_root=target_root,
    voxel_spacings=[10.0, 20.0],  # Only sync these voxel spacings
    source_names=["membrane", "organelle"],
    target_names={"membrane": "cell_membrane", "organelle": "mitochondria"},
    log=True
)

Synchronizing Tomograms with Type Mapping

from copick.ops.sync import sync_tomograms

# Sync tomograms with type mapping
sync_tomograms(
    source_root=source_root,
    target_root=target_root,
    voxel_spacings=[10.0],
    source_tomo_types=["wbp", "raw"],
    target_tomo_types={"wbp": "filtered", "raw": "original"},
    exist_ok=True,
    log=True
)

Complete Multi-Data Type Synchronization

from copick.ops.sync import sync_picks, sync_meshes, sync_segmentations

# Sync multiple data types in sequence
data_types = [
    (sync_picks, {}),
    (sync_meshes, {}),
    (sync_segmentations, {"voxel_spacings": [10.0, 20.0]})
]

common_args = {
    "source_root": source_root,
    "target_root": target_root,
    "source_runs": ["run1", "run2"],
    "target_runs": {"run1": "exp1", "run2": "exp2"},
    "max_workers": 6,
    "log": True
}

for sync_func, extra_args in data_types:
    print(f"Synchronizing {sync_func.__name__}...")
    sync_func(**common_args, **extra_args)
    print(f"Completed {sync_func.__name__}")

Common Patterns

Name Mapping Syntax

All synchronization functions support flexible name mapping using dictionaries:

# Run name mapping
target_runs = {
    "source_run1": "target_run1",
    "source_run2": "target_run2"
}

# Object name mapping
target_objects = {
    "ribosome": "large_ribosomal_subunit",
    "membrane": "plasma_membrane",
    "vesicle": "transport_vesicle"
}

# User ID mapping
target_users = {
    "original_user": "new_user_id",
    "temp_user": "permanent_user"
}

Error Handling and Logging

import logging

# Configure logging for detailed output
logging.basicConfig(level=logging.INFO)

try:
    sync_picks(
        source_root=source_root,
        target_root=target_root,
        log=True  # Enable verbose logging
    )
except Exception as e:
    print(f"Synchronization failed: {e}")
    # Check logs for detailed error information

Performance Optimization

# Optimize for large datasets
sync_picks(
    source_root=source_root,
    target_root=target_root,
    max_workers=12,  # Increase parallelism
    exist_ok=True,   # Skip duplicate checks
    log=False        # Reduce logging overhead
)

Integration with CLI

The sync operations are also available through the CLI interface:

# Basic synchronization
copick sync picks -c source_config.json --target-config target_config.json

# With name mapping and user filtering
copick sync picks -c source_config.json --target-config target_config.json \
    --source-runs "run1,run2" \
    --target-runs "run1:exp1,run2:exp2" \
    --source-users "user1,user2" \
    --target-users "user1:analyst1,user2:analyst2" \
    --log

# From CryoET Data Portal
copick sync picks \
    --source-dataset-ids "12345,67890" \
    --target-config target_config.json \
    --log