Copick and Album

copick album

Album provides an easy way to deploy code and environments for processing tasks across platforms. Setting up an album solution that iterates over all runs in a copick project is extremely simple. Below is a step-by-step guide to writing an album solution that processes all runs in a copick project and stores a set of random points in each run.

A cookiecutter solution for copick can be found at the end of this page.

Step 1: Install Album

Comprehensive installation instructions for Album can be found on the Album docs website.

TL;DR:

conda create -n album album -c conda-forge
conda activate album

Step 2: Setup your copick project

In this example, we will create a solution that processes all runs in a copick project and stores a set of random points. Here, we will use the runs from dataset 10301 on the CZ cryoET Data Portal and a local overlay project. Other overlay backends can be used as well, see here.

Cofiguration Template

{
    "config_type": "cryoet_data_portal",
    "name": "Example Project",
    "description": "This is an example project.",
    "version": "0.5.0",
    "pickable_objects": [
        {
            "name": "random-points",
            "is_particle": true,
            "label": 1,
            "color": [
                0,
                117,
                220,
                255
            ],
            "radius": 10
        }
    ],
    "overlay_root": "local:///Users/utz.ermel/Documents/chlamy_proc/random_points/",
    "overlay_fs_args": {
        "auto_mkdir": true
    },
    "dataset_ids" : [10301]
}

Step 3: Create your solution

Environment setup

Album solutions are single Python files that contain the code to be run and information about the environment in which the code should be run. First, we will set up a baseline environment for copick projects, scikit-image and numpy. This will be used by album to create a conda environment for the solution upon installation.

###album catalog: mycatalog

from album.runner.api import get_args, setup

env_file = """
channels:
  - conda-forge
  - defaults
dependencies:
  - python=3.11
  - pip
  - zarr
  - ome-zarr
  - numpy<2
  - scipy
  - scikit-image
  - trimesh
  - pip:
    - album
    - "copick[all]>=0.5.2"
"""

Arguments

Album automatically parses command line arguments passed to the album runner. These arguments can be accessed using the get_args function inside the run function. They can be defined as a list of dictionaries, which is passed to the album.runner.api.setup() call in the last step. In case of copick, useful arguments could be the path to the copick config, the run names and any output object definitions.

args = [
    {
        "name": "copick_config_path",
        "description": "Path to the copick config file",
        "type": "string",
        "required": True,
    },
    {
        "name": "run_names",
        "description": "List of comma-separated run names to process",
        "type": "string",
        "required": False,
        "default": "",
    },
    {
        "name": "voxel_spacing",
        "description": "Voxel spacing for the tomograms",
        "type": "float",
        "required": False,
        "default": 10.0,
    },
    {
        "name": "tomo_type",
        "description": "Type of tomogram",
        "type": "string",
        "required": False,
        "default": "wbp",
    },
    {
        "name": "num_points",
        "description": "Number of random points to generate",
        "type": "integer",
        "required": False,
        "default": 10,
    },
    {
        "name": "out_object",
        "description": "Name of the output pickable object.",
        "type": "string",
        "required": False,
        "default": "random-points",
    },
    {
        "name": "out_user",
        "description": "User/Tool name for output points.",
        "type": "string",
        "required": False,
        "default": "solution-01",
    },
    {
        "name": "out_session",
        "description": "Output session, indicating this set was generated by a tool.",
        "type": "string",
        "required": False,
        "default": "0",
    },
]

Solution code

Next, we will write the code that will be run by album. This code has to be defined within a function called run. As run will be executed in a different environment than the solution file, it is important to move all imports into the body of the function.

def run():
    # Imports
    from typing import List, Sequence

    import copick
    import numpy as np
    import zarr
    from copick.models import CopickLocation, CopickPoint

Next we will parse the input arguments passed from the album runner.

    # Parse arguments
    args = get_args()
    copick_config_path = args.copick_config_path
    run_names = args.run_names.split(",")
    voxel_spacing = args.voxel_spacing
    tomo_type = args.tomo_type
    num_points = args.num_points
    out_object = args.out_object
    out_user = args.out_user
    out_session = args.out_session

Now we will define any function that we need to process the runs. In this case, we will generate a set of random points for each run and store them in the copick project.

Note

This function needs to be defined within the run function to ensure that it is available in the album environment.

    def generate_random_points(npoints: int, mdim: Sequence[float]) -> List[CopickPoint]:
        """Generate a set of random points."""
        points = []
        for _i in range(npoints):
            point = CopickPoint(
                location=CopickLocation(
                    x=np.random.rand(1) * mdim[0],
                    y=np.random.rand(1) * mdim[1],
                    z=np.random.rand(1) * mdim[2],
                ),
            )
            points.append(point)
        return points

Next, we will iterate over all runs in the copick project and store a set of random points in each run.

    # Load copick project root
    root = copick.from_file(copick_config_path)

    # If no run names are provided, process all runs
    if run_names == [""]:
        run_names = [r.name for r in root.runs]

    # Process runs
    for run_name in run_names:
        print(f"Processing run {run_name}")
        run = root.get_run(run_name)

        # Get the physical tomogram dimensions
        vs = run.get_voxel_spacing(voxel_spacing)
        tomo = vs.get_tomogram(tomo_type)
        pixel_max_dim = zarr.open(tomo.zarr())["0"].shape[::-1]
        max_dim = np.array([d * voxel_spacing for d in pixel_max_dim])

        # If picks of the same type already exist, we will get and overwrite them
        picks = run.get_picks(object_name=out_object, user_id=out_user, session_id=out_session)

        # If picks do not exist, we will generate new picks and add them to the run
        if len(picks) == 0:
            picks = run.new_picks(object_name=out_object, user_id=out_user, session_id=out_session)
        else:
            picks = picks[0]

        points = generate_random_points(num_points, max_dim)
        picks.points = points
        picks.store()

    print("Processing complete.")

Album solution setup

Finally, we will set up the album solution. This is done by calling the setup function with the arguument list, the solution code and the environment file.

setup(
    group="copick",
    name="random_points",
    version="0.1.0",
    title="Random Points",
    description="This solution generates a set of random points for each run in a copick project.",
    solution_creators=["Alice", "Bob"],
    tags=["copick", "points "],
    license="MIT",
    album_api_version="0.5.1",
    args=args,
    run=run,
    dependencies={"environment_file": env_file},
)

Step 4: Run your solution

To run the solution, save the solution code to a file, e.g. random_points.py, and run the following command:

album install random_points.py
album run random_points.py --copick_config_path /path/to/copick_config.json --voxel_spacing 7.84

This will generate a set of 10 random points for each run in the copick project.

Step 5: Visualize your results

You can visualize your output using ChimeraX-copick, napari-copick or any other visualization tool that supports the copick dataset API.

The album documentation provides a comprehensive guide on how to share your solution with others using the album catalog.

TL;DR

Full code for the solution above:

Random Points

###album catalog: mycatalog

from album.runner.api import get_args, setup

env_file = """
channels:
  - conda-forge
  - defaults
dependencies:
  - python=3.11
  - pip
  - zarr
  - ome-zarr
  - numpy<2
  - scipy
  - scikit-image
  - trimesh
  - pip:
    - album
    - "copick[all]>=0.5.2"
"""

args = [
    {
        "name": "copick_config_path",
        "description": "Path to the copick config file",
        "type": "string",
        "required": True,
    },
    {
        "name": "run_names",
        "description": "List of comma-separated run names to process",
        "type": "string",
        "required": False,
        "default": "",
    },
    {
        "name": "voxel_spacing",
        "description": "Voxel spacing for the tomograms",
        "type": "float",
        "required": False,
        "default": 10.0,
    },
    {
        "name": "tomo_type",
        "description": "Type of tomogram",
        "type": "string",
        "required": False,
        "default": "wbp",
    },
    {
        "name": "num_points",
        "description": "Number of random points to generate",
        "type": "integer",
        "required": False,
        "default": 10,
    },
    {
        "name": "out_object",
        "description": "Name of the output pickable object.",
        "type": "string",
        "required": False,
        "default": "random-points",
    },
    {
        "name": "out_user",
        "description": "User/Tool name for output points.",
        "type": "string",
        "required": False,
        "default": "solution-01",
    },
    {
        "name": "out_session",
        "description": "Output session, indicating this set was generated by a tool.",
        "type": "string",
        "required": False,
        "default": "0",
    },
]


def run():
    # Imports
    from typing import List, Sequence

    import copick
    import numpy as np
    import zarr
    from copick.models import CopickLocation, CopickPoint

    # Parse arguments
    args = get_args()
    copick_config_path = args.copick_config_path
    run_names = args.run_names.split(",")
    voxel_spacing = args.voxel_spacing
    tomo_type = args.tomo_type
    num_points = args.num_points
    out_object = args.out_object
    out_user = args.out_user
    out_session = args.out_session

    # Function definitions
    def generate_random_points(npoints: int, mdim: Sequence[float]) -> List[CopickPoint]:
        """Generate a set of random points."""
        points = []
        for _i in range(npoints):
            point = CopickPoint(
                location=CopickLocation(
                    x=np.random.rand(1) * mdim[0],
                    y=np.random.rand(1) * mdim[1],
                    z=np.random.rand(1) * mdim[2],
                ),
            )
            points.append(point)
        return points

    # Load copick project root
    root = copick.from_file(copick_config_path)

    # If no run names are provided, process all runs
    if run_names == [""]:
        run_names = [r.name for r in root.runs]

    # Process runs
    for run_name in run_names:
        print(f"Processing run {run_name}")
        run = root.get_run(run_name)

        # Get the physical tomogram dimensions
        vs = run.get_voxel_spacing(voxel_spacing)
        tomo = vs.get_tomogram(tomo_type)
        pixel_max_dim = zarr.open(tomo.zarr())["0"].shape[::-1]
        max_dim = np.array([d * voxel_spacing for d in pixel_max_dim])

        # If picks of the same type already exist, we will get and overwrite them
        picks = run.get_picks(object_name=out_object, user_id=out_user, session_id=out_session)

        # If picks do not exist, we will generate new picks and add them to the run
        if len(picks) == 0:
            picks = run.new_picks(object_name=out_object, user_id=out_user, session_id=out_session)
        else:
            picks = picks[0]

        points = generate_random_points(num_points, max_dim)
        picks.points = points
        picks.store()

    print("Processing complete.")


setup(
    group="copick",
    name="random_points",
    version="0.1.0",
    title="Random Points",
    description="This solution generates a set of random points for each run in a copick project.",
    solution_creators=["Alice", "Bob"],
    tags=["copick", "points "],
    license="MIT",
    album_api_version="0.5.1",
    args=args,
    run=run,
    dependencies={"environment_file": env_file},
)

Cookiecutter template

A cookiecutter template for copick solutions can be found below:

Cookiecutter Template

###album catalog: mycatalog

from album.runner.api import get_args, setup

env_file = """
channels:
  - conda-forge
  - defaults
dependencies:
  - python=3.11
  - pip
  - zarr
  - ome-zarr
  - numpy<2
  - scipy
  - scikit-image
  - trimesh
  - pip:
    - album
    - "copick[all]>=0.5.2"
"""

args = [
    {
        "name": "copick_config_path",
        "description": "Path to the copick config file",
        "type": "string",
        "required": True,
    },
    {
        "name": "run_names",
        "description": "List of comma-separated run names to process",
        "type": "string",
        "required": False,
        "default": "",
    },
    {
        "name": "voxel_spacing",
        "description": "Voxel spacing for the tomograms",
        "type": "float",
        "required": False,
        "default": 10.0,
    },
    {
        "name": "tomo_type",
        "description": "Type of tomogram",
        "type": "string",
        "required": False,
        "default": "wbp",
    },
    {
        "name": "out_object",
        "description": "Name of the output pickable object.",
        "type": "string",
        "required": False,
        "default": "random-points",
    },
    {
        "name": "out_user",
        "description": "User/Tool name for output points.",
        "type": "string",
        "required": False,
        "default": "solution-01",
    },
    {
        "name": "out_session",
        "description": "Output session, indicating this set was generated by a tool.",
        "type": "string",
        "required": False,
        "default": "0",
    },
]


def run():
    # Imports
    import copick
    from copick.models import CopickRun

    # Parse arguments
    args = get_args()
    copick_config_path = args.copick_config_path

    run_names = args.run_names.split(",")

    # Function definitions
    def process_run(run: CopickRun):
        # some code ...
        pass

    # Load copick project root
    root = copick.from_file(copick_config_path)

    # If no run names are provided, process all runs
    if run_names == [""]:
        run_names = [r.name for r in root.runs]

    # Process runs
    for run_name in run_names:
        print(f"Processing run {run_name}")
        run = root.get_run(run_name)

        process_run(run)

        # Store result

    print("Processing complete.")


setup(
    group="copick",
    name="solution-name",
    version="0.1.0",
    title="Template",
    description="Description.",
    solution_creators=["Alice", "Bob"],
    tags=["copick"],
    license="MIT",
    album_api_version="0.5.1",
    args=args,
    run=run,
    dependencies={"environment_file": env_file},
)