Copick and Album
copick album
Album provides an easy way to deploy code and environments for processing tasks across platforms. Setting up an album solution that iterates over all runs in a copick project is extremely simple. Below is a step-by-step guide to writing an album solution that processes all runs in a copick project and stores a set of random points in each run.
A cookiecutter solution for copick can be found at the end of this page.
Step 1: Install Album
Comprehensive installation instructions for Album can be found on the Album docs website.
TL;DR:
Step 2: Setup your copick project
In this example, we will create a solution that processes all runs in a copick project and stores a set of random points. Here, we will use the runs from dataset 10301 on the CZ cryoET Data Portal and a local overlay project. Other overlay backends can be used as well, see here.
Cofiguration Template
{
"config_type": "cryoet_data_portal",
"name": "Example Project",
"description": "This is an example project.",
"version": "0.5.0",
"pickable_objects": [
{
"name": "random-points",
"is_particle": true,
"label": 1,
"color": [
0,
117,
220,
255
],
"radius": 10
}
],
"overlay_root": "local:///Users/utz.ermel/Documents/chlamy_proc/random_points/",
"overlay_fs_args": {
"auto_mkdir": true
},
"dataset_ids" : [10301]
}
Step 3: Create your solution
Environment setup
Album solutions are single Python files that contain the code to be run and information about the environment in which
the code should be run. First, we will set up a baseline environment for copick projects, scikit-image
and numpy
.
This will be used by album to create a conda environment for the solution upon installation.
###album catalog: mycatalog
from album.runner.api import get_args, setup
env_file = """
channels:
- conda-forge
- defaults
dependencies:
- python=3.11
- pip
- zarr
- ome-zarr
- numpy<2
- scipy
- scikit-image
- trimesh
- pip:
- album
- "copick[all]>=0.5.2"
"""
Arguments
Album automatically parses command line arguments passed to the album runner. These arguments can be accessed using the
get_args
function inside the run
function. They can be defined as a list of dictionaries, which is passed to the
album.runner.api.setup()
call in the last step. In case of copick, useful arguments could be the path to the copick
config, the run names and any output object definitions.
args = [
{
"name": "copick_config_path",
"description": "Path to the copick config file",
"type": "string",
"required": True,
},
{
"name": "run_names",
"description": "List of comma-separated run names to process",
"type": "string",
"required": False,
"default": "",
},
{
"name": "voxel_spacing",
"description": "Voxel spacing for the tomograms",
"type": "float",
"required": False,
"default": 10.0,
},
{
"name": "tomo_type",
"description": "Type of tomogram",
"type": "string",
"required": False,
"default": "wbp",
},
{
"name": "num_points",
"description": "Number of random points to generate",
"type": "integer",
"required": False,
"default": 10,
},
{
"name": "out_object",
"description": "Name of the output pickable object.",
"type": "string",
"required": False,
"default": "random-points",
},
{
"name": "out_user",
"description": "User/Tool name for output points.",
"type": "string",
"required": False,
"default": "solution-01",
},
{
"name": "out_session",
"description": "Output session, indicating this set was generated by a tool.",
"type": "string",
"required": False,
"default": "0",
},
]
Solution code
Next, we will write the code that will be run by album. This code has to be defined within a function called run
. As
run
will be executed in a different environment than the solution file, it is important to move all imports into the
body of the function.
def run():
# Imports
from typing import List, Sequence
import copick
import numpy as np
import zarr
from copick.models import CopickLocation, CopickPoint
Next we will parse the input arguments passed from the album runner.
# Parse arguments
args = get_args()
copick_config_path = args.copick_config_path
run_names = args.run_names.split(",")
voxel_spacing = args.voxel_spacing
tomo_type = args.tomo_type
num_points = args.num_points
out_object = args.out_object
out_user = args.out_user
out_session = args.out_session
Now we will define any function that we need to process the runs. In this case, we will generate a set of random points for each run and store them in the copick project.
Note
This function needs to be defined within the run
function to ensure that it is available in the album environment.
def generate_random_points(npoints: int, mdim: Sequence[float]) -> List[CopickPoint]:
"""Generate a set of random points."""
points = []
for _i in range(npoints):
point = CopickPoint(
location=CopickLocation(
x=np.random.rand(1) * mdim[0],
y=np.random.rand(1) * mdim[1],
z=np.random.rand(1) * mdim[2],
),
)
points.append(point)
return points
Next, we will iterate over all runs in the copick project and store a set of random points in each run.
# Load copick project root
root = copick.from_file(copick_config_path)
# If no run names are provided, process all runs
if run_names == [""]:
run_names = [r.name for r in root.runs]
# Process runs
for run_name in run_names:
print(f"Processing run {run_name}")
run = root.get_run(run_name)
# Get the physical tomogram dimensions
vs = run.get_voxel_spacing(voxel_spacing)
tomo = vs.get_tomogram(tomo_type)
pixel_max_dim = zarr.open(tomo.zarr())["0"].shape[::-1]
max_dim = np.array([d * voxel_spacing for d in pixel_max_dim])
# If picks of the same type already exist, we will get and overwrite them
picks = run.get_picks(object_name=out_object, user_id=out_user, session_id=out_session)
# If picks do not exist, we will generate new picks and add them to the run
if len(picks) == 0:
picks = run.new_picks(object_name=out_object, user_id=out_user, session_id=out_session)
else:
picks = picks[0]
points = generate_random_points(num_points, max_dim)
picks.points = points
picks.store()
print("Processing complete.")
Album solution setup
Finally, we will set up the album solution. This is done by calling the setup
function with the arguument list, the
solution code and the environment file.
setup(
group="copick",
name="random_points",
version="0.1.0",
title="Random Points",
description="This solution generates a set of random points for each run in a copick project.",
solution_creators=["Alice", "Bob"],
tags=["copick", "points "],
license="MIT",
album_api_version="0.5.1",
args=args,
run=run,
dependencies={"environment_file": env_file},
)
Step 4: Run your solution
To run the solution, save the solution code to a file, e.g. random_points.py
, and run the following command:
album install random_points.py
album run random_points.py --copick_config_path /path/to/copick_config.json --voxel_spacing 7.84
This will generate a set of 10 random points for each run in the copick project.
Step 5: Visualize your results
You can visualize your output using ChimeraX-copick, napari-copick or any other visualization tool that supports the copick dataset API.
Step 6: Share your solution
The album documentation provides a comprehensive guide on how to share your solution with others using the album catalog.
TL;DR
Full code for the solution above:
Random Points
###album catalog: mycatalog
from album.runner.api import get_args, setup
env_file = """
channels:
- conda-forge
- defaults
dependencies:
- python=3.11
- pip
- zarr
- ome-zarr
- numpy<2
- scipy
- scikit-image
- trimesh
- pip:
- album
- "copick[all]>=0.5.2"
"""
args = [
{
"name": "copick_config_path",
"description": "Path to the copick config file",
"type": "string",
"required": True,
},
{
"name": "run_names",
"description": "List of comma-separated run names to process",
"type": "string",
"required": False,
"default": "",
},
{
"name": "voxel_spacing",
"description": "Voxel spacing for the tomograms",
"type": "float",
"required": False,
"default": 10.0,
},
{
"name": "tomo_type",
"description": "Type of tomogram",
"type": "string",
"required": False,
"default": "wbp",
},
{
"name": "num_points",
"description": "Number of random points to generate",
"type": "integer",
"required": False,
"default": 10,
},
{
"name": "out_object",
"description": "Name of the output pickable object.",
"type": "string",
"required": False,
"default": "random-points",
},
{
"name": "out_user",
"description": "User/Tool name for output points.",
"type": "string",
"required": False,
"default": "solution-01",
},
{
"name": "out_session",
"description": "Output session, indicating this set was generated by a tool.",
"type": "string",
"required": False,
"default": "0",
},
]
def run():
# Imports
from typing import List, Sequence
import copick
import numpy as np
import zarr
from copick.models import CopickLocation, CopickPoint
# Parse arguments
args = get_args()
copick_config_path = args.copick_config_path
run_names = args.run_names.split(",")
voxel_spacing = args.voxel_spacing
tomo_type = args.tomo_type
num_points = args.num_points
out_object = args.out_object
out_user = args.out_user
out_session = args.out_session
# Function definitions
def generate_random_points(npoints: int, mdim: Sequence[float]) -> List[CopickPoint]:
"""Generate a set of random points."""
points = []
for _i in range(npoints):
point = CopickPoint(
location=CopickLocation(
x=np.random.rand(1) * mdim[0],
y=np.random.rand(1) * mdim[1],
z=np.random.rand(1) * mdim[2],
),
)
points.append(point)
return points
# Load copick project root
root = copick.from_file(copick_config_path)
# If no run names are provided, process all runs
if run_names == [""]:
run_names = [r.name for r in root.runs]
# Process runs
for run_name in run_names:
print(f"Processing run {run_name}")
run = root.get_run(run_name)
# Get the physical tomogram dimensions
vs = run.get_voxel_spacing(voxel_spacing)
tomo = vs.get_tomogram(tomo_type)
pixel_max_dim = zarr.open(tomo.zarr())["0"].shape[::-1]
max_dim = np.array([d * voxel_spacing for d in pixel_max_dim])
# If picks of the same type already exist, we will get and overwrite them
picks = run.get_picks(object_name=out_object, user_id=out_user, session_id=out_session)
# If picks do not exist, we will generate new picks and add them to the run
if len(picks) == 0:
picks = run.new_picks(object_name=out_object, user_id=out_user, session_id=out_session)
else:
picks = picks[0]
points = generate_random_points(num_points, max_dim)
picks.points = points
picks.store()
print("Processing complete.")
setup(
group="copick",
name="random_points",
version="0.1.0",
title="Random Points",
description="This solution generates a set of random points for each run in a copick project.",
solution_creators=["Alice", "Bob"],
tags=["copick", "points "],
license="MIT",
album_api_version="0.5.1",
args=args,
run=run,
dependencies={"environment_file": env_file},
)
Cookiecutter template
A cookiecutter template for copick solutions can be found below:
Cookiecutter Template
###album catalog: mycatalog
from album.runner.api import get_args, setup
env_file = """
channels:
- conda-forge
- defaults
dependencies:
- python=3.11
- pip
- zarr
- ome-zarr
- numpy<2
- scipy
- scikit-image
- trimesh
- pip:
- album
- "copick[all]>=0.5.2"
"""
args = [
{
"name": "copick_config_path",
"description": "Path to the copick config file",
"type": "string",
"required": True,
},
{
"name": "run_names",
"description": "List of comma-separated run names to process",
"type": "string",
"required": False,
"default": "",
},
{
"name": "voxel_spacing",
"description": "Voxel spacing for the tomograms",
"type": "float",
"required": False,
"default": 10.0,
},
{
"name": "tomo_type",
"description": "Type of tomogram",
"type": "string",
"required": False,
"default": "wbp",
},
{
"name": "out_object",
"description": "Name of the output pickable object.",
"type": "string",
"required": False,
"default": "random-points",
},
{
"name": "out_user",
"description": "User/Tool name for output points.",
"type": "string",
"required": False,
"default": "solution-01",
},
{
"name": "out_session",
"description": "Output session, indicating this set was generated by a tool.",
"type": "string",
"required": False,
"default": "0",
},
]
def run():
# Imports
import copick
from copick.models import CopickRun
# Parse arguments
args = get_args()
copick_config_path = args.copick_config_path
run_names = args.run_names.split(",")
# Function definitions
def process_run(run: CopickRun):
# some code ...
pass
# Load copick project root
root = copick.from_file(copick_config_path)
# If no run names are provided, process all runs
if run_names == [""]:
run_names = [r.name for r in root.runs]
# Process runs
for run_name in run_names:
print(f"Processing run {run_name}")
run = root.get_run(run_name)
process_run(run)
# Store result
print("Processing complete.")
setup(
group="copick",
name="solution-name",
version="0.1.0",
title="Template",
description="Description.",
solution_creators=["Alice", "Bob"],
tags=["copick"],
license="MIT",
album_api_version="0.5.1",
args=args,
run=run,
dependencies={"environment_file": env_file},
)