CZ cryoET Data Portal
copick CZ cryoET Data Portal
The CZ cryoET Data Portal provides a standardized access to cryoET datasets and annotations. This tutorial demonstrates how to datasets and annotations stored on the CZ cryoET Data Portal as static data in a copick project.
Step 1: Install copick
See the quickstart guide for instructions on how to install copick.
Step 2: Setup your project
We will create a project that uses dataset 10301 from the CZ
cryoET Data Portal. We will store locally created annotations in a local directory, called the "overlay". In the following,
we will create a configuration file copick_config.json
that describes the project.
The configuration file is a JSON file that contains all information necessary to access the data. We first provide general information about the project, such as the project name, description, and version.
{
"config_type": "cryoet_data_portal",
"name": "Example Project",
"description": "This is an example project.",
"version": "0.5.0"
}
Next, we define the objects that can be accessed and created using the copick API. Dataset 10301 already contains
annotations from multiple authors, which we can access from within copick. In order to make data portal annotations
available, we need to include the Gene Ontology IDs or UniProtKB accessions (identifier
) of the objects we want to access. Any portal annotation
that has a matching GO ID will be available in the copick project.
In this case, we will obtain pre-existing annotations for the ribosome, ATPase, and membrane. We will also create a new object called "prohibitin" that will be stored in the overlay directory, but is not available on the data portal.
{
"pickable_objects": [
{
"name": "ribosome",
"is_particle": true,
"identifier": "GO:0022626",
"label": 1,
"color": [ 0, 117, 220, 255],
"radius": 150
},
{
"name": "atpase",
"is_particle": true,
"identifier": "GO:0045259",
"label": 2,
"color": [251, 192, 147, 255],
"radius": 150
},
{
"name": "membrane",
"is_particle": false,
"identifier": "GO:0016020",
"label": 3,
"color": [200, 200, 200, 255],
"radius": 10
},
{
"name": "prohibitin",
"is_particle": true,
"label": 4,
"color": [ 155, 117, 220, 255],
"radius": 10
}
]
}
Finally, we define the overlay directory where the new annotations will be stored, and the dataset ID of dataset 10301 on the CZ cryoET Data Portal.
{
"overlay_root": "local:///home/bob/copick_project/",
"overlay_fs_args": {
"auto_mkdir": true
},
"dataset_ids" : [10301]
}
Full Configuration Template
{
"config_type": "cryoet_data_portal",
"name": "Example Project",
"description": "This is an example project.",
"version": "0.5.0",
"pickable_objects": [
{
"name": "ribosome",
"is_particle": true,
"identifier": "GO:0022626",
"label": 1,
"color": [ 0, 117, 220, 255],
"radius": 150
},
{
"name": "atpase",
"is_particle": true,
"identifier": "GO:0045259",
"label": 2,
"color": [251, 192, 147, 255],
"radius": 150
},
{
"name": "membrane",
"is_particle": false,
"identifier": "GO:0016020",
"label": 3,
"color": [200, 200, 200, 255],
"radius": 10
}
],
"overlay_root": "local:///Users/utz.ermel/Documents/chlamy_proc/random_points/",
"overlay_fs_args": {
"auto_mkdir": true
},
"dataset_ids" : [10301]
}
Step 3: Visualize, curate or process the data
You can now use the copick API to access the data from dataset 10301 and the overlay directory. As a first step, you can print the available objects and runs.
"""Print all objects and runs in a copick project."""
import copick
# Initialize the root object from a configuration file
root = copick.from_file("path/to/config.json")
# List all available objects
obj_info = [(o.name, o.label) for o in root.objects.values()]
print("Pickable objects in this project:")
for name, label in obj_info:
print(f" {name}: {label}")
# Execute a function on each run in the project
runs = root.runs
print("Runs in this project:")
for run in runs:
print(f"Run: {run.name}")
# Do something with the run
Visualization works as with any other copick project. For more information, see the tutorial on ChimeraX-copick integration., or check out CellCanvas or napari-copick for alternative visualization options.