Motl basics#

  • Motl stands for “motive list” and contains list of particles and their properties. It can be used in subtomogram averaging or to perform contextual analysis.

  • Internally, the particle list is stored within a Motl class as pandas DataFrame with 20 columns (see below) and N rows, where N corresponds to number of particles.

  • Externally, the particle list can be loaded from and written as a binary file in any of the following: EM format (novaSTA, TOM/AV3, ArtiaX compatible), a RELION starfile (currently up to version 4.x), a STOPGAP starfile, IMOD .mod file, and a simple CSV file.

  • The module cryomotl contains stand stand-alone functions for hassle-free conversions between formats, as well as classes as described below

Motl class#

Motl class is a parent class containing functions that are general to all formats. It is the class that is used for all manipulations applied to the motl list. The particle list itself is stored in the member variable df as pandas DataFrame that has following columns:

  1. “score” - a quality metric (typically cross-correlation value between the particle and the reference)

  2. “geom1” - a free geometric property

  3. “geom2” - a free geometric property

  4. “subtomo_id” - a subtomogram id; IMPORTANT many functions rely on this one to be unique

  5. “tomo_id” - a tomogram id to which the particle is affiliated to

  6. “object_id” - an object id to which the particle is affiliated to

  7. “subtomo_mean” - a mean value of the subtomogram

  8. “x” - a position in the tomogram (an integer value), typically used for subtomogram extraction

  9. “y” - a position in the tomogram (an integer value), typically used for subtomogram extraction

  10. “z” - a position in the tomogram (an integer value), typically used for subtomogram extraction

  11. “shift_x” - shift of the particle in X direction (a float value); to complete position of a particle is given by x + shift_x

  12. “shift_y” - shift of the particle in Y direction (a float value); to complete position of a particle is given by y + shift_y

  13. “shift_z” - shift of the particle in Z direction (a float value); to complete position of a particle is given by z + shift_z

  14. “geom3” - a free geometric property

  15. “geom4” - a free geometric property

  16. “geom5” - a free geometric property

  17. “phi” - a phi angle describing rotation around the first Z axis (following Euler zxz convention)

  18. “psi” - a psi angle describing rotation around the second Z axis (following Euler zxz convention)

  19. “theta” - a theta angle describing rotation around the X axis (following Euler zxz convention)

  20. “class” - a class of the particle

Child classes#

Different subclasses include EmMotl, RelionMotl, StopgapMotl, DynamoMotl, ModMotl and contain functionalities allowing for smooth transitions between the different conventions utilised in a given software. They mostly contain funtions for reading in/out lists and are used under-the-hood to ensure compatibility with the parent Motl file, which we use in order to use functions to modify and inspect our files.

Working with particle lists: Basic examples#

In the following section, some examples of how to use Motl files in your analyisis pipelines are provided. For a complete list of fucntions, please refer to the cryomotl module in the API guide. NOTE: For all the functions displayed, it is assumed that the cryomotl module is imported:

[ ]:
from cryocat import cryomotl

Loading a particle list as a Motl object#

The first step to work with a particle list is to load it and store it as a Motl object. This can be accomplished with the load() function, which is used to initialize the Motl class. ##### Example:

[ ]:
my_motl = cryomotl.Motl.load('path/to/motl_file')

Then, the properties of the particles can be displayed by inspecting the df attribute (pandas DataFrame).

[ ]:
display(my_motl.df)

This will print out the content of the pandas Dataframe will all the columns described in the previous section.

Extract a subset of particles#

To work on a subset of particles based on the value of a particular feature, you can extract them with the get_motl_subset() function. By default, the feature that is taken into account is the tomogram ID (tomo_id) and it returns a new Motl object, however you cna ask for a panda DataFrame as well. Here are some examples:

[ ]:
subset_motl_tomo = my_motl.get_motl_subset(2) #select particles belonging to tomogram 2 and return a new Motl object
subset_motl_class = my_motl.get_motl_subset(1, feature_id="class") #select particles belonging to class 1 and return a new Motl object
subset_df_tomo = my_motl.get_motl_subset(2, return_df=True) #select particles belonging to tomogram 2 and return a pandas dataframe

Clean a particle list#

Different functions are available to clean particle list depending on the analysis pipeline.

  • Clean particles by distance: Clean duplicate particles, commonly required when working with particles from oversampling. The function that accomplishes this is clean_by_distance(). This functions changes the df attribute of the Motl. To save the particle list after cleaning, it is necessary to write the function to file with the write_out() function (see below).

[ ]:
my_motl.clean_by_distance(10) #remove particles that are closer than 10 voxels to each other, keeping the one with the highest score value
my_motl.clean_by_distance(10, feature_id="class") #remove particles that are closer than 10 voxels to each other, grouping the particles by class and keeping the one with the highest score value
  • Clean particles by CC value: This is accomplished with the clean_by_otsu() function.

[ ]:
my_motl.clean_by_otsu(feature_id="tomo_id") #group the particle by tomogram and compute the Otsu's threshold to select the score threshold according to which the particles are cleaned
  • Clean particles by radius: In case of particles the coordinates whereof were sampled on a spherical surface, the clean_by_radius() function allows to remove those particles that are not fitting the surface but are rather outside the sphere radius. This function is part of the structure module. If an output path is passed, the cleaned particle list will be written to the specified file.

[ ]:
from cryocat import structure
my_motl_cleaned = structure.PleomorphicSurface.clean_by_radius(my_motl, feature_id='object_id', threshold=None, output_file='./my_motl_cleaned.em') #for each object, remove the particles that are outside its radius ± the standard deviation of the distance between the particles and the object center
my_motl_cleaned = structure.PleomorphicSurface.clean_by_radius(my_motl, feature_id='object_id', threshold=5, output_file='./my_motl_cleaned.em') #for each object, remove the particles that are outside its radius ± the threshold value

Write a Motl to file#

If you have edited the df attribute of your Motl object or you generated a new Motl object and you wish to save them to a particle list file, you need to use the write_out() function:

[ ]:
my_motl.write_out("path/to/desired_output_file.em") #this will save my_motl to desired_output_file.em

Example of different classes in use#

Let’s assume we are working with a Stopgap motl file and want to scale the coordinates 2x in order to decrease the binning factor we are working at twice. In order to do that, we need to ensure loading the file within the conventions of the appropriate instance by specifying the type of the motl we are dealing with.

[ ]:
my_sg_motl = cryomotl.Motl.load('input_motl_file.star', motl_type='stopgap')

Still using the Motl class, we modify the object according to our needs.

[ ]:
my_sg_motl.scale_coordinates(2)
my_sg_motl.update_coordinates()

If one displays the df of the object at this stage, they would notice that column labels are as described above, i.e. characteristic to a Motl class object. To write out the file in the correct format for further use in stopgap, we should call.

[ ]:
cryomotl.Motl.write_out(my_sg_motl,'output_file.star','stopgap') #specify motl type to ensure correct convention

We can inspect written out data by using either of the classes: Motl or Stopgap.

[ ]:
display(cryomotl.StopgapMotl.read_in('output_motl_file.star')) # keeps the specifiers (labels) and the column order of the file as is; object stored as DataFrame
display(cryomotl.Motl.load('output_motl_file.star', motl_type='stopgap').df) # the labels corresponds to the Motl object; to display the values we need to inspect the df attribute

Reading model files from IMOD#

Amongst the child classes of Motl, ModMotl allows to read binary .mod files from IMOD and work with them as ModMotl objects, which inherit the same parameters as Motl objects. Examples:

  • Load one or multiple .mod files from IMOD and store them as ModMotl:

[ ]:
my_modmotl = cryomotl.ModMotl('path/to/model_folder/', mod_prefix='tomo_') #loaad all mod files in model_folder that start with tomo_
my_modmotl = cryomotl.ModMotl('path/to/model_file.mod') #load model_file.mod

display(my_modmotl.df) # inspect the content
  • Get the content of a model file as a pandas DataFrame object:

[ ]:
mod_df = cryomotl.ModMotl.read_in('path/to/model_file.mod') # read in a mod file and return a pandas dataframe

Conversely, to convert a Motl and write it out to a binary .mod file, the write_to_model_file() function is available:

[ ]:
my_motl.write_to_model_file("tomo_id", "tomo", point_size=1) #split my_motl based on the tomo_id and write the resulting motl files to individual model files with the prefix "tomo_"