Wedge lists#

Basics#

Wedge lists are tabular files that contain the information about the missing wedge. They can contain information about one single tomogram or multiple tomograms and are required for subtomogram avergaing (STA) and template matching (TM). The least information that they contain is the tomogram identification number, the minimum and the maximum tilt angle. The wedgeutils module in cryoCAT allows to work with wedge lists in EM or STOPGAP STAR format. Internally, wedge lists are stored as dataframes with a different number of columns depending on the format.

  • STOPGAP STAR format: These files employ the widely employed STAR format. They contain as many rows as the number of projection images and 11 columns:

    • tomo_id: Tomogram identification number to which the projection image belongs to;

    • pixelsize: Original pixel size in Armstrongs (Å);

    • tomo_x: Number of pixels on the x-axis;

    • tomo_y: Number of pixels on the y-axis;

    • tomo_z: Number of pixels on the z-axis. All the tomogram dimensions refer to the unbinned tomogram dimensions;

    • z_shift: Any shifts applied on the z-axis to center the tomogram;

    • tilt_angle: Tilt angle at which the projection image was collected;

    • defocus: Estimated defocus of that projection image;

    • exposure: Corrected dose for that projection image. It takes into account any prior dose and the cumulative dose;

    • voltage;

    • amp_contrast: Amplitude contrast;

    • cs: Spherical aberrations.

  • EM format: These files contain as many rows as the number of tomograms and 3 columns:

    • tomo_num: Tomogram identification number;

    • min_angle: Minimum tilt angle;

    • max_angle: Maximum tilt angle.

In the following section, we provide some examples of the wedgelist module functionalities.

Pre-requisites#

To work with the functions illustrated herein, you need to need to import the wedgeutils module. Another useful third-party package is Pandas. Any additional requirements are pointed out where necessary.

[ ]:
from cryocat import wedgeutils
import pandas as pd

Functions to generate wedge lists#

CryoCAT offers functions to generate wedge lists in both EM and STOPGAP starfile format, as indicated below with some basic examples:

  • EM format: create_wedge_list_em_batch()

[ ]:
# Return a pandas DataFrame object with the wedge list in EM format and save it to the file specified by the desired path
my_em_wedgelist_df = wedgeutils.create_wedge_list_em_batch("/path/to/tomo_list.txt",
                                                           "/path/to/TS_$xxx/$xxx.mrc.mdoc",
                                                           output_file="/path/to/wedgelist.em",
                                                           )
  • STOPGAP starfile format: create_wedge_list_sg_batch() and create_wedge_list_sg() for a wedge list referred to a single tomogram. Typically, you would use the first one to generate a wedge list for all the tomograms in your dataset.

[ ]:
# Return a pandas DataFrame object with the wedge list in STOPGAP format and save it to the file specified by the desired path
my_sg_wedgelist_df = wedgeutils.create_wedge_list_sg_batch("/path/to/tomo_list.txt",
                                                           2.176,
                                                           "/path/to/TS_$xxx/$xxx.mrc.mdoc",
                                                           tomo_dim=[4096, 4096, 1800],
                                                           ctf_file_format = "/path/to/TS_$xxx/$xxx_ctf_output.txt",
                                                           ctf_file_type = "ctffind4",
                                                           output_file="/path/to/wedgelist.star",
                                                           )
  • Directly from the terminal via the wedge_list command. This allows to generate a STOPGAP starfile wedge list either for a single tomogram or a list of tomograms as for the two functions illustrated above directly from the terminal window or the command line. Furthermore, it is possible to print out the help message by calling the function name followed by --help or -h.

# example for a single tomogram with only the required arguments and the path to the output file, which will not contain the defocus column
wedge_list stopgap --tomo_id 30 --tomo_dim 5760,4092,1800 --pixel_size 2.176 --tlt_file /path/to/TS_030.mrc.mdoc --output_file /path/to/TS_030_wedgelist.star

# example for a list of tomograms
wedge_list stopgap_batch --tomo_list /path/to/tomo_list.txt --pixel_size 2.176 --tlt_file_format /path/to/TS_$xxx/$xxx.mrc.mdoc --output_file /path/to/wedgelist.star

Functions to load wedge lists#

  • To load a wedge list in STOPGAP format, you can use the load_wedge_list_sg() function;

  • To load wedge list in EM format, you can use the load_wedge_list_em() function. Both these functions return a pandas DataFrame instance.

[ ]:
my_sg_wedgelist_df = wedgeutils.load_wedge_list_sg("/path/to/my_sg_wedgelist.star") # STOPGAP format
my_em_wedgelist_df = wedgeutils.load_wedge_list_em("/path/to/my_sg_wedgelist.em") # EM format

Others#

  • Convert a wedge list from STOPGAP to EM format: wedge_list_sg_to_em()

  • Create a wedge mask: create_wg_mask()

  • Apply the wedge mask to an EM map: apply wg_mask()