RelionMotlv5#

class cryocat.core.cryomotl.RelionMotlv5(input_particles=None, input_tomograms=None, tomo_idx=None, pixel_size=None, binning=1.0, optics_data=None, tomo_format='', subtomo_format='', version=None)#

Bases: RelionMotl, Motl

check_isWarp()#
static clean_subtomo_name_column(df)#
static clean_tomo_name_column(df)#

Clean the ‘rlnTomoName’ column of a tomogram DataFrame by extracting only the numeric ID from names like ‘TS_17.tomostar’.

Parameters:
dfpandas.DataFrame

The tomogram DataFrame, must contain ‘rlnTomoName’ column.

Returns:
pandas.DataFrame

The same DataFrame with a cleaned ‘rlnTomoName’ column.

columns_v5 = ['rlnCoordinateX', 'rlnCoordinateY', 'rlnCoordinateZ', 'rlnAngleRot', 'rlnAngleTilt', 'rlnAnglePsi', 'rlnTomoName', 'rlnTomoParticleName', 'rlnOpticsGroup', 'rlnOriginXAngst', 'rlnOriginYAngst', 'rlnOriginZAngst']#
columns_v5_centAng = ['rlnCenteredCoordinateXAngst', 'rlnCenteredCoordinateYAngst', 'rlnCenteredCoordinateZAngst', 'rlnAngleRot', 'rlnAngleTilt', 'rlnAnglePsi', 'rlnTomoName', 'rlnTomoParticleName', 'rlnOpticsGroup', 'rlnOriginXAngst', 'rlnOriginYAngst', 'rlnOriginZAngst', 'rlnClassNumber', 'rlnRandomSubset', 'rlnGroupNumber']#
convert_coordinates_ang_merge(relion_df)#
convert_coordinates_merge(relion_df)#
convert_to_motl(relion_df, optics_df=None, tomo_format='', subtomo_format='')#

The function converts a DataFrame in relion format into a motl DataFrame.

Parameters:
relion_dfpandas.DataFrame

DataFrame in relion format.

versionfloat, optional

Version of Relion DataFrame. Defaults to None.

optics_dfpandas.DataFrame, optional

DataFrame with optics data. Defaults to None

Returns:
None

Notes

This method modifies the df attribute of the object.

create_optics_group_v5(pixel_size=None, subtomo_size=None, binning=None)#
create_particles_data()#

Creates an empty DataFrame in Relion version-specific format with the size corresponding to self.df.

Parameters:
versionfloat

The version of the Relion to be used. Valid values are 3.0, 3.1, and any other value for version 4 or higher.

Returns:
pandas.DataFrame

The empty DataFrame with columns corresponding to the specified Relion version.

create_relion_df(use_original_entries=False, tomo_format='', subtomo_format='', keep_all_entries=False, add_object_id=False, add_subunit_id=False, binning=None, pixel_size=None, adapt_object_attr=False, convert=False)#

This function creates takes the self.df attribute and creates a DataFrame that is Relion format.

Parameters:
tomo_formatstr, default=””

Format of the tomo name output format. See cryocat.cryomotl.RelionMotl.prepare_particles_data() for more information. Defaults to empty string.

subtomo_formatstr, default=””

Format of the subtomogram name output format. See cryocat.cryomotl.RelionMotl.prepare_particles_data() for more information. Defaults to empty string.

use_original_entriesbool, default=False

Determine whether to use (True) the original entries stored in self.relion_df or not (False). If True, all relion entries that are not used in motl (e.g., rlnCtfImage, rlnHelicalTubeID) are fetched from the original relion dataframe. Coordinates, rotations, classes etc. will be updated. Defaults to False.

keep_all_entries: bool, default=False

Used only if use_original_entries is True. If True, it will keep all the entries as they were loaded including coordinates, rotations and classes. Essentially, it should be set to True only if some selection on particles was done and nothing changed. Defaults to False.

versionfloat, optional

Specify the version and thereby the format of the DataFrame. If not provided the value from self.version will be used. Defaults to None.

add_object_idbool, default=False

Whether to add “object_id” from self.df to the DataFrame. If True, the column will be named “ccObjectName”. This is particularly useful for exporting fields mapped during loading, such as “rlnHelicalTubeID”. Defaults to False.

add_subunit_idbool, default=False

Whether to add “subunit_id” from self.df to the DataFrame. If True, the column will be named “ccSubunitName”. Defaults to False.

binningint, optional

Binning that should be used for conversion in case of Relion v. 4.x. If not provided the value from self.binning will be used. Defaults to None.

pixel_sizefloat, optional

The pixel size of the data. If not provided, the pixel size of the object instance (self.pixel_size) will be used. Defaults to None.

adapt_object_attrbool, default=False

Store the created DataFrame to self.relion_df attribute of the object. Defaults to False.

Returns:
pandas.DataFrame

A dataframe in Relion format.

See also

cryocat.cryomotl.RelionMotl.prepare_particles_data()

Provides more info tomo_format and subtomo_format.

prepare_optics_data(use_original_entries=True, optics_data=None, pixel_size=None, subtomo_size=None)#

The function prepares the optics data for relion DataFrame. It takes in a dictionary or starfile path as an argument, and returns a pandas DataFrame containing the optics information in version specific format.

Parameters:
use_original_entriesbool, default=True

Whether to use the self.optics_df (True) as source or not. If set to True, the optics_data as well as version will be ignored. Defaults to True.

optics_datastr, optional

The optics data specified either as a path to the starfile (it can also contain the particle list), dictionary or as DataFrame. It is used only if “use_original_entries” is set to False. Defaults to None.

versionfloat, optional

Relion version to be used for the DataFrame. It is used only if use_original_entries is set to False and the “optics_data” is a path to starfile. If not set, self.version will be used instead. Defaults to None.

pixel_sizefloat, optional

The pixel size of the data. If not provided, the pixel size of the object instance (self.pixel_size) will be used. Defaults to None.

subtomo_sizeint, optional

The size of the subtomograms. If not provided, it will be set to “NaN”. Defaults to None.

Returns
——-
pandas.DataFrame

DataFrame with the optics data.

Raises:
UserInputError

If optics_data is not str nor pandas.DataFrame.

Warning

If optics_data is not specified and self.optics_df is empty.

prepare_particles_data(tomo_format='', subtomo_format='')#

The function creates a DataFrame that contains the information on particles in Relion format. The function takes in the version of Relion to be used and formats describining how the tomogram/tilt-series and subtomogram names should be assembled.

Parameters:
tomo_formatstr, default=””

Format specifying the tomogram/tilt-series name by containing sequence of “x” introduced by “$” character. The longest sequence is evaluated as the position of the tomo_id and replaced with corresponding tomo_id. The number of x letters of the longest sequence determines number of digits to pad with zero. For example, for tomo_id 5 will following format “/path/to/tomo/$xxxx.rec” result in “/path/to/tomo/0005.rec”. The sequence can be present multiple times, sequences of “x” shorter than the longest one will be kept intact: for tomo_id 5 will “/path/to/tomo/$xxxx/$xxxx_$xx.mrc result in “/path/to/tomo/0005/0005_$xx.mrc”. Defaults to empty string, in which case the tomo_id will be used without any zero padding.

subtomo_formatstr, default=””

Format specifying the subtomogram name by containing sequence of “y” introduced by “$” character. The longest sequence is evaluated as the position of the subtomo_id and replaced with corresponding subtomo_id. The number of “y” letters of the longest sequence determines number of digits to pad with zero. For example, for subtomo_id 65 with following format “/path/to/subtomograms/$yyy.mrc” will result in /path/to/subtomograms/065.mrc”. The sequence can be present multiple times, sequences of “y” shorter than the longest one will be kept intact: for subtomo_id 65 will “/path/to/subtomograms/$yy_$yyy.mrc” result in “/path/to/subtomograms/$yy_065.mrc”. The subtomo_format can also contain sequence of “x” letters introduced by “$” in which case these are replaced by tomo_id in the same way as for tomo_format. For example, for tomo_id 5 and subtomogram_id 65 the following “/path/to/subtomograms/$xxxx/$xxxx_$yyy.mrc” will result in “/path/to/subtomograms/0005/0005_065.mrc”. Defaults to empty string, in which case the subtomo_id will be used without any zero padding.

versionfloat, optional

Relion version to be used for the DataFrame. Defaults to None, in which case self.version is used.

pixel_sizefloat, optional

The pixel size of the data. If not provided, the pixel size of the object instance (self.pixel_size) will be used. Defaults to None.

Returns:
pandas.DataFrame

A DataFrame with particle list in Relion format.

Raises:
UserInputError

In case the format does not contain valid sequence.

Examples

>>> rln_motl = cryomotl.RelionMotl()
>>> rln_motl.fill({"tomo_id": [2], "subtomo_id":[65]})
>>> rln_df = rln_motl.prepare_particles_data(tomo_format="/path/to/$xxxx.rec",
... subtomo_format="/path/to/$xxxx/$xxxx_$yy_2.6A.mrc", version=3.1)
>>> print(rln_df["rlnMicrographName"].values[0])
>>> print(rln_df["rlnImageName"].values[0])
/path/to/0002.rec
/path/to/0002/0002_65_2.6A.mrc
>>> rln_df = rln_motl.prepare_particles_data(tomo_format="/path/to/$xxxx",
... subtomo_format="/path/to/$xxxx/$xxxx_$yy_2.6A", version=4.0)
>>> print(rln_df["rlnTomoName"].values[0])
>>> print(rln_df["rlnTomoParticleName"].values[0])
/path/to/0002
/path/to/0002/0002_65_2.6A
>>> rln_df = rln_motl.prepare_particles_data(tomo_format="/path/to/$xx.rec",
... subtomo_format="/path/to/xxxx/xxxx_$yy_2.6A.mrc", version=3.1)
>>> print(rln_df["rlnMicrographName"].values[0])
>>> print(rln_df["rlnImageName"].values[0])
/path/to/02.rec
/path/to/xxxx/xxxx_65_2.6A.mrc
>>> rln_df = rln_motl.prepare_particles_data(tomo_format="",
... subtomo_format="/path/to/$xxx/$yy_2.6A.mrc", version=3.1)
>>> print(rln_df["rlnMicrographName"].values[0])
>>> print(rln_df["rlnImageName"].values[0])
2
/path/to/002/65_2.6A.mrc
>>> rln_df = rln_motl.prepare_particles_data(tomo_format="",
... subtomo_format="/path/to/$xxx/yy_2.6A.mrc", version=3.1)
>>> print(rln_df["rlnMicrographName"].values[0])
>>> print(rln_df["rlnImageName"].values[0])
ValueError: The format /path/to/$xxx/yy_2.6A.mrc does not contain any sequence of \$ followed by y.
read_in(input_path)#

Reads in a starfile and returns the particle list, version of the starfile and optics data if present.

Parameters:
input_pathstr

The path to the starfile.

Returns:
framespandas.DataFrame

Pandas.DataFrame containing the particle list in relion format.

versionfloat

The version extracted from the starfile. See meth:cryocat.cryomotl.RelionMotl.get_version_from_file for more info.

optics_dfpandas.DataFrame or None

Pandas.DataFrame containing optics if available, otherwise None.

read_in_tomograms(input_path)#
relion2warp(input_df)#

Converts relion_df from relion5 to warp2 Returns ——-

warp2relion(input_df)#

Converts relion_df from warp2 to relion5.

Returns:
write_out(output_path, write_optics=True, tomo_format='', subtomo_format='', use_original_entries=False, keep_all_entries=False, add_object_id=False, add_subunit_id=False, binning=None, pixel_size=None, optics_data=None, subtomo_size=None, convert=False)#

This function converts self.df DataFrame to a DataFrame in Relion format and writes it out as a starfile.

Parameters:
ouput_pathstr

The output path to the starfile to be written out.

write_opticsbool, default=True

Whether to include optics data in the starfile or not. Defaults to True.

tomo_formatstr, default=””

Format of the tomo name output format. See cryocat.cryomotl.RelionMotl.prepare_particles_data() for more information. Defaults to empty string.

subtomo_formatstr, default=””

Format of the subtomogram name output format. See cryocat.cryomotl.RelionMotl.prepare_particles_data() for more information. Defaults to empty string.

use_original_entriesbool, default=False

Determine whether to use (True) the original entries stored in self.relion_df or not (False). If True, all relion entries that are not used in motl (e.g., rlnCtfImage, rlnHelicalTubeID) are fetched from the original relion dataframe. Coordinates, rotations, classes etc. will be updated. Defaults to False.

keep_all_entries: bool, default=False

Used only if use_original_entries is True. If True, it will keep all the entries as they were loaded including coordinates, rotations and classes. Essentially, it should be set to True only if some selection on particles was done and nothing changed. Defaults to False.

versionfloat, optional

Specify the version and thereby the format of the DataFrame. If not provided the value from self.version will be used. Defaults to None.

add_object_idbool, default=False

Whether to add “object_id” from self.df to the DataFrame. If True, the column will be named “ccObjectName”. This is particularly useful for exporting fields mapped during loading, such as “rlnHelicalTubeID”. Defaults to False.

add_subunit_idbool, default=False

Whether to add “subunit_id” from self.df to the DataFrame. If True, the column will be named “ccSubunitName”. Defaults to False.

binningint, optional

Binning that should be used for conversion in case of Relion v. 4.x. If not provided the value from self.binning will be used. Defaults to None.

pixel_sizefloat, optional

The pixel size of the data. If not provided, the pixel size of the object instance (self.pixel_size) will be used. Defaults to None.

optics_datastr, optional

A DataFrame or a dictionary containing optics data or a path to the starfile that should be used to fetch the optics from. See cryocat.cryomotl.RelionMotl.prepare_optics_data() for more details. Used only if write_optics is True. If it is None and write_optics is True, then the attribute self.optics_df will be used. Defaults to None.

subtomo_sizeint, optional

The size of the subtomograms. If not provided, it will be set to “NaN”. Defaults to None.

Returns:
None

See also

cryocat.cryomotl.RelionMotl.prepare_particles_data()

Provides more information tomo_format and subtomo_format.

cryocat.cryomotl.RelionMotl.prepare_optics_data()

Provide more information on optics_data inputs.