RelionMotlv5#
- class cryocat.core.cryomotl.RelionMotlv5(input_particles=None, input_tomograms=None, tomo_idx=None, pixel_size=None, binning=1.0, optics_data=None, tomo_format='', subtomo_format='', version=None)#
Bases:
RelionMotl,Motl- check_isWarp()#
- static clean_subtomo_name_column(df)#
- static clean_tomo_name_column(df)#
Clean the ‘rlnTomoName’ column of a tomogram DataFrame by extracting only the numeric ID from names like ‘TS_17.tomostar’.
- Parameters:
- dfpandas.DataFrame
The tomogram DataFrame, must contain ‘rlnTomoName’ column.
- Returns:
- pandas.DataFrame
The same DataFrame with a cleaned ‘rlnTomoName’ column.
- columns_v5 = ['rlnCoordinateX', 'rlnCoordinateY', 'rlnCoordinateZ', 'rlnAngleRot', 'rlnAngleTilt', 'rlnAnglePsi', 'rlnTomoName', 'rlnTomoParticleName', 'rlnOpticsGroup', 'rlnOriginXAngst', 'rlnOriginYAngst', 'rlnOriginZAngst']#
- columns_v5_centAng = ['rlnCenteredCoordinateXAngst', 'rlnCenteredCoordinateYAngst', 'rlnCenteredCoordinateZAngst', 'rlnAngleRot', 'rlnAngleTilt', 'rlnAnglePsi', 'rlnTomoName', 'rlnTomoParticleName', 'rlnOpticsGroup', 'rlnOriginXAngst', 'rlnOriginYAngst', 'rlnOriginZAngst', 'rlnClassNumber', 'rlnRandomSubset', 'rlnGroupNumber']#
- convert_coordinates_ang_merge(relion_df)#
- convert_coordinates_merge(relion_df)#
- convert_to_motl(relion_df, optics_df=None, tomo_format='', subtomo_format='')#
The function converts a DataFrame in relion format into a motl DataFrame.
- Parameters:
- relion_dfpandas.DataFrame
DataFrame in relion format.
- versionfloat, optional
Version of Relion DataFrame. Defaults to None.
- optics_dfpandas.DataFrame, optional
DataFrame with optics data. Defaults to None
- Returns:
- None
Notes
This method modifies the
dfattribute of the object.
- create_optics_group_v5(pixel_size=None, subtomo_size=None, binning=None)#
- create_particles_data()#
Creates an empty DataFrame in Relion version-specific format with the size corresponding to
self.df.- Parameters:
- versionfloat
The version of the Relion to be used. Valid values are 3.0, 3.1, and any other value for version 4 or higher.
- Returns:
- pandas.DataFrame
The empty DataFrame with columns corresponding to the specified Relion version.
- create_relion_df(use_original_entries=False, tomo_format='', subtomo_format='', keep_all_entries=False, add_object_id=False, add_subunit_id=False, binning=None, pixel_size=None, adapt_object_attr=False, convert=False)#
This function creates takes the
self.dfattribute and creates a DataFrame that is Relion format.- Parameters:
- tomo_formatstr, default=””
Format of the tomo name output format. See
cryocat.cryomotl.RelionMotl.prepare_particles_data()for more information. Defaults to empty string.- subtomo_formatstr, default=””
Format of the subtomogram name output format. See
cryocat.cryomotl.RelionMotl.prepare_particles_data()for more information. Defaults to empty string.- use_original_entriesbool, default=False
Determine whether to use (True) the original entries stored in
self.relion_dfor not (False). If True, all relion entries that are not used in motl (e.g., rlnCtfImage, rlnHelicalTubeID) are fetched from the original relion dataframe. Coordinates, rotations, classes etc. will be updated. Defaults to False.- keep_all_entries: bool, default=False
Used only if use_original_entries is True. If True, it will keep all the entries as they were loaded including coordinates, rotations and classes. Essentially, it should be set to True only if some selection on particles was done and nothing changed. Defaults to False.
- versionfloat, optional
Specify the version and thereby the format of the DataFrame. If not provided the value from
self.versionwill be used. Defaults to None.- add_object_idbool, default=False
Whether to add “object_id” from
self.dfto the DataFrame. If True, the column will be named “ccObjectName”. This is particularly useful for exporting fields mapped during loading, such as “rlnHelicalTubeID”. Defaults to False.- add_subunit_idbool, default=False
Whether to add “subunit_id” from
self.dfto the DataFrame. If True, the column will be named “ccSubunitName”. Defaults to False.- binningint, optional
Binning that should be used for conversion in case of Relion v. 4.x. If not provided the value from
self.binningwill be used. Defaults to None.- pixel_sizefloat, optional
The pixel size of the data. If not provided, the pixel size of the object instance (
self.pixel_size) will be used. Defaults to None.- adapt_object_attrbool, default=False
Store the created DataFrame to
self.relion_dfattribute of the object. Defaults to False.
- Returns:
- pandas.DataFrame
A dataframe in Relion format.
See also
cryocat.cryomotl.RelionMotl.prepare_particles_data()Provides more info tomo_format and subtomo_format.
- prepare_optics_data(use_original_entries=True, optics_data=None, pixel_size=None, subtomo_size=None)#
The function prepares the optics data for relion DataFrame. It takes in a dictionary or starfile path as an argument, and returns a pandas DataFrame containing the optics information in version specific format.
- Parameters:
- use_original_entriesbool, default=True
Whether to use the
self.optics_df(True) as source or not. If set to True, the optics_data as well as version will be ignored. Defaults to True.- optics_datastr, optional
The optics data specified either as a path to the starfile (it can also contain the particle list), dictionary or as DataFrame. It is used only if “use_original_entries” is set to False. Defaults to None.
- versionfloat, optional
Relion version to be used for the DataFrame. It is used only if use_original_entries is set to False and the “optics_data” is a path to starfile. If not set,
self.versionwill be used instead. Defaults to None.- pixel_sizefloat, optional
The pixel size of the data. If not provided, the pixel size of the object instance (
self.pixel_size) will be used. Defaults to None.- subtomo_sizeint, optional
The size of the subtomograms. If not provided, it will be set to “NaN”. Defaults to None.
- Returns
- ——-
- pandas.DataFrame
DataFrame with the optics data.
- Raises:
- UserInputError
If
optics_datais not str nor pandas.DataFrame.- Warning
If
optics_datais not specified andself.optics_dfis empty.
- prepare_particles_data(tomo_format='', subtomo_format='')#
The function creates a DataFrame that contains the information on particles in Relion format. The function takes in the version of Relion to be used and formats describining how the tomogram/tilt-series and subtomogram names should be assembled.
- Parameters:
- tomo_formatstr, default=””
Format specifying the tomogram/tilt-series name by containing sequence of “x” introduced by “$” character. The longest sequence is evaluated as the position of the tomo_id and replaced with corresponding tomo_id. The number of x letters of the longest sequence determines number of digits to pad with zero. For example, for tomo_id 5 will following format “/path/to/tomo/$xxxx.rec” result in “/path/to/tomo/0005.rec”. The sequence can be present multiple times, sequences of “x” shorter than the longest one will be kept intact: for tomo_id 5 will “/path/to/tomo/$xxxx/$xxxx_$xx.mrc result in “/path/to/tomo/0005/0005_$xx.mrc”. Defaults to empty string, in which case the tomo_id will be used without any zero padding.
- subtomo_formatstr, default=””
Format specifying the subtomogram name by containing sequence of “y” introduced by “$” character. The longest sequence is evaluated as the position of the subtomo_id and replaced with corresponding subtomo_id. The number of “y” letters of the longest sequence determines number of digits to pad with zero. For example, for subtomo_id 65 with following format “/path/to/subtomograms/$yyy.mrc” will result in /path/to/subtomograms/065.mrc”. The sequence can be present multiple times, sequences of “y” shorter than the longest one will be kept intact: for subtomo_id 65 will “/path/to/subtomograms/$yy_$yyy.mrc” result in “/path/to/subtomograms/$yy_065.mrc”. The subtomo_format can also contain sequence of “x” letters introduced by “$” in which case these are replaced by tomo_id in the same way as for tomo_format. For example, for tomo_id 5 and subtomogram_id 65 the following “/path/to/subtomograms/$xxxx/$xxxx_$yyy.mrc” will result in “/path/to/subtomograms/0005/0005_065.mrc”. Defaults to empty string, in which case the subtomo_id will be used without any zero padding.
- versionfloat, optional
Relion version to be used for the DataFrame. Defaults to None, in which case
self.versionis used.- pixel_sizefloat, optional
The pixel size of the data. If not provided, the pixel size of the object instance (
self.pixel_size) will be used. Defaults to None.
- Returns:
- pandas.DataFrame
A DataFrame with particle list in Relion format.
- Raises:
- UserInputError
In case the format does not contain valid sequence.
Examples
>>> rln_motl = cryomotl.RelionMotl() >>> rln_motl.fill({"tomo_id": [2], "subtomo_id":[65]})
>>> rln_df = rln_motl.prepare_particles_data(tomo_format="/path/to/$xxxx.rec", ... subtomo_format="/path/to/$xxxx/$xxxx_$yy_2.6A.mrc", version=3.1) >>> print(rln_df["rlnMicrographName"].values[0]) >>> print(rln_df["rlnImageName"].values[0]) /path/to/0002.rec /path/to/0002/0002_65_2.6A.mrc
>>> rln_df = rln_motl.prepare_particles_data(tomo_format="/path/to/$xxxx", ... subtomo_format="/path/to/$xxxx/$xxxx_$yy_2.6A", version=4.0) >>> print(rln_df["rlnTomoName"].values[0]) >>> print(rln_df["rlnTomoParticleName"].values[0]) /path/to/0002 /path/to/0002/0002_65_2.6A
>>> rln_df = rln_motl.prepare_particles_data(tomo_format="/path/to/$xx.rec", ... subtomo_format="/path/to/xxxx/xxxx_$yy_2.6A.mrc", version=3.1) >>> print(rln_df["rlnMicrographName"].values[0]) >>> print(rln_df["rlnImageName"].values[0]) /path/to/02.rec /path/to/xxxx/xxxx_65_2.6A.mrc
>>> rln_df = rln_motl.prepare_particles_data(tomo_format="", ... subtomo_format="/path/to/$xxx/$yy_2.6A.mrc", version=3.1) >>> print(rln_df["rlnMicrographName"].values[0]) >>> print(rln_df["rlnImageName"].values[0]) 2 /path/to/002/65_2.6A.mrc
>>> rln_df = rln_motl.prepare_particles_data(tomo_format="", ... subtomo_format="/path/to/$xxx/yy_2.6A.mrc", version=3.1) >>> print(rln_df["rlnMicrographName"].values[0]) >>> print(rln_df["rlnImageName"].values[0]) ValueError: The format /path/to/$xxx/yy_2.6A.mrc does not contain any sequence of \$ followed by y.
- read_in(input_path)#
Reads in a starfile and returns the particle list, version of the starfile and optics data if present.
- Parameters:
- input_pathstr
The path to the starfile.
- Returns:
- framespandas.DataFrame
Pandas.DataFrame containing the particle list in relion format.
- versionfloat
The version extracted from the starfile. See meth:
cryocat.cryomotl.RelionMotl.get_version_from_filefor more info.- optics_dfpandas.DataFrame or None
Pandas.DataFrame containing optics if available, otherwise None.
- read_in_tomograms(input_path)#
- relion2warp(input_df)#
Converts relion_df from relion5 to warp2 Returns ——-
- warp2relion(input_df)#
Converts relion_df from warp2 to relion5.
- Returns:
- write_out(output_path, write_optics=True, tomo_format='', subtomo_format='', use_original_entries=False, keep_all_entries=False, add_object_id=False, add_subunit_id=False, binning=None, pixel_size=None, optics_data=None, subtomo_size=None, convert=False)#
This function converts
self.dfDataFrame to a DataFrame in Relion format and writes it out as a starfile.- Parameters:
- ouput_pathstr
The output path to the starfile to be written out.
- write_opticsbool, default=True
Whether to include optics data in the starfile or not. Defaults to True.
- tomo_formatstr, default=””
Format of the tomo name output format. See
cryocat.cryomotl.RelionMotl.prepare_particles_data()for more information. Defaults to empty string.- subtomo_formatstr, default=””
Format of the subtomogram name output format. See
cryocat.cryomotl.RelionMotl.prepare_particles_data()for more information. Defaults to empty string.- use_original_entriesbool, default=False
Determine whether to use (True) the original entries stored in
self.relion_dfor not (False). If True, all relion entries that are not used in motl (e.g., rlnCtfImage, rlnHelicalTubeID) are fetched from the original relion dataframe. Coordinates, rotations, classes etc. will be updated. Defaults to False.- keep_all_entries: bool, default=False
Used only if use_original_entries is True. If True, it will keep all the entries as they were loaded including coordinates, rotations and classes. Essentially, it should be set to True only if some selection on particles was done and nothing changed. Defaults to False.
- versionfloat, optional
Specify the version and thereby the format of the DataFrame. If not provided the value from
self.versionwill be used. Defaults to None.- add_object_idbool, default=False
Whether to add “object_id” from
self.dfto the DataFrame. If True, the column will be named “ccObjectName”. This is particularly useful for exporting fields mapped during loading, such as “rlnHelicalTubeID”. Defaults to False.- add_subunit_idbool, default=False
Whether to add “subunit_id” from
self.dfto the DataFrame. If True, the column will be named “ccSubunitName”. Defaults to False.- binningint, optional
Binning that should be used for conversion in case of Relion v. 4.x. If not provided the value from
self.binningwill be used. Defaults to None.- pixel_sizefloat, optional
The pixel size of the data. If not provided, the pixel size of the object instance (
self.pixel_size) will be used. Defaults to None.- optics_datastr, optional
A DataFrame or a dictionary containing optics data or a path to the starfile that should be used to fetch the optics from. See
cryocat.cryomotl.RelionMotl.prepare_optics_data()for more details. Used only ifwrite_opticsis True. If it is None andwrite_opticsis True, then the attributeself.optics_dfwill be used. Defaults to None.- subtomo_sizeint, optional
The size of the subtomograms. If not provided, it will be set to “NaN”. Defaults to None.
- Returns:
- None
See also
cryocat.cryomotl.RelionMotl.prepare_particles_data()Provides more information tomo_format and subtomo_format.
cryocat.cryomotl.RelionMotl.prepare_optics_data()Provide more information on optics_data inputs.