tango#

class cryocat.analysis.tango.AlphaComplexDescriptor(twist_df, alpha_param=200.0, build_unique_desc=True)#

Bases: Descriptor

compute_alpha_complex(coord)#

Compute the stars and links of vertices corresponding to an alpha complex. This computation is performed in 2D. Thus, the input coordinates have t be close enough to a plane.

Parameters:

coordnumpy.ndarray: The coordinates of the points in the alpha complex.

Returns:

tuple
- triangleslist: A list of triangles representing the stars of the vertices in the alpha complex.
- link_edgeslist: A list of edges representing the links of the vertices in the alpha complex.

compute_features(qp_id)#

Compute geometric features for a given query point ID. These features include the central angles, the triangle areas, the radii of the inner circles, the radii of the circumcircles, and the ratio of the circumcircle radius to the inner circle radius.

Parameters:

qp_idint: The ID of the query point.

Returns:

pandas.DataFrame: A DataFrame containing the computed features for the given query point ID.

class cryocat.analysis.tango.AngularScoreNN(twist_desc, num_neighbors=1)#: Bases: Filter

class cryocat.analysis.tango.AngularScoreStatsTwist(assoc_desc)#

Bases: Feature

compute()#

Compute the mean, median, standard deviation, and variance of the angular score between the query point and its neighbors. Works only for particles with symmetry.

Returns:

pandas.DataFrame: A DataFrame containing the mean, median, standard deviation, and variance of the angular score between each query point and its neighbors. If the TwistDescriptor does not contain the angular score, it returns empty data frame.

class cryocat.analysis.tango.AxisRot(twist_desc, max_angle, axis='z', min_angle=0.0)#: Bases: Filter

class cryocat.analysis.tango.Catalog#

Bases: object

get_all_classes(filter_contains=None, filter_exclude=None)#

Returns all class names in the catalog that are subclasses of the respective parent class.

Parameters:

filter_containsstr, optional: A string to filter the class names that contain this substring. Default is None.
filter_excludestr, optional: A string to filter the class names that do not contain this substring. Default is None.

Returns:

list: A list of class names that are subclasses of the parent class and match the filters.

class cryocat.analysis.tango.CentralAngleStatsAlphaComplex(assoc_desc)#

Bases: Feature

compute()#

Compute the mean, median, standard deviation, and variance of the central angles of a vertex’s star.

Returns:

pandas.DataFrame: A DataFrame containing the mean, median, standard deviation, and variance of the central angles for each query point.

class cryocat.analysis.tango.CentralAngleStatsPLComplex(assoc_desc)#

Bases: Feature

compute()#

Compute the mean, median, standard deviation, and variance of the central angles of a vertex’s star.

Returns:

pandas.DataFrame: A DataFrame containing the mean, median, standard deviation, and variance of the central angles for each query point.

class cryocat.analysis.tango.Cone(twist_desc, cone_height, cone_radius, axis=None, mode='position')#: Bases: Support

class cryocat.analysis.tango.CountSHOT(assoc_desc)#

Bases: Feature

compute()#

Compute the number of occurrences of each (cone_id, shell_id) combination.

Returns:

pandas.DataFrame: A DataFrame containing the counts of each (cone_id, shell_id) combination for each query point.

class cryocat.analysis.tango.CustomDescriptor(twist_df, feature_list=None, feature_kwargs=None, support_class=None, support_kwargs=None)#

Bases: Descriptor

build_descriptor(feature_list, feature_kwargs, support_class=None, support_kwargs=None)#

Build the descriptor by computing the features based on the provided support and feature list. This method creates the necessary keyword arguments for the features based on the support. It then computes the features and merges them into a single DataFrame. The resulting DataFrame is stored in the desc attribute of the CustomDescriptor instance. The df attribute contains the original DataFrame. This method is a class method that creates an instance of the CustomDescriptor class.

Parameters:

feature_listlist: A list of feature classes to be computed.
feature_kwargslist: A list of dictionaries containing keyword arguments for each feature class.
support_classclass, optional: The support class used to compute the features. If None, the original support from twist descriptor is used. Default is None.
support_kwargsdict, optional: A dictionary containing keyword arguments for the support class (if specified). Default is None.

Returns:

new_desc_dfpandas.DataFrame: A data frame with the custom descriptor values.

create_additional_descriptors(support_df, feature_list, feature_kwargs)#

Create additional descriptors based on the provided feature list and support. This method generates the necessary descriptors associated with the features and based on the provided support.

Parameters:

support_dfpd.DataFrame: DataFrame with qp from the defined support.
feature_listlist: A list of feature classes names to be computed.
feature_kwargslist: A list of dictionaries containing keyword arguments for each descriptor class.

Returns:

dict: A dictionary containing computed descriptors for all features.

classmethod load(desc_df)#

Load a custom descriptor from a DataFrame. This method is a class method that creates an instance of the CustomDescriptor class.

Parameters:

desc_dfpandas.DataFrame: The DataFrame containing the custom descriptor data.

class cryocat.analysis.tango.CustomSupport(twist_desc, binary_mask)#: Bases: Support

class cryocat.analysis.tango.Cylinder(twist_desc, radius, height, axis=None, mode='position', symmetric=True)#: Bases: Support

class cryocat.analysis.tango.Descriptor#

Bases: object

build_descriptor()#

Builds a descriptor DataFrame by merging computed features based on unique ‘qp_id’ values.

This method retrieves available features from the FeatureCatalog that match the class name, instantiates each feature, computes its values, and merges the results into a single DataFrame.

Returns:

new_desc_dfpandas.DataFrame: A data frame with the descriptor values.

Notes

The method assumes that the DataFrame self.df contains a column named ‘qp_id’.
The features are filtered based on the class name, which is derived from the class of the instance.
The merging is performed using a left join on the ‘qp_id’ column.

static build_descriptor_feature_map(desc_list, feat_list)#

Build a mapping between descriptors and features.

Parameters:

desc_listlist of str: List of descriptor names.
feat_listlist of str: List of feature names.

Returns:

dict: A dictionary where keys are descriptor names and values are lists of matching feature names.

static build_feature_descriptor_map(feat_list, desc_list)#

Build a mapping between features and descriptors.

Parameters:

feat_listlist of str: List of feature names.
desc_listlist of str: List of descriptor names.

Returns:

dict: A dictionary where keys are feature names and values are the corresponding descriptor names.

compute_pca(pca_components=None, feature_ids='all', nan_drop='row')#

Compute PCA on the descriptor DataFrame.

Parameters:

pca_componentsint, optional: The number of PCA components to compute. Default is None.
feature_idsstr or list, default=”all”: The feature IDs to filter by. Can be “all” or a list of feature names corresponding to columns from input_df. Default is “all”.
nan_dropstr, {“row”, “column”}: The axis to drop NaN values from. Default is “row”.

Returns:

tuple
- pca_dfpandas.DataFrame: The DataFrame containing the PCA components.
- qp_idsnumpy.ndarray: The array of query point indices corresponding to the PCA components.

filter_features(input_df, feature_ids='all')#

Filter features based on the feature_ids parameter.

Parameters:

input_dfpandas.DataFrame: The input DataFrame to filter.
feature_idsstr or list, default=”all”: The feature IDs to filter by. Can be “all” or a list of feature names corresponding to columns from input_df. Default is “all”.

Returns:

pandas.DataFrame: The filtered DataFrame containing only the specified features.

Raises:

ValueError: If feature_ids is not a valid option. If none of the provided features are in the DataFrame. If feature_ids is not a string or list.

get_important_features(pca, input_df, n_components)#

Get important features based on PCA loadings.

Parameters:

pcasklearn.decomposition.PCA: PCA object fitted to the data.
input_dfpandas.DataFrame: The input DataFrame containing the features.
n_componentsint: The number of components to consider.

Returns:

pandas.DataFrame: A DataFrame containing the feature importance scores.

k_means_clustering(n_clusters, nan_drop='row', pca_dict=None, feature_ids='all', scale_data=True)#

Perform k-means clustering on the descriptor DataFrame.

Parameters:

n_clustersint: The number of clusters to form.
nan_dropstr, {“row”, “column”}: The axis to drop NaN values from. Default is “row”.
pca_dictdict, optional: A dictionary containing PCA parameters. If None, PCA is not applied. Default is None.
feature_idsstr or list, default=”all”: The feature IDs to filter by. Can be “all” or a list of feature names corresponding to columns from input_df. Default is “all”.
scale_databool, default=True: Whether to scale the data before clustering. Default is True.

Returns:

pandas.DataFrame: A DataFrame containing the clustering results, including the cluster labels and query point IDs.

pca_analysis(variance_threshold=0.95, show_fig=True, nan_drop='row', scatter_kwargs=None, bar_kwargs=None)#

Perform PCA analysis on the descriptor DataFrame.

Parameters:

variance_thresholdfloat, default=0.95: The threshold for cumulative explained variance to determine the number of components. Default is 0.95.
show_figbool, default=True: Whether to show the figure. Default is True.
nan_dropstr, default=”row”: The axis to drop NaN values from. Default is “row”.
scatter_kwargsdict, optional: Additional arguments for the scatter plot. Default is None.
bar_kwargsdict, optional: Additional arguments for the bar plot. Default is None.

Returns:

tuple

n_componentsint
The number of components chosen based on the variance threshold.
important_featurespandas.DataFrame
A DataFrame containing the important features based on PCA loadings.
figplotly.graph_objects.Figure
The figure object containing the PCA summary plot.

plot_k_means(color_column)#

Plot the k-means clustering results in 3D.

Parameters:

color_columnstr: The column name in the DataFrame to use for coloring the points.

proximity_clustering(num_connected_components=1, size_connected_components=None)#

Cluster particles based on spatial proximity. If size_connected_components is None, num_connected_components is used to determine the number of connected components to return. If size_connected_components is specified, it returns all connected components with size >= size_connected_components.

Parameters:

num_connected_componentsint, default=1: The number of connected components to return. Default is 1.
size_connected_componentsint, optional: The minimum size of the connected components to return. Default is None.

Returns:

list: A list of connected components, each represented as a subgraph of the original graph.

static remove_nans(df, axis_type='row')#

Remove NaN values from the DataFrame.

Parameters:

dfpandas.DataFrame
axis_typestr, {“row”, “column”}: Default is “row”.

Returns:

pandas.DataFrame

Raises:

ValueError: If axis_type is not “row” or “column”.

class cryocat.analysis.tango.DistProductStatsTwist(assoc_desc)#

Bases: Feature

compute()#

Compute the mean, median, standard deviation, and variance of the product of Euclidean and geodesic distance between the query point and its neighbors.

Returns:

pandas.DataFrame: A DataFrame containing the mean, median, standard deviation, and variance of the product of Euclidean and geodesic distance between each query point and its neighbors.

class cryocat.analysis.tango.EuclideanDistNN(twist_desc, num_neighbors=1)#: Bases: Filter

class cryocat.analysis.tango.EuclideanDistStatsTwist(assoc_desc)#

Bases: Feature

compute()#

Compute the mean, median, standard deviation, and variance of the Euclidean distance between the query point and its neighbors.

Returns:

pandas.DataFrame: A DataFrame containing the mean, median, standard deviation, and variance of the Euclidean distance between each query point and its neighbors.

class cryocat.analysis.tango.EulerCharAlphaComplex(assoc_desc)#

Bases: Feature

compute()#

Compute the Euler characteristic of a 1-dimensional simplicial complex.

Returns:

pandas.DataFrame: A DataFrame containing the Euler characteristic for each query point.

class cryocat.analysis.tango.Feature(desc_df)#

Bases: object

compute()#

compute_stats(column_name)#

class cryocat.analysis.tango.FeatureCatalog#: Bases: Catalog

class cryocat.analysis.tango.Filter#: Bases: object

class cryocat.analysis.tango.FilterCatalog#: Bases: Catalog

class cryocat.analysis.tango.GeodesicDistNN(twist_desc, num_neighbors=1)#: Bases: Filter

class cryocat.analysis.tango.GeodesicnDistStatsTwist(assoc_desc)#

Bases: Feature

compute()#

Compute the mean, median, standard deviation, and variance of the geodesic distance between the query point and its neighbors.

Returns:

pandas.DataFrame: A DataFrame containing the mean, median, standard deviation, and variance of the geodesic distance between each query point and its neighbors.

class cryocat.analysis.tango.LinkTypeAlphaComplex(assoc_desc)#

Bases: Feature

compute()#

Compute a simplicial isomorphism invariant (‘link type’) for a 1-dimensional simplicial complex.

Returns:

pandas.DataFrame: A DataFrame containing the simplicial isomorphism invariant (‘link type’) for each query point.

class cryocat.analysis.tango.MixedDistNN(twist_desc, num_neighbors=1)#: Bases: Filter

class cryocat.analysis.tango.NNCountTwist(assoc_desc)#

Bases: Feature

compute()#

Compute the number of nearest neighbors for each query point.

Returns:

pandas.DataFrame: A DataFrame containing the number of nearest neighbors for each query point.

class cryocat.analysis.tango.PLComplexDescriptor(twist_df, build_unique_desc=True)#

Bases: Descriptor

compute_features(qp_id)#

Compute geometric features for a given query point ID.

Parameters:

qp_idint: The ID of the query point.

Returns:

pandas.DataFrame: A DataFrame containing the computed features for the given query point ID.

compute_outer_edges()#

Compute the edges making up links of the triangulated surface.

Returns:

list: A list of tuples representing the edges of the triangulated surface.

compute_triangles(qp_id)#

Compute the triangles for a given query point ID.

Parameters:

qp_idint: The ID of the query point.

class cryocat.analysis.tango.Particle(rotation, position, tomo_id=None, motl_fid=None, degrees=True, particle_id=0)#

Bases: object

add_noise(noise_level=0.05, mode='orientation', degrees=False)#

Add noise to particle based on mode.

Parameters:

noise_levelint, float, tuple, list: Controls how much perturbed particle deviates from input particle. Defaults to 0.05. Tuple or list are only to be used in mixed-mode, where first entry refers to orientational noise, second one to positional noise.
modestr, {“orientation”, “position”, “mixed”}: The mode refers to the notion of distance that is to be applied. ‘orientation’ for geodesic (angular) distance between particle orientations ‘position’ for Euclidean distance between physical particle positions ‘mixed’ for product metric Default is “orientation”.
degreesbool, default=False: If True, orientational noise is expressed in degrees. Otherwise: radians. Default is False.

Returns:

Particle

Raises:

ValueError: If the mode is invalid.

distance(other, mode='orientation', degrees=False)#

Compute distance between particles based on mode.

Parameters:

modestr, {“orientation”, “position”, “mixed”}: The mode refers to the notion of distance that is to be applied. ‘orientation’ for geodesic (angular) distance between particle orientations ‘position’ for Euclidean distance between physical particle positions ‘mixed’ for product metric Default is “orientation”.
degreesbool, default=False.: If True, angular distance is expressed in degrees. Otherwise: radians. Default is False.

Returns:

float

Raises:

ValueError: If the input is not a Particle object or if the mode is invalid.

classmethod identity()#

Returns the identity particle. This is the particle positioned at the origin and equipped with the trivial (canonical) orientation.

Returns:

Particle

in_plane_angle(degrees=True)#

Returns the rotation angle of the inplane portion of a rotation matrix.

Parameters:

degreesbool, default=True: If True, rotation angle is expressed in degrees. Otherwise: radians. Default is True.

Returns:

floatrotation angle

inv()#: Compute inverse of input particle.

classmethod random(x_range, y_range, z_range)#

Generates a random particle with a translation vector position within bounds.

Parameters:

- x_range: Tuple of (min_x, max_x)
- y_range: Tuple of (min_y, max_y)
- z_range: Tuple of (min_z, max_z)

Returns:

Particle

scale(scaling_factor, overwrite=True)#

Scales the translation vector associated to self.

Parameters:

scaling_factorfloat or int: The factor by which the particle position is scaled.
overwritebool, default=True: If True, the original particle is overwritten. Otherwise, False.

Returns:

If overwrite == False, Particle is returned.

Raises:

TypeError: If scaling_factor is not of type int or float.

tangent_at_identity()#

Compute tangent from identity in SE(3) pointing in direction of self.

Returns:

numpy ndarray (6,)

tangent_subspace_projection(other, mode='orientation')#

Project tangent vector at identity pointing in direction self –> other onto subspace corresponding to mode.

Parameters:

modestr, {“orientation”, “position”, “mixed”}: The mode refers to the notion of distance that is to be applied. ‘orientation’ for geodesic (angular) distance between particle orientations ‘position’ for Euclidean distance between physical particle positions ‘mixed’ for product metric Default is “orientation”.

Returns:

numpy ndarray (3,) or (6,) (depends on mode)

Raises:

ValueError: If the input is not a Particle object or if the mode is invalid.

twist_vector(other)#

Compute twist vector describing relative pose of input particles.

Returns:

numpy ndarray (6,)

Raises:

ValueError: If the input is not a Particle object.

class cryocat.analysis.tango.RotAngleXStatsTwist(assoc_desc)#

Bases: Feature

compute()#

Compute the mean, median, standard deviation, and variance of the rotation angle of x axis between the query point and its neighbors.

Returns:

pandas.DataFrame: A DataFrame containing the mean, median, standard deviation, and variance of the rotation angle of x axis between each query point and its neighbors.

class cryocat.analysis.tango.RotAngleYStatsTwist(assoc_desc)#

Bases: Feature

compute()#

Compute the mean, median, standard deviation, and variance of the rotation angle of y axis between the query point and its neighbors.

Returns:

pandas.DataFrame: A DataFrame containing the mean, median, standard deviation, and variance of the rotation angle of y axis between each query point and its neighbors.

class cryocat.analysis.tango.RotAngleZStatsTwist(assoc_desc)#

Bases: Feature

compute()#

Compute the mean, median, standard deviation, and variance of the rotation angle of z axis between the query point and its neighbors.

Returns:

pandas.DataFrame: A DataFrame containing the mean, median, standard deviation, and variance of the rotation angle of z axis between each query point and its neighbors.

class cryocat.analysis.tango.SHOTDescriptor(twist_df, cone_number=6, shell_number=1, north_pole_axis=None, build_unique_desc=True)#

Bases: Descriptor

assign_cone_ids_by_dot(cone_dirs)#

Assign each point to the cone direction with which it has the largest cosine similarity. All inputs must be normalized.

Parameters:

pointsnumpy.ndarray: The points to be assigned to cone directions.
cone_dirsnumpy.ndarray: The cone directions to which the points are assigned.

Returns:

numpy.ndarray: The indices of the cone directions to which each point is assigned.

assign_shell_and_cone_ids(qp_id, points, num_shells, num_cones, radius=1.0, north_pole_axis=None)#

Assign each point to a shell and cone direction based on its radial distance and angular position. The points are normalized to lie within a sphere of the specified radius. The cone directions are generated based on the specified number of cones and the north pole axis.

Parameters:

qp_idint: The ID of the query point.
pointsnumpy.ndarray: The points to be assigned to shells and cone directions.
num_shellsint: The number of shells to divide the spherical support into.
num_conesint: The number of cones to divide the spherical support into.
radiusfloat, default=1.0: The radius of the spherical support. Default is 1.0.
north_pole_axisnumpy.ndarray, optional: The axis corresponding to the north-pole of the subdivided support. Default is None, which uses the z-axis.

Returns:

pandas.DataFrame: A DataFrame containing the assigned shell and cone IDs for each point.

fixed_cone_directions(num_cones)#

Return evenly distributed unit vectors for small num_cones (hand-picked for symmetry). These directions include one at [0, 0, 1] and others at standard axes for low counts.

Parameters:

num_conesint: The number of cones to divide the spherical support into.

Returns:

numpy.ndarray: The fixed cone directions for the subdivided spherical support.

generate_rotated_axes(num_cones, north_pole_axis=None)#

Compute the rotated axes for the subdivided spherical support.

Parameters:

num_conesint: The number of cones to divide the spherical support into.
north_pole_axisnumpy.ndarray, optional: The axis corresponding to the north-pole of the subdivided support. Default is None, which uses the z-axis.

Returns:

numpy.ndarray: The rotated axes for the subdivided spherical support.

class cryocat.analysis.tango.Shell(twist_desc, radius_min, radius_max)#: Bases: Support

class cryocat.analysis.tango.Sphere(twist_desc, radius=None)#: Bases: Support

class cryocat.analysis.tango.Support#

Bases: object

static set_axis_and_columns(mode, axis=None)#

Set the rotation axis for the twist descriptor based on the specified mode. The axis is considered in a different subspace depending on the mode. The choice depends on the type of suport that is being used.

Parameters:

modestr: The mode of the twist descriptor. Can be “orientation”, “position”, or “mixed”.
axisnumpy.ndarray, optional: The axis used in the defintion of the required support. Default is None.

Returns:

tuple
- axisnumpy.ndarray: The normalized axis vector.
- columnslist: The feature IDs corresponding to the specified mode.

Raises:

ValueError: If the mode is not supported or if the axis is not a numpy array of size 3 or 6.

class cryocat.analysis.tango.SupportCatalog#: Bases: Catalog

class cryocat.analysis.tango.SymmParticle(rotation, position, tomo_id=None, motl_fid=None, particle_id=None, symm=None, custom_rot=None)#

Bases: Particle

classmethod equip_symmetry(symm, custom_rot=None)#

Equip an existing particle object with symmetry information.

Parameters:

input_particleParticle: The input particle to be equipped with symmetry information.
symmstr or int: Refers to symmetry type. Can be one of the following: - ‘tetra’, ‘octa’, ‘cube’, ‘ico’, ‘dodeca’ for platonic solids - An integer n > 1 for cyclic groups C_n.
custom_rotnumpy ndarray (3,3) or rotation object, optional: Rotation matrix or rotation object describing the symmetry of the particle in the case of a platonic solid. Default is None.

Returns:

SymmParticle

max_dissimilarity()#

Given the input particle’s symmetry type, this function returns the associated maximum angular dissimilarity.

Returns:

float

similarity_symm(other, max=None)#

Compute angular similarity between two symmetric particles in an unambiguous manner.

Parameters:

maxfloat, optional: If not None, the maximum dissimilarity is set to the input value. This is designed to accelerate computations by computing max only once. Default is None.

Returns:

float

Raises:

ValueError: If the symmetry types of the input particles don’t match.

class cryocat.analysis.tango.Torus(twist_desc, inner_radius, outer_radius, axis=None, mode='position')#: Bases: Support

class cryocat.analysis.tango.TwistDescriptor(input_twist=None, input_motl=None, nn_radius=None, feature_id='tomo_id', symm=None, remove_qp=False, remove_duplicates=False, build_unique_desc=True)#

Bases: Descriptor

static get_all_feature_ids(symm=False)#

To access information available to TwistDescriptors.

Returns:

list

static get_axis_feature_id(feature_id, axis='z')#

Get the feature ID for a specific axis.

Parameters:

feature_idstr: The base feature ID (e.g., “twist_so”).
axisstr, {“z”, “x”, “y”}: The axis to use. Default is “z”.

Returns:

str: The feature ID for the specified axis.

Raises:

ValueError: If the axis is not valid (not “x”, “y”, or “z”).

classmethod get_data_range(twist_desc, twist_descriptor_id=None, min_value=None, max_value=None)#

Filter a TwistDescriptor instance by a specific column and return a new instance.

Parameters:

twist_descTwistDescriptor: The original TwistDescriptor instance.
twist_descriptor_idstr, optional: The name of the column to filter on. If None, the original twist_desc will be returned. Default is None.
min_valuefloat, optional: Minimum value for filtering. Default is None.
max_valuefloat, optional: Maximum value for filtering. Default is None.

Returns:

TwistDescriptor: A new instance of TwistDescriptor with filtered data.

static get_mixed_feature_ids()#

To access both relative position and orientation information.

Returns:

list

static get_nn_twist_stats_within_radius(input_motl, nn_radius, feature_id='tomo_id', symm=None, remove_qp=None, remove_duplicates=False)#

Compute twist descriptor for a given input_motl within a specified radius.

Parameters:

input_motlstr or Motl: The path to the input Motl file or Motl object to be loaded.
nn_radiusfloat: The radius within which to compute the twist descriptor.
feature_idstr, {“tomo_id”,”object_id”,”class”,”geom1”,”geom2”,”geom3”,”geom4”,”geom5”}: The identifier for the motl feature which specifies the level of comparison. For instance, if “tomo_id” is specified the NN analysis will be computed at the tomogram level. If “object_id” is specified, the nearest neighbors will be searched in the objects with same “object_id”. Note that one should ensure unique numbering among all tomograms in case “tomo_id” is not set, otherwise objects from different tomograms might be incorrectly grouped together. Default is “tomo_id”.
symmint or str, optional: Specifies whether to use symmetry information. If None, no symmetry will be used. For allowed values see cryocat.tango.SymmParticle. Default is None.
remove_qpbool, optional: If True, the query point is removed from the nearest neighbors in the DataFrame. Default is None.
remove_duplicatesbool, default=False: If True, duplicate entries are removed from the DataFrame. Default is False.

Returns:

pandas.DataFrame: DataFrame containing twist vectors and additional information.

static get_pos_feature_ids()#

To access relative positions information.

Returns:

list

get_qp_twist_desc(query_particle, tomo_id=None)#

Get twist descriptor for a specific query particle from a twist dataframe.

Parameters:

query_particleint, float or Particle: The index of the particle or a Particle instance for which statistics are to be retrieved.
tomo_idint, optional: The tomogram id to filter the data. If None, data will be retrieved for all tomograms associated with the query particle. Default is None.

Returns:

filtered_twist_descTwistDescriptor: A TwistDescriptor containing the filtered statistics for the specified query particle and tomo id.

Raises:

ValueError: If query_particle is neither an instance of Particle nor an integer index.

Notes

This function assumes that twist_df contains a column named “qp_id” for particle IDs and “tomo_id” for tomography IDs.

static get_rot_feature_ids()#

To access relative orientation information.

Returns:

list

static get_symm_parameters(input_motl, symm=None)#

Get symmetry parameters (symmetry type, maximum dissimilarity) for a given input_motl.

Parameters:

input_motlstr or Motl: The path to the input Motl file or Motl object to be loaded.
symmint or str, optional: Specifies whether to use symmetry information. If None, no symmetry will be used. For allowed values see cryocat.tango.SymmParticle. Default is None.

Returns:

tuple
- max_dissimilarityfloat: The maximum dissimilarity for the given symmetry type.
- categorystr: The symmetry type (e.g., ‘tetrahedron’, ‘octahedron’, etc.).

get_twist_mixed_df()#

Get the relative position and orientation from the twist descriptor DataFrame. This data corresponds to the twist vectors.

Returns:

pandas.DataFrame: A DataFrame containing the relative position and orientation (twist_x, twist_y, twist_z, twist_so_x, twist_so_y, twist_so_z).

get_twist_mixed_np()#

Returns the twist vectors as a numpy array with shape (n_samples, 6). The first three columns correspond to the relative orientation as described by (twist_so_x, twist_so_y, twist_so_z), and the last three columns correspond to the relative position as described by (twist_x, twist_y, twist_z).

Returns:

numpy.ndarray: The twist vectors as a numpy array.

get_twist_pos_df()#

Get the relative position from the twist descriptor DataFrame.

Returns:

pandas.DataFrame: A DataFrame containing the relative positions (twist_x, twist_y, twist_z).

get_twist_pos_np()#

Get the relative position from the twist descriptor DataFrame as a numpy array. This data corresponds to the twist vectors.

Returns:

numpy.ndarray: A numpy array containing the relative positions (twist_x, twist_y, twist_z).

get_twist_rot_df()#

Get the relative orientation from the twist descriptor DataFrame.

Returns:

pandas.DataFrame: A DataFrame containing the relative orientations (twist_so_x, twist_so_y, twist_so_z).

get_twist_rot_np()#

Get the relative orientation from the twist descriptor DataFrame as a numpy array.

Returns:

numpy.ndarray: A numpy array containing the relative orientations (twist_so_x, twist_so_y, twist_so_z).

classmethod load(input_data)#

Create a TwistDescriptor object from input data

Parameters:

input_datastr or TwistDescriptor: File path or TwistDescriptor object.

Raises:

ValueError: If the input data is not a valid type (TwistDescriptor or str).

static process_tomo_twist(t_nn, symm=None, symm_max_value=None, symm_category=None)#

Compute twist descriptors for a single tomogram.

Parameters:

t_nncryocat.nnana.NearestNeighbors object: Contains nearest neighbors data for a single tomogram.
symmint or str, optional: Specifies whether to use symmetry information. If None, no symmetry will be used. For allowed values see cryocat.tango.SymmParticle. Default is None.
symm_max_valuefloat, optional: Maximum dissimilarity for symmetry. Default is None.
symm_categorystr, optional: Type of symmetry. Default is None.

Returns:

pandas.DataFrame: DataFrame containing twist vectors and additional information.

static read_in(input_file)#

Reads in a pandas DataFrame from a CSV or Pickle file depending on file extension.

Parameters:

input_filestr: File path. Must end with .csv or .pkl.

Raises:

ValueError: If the file type is not supported.

sort_by_distance(twist_descriptor_id='geodesic_distance_rad')#

Sort the twist descriptor DataFrame by a specific column. The feature corresponding to the twist descriptor ID is interpolated at the sorted distances.

Parameters:

twist_descriptor_idstr, {“geodesic_distance_rad”, “euclidean_distance”, “product_distance”, “twist_so_x”, “twist_so_y”, “twist_so_z”, “twist_x”, “twist_y”, “twist_z”, “qp_inplane”,”nn_inplane”}: The name of the distance column to sort by. In principle, all columns of a TwistDescriptor data frame are valid input. Default is “geodesic_distance_rad”.

Returns:

tuple
- unified_distancesnumpy.ndarray: The sorted unique distances.
- average_valuesnumpy.ndarray: The average values of the twist descriptor at the sorted distances.

symmetry_statistics(c_range=None, plot_graph=True)#

Plot the angular scores for different C_n symmetries.

Parameters:

c_rangeint, float, or range, optional: The range of C_n symmetries to consider. If None, defaults to range(2, 10). If a single integer or float is provided, it will be converted to a range from 2 to that value. Default is None.
plot_graphbool, default=True: Whether to plot the graph. If True, the graph will be displayed. Default is True

Returns:

figplotly.graph_objects.Figure: A Plotly figure containing the box plot of angular scores for different C_n symmetries.

twist_template_comparison(referenc_particle)#

Shift twist vectors by tangent vector corresponding to input particle. Ideally, the input particle corresponds to the relative particle pose of interest.

Parameters:

referenc_particleParticle: The reference particle to compare against.

Returns:

numpy.ndarray: The shifted twist vectors as a numpy array.

weighted_stats(position_weight, orientation_weight)#

Compute a weighted product metric for the twist vectors in the TwistDescriptor.

Parameters:

position_weightfloat: The weight for the position component of the twist vector.
orientation_weightfloat: The weight for the orientation component of the twist vector.

Returns:

TwistDescriptor: A new TwistDescriptor object containing the weighted product metric.

write_out(output_file)#

Save self.df to CSV or Pickle depending on file extension.

Parameters:

output_filestr: File path. Must end with .csv or .pkl.

Raises:

ValueError: If the file type is not supported.

cryocat.analysis.tango.convert_to_particle_list(input_motl, motl_fid=None, subset_tomo_id=None, symm=None, custom_rot=None)#

Convert a Motl object to a list of Particle objects.

Parameters:

input_motlstr or Motl: The path to the input Motl file or Motl object to be loaded.
motl_fidstr, optional: The column name in the Motl dataframe that should be used as motl_fid. If None, motl_fids will be set to None for all particles. Default is None.
subset_tomo_idint or list, optional: A tomo_id(s) that should be used. If None, the entire Motl will be used. Default is None.
symm: int or str, optional: Specifies symmetry of a particle. If int is passed, cyclic (C) symmetry is assumed. See SymParticle for more details on str specifications. If specified, list of SymmParticle objects is created, instead of list of Particle objects. Default is None.
custom_rot: np.array, optional: The input custom_rot can be a rotation matrix for the case that the given particle symmetry does not align with the canonical options for platonic solids as defined in geom. This is not needed in the case where sym == n >1. It is used only if symm is specified. Default to None.

Returns:

list of Particle or SymmParticle: A list of Particle or SymmParticle objects, each containing the rotation angles, position coordinates, tomo id, motl feature id, particle id, and symmetry (for SymmParticle).

cryocat.analysis.tango.get_colormap_color(cmap, val)#

Get a color from a colormap based on a value.

Parameters:

cmapmatplotlib.colors.Colormap: The colormap to use.
valfloat: The value to map to a color. Should be in the range [0, 1].

Returns:

str: The color in hexadecimal format.