tango#

class cryocat.analysis.tango.AlphaComplexDescriptor(twist_df, alpha_param=200.0, build_unique_desc=True)#

Bases: Descriptor

compute_alpha_complex(coord)#

Compute the stars and links of vertices corresponding to an alpha complex. This computation is performed in 2D. Thus, the input coordinates have t be close enough to a plane.

Parameters:
coordnumpy.ndarray

The coordinates of the points in the alpha complex.

Returns:
tuple
- triangleslist

A list of triangles representing the stars of the vertices in the alpha complex.

- link_edgeslist

A list of edges representing the links of the vertices in the alpha complex.

compute_features(qp_id)#

Compute geometric features for a given query point ID. These features include the central angles, the triangle areas, the radii of the inner circles, the radii of the circumcircles, and the ratio of the circumcircle radius to the inner circle radius.

Parameters:
qp_idint

The ID of the query point.

Returns:
pandas.DataFrame

A DataFrame containing the computed features for the given query point ID.

class cryocat.analysis.tango.AngularScoreNN(twist_desc, num_neighbors=1)#

Bases: Filter

class cryocat.analysis.tango.AngularScoreStatsTwist(assoc_desc)#

Bases: Feature

compute()#

Compute the mean, median, standard deviation, and variance of the angular score between the query point and its neighbors. Works only for particles with symmetry.

Returns:
pandas.DataFrame

A DataFrame containing the mean, median, standard deviation, and variance of the angular score between each query point and its neighbors. If the TwistDescriptor does not contain the angular score, it returns empty data frame.

class cryocat.analysis.tango.AxisRot(twist_desc, max_angle, axis='z', min_angle=0.0)#

Bases: Filter

class cryocat.analysis.tango.Catalog#

Bases: object

get_all_classes(filter_contains=None, filter_exclude=None)#

Returns all class names in the catalog that are subclasses of the respective parent class.

Parameters:
filter_containsstr, optional

A string to filter the class names that contain this substring. Default is None.

filter_excludestr, optional

A string to filter the class names that do not contain this substring. Default is None.

Returns:
list

A list of class names that are subclasses of the parent class and match the filters.

class cryocat.analysis.tango.CentralAngleStatsAlphaComplex(assoc_desc)#

Bases: Feature

compute()#

Compute the mean, median, standard deviation, and variance of the central angles of a vertex’s star.

Returns:
pandas.DataFrame

A DataFrame containing the mean, median, standard deviation, and variance of the central angles for each query point.

class cryocat.analysis.tango.CentralAngleStatsPLComplex(assoc_desc)#

Bases: Feature

compute()#

Compute the mean, median, standard deviation, and variance of the central angles of a vertex’s star.

Returns:
pandas.DataFrame

A DataFrame containing the mean, median, standard deviation, and variance of the central angles for each query point.

class cryocat.analysis.tango.Cone(twist_desc, cone_height, cone_radius, axis=None, mode='position')#

Bases: Support

class cryocat.analysis.tango.CountSHOT(assoc_desc)#

Bases: Feature

compute()#

Compute the number of occurrences of each (cone_id, shell_id) combination.

Returns:
pandas.DataFrame

A DataFrame containing the counts of each (cone_id, shell_id) combination for each query point.

class cryocat.analysis.tango.CustomDescriptor(twist_df, feature_list=None, feature_kwargs=None, support_class=None, support_kwargs=None)#

Bases: Descriptor

build_descriptor(feature_list, feature_kwargs, support_class=None, support_kwargs=None)#

Build the descriptor by computing the features based on the provided support and feature list. This method creates the necessary keyword arguments for the features based on the support. It then computes the features and merges them into a single DataFrame. The resulting DataFrame is stored in the desc attribute of the CustomDescriptor instance. The df attribute contains the original DataFrame. This method is a class method that creates an instance of the CustomDescriptor class.

Parameters:
feature_listlist

A list of feature classes to be computed.

feature_kwargslist

A list of dictionaries containing keyword arguments for each feature class.

support_classclass, optional

The support class used to compute the features. If None, the original support from twist descriptor is used. Default is None.

support_kwargsdict, optional

A dictionary containing keyword arguments for the support class (if specified). Default is None.

Returns:
new_desc_dfpandas.DataFrame

A data frame with the custom descriptor values.

create_additional_descriptors(support_df, feature_list, feature_kwargs)#

Create additional descriptors based on the provided feature list and support. This method generates the necessary descriptors associated with the features and based on the provided support.

Parameters:
support_dfpd.DataFrame

DataFrame with qp from the defined support.

feature_listlist

A list of feature classes names to be computed.

feature_kwargslist

A list of dictionaries containing keyword arguments for each descriptor class.

Returns:
dict

A dictionary containing computed descriptors for all features.

classmethod load(desc_df)#

Load a custom descriptor from a DataFrame. This method is a class method that creates an instance of the CustomDescriptor class.

Parameters:
desc_dfpandas.DataFrame

The DataFrame containing the custom descriptor data.

class cryocat.analysis.tango.CustomSupport(twist_desc, binary_mask)#

Bases: Support

class cryocat.analysis.tango.Cylinder(twist_desc, radius, height, axis=None, mode='position', symmetric=True)#

Bases: Support

class cryocat.analysis.tango.Descriptor#

Bases: object

build_descriptor()#

Builds a descriptor DataFrame by merging computed features based on unique ‘qp_id’ values.

This method retrieves available features from the FeatureCatalog that match the class name, instantiates each feature, computes its values, and merges the results into a single DataFrame.

Returns:
new_desc_dfpandas.DataFrame

A data frame with the descriptor values.

Notes

  • The method assumes that the DataFrame self.df contains a column named ‘qp_id’.

  • The features are filtered based on the class name, which is derived from the class of the instance.

  • The merging is performed using a left join on the ‘qp_id’ column.

static build_descriptor_feature_map(desc_list, feat_list)#

Build a mapping between descriptors and features.

Parameters:
desc_listlist of str

List of descriptor names.

feat_listlist of str

List of feature names.

Returns:
dict

A dictionary where keys are descriptor names and values are lists of matching feature names.

static build_feature_descriptor_map(feat_list, desc_list)#

Build a mapping between features and descriptors.

Parameters:
feat_listlist of str

List of feature names.

desc_listlist of str

List of descriptor names.

Returns:
dict

A dictionary where keys are feature names and values are the corresponding descriptor names.

compute_pca(pca_components=None, feature_ids='all', nan_drop='row')#

Compute PCA on the descriptor DataFrame.

Parameters:
pca_componentsint, optional

The number of PCA components to compute. Default is None.

feature_idsstr or list, default=”all”

The feature IDs to filter by. Can be “all” or a list of feature names corresponding to columns from input_df. Default is “all”.

nan_dropstr, {“row”, “column”}

The axis to drop NaN values from. Default is “row”.

Returns:
tuple
- pca_dfpandas.DataFrame

The DataFrame containing the PCA components.

- qp_idsnumpy.ndarray

The array of query point indices corresponding to the PCA components.

filter_features(input_df, feature_ids='all')#

Filter features based on the feature_ids parameter.

Parameters:
input_dfpandas.DataFrame

The input DataFrame to filter.

feature_idsstr or list, default=”all”

The feature IDs to filter by. Can be “all” or a list of feature names corresponding to columns from input_df. Default is “all”.

Returns:
pandas.DataFrame

The filtered DataFrame containing only the specified features.

Raises:
ValueError

If feature_ids is not a valid option. If none of the provided features are in the DataFrame. If feature_ids is not a string or list.

get_important_features(pca, input_df, n_components)#

Get important features based on PCA loadings.

Parameters:
pcasklearn.decomposition.PCA

PCA object fitted to the data.

input_dfpandas.DataFrame

The input DataFrame containing the features.

n_componentsint

The number of components to consider.

Returns:
pandas.DataFrame

A DataFrame containing the feature importance scores.

k_means_clustering(n_clusters, nan_drop='row', pca_dict=None, feature_ids='all', scale_data=True)#

Perform k-means clustering on the descriptor DataFrame.

Parameters:
n_clustersint

The number of clusters to form.

nan_dropstr, {“row”, “column”}

The axis to drop NaN values from. Default is “row”.

pca_dictdict, optional

A dictionary containing PCA parameters. If None, PCA is not applied. Default is None.

feature_idsstr or list, default=”all”

The feature IDs to filter by. Can be “all” or a list of feature names corresponding to columns from input_df. Default is “all”.

scale_databool, default=True

Whether to scale the data before clustering. Default is True.

Returns:
pandas.DataFrame

A DataFrame containing the clustering results, including the cluster labels and query point IDs.

pca_analysis(variance_threshold=0.95, show_fig=True, nan_drop='row', scatter_kwargs=None, bar_kwargs=None)#

Perform PCA analysis on the descriptor DataFrame.

Parameters:
variance_thresholdfloat, default=0.95

The threshold for cumulative explained variance to determine the number of components. Default is 0.95.

show_figbool, default=True

Whether to show the figure. Default is True.

nan_dropstr, default=”row”

The axis to drop NaN values from. Default is “row”.

scatter_kwargsdict, optional

Additional arguments for the scatter plot. Default is None.

bar_kwargsdict, optional

Additional arguments for the bar plot. Default is None.

Returns:
tuple
  • n_componentsint

    The number of components chosen based on the variance threshold.

  • important_featurespandas.DataFrame

    A DataFrame containing the important features based on PCA loadings.

  • figplotly.graph_objects.Figure

    The figure object containing the PCA summary plot.

plot_k_means(color_column)#

Plot the k-means clustering results in 3D.

Parameters:
color_columnstr

The column name in the DataFrame to use for coloring the points.

proximity_clustering(num_connected_components=1, size_connected_components=None)#

Cluster particles based on spatial proximity. If size_connected_components is None, num_connected_components is used to determine the number of connected components to return. If size_connected_components is specified, it returns all connected components with size >= size_connected_components.

Parameters:
num_connected_componentsint, default=1

The number of connected components to return. Default is 1.

size_connected_componentsint, optional

The minimum size of the connected components to return. Default is None.

Returns:
list

A list of connected components, each represented as a subgraph of the original graph.

static remove_nans(df, axis_type='row')#

Remove NaN values from the DataFrame.

Parameters:
dfpandas.DataFrame
axis_typestr, {“row”, “column”}

Default is “row”.

Returns:
pandas.DataFrame
Raises:
ValueError

If axis_type is not “row” or “column”.

class cryocat.analysis.tango.DistProductStatsTwist(assoc_desc)#

Bases: Feature

compute()#

Compute the mean, median, standard deviation, and variance of the product of Euclidean and geodesic distance between the query point and its neighbors.

Returns:
pandas.DataFrame

A DataFrame containing the mean, median, standard deviation, and variance of the product of Euclidean and geodesic distance between each query point and its neighbors.

class cryocat.analysis.tango.EuclideanDistNN(twist_desc, num_neighbors=1)#

Bases: Filter

class cryocat.analysis.tango.EuclideanDistStatsTwist(assoc_desc)#

Bases: Feature

compute()#

Compute the mean, median, standard deviation, and variance of the Euclidean distance between the query point and its neighbors.

Returns:
pandas.DataFrame

A DataFrame containing the mean, median, standard deviation, and variance of the Euclidean distance between each query point and its neighbors.

class cryocat.analysis.tango.EulerCharAlphaComplex(assoc_desc)#

Bases: Feature

compute()#

Compute the Euler characteristic of a 1-dimensional simplicial complex.

Returns:
pandas.DataFrame

A DataFrame containing the Euler characteristic for each query point.

class cryocat.analysis.tango.Feature(desc_df)#

Bases: object

compute()#
compute_stats(column_name)#
class cryocat.analysis.tango.FeatureCatalog#

Bases: Catalog

class cryocat.analysis.tango.Filter#

Bases: object

class cryocat.analysis.tango.FilterCatalog#

Bases: Catalog

class cryocat.analysis.tango.GeodesicDistNN(twist_desc, num_neighbors=1)#

Bases: Filter

class cryocat.analysis.tango.GeodesicnDistStatsTwist(assoc_desc)#

Bases: Feature

compute()#

Compute the mean, median, standard deviation, and variance of the geodesic distance between the query point and its neighbors.

Returns:
pandas.DataFrame

A DataFrame containing the mean, median, standard deviation, and variance of the geodesic distance between each query point and its neighbors.

class cryocat.analysis.tango.LinkTypeAlphaComplex(assoc_desc)#

Bases: Feature

compute()#

Compute a simplicial isomorphism invariant (‘link type’) for a 1-dimensional simplicial complex.

Returns:
pandas.DataFrame

A DataFrame containing the simplicial isomorphism invariant (‘link type’) for each query point.

class cryocat.analysis.tango.MixedDistNN(twist_desc, num_neighbors=1)#

Bases: Filter

class cryocat.analysis.tango.NNCountTwist(assoc_desc)#

Bases: Feature

compute()#

Compute the number of nearest neighbors for each query point.

Returns:
pandas.DataFrame

A DataFrame containing the number of nearest neighbors for each query point.

class cryocat.analysis.tango.PLComplexDescriptor(twist_df, build_unique_desc=True)#

Bases: Descriptor

compute_features(qp_id)#

Compute geometric features for a given query point ID.

Parameters:
qp_idint

The ID of the query point.

Returns:
pandas.DataFrame

A DataFrame containing the computed features for the given query point ID.

compute_outer_edges()#

Compute the edges making up links of the triangulated surface.

Returns:
list

A list of tuples representing the edges of the triangulated surface.

compute_triangles(qp_id)#

Compute the triangles for a given query point ID.

Parameters:
qp_idint

The ID of the query point.

class cryocat.analysis.tango.Particle(rotation, position, tomo_id=None, motl_fid=None, degrees=True, particle_id=0)#

Bases: object

add_noise(noise_level=0.05, mode='orientation', degrees=False)#

Add noise to particle based on mode.

Parameters:
noise_levelint, float, tuple, list

Controls how much perturbed particle deviates from input particle. Defaults to 0.05. Tuple or list are only to be used in mixed-mode, where first entry refers to orientational noise, second one to positional noise.

modestr, {“orientation”, “position”, “mixed”}

The mode refers to the notion of distance that is to be applied. ‘orientation’ for geodesic (angular) distance between particle orientations ‘position’ for Euclidean distance between physical particle positions ‘mixed’ for product metric Default is “orientation”.

degreesbool, default=False

If True, orientational noise is expressed in degrees. Otherwise: radians. Default is False.

Returns:
Particle
Raises:
ValueError

If the mode is invalid.

distance(other, mode='orientation', degrees=False)#

Compute distance between particles based on mode.

Parameters:
modestr, {“orientation”, “position”, “mixed”}

The mode refers to the notion of distance that is to be applied. ‘orientation’ for geodesic (angular) distance between particle orientations ‘position’ for Euclidean distance between physical particle positions ‘mixed’ for product metric Default is “orientation”.

degreesbool, default=False.

If True, angular distance is expressed in degrees. Otherwise: radians. Default is False.

Returns:
float
Raises:
ValueError

If the input is not a Particle object or if the mode is invalid.

classmethod identity()#

Returns the identity particle. This is the particle positioned at the origin and equipped with the trivial (canonical) orientation.

Returns:
Particle
in_plane_angle(degrees=True)#

Returns the rotation angle of the inplane portion of a rotation matrix.

Parameters:
degreesbool, default=True

If True, rotation angle is expressed in degrees. Otherwise: radians. Default is True.

Returns:
floatrotation angle
inv()#

Compute inverse of input particle.

classmethod random(x_range, y_range, z_range)#

Generates a random particle with a translation vector position within bounds.

Parameters:
- x_range: Tuple of (min_x, max_x)
- y_range: Tuple of (min_y, max_y)
- z_range: Tuple of (min_z, max_z)
Returns:
Particle
scale(scaling_factor, overwrite=True)#

Scales the translation vector associated to self.

Parameters:
scaling_factorfloat or int

The factor by which the particle position is scaled.

overwritebool, default=True

If True, the original particle is overwritten. Otherwise, False.

Returns:
If overwrite == False, Particle is returned.
Raises:
TypeError

If scaling_factor is not of type int or float.

tangent_at_identity()#

Compute tangent from identity in SE(3) pointing in direction of self.

Returns:
numpy ndarray (6,)
tangent_subspace_projection(other, mode='orientation')#

Project tangent vector at identity pointing in direction self –> other onto subspace corresponding to mode.

Parameters:
modestr, {“orientation”, “position”, “mixed”}

The mode refers to the notion of distance that is to be applied. ‘orientation’ for geodesic (angular) distance between particle orientations ‘position’ for Euclidean distance between physical particle positions ‘mixed’ for product metric Default is “orientation”.

Returns:
numpy ndarray (3,) or (6,) (depends on mode)
Raises:
ValueError

If the input is not a Particle object or if the mode is invalid.

twist_vector(other)#

Compute twist vector describing relative pose of input particles.

Returns:
numpy ndarray (6,)
Raises:
ValueError

If the input is not a Particle object.

class cryocat.analysis.tango.RotAngleXStatsTwist(assoc_desc)#

Bases: Feature

compute()#

Compute the mean, median, standard deviation, and variance of the rotation angle of x axis between the query point and its neighbors.

Returns:
pandas.DataFrame

A DataFrame containing the mean, median, standard deviation, and variance of the rotation angle of x axis between each query point and its neighbors.

class cryocat.analysis.tango.RotAngleYStatsTwist(assoc_desc)#

Bases: Feature

compute()#

Compute the mean, median, standard deviation, and variance of the rotation angle of y axis between the query point and its neighbors.

Returns:
pandas.DataFrame

A DataFrame containing the mean, median, standard deviation, and variance of the rotation angle of y axis between each query point and its neighbors.

class cryocat.analysis.tango.RotAngleZStatsTwist(assoc_desc)#

Bases: Feature

compute()#

Compute the mean, median, standard deviation, and variance of the rotation angle of z axis between the query point and its neighbors.

Returns:
pandas.DataFrame

A DataFrame containing the mean, median, standard deviation, and variance of the rotation angle of z axis between each query point and its neighbors.

class cryocat.analysis.tango.SHOTDescriptor(twist_df, cone_number=6, shell_number=1, north_pole_axis=None, build_unique_desc=True)#

Bases: Descriptor

assign_cone_ids_by_dot(cone_dirs)#

Assign each point to the cone direction with which it has the largest cosine similarity. All inputs must be normalized.

Parameters:
pointsnumpy.ndarray

The points to be assigned to cone directions.

cone_dirsnumpy.ndarray

The cone directions to which the points are assigned.

Returns:
numpy.ndarray

The indices of the cone directions to which each point is assigned.

assign_shell_and_cone_ids(qp_id, points, num_shells, num_cones, radius=1.0, north_pole_axis=None)#

Assign each point to a shell and cone direction based on its radial distance and angular position. The points are normalized to lie within a sphere of the specified radius. The cone directions are generated based on the specified number of cones and the north pole axis.

Parameters:
qp_idint

The ID of the query point.

pointsnumpy.ndarray

The points to be assigned to shells and cone directions.

num_shellsint

The number of shells to divide the spherical support into.

num_conesint

The number of cones to divide the spherical support into.

radiusfloat, default=1.0

The radius of the spherical support. Default is 1.0.

north_pole_axisnumpy.ndarray, optional

The axis corresponding to the north-pole of the subdivided support. Default is None, which uses the z-axis.

Returns:
pandas.DataFrame

A DataFrame containing the assigned shell and cone IDs for each point.

fixed_cone_directions(num_cones)#

Return evenly distributed unit vectors for small num_cones (hand-picked for symmetry). These directions include one at [0, 0, 1] and others at standard axes for low counts.

Parameters:
num_conesint

The number of cones to divide the spherical support into.

Returns:
numpy.ndarray

The fixed cone directions for the subdivided spherical support.

generate_rotated_axes(num_cones, north_pole_axis=None)#

Compute the rotated axes for the subdivided spherical support.

Parameters:
num_conesint

The number of cones to divide the spherical support into.

north_pole_axisnumpy.ndarray, optional

The axis corresponding to the north-pole of the subdivided support. Default is None, which uses the z-axis.

Returns:
numpy.ndarray

The rotated axes for the subdivided spherical support.

class cryocat.analysis.tango.Shell(twist_desc, radius_min, radius_max)#

Bases: Support

class cryocat.analysis.tango.Sphere(twist_desc, radius=None)#

Bases: Support

class cryocat.analysis.tango.Support#

Bases: object

static set_axis_and_columns(mode, axis=None)#

Set the rotation axis for the twist descriptor based on the specified mode. The axis is considered in a different subspace depending on the mode. The choice depends on the type of suport that is being used.

Parameters:
modestr

The mode of the twist descriptor. Can be “orientation”, “position”, or “mixed”.

axisnumpy.ndarray, optional

The axis used in the defintion of the required support. Default is None.

Returns:
tuple
- axisnumpy.ndarray

The normalized axis vector.

- columnslist

The feature IDs corresponding to the specified mode.

Raises:
ValueError

If the mode is not supported or if the axis is not a numpy array of size 3 or 6.

class cryocat.analysis.tango.SupportCatalog#

Bases: Catalog

class cryocat.analysis.tango.SymmParticle(rotation, position, tomo_id=None, motl_fid=None, particle_id=None, symm=None, custom_rot=None)#

Bases: Particle

classmethod equip_symmetry(symm, custom_rot=None)#

Equip an existing particle object with symmetry information.

Parameters:
input_particleParticle

The input particle to be equipped with symmetry information.

symmstr or int

Refers to symmetry type. Can be one of the following: - ‘tetra’, ‘octa’, ‘cube’, ‘ico’, ‘dodeca’ for platonic solids - An integer n > 1 for cyclic groups C_n.

custom_rotnumpy ndarray (3,3) or rotation object, optional

Rotation matrix or rotation object describing the symmetry of the particle in the case of a platonic solid. Default is None.

Returns:
SymmParticle
max_dissimilarity()#

Given the input particle’s symmetry type, this function returns the associated maximum angular dissimilarity.

Returns:
float
similarity_symm(other, max=None)#

Compute angular similarity between two symmetric particles in an unambiguous manner.

Parameters:
maxfloat, optional

If not None, the maximum dissimilarity is set to the input value. This is designed to accelerate computations by computing max only once. Default is None.

Returns:
float
Raises:
ValueError

If the symmetry types of the input particles don’t match.

class cryocat.analysis.tango.Torus(twist_desc, inner_radius, outer_radius, axis=None, mode='position')#

Bases: Support

class cryocat.analysis.tango.TwistDescriptor(input_twist=None, input_motl=None, nn_radius=None, feature_id='tomo_id', symm=None, remove_qp=False, remove_duplicates=False, build_unique_desc=True)#

Bases: Descriptor

static get_all_feature_ids(symm=False)#

To access information available to TwistDescriptors.

Returns:
list
static get_axis_feature_id(feature_id, axis='z')#

Get the feature ID for a specific axis.

Parameters:
feature_idstr

The base feature ID (e.g., “twist_so”).

axisstr, {“z”, “x”, “y”}

The axis to use. Default is “z”.

Returns:
str

The feature ID for the specified axis.

Raises:
ValueError

If the axis is not valid (not “x”, “y”, or “z”).

classmethod get_data_range(twist_desc, twist_descriptor_id=None, min_value=None, max_value=None)#

Filter a TwistDescriptor instance by a specific column and return a new instance.

Parameters:
twist_descTwistDescriptor

The original TwistDescriptor instance.

twist_descriptor_idstr, optional

The name of the column to filter on. If None, the original twist_desc will be returned. Default is None.

min_valuefloat, optional

Minimum value for filtering. Default is None.

max_valuefloat, optional

Maximum value for filtering. Default is None.

Returns:
TwistDescriptor

A new instance of TwistDescriptor with filtered data.

static get_mixed_feature_ids()#

To access both relative position and orientation information.

Returns:
list
static get_nn_twist_stats_within_radius(input_motl, nn_radius, feature_id='tomo_id', symm=None, remove_qp=None, remove_duplicates=False)#

Compute twist descriptor for a given input_motl within a specified radius.

Parameters:
input_motlstr or Motl

The path to the input Motl file or Motl object to be loaded.

nn_radiusfloat

The radius within which to compute the twist descriptor.

feature_idstr, {“tomo_id”,”object_id”,”class”,”geom1”,”geom2”,”geom3”,”geom4”,”geom5”}

The identifier for the motl feature which specifies the level of comparison. For instance, if “tomo_id” is specified the NN analysis will be computed at the tomogram level. If “object_id” is specified, the nearest neighbors will be searched in the objects with same “object_id”. Note that one should ensure unique numbering among all tomograms in case “tomo_id” is not set, otherwise objects from different tomograms might be incorrectly grouped together. Default is “tomo_id”.

symmint or str, optional

Specifies whether to use symmetry information. If None, no symmetry will be used. For allowed values see cryocat.tango.SymmParticle. Default is None.

remove_qpbool, optional

If True, the query point is removed from the nearest neighbors in the DataFrame. Default is None.

remove_duplicatesbool, default=False

If True, duplicate entries are removed from the DataFrame. Default is False.

Returns:
pandas.DataFrame

DataFrame containing twist vectors and additional information.

static get_pos_feature_ids()#

To access relative positions information.

Returns:
list
get_qp_twist_desc(query_particle, tomo_id=None)#

Get twist descriptor for a specific query particle from a twist dataframe.

Parameters:
query_particleint, float or Particle

The index of the particle or a Particle instance for which statistics are to be retrieved.

tomo_idint, optional

The tomogram id to filter the data. If None, data will be retrieved for all tomograms associated with the query particle. Default is None.

Returns:
filtered_twist_descTwistDescriptor

A TwistDescriptor containing the filtered statistics for the specified query particle and tomo id.

Raises:
ValueError

If query_particle is neither an instance of Particle nor an integer index.

Notes

This function assumes that twist_df contains a column named “qp_id” for particle IDs and “tomo_id” for tomography IDs.

static get_rot_feature_ids()#

To access relative orientation information.

Returns:
list
static get_symm_parameters(input_motl, symm=None)#

Get symmetry parameters (symmetry type, maximum dissimilarity) for a given input_motl.

Parameters:
input_motlstr or Motl

The path to the input Motl file or Motl object to be loaded.

symmint or str, optional

Specifies whether to use symmetry information. If None, no symmetry will be used. For allowed values see cryocat.tango.SymmParticle. Default is None.

Returns:
tuple
- max_dissimilarityfloat

The maximum dissimilarity for the given symmetry type.

- categorystr

The symmetry type (e.g., ‘tetrahedron’, ‘octahedron’, etc.).

get_twist_mixed_df()#

Get the relative position and orientation from the twist descriptor DataFrame. This data corresponds to the twist vectors.

Returns:
pandas.DataFrame

A DataFrame containing the relative position and orientation (twist_x, twist_y, twist_z, twist_so_x, twist_so_y, twist_so_z).

get_twist_mixed_np()#

Returns the twist vectors as a numpy array with shape (n_samples, 6). The first three columns correspond to the relative orientation as described by (twist_so_x, twist_so_y, twist_so_z), and the last three columns correspond to the relative position as described by (twist_x, twist_y, twist_z).

Returns:
numpy.ndarray

The twist vectors as a numpy array.

get_twist_pos_df()#

Get the relative position from the twist descriptor DataFrame.

Returns:
pandas.DataFrame

A DataFrame containing the relative positions (twist_x, twist_y, twist_z).

get_twist_pos_np()#

Get the relative position from the twist descriptor DataFrame as a numpy array. This data corresponds to the twist vectors.

Returns:
numpy.ndarray

A numpy array containing the relative positions (twist_x, twist_y, twist_z).

get_twist_rot_df()#

Get the relative orientation from the twist descriptor DataFrame.

Returns:
pandas.DataFrame

A DataFrame containing the relative orientations (twist_so_x, twist_so_y, twist_so_z).

get_twist_rot_np()#

Get the relative orientation from the twist descriptor DataFrame as a numpy array.

Returns:
numpy.ndarray

A numpy array containing the relative orientations (twist_so_x, twist_so_y, twist_so_z).

classmethod load(input_data)#

Create a TwistDescriptor object from input data

Parameters:
input_datastr or TwistDescriptor

File path or TwistDescriptor object.

Raises:
ValueError

If the input data is not a valid type (TwistDescriptor or str).

static process_tomo_twist(t_nn, symm=None, symm_max_value=None, symm_category=None)#

Compute twist descriptors for a single tomogram.

Parameters:
t_nncryocat.nnana.NearestNeighbors object

Contains nearest neighbors data for a single tomogram.

symmint or str, optional

Specifies whether to use symmetry information. If None, no symmetry will be used. For allowed values see cryocat.tango.SymmParticle. Default is None.

symm_max_valuefloat, optional

Maximum dissimilarity for symmetry. Default is None.

symm_categorystr, optional

Type of symmetry. Default is None.

Returns:
pandas.DataFrame

DataFrame containing twist vectors and additional information.

static read_in(input_file)#

Reads in a pandas DataFrame from a CSV or Pickle file depending on file extension.

Parameters:
input_filestr

File path. Must end with .csv or .pkl.

Raises:
ValueError

If the file type is not supported.

sort_by_distance(twist_descriptor_id='geodesic_distance_rad')#

Sort the twist descriptor DataFrame by a specific column. The feature corresponding to the twist descriptor ID is interpolated at the sorted distances.

Parameters:
twist_descriptor_idstr, {“geodesic_distance_rad”, “euclidean_distance”, “product_distance”, “twist_so_x”, “twist_so_y”, “twist_so_z”, “twist_x”, “twist_y”, “twist_z”, “qp_inplane”,”nn_inplane”}

The name of the distance column to sort by. In principle, all columns of a TwistDescriptor data frame are valid input. Default is “geodesic_distance_rad”.

Returns:
tuple
- unified_distancesnumpy.ndarray

The sorted unique distances.

- average_valuesnumpy.ndarray

The average values of the twist descriptor at the sorted distances.

symmetry_statistics(c_range=None, plot_graph=True)#

Plot the angular scores for different C_n symmetries.

Parameters:
c_rangeint, float, or range, optional

The range of C_n symmetries to consider. If None, defaults to range(2, 10). If a single integer or float is provided, it will be converted to a range from 2 to that value. Default is None.

plot_graphbool, default=True

Whether to plot the graph. If True, the graph will be displayed. Default is True

Returns:
figplotly.graph_objects.Figure

A Plotly figure containing the box plot of angular scores for different C_n symmetries.

twist_template_comparison(referenc_particle)#

Shift twist vectors by tangent vector corresponding to input particle. Ideally, the input particle corresponds to the relative particle pose of interest.

Parameters:
referenc_particleParticle

The reference particle to compare against.

Returns:
numpy.ndarray

The shifted twist vectors as a numpy array.

weighted_stats(position_weight, orientation_weight)#

Compute a weighted product metric for the twist vectors in the TwistDescriptor.

Parameters:
position_weightfloat

The weight for the position component of the twist vector.

orientation_weightfloat

The weight for the orientation component of the twist vector.

Returns:
TwistDescriptor

A new TwistDescriptor object containing the weighted product metric.

write_out(output_file)#

Save self.df to CSV or Pickle depending on file extension.

Parameters:
output_filestr

File path. Must end with .csv or .pkl.

Raises:
ValueError

If the file type is not supported.

cryocat.analysis.tango.convert_to_particle_list(input_motl, motl_fid=None, subset_tomo_id=None, symm=None, custom_rot=None)#

Convert a Motl object to a list of Particle objects.

Parameters:
input_motlstr or Motl

The path to the input Motl file or Motl object to be loaded.

motl_fidstr, optional

The column name in the Motl dataframe that should be used as motl_fid. If None, motl_fids will be set to None for all particles. Default is None.

subset_tomo_idint or list, optional

A tomo_id(s) that should be used. If None, the entire Motl will be used. Default is None.

symm: int or str, optional

Specifies symmetry of a particle. If int is passed, cyclic (C) symmetry is assumed. See SymParticle for more details on str specifications. If specified, list of SymmParticle objects is created, instead of list of Particle objects. Default is None.

custom_rot: np.array, optional

The input custom_rot can be a rotation matrix for the case that the given particle symmetry does not align with the canonical options for platonic solids as defined in geom. This is not needed in the case where sym == n >1. It is used only if symm is specified. Default to None.

Returns:
list of Particle or SymmParticle

A list of Particle or SymmParticle objects, each containing the rotation angles, position coordinates, tomo id, motl feature id, particle id, and symmetry (for SymmParticle).

cryocat.analysis.tango.get_colormap_color(cmap, val)#

Get a color from a colormap based on a value.

Parameters:
cmapmatplotlib.colors.Colormap

The colormap to use.

valfloat

The value to map to a color. Should be in the range [0, 1].

Returns:
str

The color in hexadecimal format.