multipers.data package

Submodules

multipers.data.MOL2 module

multipers.data.MOL2.EF_AUC(distances: ndarray, labels: ndarray, anchors_in_test=0)
multipers.data.MOL2.EF_from_distance_matrix(distances: ndarray, labels: list | ndarray, alpha: float, anchors_in_test=True)
Computes the Enrichment Factor from a distance matrix, and its labels.
  • First axis of the distance matrix is the anchors on which to compute the EF

  • Second axis is the test. For convenience, anchors can be put in test, if the flag anchors_in_test is set to true.

  • labels is a table of bools, representing the the labels of the test axis of the distance matrix.

  • alpha : the EF alpha parameter.

class multipers.data.MOL2.Molecule2SimplexTree(delayed: bool = False, filtrations: Iterable[str] = [], graph: bool = True, n_jobs: int = 1)

Bases: BaseEstimator, TransformerMixin

Transforms a list of MDA-compatible files into a list of mulitparameter simplextrees

Input

X: Iterable[path_to_files:str]

Output

Iterable[multipers.SimplexTreeMulti]

Parameters

  • filtrations : list of filtration names. Available ones : ‘charge’, ‘atomic_mass’, ‘bond_length’, ‘bond_type’. Others are ignored.

  • graph : bool. If true, will use the graph given by the molecule, otherwise, a Rips Complex Based on the distance. ‘

In that case bond_length is ignored (it’s the 1rst parameter).

fit(X: Iterable[str], y=None)
transform(X: Iterable[str])
multipers.data.MOL2.apply_pipeline(pathes: dict, pipeline)
multipers.data.MOL2.get_EF_vector_from_distances(distances, ytest, alpha=0.05)
multipers.data.MOL2.get_all_JC_path()
multipers.data.MOL2.get_data_path_JC(type='dict')
multipers.data.MOL2.img_distances(img_dict: dict)
multipers.data.MOL2.lines2bonds(mol: Universe, bond_types=['ar', 'am', 3, 2, 1, 0], molecule_format=None)
multipers.data.MOL2.lines2bonds_MOL2(mol: Universe)
multipers.data.MOL2.lines2bonds_PDB(mol: Universe)
multipers.data.MOL2.plot_EF_from_distances(alphas=[0.01, 0.02, 0.05, 0.1], EF=<function EF_from_distance_matrix>, plot: bool = True)
multipers.data.MOL2.split_multimol(path: str, mol_name: str, out_folder_name: str = 'splitted', enforce_charges: bool = False)
multipers.data.MOL2.theorical_max_EF(distances, labels, alpha)
multipers.data.MOL2.theorical_max_EF_from_distances(list_of_distances, list_of_labels, alpha)

multipers.data.UCR module

multipers.data.UCR.get(dataset: str = 'UCR/Coffee', test: bool = False, DATASET_PATH: str = '/user/dloiseau/home/Datasets/', dim=3, delay=1, skip=1)
multipers.data.UCR.get_test(*args, **kwargs)
multipers.data.UCR.get_train(*args, **kwargs)

multipers.data.graphs module

class multipers.data.graphs.Graph2SimplexTrees(filtrations=[], delayed=False, num_collapses=100, progress: bool = False)

Bases: BaseEstimator, TransformerMixin

Transforms a list of networkx graphs into a list of simplextree multi

Usual Filtrations

  • “cc” closeness centrality

  • “geodesic” if the graph provides data to compute it, e.g., BZR, COX2, PROTEINS

  • “degree”

  • “ricciCurvature” the ricci curvature

  • “fiedler” the square of the fiedler vector

fit(X, y=None)
transform(X: list[Graph])
multipers.data.graphs.compute_cc(graphs: list[Graph], progress=1)
multipers.data.graphs.compute_degree(graphs: list[Graph], progress=1)
multipers.data.graphs.compute_fiedler(graphs: list[Graph], progress=1)
multipers.data.graphs.compute_filtration(dataset: str, filtration: str = 'ALL', **kwargs)
multipers.data.graphs.compute_geodesic(graphs: list[Graph], progress=1)
multipers.data.graphs.compute_hks(graphs: list[Graph], t: float, progress=1)
multipers.data.graphs.compute_intrinsic(graphs: list[Graph], progress=1, nowarning=False)
multipers.data.graphs.compute_ricci(graphs: list[Graph], alpha=0.5, progress=1)
multipers.data.graphs.get(dataset: str, filtration: str | None = None)
multipers.data.graphs.get_from_file(dataset: str)
multipers.data.graphs.get_from_file_old(dataset: str, label='lb')
multipers.data.graphs.get_graphs(dataset: str, N: int | str = '') tuple[list[Graph], list[int]]
multipers.data.graphs.reset_graphs(dataset: str, N=None)
multipers.data.graphs.set_graphs(graphs: list[Graph], labels: list, dataset: str, N: int | str = '')

multipers.data.immuno_regions module

multipers.data.immuno_regions.get(DATASET_PATH='/user/dloiseau/home/Datasets/')
multipers.data.immuno_regions.get_immuno(i=1, DATASET_PATH='/user/dloiseau/home/Datasets/')

multipers.data.minimal_presentation_to_st_bf module

multipers.data.pytorch2simplextree module

class multipers.data.pytorch2simplextree.Torch2SimplexTree(filtrations: Iterable[str] = [])

Bases: BaseEstimator, TransformerMixin

WARNING : build in progress PyTorch Data-like to simplextree.

Input

Class having pos, edges, faces methods

Filtrations

  • Geodesic (geodesic rips)

  • eccentricity

fit(X, y=None)
mp = <module 'multipers' from '/home/dloiseau/micromamba/envs/312/lib/python3.12/site-packages/multipers/__init__.py'>
transform(X: list[Graph])
multipers.data.pytorch2simplextree.modelnet2graphs(version='10', print_flag=False, labels_only=False, a=0, b=10, weight_flag=False)

load modelnet 10 or 40 and convert to graphs

multipers.data.pytorch2simplextree.modelnet2pts2gs(train_dataset, test_dataset, nbr_size=8, exp_flag=True, labels_only=False, n=100, n_jobs=1, random=False)
multipers.data.pytorch2simplextree.torch_geometric_2nx(dataset, labels_only=False, print_flag=False, weight_flag=False)
Parameters:
  • dataset

  • labels_only – return labels only

  • print_flag

  • weight_flag – whether computing distance as weights or not

Returns:

multipers.data.shape3d module

multipers.data.shape3d.get(dataset: str, num_graph=0, seed=0, node_per_graph=0)
multipers.data.shape3d.get_(dataset: str, dataset_num: int | None = None, num_sample: int = 0, DATASET_PATH='/user/dloiseau/home/Datasets/')
multipers.data.shape3d.get_ModelNet(dataset, num_graph, seed)
multipers.data.shape3d.load_modelnet(version='10', sample_points=False, reset: bool = False, remove_faces=False)

multipers.data.synthetic module

multipers.data.synthetic.get_orbit5k(num_pts=1000, num_data=5000)
multipers.data.synthetic.noisy_annulus(n1: int = 1000, n2: int = 200, r1: float = 1, r2: float = 2, dim: int = 2, center: ndarray | list | None = None, **kwargs) ndarray

Generates a noisy annulus dataset.

Parameters

r1float.

Lower radius of the annulus.

r2float.

Upper radius of the annulus.

n1int

Number of points in the annulus.

n2int

Number of points in the square.

dimint

Dimension of the annulus.

center: list or array

center of the annulus.

Returns

numpy array

Dataset. size : (n1+n2) x dim

multipers.data.synthetic.orbit(n: int = 1000, r: float = 1.0, x0=[])
multipers.data.synthetic.three_annulus(num_pts: int = 500, num_outliers: int = 500)

Module contents