multipers.data package
Submodules
multipers.data.MOL2 module
- multipers.data.MOL2.EF_AUC(distances: ndarray, labels: ndarray, anchors_in_test=0)
- multipers.data.MOL2.EF_from_distance_matrix(distances: ndarray, labels: list | ndarray, alpha: float, anchors_in_test=True)
- Computes the Enrichment Factor from a distance matrix, and its labels.
First axis of the distance matrix is the anchors on which to compute the EF
Second axis is the test. For convenience, anchors can be put in test, if the flag anchors_in_test is set to true.
labels is a table of bools, representing the the labels of the test axis of the distance matrix.
alpha : the EF alpha parameter.
- class multipers.data.MOL2.Molecule2SimplexTree(delayed: bool = False, filtrations: Iterable[str] = [], graph: bool = True, n_jobs: int = 1)
Bases:
BaseEstimator
,TransformerMixin
Transforms a list of MDA-compatible files into a list of mulitparameter simplextrees
Input
X: Iterable[path_to_files:str]
Output
Iterable[multipers.SimplexTreeMulti]
Parameters
filtrations : list of filtration names. Available ones : ‘charge’, ‘atomic_mass’, ‘bond_length’, ‘bond_type’. Others are ignored.
graph : bool. If true, will use the graph given by the molecule, otherwise, a Rips Complex Based on the distance. ‘
In that case bond_length is ignored (it’s the 1rst parameter).
- fit(X: Iterable[str], y=None)
- transform(X: Iterable[str])
- multipers.data.MOL2.apply_pipeline(pathes: dict, pipeline)
- multipers.data.MOL2.get_EF_vector_from_distances(distances, ytest, alpha=0.05)
- multipers.data.MOL2.get_all_JC_path()
- multipers.data.MOL2.get_data_path_JC(type='dict')
- multipers.data.MOL2.img_distances(img_dict: dict)
- multipers.data.MOL2.lines2bonds(mol: Universe, bond_types=['ar', 'am', 3, 2, 1, 0], molecule_format=None)
- multipers.data.MOL2.lines2bonds_MOL2(mol: Universe)
- multipers.data.MOL2.lines2bonds_PDB(mol: Universe)
- multipers.data.MOL2.plot_EF_from_distances(alphas=[0.01, 0.02, 0.05, 0.1], EF=<function EF_from_distance_matrix>, plot: bool = True)
- multipers.data.MOL2.split_multimol(path: str, mol_name: str, out_folder_name: str = 'splitted', enforce_charges: bool = False)
- multipers.data.MOL2.theorical_max_EF(distances, labels, alpha)
- multipers.data.MOL2.theorical_max_EF_from_distances(list_of_distances, list_of_labels, alpha)
multipers.data.UCR module
- multipers.data.UCR.get(dataset: str = 'UCR/Coffee', test: bool = False, DATASET_PATH: str = '/user/dloiseau/home/Datasets/', dim=3, delay=1, skip=1)
- multipers.data.UCR.get_test(*args, **kwargs)
- multipers.data.UCR.get_train(*args, **kwargs)
multipers.data.graphs module
- class multipers.data.graphs.Graph2SimplexTrees(filtrations=[], delayed=False, num_collapses=100, progress: bool = False)
Bases:
BaseEstimator
,TransformerMixin
Transforms a list of networkx graphs into a list of simplextree multi
Usual Filtrations
“cc” closeness centrality
“geodesic” if the graph provides data to compute it, e.g., BZR, COX2, PROTEINS
“degree”
“ricciCurvature” the ricci curvature
“fiedler” the square of the fiedler vector
- fit(X, y=None)
- transform(X: list[Graph])
- multipers.data.graphs.compute_cc(graphs: list[Graph], progress=1)
- multipers.data.graphs.compute_degree(graphs: list[Graph], progress=1)
- multipers.data.graphs.compute_fiedler(graphs: list[Graph], progress=1)
- multipers.data.graphs.compute_filtration(dataset: str, filtration: str = 'ALL', **kwargs)
- multipers.data.graphs.compute_geodesic(graphs: list[Graph], progress=1)
- multipers.data.graphs.compute_hks(graphs: list[Graph], t: float, progress=1)
- multipers.data.graphs.compute_intrinsic(graphs: list[Graph], progress=1, nowarning=False)
- multipers.data.graphs.compute_ricci(graphs: list[Graph], alpha=0.5, progress=1)
- multipers.data.graphs.get(dataset: str, filtration: str | None = None)
- multipers.data.graphs.get_from_file(dataset: str)
- multipers.data.graphs.get_from_file_old(dataset: str, label='lb')
- multipers.data.graphs.get_graphs(dataset: str, N: int | str = '') tuple[list[Graph], list[int]]
- multipers.data.graphs.reset_graphs(dataset: str, N=None)
- multipers.data.graphs.set_graphs(graphs: list[Graph], labels: list, dataset: str, N: int | str = '')
multipers.data.immuno_regions module
- multipers.data.immuno_regions.get(DATASET_PATH='/user/dloiseau/home/Datasets/')
- multipers.data.immuno_regions.get_immuno(i=1, DATASET_PATH='/user/dloiseau/home/Datasets/')
multipers.data.minimal_presentation_to_st_bf module
multipers.data.pytorch2simplextree module
- class multipers.data.pytorch2simplextree.Torch2SimplexTree(filtrations: Iterable[str] = [])
Bases:
BaseEstimator
,TransformerMixin
WARNING : build in progress PyTorch Data-like to simplextree.
Input
Class having pos, edges, faces methods
Filtrations
Geodesic (geodesic rips)
eccentricity
- fit(X, y=None)
- mp = <module 'multipers' from '/home/dloiseau/micromamba/envs/312/lib/python3.12/site-packages/multipers/__init__.py'>
- transform(X: list[Graph])
- multipers.data.pytorch2simplextree.modelnet2graphs(version='10', print_flag=False, labels_only=False, a=0, b=10, weight_flag=False)
load modelnet 10 or 40 and convert to graphs
- multipers.data.pytorch2simplextree.modelnet2pts2gs(train_dataset, test_dataset, nbr_size=8, exp_flag=True, labels_only=False, n=100, n_jobs=1, random=False)
- multipers.data.pytorch2simplextree.torch_geometric_2nx(dataset, labels_only=False, print_flag=False, weight_flag=False)
- Parameters:
dataset
labels_only – return labels only
print_flag
weight_flag – whether computing distance as weights or not
- Returns:
multipers.data.shape3d module
- multipers.data.shape3d.get(dataset: str, num_graph=0, seed=0, node_per_graph=0)
- multipers.data.shape3d.get_(dataset: str, dataset_num: int | None = None, num_sample: int = 0, DATASET_PATH='/user/dloiseau/home/Datasets/')
- multipers.data.shape3d.get_ModelNet(dataset, num_graph, seed)
- multipers.data.shape3d.load_modelnet(version='10', sample_points=False, reset: bool = False, remove_faces=False)
multipers.data.synthetic module
- multipers.data.synthetic.get_orbit5k(num_pts=1000, num_data=5000)
- multipers.data.synthetic.noisy_annulus(n1: int = 1000, n2: int = 200, r1: float = 1, r2: float = 2, dim: int = 2, center: ndarray | list | None = None, **kwargs) ndarray
Generates a noisy annulus dataset.
Parameters
- r1float.
Lower radius of the annulus.
- r2float.
Upper radius of the annulus.
- n1int
Number of points in the annulus.
- n2int
Number of points in the square.
- dimint
Dimension of the annulus.
- center: list or array
center of the annulus.
Returns
- numpy array
Dataset. size : (n1+n2) x dim
- multipers.data.synthetic.orbit(n: int = 1000, r: float = 1.0, x0=[])
- multipers.data.synthetic.three_annulus(num_pts: int = 500, num_outliers: int = 500)