import multipers as mp
import numpy as np
import gudhi as gd
import matplotlib.pyplot as plt
mp.simplex_tree_multi.SAFE_CONVERSION=True ## Enable this if you encounter crashes.

Custom multi-filtrations

Toy examples

This notebook will showcase how to use the multipers library core functions. As it relies on gudhi code, the syntax is almost the same.

The example we provide can be tweaked in different ways; don’t hesitate to read the functions documentations to figure out how to do so.

st = gd.SimplexTree() # usual gudhi simplextree
st.insert([0,1],1)
st.insert([1],0)
# converts the simplextree into a multiparameter simplextree
st_multi = mp.SimplexTreeMulti(st, num_parameters=2)
list(st_multi.get_simplices())
[(array([0, 1]), array([  1., -inf])),
 (array([0]), array([  1., -inf])),
 (array([1]), array([  0., -inf]))]
st = mp.SimplexTreeMulti(num_parameters=2) # multiparameter syntax
st.insert([0],[0,1])
st.insert([1],[1,0])
st.insert([0,1],[1,1])
list(st.get_simplices()) # list of (simplex, filtration)
[(array([0, 1]), array([1., 1.])),
 (array([0]), array([0., 1.])),
 (array([1]), array([1., 0.]))]

Almost all methods of gudhi simplextree should work ootb on simplextreemultis, and some are reimplemented, e.g., make_filtration_non_decreasing, expand, edge_collapse.

Furthermore, some methods are removed, e.g., persistence, and are replaced by other topological invariants. For instance interval approximations, provided by MMA (wich comes itself with its own PyModule structure), or signed measures derived from the dimension vector of the module or the Euler characteristic.

mma_pymodule = st.persistence_approximation()
mma_pymodule.plot(degree=0, box=[[0,0],[2,2]])
mma_pymodule[0].get_birth_list(), mma_pymodule[0].get_death_list()
([array([1., 0.]), array([0., 1.])], [array([inf, inf])])
../_images/fe58df39afcf71357eb4a769c63b2465d7f39b4cafb0debe4e5ecca14beae636.png

The signed measure obtained by the degree 0 homology can be retrieved using

mp.signed_measure(st, degree=0, mass_default=None)
((array([[0., 1.],
         [1., 0.],
         [1., 1.]]),
  array([ 1,  1, -1])),)

And the signed measure obtained via the euler characteristic (degree=None)

mp.signed_measure(st,mass_default=None)
[(array([[0., 1.],
         [1., 0.],
         [1., 1.]]),
  array([ 1,  1, -1]))]

On point clouds

# Dataset
from multipers.data import noisy_annulus
X = noisy_annulus(n1=1000, n2=1000) # 1000 points on the annulus, 500 points on the square.
plt.scatter(X[:,0], X[:,1], s=5, c='k');plt.gca().set_aspect(1);
# Gudhi SimplexTree
st = gd.RipsComplex(points=X, max_edge_length=2.1, sparse=.2).create_simplex_tree()
# SimplexTreeMulti, with 2 parameters
st = mp.SimplexTreeMulti(st, num_parameters=2)
../_images/5f1c279c09723b177ef84ab48f965d920d737400b6c69f147ca73ec6ea30f7ff.png

st is now a SimplexTreeMulti, whose first parameter is a rips, and the second parameter is undefined.

Let’s fill the second parameter.

from multipers.ml.convolutions import KDE as KernelDensity
codensity = - KernelDensity(bandwidth=0.2).fit(X).score_samples(X) 
# parameter = 1 is the second parameter as python starts with 0.
st.fill_lowerstar(codensity, parameter=1) # This fills the second parameter with the co-log-density
[KeOps] Warning : Cuda libraries were not detected on the system ; using cpu only mode
<multipers.simplex_tree_multi.SimplexTreeMulti_Ff64 at 0x7f4906bb6af0>

Note that the KernelDensity is a (faster) reimplementation of sklearn’s KernelDensity, using PyKeops, which allows us to do automatic-differentiation.

Remark : A rips complex contains a lot of unnecessary edges, which can make the filtration grid prohibitly large; we will simplificate the simplextree, before computing this exact grid. This can be done using the collapse_edges method, which relies on filtration_domination.

print("Before edge collapses :", st.num_simplices)
st.collapse_edges(full=True) # This should take less than 20s
print("After edge collapses :", st.num_simplices)
st.expansion(2) # Be careful, we have to expand the dimension to 2 before computing degree 1 homology.
print("After expansion :", st.num_simplices)
Before edge collapses : 162364
After edge collapses : 9444
After expansion : 19167

Interval decomposition approximation

To get an idea of the shape of the module, we can compute first an approximation of it, using MMA.

st.persistence_approximation().plot(degree=1) # Should take less than ~10 seconds depending on the seed
../_images/67859706c387643def38ed8c84d1d3518134b48b2174885d6a18f47b5ebf4c90.png

Signed Barcode Decompositions

Another interesting topological invariants are signed measures given by module decompositions. Unlike MMA, they heavily rely on computation on a fixed grid, so in order to significantly reduce the computational time, we can coarsen the multi-filtration before doing the computations. This can be done using the grid_squeeze method.

The easiest measure to compute on point cloud is the one given by the decomposition of the hilbert function, i.e., the pointwise dimension of the module.

st_coord = mp.SimplexTreeMulti(st) # We do a copy first (for later use), as this changes the filtration values.
st_coord = st_coord.grid_squeeze(grid_strategy="exact") # exact filtration
list(st_coord.get_simplices())[:3] # filtration values are now integers : the coordinates in that grid
[(array([  0, 150, 758]), array([606, 187], dtype=int32)),
 (array([   0,  150, 1785]), array([835, 180], dtype=int32)),
 (array([  0, 150]), array([499, 180], dtype=int32))]

We can now compute the signed measure of degree 1 of this complex.

# The zero pad zeros out the grid, should take less than ~10s
sm_hilbert = mp.signed_measure(st_coord,degrees=[1], plot=True, mass_default=None, invariant="hilbert") 
../_images/f1c172493e3652b9b5fc33b193087a402b4b8128fa1b37bf9b129ad3bdc57b28.png

Note that the Euler signed measure cannot be computed here (unless you have a lot of RAM), as one would need to do a st.expansion(#num_verticies) beforehand.

This is why we need to simplificate the original multi-filtration, by using a grid of lower resolution.

F = st.get_filtration_grid(grid_strategy="regular_closest", resolution=100)
[len(f) for f in F]
[100, 100]
st_coord = mp.SimplexTreeMulti(st,safe_conversion=True)
st_coord = st_coord.grid_squeeze(F) # Squeezing simplificates the multi-filtration,  thus we can collapse edges again :)
st_coord.collapse_edges(full=True, progress=True, ignore_warning=True)
st_coord.expansion(st_coord.num_vertices)
st_coord.num_simplices
Removing edges:   9%|██████████████▍                                                                                                                                                 | 9/100 [00:00<00:03, 24.77it/s]
21939
# should be instantaneous
sm_euler = mp.signed_measure(st_coord,degree=None,plot=True) # degree=None automatically calls the euler invariant, and integers the hilbert invariant
../_images/a94aa41029f5070a56c62cbe3dec91ced261aacc886022de4c4efcdbbc6b20ac.png

Finally, the measure given by the rank decomposition has an even worse scaling with the grid size, so we recommend decreasing the grid size before computing this measure

## The rank invariant can be even slower to compute; it is in particular very affected by the resolution.
st_coord = st.grid_squeeze(grid_strategy="regular_closest", resolution=50)
st_coord.collapse_edges(full=True, ignore_warning=True)
sm_rank, = mp.signed_measure(st_coord,degrees=[1], invariant="rank")
st_copy = st.grid_squeeze(grid_strategy="regular_closest", resolution=200)
sm_hilbert1 = mp.signed_measure(st_copy, degree=1, invariant="hilbert")
pts, weights = sm_hilbert1[0]
hf, grid = mp.point_measure_integration.integrate_measure(pts,weights, return_grid=True)
mp.plots.plot_surface(grid, hf, discrete_surface=True)
mp.plots.plot_signed_measure(sm_rank)
../_images/a556520cbe49a99ecb6b9a104bf453ba33a116b58ef8a278a1268c3e66b9a8eb.png

Signed Measure Vectorizations

Sliced Wasserstein

For the slicedwasserstein distances, and the convolutions of the signed measures, we recommend to use the already provided pipelines :

  • multipers.ml.multi.SignedMeasure2Convolution

  • multipers.ml.multi.SignedMeasure2SlicedWassersteinDistance

Note that the second one needs signed measures of null mass. So set the zero_pad flag to True when using these pipelines.

from multipers.ml.signed_measures import SignedMeasure2Convolution, SignedMeasure2SlicedWassersteinDistance, SignedMeasureFormatter
[sm_hilbert, sm_euler] = SignedMeasureFormatter(normalize=True).fit_transform([sm_hilbert, sm_euler])
imgs = SignedMeasure2Convolution(plot=True,bandwidth=0.05, grid_strategy='regular', resolution=200).fit_transform([sm_hilbert, sm_euler])
../_images/31f04d5f9bc8d7399a1e434b168945c7b7cc9b44b5e769d414fa7392dc18d84b.png

Multi-critical multi-Filtrations

Note. This is still experimental, and interface is subject to changes.
Furthermore, no other library is compatible with multi-critical filtrations, so calling code that requires them will not work.
Check the simplextree with list(st) if necessary.

import multipers as mp
import numpy as np

Simply add the simplices multiple times! This can also seamlessly be converted to multi-critical slicers

st = mp.SimplexTreeMulti(num_parameters=2, kcritical=True, dtype = np.float64)
st.insert([0,1,2], [0,1])
st.insert([0,1,2], [1,0])
st.remove_maximal_simplex([0,1,2])
st.insert([0,1,2], [1,2])
st.insert([0,1,2], [2,1])
st.insert([0,1,2],[1.5,1.5])
st.insert([0,1,2], [2.5,.5])
st.insert([0,1,2], [.5,2.5])
list(st)
[(array([0, 1, 2]),
  [array([1., 2.]),
   array([2., 1.]),
   array([1.5, 1.5]),
   array([2.5, 0.5]),
   array([0.5, 2.5])]),
 (array([0, 1]), [array([0., 1.]), array([1., 0.])]),
 (array([0, 2]), [array([0., 1.]), array([1., 0.])]),
 (array([0]), [array([0., 1.]), array([1., 0.])]),
 (array([1, 2]), [array([0., 1.]), array([1., 0.])]),
 (array([1]), [array([0., 1.]), array([1., 0.])]),
 (array([2]), [array([0., 1.]), array([1., 0.])])]
mp.module_approximation(st, box = [[0,0],[3,3]]).plot()
../_images/619ac48e5594f259b0675e8e5aec02c3a7a8a0eb251bb647cf675d2960565810.png