# TD1: Simplicial complexes and Homology

In this first practical session, we will use `gudhi` to create simplicial complexes and compute their homology groups and Betti numbers. We will first start with simple and synthetic examples, and then apply computational topology on a real-world data set of images.

First load the required Python libraries. You will need `numpy`, `gudhi`, `matplotlib`, `networkx` and `scikit-learn`.

In [None]:
import os
import itertools
import numpy as np
import gudhi as gd
import networkx as nx
import matplotlib.pyplot as plt
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.cluster import DBSCAN, AgglomerativeClustering
from sklearn.metrics import pairwise_distances
from sklearn.preprocessing import OneHotEncoder
from sklearn.decomposition import PCA
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC

In [None]:
%matplotlib notebook

## 1. Betti numbers of standard topological spaces

The goal of this first exercise is to get familiarized with `gudhi` and the `SimplexTree` data structure. Documentation is [here](https://gudhi.inria.fr/python/latest/simplex_tree_ref.html).

Q1. Triangulate the torus, represented as a quotient space.

![torus.png](attachment:torus.png)

![119733806.jpg](attachment:119733806.jpg)

Q2. Enter your triangulation as a simplex tree in `gudhi`.

Q3. Check that your triangulation is correct by computing the Betti numbers of your triangulation. You can use the `Betti_numbers` function below for this. If the Betti numbers are not correct, make sure that the number of simplices in your simplex tree matches the one of your triangulation (in each dimension).

In [None]:
def Betti_numbers(st):
 st.compute_persistence(persistence_dim_max=st.dimension()+1)
 return st.betti_numbers()

Q4. Do the same with the [dunce hat](https://en.wikipedia.org/wiki/Dunce_hat_(topology)). The dunce hat is a space known to be contractible (homotopy equivalent to a point), but not collapsible (you cannot deform the complex to a point by progressively removing its simplices).

![dunce.png](attachment:dunce.png)

![fig7_11dunce.jpg](attachment:fig7_11dunce.jpg)

## 2. Application to COIL data set

The goal of this second exercise is to visualize and classify a data set using simplicial complexes and Betti numbers. The data set is the Columbia Object Image Library, which is made of gray scale images of rotating objects.

![obj1.gif](attachment:obj1.gif)![obj2.gif](attachment:obj2.gif)![obj3.gif](attachment:obj3.gif)![obj4.gif](attachment:obj4.gif)![obj5.gif](attachment:obj5.gif)

You can download the data [here](http://www-sop.inria.fr/abs/teaching/centrale-FGMDA/slides_mathieu/coil-20-proc.zip). Set up the `path` variable to where the data is in your machine.

In [None]:
path = './coil-20-proc/'

In [None]:
plt.figure()
plt.imshow(plt.imread(path + 'obj1__0.png'), cmap='gray')
plt.show()

We provide an implementation of the Mapper complex. Get familiarized with its arguments and their formats before using it.

Q1. Read the images and their labels (the object they represent) in `numpy` arrays.

Q2. Compute the Mapper complex (using the `MapperComplex` class in `gudhi`---documentation is [here](https://gudhi.inria.fr/python/latest/cover_complex_sklearn_isk_ref.html).) on the images of a given rotating object (using, e.g., PCA components as filters), and compute its Betti numbers. Try different filters, resolutions, gains and clusterings, and see how they influence the results.

Q3. Use `networkx`, its `draw` function and the `get_networkx` method of the `MapperComplex` class to visualize the complex.

Q4. We are now going to classify images based on their Betti numbers. First, write a function that turns images into simplicial complexes, by triangulating every pixel with two triangles, and leaving aside pixels whose gray scale value is below a pre-defined threshold. Test your function on an image, compute the Betti numbers of its associated complex, and visualize the filtered image to check that the Betti numbers make sense.

Q5. Compute the Betti numbers of the complexes computed with a given threshold on the images associated to a few objects. Then, use these features to train classifiers (such as SVM or random forests) and compute their accuracies on random 80%/20% train/test splits of the data.

Q6. Since Betti numbers are ordinal data, check the effect of one-hot encoding on the accuracies.