**PostDoc: Multi-Agent Reinforcement Learning with Application to Access Control**

**PhD topic description:**

Computation of stationary distribution of large Markov chains has immense number of applications, for example, in computing systems, supply chains, recommendation systems, and computational biology. Arguably, the largest Markov chain was introduced by Google founders. This Markov chain describes a stylized web surfer hopping from one web page to another. The stationary distribution of this Markov chain, called PageRank, was used by Google to rank search results by their relevance. This gave an enormous boost to the development of new computational methods for solving very large Markov chains.

In this project you will develop and analyse a new class of methods for computing stationary distribution of general large Markov chains. These methods converge much faster than the current state-of-the art, and allow for distributed implementation.

The analysis of the algorithms involves techniques from probability theory and optimisation such as coupling, large deviations, and dynamic programming. Since we analyse very large systems, it is inevitable to look for high quality approximations, using, for example, perturbation theory and reinforcement learning. We will also explore the potential of distributed implementations where calculation can be, in a natural way, divided between multiple units such as GPU's.

This project will be supervised by Dr. Konstantin Avrachenkov (Inria) in collaboration with Prof. Nelly Litvak (University of Twente and Eindhoven University of Technology, Netherlands). The PhD student will have an opportunity for research visit(s) to the Netherlands.

**Requirements:**

- MSc degree in mathematics or related fields (computer science, physics, engineering) with strong focus on mathematics
- Strong background in probability theory, linear algebra and optimization is welcome
- Good command of English language

**Related literature:**

S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN systems, 30(1-7):107-117, 1998.

S. Abiteboul, M. Preda and G. Cobena. Adaptive on-line page importance computation. In Proceedings of the 12th ACM international conference on World Wide Web, 280-290, 2003.

K. Avrachenkov, N. Litvak, D. Nemirovsky and N. Osipova. Monte Carlo methods in PageRank computation: When one iteration is sufficient. SIAM Journal on Numerical Analysis, 45(2): 890-904, 2007.

K. Avrachenkov, P. Brown and N. Litvak. Red Light-Green Light Solution Method for Large Markov Chains. Arxiv preprint arXiv:2008.02710, 2020.

**Contact:**

Konstantin Avrachenkov (K.Avrachenkov@inria.fr)

Nelly Litvak (N.Litvak@utwente.nl)

Back to the front page