P2P Rec

Friend-Of-A-Friend files

P2Prec is a recommendation service (RS) for P2P content-sharing systems that exploits users’ social data. To manage users’ social data, we rely on the Friend-Of-A-Friend (FOAF) project. FOAF offers an open, detailed description of profiles of users and the relationships between them using a machine-readable syntax. Whenever a user generates its FOAF file, it can obtain an identity for that file on the Web in the form of a URI. This URI could point to a reference in the user's FOAF file stored in a server that the user trusts. In this sense, FOAF becomes an important tool to provide simple directory services and one can use information from FOAF files to locate people. One can imagine FOAF as a way of describing a distributed directed graph of friendship relations, where each user specifies its interests, topics of expertise and friends in its FOAF file, and then stores it in a server that its trusts.

A recommendation System

P2Prec’s general goal is to improve the quality and efficiency of query responses in P2P content sharing systems, by exploring the synergy between RSs and the social relations between users. P2Prec is useful to recommend to a user high quality documents related to a specific topic from documents that have been seen or created and rated by friends (or friends of friends) which are expert in that topic.

As an example, let us consider a user new to a topic, e.g., vegetarian, who would like to get good and valuable documents related to the topic vegetarian cuisine. Then the user should be able to consult its friends (or friends of friends) which have some expertise or experience in vegetarian cuisine so those users can return back high quality documents related to that topic.

Topics Extraction

Thus, a key capability of P2Prec is to help users in a P2P content sharing system finding high quality documents in one or more topics from friends (of friends) which are experts in those topics. Users’ topics of expertise are automatically calculated based on a combination of topic extraction from their documents (the documents they shares) and rating.

To extract and classify the hidden topics available in the documents, we use the Latent Dirichlet Allocation (LDA) technique. Without loss of generality, users which do not have expert friends, which we call isolated-users, still have the ability to use the system to get high quality recommendations.

A P2P Application

P2Prec has an hybrid P2P architecture to work on top of any P2P content sharing system. It combines efficient DHT indexing to manage users which are expert in one or more topics along with their FOAF files with gossip robustness to disseminate the topics of expertise between friends (of friends). This hybrid architecture has two layers: expertise and recommendation.

  • The expertise layer organizes the users of the P2P content sharing system which are expert in a Distributed Hash Table (DHT) to efficiently find them and access their FOAF files.
  • The recommendation layer is an unstructured overlay that implements a gossip-based protocol to let each user maintain an up-to-date view of the topics of expertise of its friends and a subset of their friends of friends for the purpose of generating recommendations.

In our experimental evaluation [DPV+10], using the CiteSeer dataset, we show that P2Prec has the ability to get the maximum recall with very good performance. Furthermore, it increases recall and precision by a factor of 2 compared with centralized solutions.

Gossip Algorithm

The Gossip Algorithm is designed to disseminate users’ relevant topics (what topics a user can provide) as well as it proposes new interests users to the owner in which the owner may like to make friendship with them. Gossip Algorithm consists of four modules:

  1. View: a fix number of entries, each entry refers to a user. Each entry contains the IP address of the user and topics the user can provides.
  2. Initialize view: it initializes the user’s view by exchanging its FOAF file with its direct friends.
  3. View Management: it uses to update user’s view, when the user receives a gossip message, it updates its view based on the gossip message received. Also it performs the gossip protocol behaviors: active and passive behaviors.
    • Active: a user initiates a gossip exchange by selecting a random contact from its view to gossip with. Then the user selects a random subset from its view (gossip message) and sends it to the selected user.
    • Passive: when a user receives a gossip message, the user creates a gossip message from its view and sends it back to the initiator.
  4. Friend Matcher: it measures the similarity between user and the users in its view. Then it proposes to the owner the users, which exceed a certain threshold.

Query Processing

The Query Processing is designed to route user’s queries and the queries that the user has received from another users. Also and it is responsible for measuring the similarity between received queries and the documents the user has shared. Query processing consists of three modules:

  1. Query Routing: when a user sends or receives a query, query routing selects from user’s view and friends the users that may serve query and forwards the query to them.
  2. Process Query: it measures the similarity between received queries and the documents the user has in its shared area.
  3. Topic Extractor: it is designed to computes the topics of the documents the user has maintained by using latent dirichlet allocation (LDA), and the topics the user can provide.

INRIA main page