Find the perfect match!

Find the perfect match!

Problem being addressed

Many real-world two-sided matching markets are decentralized in the sense that no centralized agency exists which knows everyone’s preferences in order to perform the matching between the two sides of the market. Having such a central entity that knows everyone’s preferences is often not feasible. Furthermore​,​ privacy issues can arise if preferences are confidential.


Training autonomous agents to find suitable matches for themselves using reinforcement learning. The researchers consider the decentralized two-sided stable matching problem, where an agent is allowed to have at most one partner at a time from the opposite set. Each agent receives some utility for being in a match with a member of the opposite set. The problem is formulated spatially as a grid world environment and having autonomous agents acting independently makes the environment very uncertain and dynamic. Agents learn their policies separately, using separate training modules. The goal is to train agents to find partners such that the outcome is a stable matching if one exists and also a matching with set-equality, meaning outcome is approximately equally likable by agents from both the sets.

Advantages of this solution

The multi-agent reinforcement learning method was able to successfully find stable matchings with set-equality for almost all the instances in the experiments.

Solution originally applied in these industries


Entertainment Industry

Possible New Application of the Work


Management Sector

In many matching markets, navigating (physically or virtually) for locating and approaching a potential match is crucial task. Several decentralized matching markets, such as worker-employer markets consist of locations at which potential matching agents may meet. Such locations can be present physically or online on the internet. This is an important aspect of the matching process, to incorporate it, it is possible to model the problem by using a grid world environment in which agents must learn to navigate to other agents in order to form matches.

Author of original research described in this blitzcard:


Name of the author who conducted the original research that this blitzcard is based on.

Source URL: #############check-icon

search-iconBrowse all blitzcards