(This post is dedicated to Taha and Rıza, my fellow classmates.)
I am trying to survive finals and tomorrow I have a CS final, so I wanted to write about a fun problem in computer science (and maths).
As you have probably noticed, CS people tend to give geeky names to algorithms and methods (we have The Chinese Restaurant Process and Monte Carlo Algorithm, for example). Perfect marriage problem is actually nothing but a funny name for an optimal matching problem.
Let me tell you the problem first: suppose you are the top coder for a celebrity matchmaking website. The company is not very famous yet, so you just have fourteen, heterosexual clients for now, and they tell you their preferences among the opposite sex clients:
Scarlett Johansson: “I would marry Justin Timberlake or Jared Leto. They are both very handsome.”
Jennifer Aniston: “I love Brad Pitt. However, if I marry Orlando Bloom or Justin Theroux I wouldn’t get so sad.”
Angelina Jolie: “Brad Pitt is the love of my life. Please do not match him with anyone else!”
Lindsay Lohan: “It doesn’t really matter to me. Just do not match me with Brad Pitt or Justin Theroux. They are booo-ring.”
Paris Hilton: “Umm, I guess Jared Leto or Leonardo di Caprio would do.”
Demi Moore: “I would marry Brad Pitt or Leonardo di Caprio. I like blondies.”
Miranda Kerr: “Leonardo di Caprio or Orlando Bloom, please.”
(Suppose if Jennifer Aniston, Demi Moore and Angelina Jolie want to marry Brad Pitt, he also wants to marry one of these women.)
(yes, these statements are made up just for procrastination purposes)
(…and yes, I actually googled “celebrities love triangles”.)
It is very convenient to represent this as a bipartite graph with two sets of vertices (women and men):
Before jumping right in to the problem, however, I guess we have to talk about network flow problems. In graph theory, flow network is a directed graph where each edge has a capacity and each edge receives a flow. Amount of flow into a vertex equals amount of flow out of it, unless it is a source (a vertex with only outgoing flow) or a sink (a vertex with only incoming flow). In real life applications, flow network is used to model traffic, water pipes, electric circuits, cargo delivery (actually anything you want that involves something travelling through a network of nodes).
For example, let the flow network below represent the traffic in a road system in Ezgiland (a capacity of 4 cars, because why not).
Here are some of the possible flows:
The black numbers in the graph are the capacities (how much car can be on a road at the same time) and the red ones are the flows (how much car we allow on that road at that time).
Well, you could try every possibility and find the optimum (maximum here) flow but for bigger flow networks (for example, the road system in Istanbul) the trial-and-error method would take ages.
Ford-Fulkerson algorithm to the rescue! In this algorithm, we try to find the augmenting flow chains, the sequences of the edges that forms a path in the network when the directions are ignored and that we can add more flow. After we find a path, we can try to optimize the flow with this algorithm.
Let’s try to improve our worst flow in the road system of Ezgiland:
The blue writings in the form (where the flow came from, additional flow) represent my attempts on finding an augmenting flow chain in this flow network. From the source vertex, I can push 3 more “cars” to V2 so I write (+, +5) near V2. From V2, I can push 3 more cars to V4 but after that I would get stuck since the roads with capacities 7 and 4 are already at their maximum capacity. The same thing happens when I try to push 3 more cars to V1. So what will I do next?
Since I am looking for an augmenting flow chain, I am allowed to backflow, so I continue with the road having capacity 9. Whenever I take away flow, I put a minus sign before the additional flow. While travelling from V2 to V3, I take away 4 cars from the road so I write (V2, -4) near vertex V3. Those 4 cars that I took away from the road, I can add them to the road connected to the sink vertex. This augmented chain has bottleneck capacity of 4 (we started with pushing 5 from the source but ended up with 4 cars at the end). That would increase the total flow to 15 + 4 + 4 = 23, which gives us the optimal solution. You could try to find other augmenting paths but I believe this is the one with the biggest bottleneck capacity (if you find a better solution please let me know).
Since we learned the algorithm, FINALLY, we can go back to our perfect marriage problem.
I am expecting this question from you now: “What does the network flow problem have to do with the perfect marriage problem? The graph we previously had in the matchmaking example doesn’t even have any source or sink vertex, and it is undirected!”
Well… you are right. In fact, in order to use Ford-Fulkerson algorithm in perfect matching problem, we add source and sink vertices and we make the graph directed!
Then, we arbitrarily choose husbands for each female celeb. Scarlett Johansson (SJ) matches with Justin Timberlake (JT), Jennifer Aniston (JA) with Brad Pitt (BP) (…yes I did this on purpose lol), Lindsay Lohan (LL) with Leonardo di Caprio (LC), Paris Hilton (PH) with Jared Leto (JL) and Miranda Kerr (MK) with Orlando Bloom (OB). Then we are left with Angelina Jolie (AJ) and Demi Moore (DM). Although there are two male celebs left, since their dream husbands are matched with others now, they remain single.
Is this the perfect matching? We have to inspect the graph. We cannot directly say “Oh, there are two women and two men left, so there should be a better solution”.
Sometimes we will not find the perfect marriage. There won’t be enough men for every women, or every women will want the same man. Maybe a woman will refuse to marry any of these man even if it’s the end of the world. Angelina Jolie, for example, does not accept that there’s plenty of handsome fish in the sea. She just wants to marry Brad Pitt. If Brad Pitt was not in this set, she would definitely remain single.
Anyway, let’s go back to graphs and network flows. In this graph, every edge has a maximum capacity of 1. If we push a flow of 1 to an edge, this means the endpoints of the edge are getting married. An empty edge means 0 flow.
We apply Ford-Fulkerson algorithm to find the optimal solution. From the source, we first push 1 to AJ to maximize the flow. Then, there is only one edge coming out from AJ, which is going in to BP. Then we do something interesting: since male celebs always have incoming edges, we have to separate BP from his current wife and hope that his wife will not get upset and can marry someone else. BP’s current wife turns out to be JA ( 😦 ) so we send -1 flow from BP to JA (You could think of +1 as a wedding proposal and -1 as filing for divorce). Then, we see that Justin Theroux (JTo) is not taken yet, so we match them.
We managed to solve this one. We still have the problem of matching DM to someone though.
Do not give up. We can do it!
From the source, we again push 1, but this time to DM. We check the men set and we see DM either wants to marry BP or LC. They are both taken, so we definitely have to seperate a couple. However, BP has just got divorced, so we do not want to make him upset again (he will refuse to pay us if we do). We choose to seperate LC from LL. LL said he would nearly marry anyone (thank goodness), so we would match her with Adam Levine (AL), who is currently single.
Aaand we are done! Your clients will live happily ever after (?), the matchmaking company will soon become famous and you will get a raise as the top coder. Congrats!
The final matching looks like this, if you are curious:
Thanks for reading. Wish me luck on my finals!