Shell sort with Algo-rythmics ♫

Whoa, it has been over two months since my last post!

Let’s get this train back on track (choof choof).

So, two weeks ago I was studying for my midterms and I was reminded of an awesome group called AlgoRythmics. They are a bunch of people from a university in Romania that demonstrate the sorting algorithms with various folk dances (mostly Hungarian). Although my midterm was not about sorting or any sort of algorithm at all, I watched all of their videos, totally hypnotized by their nerdiness and awesomeness.  My favorite ones are merge sort, shell sort and quick sort (although I feel like they should’ve chosen a different dance for quick sort – the video is even longer than the one on bubble sort!).

I understood the mechanisms behind all of them at first glance, except shell sort. I haven’t learned it before so I had to watch it more than once (sorry, systems programming midterm)  before I could have a clue of what is going on. That’s why today we’re going to learn how shell sort works.

First things first: please watch this to continue.

for-my-cakeday-i-present-my-skeptical-dog-62998
you watched it, right?

 

By the way, I do not understand what they yell  most of the time but I have some legit guesses. So let’s learn some Hungarian.

  1. öt: five
  2. három: three
  3. Hej, te szomszed!: Hey, you’re next!

…and that’s all of my Hungarian knowledge.

…sooo when a bunch of people turns to camera (like in 0:05) I will assume they are yelling “holeeey”. 

for-my-cakeday-i-present-my-skeptical-dog-62998
sceptical dog is still sceptical

Shell sort works as follows: We compare the n elements that have distance k from each other only, and then decrease k until it becomes 1. There are several ways to decrease the gap. Easiest way is to take k = n/2 and go with k = k/2 in each iteration. In this video, however, they go with “5,3,1” instead of “5,2,1”. I will take a wild guess here: if n/2 evaluates to an odd number, they start with that k and subtract 2 in each iteration. If n/2 is even, they start with k = n/2 – 1 since the algorithm should end with k=1, not k=0 (if you have any other guesses, please let me know by writing a comment below).

Let’s go step by step. This is their first arrangement:

first

This is when they say “holeeey” for the first time:

second

We should compare a[0] with a[5], a[1] with a[6] and so on. If the one on left is larger than the one on right, we should swap them (since we are looking for an increasing order).

(Note that the array elements with the same color of dot are in the same sublist, so swapping will occur only between those elements, if there is any).

third
a[0] – a[5] and a[4] -a[9] are swapped. The others just lift their hands, indicating that no swapping will be done.
This is when they say “holeeey” for the second time (or something like haj mosteve?)

fourth

Now, we are comparing a[0] with a[3], a[1] with a[4] and so on (that’s why they yell something with harow). a[0] – a[3], a[1] – a[4] and a[2] – a[5] dance as pairs but no swapping occur, so a[6] starts to dance with a[3].

fifth

Now be careful! a[6] swapped with a[3], but we still have to check a[0] and a[3] since they have a distance of three.

We check it, and see that no swapping should happen. False alarm. Phew.

We apply the same to the rest:

sixth
after some dancing, a[4] swaps with a[7] and a[6] swaps with a[9].
Then, the dancers face the camera once more:

seventh

Now, since the interval is 1, every element is in the same sublist and every one of them can swap with each other, so we apply basic insertion sort (we try to place the element where it belongs at last).

“Why should I even bother? What’s the point, really?” He thought for a moment.amca.png

“Who says there has to be a point?” he asked.

“Or a reason. Maybe it’s just something you have to do.”

 

You may ask yourself (as I also asked myself) why you even bothered to learn this algorithm since it boils down to insertion sort anyway. Now, imagine that you have a list that is almost in order except for an element or two and they are far away from each other.

deck1

If you performed an insertion sort on this list (or shell sort with gap = 1), you would have to move 3, 4, 5 and 6 to the left of 7, then you would move 2 to the left of 6, 5, 4 and finally 3, a total number of 8 moves. However, if you performed shell sort with gap = 5, you would just swap 2 and 7 and your list would be sorted.

Valla kod yazmadan bırakmam. I am currently learning Python for a course, so I gave it a shot.


def main():
    mylist = [3,0,1,8,7,2,5,4,9,6]
    shellSort(mylist) 
    print mylist

def shellSort(mylist):
    gap = len(mylist)//2 #gap is integer division of the list's length
    if gap % 2 == 0: #if even gap, make it odd
        gap-=1

    while gap > 0:

        for start in range(gap):
            for i in range(start+gap, len(mylist), gap): #start from starting position + gap, go until end, increment by gap in each iteration

                pos = i #current position
                current = mylist[pos]
                
                while pos >= gap and mylist[pos-gap] > current: #if one on the right is less, swap
                    mylist[pos] = mylist[pos-gap]
                    pos -= gap 
            
                mylist[pos] = current
        gap -= 2 #subtract 2 from gap in every iteration

main()

p e a c e   o u t

Advertisements

Build Your Own Maze

So far, I’ve only talked about how to solve problems. Today I will tell you how to create one for a change. You can create mazes as hard as you want and have fun watching your friends try to solve them!

ShiningMaze
mehehehehe

There are many algorithms for creating a maze, of course (♫ You’ll take the high road and I’ll take the low road, and I’ll be in Scotland before you ♫). The algorithm I will show you is called recursive backtracking. It will be easy as a Dr. Oetker pudding recipe once you know what depth-first search (DFS) and a stack means. I will try to explain them as I go on.

We start with a table like this (here, I chose a 4×8 table. You can start your maze with anything you like, it’s not important – unless you start with a 1×1 one).

maze0

We randomly choose a box to start. Then, we again randomly choose an unvisited “neighbor” to step on – either the left, right, up or down one (down-left box is not a neighbor, for example). When we travel from one box to another, we remove the wall between them. We keep repeating this algorithm until all squares are visited.

We have to be careful though – if we step to a box and that box happens to have no unvisited neighbors, we will have to take a step/steps back until everything turns back to normal. We will still mark that box as visited. This is called as DFS since we do not backtrack until we reach the very end.

As we visit boxes for the first time, we will put them in a stack. Stack is a data structure which follows the last in first out (LIFO) principle. A box of Pringles is an example of a

pringles
yup, just like a stack

stack – the last chip you will eat is actually the one which is packed first (we also use “stack” as a term in Ultimate Frisbee – usually the last one in the stack makes the first cut). We dynamically grow or shrink the stack as we continue on. The program terminates when stack is finally empty.

 

‘Nuff said. Let’s start with the box 32. I will color the visited boxes with blue and add them to a stack as I proceed (click on the images pls):

Then, I randomly chose 31. I delete the walls between 30 and 31, and mark 31 as visited as usual.

 

Then…

dustin-hoffman-rain-man-1
Uh-oooh…

We’ve reached a dead end. We cannot proceed from 31 to anywhere since all of its neighbors are already visited. We “pop” 31 from the stack and see if 30 has other neighbors. Fortunately, 30 has 29 as a neighbor so we do not need to backtrack anymore.

 

So far so good…

Let’s continue:

Obviously this maze will start at 32 and end at 1, but this doesn’t change the fact that we need to visit every box in this table, meaning that the big square should be completely blue and the stack should be empty when we finish.

It’s actually very difficult to show these step by step, so I will skip a few steps. I believe you already understood how to proceed anyway:

Again, we’ve reached a dead end. This time, one step back is not enough, so we continue backtracking and popping off the stack until we can proceed with another box.

We still have 13 to visit:

Although we see it clearly from this image, we still need to backtrack from 13 in order to see if any box is unvisited:

Aaaaaaand we-are-done!

finalmaze
yes… I know it’s the easiest maze in the world

You may now think this algorithm is not worth it since our final maze is very easy. However, I used a table of only 4×8 as an example. Also, I explained it step by step which made the creation seem so long. That’s what computers are for: once you implement this algorithm, you will see it’s a very fast algorithm. I will not give a full implementation though (if I feel like it later, I will edit here). I just wanted to show you how mazes are created with computers.

Some day, inshallah mashallah (lol), I will be able to create my own demos. Until that day comes, here is an a-mazing (lolol) animation (warning, may cause addiction):

~~IMPORTANT EDIT~~ My classmate Mahmut made an awesome demo! Please also check that out. You can play with the animation rate and grid size.

So long, folks!

Source:

  1. Check out this awesome demo!
  2. Wikipedia, of course
  3. I love this blog
  4. Easy learn tutorial – implementation

0/1 Knapsack Problem or: How I Learned to Stop Coding and Rob a House

Ahoy!

It’s been over a week since I posted about another dynamic programming problem. I thought I would be posting more on winter break. Sorry about that.

I have usually written about everyday problems in this blog. Again, I will teach you a very useful, daily life application of computer science: efficient stealing.

Here we go:

Suppose you are a thief and you planned to enter a house. You know that there are only four valuable items in the house that weigh 1, 3, 4, 5 kilograms and cost 1000, 2000, 4000 and 4500 dollars, respectively. You are a very sloppy thief, i.e., you did not plan things ahead, so you brought a knapsack with a capacity of only 7 kilograms. What is the maximum total value of goods that you can steal? 

problem
(the vase is probably from Beymen lol)

In the above example, you can easily see that the wisest decision would be stealing 6000$ worth of goods. In some houses though, there will be thousands of objects with very small weights or you will have a bigger knapsack so it will definitely be harder to decide. It’s easier to find the optimum solution with a clever algorithm rather than trying every combination and calculating the values each time.

In this case, our clever algorithm will be dynamic programming.

Remember that the dynamic programming approach is all about dividing the problem into subproblems. As in the stair-climbing problem, we do not jump directly into the problem of fitting the combination of the items with the maximum total value. Instead, we check for capacities less than 7, store the maximum total value up to that capacity and increase the capacity one by one.

Let’s see this on a table:

table

I inserted the items in non decreasing order and filled the “zeroth” column (I will implement this with arrays, that’s why there is a zeroth column) with all zeros since we cannot fit anything to the knapsack if it has zero capacity. Then, we start with only the first item with weight 1 (camera) and we increase the capacity of the knapsack one by one. For each capacity, the best (by best, I mean the maximum total value) we can do with that item is to include that item if it fits. It seems that weight 1 fits to every capacity greater than or equal to 1, so we fill the table according to that:

table2
(the max total value we can get with the camera only)

Then, we consider the second item with weight 3 (laptop) together with the camera. If the capacity of the knapsack is 1 or 2, we can only fit the camera to the knapsack. If the capacity of the knapsack is 3, we can either put the camera or the laptop. We compare their values (we go one row up) and laptop turns out to cost more than the camera, so we choose the laptop. Here comes the trick for capacities bigger than 3 kg: we check for the maximum total value of the goods before adding 3 and add 3 on that value. If we are on column 4, for example, we go up and go 3 columns back and add the value of the laptop to that. If this value we have is bigger than the value when we go one row up, we write that value to that cell.

I know my attempt to explain the algorithm is very confusing, but I am sure that you will understand what I am trying to say:

table3-1
we can fit the item with weight 1 only
table3-2
we choose laptop over camera
table3-3
we check for the maximum before adding the item with weight 3, then choose to add it or not

Next, we include the items with weights 1, 3 and 4. For the capacities less than 4, the best we can do is what we can do without including the item with weight 4 (necklace). Instead of recalculating these, we directly go up one row and write the maximum total values with just including the camera and the laptop. Then, for the capacities greater than or equal to 4, we either do not include item with weight 4 to the solution (go 1 row up and write the value 3000) or we go 4 back, one up, add necklace’s value to that value. For example, for the capacity of 4, we check for maximum total value with capacity of zero and add 4000 to it or we directly write 3000. Since 4000 is greater than 3000, we choose to write 4000.

table4.jpg
(here, blue arrow means directly including the value above. orange arrow means adding 4000 to the value that the arrow comes from)

For the final row, we apply the same steps:

table5

Note that in the last row, we chose to include the value above the 5th column instead of including the one where we go 1 up and 5 back since 5000 is bigger than 0 + 4500 = 4500. Same goes for the 7th column: we chose 6000 over 1000 + 4500 = 5500.

table5.jpg
The value we want to find is the value on bottom right!

From this table, we can also see that, for example, the maximum value of goods we can steal when we have a knapsack with capacity 6 is 5500, with 5 is 5000, with 4 is 4000, and so on.

~~~~END OF TORTURE~~~~

Now, you may think implementation of this algorithm takes a loooooong time. Well, you will be surprised. All of the text above comes down to this formula:


i = current row/item, j = current column/capacity
if(i == 0 || j == 0) 
      TABLE[i][j] = 0  
else if (j < (i-1).weight) 
      TABLE[i][j] = TABLE[i-1][j] 
else 
      TABLE[i][j] = max(TABLE[i-1][j], (i-1).value + TABLE[i-1][j-(i-1).weight])

(For all j, TABLE[0][j] = 0 in the above formula. I did this in order not to manually fill in the first row. That’s also why I am looking for weights and values of i-1 instead of i. I did not show this on the tables, sorry about that.)

TL;DR What we do here is for the capacities less than the weight, we check for the maximum total value we found before, without including that weight. If the capacity is greater than or equal to the weight, we either do not include the weight and directly write the maximum total value we found before that fits that capacity or we include that weight and check for the maximum value subtracting that weight from the capacity and check for the maximum value at that capacity without including that weight.

Easy, right?

Well, maybe it will be easier if you also see a full implementation (in C++ this time):


#include <iostream>
#include <utility>
using namespace std;

int capacity; int numOfItems;

//returns the max value of two integers
int max(int a, int b) { return (a > b) ? a : b; } 

int knapsack_01(pair<int, int>* items){
    int TABLE[numOfItems+1][capacity+1]; //construct the table
    
    for(int i=0; i<numOfItems+1; i++){
        for(int j=0; j<capacity+1; j++){
            if(i == 0 || j == 0) 
                //fill TABLE[0][j]s and TABLE[i][0]s with 0s
                TABLE[i][j] = 0;  
            else if (j < items[i-1].second)
                TABLE[i][j] = TABLE[i-1][j]; 
            else
                TABLE[i][j] = max(TABLE[i-1][j], items[i-1].first + TABLE[i-1][j-items[i-1].second]);
        }
    }
    return TABLE[numOfItems][capacity];
}

int main(){
    capacity = 7;
    numOfItems = 4;
    
    //a pair holds one item
    //items[item].first = value of item, items[item].second = weight of item
    
    pair<int, int> items [numOfItems] = {make_pair(1000, 1), make_pair(2000, 3), make_pair(4000, 4), make_pair(4500, 5)};//array of pairs (items)
    
    cout << knapsack_01(items) << endl;
}

Aaand that’s pretty much it. If you have a computer with you while stealing (…or you can steal one from the house), you can copy and paste the code above to a C++ shell and see the result. You can always change the capacity and the items.

computer_thief.jpg
(formal attire is not required)

(If you get caught in the act, you don’t know me and you didn’t read this post.)

Source:

  1. Similar source code
  2. Tushar Roy strikes again!

Stairs Climbing Puzzle (Dynamic Programming)

 

orig-21315475
(Done with finals!!!)

Yello,

Today, I will introduce you a very efficient algorithm called dynamic programming.

“If it’s just an algorithm, why does it sound like it’s totally a different aspect of programming?

“Why didn’t they give it a boring name like Kruskal’s Algorithm, for example? (no offense)”

The term dynamic programming was originally used by Bellman in 1940. He says he chose the word dynamic because the output characteristics of the solution depend upon time… and simply because it sounds cool. The word programming refers to the use of the method to find an optimal program.

Let’s see the problem first before actually talking about the algorithm (one of the easiest problems):

Suppose you are climbing up a staircase with a friend. Your friend is talking about how much he likes his pet goldfish for the fifth time that day. You do not want to hear that story again, so you want to keep your mind busy with counting how many possible ways you can jump up the stairs if you can just hop 1 or 2 step(s) at a time (you know how many steps the staircase has).

Here are some examples:

Do you see a pattern here? (Hint: It’s a very famous sequence).

The answer is…

(♫♫ drum roll ♫♫)

FIBONACCI SEQUENCE!

So, how does it work?

For n = 0, there is only one way to climb the staircase (which is no way). For n = 1, you could do only one hop of one. For n = 2, you can either directly hop two steps or you can jump from first step to second step with a single hop of one. For n = 3, you first have to reach step 1 somehow and add a hop of 2 steps or you first reach step 2 somehow and add a hop of 1 step. For n = 4, either you reach step 3 and add a hop of one or you reach step 2 and add a hop of two. Note that I did not include reaching step 2 and adding two hops of one since it is already included in reaching step 3 and adding one more step (think about it).

(F(n) is the number of ways to reach the nth step).

We can define our solution recursively from now on:

table

The sequence is exactly the same as Fibonacci except that F(0) = 1 and F(1) = 1 are the base cases here whereas in Fibonacci, F(0)=0 and F(1)=1 are the base cases.

Speaking of implementation, let’s implement this in Java (because I miss coding in Java 😦 )

public class StaircaseProblem {
 
 public long numOfWays(int numOfSteps) {
   if(numOfSteps == 1 || numOfSteps == 0) 
     return 1;
   return numOfWays(numOfSteps-2) + numOfWays(numOfSteps-1);
 }
 
 public static void main(String[] args) {
   StaircaseProblem sp = new StaircaseProblem();
   int steps = 5;
   System.out.println(sp.numOfWays(steps));
 }

}

This simple, naive recursive program gives an output of 8, as it was supposed to do. However, suppose you are climbing to a staircase of 100 steps.

yamadera-risshakuji-temple_miyagi_pref_japan_photo_jnto.jpg
…or Risshakuji Temple in Japan (good luck with your boring friend lol)

If you try this program with numOfSteps = 100, then your program will take strangely long to run.

Why? I said we need to calculate F(4) and F(3) in order to calculate F(5). In order to calculate F(4) we need to calculate F(3) and F(2), and to calculate F(3), we need to calculate F(2) and F(1). Calculating F(4) and F(3) have nothing to do with each other, so we calculate F(3) once for F(4) and once for F(3) itself.

Imagine how many multiple calculations occur in n steps:

fibo
a tree for showing the number of calculations (n=6)

If there was just a way to calculate each subproblem only once…

But wait! There is one!

This is where dynamic programming comes in. In dynamic programming, we compute a subproblem the first time we encounter it. The key is to store the solutions. The next time the subproblem occurs, instead of recomputing its solution, we simply look up the previously computed solution. Instead of time, we use space for storage. The technique of storing solutions instead of recomputing them is called memoization (no it’s not a typo, WordPress, stop underlining it).

fibody.png
eliminating a certain number of calculations using dynamic programming

import java.util.Arrays;
public class StaircaseProblem {

static long[] memoization; //new line of code
 public long numOfWays(int numOfSteps) {
   if(numOfSteps == 1 || numOfSteps == 0) 
     return 1;
 //start of new lines of code
   if (memoization[numOfSteps] != -1) 
     return memoization[numOfSteps]; 
 
   long currentResult = numOfWays(numOfSteps-2) + numOfWays(numOfSteps-1);
   memoization[numOfSteps] = currentResult;
   return currentResult;
 //end of new lines of code
 }
 
 public static void main(String[] args) {
   StaircaseProblem sp = new StaircaseProblem();
   int steps = 5;

   memoization = new long[steps+1]; //new line of code
   Arrays.fill(memoization, -1); //new line of code
   
   System.out.println(sp.numOfWays(steps));
 }

}

The two codes I wrote are basically the same lines of code except some few modifications about storage. However, it had a huge effect on the run time. For n = 100, the dynamic code took between 200000 ns to 300000 ns each time (about 0.25 milliseconds in average) whereas I waited for more than 5 minutes for the naive code to run before terminating!

What if you could hop 1, 2 or 3 steps at a time? You would simply modify the formula as F(n) = F(n-1) + F(n-2) + F(n-3), since you can also start with hopping 3 steps at the same time now.

I guess I will be writing about dynamic programming from time to time since I think it is a very cool algorithm. It makes you use a mathematical approach to any problem in real life, and after finding a pattern and the formula for the problem, it is quite easy to implement.

Hope you enjoyed my first blog post with (serious) source code!

Source:

  1. Where the name comes from
  2. I like this guy’s tutorials

 

Network Flows and Perfect Marriage Problem

(This post is dedicated to Taha and Rıza, my fellow classmates.)

Hello!

I am trying to survive finals and tomorrow I have a CS final, so I wanted to write about a fun problem in computer science (and maths).

As you have probably noticed, CS people tend to give geeky names to algorithms and methods (we have The Chinese Restaurant Process and Monte Carlo Algorithm, for example). Perfect marriage problem is actually nothing but a funny name for an optimal matching problem.

Let me tell you the problem first: suppose you are the top coder for a celebrity matchmaking website. The company is not very famous yet, so you just have fourteen, heterosexual clients for now, and they tell you their preferences among the opposite sex clients:

Scarlett Johansson: “I would marry Justin Timberlake or Jared Leto. They are both very handsome.”

Jennifer Aniston: “I love Brad Pitt. However, if I marry Orlando Bloom or Justin Theroux I wouldn’t get so sad.”

Angelina Jolie: “Brad Pitt is the love of my life. Please do not match him with anyone else!”

Lindsay Lohan: “It doesn’t really matter to me. Just do not match me with Brad Pitt or Justin Theroux. They are booo-ring.”

Paris Hilton: “Umm, I guess Jared Leto or Leonardo di Caprio would do.”

Demi Moore: “I would marry Brad Pitt or Leonardo di Caprio. I like blondies.”

Miranda Kerr: “Leonardo di Caprio or Orlando Bloom, please.”

(Suppose if Jennifer Aniston,  Demi Moore and Angelina Jolie want to marry Brad Pitt, he also wants to marry one of these women.)
(yes, these statements are made up just for procrastination purposes)
(…and yes, I actually googled “celebrities love triangles”.)

It is very convenient to represent this as a bipartite graph with two sets of vertices (women and men):

marriage
(seems like a complicated one!)

Before jumping right in to the problem, however, I guess we have to talk about network flow problems. In graph theory,  flow network is a directed graph where each edge has a capacity and each edge receives a flow. Amount of flow into a vertex equals amount of flow out of it, unless it is a source (a vertex with only outgoing flow) or a sink (a vertex with only incoming flow). In real life applications, flow network is used to model traffic, water pipes, electric circuits, cargo delivery (actually anything you want that involves something travelling through a network of nodes).

For example, let the flow network below represent the traffic in a road system in Ezgiland (a capacity of 4 cars, because why not).

flow-network.png

Here are some of the possible flows:

The black numbers in the graph are the capacities (how much car can be on a road at the same time) and the red ones are the flows (how much car we allow on that road at that time).

Well, you could try every possibility and find the optimum (maximum here) flow but for bigger flow networks (for example, the road system in Istanbul) the trial-and-error method would take ages.

Ford-Fulkerson algorithm to the rescue! In this algorithm, we try to find the augmenting flow chains, the sequences of the edges that forms a path in the network when the directions are ignored and that we can add more flow. After we find a path, we can try to optimize the flow with this algorithm.

Let’s try to improve our worst flow in the road system of Ezgiland:

The blue writings in the form (where the flow came from, additional flow) represent my attempts on finding an augmenting flow chain in this flow network. From the source vertex, I can push 3 more “cars” to V2 so I write (+, +5) near V2. From V2, I can push 3 more cars to V4 but after that I would get stuck since the roads with capacities 7 and 4 are already at their maximum capacity. The same thing happens when I try to push 3 more cars to V1. So what will I do next?

Since I am looking for an augmenting flow chain, I am allowed to backflow, so I continue with the road having capacity 9. Whenever I take away flow, I put a minus sign before the additional flow. While travelling from V2 to V3, I take away 4 cars from the road so I write (V2, -4) near vertex V3. Those 4 cars that I took away from the road, I can add them to the road connected to the sink vertex. This augmented chain has bottleneck capacity of 4 (we started with pushing 5 from the source but ended up with 4 cars at the end). That would increase the total flow to 15 + 4 + 4 = 23, which gives us the optimal solution. You could try to find other augmenting paths but I believe this is the one with the biggest bottleneck capacity (if you find a better solution please let me know).

Since we learned the algorithm, FINALLY, we can go back to our perfect marriage problem.

I am expecting this question from you now: “What does the network flow problem have to do with the perfect marriage problem? The graph we previously had in the matchmaking example doesn’t even have any source or sink vertex, and it is undirected!”

Well… you are right. In fact, in order to use Ford-Fulkerson algorithm in perfect matching problem, we add source and sink vertices and we make the graph directed!

marriageford.png

Then, we arbitrarily choose husbands for each female celeb. Scarlett Johansson (SJ) matches with Justin Timberlake (JT), Jennifer Aniston (JA) with Brad Pitt (BP) (…yes I did this on purpose lol), Lindsay Lohan (LL) with Leonardo di Caprio (LC), Paris Hilton (PH) with Jared Leto (JL) and Miranda Kerr (MK) with Orlando Bloom (OB). Then we are left with Angelina Jolie (AJ) and Demi Moore (DM). Although there are two male celebs left, since their dream husbands are matched with others now, they remain single.

Is this the perfect matching? We have to inspect the graph. We cannot directly say “Oh, there are two women and two men left, so there should be a better solution”.

Sometimes we will not find the perfect marriage. There won’t be enough men for every women, or every women will want the same man. Maybe a woman will refuse to marry any of these man even if it’s the end of the world. Angelina Jolie, for example, does not accept that there’s plenty of handsome fish in the sea. She just wants to marry Brad Pitt. If Brad Pitt was not in this set, she would definitely remain single. marriageford1

Anyway, let’s go back to graphs and network flows. In this graph, every edge has a maximum capacity of 1. If we push a flow of 1 to an edge, this means the endpoints of the edge are getting married. An empty edge means 0 flow.

We apply Ford-Fulkerson algorithm to find the optimal solution. From the source, we first push 1 to AJ to maximize the flow. Then, there is only one edge coming out from AJ, which is going in to BP. Then we do something interesting: since male celebs always have incoming edges, we have to separate BP from his current wife and hope that his wife will not get upset and can marry someone else. BP’s current wife turns out to be JA ( 😦 ) so we send -1 flow from BP to JA (You could think of +1 as a wedding proposal and -1 as filing for divorce). Then, we see that Justin Theroux (JTo) is not taken yet, so we match them.

marriageford2

 

marriageford3

We managed to solve this one. We still have the problem of matching DM to someone though.

Do not give up. We can do it!

From the source, we again push 1, but this time to DM. We check the men set and we see DM either wants to marry BP or LC. They are both taken, so we definitely have to seperate a couple. However, BP has just got divorced, so we do not want to make him upset again (he will refuse to pay us if we do). We choose to seperate LC from LL. LL said he would nearly marry anyone (thank goodness), so we would match her with Adam Levine (AL), who is currently single. marriageford4

 

Aaand we are done! Your clients will live happily ever after (?), the matchmaking company will soon become famous and you will get a raise as the top coder. Congrats!

The final matching looks like this, if you are curious:

marriagefinal.png
(Scarlett Johansson – Justin Timberlake, Jennifer Aniston – Justin Theroux, Angelina Jolie – Brad Pitt, Lindsay Lohan – Adam Levine, Paris Hilton – Jared Leto, Demi Moore – Leonardo di Caprio, Miranda Kerr – Orlando Bloom)

 

Thanks for reading. Wish me luck on my finals!

Source:

  1. Tutorial (a very good one)
  2. Perfect Marriage and Maximum Flow
  3. Maximum Flow Problem

 

“Draw Without Lifting Pencil” Puzzles (Euler Paths & Circuits)

envelope
Shape 1

Remember this shape? I bet you could tell me the answer without me asking the question. Most of us solved this puzzle million times when we were at primary school.

Let me ask the question again: how can you draw this shape without tracing the same line twice and without taking the pencil off the paper?

I will talk about the mathematical approach to this problem in this post, which includes graph theory (so it would be very nice if you go and read the basics of graph theory before continuing, although it is not that necessary since I assume you will understand the general concept).

For this shape, you probably memorized the answer by doing it so many times. You know where to start and where to finish. However, suppose you see a shape like this:

main-qimg-29f049440bad4571da709d94211bcaf3
Shape 2

Now we have to modify our question a little: can you draw this?

We can convert the shape into a graph by assigning vertices to intersections and assigning edges to the lines between the vertices. Let’s try this with our popular shape:

envelope

(There are ten edges in this graph: 1-2, 1-3, 2-3, 2-4, 2-5, 3-4, 3-6, 4-5, 4-6 and 5-6.)

Now we are looking for a path (or a cycle) in the graph that visits every edge exactly once. This problem was solved by the famous mathematician Euler in 1736 and is considered to be the beginning of the graph theory. The problem is often referred as an Euler path or Euler circuit problem. An Euler path starts and ends at different vertices, whereas an Euler circuit starts and ends at the same vertex.

For an Euler path P, for every vertex v other than the endpoints, the path enters v the same number of times it leaves v (what goes in must come out). Then, there should be twice this number of edges having v as an endpoint (try to visualize this: -*-, where asterisk is a vertex which has one entrance = one exit, and to which one times two edges are connected).  Therefore, every v should have an even degree (even number of edges should be connected to v).

Now suppose P starts at vertex p and ends at vertex q. Then P should leave p one more time than it enters, and enter q one more time than it leaves. This makes degrees of p and q even degree minus one, therefore, the endpoints of P should have odd degrees (odd number edges should be connected to v).

Then the conclusion  for P is this:

“If a graph has an Euler path, then it must have exactly two odd vertices.”

For an Euler circuit C, the starting point must be the same with the endpoint, so C enters the endpoint the same number of times it leaves it, which makes it a vertex of even degree.

Then the conclusion for C is this:

“If a graph has an Euler circuit, then all of its vertices must be even vertices.”

Let’s apply these on our examples. In shape 1, vertices 1, 2, 3 and 4 all have even degrees of two, four, four and four, respectively. Vertices 5 and 6 both have odd degrees of three. Then, this graph has at least one Euler path but it does not have any Euler circuit. In shape 2, there are four vertices of odd degree and one vertex of even degree, so it does not have any Euler path or Euler circuit.

Question: Is it possible for a graph to have both an Euler path and an Euler circuit? 
Answer: No. (I think you can figure this out by yourself. I trust you!)

Now we have to focus on our main question: if a path/circuit exists, how do we find it?

For small graphs, you could try every possibility, of course, but in real life applications there will surely be graphs with thousands or millions of vertices and trial-and-error method will take so much time even with a computer program.

A systematic approach would be Fleury’s Algorithm. In this algorithm, our motto is this old proverb: Don’t burn your bridges behind you (it also exists in Turkish: Gectigin kopruleri yakma. I like the English version better though). In graph theory, bridge is the only edge which connects two separate sections of the graph. Removing this edge from the graph would make it disconnected.

Bridges_800

To find an Euler path/circuit in a graph:

  • Make sure it has one.
  •  If you are looking for an Euler path, start from any odd vertex. Else, start from any vertex.
  • A non-bridge ALWAYS has priority over a bridge. ALWAYS CHOOSE THE NON-BRIDGE.
  • Delete the edge that you have traversed.
  • If you do not have any edges left, stop.

Let’s see an example (I am always taking the images from the other sites in this post but I will start with a different vertex, I promise).

bridge.png

In this graph, vertices A, B, C, D, E and F are all even, so we will find an Euler circuit. We could start with any vertex, say B. (1) Then we would proceed again with any one of them, say A. From A, there is no bridge so we can safely (2) travel to D. Then from D, we would either (3) travel to F or C (say C) BUT DEFINITELY NOT B since we would get stuck at B, we would burn the bridge (…ne gemiler yaktim). Then we could proceed with (4) A or E (say A) (not F!), then with (5) E, then (6) C, (7) F, (8) D, (9) B, and since there is no edge left we would finally stop.

(Notice that we started and ended with vertex B, as we were supposed to do.)

If we go back to shape 1, it does not matter if you start from 5 or 6 since the solutions are mirror images of each other. There are 44 possible solutions if you start from vertex 5 and there are only 10 ways to lose.

HausVomNikolaus.svg.png
Ways to win this game (Do not forget to say das ist das Haus vom Ni-ko-laus while trying these at home!)

Since there are so many solutions to this, let’s see an example of a failure. If we follow the path 6-5-4-3-6-4-2 and delete all the edges that we travel, the final graph would look like this:

envelope

If we continue with vertex 5, which is a bridge, we would get stuck there, so we would have to lift our pencil to draw this shape.

I hope you are not bored yet. If you are, though, here is a small exercise for you. The question is this: can you draw the shape below without tracing the same line twice and without taking the pencil off the paper?

path

I hear you say “No, since it neither contains zero nor two odd vertices.”

Well, the answer is…

YES!!!! You could draw it!

“WHAT?! But how? I trusted you, Ezgi!”

Here’s how:

(Well, he did lift his hand… So Euler and I are not liars after all…)

Source:

  1. math.ku.edu
  2. ncsu.edu
  3. Haus vom Nikolaus
  4. Data Structures and Algorithm Analysis in C++, Mark A. Weiss, Fourth Edition