Shell sort with Algo-rythmics ♫

Whoa, it has been over two months since my last post!

Let’s get this train back on track (choof choof).

So, two weeks ago I was studying for my midterms and I was reminded of an awesome group called AlgoRythmics. They are a bunch of people from a university in Romania that demonstrate the sorting algorithms with various folk dances (mostly Hungarian). Although my midterm was not about sorting or any sort of algorithm at all, I watched all of their videos, totally hypnotized by their nerdiness and awesomeness.  My favorite ones are merge sort, shell sort and quick sort (although I feel like they should’ve chosen a different dance for quick sort – the video is even longer than the one on bubble sort!).

I understood the mechanisms behind all of them at first glance, except shell sort. I haven’t learned it before so I had to watch it more than once (sorry, systems programming midterm)  before I could have a clue of what is going on. That’s why today we’re going to learn how shell sort works.

First things first: please watch this to continue.

you watched it, right?


By the way, I do not understand what they yell  most of the time but I have some legit guesses. So let’s learn some Hungarian.

  1. öt: five
  2. három: three
  3. Hej, te szomszed!: Hey, you’re next!

…and that’s all of my Hungarian knowledge.

…sooo when a bunch of people turns to camera (like in 0:05) I will assume they are yelling “holeeey”. 

sceptical dog is still sceptical

Shell sort works as follows: We compare the n elements that have distance k from each other only, and then decrease k until it becomes 1. There are several ways to decrease the gap. Easiest way is to take k = n/2 and go with k = k/2 in each iteration. In this video, however, they go with “5,3,1” instead of “5,2,1”. I will take a wild guess here: if n/2 evaluates to an odd number, they start with that k and subtract 2 in each iteration. If n/2 is even, they start with k = n/2 – 1 since the algorithm should end with k=1, not k=0 (if you have any other guesses, please let me know by writing a comment below).

Let’s go step by step. This is their first arrangement:


This is when they say “holeeey” for the first time:


We should compare a[0] with a[5], a[1] with a[6] and so on. If the one on left is larger than the one on right, we should swap them (since we are looking for an increasing order).

(Note that the array elements with the same color of dot are in the same sublist, so swapping will occur only between those elements, if there is any).

a[0] – a[5] and a[4] -a[9] are swapped. The others just lift their hands, indicating that no swapping will be done.
This is when they say “holeeey” for the second time (or something like haj mosteve?)


Now, we are comparing a[0] with a[3], a[1] with a[4] and so on (that’s why they yell something with harow). a[0] – a[3], a[1] – a[4] and a[2] – a[5] dance as pairs but no swapping occur, so a[6] starts to dance with a[3].


Now be careful! a[6] swapped with a[3], but we still have to check a[0] and a[3] since they have a distance of three.

We check it, and see that no swapping should happen. False alarm. Phew.

We apply the same to the rest:

after some dancing, a[4] swaps with a[7] and a[6] swaps with a[9].
Then, the dancers face the camera once more:


Now, since the interval is 1, every element is in the same sublist and every one of them can swap with each other, so we apply basic insertion sort (we try to place the element where it belongs at last).

“Why should I even bother? What’s the point, really?” He thought for a moment.amca.png

“Who says there has to be a point?” he asked.

“Or a reason. Maybe it’s just something you have to do.”


You may ask yourself (as I also asked myself) why you even bothered to learn this algorithm since it boils down to insertion sort anyway. Now, imagine that you have a list that is almost in order except for an element or two and they are far away from each other.


If you performed an insertion sort on this list (or shell sort with gap = 1), you would have to move 3, 4, 5 and 6 to the left of 7, then you would move 2 to the left of 6, 5, 4 and finally 3, a total number of 8 moves. However, if you performed shell sort with gap = 5, you would just swap 2 and 7 and your list would be sorted.

Valla kod yazmadan bırakmam. I am currently learning Python for a course, so I gave it a shot.

def main():
    mylist = [3,0,1,8,7,2,5,4,9,6]
    print mylist

def shellSort(mylist):
    gap = len(mylist)//2 #gap is integer division of the list's length
    if gap % 2 == 0: #if even gap, make it odd

    while gap > 0:

        for start in range(gap):
            for i in range(start+gap, len(mylist), gap): #start from starting position + gap, go until end, increment by gap in each iteration

                pos = i #current position
                current = mylist[pos]
                while pos >= gap and mylist[pos-gap] > current: #if one on the right is less, swap
                    mylist[pos] = mylist[pos-gap]
                    pos -= gap 
                mylist[pos] = current
        gap -= 2 #subtract 2 from gap in every iteration


p e a c e   o u t


0/1 Knapsack Problem or: How I Learned to Stop Coding and Rob a House


It’s been over a week since I posted about another dynamic programming problem. I thought I would be posting more on winter break. Sorry about that.

I have usually written about everyday problems in this blog. Again, I will teach you a very useful, daily life application of computer science: efficient stealing.

Here we go:

Suppose you are a thief and you planned to enter a house. You know that there are only four valuable items in the house that weigh 1, 3, 4, 5 kilograms and cost 1000, 2000, 4000 and 4500 dollars, respectively. You are a very sloppy thief, i.e., you did not plan things ahead, so you brought a knapsack with a capacity of only 7 kilograms. What is the maximum total value of goods that you can steal? 

(the vase is probably from Beymen lol)

In the above example, you can easily see that the wisest decision would be stealing 6000$ worth of goods. In some houses though, there will be thousands of objects with very small weights or you will have a bigger knapsack so it will definitely be harder to decide. It’s easier to find the optimum solution with a clever algorithm rather than trying every combination and calculating the values each time.

In this case, our clever algorithm will be dynamic programming.

Remember that the dynamic programming approach is all about dividing the problem into subproblems. As in the stair-climbing problem, we do not jump directly into the problem of fitting the combination of the items with the maximum total value. Instead, we check for capacities less than 7, store the maximum total value up to that capacity and increase the capacity one by one.

Let’s see this on a table:


I inserted the items in non decreasing order and filled the “zeroth” column (I will implement this with arrays, that’s why there is a zeroth column) with all zeros since we cannot fit anything to the knapsack if it has zero capacity. Then, we start with only the first item with weight 1 (camera) and we increase the capacity of the knapsack one by one. For each capacity, the best (by best, I mean the maximum total value) we can do with that item is to include that item if it fits. It seems that weight 1 fits to every capacity greater than or equal to 1, so we fill the table according to that:

(the max total value we can get with the camera only)

Then, we consider the second item with weight 3 (laptop) together with the camera. If the capacity of the knapsack is 1 or 2, we can only fit the camera to the knapsack. If the capacity of the knapsack is 3, we can either put the camera or the laptop. We compare their values (we go one row up) and laptop turns out to cost more than the camera, so we choose the laptop. Here comes the trick for capacities bigger than 3 kg: we check for the maximum total value of the goods before adding 3 and add 3 on that value. If we are on column 4, for example, we go up and go 3 columns back and add the value of the laptop to that. If this value we have is bigger than the value when we go one row up, we write that value to that cell.

I know my attempt to explain the algorithm is very confusing, but I am sure that you will understand what I am trying to say:

we can fit the item with weight 1 only
we choose laptop over camera
we check for the maximum before adding the item with weight 3, then choose to add it or not

Next, we include the items with weights 1, 3 and 4. For the capacities less than 4, the best we can do is what we can do without including the item with weight 4 (necklace). Instead of recalculating these, we directly go up one row and write the maximum total values with just including the camera and the laptop. Then, for the capacities greater than or equal to 4, we either do not include item with weight 4 to the solution (go 1 row up and write the value 3000) or we go 4 back, one up, add necklace’s value to that value. For example, for the capacity of 4, we check for maximum total value with capacity of zero and add 4000 to it or we directly write 3000. Since 4000 is greater than 3000, we choose to write 4000.

(here, blue arrow means directly including the value above. orange arrow means adding 4000 to the value that the arrow comes from)

For the final row, we apply the same steps:


Note that in the last row, we chose to include the value above the 5th column instead of including the one where we go 1 up and 5 back since 5000 is bigger than 0 + 4500 = 4500. Same goes for the 7th column: we chose 6000 over 1000 + 4500 = 5500.

The value we want to find is the value on bottom right!

From this table, we can also see that, for example, the maximum value of goods we can steal when we have a knapsack with capacity 6 is 5500, with 5 is 5000, with 4 is 4000, and so on.


Now, you may think implementation of this algorithm takes a loooooong time. Well, you will be surprised. All of the text above comes down to this formula:

i = current row/item, j = current column/capacity
if(i == 0 || j == 0) 
      TABLE[i][j] = 0  
else if (j < (i-1).weight) 
      TABLE[i][j] = TABLE[i-1][j] 
      TABLE[i][j] = max(TABLE[i-1][j], (i-1).value + TABLE[i-1][j-(i-1).weight])

(For all j, TABLE[0][j] = 0 in the above formula. I did this in order not to manually fill in the first row. That’s also why I am looking for weights and values of i-1 instead of i. I did not show this on the tables, sorry about that.)

TL;DR What we do here is for the capacities less than the weight, we check for the maximum total value we found before, without including that weight. If the capacity is greater than or equal to the weight, we either do not include the weight and directly write the maximum total value we found before that fits that capacity or we include that weight and check for the maximum value subtracting that weight from the capacity and check for the maximum value at that capacity without including that weight.

Easy, right?

Well, maybe it will be easier if you also see a full implementation (in C++ this time):

#include <iostream>
#include <utility>
using namespace std;

int capacity; int numOfItems;

//returns the max value of two integers
int max(int a, int b) { return (a > b) ? a : b; } 

int knapsack_01(pair<int, int>* items){
    int TABLE[numOfItems+1][capacity+1]; //construct the table
    for(int i=0; i<numOfItems+1; i++){
        for(int j=0; j<capacity+1; j++){
            if(i == 0 || j == 0) 
                //fill TABLE[0][j]s and TABLE[i][0]s with 0s
                TABLE[i][j] = 0;  
            else if (j < items[i-1].second)
                TABLE[i][j] = TABLE[i-1][j]; 
                TABLE[i][j] = max(TABLE[i-1][j], items[i-1].first + TABLE[i-1][j-items[i-1].second]);
    return TABLE[numOfItems][capacity];

int main(){
    capacity = 7;
    numOfItems = 4;
    //a pair holds one item
    //items[item].first = value of item, items[item].second = weight of item
    pair<int, int> items [numOfItems] = {make_pair(1000, 1), make_pair(2000, 3), make_pair(4000, 4), make_pair(4500, 5)};//array of pairs (items)
    cout << knapsack_01(items) << endl;

Aaand that’s pretty much it. If you have a computer with you while stealing (…or you can steal one from the house), you can copy and paste the code above to a C++ shell and see the result. You can always change the capacity and the items.

(formal attire is not required)

(If you get caught in the act, you don’t know me and you didn’t read this post.)


  1. Similar source code
  2. Tushar Roy strikes again!

Stairs Climbing Puzzle (Dynamic Programming)


(Done with finals!!!)


Today, I will introduce you a very efficient algorithm called dynamic programming.

“If it’s just an algorithm, why does it sound like it’s totally a different aspect of programming?

“Why didn’t they give it a boring name like Kruskal’s Algorithm, for example? (no offense)”

The term dynamic programming was originally used by Bellman in 1940. He says he chose the word dynamic because the output characteristics of the solution depend upon time… and simply because it sounds cool. The word programming refers to the use of the method to find an optimal program.

Let’s see the problem first before actually talking about the algorithm (one of the easiest problems):

Suppose you are climbing up a staircase with a friend. Your friend is talking about how much he likes his pet goldfish for the fifth time that day. You do not want to hear that story again, so you want to keep your mind busy with counting how many possible ways you can jump up the stairs if you can just hop 1 or 2 step(s) at a time (you know how many steps the staircase has).

Here are some examples:

Do you see a pattern here? (Hint: It’s a very famous sequence).

The answer is…

(♫♫ drum roll ♫♫)


So, how does it work?

For n = 0, there is only one way to climb the staircase (which is no way). For n = 1, you could do only one hop of one. For n = 2, you can either directly hop two steps or you can jump from first step to second step with a single hop of one. For n = 3, you first have to reach step 1 somehow and add a hop of 2 steps or you first reach step 2 somehow and add a hop of 1 step. For n = 4, either you reach step 3 and add a hop of one or you reach step 2 and add a hop of two. Note that I did not include reaching step 2 and adding two hops of one since it is already included in reaching step 3 and adding one more step (think about it).

(F(n) is the number of ways to reach the nth step).

We can define our solution recursively from now on:


The sequence is exactly the same as Fibonacci except that F(0) = 1 and F(1) = 1 are the base cases here whereas in Fibonacci, F(0)=0 and F(1)=1 are the base cases.

Speaking of implementation, let’s implement this in Java (because I miss coding in Java 😦 )

public class StaircaseProblem {
 public long numOfWays(int numOfSteps) {
   if(numOfSteps == 1 || numOfSteps == 0) 
     return 1;
   return numOfWays(numOfSteps-2) + numOfWays(numOfSteps-1);
 public static void main(String[] args) {
   StaircaseProblem sp = new StaircaseProblem();
   int steps = 5;


This simple, naive recursive program gives an output of 8, as it was supposed to do. However, suppose you are climbing to a staircase of 100 steps.

…or Risshakuji Temple in Japan (good luck with your boring friend lol)

If you try this program with numOfSteps = 100, then your program will take strangely long to run.

Why? I said we need to calculate F(4) and F(3) in order to calculate F(5). In order to calculate F(4) we need to calculate F(3) and F(2), and to calculate F(3), we need to calculate F(2) and F(1). Calculating F(4) and F(3) have nothing to do with each other, so we calculate F(3) once for F(4) and once for F(3) itself.

Imagine how many multiple calculations occur in n steps:

a tree for showing the number of calculations (n=6)

If there was just a way to calculate each subproblem only once…

But wait! There is one!

This is where dynamic programming comes in. In dynamic programming, we compute a subproblem the first time we encounter it. The key is to store the solutions. The next time the subproblem occurs, instead of recomputing its solution, we simply look up the previously computed solution. Instead of time, we use space for storage. The technique of storing solutions instead of recomputing them is called memoization (no it’s not a typo, WordPress, stop underlining it).

eliminating a certain number of calculations using dynamic programming

import java.util.Arrays;
public class StaircaseProblem {

static long[] memoization; //new line of code
 public long numOfWays(int numOfSteps) {
   if(numOfSteps == 1 || numOfSteps == 0) 
     return 1;
 //start of new lines of code
   if (memoization[numOfSteps] != -1) 
     return memoization[numOfSteps]; 
   long currentResult = numOfWays(numOfSteps-2) + numOfWays(numOfSteps-1);
   memoization[numOfSteps] = currentResult;
   return currentResult;
 //end of new lines of code
 public static void main(String[] args) {
   StaircaseProblem sp = new StaircaseProblem();
   int steps = 5;

   memoization = new long[steps+1]; //new line of code
   Arrays.fill(memoization, -1); //new line of code


The two codes I wrote are basically the same lines of code except some few modifications about storage. However, it had a huge effect on the run time. For n = 100, the dynamic code took between 200000 ns to 300000 ns each time (about 0.25 milliseconds in average) whereas I waited for more than 5 minutes for the naive code to run before terminating!

What if you could hop 1, 2 or 3 steps at a time? You would simply modify the formula as F(n) = F(n-1) + F(n-2) + F(n-3), since you can also start with hopping 3 steps at the same time now.

I guess I will be writing about dynamic programming from time to time since I think it is a very cool algorithm. It makes you use a mathematical approach to any problem in real life, and after finding a pattern and the formula for the problem, it is quite easy to implement.

Hope you enjoyed my first blog post with (serious) source code!


  1. Where the name comes from
  2. I like this guy’s tutorials



It seems like it was only yesterday when I drew my first “for loop shape” in my introductory Java class and felt like a GODDESS. 

Now, here I am, doing the same kind of stuff in C++ and still having fun (…not feeling a bit ashamed).

Anyway, just copy and paste the code here (or just click here) and see what happens!

(If you are on a mobile device, click here in order to have a meaningful shape).

#include <iostream>
using namespace std;

int main(){
  int size;
  cout << "What do you want the size of your \"shape\" to be?";
  cout << " (pls no funny input) ";   cin >> size;
  while(size < 2 || size > 50){
    cout << "I SAID NO FUNNY INPUT!!" << endl;
    cout << "What do you want the size of your \"shape\" to be?";
    cout << " (pls no funny input) ";     cin >> size;
      int i;
      for(i=0; i<size-1; i++)
        cout << " ";
      cout << "**" << endl;
      for(i=0; i<size; i++){
        int j;
        for(j=0; j<size-1-i; j++)
            cout << " ";
        for(j=0; j<i+1; j++)
            cout << "/";
        for(j=0; j<i+1; j++)
            cout << "\\";
        cout << endl;    
      for(i=0; i<size-1; i++){
          cout << " ";
      cout << "||" << endl;
  return 0;