Thursday, July 18, 2013

Solving Knapsack problem using Dynamic Programming

This is post is basically for solving the Knapsack problem, very famous problem in optimization community, using dynamic programming. But remember this problem can be solved using various approaches with different complexities, but here I shall talk about only dynamic programming, specifically bottom-up  approach. So lets first talk about what is Knapsack problem for the people who are unfamiliar with it. The Knapsack problem states that “Given a set of items, each with a weight and a value, determine the number of each item to include in a collection so that the total weight is less than or equal to a given limit and the total value is as large as possible”. Lets consider a concrete example , suppose you have a Knapsack and you have lots of books and articles. You are planning to go somewhere in your vacation and you want to keep some important books and articles with you. The Knapsack has fix capacity so you cannot keep all the books and articles that you got inside the knapsack.
Now your job is to select only those books and articles, whose total capacity is less than or equal to that of Knapsack, so that you get maximum valuable books and articles inside your knapsack. In this case the weight of item can be taken as real weight of the books and articles and value of the item can be taken as how important those articles and books for you. And also the Knapsack can support only fix weight items. This is the theoretical explanation, mathematically the problem can be defined as follows

\[ \text {maximize} \sum_{i \in 1 ... j} v_ix_i \\
\text {subject to} \sum_{i \in 1 ... j}w_ix_i \le k,\;x_i \in \{0,1\}, \; i \in \{1...j\} \]
Where vi is value of item i, j is total number of items, xi is binary number indicating whether you take item i or not. If you take item i then value of xi will be 1 otherwise 0. wi is the weight of item i and k is the capacity of Knapsack. So far I gave you the theoretical and mathematical background of the problem, now I talk about programming model. To illustrate this, here is an example
weight    value
2             16
3             19
4             23
5             28
Total number of item = 4
Capacity of Knapsack = 7
Here in this example we have total 4 items with different weights and values. The total value of the Knapsack is 7 so the total weight of the items must be less than or equal to 7. Mathematically this can be written as
\[ \text {maximize}  16x_1 + 19x_2 + 23x_3 + 28x_4  \\
\text {subject to}  2x_1 + 3x_2 + 4x_3 + 5x_4 \le 7  \\
x_i \in \{0,1\}, \; i \in \{0...4\} \]

In bottom up approach what we do is we compute the recursive equations bottom up. i.e. we start with a zero items and find out what is the best we can do when we have 0 items to work with. And we take 1 item and find out what is the best we can do when we have 1 item to work with. We perform this up to j items which is 4 in our case. We create a big table with capacity along one side(row) and items on other side (column) and fill that table step by step. First we take 0 items and try to optimize the Knapsack of size 0, then we take again 0 items and try to optimize for size 1. We perform this up to the total capacity of Knapsack which 7 in our case. We we will have capacity +1 rows in the table. Similarly we take 1 item and try to optimize the Knapsack of size 0 to capacity and so on up to total number of items. This makes total no of times + 1 columns in table. Now we have table with size (capacity +1) along rows and (no of items + 1) along column. In our example this is 8 x 5. This may look like this
First, we take 0 item. Since there is no item we cannot do any optimization. So the best optimization we get will be
Second, we take 1 item with weight = 2 and value = 16. Since the weight is 2 we need Knapsack with size at least 2. We cannot keep the item with Knapsack of size 0 and 1. So after taking one item the table will look like.
Third, we take two items. First item is one of given above and the next items is of size = 3 and value =  19. To keep this item we need Knapsack of size at least 3. Since total size of these two items is 5 the Knapsack of size at least 5 can hold both items. So the table for 2 items look like.
Similarly for 3 items.
and last for all 4 items
So the best optimization value is 44. Python Script for bottom-up dynamic programming is given below
bestvalues = [[0] * (capacity + 1) for i in range(1,2)]
    for i in range(1,itmes+1):
     for j in range(1,capacity+1):
      if weights[i-1] > j:
       bestvalues[i][j] = bestvalues[i-1][j]
       candidate1 = bestvalues[i-1][j]
       candidate2 = bestvalues[i-1][j-weights[i-1]] + values[i-1]
       bestvalues[i][j] = max(candidate1,candidate2)
This only gives the best values. But to determine which items are actually taken to make this optimized value we must backtrack the table. This can be done using following python script
reconstruction = []
    i = items
    j = capacity
    while i > 0:
        if bestvalues[i][j] != bestvalues[i - 1][j]:
            j -= weights[i-1]
        i -= 1


  1. This article has helped me to understand original article from I am saying this as a programmer with over 8 years of coding in various languages. Thanks for the python`s sample of code. Helped a lot