JOIN
 Select a Forum     Round Tables New Member Discussions News Discussions Algorithm Matches Marathon Matches NASA Tournament Lab TopCoder Cookbook High School Matches Sponsor Discussions Development Forums Design Forums Search | Watch Thread  |  My Post History  |  My Watches  |  User Settings View: Flat (newest first)  | Threaded  | Tree Previous Thread  |  Next Thread Forums TopCoder Cookbook Algorithm Competitions - New Recipes 2.4.? Commonly used DP state domains [ 1 2 ]    NEXT >
 2.4.? Commonly used DP state domains | Reply Problem:The most creative part of inventing dynamic programming solution is defining recurrent relations. The recurrent relations consist of two parts: state domain and transitions. State domain is a set of states (subproblems) in dynamic programming. For each state the subresult will be calculated eventually. Transitions are the relations between different states which help calculate the subresults.This recipe covers frequently used state domain types. The general approaches of dealing with them and real SRM examples are given. Also few optimizations specific to particular domains are mentioned here.SolutionCode of DP solution usually contains an array representing subresults on the state domain. For example, classic knapsack problem solution will be like:``` int maxcost[items+1][space+1]; memset(maxcost, -63, sizeof(maxcost)); //fill with negative infinity maxcost[0][0] = 0; //base of DP for (int i = 0; imc to state (ni,nj)->nmc   ni = i + 1; //forward transition: do not add i-th item nj = j; nmc = mc; if (maxcost[ni][nj] < nmc) //relaxing result for new state maxcost[ni][nj] = nmc;   ni = i + 1; //forward transition: add i-th item nj = j + size[i]; nmc = mc + cost[i]; if (nj <= space && maxcost[ni][nj] < nmc) maxcost[ni][nj] = nmc; } int answer = -1000000000; //getting answer from state results for (j = 0; j<=space; j++) if (maxcost[items][j] > answer) answer = maxcost[items][j]; return answer; ```Here (i,j) is state of DP with result equal to maxcost[i][j]. The result here means the maximal cost of items we can get by taking some of first i items with overall size of exactly j. So the set of (i,j) pairs and concept of maxcost[i][j] here comprise a state domain. The forward transition is adding or not adding the i-th item to the set of items we have already chosen.The order of iterations through all DP states is important. The code above iterates through states with pairs (i,j) sorted lexicographically. It is correct since any transition goes from set (i,*) to set (i+1,*), so we see that i is increasing by one. Speaking in backward (recurrent) style, the result for each state (i,j) directly depends only on the results for the states (i-1,*).To determine order or iteration through states we have to define order on state domain. We say that state (i1,j1) is greater than state (i2,j2) if (i1,j1) directly or indirectly (i.e. through several other states) depends on (i2,j2). This is definition of order on the state domain used. In DP solution any state must be considered after all the lesser states. Else the solution would give incorrect result.Multidimensional arrayThe knapsack DP solution described above is an example of multidimensional array state domain (with 2 dimensions). A lot of other problems have similar state domains. Generally speaking, in this category states are represented by k parameters: (i1, i2, i3, ..., ik). So in the code we define a multidimensional array for state results like: int Result[N1][N2][N3]...[Nk]. Of course there are some transition rules (recurrent relations). These rules themselves can be complex, but the order of states is usually simple.In most cases the states can be iterated through in lexicographical order. To do this you have to ensure that if I = (i1, i2, i3, ..., ik) directly depends on J = (j1, j2, j3, ..., jk) then I is lexicographically greater that J. This can be achieved by permuting parameters (like using (j,i) instead of (i,j)) or reversing them. But it is usually easier to change the order and direction of nested loops. Here is general code of lexicographical traversion:``` for (int i1 = 0; i1 jres by performing transitions //and handle them } ```Note: changing order of DP parameters in array and order of nested loops can noticably affect performance on modern computers due to CPU cache behavior.This type of state domain is the easiest to understand and implement, that's why most DP tutorials show problems of this type. But it is not the most frequently used type of state domain in SRMs. DP over subsets is much more popular.
 Re: 2.4.? Commonly used DP state domains (response to post by syg96) | Reply Subsets of a given setThe problems of this type has some set X. The number of elements in this set is small: less than 20. The idea of DP solution is to consider all subsets of X as state domain. Often there are additional parameters. So generally we have state domain in form (s,a) where s is a subset of X and "a" represents additional parameters.Consider TSP problem as an example. The set of cities X={0, 1, 2, ..., N-1} is used here. State domain will have two parameters: s and a. The state (s,a)->R means that R is the shortest path from city 0 to city "a" which goes through all the vertices from subset s exactly once. The transition is simply adding one city v to the end of path: (s,a)->R turns into (s+{v},v)->R + M[a,v]. Here M[i,j] is distance between i-th and j-th city. Any hamiltonian cycle is a path which goes through each vertex exactly once plus the edge which closes the cycle, so the answer for TSP problem can be computed as min(R[X,a]+M[a,0]) among all vertices "a".It is very convenient to encode subsets with binary numbers. Look recipe "Representing sets with bitfields" for detailed explanation.The state domain of DP over subsets is usually ordered by set inclusion. Each forward transition adds some elements to the current subset, but does not subtract any. So result for each state (s,a) depends only on the results of states (t,b) where t is subset of s. If state domain is ordered like this, then we can iterate through subsets in lexicographical order of binary masks. Since subsets are usually represented with binary integers, we can iterate through all subsets by iterating through all integers from 0 to 2^N - 1. For example in TSP problem solution looks like:``` int res[1< nr) //relax result for state (ns,na) with nr res[ns][na] = nr; } } int answer = 1000000000; //get TSP answer for (int a = 0; a < N; a++) answer = min(answer, res[(1<
 Re: 2.4.? Commonly used DP state domains (response to post by syg96) | Reply Subtrees(vertices) of a given rooted treeThe problem involves a rooted tree. Sometimes a graph is given and its DFS search tree is used. Some sort of result can be calculated for each subtree. Since each subtree is uniquely identified by its root, we can treat DP over subtrees as DP over vertices. The result for each non-leaf vertex is determined by the results of its immediate children.The DP over subtree has a state domain in form (v,a) where v is a root of subtree and "a" may be some additional parameters. states are ordered naturally be tree order on vertices. Therefore the easiest way to iterate through states in correct order is to launch DFS from the root of tree. When DFS exits from a vertex, its result must be finally computed and stored in global memory. The code generally looks like:``` bool vis[N]; //visited mark for DFS res[N]; //DP result array   void DFS(int v) { //visit v-rooted subtree recursively vis[v] = true; //mark vertex as visited res[v] = ???; //initial result, which is final result in case v is leaf for (int i = 0; i sons; for (int i = 0; iKi. Since the sum of all Ki is N, you will get O(N*A) space.
 Re: 2.4.? Commonly used DP state domains (response to post by syg96) | Reply Layer count + layer profileThis is the toughest type of DP state domain. It is usually used in tiling or covering problems on special graphs. The classic examples are: calculate number of ways to tile the rectangular board with dominoes (certain cells cannot be used); or put as many chess figures on the chessboard as you can so that they do not hit each other (again, some cells may be restricted).Generally speaking, all these problems can be solved with DP over subsets (use set of all cells of board). DP with profiles is an optimization which exploits special structure in this set. The board we have to cover/tile is represented as an array of layers. We try to consider layers one by one and store partial solutions after each layer. In simple rectangular board case layer is one row of the board. The profile is a subset of cells in current row which are already tiled.The state domain has form (k,p) where k is number of fully processed layers and p is so-called profile of solution. Profile is the necessary information about solution in layers that are not fully processed yet. The transitions go from (k,p) to (k+1,q) where q is some new profile. The number of transitions for each state is usually large, so they all are iterated through by recursive search, sometimes with pruning. The search has to find all the ways to increase the partial solution up to the next layer.The example code below calculates the number of way to fully cover empty cells on the given rectangular board with dominoes.```int res[M+1][1<
 Re: 2.4.? Commonly used DP state domains (response to post by syg96) | Reply Examples:InformFriendsWe are asked to find assignment man->fact which maximizes the number of facts with constraint that everyone must know all the facts. In the optimal assignment people can be divided into fact-groups. Each fact-group consists of people who are told the same fact. And each fact-group must be able to tell everybody else about the fact.Let's precalculate for each subset of people whether they can become a fact-group. The subset can be a fact-group if set of all thier friends united with them is the whole set. After possible fact-groups are calculated, we have to determine maximal number of non-intersecting fact-groups in the set of people. We can define state domain (s)->R where s is subset of people and R is problem answer for them. To determine the answer for state s we have to subtract one of its fact-groups. It is a subset of s and forms a fact-group. So we can iterate through all subsets of s and try them as fact-groups.``` n = matr.size(); int i, j, u; for (i = 0; i<(1<0; j = (j-1)&i) if (cover[j]) //iterate through all fact-group subsets if (res[i] < res[i^j] + 1) res[i] = res[i^j] + 1; //relax the result for i if (ans < res[i]) ans = res[i]; //relax the global answer } return ans; ```Breaking strings (ZOJ 2860)This problem is solved with DP over substrings. Let's enumerate all required breaks and two ends of string with numbers 0, 1, 2, ..., k. Then res[L,R] will be result for the substring which starts in L-th point and ends in R-th point. To get this result we should look through all possible middle points M and consider res[L][M] + res[M][R] + (x[R]-x[L]) as a result. By doing this we get a clear O(k^3) solution (which is TL).What makes this problem exceptional is the application of Knuth's optimization. This trick works only for optimization DP over substrings for which optimal middle point depends monotonously on the end points. Let mid[L,R] be the first middle point for (L,R) substring which gives optimal result. It can be proven that mid[L,R-1] <= mid[L,R] <= mid[L+1,R] - this means monotonicity of mid by L and R. If you are interested in a proof, read about optimal binary search trees in Knuth's "The Art of Computer Programming" volume 3 binary search tree section.Applying this optimization reduces time complexity from O(k^3) to O(k^2) because with fixed s (substring length) we have m_right(L) = mid[L+1][R] = m_left(L+1). That's why nested L and M loops require not more than 2k iterations overall.``` for (int s = 0; s<=k; s++) //s - length(size) of substring for (int L = 0; L+s<=k; L++) { //L - left point int R = L + s; //R - right point if (s < 2) { res[L][R] = 0; //DP base - nothing to break mid[L][R] = l; //mid is equal to left border continue; } int mleft = mid[L][R-1]; //Knuth's trick: getting bounds on M int mright = mid[L+1][R]; res[L][R] = 1000000000000000000LL; for (int M = mleft; M<=mright; M++) { //iterating for M in the bounds only int64 tres = res[L][M] + res[M][R] + (x[R]-x[L]); if (res[L][R] > tres) { //relax current solution res[L][R] = tres; mid[L][R] = M; } } } int64 answer = res[0][k]; ```
 Re: 2.4.? Commonly used DP state domains (response to post by syg96) | Reply BlockEnemySince tree is given in the problem statement, we should try DP on subtrees first=) Given any correct solution for the whole tree, any subtree of it is also correct. So if all pairs of occupied towns are separated, then in each subtree they are also separated. So we can try to use state domain (v)->R where v represents subtree and R represents minimal effort to solve the problem for this subtree. But we'll discover soon that connecting subtrees correctly is impossible because we need to know whether there is an occupied town connected with outside part of subtree. We call such a solution for subtree dangerous. Now we add the boolean parameter (solution is safe/dangerous) in the state domain. Let res[v][t] be the minimal effort needed to get a correct solution for v-subtree which is dangerous(if d=1)/safe(if d=0).Initially we consider edge which is going out of subtree indestructible. We will handle the destruction of outgoing edge later. In this case, the solution for leaves is easy to obtain. If v is non-occupied leaf, then it forms a safe solution, and dangerous solution is impossible to get. If it is occupied, then the solution is dangerous with no effort, and impossible to make safe. Then if the vertex v is not leaf, we add its sons one by one. When adding a son s to v-subtree, we have the following merging rules:1. v=safe + s=safe -> v=safe2. v=dangerous + s=safe -> v=dangerous3. v=safe + s=dangerous -> v=dangerous4. v=dangerous + s=dangerous -> incorrect solutionAfter merging by these rules, we receive a solution for v-subtree in case of indestructible outgoing edge. Now what changes if the outgoing edge is destructible? We can destruct it by paying additional effort. In this case a dangerous solution turns into safe one. The root of tree has no outgoing edge.The problem answer is minimal effort of safe and dangerous solutions for the root ot DFS tree. The time complexity is O(N^2) because adjacency matrix is used, but can be easily reduced to O(N) with neighbors lists.```int res[MAXN][2];   void DFS(int v, int f) { //traverse v-subtree, f is father of v vis[v] = true; //mark v as visited res[v][0] = (occ[v] ? 1000000000 : 0); //result in case of leaf res[v][1] = (occ[v] ? 0 : 1000000000); for (int s = 0; s= 0 && res[v][0] > res[v][1] + matr[v][f]) //we can destroy upgoing edge (v-f) res[v][0] = res[v][1] + matr[v][f]; } ... DFS(0, -1); //run DFS from 0 (with no father) return min(res[0][0], res[0][1]); //whether root is dangerous does not matter } ```
 Re: 2.4.? Commonly used DP state domains (response to post by syg96) | Reply ConstructionFromMatchesWe need to construct grid from matches obeying certain rules which have local effect. And the grid is 2 x n which is ideal for DP with profiles. Of course, it is better to slice the grid to 2-cell layers instead of N-cell ones. The profile consists of thicknesses of the last two vertical sticks. They are required to check sum when filling the next layer. To perform the transition, we have to iterate through all possible variants of next five sticks thicknesses and choose only variants that satisfy two equations on cell sums. If we perform it bruteforce, we will get O(N*K^7) algorithm, that's slow. It can be easily optimized if we calculate thicknesses of two sticks explicitly by using cell sum equations. This way the DP will be O(N*K^5) in time and O(N*K^2) in space. That is more than enough for the problem.If you want a better solution, you can notice that:1. You can use "storing two layers" optimization and reduce space complexity to O(K^2).2. You can transform the solution to DP with broken profiles. To do it you should add intermediate "half" states (i,p,b,v). hres[i][p][b][v] means minimal cost of full construction of (2*i+1) cells - i layers and a top cell in the next layer. The DP with broken profiles will require O(N*K^4) time and O(N*K^3) space, which can be reduced with technique 1 to O(K^3).The code for simple solution is given below. Since width of profile is well-known and very small, it is easier to write transition code as nested loops instead of recursive search.```//...?? aa //schema of profile and transition //... u p //u and v comprise profile //... u p //a, b, c, p, q are five added sticks //...?? bb //system of equations is: //... v q //{u + a + p + b = top[i] //... v q //{v + b + q + c = bottom[i] //...?? cc //the depicted transition leads to (i+1,p,q) state   memset(res, 63, sizeof(res)); //DP base for (int u = 0; u= k) continue; //though it can be bad... for (int q = 0; q= k) continue;   int nres = cres + cost[p] + cost[q] //the new solution cost + cost[a] + cost[b] + cost[c]; if (res[i+1][p][q] > nres) //relaxing the destination res[i+1][p][q] = nres; } } }   int answer = 1000000000; //the last two sticks do not matter for (int u = 0; u res[n][u][v]) answer = res[n][u][v]; if (answer >= 1000000000) answer = -1; //do not forget "impossible" case ```
 Re: 2.4.? Commonly used DP state domains (response to post by syg96) | Reply GameWithGraphAndTreeThis problem is solved by very tricky DP on the tree and subsets. We are required to find number of mappings of the tree on the graph. First of all, we choose root of the tree because it is easier to handle rooted tree. Clearly, we should consider all possible submapping of all subtrees on all vertex-subsets of the graph. The number of these submapping is huge, so we have to determine which properties of these submappings are important for extending the mapping. It turns out that these properties are:1. Subtree denoted by its root vertex v. Necessary to check the outgoing edge mapping later.2. Vertex of graph p which is the image of v. Again: necessary to check the mapping of added edge.3. The full image of v-subtree in graph - set s of already mapped vertices in graph. Necessary to maintain bijectivity of mapping.Therefore we define state domain (v,p,s)->GR. GR is number of submappings with the properties we are interested in. To combine results of sons in tree we need to run another "internal" DP. Remember that internal DP is local for each vertex v of the tree. The first parameter will be number of sons already merged - this is quite standard. Also we'll use additional parameters p and s inside. The state domain is (k,p,s)->IR where IR is the number of submappings of partial v-subtree on graph with properties:1. The vertex v and subtrees corresponding to its first k sons are being mapped (called domain).2. Image of v is vertex p in graph.3. The full image of mapping considered is s - subset of already used vertices.The transition of this internal DP is defined by adding one subtree corresponding to k-th son to the domain of mapping. For example, if w is the k-th son, then we add global state GR[w,q,t] to internal state IR[k,p,s] and get internal state IR[k+1,p,s+t]. Here we must check that there is an edge p-q in the graph and that sets s and t have no common elements. The combinations considered in GR[w,q,t] and IR[k,p,s] are independent, so we can just add their product to the destination state. The answer of internal DP is IR[sk,p,s] which is stored as a result GR[k,p,s] of global DP.This is correct solution of this problem. Unfortunately, it runs in O(4^N * N^3) if implemented like it is in the code below. Of course it gets TL. You need to optimize the solution even further to achieve the required performance. The recipe "Optimizing DP solution" describes how to get this solution accepted.```int gres[MAXN][MAXN][SIZE]; //global DP on subtrees: (v,p,s)->GR int ires[MAXN][MAXN][SIZE]; //internal DP: (k,p,s)->IR   void DFS(int v) { //solve DP for v subtree vis[v] = true; vector sons; for (int u = 0; u p for (int k = 0; k q; w-subtree -> t subset; if (s & t) continue; //do not break bijectivity add(ires[k+1][p][s^t], mult(ires[k][p][s], gres[w][q][t])); } //add product of numbers to solution } } memcpy(gres[v], ires[sk], sizeof(ires[sk])); //since partial v-subtree with k=sk is full v-subtree } //we have GR[v,p,s] = IR[sk,p,s] ... DFS(0); //launch DFS from root = 0-th vertex int answer = 0; //consider all variants for i - image of 0 for (int i = 0; i
 Re: 2.4.? Commonly used DP state domains (response to post by syg96) | Reply This recipe is great, in my opinion, considering there's a few interesting tutorials about dp in the web (not just knapsack and LCS but with some more content) if any. Sought for one some time ago (especially wanted to read about dp over subsets) and now I have a chance to read it. I mean it's good not only for Cookbook but for me so thank you very much.
 Re: 2.4.? Commonly used DP state domains (response to post by syg96) | Reply Thank you! This is the most useful recipe I have read so far. Perhaps it will finally help me to understand DP and move to yellow...
 Re: 2.4.? Commonly used DP state domains (response to post by syg96) | Reply For the multidimensional array, I don't understand what does Result[i1][i2][i3]...[ik] represent in terms of the Knapsack problem? Does it mean the maximum cost that we can obtain by using i1 number of items 1, i2 number of items2, ..., ik number of k-th item? If so, then it is no longer 0-1 Knapsack, but more like unbounded Knapsack. By the way, it will be great if you can use the Knapsack example for all the representation types.
 Re: 2.4.? Commonly used DP state domains (response to post by syg96) | Reply For the subsets of a given set you should mention that M[x,y] is the distance between cities x and y. You should use M in the code (instead of matr). You should explain the reasoning behind R[X,a]+M[a,0]: R[x,a] is the shortest path that visits every node once starting from node 0, so R[X,a]+M[a,0] is the shortest Hamilton cycle for X.
 Re: 2.4.? Commonly used DP state domains (response to post by syg96) | Reply For substrings of a given string. Can you avoid using 'l' in the code, because it looks too similar to one. You can use L and R instead. What is "tres=?;"? What is a in this case and is it actually used in the code?
 Re: 2.4.? Commonly used DP state domains (response to post by dimkadimon) | Reply 1. The Result[i1,i2,...,ik] has nothing is common with knapsack. It is array of results for general problem with multidim-like state domain. Knapsack was an example of such a problem with dim=2.2. I added explanation for M matrix and about the way answer is got. About M and matr - it is usual thing. Mathematics is used to short variable names (1-2 letters) whereas in programming short names is bad habbit. I have a strong feeling against naming a matrix with one letter, sorry=) By the way, didn't you notice that state result is denoted as R in text but as res in code?...3. Changed l,m,r to L,M,R. (l+1) looks really weird=) Question mark represents something that depends on the problem. It is usually neutral element i.e. 0 in combinatoric problem, +inf in minimization problem, -inf in maximization problem. I don't want to explain it here and there may be exceptions. That's why I just put question mark. Look one of examples and you'll see 10^18 in this place.a represents additional parameters. It must be explained somewhere in the recipe=) I added a note on this to the first place it is met.I don't like that additional parameters are represented by letter "a". Because it is english article also... I've enclosed parameter "a" in quotes in most places in text so that the do not confuse reader.
 Re: 2.4.? Commonly used DP state domains (response to post by syg96) | Reply This is a great post! Thanks for sharing it and congrats you are a smart guy!
 Forums TopCoder Cookbook Algorithm Competitions - New Recipes 2.4.? Commonly used DP state domains Previous Thread  |  Next Thread [ 1 2 ]    NEXT >