Apparatus and method for efficiently partitioning a weighted array6324540Abstract A method and apparatus for determining an approximated solution to the partitioning of a two or greater dimensional array in an amount of time on the order of O(1). Given a desired maximum weight or a maximum number of partitions, an array of data is partitioned by determining a temporary division of the array of blocks such that each block has a weight of at least two times the desired maximum weight. Within each block, rectangles are determined which contain array elements greater than or equal to an arbitrary value corresponding to a guess at a maximum weight. From among these rectangles, a largest set of independent rectangles is determined, and partitions are defined based on the sides of the largest set of independent rectangles determined for each block. Select ones of the partitions may be deleted until a total number of defined partitions is equal to or less than a desired maximum number of partitions. Claims What is claimed is: Description BACKGROUND OF THE INVENTION
REFERENCE RESULT
Grigni & Manne NP-Hardness
Charikar et al. APX-Hardness
Halldorsson & Manne O(p) approximation
There is a need to improve the speed of an approximated solution for an array either given the desired maximum weight of any block or given the desired number of partitions, for a two and greater dimensional array, and/or to strike an improved balance between greater speed and increased accuracy. SUMMARY OF THE INVENTION A method and apparatus for partitioning a two-dimensional array of data based on a weight given to any one block in the array is provided. A temporary division of the array into a plurality of blocks is determined such that each of the blocks has a weight of at least approximately two times a desired maximum weight. Within each block, rectangles containing array elements greater than or equal to an arbitrary value are determined. Then, from among these rectangles a largest set of rectangles which are independent is determined for each block, and partitions of the array are defined corresponding to sides of the largest set of rectangles which are independent for each block. BRIEF DESCRIPTION OF THE DRAWINGS Features and advantages of the present invention will become apparent to those skilled in the art from the following description with reference to the drawings, in which: FIG. 1 shows a method for efficiently partitioning a weighted array in accordance with the principles of the present invention. DETAILED DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS Given a desired maximum weight value .delta. and using the present invention to partition a solution to the two dimensional .delta.-weight problem, one can easily obtain an O((log n).sup.2) approximation to an optimal partitioning using an additive weight function F. However, the present invention provides an increased accuracy of the solution in a shortened amount of time. This improved accuracy is achieved by recognizing an inherent connection and balance between "independent" blocks having greater weight within the array, and the time required to achieve an optimal solution. With suitable pre-processing of the two-dimensional array, a locally optimal collection of independent blocks can be used to generate a solution which is at most a multiple or linear factor away from the optimal solution. Conventional methods require a much longer amount of time relating to the square root of the number of partitions p, i.e., on the order of O(p). Two Dimensional .delta.-weight Partitioning Using Hamming Weight Function H.sub.c In geometry, a rectangle is considered to be "stabbed" by a line if the line passes through the interior of the rectangle. Stated as a problem, this is: Given a set of axis-parallel rectangles in the [1, n].times.[1, n] two dimensional integer grid, determine a set R of grid rows and C of grid columns such that each rectangle is stabbed by one of the rows in R or one of the columns in C and furthermore, S=max{.vertline.R.vertline., .vertline.C.vertline.} is minimized. The present invention provides a polynomial-time O(log n) approximation for solving the stabbing problem. That is, we give a polynomial algorithm that finds a set R of rows and set C of columns which stab the given set of rectangles and max{.vertline.R.vertline., .vertline.C.vertline.} is O(s* log n), where s* is the optimum solution. Our solution reduces the problem to the well-known "set cover" problem. The "set cover" problem can be stated as: For each (i,j), 1.ltoreq.i,j .ltoreq.n, Sij denotes the set of rectangles that are stabbed by either the grid row i or the grid column j. Let R' denote the given collection of rectangles and S' denote the collection of sets S.sub.ij, 1.ltoreq.i,j .ltoreq.n. Find C'.epsilon.S such that .orgate.{S.sub.ij.vertline.S.sub.ij.epsilon.C'}=R' Such a C' is a set cover for R'. Consider a solution R={r.sub.1, r.sub.2, . . . , (r.sub..vertline.R.vertline. } of grid rows and C={c.sub.1, C.sub.2, . . . , C.sub..vertline.C.vertline. } of grid columns that stab the given set of rectangles. Without loss of generality, let .vertline.R.vertline..ltoreq..vertline.C.vertline.. Let C'={S.sub.r1c1, S.sub.r2c2, . . . , S.sub.r.vertline.R.vertline.C.vertline.R.vertline., S.sub.r.vertline.R.vertline.c.vertline.R.vertline.+1, . . . S.sub.r.vertline.R.sub.c.vertline.} Since each rectangle is stabbed by one of the grid rows in R or the grid columns in C, C' covers each rectangle in R'. Furthermore, .vertline.C'.vertline.=.vertline.C.vertline.=max{.vertline.R.vertline.,.ve rtline.C.vertline.}. In the opposite direction, say C' is a set cover for R'. Then there is a set of grid rows and columns of maximum size .vertline.C'.vertline. that stab the given set of rectangles. We construct these grid rows and columns for the rectangles as follows. Let R be the set of all row indices in C', that is, R={i.vertline.S.sub.ij.epsilon.C', 1.ltoreq.j.ltoreq.n} Similarly, C={i.vertline.S.sub.ij.epsilon.C', 1.ltoreq.i.ltoreq.n}. Clearly max{.vertline.R.vertline., .vertline.C.vertline.}.ltoreq..vertline.C'.vertline.. Since each rectangle in R' is in a set S.sub.ii for some Sij .epsilon.C', it is hit by either the grid row i or the grid column j; therefore, it is stabbed by the grid rows in R or the grid columns in C. Combining those two arguments, it follows that the minimum sized set cover gives the optimum solution to the stabbing problem. Standard results on the set cover problem give a polynomial time algorithm with an approximation of O(log maxi.sub.ij.vertline.S.sub.ij.vertline.)=O(log n) (since there are at most n.sup.4 grid rectangles in any set, and in fact overall). This in turn gives an O(log n) approximation to the stabbing problem. Theorem 2: There exists a polynomial time O(log n) factor approximation for the two dimensional .delta.-weight partition problem using a Hamming weight function H.sub.c. Proof. This problem is reduced to the stabbing problem as discussed above. Consider the collection of all possibly overlapping minimal rectangles where the additive weight function F value of each rectangle is >.delta.; rectangles are minimal in the sense that if two rectangles have an additive weight function F value >.delta. and one is contained in the other, we retain the smaller one. Now the two dimensional .delta.-weight partition problem is precisely the stabbing problem for which a O(log n) factor approximation exists. Two Dimensional p.times.p Partitioning Using Additive Weight Function F Grigni and Manne [GM96] have shown that the two dimensional p.times.p partition problem using an additive weight function F is NP-Complete. In this section, we present a polynomial time heuristic which provides a O(1) factor approximation to an optimum two dimensional p.times.p partitioning. The following Lemma 3 is important to our arguments. Lemma 3: Let c and d be two positive integers, c, d.ltoreq.k. If there exists a k.times.k partitioning such that MAX norm of the blocks is B using an additive weight function F, then there exists a k/c.times.k/d partitioning with MAX norm<cdB using the additive weight function F. Proof. Consider a k.times.k partitioning with MAX norm B and take every cth row as well as every dth column. The maximum F value of a block of this k/c.times.k/d partitioning is at most cdB since each new block contains cd of the previous blocks. This lemma can be combined with the observation that Theorem 2 holds for two dimensional .delta.-weight partition problem using an additive weight function F as well, to get the following. Theorem 3: There exists a polynomial time O(log.sup.2 n)-approximation for the one dimensional p.times.p partition problem using an additive weight function F. Proof. Our approach is geometric. Let B* be the optimal solution to the two dimensional partitioning problem. Consider a collection of all possibly overlapping minimal rectangles where the f value of each rectangle is .gtoreq.B; rectangles are minimal in the sense that if two rectangles have f value .gtoreq.B and one is contained in the other, we retain the smaller one. If such a set of rectangles have a stabbing, then B*.ltoreq.B. On the other hand, if there does not exist a stabbing, then B*.gtoreq.B. Thus it suffices to find the minimum B for which a stabbing exists. This observation combined with Lemma 3 implies the following: if we solve the stabbing problem with c-approximation for B =1,2,4, . . . (with p/c.times.p/c partitioning) and so on, and return the smallest B where the stabbing is possible, then that value of B is .gtoreq.B* and .ltoreq.C.sup.2 B*. (If fewer than p rows or p columns are used, we can always add arbitrary row or column partitions to pad up the numbers). Since Theorem 2 holds using an additive weight function F as well, we may pick c=O(log n). The stabbing problem is then solved O(log((log n).sup.2 B*)) times each of which requires constructing the appropriate rectangles (trivially O(n.sup.4) time) and solving the stabbing problem (trivially O(n.sup.4 polylog n) time). Since B*.ltoreq.n.sup.2 M where M is the maximum value of any array entry, the whole algorithm works in time polynomial in the input size. The main result is a substantially improved approximation method which computes a solution to within an O(1) factor approximation. Let a <W,I> partition be a I.times.I partition such that the MAX norm of the blocks is at most W. We will now show that given an input instance for which a <W,I> partition exists, we can construct in polynomial time a <O(W),I> partition. The basic idea behind our algorithm is the notion of independent rectangles: Definition 1: Two axis-parallel rectangles are said to be independent if there projections are disjoint along both the x-axis and the y-axis. Clearly, no single horizontal or vertical line can stab a pair of independent rectangles. So if an array has a <W,I> partition, then it may contain at most 2I independent rectangles of weight strictly greater than W. As a result, independent rectangles constitute a useful tool in establishing a lower bound on the optimal solution value. The method presented below builds on this idea to construct a partition whose cost is O(W). An Improved Method Let W be the optimal solution value. We assume a knowledge of this value in the presentation below--this value will be determined by performing a binary search over the interval [0,.SIGMA..sub.i,j A[i,j]] Observe that W.gtoreq.max.sub.ij A[i,j]. A method of solving a two dimensional partitioning problem in accordance with the principles of the present invention is now described with reference to FIG. 1. The goal of this method is given a predetermined number of partitions p, minimize the maximum weight of the elements contained within the partitions. In FIG. 1, the first step 100 assumes a value for the weight W, e.g., determined empirically by the practitioner. This initial guess is but a starting point. Then, initial partitions are placed, e.g., an I.times.I partition of the array is obtained, such that each row or column within any block in the partition has a weight of at most 2W. This can done by performing independent horizontal and vertical scans. During the horizontal scan, we keep a running sum of the weight of each row since the most recent vertical partition and set down the next vertical partition when the weight of any one of the rows exceeds W. Likewise, we set horizontal partitions based on running sums of the weights of columns during the vertical scan. Since each time a new column (row, respectively) is considered, the weight of the rows (columns, respectively) can increase by at most W, it follows that the weight of any row (column, respectively) within any block induced by the vertical and horizontal partitions does not exceed 2W. Henceforth we consider the array with this I.times.I partition which we refer to as the partition P. Thus, as a result of this first step, the initial partitions are such that any block has at most two times the desired weight W. or 2W. In the second step 200, for each block formed in step 100, all possible rectangles having a weight just greater than the desired weight W are determined for each block. Thus, a set S of all minimal rectangles whose weight exceeds Wand which are entirely contained within the blocks induced by the partition from step 100 are built. A rectangle is considered minimal if there does not exist another rectangle properly contained in it with weight larger than W. This can be done by starting from each location within a block and considering rectangles with their top left corner at that location in turn in the order of increasing sides until all minimal rectangles of weight strictly greater than Ware discovered. In the third step 300, Definition 1 is important, i.e., defining independent rectangles. For each of the rectangles determined for each partitioned block, the maximum number of independent rectangles are determined. Thus, a local 3-optimal set M .OR right. S of independent rectangles is determined. M is a local 3-optimal set if there does not exist i .epsilon.{1, 2, 3} independent rectangles in S-M which can be added to M by removing at most (i-1) rectangles from M without violating the independence condition. Such a set can be easily constructed in polynomial time by repeatedly performing swaps which increase the size of the current independent collection. Each swap takes polynomial time and the procedure terminates in polynomial time since any independent collection can have at most O(n) rectangles. In the fourth step 400, tentative final partitions are formed on the basis of the maximum number of independent rectangles determined in step 400. For each independent rectangle, the left, right, top and bottom sides are extended through the array in the horizontal or vertical direction to form partitions. Thus, in more detail another partition is based on M. For each rectangle in M, we set two straddling horizontal and two straddling vertical partitions so as to induce that rectangle. In all, this introduces at most 2M horizontal and 2M vertical partitions. The partition P from step 100 together with this partition induced by rectangles in M is our new partition now. In the fifth step 500, if two many partitions remain after the performance of step 400, then partitions are removed in the fifth step 500 until only the desired number of partitions remain. For instance, every other partition may be removed in the vertical and horizontal directions and the resulting partitioning tested to determine if it meets the given criteria. Thus, in more detail, a partition of the input array now remains which uses h.ltoreq.2M+I horizontal lines and v.ltoreq.2M+I vertical lines. To get an I.times.I partition from this, we simply retain only every .left brkt-top.hII.right brkt-top.th horizontal line and only every .left brkt-top.vII.right brkt-top.th vertical line. By Lemma 3, this increases the maximum block weight by at most a factor of .left brkt-top.hII{character pullout}vII.right brkt-top.. Resultant Approximation Accuracy Two properties of the above method must be established: (a) given a choice of weight W for which the input array has a <W, I> partition, the weight of any block in the partition constructed by the above algorithm is O(W), and (b) the smallest value W for which the analysis of the algorithm holds, identified via binary search, is upper bounded by the optimum solution value. We begin by establishing the first property above; the following lemma is central to the analysis here. Lemma 4: Let b be a block contained in some block of the partition P constructed in step 100 above. Then if the weight of block b is at least 27W, it can be partitioned into 3 independent rectangles, each with weight strictly exceeding W. Proof: Given a block of weight at least 27W, we construct three independent rectangles of weight exceeding W as follows. First we perform a vertical scan, placing a horizontal cut as soon as the weight of the slab seen thus far exceeds 7W; we place two horizontal cuts in all. This gives us three slabs each of weight strictly greater than 7W. Now we perform a horizontal scan from right to left placing the first vertical cut as soon as one of the horizontal slabs exceeds weight W. Without loss of generality assume that it is the top slab. Then the top right block has weight greater than W but does not exceed 3W, and the two lower horizontal slabs to the left of that vertical cut have weight greater than 4W each. Now in a similar manner we place a second vertical cut to obtain two independent blocks of weight exceeding W from these two horizontal slabs. Thus we get three independent rectangles of weight greater than W each. Lemma 5: The weight of any block in the partition constructed at the end of step 400 is O(W). Proof: We begin by observing the following easily verifiable properties of the solution: (a) each block of the solution is completely contained in some block of the partition P, and (b) given a block b.epsilon.M and another block b'.epsilon slash.M, their projections on the x-axis or the y-axis are either completely disjoint or have a perfect overlap. Now consider a block b in the solution; using the preceding observations, it is readily seen to fall into one of the following categories: (1) the block b belongs to M, or (2) the block b does not belong to M but has a perfect overlap along one of the axes with a block b'.epsilon.M, or (3) the block b does not belong to M but has a perfect overlap along the x-axis with a block b'.epsilon.M and a perfect overlap along the x-axis with a block b".epsilon.M. In Case 1, the weight of b is O(W) since the set S as defined in step 200 has the property that any rectangle r in it has weight at most 3W This is because otherwise, we can always remove either a row or a column (of weight at most 2W) from r to obtain a rectangle r' of weight greater than W, contained in r, which violates the minimality of the rectangles in S. In Cases 2 and 3, each block has weight at most 27W; this follows from an application of Lemma 4. We observe that at most two blocks in M, say b' and b", may not be independent of a block which falls into these two cases. So if b has weight greater than 27W, we can replace b' and b" with at least three independent rectangles which are constructible from b (and are contained in S). But this contradicts the local 3-optimality of the collection M constructed in step 300. Hence b must weigh at most 27W. Lemma 6: The number of rectangles in M is 2I for any choice W for which there exists a <W, I> partition of the input array. Proof: If M had x rectangles, then each of those rectangles must be stabbed in the optimal solution since the optimal solution value is bounded by W and every rectangle in M has weight strictly greater than W. Stabbing x rectangles requires at least x/2 horizontal or vertical partitions and hence x must be at most 2I. Lemma 7: The weight of any block in the final solution returned in step 500 is at most O(W) for any choice W for which there exists a <W,/I> partition of the input array Proof: Lemma 6 tells us that the number of horizontal and vertical partitions at the end of step 400 is O(I) each. This fact, along with an application of Lemma 3, allows us to conclude that the weight of every resulting block in the I.times.I partition is O(W). This completes the proof of the first property of our algorithm that it gives a solution of weight O(W) whenever a <W,I> partition exists. To conclude, we observe that the least value W for which the algorithm either fails to construct the partition P in step 100 or yields a collection M in step 300 with more than 2I rectangles, must exceed the optimum. Thus the binary search procedure works to identify a suitable W. Theorem 4: There exists a polynomial time algorithm that computes an O(1)-factor approximation to the two dimensional block partitioning problem. Although the range of the weight function in the described embodiments of the present invention are not restricted, improved bounds may be obtained if the weight functions are given a restricted range. While the invention has been described with respect to p.times.p partitioning a two-dimensional array, it is equally applicable to the partitioning of arrays having higher dimensions, e.g., three, four, or five dimensions. Moreover, although the additive weight function F and the Hamming weight function H.sub.c are described with respect to the disclosed embodiments, the present invention is equally applicable to other weight functions and tilings. While the invention has been described with reference to the exemplary preferred embodiments thereof, those skilled in the art will be able to make various modifications to the described embodiments of the invention without departing from the true spirit and scope of the invention.
|
Same subclass Same class Consider this |
||||||||||
