Merging a set of polygons with non-stable borders

Gregor Klajnsek

Laboratory for Geometric Modeling and Multimedia Algorithms
Institute of Computer Science
Faculty of Electrical Engineering and Computer Science
University of Maribor
Maribor / Slovenia


1. Introduction
2. Description of the problem
3. Algorithm
    3.1. Determination of non-stable areas
    3.2. Removal of non-stable areas
        3.2.1. Fixing gap areas
        3.2.2. Fixing polygon intersections
        3.2.3. Fixing combined areas
4. Conclusions

5. References

Postscript version


The paper introduces an algorithm dealing with polygons with non-stable borders. Such polygons may appear because of measuring errors or during scanning and vectorisation of blueprints. The aim of the algorithm is to remove these errors within prescribed tolerance. The algorithm uses a sweep-line approach and works in two steps. In the first step, all non-stable areas are determined and in the second step fixed.

Keywords: algorithms, computational geometry, polygons, non-stable areas.

  1. Introduction
  2. Polygons present basic geometric structures in many 2D applications. Perhaps the most important are geographic information systems (GIS) where polygons represents piece of land considered as parcels. It is reasonable to expect that the parcels fit completely together because in reality, there is no overlapping or empty spaces between them. However, the borders of these parcels are obtained by measuring with given tolerance. In reality this tolerance is +/- 12 cm. Because of this, it could happen, that in GIS database the borders of the parcels do not fit exactly together, especially if different logical units are tried to be unified. The problem is even more difficult, if they have been stored in different databases. In such cases, the borders of the parcels do not fit together entirely but instead, they overlap or there are empty spaces among them. Such data represents real problems in further operations on geometric data and therefore, they have to be corrected. Indeed, the corrected polygons do not express the exact real data, but they are consistent within the given tolerance.


    In this paper, an algorithm for detecting and eliminating described problem is considered. The algorithm works in two steps: At first, by the use of a sweep-line, the borders of the polygons that are not consistent are determined, and in the second step, they are corrected. Although practically important, we have found no solution to presented problem in the literature.

  3. Description of the problem

In Figure 1 an example set of polygons is shown. Some of them are consistent regarding their neighbours, while others are not. At first, the polygons, which are consistent, are merged together and a smaller set of non-consistent polygons is obtained. Let us assume we already have a routine performing merging. Here, just a brief idea is given. We have to eliminate all polygon edges, which are doubled in the database. With the proper organization of data structures, this task can be finished very fast. For example, Zalik reported that 100.000 polygons with 1.000.000 edges might be merged in 38 seconds measured on an ordinary PC [1]. The reduced set of polygons is then obtained and it is shown in Figure 1b. However, some vertices of those polygons are not consistent, this means, that they are not adjusted with the vertices in the neighbouring polygons. Therefore, the edges are also non-consistent and we considered such polygons as polygons with non-stable borders. Desired result of merging is shown in Figure 1c and can be obtained by the process of “stabilization” of the polygon borders. This process is described in the continuation. For clarity, the algorithm is explained when just two polygons are presented, although the generalization to more polygons is simple.

Figure 1: Input set of polygons (a) Result of merging (b) Expected result of merging (c)

Let us determine the possible cases of instability, which may appear.

Figure 2: The cases of non-stable areas: gap (a) Intersection area (b) Combined area (c)

Obviously, the algorithm needs to know how wide the non-stable area could be. In reality, this is connected with the measurement error. Here, we will denote it by a parameter called the e -distance, or shortly e . Each polygon, whose all edges are further apart than e , is not considered in this algorithm. Indeed, such polygons must be eliminated from the process of non-stable area determination as soon as possible.

  1. Algorithm

As mentioned, the algorithm for fixing the non-stable areas works in two steps:


These steps are described in the continuation.


    1. Determination of non-stable areas

A sweep-line algorithm has been used for detecting non-stable areas. The sweep-line algorithm introduces a sweep-line moving from –¥ to ¥ across the plane [2]. Let us suppose it is vertical. It stops only at the event points - polygon vertices in our case. At the stop, the algorithm must update so-called sweep-line status (SLS) and perform some operations, if necessary. SLS contains all polygon edges, which are intersected by the current sweep-line. These edges are usually named active edges [3]. Because, we have to handle non-stable edges, each edge being inserted into SLS is prolonged for e in x direction. When the sweep line hits a new event point the following actions are executed:


To ensure that the sweep-line algorithm works correctly, all polygon vertices have to be sorted regarding their x coordinate.


Let us highlight the algorithm by an example shown in Figure 3. For this purpose, let us observe what happened in individual event points.


Figure 3: Determination of non-stable areas

Sweep line S0: Vertex 00 is detected. As SLS is empty, edges 0001 and 0004 are added into sweep line status.


Sweep line S1: Sweep line stops at vertex 04. Horizontal distance between the right end of all edges in SLS and vertex 04 is calculated. As no horizontal distance is greater then e all edges remain in SLS. As edges 0004,0001 and vertex 04 belong to the same polygon, calculation of Euclid distances between the vertex and the edges is skipped. Edge 0304 is added into SLS.


Sweep line S2: Horizontal distance between vertex 03 and the right end of edge 0004 is greater then e -distance. Therefore, edge 0004 is erased from SLS. Other edges in SLS have smaller horizontal distances then e and remain in SLS. All edges still belong to the same polygon and no other calculations have to be done. Edge 0203 is added into SLS, which now consists of edges 0001, 0304, and 0203.


Sweep line S3: Sweep line stops in vertex 01. After calculation of horizontal distances, edge 0403 is removed from SLS, after that edge 0102 is added to SLS.


Sweep line S4: Sweep line hits point 10. After calculation of horizontal distances, between edges in SLS and vertex 10, edge 0304 is removed from SLS. Edges in SLS belong to polygon P0, while vertex 10 belongs to polygon P1. Because of this, we have to calculate Euclid distances between the vertex and the edges. Distances from edges 0001 and 0203 to vertex 10 are greater than e and therefore, these edges are ignored. But, the distance between the edge 0102 and the vertex 10 is smaller then e indicating that the non-stable area has been entered. For the next step of the algorithm available information about non-stable area is stored. Up to now we know, that edge 0102 and vertex 10 belong to this area. After that, edges 1011 and 1014, which derive from vertex 10 are added into SLS.


Sweep line S5: Edge 0001 has horizontal distance to vertex 02 greater then e and therefore is removed from the SLS. As vertex belongs to polygon P0, Euclid distances between vertex 02 and edge 1011 and 1014 are determined. The distance between vertex 02 and edge 1011 is greater than e , and therefore edge 1011 does not fall into non-stable area. On the other hand, because the distance between the vertex 02 and edge 1014 is smaller then e it falls into non-stable area. As vertex 02 is already in the data structure describing non-stable areas, only additional information just found is added (edge 1014).


Sweep line S6: Edges 0102 and 0203 are removed from SLS, as their horizontal distance to vertex 14 is greater then e . All remaining edges in SLS and vertex 14 belong to polygon P1 so determination of Euclid distances is skipped. Edge 1314 is added into SLS, which now consists of edges 1011,1014, and 1314.


Sweep line S7: Edge 1014 is deleted from SLS. Determination of Euclid distances is skipped and edge 1213 is added to SLS.


Sweep line S8: Sweep line hits vertices 11 and 12. After horizontal distance is calculated all edges remain in SLS. Edge 1112 is added into SLS.


For described example, the sweep line algorithm produced following sets of data describing non-stable areas:

List of non-stable polygons: {P0,P1}

List of non-stable borders: {0102,1014}

List of non-stable vertices: {01,02,10,14}

    1. Removal of non-stable areas
    2. In this step, the algorithm constructs the additional polygons obtained from the information about the non-stable areas. As we already know, three possible cases exist: gaps, intersections and combined non-stable areas. How they are constructed is described in the continuation.

      1. Fixing gap areas

A polygon describing a gap consists of all the edges surrounding the gap and two additional edges, which must be determined to close the polygon. The creation of the new polygon follows the next algorithm:


Let us consider the example in Figure 4a:

From the data structure filled in the first part of the algorithm, the sets of edges are obtained. Using the information from the input polygons, two chains are constructed. The left chain consists of edges 0102, 0203, and 0304, and the right chain of edges 1016, 1615, and 1514. Let us consider now just the “upper” pair of edges of both chains – edges 0102 and 1016. From the ending points of both chains (in our case from the vertices 01 and 10), two perpendicular lines are calculated. Let us denote them as rleft and rright. These lines are shown in Figure 4b. It happened, that rright intersects left edge 0102. At the intersection point, edge 0102 is split and a new vertex V0 is obtained. This vertex is then connected with the ending vertex of the right chain (vertex 01). Similarly, the algorithm solves the “bottom” part of the chains.

In Figure 4c the result of the merging operation, polygon Pm, after addition of the new polygon is shown.

Figure 4: Set of polygons forming the gap area (a) Determination of additional edge (b) Final result of merging, after the “stabilization” (c)

      1. Fixing polygon intersections

Two polygons, which intersect are transformed into three polygons. One polygon fits the intersection area. The intersection area is subtracted from the original polygons to obtain other two polygons. The creation of these three polygons follows the next algorithm:


Let us highlight the algorithm using an example in Figure 5a:

From the data structure filled in the first part of the algorithm, two chains are constructed One chain consists of edges 0102, 0203, 0304, 0405 and 0506 and the second chain from edges 1015, 1514, 1413, 1312 and 1211. From the end edges of both chains, two intersection points are determined. Vertex V0, represents intersection point between edges 1015 and 0102 and vertex V1 is the in the intersection point of edges 0506 and 1211. Edge 0102, is split into two edges 01V0 and V002, and edge 1510 is split into edges 15V0 and V010. As new edges 01V0 and V010 are outside the intersection area, they are deleted from the chains. Similarly, removal of the edges is performed at the “bottom” of intersection area. Now the right chain consists of edges V002, 0203, 0304, 0405, and 05V1 and the left chain consists of edges V015, 1514, 1413, 1312 and 12V1. These chains now form the polygon P2. Chains are swapped to modify the polygons P0 and P1. Polygons P0, P1 and P2 are shown in Figure 5b.

Figure 5: Set of intersecting polygons (a) Constructed set of consistent polygons (b)

      1. Fixing combined areas

Combined areas consist of gap and intersection areas. All the gap areas must be covered with additional polygons, and the intersection areas have to be subtracted from the original polygons. New polygons fitting intersection areas must be constructed. The algorithm is described in continuation:


Let us consider the example in Figure 6a:

From the data structure filled in the first part of the algorithm, two chains are constructed One chain consists of edges 0102, 0203, 0304, 0405 and 0506, and the second chain contains edges 1017, 1716, 1615 and 1514. Edges 0102 and 1017 are chosen as starting edges. As these edges do not intersect, additional edge 01V0 is formed.

Next intersecting edges are edges are edges 0203 and 1017. At their intersection, vertex V1 is created, which splits both edges. “Upper” parts of edges are added to the partial chains. One partial chain now consists of edges 0102 and 02V1 and the second of edge V0V1. These partial chains and edge 01V0 form the polygon P2 which fills first gap area.

The next stop is intersection of edges 1716 and 0304. At their intersection new vertex V2 is created which splits the edges. One partial chain consists now of edges V103 and 03V2 and the second from edges V117 and 17V2. The partial chains form polygon P3, which covers the intersection area. In polygons P0 and P2 chains between vertices V1 and V2 are swapped and polygons are updated.

Polygon P4 is formed from partial chains V204, 04V3 and V216, 16V3. As polygon P4 fills the gap area, polygons P0 and P1 are not modified.

Polygon P5 fits the last intersection area. It consist of chains V315, 15V4 and V305, 05V4. In polygons P0 and P1 chains between vertices V3 and V4 are swapped.

The final result of the algorithm are polygons P0 (consisting of edges 0001, 0102, 02V1, V117, 17V2, V204, 04V3, V315, 15V4, V406, 0607, and 0700), P1 (consisting of edges 1011, 1112, 1213, 1314, 14V4, V405, 05V3, V316, 16V2, V203, 03V1, V1V0, and V010), P2, P3, P4, and P5. All polygons are shown in Figure 6b.


Figure 6: Set of polygons forming a combined area (a) Constructed set of consistent polygons (b)

  1. Conclusions
  2. The paper presents an algorithm for correcting the polygons with non-stable borders. As described, such polygon easily appears at engineer practice, where polygons are obtained by some measurements subjected to measuring errors. We mention geographic information systems as an example. The algorithm presented here works in two steps: at first so-called non-stable areas are determined and in the second step, the additional artificial (blind) polygons are created. To accelerate the first step, a sweep-line algorithm is proposed. The second step handles three different cases how to construct these additional polygons.


    The algorithm is being implemented at the time of writing. We hope that the implementation will confirm the correctness of the proposed algorithm.

  3. References

[1] Zalik, B.: An Incremental Randomized Approach to the Envelope Determination of a Huge Set of Topologically Consistent Polygons, paper sent to ACM ToG.

[2] Preparata, F. P., Shamos, M. I.: Computational geometry - an introduction, Springer-Verlag, New York, 1985.

[3] de Berg, M., van Kreveld, M., Overmars, M., and Schwarzkopf, O.: Computational geometry – Algorithms and Applications, Springer, Berlin, 1997.