Merging a set of polygons with non-stable borders

Merging a set of polygons with non-stable borders

Gregor Klajnsek
gregor.klajnsek@siol.net

Laboratory for Geometric Modeling and Multimedia Algorithms
Institute of Computer Science
Faculty of Electrical Engineering and Computer Science
University of Maribor
Maribor / Slovenia

Contents:

Abstract
1. Introduction
2. Description of the problem
3. Algorithm
    3.1. Determination of non-stable areas
    3.2. Removal of non-stable areas
        3.2.1. Fixing gap areas
        3.2.2. Fixing polygon intersections
        3.2.3. Fixing combined areas
4. Conclusions
5. References
Postscript version

Abstract

The paper introduces an algorithm dealing with polygons with non-stable borders. Such polygons may appear because of measuring errors or during scanning and vectorisation of blueprints. The aim of the algorithm is to remove these errors within prescribed tolerance. The algorithm uses a sweep-line approach and works in two steps. In the first step, all non-stable areas are determined and in the second step fixed.

Keywords: algorithms, computational geometry, polygons, non-stable areas.

Introduction

Polygons present basic geometric structures in many 2D applications. Perhaps the most important are geographic information systems (GIS) where polygons represents piece of land considered as parcels. It is reasonable to expect that the parcels fit completely together because in reality, there is no overlapping or empty spaces between them. However, the borders of these parcels are obtained by measuring with given tolerance. In reality this tolerance is +/- 12 cm. Because of this, it could happen, that in GIS database the borders of the parcels do not fit exactly together, especially if different logical units are tried to be unified. The problem is even more difficult, if they have been stored in different databases. In such cases, the borders of the parcels do not fit together entirely but instead, they overlap or there are empty spaces among them. Such data represents real problems in further operations on geometric data and therefore, they have to be corrected. Indeed, the corrected polygons do not express the exact real data, but they are consistent within the given tolerance.

In this paper, an algorithm for detecting and eliminating described problem is considered. The algorithm works in two steps: At first, by the use of a sweep-line, the borders of the polygons that are not consistent are determined, and in the second step, they are corrected. Although practically important, we have found no solution to presented problem in the literature.

Description of the problem

In Figure 1 an example set of polygons is shown. Some of them are consistent regarding their neighbours, while others are not. At first, the polygons, which are consistent, are merged together and a smaller set of non-consistent polygons is obtained. Let us assume we already have a routine performing merging. Here, just a brief idea is given. We have to eliminate all polygon edges, which are doubled in the database. With the proper organization of data structures, this task can be finished very fast. For example, Zalik reported that 100.000 polygons with 1.000.000 edges might be merged in 38 seconds measured on an ordinary PC [1]. The reduced set of polygons is then obtained and it is shown in Figure 1b. However, some vertices of those polygons are not consistent, this means, that they are not adjusted with the vertices in the neighbouring polygons. Therefore, the edges are also non-consistent and we considered such polygons as polygons with non-stable borders. Desired result of merging is shown in Figure 1c and can be obtained by the process of “stabilization” of the polygon borders. This process is described in the continuation. For clarity, the algorithm is explained when just two polygons are presented, although the generalization to more polygons is simple.

Figure 1: Input set of polygons (a) Result of merging (b) Expected result of merging (c)

Let us determine the possible cases of instability, which may appear.

Gap area. If the borders of two polygons are partially far way for less than prescribed tolerance e , but they do not intersect, a tiny gap appears between the polygons (see Figure2a). To get correct result gaps are filled by additional polygons that disappear during the merging process, which is rerun after stabilization is complete.

Intersection area. Figure 2b shows an example of intersection area. Again, additional polygons are created at first, and then they are removed from the result.

Combined are. This case includes gaps and intersection areas (Figure 2c). Again, additional polygons are created and classified as gap or intersection areas.

Figure 2: The cases of non-stable areas: gap (a) Intersection area (b) Combined area (c)

Obviously, the algorithm needs to know how wide the non-stable area could be. In reality, this is connected with the measurement error. Here, we will denote it by a parameter called the e -distance, or shortly e . Each polygon, whose all edges are further apart than e , is not considered in this algorithm. Indeed, such polygons must be eliminated from the process of non-stable area determination as soon as possible.

Algorithm

As mentioned, the algorithm for fixing the non-stable areas works in two steps:

Determination of non-stable areas

Removal of non-stable areas.

These steps are described in the continuation.

Determination of non-stable areas

A sweep-line algorithm has been used for detecting non-stable areas. The sweep-line algorithm introduces a sweep-line moving from –¥ to ¥ across the plane [2]. Let us suppose it is vertical. It stops only at the event points - polygon vertices in our case. At the stop, the algorithm must update so-called sweep-line status (SLS) and perform some operations, if necessary. SLS contains all polygon edges, which are intersected by the current sweep-line. These edges are usually named active edges [3]. Because, we have to handle non-stable edges, each edge being inserted into SLS is prolonged for e in x direction. When the sweep line hits a new event point the following actions are executed:

For each edge in SLS, at first the horizontal distances between its points and the event point are determined. If both distances are greater e , the edge is removed from SLS.

The shortest Euclid distances between the event point and all remaining edges in SLS are calculated. If any distance is less then e , the non-stable area is found.

The edges which are connected with the actual event point and are not yet in SLS are inserted into it (i.e. the second vertex of these edges has x coordinate larger than x coordinate of the event point).

To ensure that the sweep-line algorithm works correctly, all polygon vertices have to be sorted regarding their x coordinate.

Let us highlight the algorithm by an example shown in Figure 3. For this purpose, let us observe what happened in individual event points.

Figure 3: Determination of non-stable areas

Sweep line S₀: Vertex 0₀ is detected. As SLS is empty, edges 0₀0₁ and 0₀0₄ are added into sweep line status.

Sweep line S₁: Sweep line stops at vertex 0₄. Horizontal distance between the right end of all edges in SLS and vertex 0₄ is calculated. As no horizontal distance is greater then e all edges remain in SLS. As edges 0₀0₄,0₀0₁ and vertex 0₄ belong to the same polygon, calculation of Euclid distances between the vertex and the edges is skipped. Edge 0₃0₄ is added into SLS.

Sweep line S₂: Horizontal distance between vertex 0₃ and the right end of edge 0₀0₄ is greater then e -distance. Therefore, edge 0₀0₄ is erased from SLS. Other edges in SLS have smaller horizontal distances then e and remain in SLS. All edges still belong to the same polygon and no other calculations have to be done. Edge 0₂0₃ is added into SLS, which now consists of edges 0₀0₁, 0₃0₄, and 0₂0₃.

Sweep line S₃: Sweep line stops in vertex 0₁. After calculation of horizontal distances, edge 0₄0₃ is removed from SLS, after that edge 0₁0₂ is added to SLS.

Sweep line S₄: Sweep line hits point 1₀. After calculation of horizontal distances, between edges in SLS and vertex 1₀, edge 0₃0₄ is removed from SLS. Edges in SLS belong to polygon P₀, while vertex 1₀ belongs to polygon P₁. Because of this, we have to calculate Euclid distances between the vertex and the edges. Distances from edges 0₀0₁ and 0₂0₃ to vertex 1₀ are greater than e and therefore, these edges are ignored. But, the distance between the edge 0₁0₂ and the vertex 1₀ is smaller then e indicating that the non-stable area has been entered. For the next step of the algorithm available information about non-stable area is stored. Up to now we know, that edge 0₁0₂ and vertex 1₀ belong to this area. After that, edges 1₀1₁ and 1₀1_4, which derive from vertex 1₀ are added into SLS.

Sweep line S₅: Edge 0₀0₁ has horizontal distance to vertex 0₂ greater then e and therefore is removed from the SLS. As vertex belongs to polygon P₀, Euclid distances between vertex 0₂ and edge 1₀1₁ and 1₀1₄ are determined. The distance between vertex 0₂ and edge 1₀1₁ is greater than e , and therefore edge 1₀1₁ does not fall into non-stable area. On the other hand, because the distance between the vertex 0₂ and edge 1₀1₄ is smaller then e itfalls into non-stable area. As vertex 0₂ is already in the data structure describing non-stable areas, only additional information just found is added (edge 1₀1₄).

Sweep line S₆: Edges 0₁0₂ and 0₂0₃ are removed from SLS, as their horizontal distance to vertex 1₄ is greater then e . All remaining edges in SLS and vertex 1₄belong to polygon P₁ so determination of Euclid distances is skipped. Edge 1₃1₄ is added into SLS, which now consists of edges 1₀1₁,1₀1₄, and 1₃1₄.

Sweep line S₇: Edge 1₀1₄ is deleted from SLS. Determination of Euclid distances is skipped and edge 1₂1₃ is added to SLS.

Sweep line S₈: Sweep line hits vertices 1₁ and 1_2. After horizontal distance is calculated all edges remain in SLS. Edge 1₁1₂ is added into SLS.

For described example, the sweep line algorithm produced following sets of data describing non-stable areas:

List of non-stable polygons: {P₀,P₁}

List of non-stable borders: {0₁0₂,1₀1₄}

List of non-stable vertices: {0₁,0₂,1₀,1₄}

Removal of non-stable areas

In this step, the algorithm constructs the additional polygons obtained from the information about the non-stable areas. As we already know, three possible cases exist: gaps, intersections and combined non-stable areas. How they are constructed is described in the continuation.

Fixing gap areas

A polygon describing a gap consists of all the edges surrounding the gap and two additional edges, which must be determined to close the polygon. The creation of the new polygon follows the next algorithm:

The set of non-stable-borders is split into two sets. In one set are all edges, which belong to polygon P₀ and in the second are edges belonging to polygon P₁.

The left and the right chain have to be determined. After that, the end edges of the non-stable area are determined.

Four perpendicular lines regarding to the end edges are constructed. From these, two, which actually intersect the edges on the opposite chain, are chosen. An intersection point between the perpendicular line and the corresponding edge is calculated. The intersected edge is then split and a new edge connecting the intersection point with the end vertex on the corresponding chain is generated.

New polygon is constructed in this way and it fills the gap.

Let us consider the example in Figure 4a:

From the data structure filled in the first part of the algorithm, the sets of edges are obtained. Using the information from the input polygons, two chains are constructed. The left chain consists of edges 0₁0₂, 0₂0₃, and 0₃0₄, and the right chain of edges 1₀1₆, 1₆1₅, and 1₅1₄. Let us consider now just the “upper” pair of edges of both chains – edges 0₁0₂ and 1₀1₆. From the ending points of both chains (in our case from the vertices 0₁ and 1₀), two perpendicular lines are calculated. Let us denote them as r_left and r_right. These lines are shown in Figure 4b. It happened, that r_right intersects left edge 0₁0₂. At the intersection point, edge 0₁0₂ is split and a new vertex V₀ is obtained. This vertex is then connected with the ending vertex of the right chain (vertex 0₁). Similarly, the algorithm solves the “bottom” part of the chains.

In Figure 4c the result of the merging operation, polygon P_m, after addition of the new polygon is shown.

Figure 4: Set of polygons forming the gap area (a) Determination of additional edge (b) Final result of merging, after the “stabilization” (c)

Fixing polygon intersections

Two polygons, which intersect are transformed into three polygons. One polygon fits the intersection area. The intersection area is subtracted from the original polygons to obtain other two polygons. The creation of these three polygons follows the next algorithm:

The set of non-stable-borders is split into two sets. In one set all edges belonging to polygon P₀ are located and in the second are edges belonging to polygon P₁.

The left and the right chain have to be determined. After that, end edges of the non-stable area are determined.

Intersection points between the end edges are calculated. These intersection points split end edges. All parts of end edges, which are not in the intersection area, are taken from the chains.

All edges, which have remained in both chains, form the polygon, which fits the intersection area.

The chains belonging to P₀ and P₁ are swapped and in this way polygons P₀ and P₁ are modified.

Let us highlight the algorithm using an example in Figure 5a:

From the data structure filled in the first part of the algorithm, two chains are constructed One chain consists of edges 0₁0₂, 0₂0₃, 0₃0₄, 0₄0₅and 0₅0₆ and the second chain from edges 1₀1₅, 1₅1₄, 1₄1₃, 1₃1₂ and 1₂1₁. From the end edges of both chains, two intersection points are determined. Vertex V₀, represents intersection point between edges 1₀1₅ and 0₁0₂and vertex V₁ is the in the intersection point of edges 0₅0₆ and 1₂1₁. Edge 0₁0₂, is split into two edges 0₁V₀ and V₀0₂, and edge 1₅1₀ is split into edges 1₅V₀ and V₀1₀. As new edges 0₁V₀ and V₀1₀ are outside the intersection area, they are deleted from the chains. Similarly, removal of the edges is performed at the “bottom” of intersection area. Now the right chain consists of edges V₀0₂, 0₂0₃, 0₃0₄, 0₄0₅, and 0₅V₁ and the left chain consists of edges V₀1₅, 1₅1₄, 1₄1₃, 1₃1₂ and 1₂V₁. These chains now form the polygon P₂. Chains are swapped to modify the polygons P₀ and P₁. Polygons P₀, P₁ and P₂ are shown in Figure 5b.

Figure 5: Set of intersecting polygons (a) Constructed set of consistent polygons (b)

Fixing combined areas

Combined areas consist of gap and intersection areas. All the gap areas must be covered with additional polygons, and the intersection areas have to be subtracted from the original polygons. New polygons fitting intersection areas must be constructed. The algorithm is described in continuation:

The set of non-stable borders is split in two sets and the chains are determined.

Two corresponding end edges are chosen as starting edges.

If starting edges cross, the first area is intersection area, and the intersection point is determined. If end edges do not cross, first area is the gap area and additional edge is created.

A walk-about is performed along the edges in chains, until the next intersecting edges are found or end of chains is reached. At each detected intersection point, the chains are split. If there are no intersecting edges, the algorithm jumps to the last step. Otherwise, the intersection point is determined, and the intersecting edges are split. New polygon is created from partial chains. If algorithm is fixing the intersection area partial chains are swapped in the original polygons, too. The partial chains are then removed. As gap area always follows the intersection area and vice-versa, the classification of next area is simple. The process is repeated until the algorithm deals with all edges.

If the last two edges, do not intersect we have the gap at the “end” of intersection area. The additional edge is determined and a new polygon is formed from partial chains and the new edge.

Let us consider the example in Figure 6a:

From the data structure filled in the first part of the algorithm, two chains are constructed One chain consists of edges 0₁0₂, 0₂0₃, 0₃0₄, 0₄0₅ and 0₅0₆, and the second chain contains edges 1₀1₇, 1₇1₆, 1₆1₅ and 1₅1₄. Edges 0₁0₂ and 1₀1₇ are chosen as starting edges. As these edges do not intersect, additional edge 0₁V₀ is formed.

Next intersecting edges are edges are edges 0₂0₃ and 1₀1₇. At their intersection, vertex V₁ is created, which splits both edges. “Upper” parts of edges are added to the partial chains. One partial chain now consists of edges 0₁0₂ and 0₂V₁ and the second of edge V₀V₁. These partial chains and edge 0₁V₀ form the polygon P₂ which fills first gap area.

The next stop is intersection of edges 1₇1₆ and 0₃0₄. At their intersection new vertex V₂ is created which splits the edges. One partial chain consists now of edges V₁0₃ and 0₃V₂ and the second from edges V₁1₇ and 1₇V₂. The partial chains form polygon P₃, which covers the intersection area. In polygons P₀ and P₂ chains between vertices V₁ and V₂ are swapped and polygons are updated.

Polygon P₄ is formed from partial chains V₂0₄, 0₄V₃ and V₂1₆, 1₆V₃. As polygon P₄ fills the gap area, polygons P₀ and P₁ are not modified.

Polygon P₅ fits the last intersection area. It consist of chains V₃1₅, 1₅V₄ and V₃0₅, 0₅V₄. In polygons P₀ and P₁ chains between vertices V₃ and V₄ are swapped.

The final result of the algorithm are polygons P₀(consisting of edges 0₀0₁, 0₁0₂, 0₂V₁, V₁1₇, 1₇V₂, V₂0₄, 0₄V₃, V₃1₅, 1₅V₄, V₄0₆, 0₆0₇, and 0₇0₀), P₁ (consisting of edges 1₀1₁, 1₁1₂, 1₂1₃, 1₃1₄, 1₄V₄, V₄0₅, 0₅V₃, V₃1₆, 1₆V₂, V₂0₃, 0₃V₁, V₁V₀, and V₀1₀), P₂, P₃, P₄, and P₅. All polygons are shown in Figure 6b.

Figure 6: Set of polygons forming a combined area (a) Constructed set of consistent polygons (b)

Conclusions

The paper presents an algorithm for correcting the polygons with non-stable borders. As described, such polygon easily appears at engineer practice, where polygons are obtained by some measurements subjected to measuring errors. We mention geographic information systems as an example. The algorithm presented here works in two steps: at first so-called non-stable areas are determined and in the second step, the additional artificial (blind) polygons are created. To accelerate the first step, a sweep-line algorithm is proposed. The second step handles three different cases how to construct these additional polygons.

The algorithm is being implemented at the time of writing. We hope that the implementation will confirm the correctness of the proposed algorithm.

References

[1] Zalik, B.: An Incremental Randomized Approach to the Envelope Determination of a Huge Set of Topologically Consistent Polygons, paper sent to ACM ToG.

[2] Preparata, F. P., Shamos, M. I.: Computational geometry - an introduction, Springer-Verlag, New York, 1985.

[3] de Berg, M., van Kreveld, M., Overmars, M., and Schwarzkopf, O.: Computational geometry – Algorithms and Applications, Springer, Berlin, 1997.