Home | Docs | Issue Tracker | FAQ | Download | |
Date: | 2011/02/19 |
---|---|
Author: | Tamas Szekeres |
Contact: | szekerest at gmail.com |
Last Edited: | $Date$ |
Status: | Adopted (Implemented on 2011-03-04) |
Version: | MapServer 6.0 |
Id: | $Id$ |
Description: This RFC proposes an implementation for clustering multiple features from a layer to single (aggregated) features based on their relative positions.
In order to make the maps perspicuous at a given view, we may require to limit the number of the features rendered at neighbouring locations which would normally overlap each other. Currently there’s no such mechanism in MapServer which would prevent from the symbols to overlap based on their relative locations. In a feasible solution we should provide rendering the isolated symbols as is, but create new (clustered) features for those symbols that would overlap in a particular scale. According to the example at http://trac.osgeo.org/mapserver/attachment/ticket/3700/cluster.png the number of the features forming the clusters are displayed in the labels for each clustered features.
This functionality will be implemented as a separate layer data provider (implemented in mapcluster.c) This provider will be used internally (being invoked in msLayerOpen). The clustering parameters can be specified for each layer type as follows:
LAYER
TYPE POINT
CONNECTIONTYPE OGR
NAME cluster
CLUSTER
MAXDISTANCE 20 # in pixels
REGION ellipse # can be rectangle or ellipse
GROUP (expression) # we can define an expression to create separate groups for each value
FILTER (expression) # we can define a logical expression to specify the grouping condition
END
...
END
We can also use multiple classes to display the clustered shapes. The referred example above use the following layer definition:
LAYER
CLUSTER
MAXDISTANCE 50
REGION "rectangle"
END
LABELITEM "Cluster:FeatureCount"
CLASSITEM "zoomcode"
CLASS
TEMPLATE "query.html"
STYLE
SYMBOL "image4"
END
LABEL
...
END
EXPRESSION "Cluster:Empty"
END
CLASS
TEMPLATE "query.html"
STYLE
SYMBOL "image1"
END
LABEL
...
END
EXPRESSION "5"
END
CLASS
TEMPLATE "query.html"
STYLE
SYMBOL "image2"
END
LABEL
...
END
EXPRESSION "4"
END
CLASS
TEMPLATE "query.html"
STYLE
COLOR 0 0 255
SYMBOL "image3"
END
LABEL
...
END
EXPRESSION "3"
END
...
END
In the proposed solution msLayerOpen will call the vtable method of the cluster layer provider instead of the original vtable method depending on the existence of the clustering options, something like:
if (layer->cluster.region)
return msClusterLayerOpen(layer);
In msClusterLayerOpen the data provider will override the vtable functions so that the subsequent LayerWhichShapes/LayerNextShape/LayerClose (and some further) functions will be handled by this provider and not by the original data source. The clustering process itself will be handled in the LayerWhichShapes call. This is the only place where the features are retrieved from the original data source and then cached in the local clustering database (stored in layerinfo).
The clustering process itself will be implemented in the following way:
n = n + 1
avg(n) = avg(n-1) * (n-1) / n + x(n) / n
var(n) = var(n-1) * (n-1) / n + pow2(x(n) - avg(n)) / (n-1)
In LayerNextShape the features are served from the finalization list which is preserved until the layer is open. In LayerClose the vtable of the layer will be restored to the original methods (by calling msInitializeVirtualTable)
The clustered layer itself will provide the following aggregate attributes:
1) Cluster:FeatureCount - count of the features in the clustered shape 1) Cluster:Group - The group value of the cluster (to which the group expression is evaluated)
These attributes can be used to configure the labels of the features and can also be used in expressions. The clustered layer will also support to get further attributes from the original data source as referenced in the LAYER configuration. The ITEMS processing option can be used to specify additional attributes from the source layer according to the user preference.
If we retrieve the original attributes then the layer provider will provide only those values which are equal for each shapes in the cluster. The other values are set to “Cluster:Empty”. In the future we may probably extend this by implementing aggregate functions to define the attributes, like min([attributename]) or max([attributename])
We can define the symbology and labelling of the clustered layers in the same way as any other layer by specifying the classes and styles. STYLEITEM AUTO is not considered to be supported at this phase.
In the query operations the clustered features are retrieved as single shapes with the attribute set as specified in the ITEMS processing option. The clustered features are preserved until the layer is open, so the single pass query approach is provided by this driver.
In order to implement this enhancement the following changes should be made in the MapServer codebase:
The following files will be modified/created by this RFC:
Makefile.vc
Makefile.in
maplayer.c (msLayerOpen and msLayerWhichItems)
mapfile.c, mapfile.h(for handling clusterObj)
mapcluster.c (the code of the new data provider)
cluster.i (SWIG interface file to expose clusterObj)
maplexer.l
mapserver.h
The new object (clusterObj) will be exposed to the mapscript interface.
This change provides a new functionality with no backwards compatibility issues being considered.
+1 from SteveL, AssefaY, ThomasB and TamasS