MapServer banner Home | Docs | Issue Tracker | FAQ | Download
en de es fr zh_cn it

Cache Types

Author:Thomas Bonfort
Contact:tbonfort at terriscope.fr

This document details the different cache backends that can be used to store tiles

Disk Caches

The disk based cache is the simplest cache to configure, and the one with the the fastest access to existing tiles. It is ideal for small tile repositories, but may cause trouble for sites hosting millions of tiles, as the number of files or directory may rapidly overcome the capabilities of the underlying filesystem. Additionaly, the block size chosen for the filesystem must closely match the mean size of a stored tile: ideally, any given tile should just fit inside a filesystem block, so as not to waste storage space inside each block, and not have to use up multiple blocks per tile.

The location of the files/directories has to be readable and writable by the user running the tile server.

There are two types of disk caches, that create a different hierarchy of files:

Default Structure

The default disk cache stores tiles in a structure nearly identical to the file/directory hierarchy used by TileCache. The only change is that a top level directory corresponding to the name of the grid is added (as MapCache supports multiple grids per tileset)

This cache is capable of detecting blank (i.e. uniform color) tiles and using a symbolic link to a single blank tile to gain disk space.

<cache name="disk" type="disk">
   <base>/tmp</base>
   <symlink_blank/>
</cache>

The two only configuration keys are the root directory where the tiles will be stored, and a key to activate the symbolic linking of blank tiles

Template Structure

The template based disk cache allows you to create (or reuse an existing) tile structure that you define in advance. The <template> parameter takes a string argument where various template entries will be replaced at runtime by the correct value for each tile to store.

<cache name="tmpl" type="disk">
   <!-- template

       string template that will be used to map a tile (by tileset, grid name, dimension,
       format, x, y, and z) to a filename on the filesystem.
       the following replacements are performed:
       - {tileset} : the tileset name
       - {grid} : the grid name
       - {dim} : a string that concatenates the tile's dimension
       - {ext} : the filename extension for the tile's image format
       - {x},{y},{z} : the tile x,y,z values
       - {inv_x}, {inv_y}, {inv_z} : inverted x,y,z values (inv_x = level->maxx - x - 1). This
            is mainly used to support grids where one axis is inverted (e.g. the google schema)
            and you want to create on offline cache.

      * note that this type of cache does not support blank-tile detection and symlinking.

      * warning: it is up to you to make sure that the template you chose creates a unique
        filename for your given tilesets. e.g. do not ommit the {grid} parameter if your
        tilesets reference multiple grids. Failure to do so will result in filename
        collisions !

   -->
   <template>/tmp/template-test/{tileset}#{grid}#{dim}/{z}/{x}/{y}.{ext}</template>
</cache>

BerkeleyDB Caches

The BerkeleyDB cache backend stores tiles in a key-value flat-file database. and therefore does not have the disadvantages of disk caches with regards to the number of files stored on the filesystem. As the image blobs are stored contiguously, the block size chosen for the filesystem has no influence on the storage capacity of the volume.

Note that for a given bdb cache, only a single database file is created, which will store the tiles of its associated tilesets (i.e. there is not a database file created per tileset, grid, and or dimension). If you need to store different tilesets to different files, then use multiple dbd cache entries. It is not possible to use multiple database files for tileset grids or dimensions.

The berkeleyDB based caches are a bit faster than the disk based caches during reads, but may be a bit slower during concurrent writes if a high number of threads all try to insert new tiles concurrently.

<cache name="bdb" type="bdb">
   <!-- base (required)
      absolute filesystem path where the berkeley db database file is to be stored.
      this directory must exist, and be writable
   -->
   <base>/tmp/foo/</base>

   <!-- key_template (optional)
      string template used to create the key for a tile entry in the database.
      defaults to the value below. you should include {tileset}, {grid} and {dim} here
      unless you know what you are doing, or you will end up with mixed tiles
   <key_template>{tileset}-{grid}-{dim}-{z}-{y}-{x}.{ext}</key_template>
   -->
</cache>

Sqlite Caches

There are two different sqlite caches that vary by the database schema they create and query. Sqlite caches have the advantage that they store tiles as blobs inside a single database file, and therefore do not have the disadvantages of disk caches with regards to the number of files stored. As the image blobs are stored contiguously, the block size chosen for the filesystem has no influence on the storage capacity of he volume.

The sqlite based caches are a bit slower than the disk based caches, and may have write-locking issues at seed time if a high number of threads all try to insert new tiles concurrently.

Default Schema

Tiles are stored in the configured sqlite file created by mapcache with

create table if not exists tiles(
   tileset text,
   grid text,
   x integer,
   y integer,
   z integer,
   data blob,
   dim text,
   ctime datetime,
   primary key(tileset,grid,x,y,z,dim)
);
<cache name="sqlite" type="sqlite3">
</cache>

You may also add custom sqlite pragmas that will be executed when first connecting to a sqlite db, e.g. to override some compiled in sqlite defaults

<cache name="sqlitetemplate" type="sqlite3">
   <dbfile>/tmp/sqlitefile.db</dbfile>
   <pragma name="max_page_count">10000000</pragma>
</cache>

<pragma> entries will result in a call to

PRAGMA max_page_count = 1000000;

Custom Schema

This cache can use any database schema, it is up to you to supply the SQL that will be exectuted to select or insert a new tile.

This cache type is not fully implemented yet (there is no way to configure it from the mapcache.xml configuration file yet)

MBTiles Schema

This cache type is a shortcut to the previous custom schema sqlite cache, with pre-populated SQL queries that correspond to the MBTiles specification.

Although the default mbtiles schema is very simple, mapcache uses the same multi- table schema found in most downloadable mbtiles file, to notably enable storing blank (i.e. uniform) tiles without duplicating the encoded image data (in the same way the disk cache supports tile symlinking).

The mbtiles schema is created with:

create table if not exists images(
  tile_id text,
  tile_data blob,
  primary key(tile_id));
create table if not exists map (
  zoom_level integer,
  tile_column integer,
  tile_row integer,
  tile_id text,
  foreign key(tile_id) references images(tile_id),
  primary key(tile_row,tile_column,zoom_level));
create table if not exists metadata(
  name text,
  value text); -- not used or populated yet
create view if not exists tiles
  as select
     map.zoom_level as zoom_level,
     map.tile_column as tile_column,
     map.tile_row as tile_row,
     images.tile_data as tile_data
  from map
     join images on images.tile_id = map.tile_id;
<cache name="mbtiles" type="mbtiles">
   <dbfile>/Users/XXX/Documents/MapBox/tiles/natural-earth-1.mbtiles</dbfile>
</cache>

Note

Contrarily to the standard sqlite mapcache schema, the mbtiles db file only supports a single tileset per cache. The bahavior if multiple tilesets are associated to the same mbtiles cache is undefined, and will definitely produce incorrect results.

Memcache Caches

This cache type stores tile to an external memcached server running on the local machine or accessible on the network. This cache type has the advantage that memcached takes care of expiring tiles, so the size of the cache will never exceed what has been configured in the memcache instance.

Memcache support requires a rather recent version of the apr-util library. Note that under very high loads (usually only atainable on synthetic benchmarks on localhost), the memcache implementation of apr-util may fail and start dropping connections for some intervals of time before coming back online afterwards.

You can add multiple <server> entries.

<cache name="memcache" type="memcache">
   <server>
      <host>localhost</host>
      <port>11211</port>
   </server>
</cache>

Note

Tiles stored in memcache backends are configured to expire in 1 day by default. This can be overriden at the tileset level with the <auto_expire> keyword.

(Geo)TIFF Caches

TIFF caches are the most recent addition to the family of caches, and use the internal tile structure of the TIFF specification to access tile data. Tiles can be stored in JPEG only (TIFF does not support PNG tiles).

As a single tiff file may contain many tiles, there is a drastic reduction in the number of files that have to be stored on the filesystem, which solves the major shortcomings of the disk cache. Another advantage is that the same TIFF files can be used by programs or WMS servers that only understand regular GIS raster formats, and be served up with high performance for tile access.

The TIFF cache should be considered read-only for the time being. Write access is already possible but should be considered experimental as there might be some file corruption issues, notably on network file-systems. Note that until all the tiles in a given tiff file have been seeded/created, the TIFF file is said to be “sparse” in the sense that it is missing a number of jpeg tiles. As such most non-gdal based programs will have problems opening these incomplete files.

Note that the tiff tile structure must exactly match the structure of the grid used by the tileset, and the tif file names must follow strict naming rules.

Defining the TIFF file sizes

The number of tiles stored in each the horizontal and vertical direction must be defined:

  • <xcount> the number of tiles stored along the x (horizontal) direction of the TIFF file.
  • <ycount> the number of tiles stored along the y (vertical) direction of the TIFF file.
<cache name="tiff" type="tiff">
   <xcount>64</xcount>
   <ycount>64</ycount>
   ...
</cache>

Setting up the file naming convention

The <template> tag sets the template to use when looking up a TIFF file name given the x,y,z of the requested tile

<cache name="tiff" type="tiff">
   <template>/data/tiffs/{tileset}/{grid}/L{z}/R{inv_y}/C{x}.tif</template>
   ...
</cache>

The following template keys are available for operating on the given tile’s x,y, and z:

  • {x} is replaced by the x value of the leftmost tile inside the TIFF file containing the requested tile
  • {inv_x} is replaced by the x value of the rightmost tile.
  • {y} is replaced by the y value of the bottommost tile.
  • {inv_y} is replaced by the y value of the topmost tile.
  • {div_x} is replaced by the index of the TIFF file starting from the left of the grid (i.e. {div_x} = {x}/<countx>)
  • {inv_div_x} same as {div_x} but starting from the right.
  • {div_y} is replaced by the index of the TIFF file starting from the bottom of the grid (i.e. {div_y} = {y}/<county>)
  • {inv_div_y} same as {div_y} but starting from the top.

Note

{inv_x} and {inv_div_x} will probably be rarely used, whereas {inv_y} and {inv_div_y} will find some usage for people who prefer to index their TIFF files from top to bottom rather than bottom to top.

Customizing the template keys

In some occurences, it may be desirable to have a precise hand on the filename to use for a given x,y,z, tile lookup, e.g. to look for a file named “Z03-R00003-C000009.tif” instead of just “Z3-R3-C9.tif”. The <template> entry supports formatting attributes, following the unix printf syntax ( c.f.: http://linux.die.net/man/3/printf ), by suffixing each template key with “_fmt”, e.g.:

<cache name="tiff" type="tiff">
   <template
      x_fmt="%08d"
      inv_y_fmt="%08d"
   >/data/tiffs/{tileset}/{grid}/L{z}/R{inv_y}/C{x}.tif</template>
</cache>

Note

If not specified, the default behavior is to use “%d” for formatting.

Setting JPEG compression options

An additional optional parameter defines which JPEG compression should be applied to the tiles when saved into the TIFF file:

  • <format> the name of the (jpeg) format to use

See also

JPEG Format

<cache name="tiff" type="tiff">
   ...
   <format>myjpeg</format>
</cache>

In this example, assuming a grid using 256x256 tiles, the files that are read to load the tiles are tiled TIFFs with jpeg compression, who’s size are 16384x16384. The number of files to store on disk is thus reduced 4096 times compared to the basic disk cache.

GeoTIFF Support

If compiled with GeoTiff and write support, MapCache will add referencing information to the TIFF files it creates, so that the TIFF files can be used in any geotiff- enabled software. Write support does not produce full geotiffs with the definition of the projection used, but only the pixel scale and tie-points, i.e. what is usually found in .tfw files.

For reference, here is the gdalinfo output on a TIFF file created by MapCache when compiled with geotiff support:

LOCAL_CS["unnamed",
    UNIT["metre",1,
        AUTHORITY["EPSG","9001"]]]
Origin = (-20037508.342789247632027,20037508.342789247632027)
Pixel Size = (156543.033928040997125,-156543.033928040997125)
Metadata:
  AREA_OR_POINT=Area
Image Structure Metadata:
  COMPRESSION=YCbCr JPEG
  INTERLEAVE=PIXEL
  SOURCE_COLOR_SPACE=YCbCr
Corner Coordinates:
Upper Left  (-20037508.343,20037508.343)
Lower Left  (-20037508.343,-20037508.343)
Upper Right (20037508.343,20037508.343)
Lower Right (20037508.343,-20037508.343)
Center      (   0.0000000,   0.0000000)