Published April 20, 2023 | Version v1
Knowledge Package

High resolution national population mapping: Top-down population disaggregation modelling

WorldPop1
  • 1. ROR icon University of Southampton

Description

Background

Gridded population estimates are particularly useful as they provide decision-makers and data users with the flexibility to aggregate population estimates into different spatial units in existing enumeration areas or custom areas. They can be aggregated over various levels of administrative units, but also over areal units that don't follow administrative boundaries, such as a hospital catchment area, enabling integration and analyses with a range of other spatial datasets that are not possible using standard census counts mapped to administrative boundaries. The modelled datasets do not however replace the need for a full census, which usually includes a more precise collection of demographics and socioeconomics, as well as a housing census.

WorldPop produces a range of different gridded population estimate datasets and tools, and choosing the best to use depends on your needs and situation. Population and housing censuses are the most important resource to produce accurate population data at the national and sub-national level. These are typically undertaken every decade and simple projections can be used to create subnational estimates in the intervening years. These data are typically only available as counts per administrative unit though, masking small area variations and making them difficult to integrate with other datasets. WorldPop top-down modelling methods take population counts at administrative unit level and disaggregate them to counts for each 100x100m grid square across the country or region of interest using machine learning methods that leverage relationships with a stack of 100x100m resolution geospatial covariate datasets. This differs from the 'bottom-up' approach, and differences are outlined here: https://www.worldpop.org/methods/populations/, as well as in the documents provided in this work package. In some cases WorldPop works directly with a government to disaggregate their own census data or official projections to produce bespoke modelled outputs (e.g. outputs in the WorldPop Open Population Repository: https://wopr.worldpop.org/). WorldPop also produces standardized mapping for all countries in the World. This involves taking a global database of administrative unit-based census and projection counts for each year 2000-2020 and utilising a set of detailed geospatial datasets to disaggregate them to counts for either (i) each 100x100m or 1x1km grid cell on the planet (top-down unconstrained) or (ii) each 100x100m or 1x1km grid cell classified as settled by humans (top-down constrained). These datasets are available to download through the WorldPop Data Catalog: https://www.worldpop.org/datacatalog/. Differences between 'constrained' and 'unconstrained' are outlined here: https://www.worldpop.org/methods/top_down_constrained_vs_unconstrained/.

Advantages of 'top-down' modelled population estimates:

-Consistent and complete estimates for each year 2000-2020 for every country, including breakdowns by age and sex

-Maintains the 'official' population estimates or census counts at the administrative unit level of the input data, as well as adjustments available to match UN national estimates

Disadvantages of 'top-down' modelled population estimates:

-For countries that have not had a census for a long time and/or where significant subnational variations in migration, fertility and mortality exist, the input 'official' population counts and projections based on them can be highly uncertain

 

The Work Package

This work package consists of a set of papers, tutorials, datasets and tools to provide users with a comprehensive introduction to 'top-down' population modelling. This includes (i) A set of three academic papers describing the random forests top-down disaggregation method and illustrating its application for different countries and time periods, (ii) an R package and accompanying instructions and examples for implementing the random forests method, (iii) a tutorial showing how to use the code and method for top-down disaggregation of population counts from aggregate administrative unit level to 100x100m grid square estimates, (iv) a tutorial showing how to use the code and method for top-down disaggregation of population counts from large aggregate administrative unit level to smaller administrative unit level estimates, and (v) links to country specific top-down disaggregation datasets in the WorldPop Open Population Repository (WOPR).

Elements of the Knowledge Package

Additional details

Created:
April 20, 2023
Modified:
June 6, 2023