Skip to main content

Conflict Exposure methodology

Methodology detail for ACLED's Conflict Exposure measure and tool.

Estimates of the size of the exposed population vary by the proximity boundary set around the location of the event. ACLED makes available estimates of the population living within 1 kilometer, 2 km, and 5 km. Estimates of the exposed population for every new event are released along with the real-time weekly data from ACLED. A “best” estimate can also be generated, which changes the boundary distance based on the type and intensity of the event. For example, for each explosion event, the best estimate of the population exposed is set at 5 km encircling the specific location; however, the best estimate for those exposed to a protest is 1 km. 

See the How does the best estimate vary? question for more details on the best exposure estimation.

ACLED and WorldPop global population data

WorldPop is an applied research group at the University of Southampton, United Kingdom. The group produces many different types of small-area population estimate datasets to meet a wide range of needs and applications. These often involve different trade-offs in modeling methods and input datasets. This document briefly describes the datasets used in constructing the conflict exposure metrics, their limitations, and ongoing efforts to improve and update them. For further details, please see the WorldPop website, this overview paper, and available webinars on the production of population estimates and for emergency response.

To develop estimates of population numbers exposed to conflicts for all countries over multiple years, a globally consistent multi-temporal dataset provided estimated numbers of resident people in grid cells. These cells were then spatially linked with ACLED event-based conflict data. WorldPop’s global annual age/sex-structured population estimates for 2020-22 are used and extrapolated.

Constructing a consistent multi-year time series of small-area population estimates across the globe requires a range of methodological assumptions and data trade-offs. Estimates are unlikely to be highly accurate compared to those purposely built for just a single, recent time point, or a single country. Nevertheless, the global datasets represent the best available current option from the WorldPop library, with updates and improvements coming soon (see below).

How are WorldPop data built?

The WorldPop global 2020-22 datasets are built through "top-down" disaggregation modeling of a database of administrative unit-linked population estimates derived from subnational censuses and projections, constructed by CIESIN at Columbia University, and which primarily include censuses from the 2000 and 2010 rounds. Machine learning methods are used to construct models of the relationships between administrative unit-based population densities and a harmonized library of high spatial resolution gridded covariate data layers. This library and the modeled relationships are then used to disaggregate the administrative unit-based population counts to predictions of numbers of people residing in each 100×100 meter or 1×1 kilometer grid square globally. Further, an assembly of subnational demographic datasets are then used to break down these population totals by sex and age classes. Finally, national population totals are adjusted to ensure they match United Nations World Population Prospects estimates.

Calculation technique

When calculating populations exposed to conflict using the WorldPop data, the following two-step approach is applied:

Creating buffer zones around conflict locations:

First, a circular "buffer zone" is drawn around each conflict location. These buffers can vary in size, typically set at 1 kilometer, 2 km, or 5 km radii. However, in areas with multiple conflicts, these circles might overlap, leading to an overestimation of the population affected by the listed conflicts when they are added up, double-counting people. To address this, we use a technique called Voronoi tessellation.

When a set of points (representing conflict locations) is placed on a two-dimensional plane, Voronoi tessellation divides the plane into several polygons or cells, ensuring that each cell contains exactly one of these points. The boundaries of each cell are formed by the perpendicular bisectors of the lines connecting the central point to its neighbors. This effectively means each Voronoi cell is associated with a particular point and defines the area nearest to it. This characteristic makes Voronoi tessellation a preferred tool to delineate regions.

In areas with a high density of conflicts, the resulting Voronoi cells can be quite small, while in regions where conflict points are more dispersed, cells can be larger. To address this variance and achieve a more consistent representation of influence zones, we compute the geometric intersections between the Voronoi cells and the circular buffers initially created (see graphic below). This combined approach allows for a more accurate and nuanced estimation of populations exposed to conflicts in varying geographic locations.

Conflict Exposure Methodology

Zonal statistics to extract population data

The "zonal statistics" operation involves extracting the population data from the grid cells within the zones as determined in the previous steps. All cells with centroids falling inside the zones become the subject of the statistics. Since our primary interest is in the total population exposed, we aggregate the population counts within these zones.

The actual geometry of the buffer zones might be complex or disproportionate to the size of the grid cells. This mismatch means that the grid cells inside the zones may not perfectly align with the defined geometry (see graphic below). To account for this discrepancy, an adjustment is necessary to refine our population estimate. 

Conflict Exposure Methodology

Figure 2. Illustration of the buffer zone (black polygon) around a conflict location (black dot) used to aggregate the population count from the gridded dataset (rectangular grids). Two cells are identified as being inside the zone.

Consider the following:  A buffer zone encompasses an area A (in km2) and contains n number of grid cells (each cell is 1 km2). If pop is the initial population count derived from the sum of these cells, we adjust it to better reflect the actual population within the buffer zone. The adjusted population count, pop’, is calculated as follows:

Conflict Exposure Methodology

By applying these methods, we ensure a more precise estimation of the population exposed to conflicts, taking into account both the specific locations of these conflicts and the intricate geometries of the affected zones.

Tradeoffs in application

In constructing the population estimates used for the conflict exposure measures, choices are made based on needs and logistical limitations. To create "conflict exposure," the following choices are made:

  1. The "constrained" dataset is used. WorldPop global datasets are available with either mapped populations constrained to satellite-defined settlement boundaries, or with predictions made for all grid cells. The difference is described here. Constrained data are found to be more accurate and appropriate for many global applications, and especially so for African countries where recent building footprint data were available to use.
  2. Interim 2021 and 2022 global datasets were constructed through simple adjustments of 2020 national population totals to match those in the UN World Population Prospects. The WorldPop global time series dataset was constructed for each year 2000-20 through a project funded by the Bill and Melinda Gates Foundation. WorldPop is in the process of producing a new and improved time series to cover the 2015-2030 period through a new project (see below), and these are expected to be ready in mid-2024 (see more below on "updates").
  3. To enable the efficient calculation and rapid updates of exposure estimates, the 1 km spatial resolution global datasets were used, rather than the 100 m resolution data.
  4. Conflict often results in population displacement and migration. These are likely not captured in many settings due to the process of top-down disaggregation of census and projection data. WorldPop is engaged in multiple projects and activities to develop small-area population estimates that account for and update distributions when population movements occur. These include, for example, the integration of displacement survey and refugee camp data in South Sudan, disaggregating Common Operational Datasets on Population Statistics with UNFPA, the use of humanitarian datasets in Yemen, and estimating population baseline and change distributions in Ukraine. At present, none of these efforts and resulting datasets are included and, therefore, uncertainties in exposure measures will be especially high where large-scale displacements have taken place. We aim to explore the integration of population estimates produced using methods that account for population movements in future work.

The effect of these trade-offs and methodological choices results in limitations to output estimates. 

Location updates by Python scripts

The extraction process described above has been implemented in a suite of Python scripts, which are publicly accessible on GitHub. These scripts are built upon a foundation of essential Python libraries, each serving a specific role in the analysis. For handling and manipulating large datasets efficiently, the scripts utilize NumPy and pandas. These libraries are fundamental for data analysis in Python, offering robust structures and functions for numerical and tabular data. The creation of circular buffers and the intricate process of clipping using Voronoi cells are handled using GeoPandas, SciPy-spatial, and Shapely. These packages provide comprehensive tools for geometric manipulations and spatial analysis. Finally, the rasterstats package is employed for the zonal statistics operation, enabling the extraction of population counts from gridded data within the defined zones.

A key feature of these scripts is their adaptability to dynamic conflict data. They are designed to allow for the easy integration of new conflict locations into the analysis pipeline. This means that when new conflict sites are added, the entire analysis does not need to be rerun from scratch. Instead, the scripts can update the population counts efficiently with the newly added locations. This flexibility ensures that the analysis remains current and can rapidly adapt to changing real-world situations, making it an invaluable tool for researchers and analysts monitoring conflict-affected populations.

Conflict Exposure Methodology

Figure 3. General workflow of extracting population exposed to conflicts at new locations.

    Share on

    Related content