Rainfall anomalies from Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS)¶

Products used: rainfall_chirps_monthly

Background¶

Rainfall anomalies are deviations of rainfall from long-run averages. They are useful for identifying wet and dry periods which can be linked to climatically influenced patterns such as flooding, river flows, and agricultural production.

Description¶

In this real world example we will calculate rainfall anomalies for a selected African country using the CHIRPS monthly rainfall dataset. Standardised anomaly is calculated as:

\begin{equation} \text{Standardised anomaly }=\frac{x-m}{s} \end{equation}

x is the seasonal mean, m is the long-term mean, and s is the long-term standard deviation.

This means we need a long-term reference period (m) and a period of interest (x) for which we'll calculate the anomalies. This notebook names datasets ds_rf_m and ds_rf_x accordingly.

The notebook outlines:

Loading a shapefile for African countries and selecting a single country
Loading and rainfall data and masking it to the selected country.
Calculating monthly rainfall anomalies and plotting the result, aggregated over space, as a bar chart.
Calculating and plotting monthly rainfall anomalies spatially.

Getting started¶

To run this analysis, run all the cells in the notebook, starting with the "Load packages" cell.

Load packages¶

In [1]:

%matplotlib inline

import datacube
import numpy as np
import pandas as pd
import geopandas as gpd
import xarray as xr
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from datacube.utils.geometry import Geometry, CRS
from deafrica_tools.spatial import xr_rasterize
from deafrica_tools.dask import create_local_dask_cluster

Set up a Dask cluster¶

Dask can be used to better manage memory use and conduct the analysis in parallel. For an introduction to using Dask with Digital Earth Africa, see the Dask notebook.

Note: We recommend opening the Dask processing window to view the different computations that are being executed; to do this, see the Dask dashboard in DE Africa section of the Dask notebook.

To use Dask, set up the local computing cluster using the cell below.

In [2]:

create_local_dask_cluster()

Client

Scheduler: tcp://127.0.0.1:46059
Dashboard: /user/mickwelli/proxy/8787/status

Cluster

Workers: 1
Cores: 2
Memory: 13.11 GB

Analysis parameters¶

The following cell sets important parameters for the analysis:

country: In this analysis, we'll select an African country to mask the dataset and analysis.
time_m: CHIRPS monthly rainfall is available from 1981. The long-term mean for rainfall anomalies is often calculated on a 30-year period, so we'll use 1981 to 2011 in this example.
time_x: This is the period for which we want to calculate anomalies.
resolution: We'll use 5,000 m, which is approximately equal to the default resolution shown above.
dask_chunks: the size of the dask chunks, dask breaks data into manageable chunks that can be easily stored in memory, e.g. dict(x=1000,y=1000)

Standardised anomaly is calculated as:

\begin{equation} \text{Standardised anomaly }=\frac{x-m}{s} \end{equation}

$x$ is the seasonal mean, $m$ is the long-term mean, and $s$ is the long-term standard deviation.

This means we need a long-term reference period (m) and a period of interest (x) for which we'll calculate the anomalies. This notebook names datasets ds_rf_m and ds_rf_x accordingly.

If running the notebook for the first time, keep the default settings below. This will demonstrate how the analysis works and provide meaningful results.

In [3]:

# Select a country, for the example we will use Kenya, a complete list of countries is available below.
country = "Kenya"

# Set the range of dates for the climatology, this will be the reference period (m) for the anomaly calculation.
# Standard practice is to use a 30 year period, so we've used 1981 to 2011 in this example.
time_m = ('1981', '2011')

# time period for monthly anomaly (x)
time_x = ('1981', '2021')

# CHIRPS has a spatial resolution of ~5x5 km
resolution = (-5000, 5000)

#size of dask chunks
dask_chunks = dict(x=500,y=500)

Connect to the datacube¶

Connect to the datacube so we can access DE Africa data. The app parameter is a unique name for the analysis which is based on the notebook file name.

In [4]:

dc = datacube.Datacube(app='rainfall_anomaly')

Load the African Countries shapefile¶

This shapefile contains polygons for the boundaries of African countries and will allows us to calculate rainfall anomalies within a chosen country

In [5]:

african_countries = gpd.read_file('../Supplementary_data/Rainfall_anomaly_CHIRPS/african_countries.geojson')
african_countries.explore()

Out[5]:

Make this Notebook Trusted to load map: File -> Trust Notebook

List countries¶

You can change the country in the analysis parameters cell to any African country. A complete list of countries is printed below.

In [6]:

print(np.unique(african_countries.COUNTRY))

['Algeria' 'Angola' 'Benin' 'Botswana' 'Burkina Faso' 'Burundi' 'Cameroon'
 'Cape Verde' 'Central African Republic' 'Chad' 'Comoros'
 'Congo-Brazzaville' 'Cote d`Ivoire' 'Democratic Republic of Congo'
 'Djibouti' 'Egypt' 'Equatorial Guinea' 'Eritrea' 'Ethiopia' 'Gabon'
 'Gambia' 'Ghana' 'Guinea' 'Guinea-Bissau' 'Kenya' 'Lesotho' 'Liberia'
 'Libya' 'Madagascar' 'Malawi' 'Mali' 'Mauritania' 'Morocco' 'Mozambique'
 'Namibia' 'Niger' 'Nigeria' 'Rwanda' 'Sao Tome and Principe' 'Senegal'
 'Sierra Leone' 'Somalia' 'South Africa' 'Sudan' 'Swaziland' 'Tanzania'
 'Togo' 'Tunisia' 'Uganda' 'Western Sahara' 'Zambia' 'Zimbabwe']

Setup polygon¶

The country selected needs to be transformed into a geometry object to be used in the dc.load() function.

In [7]:

idx = african_countries[african_countries['COUNTRY'] == country].index[0]
geom = Geometry(geom=african_countries.iloc[idx].geometry, crs=african_countries.crs)

Load rainfall data¶

First, let's have a look at the product information for CHIRPS rainfall. We can see that it is stored in monthly timestamps and its native resolution is approximately 0.05 degrees.

In [8]:

dc.list_products().loc[dc.list_products()['name'].str.contains('chirps')]

Out[8]:

.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }

	name	description	license	default_crs	default_resolution
name
rainfall_chirps_monthly	rainfall_chirps_monthly	Rainfall Estimates from Rain Gauge and Satelli...	None	EPSG:4326	(-0.05000000074505806, 0.05000000074505806)

Load data for long-term climatology (m)¶

Using the analysis parameters defined above, we will load CHIRPS monthly rainfall data for the 30-year reference period (m).

In [9]:

query = {'geopolygon': geom,
         'time': time_m,
         'output_crs': 'epsg:6933',
         'resolution': resolution,
         'measurements': ['rainfall'],
         'dask_chunks':dask_chunks
        }

ds_rf = dc.load(product='rainfall_chirps_monthly', **query)
ds_rf

Out[9]:

/* CSS stylesheet for displaying xarray objects in jupyterlab. * */ :root { --xr-font-color0: var(--jp-content-font-color0, rgba(0, 0, 0, 1)); --xr-font-color2: var(--jp-content-font-color2, rgba(0, 0, 0, 0.54)); --xr-font-color3: var(--jp-content-font-color3, rgba(0, 0, 0, 0.38)); --xr-border-color: var(--jp-border-color2, #e0e0e0); --xr-disabled-color: var(--jp-layout-color3, #bdbdbd); --xr-background-color: var(--jp-layout-color0, white); --xr-background-color-row-even: var(--jp-layout-color1, white); --xr-background-color-row-odd: var(--jp-layout-color2, #eeeeee); } html[theme=dark], body.vscode-dark { --xr-font-color0: rgba(255, 255, 255, 1); --xr-font-color2: rgba(255, 255, 255, 0.54); --xr-font-color3: rgba(255, 255, 255, 0.38); --xr-border-color: #1F1F1F; --xr-disabled-color: #515151; --xr-background-color: #111111; --xr-background-color-row-even: #111111; --xr-background-color-row-odd: #313131; } .xr-wrap { display: block; min-width: 300px; max-width: 700px; } .xr-text-repr-fallback { /* fallback to plain text repr when CSS is not injected (untrusted notebook) */ display: none; } .xr-header { padding-top: 6px; padding-bottom: 6px; margin-bottom: 4px; border-bottom: solid 1px var(--xr-border-color); } .xr-header > div, .xr-header > ul { display: inline; margin-top: 0; margin-bottom: 0; } .xr-obj-type, .xr-array-name { margin-left: 2px; margin-right: 10px; } .xr-obj-type { color: var(--xr-font-color2); } .xr-sections { padding-left: 0 !important; display: grid; grid-template-columns: 150px auto auto 1fr 20px 20px; } .xr-section-item { display: contents; } .xr-section-item input { display: none; } .xr-section-item input + label { color: var(--xr-disabled-color); } .xr-section-item input:enabled + label { cursor: pointer; color: var(--xr-font-color2); } .xr-section-item input:enabled + label:hover { color: var(--xr-font-color0); } .xr-section-summary { grid-column: 1; color: var(--xr-font-color2); font-weight: 500; } .xr-section-summary > span { display: inline-block; padding-left: 0.5em; } .xr-section-summary-in:disabled + label { color: var(--xr-font-color2); } .xr-section-summary-in + label:before { display: inline-block; content: '►'; font-size: 11px; width: 15px; text-align: center; } .xr-section-summary-in:disabled + label:before { color: var(--xr-disabled-color); } .xr-section-summary-in:checked + label:before { content: '▼'; } .xr-section-summary-in:checked + label > span { display: none; } .xr-section-summary, .xr-section-inline-details { padding-top: 4px; padding-bottom: 4px; } .xr-section-inline-details { grid-column: 2 / -1; } .xr-section-details { display: none; grid-column: 1 / -1; margin-bottom: 5px; } .xr-section-summary-in:checked ~ .xr-section-details { display: contents; } .xr-array-wrap { grid-column: 1 / -1; display: grid; grid-template-columns: 20px auto; } .xr-array-wrap > label { grid-column: 1; vertical-align: top; } .xr-preview { color: var(--xr-font-color3); } .xr-array-preview, .xr-array-data { padding: 0 5px !important; grid-column: 2; } .xr-array-data, .xr-array-in:checked ~ .xr-array-preview { display: none; } .xr-array-in:checked ~ .xr-array-data, .xr-array-preview { display: inline-block; } .xr-dim-list { display: inline-block !important; list-style: none; padding: 0 !important; margin: 0; } .xr-dim-list li { display: inline-block; padding: 0; margin: 0; } .xr-dim-list:before { content: '('; } .xr-dim-list:after { content: ')'; } .xr-dim-list li:not(:last-child):after { content: ','; padding-right: 5px; } .xr-has-index { font-weight: bold; } .xr-var-list, .xr-var-item { display: contents; } .xr-var-item > div, .xr-var-item label, .xr-var-item > .xr-var-name span { background-color: var(--xr-background-color-row-even); margin-bottom: 0; } .xr-var-item > .xr-var-name:hover span { padding-right: 5px; } .xr-var-list > li:nth-child(odd) > div, .xr-var-list > li:nth-child(odd) > label, .xr-var-list > li:nth-child(odd) > .xr-var-name span { background-color: var(--xr-background-color-row-odd); } .xr-var-name { grid-column: 1; } .xr-var-dims { grid-column: 2; } .xr-var-dtype { grid-column: 3; text-align: right; color: var(--xr-font-color2); } .xr-var-preview { grid-column: 4; } .xr-var-name, .xr-var-dims, .xr-var-dtype, .xr-preview, .xr-attrs dt { white-space: nowrap; overflow: hidden; text-overflow: ellipsis; padding-right: 10px; } .xr-var-name:hover, .xr-var-dims:hover, .xr-var-dtype:hover, .xr-attrs dt:hover { overflow: visible; width: auto; z-index: 1; } .xr-var-attrs, .xr-var-data { display: none; background-color: var(--xr-background-color) !important; padding-bottom: 5px !important; } .xr-var-attrs-in:checked ~ .xr-var-attrs, .xr-var-data-in:checked ~ .xr-var-data { display: block; } .xr-var-data > table { float: right; } .xr-var-name span, .xr-var-data, .xr-attrs { padding-left: 25px !important; } .xr-attrs, .xr-var-attrs, .xr-var-data { grid-column: 1 / -1; } dl.xr-attrs { padding: 0; margin: 0; display: grid; grid-template-columns: 125px auto; } .xr-attrs dt, .xr-attrs dd { padding: 0; margin: 0; float: left; padding-right: 10px; width: auto; } .xr-attrs dt { font-weight: normal; grid-column: 1; } .xr-attrs dt:hover span { display: inline-block; background: var(--xr-background-color); padding-right: 10px; } .xr-attrs dd { grid-column: 2; white-space: pre-wrap; word-break: break-all; } .xr-icon-database, .xr-icon-file-text2 { display: inline-block; vertical-align: middle; width: 1em; height: 1.5em !important; stroke-width: 0; stroke: currentColor; fill: currentColor; }

<xarray.Dataset>
Dimensions:      (time: 372, y: 238, x: 155)
Coordinates:
  * time         (time) datetime64[ns] 1981-01-16T11:59:59.500000 ... 2011-12...
  * y            (y) float64 5.875e+05 5.825e+05 ... -5.925e+05 -5.975e+05
  * x            (x) float64 3.272e+06 3.278e+06 ... 4.038e+06 4.042e+06
    spatial_ref  int32 6933
Data variables:
    rainfall     (time, y, x) float32 dask.array<chunksize=(1, 238, 155), meta=np.ndarray>
Attributes:
    crs:           epsg:6933
    grid_mapping:  spatial_ref

xarray.Dataset

Dimensions:
- time: 372
- y: 238
- x: 155

Coordinates: (4)

time

(time)

datetime64[ns]

1981-01-16T11:59:59.500000 ... 2...

units : seconds since 1970-01-01 00:00:00

array(['1981-01-16T11:59:59.500000000', '1981-02-14T23:59:59.500000000',
       '1981-03-16T11:59:59.500000000', ..., '2011-10-16T11:59:59.500000000',
       '2011-11-15T23:59:59.500000000', '2011-12-16T11:59:59.500000000'],
      dtype='datetime64[ns]')

y
(y)
float64
5.875e+05 5.825e+05 ... -5.975e+05
units : metre resolution : -5000.0 crs : epsg:6933
```
array([ 587500.,  582500.,  577500., ..., -587500., -592500., -597500.])
```

(x)

float64

3.272e+06 3.278e+06 ... 4.042e+06

units : metre resolution : 5000.0 crs : epsg:6933

array([3272500., 3277500., 3282500., 3287500., 3292500., 3297500., 3302500.,
       3307500., 3312500., 3317500., 3322500., 3327500., 3332500., 3337500.,
       3342500., 3347500., 3352500., 3357500., 3362500., 3367500., 3372500.,
       3377500., 3382500., 3387500., 3392500., 3397500., 3402500., 3407500.,
       3412500., 3417500., 3422500., 3427500., 3432500., 3437500., 3442500.,
       3447500., 3452500., 3457500., 3462500., 3467500., 3472500., 3477500.,
       3482500., 3487500., 3492500., 3497500., 3502500., 3507500., 3512500.,
       3517500., 3522500., 3527500., 3532500., 3537500., 3542500., 3547500.,
       3552500., 3557500., 3562500., 3567500., 3572500., 3577500., 3582500.,
       3587500., 3592500., 3597500., 3602500., 3607500., 3612500., 3617500.,
       3622500., 3627500., 3632500., 3637500., 3642500., 3647500., 3652500.,
       3657500., 3662500., 3667500., 3672500., 3677500., 3682500., 3687500.,
       3692500., 3697500., 3702500., 3707500., 3712500., 3717500., 3722500.,
       3727500., 3732500., 3737500., 3742500., 3747500., 3752500., 3757500.,
       3762500., 3767500., 3772500., 3777500., 3782500., 3787500., 3792500.,
       3797500., 3802500., 3807500., 3812500., 3817500., 3822500., 3827500.,
       3832500., 3837500., 3842500., 3847500., 3852500., 3857500., 3862500.,
       3867500., 3872500., 3877500., 3882500., 3887500., 3892500., 3897500.,
       3902500., 3907500., 3912500., 3917500., 3922500., 3927500., 3932500.,
       3937500., 3942500., 3947500., 3952500., 3957500., 3962500., 3967500.,
       3972500., 3977500., 3982500., 3987500., 3992500., 3997500., 4002500.,
       4007500., 4012500., 4017500., 4022500., 4027500., 4032500., 4037500.,
       4042500.])

spatial_ref
()
int32
6933
spatial_ref : PROJCS["WGS 84 / NSIDC EASE-Grid 2.0 Global",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]],PROJECTION["Cylindrical_Equal_Area"],PARAMETER["standard_parallel_1",30],PARAMETER["central_meridian",0],PARAMETER["false_easting",0],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","6933"]] grid_mapping_name : lambert_cylindrical_equal_area
```
array(6933, dtype=int32)
```

Data variables: (1)

rainfall

(time, y, x)

float32

dask.array<chunksize=(1, 238, 155), meta=np.ndarray>

units : mm nodata : -9999 crs : epsg:6933 grid_mapping : spatial_ref

	Array	Chunk
Bytes	54.89 MB	147.56 kB
Shape	(372, 238, 155)	(1, 238, 155)
Count	744 Tasks	372 Chunks
Type	float32	numpy.ndarray

155 238 372

Attributes: (2)
crs : epsg:6933 grid_mapping : spatial_ref

Mask the rainfall dataset using the country boundary¶

Below, the country polygon is rasterized so the rainfall dataset is masked within that raster.

In [10]:

african_countries = african_countries.to_crs('epsg:6933')

mask = xr_rasterize(african_countries[african_countries['COUNTRY'] == country], ds_rf)

#mask the rainfall dataset
ds_rf = ds_rf.where(mask)

# Plot the mask
mask.plot()

Out[10]:

<matplotlib.collections.QuadMesh at 0x7f0a7f7fd790>

Calculate monthly rainfall climatology¶

We want to capture both the monthly mean (i.e. each month averaged over thirty years) and the monthly standard deviation of rainfall within the country polygon for each year from 1981 to 2011. Firstly, rainfall is grouped by month and a mean is calculated, then the standard devation in rainfall total for each month is calculated

In [11]:

ds_rf_m = ds_rf.where(ds_rf !=-9999.) #convert missing values to NaN

# monthly means
climatology_mean = ds_rf_m.groupby('time.month').mean('time').compute()

#calculate monthly std dev
climatology_std = ds_rf_m.groupby('time.month').std('time').compute()

Now we can plot the rainfall mean climatology, this is the average rainfall (over 30 years) for each month

In [12]:

climatology_mean['rainfall'].plot.imshow(cmap='viridis', col='month', col_wrap=6, label=False);

Load data for the anomaly period¶

Using the analysis parameters defined above, we will load CHIRPS rainfall data for the period over which we want to calculate anomalies (x). We also need to mask this dataset to the country polygon.

In [13]:

#load rainfall data for the anomaly period matching the spatial extent of the climatologies
ds_rf_x = dc.load(product='rainfall_chirps_monthly',
                  like=ds_rf.geobox,
                  time=time_x,
                  dask_chunks=dask_chunks)

#mask with country polygon
ds_rf_x=ds_rf_x.where(mask)

Calculate standardised anomalies¶

We can visualise the anomalies spatially and see if they are associated with certain landscape features.

Do the spatial anomalies shown in the plots below align with the aggregated anomalies shown above?

In [14]:

stand_anomalies = xr.apply_ufunc(
    lambda x, m, s: (x - m) / s,
    ds_rf_x.groupby("time.month"),
    climatology_mean,
    climatology_std,
    output_dtypes=[ds_rf_x.rainfall.dtype],
    dask="allowed"
).compute()

Plot average rainfall anomalies across the country¶

Below, the spatial mean is taken so we can present the monthly anomalies aggregated across the selected country.

In [15]:

spatial_mean_anoms = stand_anomalies.rainfall.mean(['x','y']).to_dataframe().drop(['spatial_ref', 'month'], axis=1)

Below, we plot a bar graph that will show the average rainfall anomaly across the country

In [16]:

fig, ax = plt.subplots(figsize=(21,6))
ax.xaxis.set_major_formatter(mdates.DateFormatter("%Y-%m"))
ax.set_ylim(-2,3.5)
ax.bar(spatial_mean_anoms.index,
       spatial_mean_anoms.rainfall,
       width=35, align='center',
       color=(spatial_mean_anoms['rainfall'] > 0).map({True: 'b', False: 'r'}))
ax.axhline(0, color='black', linestyle='--')
plt.title('Monthly standardised rainfall anomalies for '+country)
plt.ylabel('Std. deviations from monthly mean');

Per-pixel plots of rainfall anomalies¶

Average anomalies across the entire country obscure details on how rainfall anomalies are spatially distrbuted within the country. Below, enter a start and end date (in format 'YYYY-MM') that is within the time_x range you entered in the Analysis parameters section to plot per-pixel anomalies for the range of dates you specify.

In [17]:

#Select a time-range to plot
start='2020-07'
end='2021-06' 

Plot the per-pixel anomalies¶

In [18]:

stand_anomalies['rainfall'].sel(time=slice(start,end)).plot(cmap='RdBu', label=False, col="time", col_wrap=6, robust=True);

Additional information¶

License: The code in this notebook is licensed under the Apache License, Version 2.0. Digital Earth Africa data is licensed under the Creative Commons by Attribution 4.0 license.

Contact: If you need assistance, please post a question on the Open Data Cube Slack channel or on the GIS Stack Exchange using the open-data-cube tag (you can view previously asked questions here). If you would like to report an issue with this notebook, you can file one on Github.

Compatible datacube version:

In [19]:

print(datacube.__version__)

1.8.6

Last Tested:

In [20]:

from datetime import datetime
datetime.today().strftime('%Y-%m-%d')

Out[20]:

'2021-11-24'

{"state": {}, "version_major": 2, "version_minor": 0}