The ihsMW package is
a dedicated toolkit designed to clean, harmonise, and aggregate
household survey data from the Malawi Integrated Household Survey (IHS)
series. It is built to support researchers and analysts working with the
IHS2 (2004/05), IHS3 (2010/11), IHS4 (2016/17), and IHS5 (2019/20)
datasets.
You can install the stable release of ihsMW from
CRAN:
Or install the development version from GitHub:
The Malawi National Statistical Office (NSO) conducts the Integrated Household Survey (IHS) periodically to track poverty, household expenditure, agriculture, and other socio-economic indicators. The primary rounds include:
Due to licensing restrictions, the raw microdata cannot be
redistributed directly within R packages. Researchers must first
register and manually download the survey data in Stata
(.dta) format from the World Bank Microdata
Library.
Once downloaded, place the files in a structured folder hierarchy on your local machine.
Each round of the IHS uses different variable names for the same
question. For example, household size is recorded under different column
names depending on the round. ihsMW uses a comprehensive
crosswalk to harmonise these variable names.
To load and harmonise a raw survey file:
To find variables mapped in the crosswalk, use
ihs_search(). You can search by keywords or labels:
# Search for consumption-related variables
ihs_search("consumption")
# Search for age within a specific round
ihs_search("age", round = "IHS5")To view a summary of the crosswalk coverage and flag variables
needing review, use ihs_crosswalk_check():
ihsMW provides tools to clean standard survey anomalies,
handle missing value codes, and winsorize extreme values:
# Convert standard survey missing codes (-99, -98, etc.) to NA
df_clean <- ihs_standardize_missing(harmonised_data)
# Winsorize outliers (e.g. food expenditure) stratified by urban/rural
df_winsor <- ihs_winsorize(df_clean, value_col = "food_exp", strata_col = "urban")
# Run the master cleaning wrapper which applies both steps and logs changes
df_cleaned <- ihs_clean(
data = harmonised_data,
missing_cols = c("food_exp", "nonfood_exp"),
winsorize_cols = "food_exp",
strata_col = "urban"
)Agricultural modules in the IHS allow households to report harvest
quantities in non-standard units (e.g., pails, basins, ox-carts, bags)
rather than standard kilograms. ihsMW bundles official NSO
conversion factors to convert these quantities to standard
kilograms:
# Convert quantities reported in non-standard units to kilograms
crop_data <- data.frame(
crop_code = c(1, 2),
unit_code = c(3, 4),
quantity = c(10, 5),
region = c(1, 2)
)
crop_data_kg <- ihs_convert_units(
data = crop_data,
crop_col = "crop_code",
unit_col = "unit_code",
qty_col = "quantity",
region_col = "region"
)