Map Methodology Note
How the 205 state-level choropleth maps were built, what data they use, and what they can and cannot tell you.
What These Maps Are
205 state-level choropleth maps of India covering demography, food and culture, economy, health, gender, caste, religion, education, environment, democracy, social protection, and reproductive choices. Each map shows a single indicator across India's 28 states and 8 union territories (36 units).
Data Sources
The maps draw primarily on seven source families:
| Source | Data Year | Coverage | ~Maps |
|---|---|---|---|
| NFHS-5 | 2019-21 | Health, nutrition, family planning, domestic violence, women's empowerment, dietary patterns | ~80 |
| Census of India | 2011 | Demographics, language, religion, education, housing, disability, migration, household assets | ~45 |
| PLFS | 2022-23 | Employment, labour force participation, workforce informality | ~10 |
| NCRB Crime in India | 2022 | Crimes against women, crimes against SCs, dowry deaths, trafficking, acid attacks | ~12 |
| NHA / HCES / SRS | 2018-23 | Health expenditure, consumption, mortality | ~15 |
| Government Dashboards | 2022-24 | MGNREGA, JJM, NSAP, Udyam, PMFBY, UDISE+, DPIIT Startup India | ~20 |
| Other | Various | IMD, CGWB, CPCB, FSI, ECI, ADR, TRAI, RBI, NASA VIIRS, NACO, Pew, CSDS, SFLC.in, IHDS-II, NABARD, Dharmaviki/IIT-B | ~23 |
Note on NCRB data: NCRB data reflects registered cases, not incidence. States with high crime registration rates may have better police responsiveness rather than worse safety. States with low rates may suppress reporting. This caveat is flagged on every NCRB-sourced map.
Data Quality Flags
Each map carries one of three flags:
BUILT Values taken directly from published official tables. No interpolation.
NEW Values compiled from published official sources during map preparation. Cross-checked against source publications.
UNCERTAIN Values derived from proxy indicators, crowdsourced databases, or methodologically contested sources. 18 maps carry this flag, including temple density, inter-caste marriage, child labour (post-2011), honour killings, witch-hunting deaths, manual scavenging, and gig workers. Each carries a specific caveat in its source line.
How the Maps Were Rendered
Maps are rendered programmatically using Python 3 with GeoPandas and Matplotlib. The base shapefile is Natural Earth 10m Admin-1 (ne_10m_admin_1_states_provinces.shp), filtered to India.
The rendering script (maps_v2.py) takes a standardised data dictionary for each map containing the title, unit, source, year, takeaway sentence, shock statistic, highest and lowest state callouts, national average, and a complete state-to-value data dictionary. Given the same input, it produces the same output.
Northeast India's eight states plus Sikkim are rendered in a dedicated inset panel overlaying the main map's NE region, since their polygons are too small for readable labels at national scale. Every state receives a two-letter abbreviation and its data value.
Design Conventions
- Warm off-white background (
#F8F6F1), 16:9 aspect ratio, 200 DPI PNG output - Amaranth Bold ALL CAPS for titles; Inter (variable weight) for all other text
- Each of the 12 thematic sections has a dedicated three-stop colour gradient (light to dark)
- A red shock-stat box (top right) highlights the single most audience-confronting number
- The colourbar (bottom centre) carries an amber marker at the national average value
- Source and data year appear bottom left; copyright line bottom right
Key Limitations
Temporal mismatch
Maps span data from 2011 to 2024. No attempt is made to harmonise time periods. Each map's source line states its data year.
State-level aggregation
State averages conceal large within-state variation. A state average for Bihar conceals Patna (near-national-average performance) and Seemanchal (far below).
NCRB reporting bias
States with high crime registration rates may have better police responsiveness rather than worse safety. States with low rates may suppress reporting.
Census 2011 age
For rapidly changing indicators (urbanisation, digital access, employment), Census 2011 is directionally valid but quantitatively outdated. Where newer survey data exists, it is preferred.
Estimates are not fabrications
The 18 UNCERTAIN maps use the best available published data. They carry wider confidence intervals and should be presented with their caveats stated.
These maps are pedagogical tools designed for facilitated discussion. They are not intended as standalone data references. For policy or academic citation, consult the underlying source publications directly.
Version 2. March 2026. Rendered with maps_v2.py using GeoPandas + Matplotlib.
Data work by Varna.