Climate-Smart Agriculture

Problem Statement

Maji Ndogo is a fictional region used as a simulated dataset by ExploreAI Academy, modelled on real agricultural challenges across Sub-Saharan Africa. The dataset draws place names from across the continent, including areas in Nigeria, Kenya, and South Africa, to represent diverse farming landscapes and climate conditions.

Smallholder farmers across Sub-Saharan Africa face increasing climate uncertainty. Erratic rainfall, rising temperatures, and shifting seasons make traditional farming knowledge less reliable. This project analysed crop yield data against climate variables within the Maji Ndogo dataset to identify patterns that could help farmers and agricultural planners make better decisions about what to grow, when to plant, and where to invest.

Approach

Starting from raw agricultural data in a SQLite database, I built a structured Python pipeline that takes data from ingestion through to final insights — designed to be reproducible and transparent at every stage.

Loaded and validated raw agricultural datasets using Pandas and SQLAlchemy
Cleaned missing values, corrected data types, and fixed spelling errors
Performed statistical analysis to identify correlations between climate variables and crop yields
Built visualisations with Matplotlib to communicate findings clearly
Exported a summarised Excel report for non-technical stakeholders

Pipeline Architecture

# Data Ingestion & Pipeline Architecture
def create_db_engine(db_path):
    """Creates and returns a SQLAlchemy database engine."""
    engine = create_engine(db_path)
    return engine

def query_data(engine, sql_query):
    """Queries the database and returns a DataFrame."""
    with engine.connect() as connection:
        df = pd.read_sql_query(text(sql_query), connection)
        return df

def read_from_web_CSV(URL):
    """Reads a CSV file from a web URL into a DataFrame."""
    df = pd.read_csv(URL)
    return df

# Run the full pipeline
field_df   = query_data(create_db_engine(config_params['db_path']),
                        config_params['sql_query'])
weather_df = read_from_web_CSV(config_params['weather_csv_path'])
        

Screenshots

Python data pipeline in Jupyter Notebook — Jupyter Notebook — Modular data ingestion pipeline functions

Python statistical analysis and visualisations — Python — Statistical analysis and Matplotlib visualisations

Excel summary report for agriculture project — Excel — Crop yield and climate summary exported from Python

Results

Identified strong correlation between seasonal rainfall and crop yields across regions
Flagged temperature anomaly years where yields dropped significantly
Produced a clean Excel summary accessible to agricultural extension officers
Built a reproducible pipeline that can be rerun as new data becomes available

Skills Demonstrated

Python data pipeline design (Pandas, SQLAlchemy, Matplotlib)
Data cleaning and validation at scale
Statistical correlation analysis
Data visualisation for storytelling
Excel reporting for non-technical audiences
Jupyter Notebook documentation