Course Coordinators

Dr. Ogechi Okanya Cookey
Department: Journalism and Media Studies

Professor Zubairul Islam
Geospatial Sciences
HU-JMS 106: Data Journalism and Reporting in the Extractive Industry (2 Units; C — LH: 30)
Overview
The course “Data Journalism and Reporting in the Extractive Industry” connects journalism, data, policy, law and accountability in Nigeria’s extractive industry. It is crucial in regions like the Niger Delta, where oil and gas activities intersect with governance, transparency, community rights, environmental and sustainable development concerns. The course introduces students to the principles and practices of data-driven journalism and accountability reporting in the context of oil, gas, and mineral extraction.
Objectives
- Explain what data is.
- Define data journalism and explain its purpose.
- Explain the scope and significance of the extractive industries.
- Identify the role and importance of data journalism in enhancing transparency and accountability in the extractive sector.
- Examine the challenges of data journalism.
- Identify key data journalism sources in Nigeria’s extractive industry landscape.
- Create hard and soft news stories from extractive industry data.
Learning Outcomes
- Explain what data is, differentiating it from information and outlining its form and sources.
- Define data journalism in at least two (2) ways and explain its purpose.
- Explain the scope and significance of the extractive industries.
- Identify at least five (5) roles/importance of data journalism in enhancing transparency and accountability in the extractive sector.
- Discuss at least four (4) challenges of data journalism in Nigeria’s extractive sector.
- Identify at least three (3) data journalism sources in Nigeria’s extractive industry landscape.
- Create at least one hard and soft news stories each from extractive industry data.
Course Contents
- What is Data?
- Data Categorization
- Data According to Form and Source
- Data and Information
- Understanding Data Journalism
- Scope and Significance of Nigeria’s Extractive Industries
- Roles/Importance of Data Journalism in the Extractive Sector
- Challenges of Data Journalism in Nigeria’s Extractive Sector
- Data Journalism Sources in Nigeria’s Extractive Industry Landscape
- Storifying Extractive Industry Data
1) What is Data?
Working definition: Data are recorded facts or observations (numbers, text, symbols, images, coordinates) about people, places, things, or events. In this course, data are the raw inputs we transform into information (organized/contextualized data) and then into insight (answers that support accountability reporting in the extractive sector).
Data → Information → Insight (quick example)
- Data: “3,500 barrels; Nembe Creek; 2024-05-10; cause: equipment failure.”
- Information: “On 10 May 2024, an equipment failure caused a 3,500-barrel spill at Nembe Creek.”
- Insight: “Aging equipment is a leading driver of recent spills in Bayelsa; enforcement targets should prioritize maintenance audits.”
Common data you’ll meet on this beat
- Quantitative: oil/gas production (bbl, m³, mscf), gas flaring volumes, royalties/taxes (₦, $), spill counts/volumes, pipeline length (km), emissions (tCO₂e).
- Qualitative: company PR/CSR statements, community testimonies, court rulings, policy texts, photos/videos.
- Spatial: well/field coordinates, pipeline routes, facility footprints, affected LGAs/communities.
- Temporal: monthly production series, incident timelines, budget cycles.
Measurement & structure
- Discrete vs. Continuous: incidents (counts) vs. volumes/lengths (measures).
- Cross-section vs. Time series vs. Panel: one period; many periods; same entities over time.
- Structured / Semi-structured / Unstructured: CSV/XLSX; JSON/XML; PDFs, scans, images, press releases (need extraction).
Unit of analysis & granularity (be precise)
- Unit: well, field/block, pipeline segment, facility, company, community/LGA, incident, payment.
- Granularity: transaction-level payments; monthly well output; annual company totals. Finer granularity enables stronger accountability—requires more cleaning.
Always check definitions & units
- “Production” (gross vs. net), “royalty base/rate,” “spill” (threshold), “flaring” (measured vs. estimated), “jobs” (direct vs. indirect).
- Currency (₦ vs $), constant vs current prices, unit conversions (bbl↔m³), coordinate reference systems for maps.
Typical sources
- Regulators and MDAs (production, flaring, licences, fines), NEITI/EITI reports, company annual/ESG reports.
- Courts/parliamentary records, community/CSO logs, academic/think-tank datasets.
- Remote sensing (flares, spills, land change) and sensor networks.
Quality & ethics checklist (before you publish)
- Accuracy (triangulate), Completeness (missing fields?), Consistency (same units/defs), Timeliness, Comparability, Transparency (method/source).
- Protect sensitive data; avoid harm to communities/sources; disclose uncertainty/limitations; keep a reproducible trail (files, dates, versions).
2) Data Categorization
Why categorize? Clear categories help you choose the right questions, methods, visuals, and safeguards. In the extractive beat, the same “number” can behave very differently depending on its category.
A) By measurement (what the values mean)
- Qualitative (text/labels): incident cause, licence type, community name.
- Quantitative (numbers):
- Discrete (counts): number of spills, arrests, wells.
- Continuous (measures): production volume (bbl), pipeline length (km), emissions (tCO₂e).
- Scale (for stats/visuals): nominal (company), ordinal (severity level), interval (°C), ratio (barrels, ₦).
B) By structure (how the data are stored)
- Structured: neat tables (CSV/XLSX/DB) with rows/columns.
- Semi-structured: JSON/XML (keys/values vary across records).
- Unstructured: PDFs, scanned forms, images, videos, press releases (require extraction/OCR).
C) By time (how observations relate in time)
- Cross-section: one period (e.g., 2024 royalty payments by company).
- Time series: repeated periods (monthly flaring 2015–2025).
- Panel/longitudinal: same entities tracked over time (each field, monthly output).
- Frequency: daily, monthly, quarterly, annual—impacts comparability and seasonality.
D) By space (where it is on the map)
- Spatial types: point (well), line (pipeline), polygon (field/LGA), raster (satellite grid).
- Scale: facility → community → LGA → state → national.
- CRS/units: confirm coordinates and projection before mapping or measuring distance/area.
E) By source & provenance (who created it and how)
- Primary: regulator measurement logs, sensor feeds, field surveys.
- Secondary: NEITI/EITI summaries, company reports, academic datasets.
- Derived: your calculations (rates per km), model outputs, satellite-based estimates.
F) By access & sensitivity (rights and risks)
- Open/public vs restricted/proprietary vs confidential/leaked.
- Note licences, privacy, potential harm to sources/communities; document permissions.
G) By update behaviour (how it changes)
- Static: fixed once published (a yearbook PDF).
- Dynamic: refreshed (dashboards/APIs).
- Revised: back-filled corrections; keep versions and note revisions in stories.
Quick cheat sheet
Category | Extractive example | Why it matters |
---|---|---|
Quantitative (ratio) | Barrels produced per month | Supports rates, percentages, and log scales |
Qualitative (nominal) | Incident cause category | Use bar charts, not averages |
Time series | Monthly flaring by field | Check for seasonality and revisions |
Spatial polygon | Affected LGAs | Requires correct CRS for area comparisons |
Derived metric | Spills per 1,000 km pipeline | Normalizes exposure for fair comparisons |
Mini practice
Label each item with at least two categories (e.g., measurement + time): “NEITI 2023 company-level royalty totals (₦)”, “Daily satellite-estimated flaring by field”, “Community survey transcripts about spill impacts”.
3) Data According to Form and Source
Goal: Understand the form (how data is packaged) and the source (where it comes from and how it was produced) so you can plan extraction, analysis, and publishing effectively—especially on mobile-first workflows.
A) Form — how the data is packaged
- Tabular (CSV/XLSX): Best for quick analysis and charts. Prefer UTF-8 CSV, tidy columns (date, geo, measure, unit). Avoid merged cells.
- APIs (JSON/GeoJSON): Good for repeatable updates. Note endpoints, parameters, paging, and rate limits. Normalize nested fields.
- Geospatial (Vector: GeoJSON/Shapefile/GPKG/kml): Use for maps and spatial joins. Confirm CRS (e.g., EPSG:4326), geometry type, and unique IDs.
- Geospatial (Raster: GeoTIFF/COG): Use for satellite layers/indices. Check pixel size, nodata value, projection, and scaling factors.
- Documents (PDF/DOCX): Often reference only. Try to obtain the original spreadsheet; if OCR is needed, record accuracy and manual fixes.
- Media (JPG/PNG/MP4): Evidence/context. Preserve originals, timestamps/EXIF where lawful, and document any edits.
- Logs (CSV/JSON Lines): Time-stamped events. Ensure ISO-8601 with timezone; handle duplicates and gaps.
Schema essentials (capture these fields)
- ID (unique key), When (date/time), Where (admin area/coords + CRS), What (measure + unit/currency), Who (entity), How (method/definition).
- Use standardized names/codes (state/LGA), ISO 8601 dates (
YYYY-MM-DD
), and consistent units/currencies.
B) Source — where the data comes from & how to treat it
- Portals/downloads: Save files plus the data dictionary/release notes.
- APIs: Archive the full request (URL + params) and response metadata.
- Scrapes: Record selectors, versions, and manual cleaning steps.
- FOI/official requests: Keep request letters, responses, and version dates.
- Field/monitoring/sensors: Store calibration notes, device IDs, collection windows.
Source documentation (minimum metadata)
- Title & publisher, link/URL, date_downloaded (UTC), coverage period, geographic coverage.
- Method (how collected), units/currency, definitions (links), license/permissions.
- Update frequency, revision policy/change log, and a file checksum (e.g., SHA-256).
Fit-for-use checklist (fast)
- Form: machine-readable, tidy, has ID/time/geo/measure/unit.
- Source: clear method, known update cycle, revision notes.
- Reproducibility: can someone else re-download and recreate your subset?
Mini practice
Given a spreadsheet of monthly payments by company: write the schema (ID, when, where, what/unit, who, how), list the source metadata you will store, and choose the form you’ll publish for readers (CSV, chart, map) with a one-line justification.
4) Data and Information
Core idea: Data are raw records (numbers, text, coordinates, timestamps). Information is data that have been cleaned, structured, and given context to answer a specific question for an audience.
Why the distinction matters (journalism lens)
- Data alone rarely convinces; information explains what is happening, where, when, and how much.
- Information must be reliable, comparable, and understandable to support accountability reporting.
From data to information (a simple workflow)
- Define the question: e.g., “Are spills increasing in Bayelsa LGAs?”
- Clean: fix dates and units, handle missing values, remove duplicates.
- Standardize: currency (₦ vs $), measurement (bbl ↔ m³), names (consistent LGA/State spellings), CRS for maps.
- Reshape: make the dataset “tidy” (each row = one observation; columns = variables; values = measurements).
- Join context: add geography, population, pipeline length, company IDs, policy periods.
- Summarize: group by time/place, compute totals/means/percent change, rates per exposure (e.g., spills per 1,000 km pipeline).
- Validate: triangulate with at least one independent source and document any discrepancies.
- Explain: write a plain-language takeaway that answers the question and notes limitations.
Context turns numbers into information
- Time: trends, seasonality, before/after policy changes.
- Place: LGA/State comparisons, urban vs rural, onshore vs offshore.
- Exposure/denominator: per km of pipeline, per well, per ₦1bn revenue, per 100,000 residents.
- Benchmarks: national average, previous year, company targets, regulatory limits.
Common pitfalls (and how to avoid them)
- Mixed units/currencies: convert first; label clearly.
- Boundary changes: note LGA splits/mergers; align to a consistent geography.
- Inconsistent definitions: “spill,” “production,” “CSR spend” can vary—cite the definition used.
- Aggregation traps: totals can hide patterns (watch for Simpson’s paradox). Show both rates and counts.
- Cherry-picking time windows: justify your date range; test alternate windows.
Mini newsroom example (before → after)
- Raw data: “Spill logs: volume, date, LGA, cause, operator (2019–2025).”
- Information: “Since 2022, two LGAs account for 55% of spill volume, driven by equipment failure; spills per 1,000 km of pipeline rose 18% year-on-year despite a 6% drop in total volume.”
Minimum information package (what to publish with your chart/map)
- Source & method: who collected, how measured/estimated, definitions.
- Coverage: time span, geography, entities included/excluded.
- Units & transformations: conversions, inflation adjustment, rate calculations.
- Uncertainty/limitations: missing months, suspected under-reporting, model assumptions.
Quick practice
Take a CSV of monthly flaring by field. Convert all units to the same measure, compute a rate per producing well, and write a one-sentence finding that a non-expert can understand. Note one limitation.
5) Understanding Data Journalism
Definition: Data journalism is reporting that uses data as a source, a method, and part of the storytelling. It turns verified datasets into public-interest stories through analysis, visuals, and clear explanations.
What it is (and isn’t)
- Is: evidence-based reporting with transparent methods (how data were obtained, cleaned, analysed).
- Isn’t: number-dumping or pretty charts without context, definitions, or verification.
Core workflow (newsroom-friendly)
- Find/Obtain: identify datasets; request (FOI), download, or scrape legally.
- Clean/Document: fix dates/units, de-duplicate, record every change in a notes file.
- Analyse: compute rates, trends, comparisons; test competing explanations.
- Verify: triangulate with independent sources; talk to experts and affected communities.
- Visualise: pick charts/maps that match the question (time → line; part-to-whole → bar; place → map).
- Narrate & Publish: write a clear lead, show the most important number first, add methods box and links.
- Release/Archive: share source files or summaries when lawful; keep versions for reproducibility.
Why it matters on the extractive beat
- Follow the money: payments, royalties, fines, budget flows.
- Track impacts: spills, flaring, emissions, land-use change, health indicators.
- Accountability: promises vs outcomes; policy targets vs results.
- Equity: who benefits, who bears the costs (by LGA/community/company).
Common story types
- Explainers: “How royalties are calculated—and why your LGA got less this year.”
- Investigations: “Unreported spills cluster around aging pipelines.”
- Monitoring: periodic dashboards of production, flaring, or compliance.
- Service journalism: interactive maps that help communities understand risk.
Good practice (quick rules)
- Use the right denominator (per km pipeline, per well, per ₦1bn revenue).
- Be explicit about definitions (what counts as a spill? how is flaring measured?).
- Show uncertainty/limitations (missing months, estimated values, revisions).
- Prefer simple visuals with labelled units, clear legends, and sensible color scales.
- Reproduce your steps: keep data, code (if used), and a methods note.
Ethics & safety
- Minimise harm: redact sensitive locations/identifiers when necessary.
- Avoid misinterpretation: correlation ≠ causation; note confounders.
- Represent communities fairly: include local voices and context, not just numbers.
Typical tools (you can mix & match)
- Spreadsheets for cleaning/quick analysis.
- Python/R for repeatable workflows; simple scripts are enough.
- GIS tools for mapping (QGIS/ArcGIS) and remote-sensing layers when relevant.
- Simple charting libraries or newsroom graphics tools for publish-ready visuals.
Mini practice
Pick one question (e.g., “Are spills declining after Policy X?”). List the dataset(s) you’d need, the main metric (level and rate), one verification source, and the best visual to tell the story.
6) Scope and Significance of Nigeria’s Extractive Industries
Purpose: Understand what “extractives” cover in Nigeria, how the value chain works, and why it matters for revenue, development, environment, and accountability reporting.
A) What the sector includes (scope)
- Oil & Gas: crude oil, condensates, natural gas and liquids—onshore, shallow offshore, deepwater.
- Solid Minerals: metallic (e.g., gold), industrial (e.g., limestone, gypsum), energy minerals (e.g., coal), construction materials (e.g., granite).
- Value chain (simplified): Upstream (exploration, appraisal, development, production) → Midstream (processing, transport, storage) → Downstream (refining/beneficiation, distribution, marketing).
B) Key actors & frameworks (orientation only)
- Government/regulators: ministries, commissions, and agencies responsible for petroleum, mining, environment, standards, and revenue.
- Companies: NOCs, IOCs, independents, service companies, artisanal/small-scale miners (ASM) in the minerals space.
- Communities & CSOs: host communities, traditional institutions, civil society/advocacy groups, media.
- Rules: licensing regimes, fiscal instruments (royalties, taxes, fees), local content policies, environmental and social safeguards, EITI/NEITI disclosure standards.
C) Why it matters (significance)
- Public finance: major source of government revenue and foreign exchange; influences budgets, debt, and public investment.
- Development & jobs: direct/indirect employment, infrastructure, local content opportunities; also risks of enclave growth if linkages are weak.
- Markets & stability: exposure to price volatility, OPEC quotas, exchange rate movements, and global demand shifts (energy transition).
- Environment & health: gas flaring, spills, artisanal mining impacts (water/soil/air), land use change, remediation and rehabilitation obligations.
- Social license: community relations, benefit-sharing, host-community development funds/Trusts, conflict sensitivity, security of assets.
- Governance & accountability: transparency of payments, contracts, production; leakages (theft, under-reporting), subsidy and pricing policies.
D) Typical datasets you will work with in this course
- Production & exports: by field/block/company; crude grades; gas volumes (produced, flared, utilised).
- Licensing & contracts: awards, terms, holders, expiry/renewal milestones.
- Revenue flows: royalties, taxes, fees, dividends, subnational allocations; reconciliation statements.
- Operational indicators: refinery runs, pipeline length and outages, downtime, lifting schedules.
- Environment & safety: spill logs (count/volume/locations/causes), flare data, emissions, remediation status, mine closure plans.
- Local content & social: procurement shares, employment metrics, community projects, host-community payments.
E) Geography to keep in mind
- Niger Delta core & coastal/offshore: concentration of oil/gas production and infrastructure.
- Inland basins & gas hubs: emerging exploration/processing zones and pipeline corridors.
- Minerals corridors: clusters of quarrying/artisanal and small-scale mining across multiple states.
F) Reporting angles you can build from this scope
- Money trail: who pays what, when, and where it goes (national vs subnational).
- Performance: targets vs outcomes (production, flaring reduction, local content).
- Risk & impact: environmental hotspots, community exposure, remediation progress.
- Equity: distribution of benefits/costs among LGAs, communities, and companies.
Mini practice
Pick one state/LGA. List: (1) two upstream datasets, (2) one revenue dataset, (3) one environment dataset you’d request or download. Write one sentence explaining the public-interest question you could answer with them.
7) Roles/Importance of Data Journalism in the Extractive Sector
Why it matters: Data journalism makes complex extractive activity understandable, verifiable, and actionable. It connects numbers to people—clarifying who benefits, who pays, and who bears the costs.
A) Core roles
- Transparency & accountability: reveal payments, contracts, production, spills, flaring, and compliance—show gaps between policy promises and outcomes.
- Follow-the-money: trace royalties, taxes, fines, and subnational allocations; compare budgeted vs. actuals; spotlight leakages and delays.
- Impact tracking: quantify environmental and social effects (spill volumes, remediation status, emissions, health indicators) across LGAs and communities.
- Public service information: present risks and rights in accessible maps, charts, and explainers to help residents make informed decisions.
- Agenda setting: surface under-reported patterns (e.g., clusters of incidents around aging assets) that warrant legislative oversight or audits.
- Misinformation defense: verify claims with data; publish methods and sources to build trust.
B) What good data-driven coverage enables
- For citizens & communities: understand benefits/costs, monitor projects, and engage authorities with evidence.
- For policymakers: identify where policies underperform; target inspections, remediation, and social investments.
- For regulators: spot non-compliance trends, prioritize enforcement, validate company disclosures.
- For companies: benchmark performance, address community concerns, and improve ESG transparency.
C) Typical story angles (build with rates and context)
- Performance vs targets: flaring reduction goals, local content quotas, remediation timelines.
- Equity & distribution: who receives revenues and who experiences harms—by LGA/community/company.
- Time & place patterns: incident hotspots, seasonal spill trends, repeat offenders.
- Cost–benefit balance: public revenue gains vs. environmental liabilities and health risks.
D) Key metrics to watch (use clear denominators)
- Production per field/company; rates per active well.
- Spills per 1,000 km of pipeline; spill volume per 1 million barrels produced.
- Flaring as % of gas produced; emissions per unit output.
- Revenue collection vs. forecasts; time-to-transfer to subnationals.
- Remediation backlog cleared per quarter; time-to-cleanup after incidents.
E) Good practice (making impact responsibly)
- Be comparable: standardize units, currencies, geographies, and time windows.
- Show uncertainty: note missing data, estimates, and revisions; avoid false precision.
- Use the right visuals: lines for trends, bars for comparisons, maps for location—label units and sources.
- Reproducibility: publish a methods note; keep data, code (if any), and version history.
- Do no harm: protect sensitive locations/identifiers when necessary; include community perspectives.
Mini practice
Pick one LGA. Draft a headline and nut graf using two metrics (one level, one rate) that compare last year to this year. Add a one-line note on data limitations you would include in the story.
8) Challenges of Data Journalism in Nigeria’s Extractive Sector
Why this matters: Extractives data are politically sensitive, technically complex, and often incomplete. Knowing the common obstacles—and how to mitigate them—keeps your reporting accurate and safe.
A) Access & openness
- Limited disclosure: key files (contracts, detailed production, remediation status) may be unpublished or redacted.
- FOI delays/denials: requests can be slow, partial, or refused; responses may arrive in non-machine-readable formats.
- Paywalls/proprietary data: commercial datasets may restrict sharing or reproducing figures.
B) Data quality & comparability
- Inconsistent definitions: “spill,” “production,” “CSR” differ across publishers and years.
- Mixed units/currencies: bbl vs m³, ₦ vs $, nominal vs inflation-adjusted values.
- Revisions & gaps: months missing; later backfills change totals; version history unclear.
- Aggregation traps: national totals hide LGA patterns; rates per exposure are missing.
C) Format & technical hurdles
- PDFs/scans: tables require scraping/OCR; accuracy must be checked.
- Messy spreadsheets: merged cells, inconsistent headers, multiple tables on one sheet.
- APIs without docs: pagination, throttling, or schema changes break your scripts.
D) Geographic & temporal issues
- Boundary changes: LGA/ward splits/renames complicate comparisons over time.
- CRS problems: mismatched projections cause wrong distances/areas on maps.
- Geocoding: facilities or incidents lack clean coordinates; locations are vague or disputed.
E) Environmental & sensing constraints
- Detection limits: small spills or nighttime events may be missed; cloud cover affects optical imagery.
- Attribution: distinguishing sabotage vs equipment failure needs corroboration beyond a single dataset.
F) Institutional, legal & safety risks
- Pushback/pressure: PR spin, threats of litigation, or attempts to discredit methods.
- Source safety: community informants risk retaliation; field visits may face security concerns.
- Licensing & ethics: unclear reuse rights; risk of exposing sensitive locations or identities.
G) Practical mitigation (quick playbook)
- Document everything: keep a data diary (who/when/where downloaded, versions, cleaning steps).
- Standardize early: convert units/currencies; adopt ISO dates; fix LGA names; pin a single CRS for the project.
- Version control: save dated snapshots; never overwrite raw files; note revisions in your story.
- Triangulate: compare at least two independent sources for pivotal numbers (production, spills, payments).
- Normalize: publish rates (per km pipeline, per well, per ₦1bn revenue) alongside totals.
- Be transparent: include a methods/limitations box; explain assumptions and data gaps.
- Safety first: redact sensitive coordinates/identifiers when needed; follow newsroom security protocols.
- Legal check: verify licenses/permissions; consult editors/lawyers before publishing contentious material.
Mini practice
You receive a PDF of quarterly spill data with inconsistent LGA names and missing months. List: (1) three cleaning steps, (2) one triangulation source, (3) one limitation you will disclose in the story.
9) Data Journalism Sources in Nigeria’s Extractive Industry Landscape
Goal: Know where to look for trustworthy, regularly updated datasets and documents about oil, gas, and solid minerals in Nigeria—plus how to capture enough context to publish responsibly.
A) Government & official records
- Regulators (petroleum): upstream, midstream, and downstream commissions/authorities—licensing, production, flaring, spills, penalties, compliance bulletins.
- Solid minerals agencies: mining cadastre (titles, coordinates, status), inspectorate data (production/royalties), environment/safety notices.
- Environment agencies: incident/spill registers, remediation status, environmental impact assessments, monitoring results.
- Statistics & finance: national statistics office (sector output, trade), treasury/budget/FAAC allocations, central bank/foreign exchange reports.
- Parliament & judiciary: committee hearings, investigative reports, motions, court filings/judgments on extractive disputes and liabilities.
- Open portals: contract registers, licensing rounds, procurement notices, open treasury dashboards.
B) Multi-stakeholder & international
- EITI/NEITI: reconciliation reports, data portals (payments, production, subnational transfers), contract transparency materials.
- Multilateral sources: development banks and international energy/commodities agencies (price series, global production/trade, flaring/emissions trackers).
- Regional bodies & initiatives: cross-border pipeline/shipping datasets, maritime traffic summaries.
C) Company disclosures
- Annual reports & financial statements: production, reserves, CAPEX/OPEX, litigation notes, provisions for decommissioning.
- ESG/sustainability reports: spills, flaring, emissions, workforce, community investment; check methods/assurance statements.
- Securities filings: material events, asset sales, incident disclosures, governance changes.
- Press releases & websites: project milestones, turnarounds, shutdowns—verify with regulator/community data.
D) Community, CSO & academic sources
- Host communities & CSOs: incident logs, petitions, field photos/videos, independent monitoring.
- NGOs/think-tanks: thematic studies (remediation, health impacts, revenue tracking), watchdog dashboards.
- Universities & research labs: peer-reviewed studies, open datasets, methods you can replicate or adapt.
E) Remote sensing & geospatial
- Satellites (optical/radar): land change around facilities, shoreline/oil sheen detection (with caution), flood exposure near pipelines.
- Flaring & emissions trackers: night-time lights/flaring estimates; compare with company/regulator figures.
- Basemaps & infrastructure: coastline, rivers, protected areas, settlements, roads/pipelines (where lawfully available).
F) Trade, shipping & logistics
- Customs/trade stats: crude/product exports/imports by destination/origin.
- Maritime/port: vessel movements, berth logs, cargo types (use ethically and within terms).
G) Minimum “source note” you should store with every file
- Who/where: publisher, portal/page, contact if any.
- When: coverage period and the date/time you downloaded (UTC).
- What/how: description, units/currency, definitions, methodology, license/permissions.
- Versioning: filename with date stamp, checksum (optional), known revisions or caveats.
H) Quick vetting questions
- Is this the authoritative source for this metric? If not, who is?
- Are definitions/units consistent with other datasets you’ll combine?
- Can you reproduce the result (same filters/time window) from raw or API?
- What biases or incentives might shape how this number was produced?
Mini practice
Pick one story idea (e.g., “gas flaring trend by LGA since 2020”). List: (1) the primary regulator dataset, (2) one community/CSO source, (3) one satellite-based proxy, and (4) how you will reconcile differences between them in your methods box.
- Resources
- https://neiti.gov.ng/
10) Storifying Extractive Industry Data
Goal: Turn cleaned, verified datasets into clear, human-centred stories that inform the public, support accountability, and are easy to read on mobile.
A) Start with a focused question & audience
- Question: What exactly are we answering? (e.g., “Did flaring fall after Policy X?”)
- Audience: Who needs this most—community members, policymakers, regulators?
- Impact: What could change if readers understand this number?
B) Find the angle (four-part checklist)
- Level: how big? (total spills, ₦ revenues)
- Rate/denominator: per km of pipeline, per well, per ₦1bn revenue
- Change: trend vs last month/year; before/after a policy
- Comparison: by LGA/company/field; national average vs local
C) Put people and place in the numbers
- Characters: name the communities affected; include a verified quote.
- Place: specify LGA/coordinates; show proximity to schools, rivers, farmlands when appropriate.
- Time: highlight the relevant period (e.g., “Jan–Jun 2025”).
D) Structure your narrative (pick one)
- Inverted pyramid: key finding → evidence → context → quotes → methods/limitations.
- Problem → Evidence → What’s next: define the issue, show data, outline remedies/accountability steps.
- Timeline: when events/policies changed the numbers; annotate turning points.
E) Write the core quickly
- Headline: Subject + metric + time frame + location (avoid hype).
Example: “Gas flaring fell 18% in Bayelsa since 2023—rates per producing well still rising.” - Lead: one-sentence summary of the most important number and why it matters.
- Nut graf: how we know (source/method), who is affected, what to watch next.
F) Choose the right visuals
- Line chart: trends over time (monthly flaring per field).
- Bar chart: comparisons across LGAs/companies.
- Slope chart: before/after policy comparisons.
- Map: location of incidents or exposure (keep labels legible on mobile; use clear legend and units).
- Always: label units, show sources, note data gaps/estimates; keep color palettes simple and accessible.
G) Make numbers relatable (without distorting)
- Show absolute numbers and rates.
- Use plain analogies sparingly (only if accurate).
- Benchmark against targets, last year, or national averages.
H) Methods & limitations box (publish with the story)
- Data sources, coverage period, units/currency, definitions used.
- Cleaning steps, denominators, adjustments (e.g., inflation, unit conversions).
- Known gaps/revisions; how conflicts between sources were resolved.
I) Ethics & safety
- Protect sensitive locations/identifiers where necessary.
- Avoid implying causation from correlation; disclose uncertainty.
- Represent communities fairly—pair data with local context and voices.
J) Quick templates
Template 1 — Explainer lead
“[Metric] in [Place] changed by [% / level] since [Date], driven by [factor], our analysis of [Source] shows.”
Template 2 — Accountability lead
“Despite [Policy/Promise], [Metric rate] in [Place] rose by [%] over [Period], raising questions about [Actor/Agency] oversight.”
Template 3 — Map-driven nut graf
“The hotspots cluster along [corridor/river/LGA], where [x%] of incidents occurred within [y km] of [asset/community].”
K) Common pitfalls
- Cherry-picking time windows; switching denominators mid-story.
- Overcrowded charts; unlabeled axes/units; decorative maps with unclear legends.
- Confusing estimates with measurements; ignoring revisions to official series.
Mini practice
Draft a headline + 2-sentence lead using: (a) monthly flaring per field (2022–2025), (b) pipeline length per LGA, (c) remediation backlog. State one denominator and one limitation you will publish.