Skip to content

wikigeocode – Coordinate Extraction from Wikipedia and Wikidata

GitHub Repository

wikigeocode is a Python library that retrieves geographic coordinates from Wikipedia and Wikidata for location name inputs. It is designed for use in historical research, geospatial pipelines, and automated text-to-map workflows.


Overview

The library provides a streamlined interface to query location data using both the Wikipedia REST API and SPARQL queries to Wikidata. It prioritizes modern place names while supporting fallback logic to account for disambiguation, redirects, and historical variants.

wikigeocode was developed to support batch processing of ambiguous or loosely structured geographic inputs, particularly in the context of archival and historical data extraction.


Features

  • Automated geocoding of place names via Wikipedia/Wikidata integration
  • Disambiguation handling with redirect and alias resolution
  • SPARQL querying for coordinate precision and filtering
  • Language fallback and custom User-Agent support for API compliance
  • Designed for command-line use or programmatic integration into larger pipelines

Implementation Details

The library is structured as a Python module with well-defined query and resolution logic. It uses the requests library for API communication and implements Wikidata queries via SPARQL for enhanced coordinate accuracy.

Internally, wikigeocode manages API rate limits, disambiguation paths, and structured fallbacks for unresolved or historically ambiguous names. Output includes structured data with latitude, longitude, matched label, and relevant metadata.

The project uses pyproject.toml for configuration and supports editable installs during development.


Installation

To install locally for use or development:

```bash git clone https://github.com/SiottoTamat/wikigeocode.git cd wikigeocode pip install -e .