This site is under active development — some features may be incomplete

Data Pipeline Project

EconGE.

Georgian Economic Data Pipeline

An automated pipeline that collects, processes, and publishes economic data from Georgia's official statistical sources — making it accessible through a browsable data portal and structured API.

21.7K
Observations
1258
Datasets
3
Sources
Mar 22
Last Updated

Technology

How It's Built

A modern data stack running on a single VPS — no cloud vendor lock-in.

Orchestration

Dagster Dagster Daemon Cron Schedules

Data Collection

Python httpx openpyxl REST APIs

Storage

PostgreSQL 16 Normalized Schema Docker Volumes

Infrastructure

Docker Compose Nginx FastAPI Let's Encrypt

Pipeline Architecture

  ┌──────────────┐     ┌──────────────┐     ┌──────────────┐
  │  GeoStat API │     │   NBG API    │     │  Excel Files │
  └──────┬───────┘     └──────┬───────┘     └──────┬───────┘
         │                    │                    │
         └────────────┬───────┴────────────────────┘
                      │
              ┌───────▼────────┐
              │    Dagster      │
              │  (Orchestrator) │
              └───────┬────────┘
                      │
              ┌───────▼────────┐
              │  Python Assets  │
              │  (Extract +     │
              │   Transform)    │
              └───────┬────────┘
                      │
              ┌───────▼────────┐
              │  PostgreSQL 16  │
              │  (econge schema)│
              └───────┬────────┘
                      │
         ┌────────────┴────────────┐
         │                         │
  ┌──────▼───────┐      ┌─────────▼────────┐
  │  Data Portal │      │  Dagster Web UI  │
  │  (FastAPI)   │      │  (Pipeline Ops)  │
  └──────────────┘      └──────────────────┘

About the Project

Why This Exists

Georgian economic data is scattered across multiple government websites, published in inconsistent formats — Excel spreadsheets, PDFs, and fragmented APIs. Finding historical time series or comparing indicators across sources is painful.

EconGE automates the collection, normalization, and storage of this data into a single PostgreSQL database with a clean, browsable interface on top. It runs on a schedule, so the data stays fresh without manual intervention.

This is both a practical tool and a portfolio piece — demonstrating end-to-end data engineering: ingestion, transformation, storage, orchestration, and data delivery.

What It Does

Automated daily scraping of official sources
Normalized schema across all data providers
Browsable data portal with search and charts
Time series visualization with Chart.js
Dagster-managed pipeline with monitoring
Dockerized deployment on a single VPS
JSON API for programmatic access
Incremental loading — only new data is fetched

Explore the Data

Browse datasets, view time series charts, or check out the pipeline orchestration.