Resume

Onyinyechi C. Ugba

Data Science & Data Engineering Intern

onyi.ugba@outlook.com·Göttingen, Germany


Data and AI professional with several years of experience delivering analytics, data engineering, and machine learning solutions across financial and business domains. Experienced in building production-grade data pipelines, Retrieval-Augmented Generation (RAG) systems, and ML models that improve data quality, operational efficiency, and decision-making. Strong at bridging data, engineering, and business requirements to transform complex information into scalable, actionable insights.

Skills

Python · SQL · Apache Airflow · ETL / ELT Pipelines · PostgreSQL · Docker · REST APIs · Incremental & Idempotent Loading · Data Modeling · Machine Learning · Time Series Forecasting · Retrieval-Augmented Generation (RAG) · LLMs (Google Gemini, LangChain) · Vector Databases (ChromaDB) · Git & GitHub

Experience & Projects

Data Science/Engineer Intern

Oct 2025 – Present

Webeet.io · Berlin (Remote)

  • Built production-grade, Dockerized Apache Airflow ETL pipelines ingesting complex GitHub API data (issues, comments, timelines) using Python and PostgreSQL
  • Reduced API overhead by 40% through incremental loading strategies and watermarking, enabling idempotent pipeline re-runs
  • Improved system reliability with Airflow Secrets Management, structured logging, SLAs, automated retries, and in-memory caching
  • Achieved 99%+ pipeline uptime while maintaining GitHub API rate-limit compliance through optimized request strategies
  • Modeled deeply nested API responses into analytical schemas supporting historical event tracking and reproducible analytics
  • Engineered geospatial data pipelines extracting 150+ Berlin library records from OpenStreetMap using Python and GeoPandas
  • Improved data completeness from 60% to 95% via automated address enrichment using the Nominatim API with rate limiting
  • Designed PostgreSQL schemas with constraints and validation workflows to ensure spatial data integrity
  • Developed multi-stage ETL pipelines for NYC education datasets, delivering insights on school safety and SAT performance trends
PythonApache AirflowETL / ELTIncremental LoadingPostgreSQLDockerREST APIsGeoPandasData Modeling

Financial Officer

Jul 2017 – Jul 2018

Post-Graduate Fellowship, Nsukka · Nigeria

  • Built structured data tracking systems for departmental financial records exceeding ₦500K per quarter
  • Implemented standardized data entry and validation processes improving data consistency
Data ManagementData ValidationReportingProcess Optimization

Data Assistant

Jun 2015 – May 2016

Ikenne Local Government Area · Nigeria

  • Processed and validated 5,000+ municipal records using systematic quality control checks
  • Created standardized data templates to support monthly and quarterly reports, reducing reporting errors by 15%
Data QualityData ProcessingReportingData Organization

Machine Learning (Data Science)

Jul 2025 – Aug 2025

Masterschool Berlin · Remote

  • Developed a retail demand forecasting solution using XGBoost and 3+ years of historical sales data
  • Engineered time-based features including seasonality, holidays, day-of-week, and lag variables
  • Evaluated model performance using time-series validation strategies
  • Delivered an end-to-end forecasting pipeline supporting inventory and promotion planning
Machine LearningTime Series ForecastingXGBoostPythonFeature Engineering

Links