Skip to content

DevelopersWork-Labs/data-engineering-portfolio-2026

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

12-Week Data Engineering + GenAI Transformation 2026

This repository is part of my structured journey to transition into a Senior Data Engineer / AI Data Engineer role.
I am combining my existing experience in Databricks and PySpark with modern GenAI workflows, vector search, and LLM-based systems.

Mission

  • Rebuild data engineering foundations with advanced PySpark, Delta, and distributed systems
  • Design scalable ETL pipelines using Lakehouse architecture
  • Build a real-time streaming + CDC platform using Kafka and Spark Structured Streaming
  • Develop an enterprise-grade RAG pipeline using Databricks Mosaic AI & Vector Search
  • Strengthen interview skills with DSA, system design, and portfolio storytelling

Long-Term Vision

To build data platforms that integrate LLMs as first-class citizens — enabling intelligent data retrieval, automation, and AI-native applications.

Links

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •