Rahul Joshi

Distinguished Data Engineer specializing in cloud-native data platforms for financial services. M.Tech, IIT Kharagpur. I write about lakehouse architecture, data engineering at scale, and building AI-ready data platforms.

Data-Centric AI: Engineering Platforms for Pre-Model Intelligence

Journal article on why data platform architecture — not model architecture — is the primary determinant of AI system reliability.

January 1, 2025 · Rahul Joshi

Lakehouse Convergence: Delta Lake & Iceberg

Analysis of format convergence between Delta Lake and Apache Iceberg — what it means for lakehouse architecture.

June 1, 2024 · Rahul Joshi

Delta Lake Transaction Logs Explained

Deep dive into Delta Lake’s transaction log internals — how ACID transactions work on object storage.

January 1, 2024 · Rahul Joshi

Understanding the Evolution of Data Lakes

From Hadoop to modern lakehouse — tracing the architectural evolution of data lakes.

June 1, 2023 · Rahul Joshi

Displacement Based Unsupervised Metric for Evaluating Rank Aggregation

Proposes a variant of Kendall-Tau distance metric for evaluating rank aggregation — published in Springer PReMI 2011.

June 1, 2011 · Rahul Joshi