Azure Data Engineer (Lead) with skills Data Engineering, Python, Apache Hadoop, Apache Hive, Apache Airflow, synapse, Databricks, SQL, Apache Spark, Azure Data Factory, Pyspark, GenAI Fundamentals, Cloud Pub/Sub, BigQuery for location Gurugram, India
ROLES & RESPONSIBILITIES

Key Responsibilities

  • Lead design and execution of Dataproc ? Databricks PySpark migration roadmap.

  • Define modernization strategy, including data ingestion, transformation, orchestration, and governance.

  • Architect scalable Delta Lake and Unity Catalog–based solutions.

  • Manage and guide teams on code conversion, dependency mapping, and data validation.

  • Collaborate with platform, infra, and DevOps teams to optimize compute costs and performance.

  • Own the automation & GenAI acceleration layer, integrating code parsers, lineage tools, and validation utilities.

  • Conduct performance benchmarking, cost optimization, and platform tuning (Photon, Auto-scaling, Delta Caching).

  • Mentor senior and mid-level developers, ensuring quality standards, documentation, and delivery timelines.

Technical Skills

  • Languages: Python, PySpark, SQL

  • Platforms: Databricks (Jobs, Workflows, Delta Live Tables, Unity Catalog), GCP Dataproc

  • Data Tools: Hadoop, Hive, Pig, Spark (RDD & DataFrame APIs), Delta Lake

  • Cloud & Integration: GCS, BigQuery, Pub/Sub, Cloud Composer, Airflow

  • Automation: GenAI-powered migration tools, custom Python utilities for code conversion

  • Version Control & DevOps: Git, Terraform, Jenkins, CI/CD pipelines

  • Other: Performance tuning, cost optimization, and lineage tracking with Unity Catalog

Preferred Experience

  • 10–14 years of data engineering experience with at least 3 years leading Databricks or Spark modernization programs.

  • Proven success in migration or replatforming projects from Hadoop or Dataproc to Databricks.

  • Exposure to AI/GenAI in code transformation or data engineering automation.

  • Strong stakeholder management and technical leadership skills.

EXPERIENCE
  • 11-12 Years
SKILLS
  • Primary Skill: Data Engineering
  • Sub Skill(s): Data Engineering
  • Additional Skill(s): Python, Apache Hadoop, Apache Hive, Apache Airflow, synapse, Databricks, SQL, Apache Spark, Azure Data Factory, Pyspark, GenAI Fundamentals, Cloud Pub/Sub, BigQuery
Express Application
Upload Microsoft word, PDF file upto 500KB.
Recent Jobs
Posted on December 07, 2025
Python Developer (Lead) | 8-11 Years | Open Source Development - ReactJS, Python, Go Microservices, GoLang
Posted on December 07, 2025
Cloud Native App Developer (Lead) | 8-11 Years | CNA Development - ReactJS, Core Java, Java Webservices, Spring Boot, GCP-Apps...
Posted on December 07, 2025
Network Engineer (Senior) | 6-8 Years | Network Engineer - LAN, Network Operations, Firewall
Posted on December 07, 2025
Network Engineer (Senior) | 6-8 Years | Network Engineer - LAN, Network Operations, Firewall