Azure Data Engineer (Senior) with skills Data Engineering, Python, Apache Hadoop, Apache Hive, Apache Airflow, synapse, Databricks, SQL, Apache Spark, Azure Data Factory, Pyspark, GenAI Fundamentals, Cloud Pub/Sub, BigQuery for location Gurugram, India
ROLES & RESPONSIBILITIES

Key Responsibilities

  • Analyze existing Hadoop, Pig, and Spark scripts from Dataproc and refactor them into Databricks-native PySpark.

  • Implement data ingestion and transformation pipelines using Delta Lake best practices.

  • Apply conversion rules and templates for automated code migration and testing.

  • Conduct data validation between legacy and migrated environments (schema, count, and data-level checks).

  • Collaborate on developing AI-driven tools for code conversion, dependency extraction, and error remediation.

  • Ensure best practices for code versioning, error handling, and performance optimization.

  • Participate in UAT, troubleshooting, and post-migration validation activities.

Technical Skills

  • Core: Python, PySpark, SQL

  • Databricks: Delta Lake, Unity Catalog, Databricks Workflows, MLflow (basic understanding)

  • GCP: Dataproc, BigQuery, GCS, Composer/Airflow, Cloud Functions

  • Data Engineering: Hadoop, Hive, Pig, Spark SQL

  • Automation: Experience with migration utilities or AI-assisted code transformation tools

  • CI/CD: Git, Jenkins, Terraform (preferred)

  • Validation: Data comparison utilities (Delta-to-Delta, DataFrame diffing, schema validation)

Preferred Experience

  • 5–8 years in data engineering or big data application development.

  • Hands-on experience migrating Spark or Hadoop workloads to Databricks.

  • Familiarity with Delta architecture, data quality frameworks, and GCP cloud integration.

  • Exposure to GenAI-based tools for automation or code refactoring is a plus.

EXPERIENCE
  • 6-8 Years
SKILLS
  • Primary Skill: Data Engineering
  • Sub Skill(s): Data Engineering
  • Additional Skill(s): Python, Apache Hadoop, Apache Hive, Apache Airflow, synapse, Databricks, SQL, Apache Spark, Azure Data Factory, Pyspark, GenAI Fundamentals, Cloud Pub/Sub, BigQuery
Express Application
Upload Microsoft word, PDF file upto 500KB.
Recent Jobs
Posted on December 07, 2025
Python Developer (Lead) | 8-11 Years | Open Source Development - ReactJS, Python, Go Microservices, GoLang
Posted on December 07, 2025
Cloud Native App Developer (Lead) | 8-11 Years | CNA Development - ReactJS, Core Java, Java Webservices, Spring Boot, GCP-Apps...
Posted on December 07, 2025
Network Engineer (Senior) | 6-8 Years | Network Engineer - LAN, Network Operations, Firewall
Posted on December 07, 2025
Network Engineer (Senior) | 6-8 Years | Network Engineer - LAN, Network Operations, Firewall