Azure Data Engineer (Lead) with skills Data Engineering, Python, Databricks, SQL, Azure Data Factory for location Bangalore, India
ROLES & RESPONSIBILITIES

Key Responsibilities

1. Data Quality Framework Design & Leadership

  • Define and implement enterprise-wide data quality frameworks and governance standards.

  • Architect automated DQ pipelines using Databricks (Delta Lake), PySpark, and Ataccama ONE.

  • Design DQ monitoring architecture—profiling, lineage integration, and alerting mechanisms.

  • Establish KPIs and DQ scorecards to measure and communicate data trust metrics across domains.

2. Advanced Data Quality Development & Automation

  • Build and optimize complex validation, reconciliation, and anomaly detection workflows using PySpark and Python.

  • Implement rule-based and ML-based DQ checks, leveraging Ataccama workflows and open-source frameworks.

  • Integrate DQ rules into CI/CD and orchestration platforms (Airflow, ADF, or Databricks Workflows).

  • Partner with data engineers to embed DQ checks into ingestion and transformation pipelines.

3. Root Cause Analysis & Continuous Improvement

  • Lead root-cause investigations for recurring DQ issues and drive long-term remediation solutions.

  • Create and enforce best practices for rule versioning, DQ exception handling, and reporting.

  • Own the playbook for DQ incident response and continuous optimization.

4. Stakeholder Management & Governance

  • Act as the primary liaison between business data owners, IT, and governance teams.

  • Translate business DQ requirements into technical implementation strategies.

  • Drive executive-level reporting on DQ KPIs, SLAs, and issue trends.

  • Contribute to metadata management, lineage documentation, and master data alignment.

5. Mentorship & Leadership

  • Guide junior analysts and data engineers in developing robust DQ solutions.

  • Lead cross-functional squads to implement new data quality capabilities or upgrades.

  • Contribute to capability uplift—training peers on DQ best practices, tools, and technologies.


Core Technical Skills

Category

Tools / Skills

Data Engineering & Quality

Databricks (Delta Lake), PySpark, SQL, Python

DQ Platforms

Ataccama ONE / Studio (rule authoring, workflow automation, profiling)

Orchestration & CI/CD

Apache Airflow, Azure Data Factory, Databricks Workflows, GitHub Actions

Data Warehouses

Databricks Lakehouse

Cloud & Infrastructure

Azure (preferred), AWS, or GCP; familiarity with Terraform or IaC concepts

Version Control / CI-CD

Git, GitHub Actions, Azure DevOps

Metadata & Governance

Collibra, Alation, Ataccama Catalog, OpenLineage

Monitoring & Observability

Grafana, Datadog, Prometheus for DQ metrics and alerts


Qualifications & Experience

  • Bachelor’s or Master’s in Computer Science, Information Systems, Statistics, or related field.

  • 9–12 years of experience in data quality, data engineering, or governance-focused roles.

  • Proven experience designing and deploying enterprise DQ frameworks and automated checks.

  • Strong expertise in Databricks, PySpark, and Ataccama for data profiling and rule execution.

  • Advanced proficiency in SQL and Python for large-scale data analysis and validation.

  • Solid understanding of data models, lineage, reconciliation, and governance frameworks

  • Experience integrating DQ checks into CI/CD pipelines and orchestrated data flows.


Soft Skills & Leadership Attributes

  • Strong analytical thinking and systems-level problem solving.

  • Excellent communication and presentation skills for senior stakeholders.

  • Ability to balance detail orientation with strategic vision.

  • Influencer with a proactive, ownership-driven mindset.

  • Comfortable leading cross-functional teams in fast-paced, cloud-native environments.


Preferred / Nice to Have

  • Experience in financial, manufacturing, or large enterprise data environments.

  • Familiarity with MDM, reference data, and data stewardship processes.

  • Exposure to machine learning-driven anomaly detection or predictive data quality.

  • Certifications: Databricks, Ataccama, or Cloud Data Engineering certifications (Azure/AWS).


Success Indicators

  • Increased DQ rule coverage and automation across key data domains.

  • Reduced manual DQ exceptions and faster remediation cycle times.

  • Measurable improvement in data trust metrics and reporting accuracy.

  • High stakeholder satisfaction with data availability and reliability.

EXPERIENCE
  • 8-11 Years
SKILLS
  • Primary Skill: Data Engineering
  • Sub Skill(s): Data Engineering
  • Additional Skill(s): Python, Databricks, SQL, Azure Data Factory
ABOUT THE COMPANY

Infogain is a human-centered digital platform and software engineering company based out of Silicon Valley. We engineer business outcomes for Fortune 500 companies and digital natives in the technology, healthcare, insurance, travel, telecom, and retail & CPG industries using technologies such as cloud, microservices, automation, IoT, and artificial intelligence. We accelerate experience-led transformation in the delivery of digital platforms. Infogain is also a Microsoft (NASDAQ: MSFT) Gold Partner and Azure Expert Managed Services Provider (MSP).

Infogain, an Apax Funds portfolio company, has offices in California, Washington, Texas, the UK, the UAE, and Singapore, with delivery centers in Seattle, Houston, Austin, Kraków, Noida, Gurgaon, Mumbai, Pune, and Bengaluru.

Express Application
Upload Microsoft word, PDF file upto 500KB.
Recent Jobs
Posted on December 07, 2025
Python Developer (Lead) | 8-11 Years | Open Source Development - ReactJS, Python, Go Microservices, GoLang
Posted on December 07, 2025
Cloud Native App Developer (Lead) | 8-11 Years | CNA Development - ReactJS, Core Java, Java Webservices, Spring Boot, GCP-Apps...
Posted on December 07, 2025
Network Engineer (Senior) | 6-8 Years | Network Engineer - LAN, Network Operations, Firewall
Posted on December 07, 2025
Network Engineer (Senior) | 6-8 Years | Network Engineer - LAN, Network Operations, Firewall