GCP Data Engineer (Senior) with skills Data Engineering, Big Data, GCP-Apps, Pyspark, BigQuery for location Any Infogain Base Location (Noida, Gurugram, Bangalore, Mumbai, Pune)
ROLES & RESPONSIBILITIES

Core Skills

Required Skills & Experience

  • 6–10 years of experience in data engineering or analytics, with 3+ years hands-on on GCP.

  • Strong experience with PySpark, Dataproc, GCS, BigQuery, and JDBC ingestion.

  • Proven experience migrating SAS workloads to PySpark or SQL-based systems.

  • Hands-on knowledge of GCP Medallion architecture (Bronze/Silver/Gold).

  • Understanding of Dataplex, IAM, policy tags, and secure data handling.

  • Experience with CI/CD (Cloud Build/GitHub Actions) and workflow orchestration (Cloud Composer/Airflow).

  • Strong problem-solving ability, debugging skills, and ability to guide teams through technical challenges.


Preferred Skills

  • Experience with Vertex AI, ML Ops, and ML pipeline deployment.

  • Knowledge of Delta/Iceberg/Hudi table formats on GCS.

  • Exposure to real-time ingestion (Pub/Sub, Dataflow).

  • Google Cloud Professional Data Engineer or Cloud Architect certification.


Soft Skills

  • Strong leadership and mentoring capabilities.

  • Excellent communication skills to support developers, architects, and business teams.

  • Ability to manage multiple priorities, resolve conflicts, and maintain steady progress under pressure.


 

Key Responsibilities

1. Hands-on Technical Leadership

  • Work closely with development teams on a daily basis to guide solution design, troubleshoot issues, and resolve technical blockers.

  • Enforce engineering best practices, coding standards, and architectural guidelines across all data pipelines and workloads.

  • Perform design and code reviews, ensuring quality, scalability, and reliability of the platform.

2. Data Engineering on GCP

  • Lead development of ingestion pipelines via Direct JDBC connectivity from Oracle and Teradata into the Raw/Bronze layer on GCS.

  • Develop and optimize PySpark workloads on Dataproc for data cleansing, transformation, and harmonization into the Curated/Silver layer.

  • Contribute to design of the Gold layer in BigQuery, including table structures, partitioning, clustering, and performance optimization.

3. Migration from SAS to GCP

  • Translate existing SAS logic into PySpark, ensuring functional parity, improved performance, and operational efficiency.

  • Provide guidance on PySpark coding patterns, UDFs, optimization strategies, shuffle/skew handling, and best practices for Dataproc jobs.

4. BigQuery Engineering & Optimization

  • Build and optimize SQL models, materialized views, and analytical datasets in BigQuery.

  • Apply query optimization techniques, cost controls, and data modeling best practices (star/snowflake).

  • Implement RLS/CLS for secure reporting and work with BI teams to integrate BigQuery into reporting tools.

5. Vertex AI & ML Support

  • Assist data scientists in building ML pipelines using Vertex AI (training, prediction, feature engineering).

  • Guide integration of feature pipelines from Silver ? Vertex AI Feature Store.

  • Ensure reproducibility, lineage, and model monitoring (drift, bias).

6. Data Governance & Security (Dataplex + IAM)

  • Implement and enforce governance standards using Dataplex, including cataloging, policy tags, and data domains.

  • Ensure datasets follow proper IAM roles, tagging, and compliance (PII/PCI/PHI masking where needed).

  • Support lineage metadata, DQ implementation, and documentation.

7. Operations, Monitoring & Cost Optimization

  • Optimize Dataproc clusters (autoscaling, Preemptibles), GCS storage lifecycle policies, and BigQuery costs.

  • Establish monitoring dashboards, logs, alerts, and operational KPIs.

  • Troubleshoot and resolve production issues, ensuring high availability and reliability.

 

EXPERIENCE
  • 6-8 Years
SKILLS
  • Primary Skill: Data Engineering
  • Sub Skill(s): Data Engineering
  • Additional Skill(s): Big Data, GCP-Apps, Pyspark, BigQuery
Express Application
Upload Microsoft word, PDF file upto 500KB.
Recent Jobs
Posted on March 31, 2026
Technical Architect (Standard) | 12-14 Years | Application Architecture - ReactJS, Java Architecture, JSP, Mobile Architecture, GIT / GITHUB...
Posted on March 31, 2026
AWS Data Engineer (Standard) | 4.5-6 Years | Data Engineering - Python, Apache Hive, AWS-Apps, DevOps Engineering
Posted on March 31, 2026
AWS Data Engineer (Standard) | 4.5-6 Years | Data Engineering - Python, Apache Hive, AWS-Apps, DevOps Engineering
Posted on March 31, 2026
GCP Data Architect (Principal) | 18-21 Years | Data Engineering - Big Data, GCP-Apps, Pyspark, BigQuery