GCP Data Architect (Principal) with skills Data Engineering, Big Data, GCP-Apps, Pyspark, BigQuery for location Any Infogain Base Location (Noida, Gurugram, Bangalore, Mumbai, Pune)
ROLES & RESPONSIBILITIES

Core Skills

Qualification

·       18+ years in data/analytics engineering with 10+ years architecting solutions on public cloud; 5+ years hands-on with GCP.

·       Proven delivery of Medallion (Bronze/Silver/Gold) architectures on GCP with GCS + Dataproc + BigQuery at enterprise scale.

·       Expert in PySpark and Dataproc (job orchestration, autoscaling, cluster policies, tuning, troubleshooting).

·       Strong BigQuery expertise: storage/compute separation, partitioning, clustering, materialized views, BI Engine, slot management, RLS/CLS.

·       Hands-on experience with Vertex AI (Pipelines, Feature Store, training/serving, registry, monitoring) and ML Ops best practices.

·       Implemented Dataplex for centralized governance (catalog, policy tags) and IAM for least-privilege, plus data security/compliance controls.

·       Practical integration with Oracle and Teradata via JDBC; familiarity with CDC patterns and schema evolution.

·       CI/CD for data platforms (Cloud Build/GitHub Actions), orchestration (Cloud Composer/Airflow), and Infrastructure as Code (Terraform).

·       Deep understanding of data modeling, data quality, lineage, and observability for data systems.

·       Excellent communication, stakeholder management, and leadership across technical and business teams.

·       Google certifications: Professional Cloud Architect and/or Professional Data Engineer (highly preferred).

·       Experience modernizing SAS workloads and translating SAS macros/PROCs to PySpark/SQL on GCP.

·       Knowledge of streaming (Pub/Sub, Dataflow/Flink/Spark Structured Streaming) for near-real-time requirements.

·       Experience with VPC Service Controls, Private Service Connect, Organization Policy, Workload Identity Federation.

·       Familiarity with Delta/Iceberg/Hudi tables and open table formats on GCS; data sharing patterns (Analytics Hub).

·       Bachelor’s/Master’s in Computer Science, Engineering, Information Systems, or equivalent experience.

 

 

 

 

Job Description:

·       Design and own the end-to-end analytics architecture on GCP, ensuring alignment with business, security, cost, and performance goals.

·       Implement a Medallion architecture:

o   Bronze (Raw) ingestion on GCS via JDBC from Oracle/Teradata.

o   Silver (Curated) transformations using PySpark on Dataproc.

o   Gold optimized in BigQuery for analytics and BI.

·       Define canonical data models, storage formats (Parquet/ORC/Delta/Iceberg), and partitioning/clustering strategies.

·       Lead migration from SAS to PySpark, establish coding standards, and optimize Spark jobs.

·       Build JDBC ingestion pipelines with CDC, robust retries, schema evolution, and orchestrate workflows via Cloud Composer/Airflow and CI/CD.

·       Architect BigQuery models, manage cost/performance, enforce SLAs, and integrate securely with BI tools using RLS/CLS.

·       Define ML Ops workflows on Vertex AI, including feature pipelines, automated training, deployment, and model monitoring for drift/bias.

·       Implement centralized governance via Dataplex (catalog, policy tags), IAM least privilege, VPC-SC, and data security/compliance controls.

·       Drive cost optimization, reliability/SRE practices, monitoring, DR/BCP, and FinOps governance.

·       Provide architectural leadership, mentor teams, set standards, and create roadmaps, ADRs, and executive-level communication.

EXPERIENCE
  • 18-21 Years
SKILLS
  • Primary Skill: Data Engineering
  • Sub Skill(s): Data Engineering
  • Additional Skill(s): Big Data, GCP-Apps, Pyspark, BigQuery
Express Application
Upload Microsoft word, PDF file upto 500KB.
Recent Jobs
Posted on March 31, 2026
Technical Architect (Standard) | 12-14 Years | Application Architecture - ReactJS, Java Architecture, JSP, Mobile Architecture, GIT / GITHUB...
Posted on March 31, 2026
AWS Data Engineer (Standard) | 4.5-6 Years | Data Engineering - Python, Apache Hive, AWS-Apps, DevOps Engineering
Posted on March 31, 2026
AWS Data Engineer (Standard) | 4.5-6 Years | Data Engineering - Python, Apache Hive, AWS-Apps, DevOps Engineering
Posted on March 31, 2026
GCP Data Architect (Principal) | 18-21 Years | Data Engineering - Big Data, GCP-Apps, Pyspark, BigQuery