Big Data Engineer (Standard) with skills Data Engineering, Kafka, Big Data, Apache Hive, SQL Server DBA, CI/CD, Apache Spark for location Noida, India
ROLES & RESPONSIBILITIES
Job Description:
· Build pipelines to bring in wide variety of data from multiple sources within the organization as well as from social media and public data sources.
· Collaborate with cross functional teams to source data and make it available for downstream consumption.
· Work with the team to provide an effective solution design to meet business needs.
· Ensure regular communication with key stakeholders, understand any key concerns in how the initiative is being delivered or any risks/issues that have either not yet been identified or are not being progressed.
· Ensure dependencies and challenges (risks) are escalated and managed. Escalate critical issues to the Sponsor and/or Head of Data Engineering.
· Ensure timelines (milestones, decisions and delivery) are managed and value of initiative is achieved, without compromising quality and within budget.
· Ensure an appropriate and coordinated communications plan is in place for initiative execution and delivery, both internal and external.
· Ensure final handover of initiative to business as usual processes, carry out a post implementation review (as necessary) to ensure initiative objectives have been delivered, and any lessons learned are fed into future initiative management processes.
Who we are looking for:
Competencies & Personal Traits
· Work as a team player
· Excellent problem analysis skills
· Experience with at least one Cloud Infra provider (Azure/AWS)
· Experience in building data pipelines using batch processing with Apache Spark (Spark SQL, Dataframe API) or Hive query language (HQL)
· Knowledge of Big data ETL processing tools
· Experience with Hive and Hadoop file formats (Avro / Parquet / ORC)
· Basic knowledge of scripting (shell / bash)
· Experience of working with multiple data sources including relational databases (SQL Server / Oracle / DB2 / Netezza).
· Basic understanding of CI CD tools such as Jenkins, JIRA, Bitbucket, Artifactory, Bamboo and Azure Dev-ops.
· Basic understanding of DevOps practices using Git version control
· Ability to debug, fine tune and optimize large scale data processing jobs
Working Experience
· 1-3 years of broad experience of working with Enterprise IT applications in cloud platform and big data environments.
Professional Qualifications
· Certifications related to Data and Analytics would be an added advantage
Education
· Master/Bachelor’s degree in STEM (Science, Technology, Engineering, Mathematics)
Language
· Fluency in written and spoken English
EXPERIENCE
- 3-4.5 Years
SKILLS
- Primary Skill: Data Engineering
- Sub Skill(s): Data Engineering
- Additional Skill(s): Kafka, Big Data, Apache Hive, SQL Server DBA, CI/CD, Apache Spark