Datafin

Data Engineer III – Platform (CPT/JHB)

IT – Analyst, Data Management
Cape Town – Western Cape ~ Johannesburg – Gauteng

ENVIRONMENT:
A dynamic Financial Services provider seeks a highly analytical Data Engineer III whose core role will entail contributing to the design & development of new cloud workloads for Platform and Product teams, evolving data platforms and empowering data consumers. The successful incumbent must possess a Bachelor’s Degree in IT with a minimum of at least 3 years’ proven experience in Computer Programming and Data Engineering including proven experience in AWS data stack (AWS Glue, AWS Redshift, AWS S3, AWS LakeFormation), Operationalizing Batch and/or Realtime data pipelines, Python, PySpark, or Scala, Version Control in Git, and CI/CD deployment and any Infrastructure as Code tool.
 
DUTIES:
  • Contribute to the design and development of new cloud workloads for Platform and Product teams, to empower data consumers using data platforms to deliver client value.
  • Maintain and manage the existing cloud data environments and enable data producers to easily contribute to these environments.
  • Contribute to evolving the data platforms through sharing of knowledge, contributing new data features, and enhancing/streamlining existing processes e.g., improved re-use of code.
 
REQUIREMENTS:
Qualifications –
  • Bachelor’s Degree in Information Technology or Information Technology – Programming.
 
Experience/Skills –
  • At least 3 years’ proven experience in Computer Programming and Data Engineering, together with a relevant 3-year tertiary qualification OR At least 4 – 5 years’ proven experience in Computer Programming and Data Engineering.
  • Application Development with scripting languages (Python).
  • Relational database management systems.
  • Provisioning cloud resources using Infrastructure as Code (Terraform).
  • Core AWS services (S3, EC2, VPC, IAM).
  • Cloud data lake and warehouse concepts.
  • Software Testing practices.
  • Basic Terminal/bash usage.
  • Software Version Control systems (Git) and deployment tools (CI/CD).
  • Structured vs Unstructured data.
Proven experience in:
  • AWS data stack (AWS Glue, AWS Redshift, AWS S3, AWS LakeFormation).
  • Operationalizing Batch and/or Realtime data pipelines.
  • Python, PySpark, or Scala.
  • Version Control in Git, and CI/CD deployment.
  • Any Infrastructure as Code tool.
 
Ideal to have –
  • Honours Degree in Information Technology – Computer Science or Information Technology – Systems Engineering.
  • At least 3 years’ proven experience in Cloud Data Engineering, particularly in AWS, together with a relevant 3-year tertiary qualification OR At least 4-5 years’ proven experience in Cloud Data Engineering, particularly in AWS.
  • Proven experience in:
    • Apache Spark, Hudi, Presto
    • Distributed Systems (Apache Hadoop, Amazon EMR)
    • Advanced shell scripting
    • Infrastructure as Code (Terraform)
  • AWS serverless services (Step Functions, Lambda, EventBridge, API Gateway).
  • AWS data lake and warehousing services (Glue, LakeFormation, EMR).
  • Data lake and warehouse architecture.
  • AWS Well-Architected Framework.
  • Collaboration tools (JIRA, Confluence, Draw.io).
  • Trusted insights into Data Governance, Data Management, Data Quality, Data Security and Master Data Management.
  • Computer Literacy (MS Word, MS Excel, MS Outlook).
  • Solid understanding of: Banking systems environment, Banking business model.
 
ATTRIBUTES:
  • Analytical.
  • Communications skills.
  • Interpersonal & Relationship Management skills.
  • Problem solving skills.