Datafin

Python Engineer (Data) (CPT/JHB)

IT – Analyst, Data Management
Cape Town – Western Cape ~ Johannesburg – Gauteng

ENVIRONMENT:
A fast-paced & innovative Financial Institution seeks the technical expertise of a Python Engineer (Data) with strong AWS whose core focus will be to empower data consumers. You will contribute to the design and development of new cloud workloads for Platform & Product teams while maintaining and managing the existing cloud data environments. The ideal candidate must possess a Bachelor’s Degree in IT or IT related field, have 3 – 5 years proven Computer Programming and Data Engineering experience and proven experience in AWS data stack (AWS Glue, AWS Redshift, AWS S3, AWS Lake Formation), Python, PySpark/Scala, operationalizing Batch and/or Realtime data pipelines, Git, CI/CD and any Infrastructure as Code tool such as Terraform.
 
DUTIES:
  • Contribute to the design and development of new cloud workloads for platform and product teams, to empower data consumers using data platforms to deliver client value.
  • Maintain and manage the existing cloud data environments and enable data producers to easily contribute to these environments.
  • Contribute to evolving the data platforms through sharing of knowledge, contributing new data features, and enhancing/streamlining existing processes e.g., improved re-use of code.
 
REQUIREMENTS:
Qualifications –
  • Bachelor’s Degree in Information Technology or Information Technology – Programming.
  • Honours Degree in Information Technology – Computer Science or Information Technology – Systems Engineering (Preferred).
 
Experience/Skills –
  • At least 3 years’ proven experience in Computer Programming and Data Engineering, together with a relevant 3-year tertiary qualification OR At least 4 – 5 years’ proven experience in Computer Programming and Data Engineering.
  • Proven experience in:
    • AWS data stack (AWS Glue, AWS Redshift, AWS S3, AWS Lake Formation).
    • Operationalizing Batch and/or Realtime data pipelines.
    • Python, PySpark, or Scala.
    • Version control in git, and CI/CD deployment.
    • Any Infrastructure as Code tool.
  • Relational database management systems.
  • Provisioning cloud resources using Infrastructure as Code (Terraform).
  • Core AWS services (S3, EC2, VPC, IAM).
  • Cloud data lake and warehouse concepts.
  • Software Testing practices.
  • Basic Terminal/bash usage.
  • Structured vs Unstructured data.
  • Clear criminal and credit record.
 
Ideal to have –
  • At least 3 years’ proven experience in cloud data engineering, particularly in AWS, together with a relevant 3-year tertiary qualification OR
  • At least 4-5 years’ proven experience in cloud data engineering, particularly in AWS.
  • Proven experience in:
    • Apache Spark, Hudi, Presto.
    • Distributed Systems (Apache Hadoop, Amazon EMR).
    • Advanced Shell Scripting.
    • Infrastructure as Code (Terraform).
  • AWS serverless services (Step Functions, Lambda, EventBridge, API Gateway).
  • AWS data lake and warehousing services (Glue, Lake Formation, EMR).
  • Data lake and warehouse architecture.
  • AWS Well-Architected Framework.
  • Collaboration tools (JIRA, Confluence, Draw.io).
  • Trusted insights into Data Governance, Data Management, Data Quality, Data Security and Master Data Management.
  • Solid understanding of:
    • Banking systems environment
    • Banking business model
 
ATTRIBUTES:
  • Analytical.
  • Communications skills
  • Interpersonal & Relationship Management skills.
  • Problem solving skills.