Datafin

Python Engineer (Data) (CPT/JHB)

IT – Analyst, Data Management
Cape Town – Western Cape ~ Johannesburg – Gauteng

ENVIRONMENT:
A fast-paced & innovative Financial Institution seeks the technical expertise of a Python Engineer (Data) to contribute to the design and development of new cloud workloads for platform and product teams, empower data consumers using data platforms to deliver client value. You will also be expected to maintain and manage the existing cloud data environments and enable data producers to easily contribute to these environments. You will require 3-5 years’ proven experience in Computer Programming and Data Engineering including AWS data stack (AWS Glue, AWS Redshift, AWS S3, AWS LakeFormation), Operationalizing Batch and/or Realtime data pipelines, Python, PySpark, or Scala, Version control in git, and CI/CD deployment and any infrastructure as code tool.
 
DUTIES:
Development and Design –
  • Develop and architect cloud solutions for the data platforms, according to best practices, the Data Product Life Cycle (DPLC), and Way of Work (WoW), to develop enhancements to the existing data platforms.
  • Gather context, requirements and define scope for new development requests.
  • Work autonomously and take a high level of ownership in delivering software components.
  • Document developed solutions thoroughly and in a way that facilitates ease of use.
  • Include relevant automated tests in developed solutions, such as unit or integration tests.
 
Support and Maintenance –
  • Maintain data platforms, by investigating and fixing reported issues.
  • Provide support to the data platforms and the platform users, including standby duties, and responding to and resolving issues.
  • Provide support to specific value stream projects using the data platforms, through the design and development of new cloud workloads, and providing technical guidance to these projects.
 
Research and Continuous Improvement –
  • Eagerly learn new relevant skills through just-in-time learning, by researching and deep-diving into the problem or feature that is currently under development. Have a willingness to tackle new work requiring knowledge and skills that are unfamiliar.
  • Stay informed about developments in areas of technology that are relevant to the data platforms (e.g., AWS services, distributed data processing, source control tools, testing libraries, infrastructure as code, programming languages).
  • Ensure platform contributions of fellow team members follow standards and best practices during code reviews.
  • Seek to understand and learn from solutions contributed by fellow team members, and in turn be willing to knowledge share.
  • Take ownership to improve own technical knowledge about the internal workings of the data platforms.
  • Onboard and mentor new starters in the team.
  • Collaborate and share experiences and knowledge with relevant communities by being an active participant with who is motivated and encourages others to participate.
 
REQUIREMENTS:
At least 3 years’ proven experience in Computer Programming and Data Engineering, together with a relevant 3-year tertiary qualification.
OR
At least 4 – 5 years’ proven experience in Computer Programming and Data Engineering.
Proven experience in:
    • AWS data stack (AWS Glue, AWS Redshift, AWS S3, AWS LakeFormation)
    • Operationalizing Batch and/or Realtime data pipelines.
    • Python, PySpark, or Scala
    • Version control in git, and CI/CD deployment
    • Any infrastructure as code tool
 
Must have detailed knowledge of –
  • Application development with scripting languages (Python).
  • Relational database management systems.
  • Provisioning cloud resources using Infrastructure as Code (Terraform).
  • Core AWS services (S3, EC2, VPC, IAM).
  • Cloud data lake and warehouse concepts.
  • Software testing practices.
  • Basic Terminal/bash usage.
  • Software Version Control systems (git) and deployment tools (CI/CD).
  • Structured vs Unstructured data.
 
Ideal to have –
  • AWS serverless services (Step Functions, Lambda, EventBridge, API Gateway).
  • AWS data lake and warehousing services (Glue, LakeFormation, EMR).
  • Data lake and warehouse architecture.
  • AWS Well-Architected Framework.
  • Collaboration tools (JIRA, Confluence, Draw.io).
  • Trusted insights into Data Governance, Data Management, Data Quality, Data Security and Master Data Management.
  • Banking systems environment.
  • Banking business model.