What will you accomplish?
- Implement data pipelines to extract data from multiple sources.
- Implement data pipelines to transform data using stream, micro-batch or batch methods in preparation for consumption by ML model training, ML model inference, business intelligence analytic tools and self-service platforms.
- Deploy data pipelines following best practice DevOps principles and incorporating automated testing.
- Build and manage robust cloud architectures to provide efficient processing and transformation of data.
- Be a team player, foster a speed-oriented culture in our data team and help to cultivate a fast-growing team.
- Work autonomously and make decisions collaboratively, within a team that will support and challenge you as you grow and develop.
What do we need from you?
- 3+ years of experience developing highly scalable and resilient data pipelines.
- Well-versed in multiple programming languages and paradigms, preferably in two or more of Python, Go, SQL, Scala, Java.
- Experience with orchestration tools (such as Apache Airflow, Luigi) and data processing frameworks (such as Apache Beam or Spark).
- Experience utilising Terraform to orchestrate infrastructure (Infrastructure as Code).
- Strong ability to implement, maintain and manage relational databases, cloud-based data warehouses, non-relational databases and storage processes for unstructured data.
- Familiarity with data integration tools (such as dbt, Stitch, Fivetran, Talend, Informatica, Matillion)
- Having an understanding of best practices in the storage and processing of data, as well as computer science fundamentals, software best practices, automated testing, networking protocols and distributed systems.
- Excellent verbal and written communication skills with the ability to understand and explain complex concepts to technical and non-technical audiences.