Apache Airflow

Scheduled Jobs Orchestration for Risk Engine

Introduction

Airflow is an open-source platform for developing, scheduling, and monitoring batch-oriented workflows.

We use it for deploying our Data Engineering and ML Engineering pipelines that need to be executed as batch jobs.

For example, as you can see in the screenshot, we have now a couple of jobs executed daily:

  • intotheblock: pulls data from ITB API and fills our Big Query tables.

  • collarisk1: uses ITB data in our Big Query and transforms it into features pushed into our postgres database.

Airflow UI

Last updated