We are working on integrating Argo & K8S for Metaflow, an open-source ML framework originally developed at Netflix https://github.com/Netflix/metaflow/pull/434
Kubeflow Pipelines under the hood (and not very hidden) uses Argo Workflows.
And it's not like KFP makes CI/CD (MLOps) any easier than Argo Workflows itself.
Argo had better kubernetes + surrounding ecosystem integration out of the box, it was designed to run containers by default which suited us because we had mixed language workloads. Airflow was mostly Python specific, unless you then ran plugins and extensions, the config/pipeline definition was written in Python which I didn’t want to do after witnessing my teammates write the worst Python I’ve seen in my career, and last time I evaluated it, it depended on a bunch of external, Python specific tools (celery etc) that I had previously found painful to run.
Isn't that fairly common which is why there are "ML engineers" that then productionize (clean up and optimize) the original code to be plugged into a production workflow / pipeline system?
I had used Airflow for a few years, and looked into Prefect; in retrospect I'm very happy we chose Argo.
Use Argo if:
- Your tasks are containerized.
- You're using Kubernetes, and can benefit from what it can offer — individually sized containers, autoscaling, fault-tolerance.
- You have loosely coupled tasks — which pass at most pass files to each other, rather than python objects.
- You don't have tens of thousands of tasks / streaming / etc.
Airflow can run on Kubernetes, but with Airflow we ended up having equally sized workers up 24/7 — whether or not it was running an expensive job, a query on a remote system, or nothing.