Metis Machine's Skafos

Machine Learning Delivered. A Machine Learning deployment platform built to unite Data Scientist, DevOps and Engineering.

Welcome to the Metis Machine documentation hub. You'll find comprehensive guides and documentation to help you start working with Metis Machine's Skafos platform as quickly as possible, as well as support if you get stuck. Fire it up!

Get Started    

The Project is the most central construct of Skafos.
Regardless of the complexity of your business use case, there are probably multiple steps in your end-to-end machine learning pipeline:

  • External data needs to be ingested, processed, and stored
  • ML features need to be engineered and a model needs to be trained
  • New incoming data needs to be scored by a live model(s)
  • Model output needs to be exposed via REST API
  • Predictions need to be monitored for model drift

Moreover, these steps each serves a different function within your ML pipeline, making each step a prime candidate to be treated as an independent microservice rather than a single monolithic python file. Skafos provides tooling that makes it easy to discretize each step as a single job, and then orchestrate them as a pipeline. Each step, or chunk of code, is called a Job, and they can work either independently or together to form a Project.

What's in a Project

A Project is a user-managed repository containing the following items:

  • Code for each Job
  • Run-time Configuration
  • Project Dependencies & Requirements

You can create a brand new Project to deploy and monitor your ML pipeline directly from the command line interface (CLI).

$ skafos init my_new_project

The Skafos CLI section contains detailed information about installation and usage details.
Once you’ve created a project, you’ll need to create jobs, define any dependencies, configure the project, and deploy the project into your operational systems.

Note:

We also provide a series of starter Templates to help get you moving quickly!

Configuration

Because a project is a collection of Jobs, Skafos enables you to configure the way each job runs. When designing an end-to-end ML pipeline, sometimes it may be useful to:

  • Schedule jobs to run at specific times (hourly, daily, weekly, every 2 mins, etc).
  • Chain jobs to run one after the other.
  • Parallelize jobs by running multiple instances at once.
  • Scale up a job’s computational limits with more CPUs and Memory resources.
  • Define a job’s unique entrypoint.
  • Utilize an AddOn such as a Skafos Queue or Spark Cluster to increase speed and performance.
    Skafos provides a simple means to manage your project’s configuration options. Each project comes with it’s own metis.config.yml file that outlines user-defined runtime behavior of deployed Jobs.

The Config File

The config file, is the central orchestration component of each project. Living at the top-level of the project code repository, it outlines the user-defined runtime behavior of deployed Jobs. When your project is first initialized, a metis.config.yml file is also generated.

Basic Example

project_token: <project_token>
name: my_new_project
jobs: 
  - job_id: <job_id>
    language: python
    name: Main
    entrypoint: "main.py"

Complex Example
If your project contains multiple Jobs, each may require specific run-time settings. Below is an example that contains several jobs that work together:

project_token: <project_token>
name: my_new_project
jobs:
  - job_id: <job_id_1>
    language: python
    name: ingest
    entrypoint: "data-ingest.py"
    schedule: “0 11 * * *”
  - job_id: <job_id_2>
    language: python
    name: train
    entrypoint: "model-train.py"
    dependencies: ["<job_id_1>"]
    resources:
      limits:
        cpu: 6
        memory: 6Gi
  - job_id: <job_id_3>
    language: python
    name: score
    entrypoint: "score.py"
    dependencies: ["<job_id_2>"]
    resources:
      limits:
        cpu: 1
        memory: 4Gi
  - job_id: <job_id_4>
    language: python
    name: report
    entrypoint: "report.py"
    dependencies: ["<job_id_3>"]

In the complex example, we show 4 jobs that have been scheduled & chained together (dependencies), requiring different resource allocations. Most ML pipelines will require configuration files of this type. Each job can have requirements (e.g. scheduling, resources, etc) that differ from the other jobs within the project, forming a powerful constellation of microservices that operationalize your ML pipeline.

Dependencies & Requirements

In addition to run-time configurations, a project contains a list of requirements that are needed to deploy your pipeline. Skafos abstracts away ugly dependency & environment management so that you can focus on your models. List out each dependency in the requirements.txt file included in your project repository:

skafossdk==1.1.2
pandas==0.23.4
scikit-learn
numpy

If your project requires a more sophisticated environment, Skafos also supports an environment.yml file included in your project repo. This relies on Conda to manage both pythonic packages and specific system-level dependencies. Go here to see how to create one.

Deployment

After you’ve created a new project, written several Jobs, and outlined configurations & dependencies, you are ready to deploy. Skafos handles all of the backend orchestration for you so that deployment is 100% serverless. Read more about this in the Deployments section! Once your project is deployed, head over to the Dashboard section to learn how you can stay on top of issues/failures as they arise.