TensorFlow Extended (TFX) was developed by Google as an end-to-end system for deploying production machine learning (ML) pipelines.
It is used internally at Google, but can be replicated for business use. This guide will explore TFX and the various built-in components to use when building one.
What Is TFX?
TFX is under the TensorFlow umbrella. It is a production-scale machine-learning platform based on TensorFlow. This platform is owned by Google.
It offers a range of libraries, frameworks, and components that can be used for machine learning, particularly when it comes to creating and launching machine learning models. It may also be used as part of an Industry Solutions strategy to improve operations and gain better insights.
Here are the components available in TFX:
- Modelling
- Serving
- Training
- Managing deployments
These can assist you in building an efficient ML pipeline.
TFX libraries and components are used in an end-to-end machine-learning pipeline. This all starts with data collation and model serving.
You can choose to install just TFX which comes with all necessary components, or opt for Python and install these separately.
Is TFX Open Source?
Yes, TFX is an open source ML pipeline framework from Google. This means that the original source code can be shared and modified. TFX can be used to help simplify pipeline definitions and reduce the number of boilerplate codes needed for each task.
Recommended video: Federated Machine Learning With TensorFlow – Moisés Martínez
TFX and Orchestrators
Orchestrators are systems that have been built to execute TFX pipelines. These platforms have the capabilities to schedule, author and monitor workflows. They’re used to ensure that each job is being executed at the right time with the correct input.
Some examples of orchestrators include the likes of Apache Beam and Kubeflow, which we will discuss in more detail later.
There are many stages of the machine learning cycle, so TFX is able to provide a set of components – each with different functionality. For instance, you may wish to extend their functionality or replace components altogether depending on the lifestyle stage.
What Does MLOps Stand for?
Put simply, MLOps stands for Machine Learning Operations. It’s a key function of machine learning engineering, whether applied to call centre AI applications or one of its many other uses. It has collaborative functionality, with a focus on streamlining the process of taking machine learning models to production and monitoring them.
Machine learning is now used across many industries. Its lifecycle includes many components such as data ingest, data prep, model training and model monitoring. Working on machine learning requires a lot of collaboration between data engineering teams and data scientists. MLOps embraces experimentation and continuous improvement for machine learning.
TensorFlow Extended Pipelines
Let’s dive into TensorFlow extended pipelines. This in-house platform was released in early 2019 to assist enterprises in implementing an industrial-standard production system (TFX). You may notice it comes with a configuration foundation and open libraries for integrating common components. This can help when it comes to monitoring machine-learning techniques.
Pipeline nodes are part of the TFX pipeline. They have been specifically created to conduct advanced metadata operations, such as flagging current ML metadata, using artifact properties.
The most popular pipeline node is the importer node. The importer node is a features node that integrates an external source into the machine learning metadata library. This enables downstream nodes to use the registered data as input.
The main object for this node is to bring in external artifacts (like a schema, for instance) into the TFX pipeline. This can then be used by the transform in trainer components.
Using Kubeflow
What is Kubeflow? It’s an open-source platform that has been designed for developing and running ML workloads. In short, it can help to manage operations on chatbots, conversational AI, and other applications.
Kubeflow allows you to manage end-to-end ML pipeline processes. It also runs workflows in different environments, whether that be on-premises or on the Cloud. It can also be used to support the visualisation of workflows. Sometimes it’s easier to manage the workflow when you can visualise the processes ahead.
So what kind of user issues can Kubeflow solve? It can help to efficiently manage and scale machine learning workflows. It offers tools and frameworks (such as TensorFlow) to streamline model development and training. It’s also used by data scientists who want to build ML pipelines and ML engineers who want to deploy ML systems.
Here are some of the components included in Kubeflow:
- Pipelines – Kubeflow Pipelines can be utilised for building machine learning workflows using Kubernetes.
- AutoML – Kubeflow can be used for automated machine learning (AutoML). It supports a model that is able to help with hyperparameter tuning, early stopping and neural architecture search.
- Model serving – This provides interfaces for frameworks such as TensorFlow.
- Model training – Kubeflow has a training operator that’s a unified interface for model training. It can run training jobs for popular frameworks including TensorFlow.
Environment Setup
Here’s a look at what you need to get started with TensorFlow Extended for MLOps.
Base Image
The first step is to build a base image on Linux with minimal requirement packages. It must be Linux as it is the only platform that comes with TF-DF distributions. From here, all of your downstream local deployment, such as Kubeflow, will be used.
Data and Model of Choice
You can start by building a simple pipeline. Then, run end-to-end with all the standard components within the TensorFlow Extended system. You can choose any model of choice, but keep it simple.
Pipeline Definition and Run
Using TFX and TF-DF documentation, you can now run the pipeline inside the development container, and it will update in your local file system on your computer. Some things you may notice include:
- Changes to be made in categories
- Removing all rows with blank values
- How it would run in the real-world
It is interesting to see exactly how your dummy pipeline will run. From this, you can analyse the pipeline output to investigate the artifacts closely.
You can connect to the ML metadata (MLMD) database to find information about your latest run.
TFX Libraries
Orchestration is a TFX module. It is required to coordinate components and manage pipelines. Businesses use Orchestration’s administration interface to enable tasks and monitor components.
It is built on the TensorFlow libraries. They are used to create Python user-defined functions. TFX also helps in the functionality of TF libraries, because it offers reusable building pieces known as standard components. This means that, in theory, with relatively minimal code you can still construct solid pipelines to rival even the best VoIP services.
Let’s take a look at TFX libraries that are available:
1. TensorFlow Data Validation (TFDV)
TensorFlow data validation is a machine learning data analysis and validation library. Its uses include a schema viewer as well as scalable calculations. If you want to analyse data in more detail, it offers dataset comparison.
2. TensorFlow Transform (TFT)
TensorFlow Transform is a library for using TensorFlow to pre-process data. This means you can keep all operations in one place instead of having to use another system to pre-process data and then move to TensorFlow.
3. KerasTuner
KerasTuner is a library that’s used as a tool for tweaking model hyperparameters. Therefore, it is mainly used for training models and model tuning. It helps you pick the best set of hyperparameters for your TensorFlow program. When you select the right set of hyperparameters for your machine learning application this is called hypertuning.
So, what are hyperparameters? They are variables that control the training process of an ML model and they remain the same throughout the process. Model hyperparameters influence model selection, and algorithm hyperparameters influence the speed and quality learning of the algorithm.
4. TensorFlow Metadata (TFMD)
TensorFlow metadata provides standard metadata representations. It is used for training machine learning models. It can make use of machine learning metadata for lineage tracking and component exchange.
It may be produced manually or automatically during input data analysis.
Metadata formats include:
- A schema for tabular data.
- A problem statement laying out some of the objectives of a model.
- Summary of statistics over such datasets.
5. ML Metadata (MLMD)
Machine learning metadata is essentially a library for storing and fetching metadata. This metadata is often used by machine learning developers, data scientists and other people who work with big data and are familiar with terms like HDFS (find an HDFS meaning here).
TFX is a scalable and quick process, thanks to its libraries. It also means that TFX can run effectively on streaming and batch pipelines.
Using TFX for MLOps
End-to-end machine learning systems are becoming more commonly used. Using TensorFlow for MLOps can be a learning process, and building pipelines is not a straightforward task. It also requires a lot of knowledge and experience around TFX.
However, it is a great tool to have if you want to deploy a machine learning model, and ML is one of the fundamental concepts for developers working in the industry. It can also help your business grow as you learn more about your business model and what works, like how to build a QoS analytics solution for streaming video services with strong data backing. As it was created and still internally used by Google, it is a popular way of compiling and using metadata among machine learning developers and some data scientists.