The last decade has seen exponential growth in deep learning capabilities and their application in research and development. Traditionally deep learning as a discipline has been limited to those with considerable training and knowledge of machine learning and AI. Fortunately, we’ve seen the growth in efforts to democratize deep learning such as the creation of Ludwig toolbox, an open source, deep learning tool built on top of TensorFlow that allows users to train and test deep learning models without writing code.
Piero Molino is a Senior Research Scientist at Uber AI with a focus on machine learning for language and dialogue. I spoke with him before his presentation at Codemotion’s online conference: The Italian edition to find out more.
What are the origins of Ludwig toolbox?
Piero describes Ludwig toolbox as “a project that got started quite a while ago. There’s probably still some line of code in the current code base that comes from a project that I worked on when I was at this startup called Geometric Intelligence. We were trying to do visual question answering, giving images questions and answering those questions. For instance, is there a cat in this image or who’s jumping over the boom in the image?
We wanted to compare different models for doing this task, so I started to create the abstractions that are there now in Ludwig.
Uber acquired geometric Intelligence in 2016, and Piero details, “While at Uber, I was tasked to solve a bunch of different machine learning problems. One was in customer support, and the other one was neural graph networks for recommender systems. Another one was a dialogue system. So I had many different tasks. And I tried to reuse that code that I wrote when I was a Geometric intelligence, and make it more general to be applied to all these different tasks. It was just me building tools for myself, making my life easier.
The code Piero was writing was available inside the company, and other people started using it internally so after a couple of years he decided to make it open source so people could use it externally in February 2019. He recalls “In the last year and a half I’ve kept on improving it and updating it. A new version was released about a month ago which introduces a bunch of new features.
What was your motivation towards making it open source as opposed to proprietary?
Piero notes that Ludwig toolbox is built on top of other open source libraries such as TensorFlow, Scikit-learn, Pandas and SpaCy. “On the one hand, this was a way to give back to the community. On the other hand, I was working at Uber, and Uber is not a company that sells machine learning platforms. So there was no real advantage to keeping it proprietary. Open source means that other people from the community could use it and also improve it. So I felt like it was win-win.”
The open source release has resulted in some unexpected use cases. Researchers in biology have used it to analyze images of worms, which they would otherwise not be able to do because they don’t have the expertise to be able to use deep learning models for their tasks.
“The software means you don’t have to write your own machine learning model, Ludwig does it for you. You can do it by a common line interface. So you don’t have to write code. One command can do as much as 500 lines of code from other handcraftedTensorFlow or pi torch cold.
Piero describes Ludwig in terms of declarative machine learning as opposed to no code, “Because you’re just declaring what model you want. Saying, these are my inputs, and these are my outputs. And then Ludwig figures out how to write the model for you depending on those.”
What have been the biggest challenges in evolving Ludwig?
According to Piero, the community aspect and management of expectations have been challenging. As a project attached to a company like Uber the expectations from the users, in particular, the beginning were really high in terms of how fast I could answer requests for adding features or link requests for solving issues. I tried to do my best, but basically, it was just me and a small number of people who later assisted, not a huge team.
In the beginning, I was checking and answering messages every hour. I learned that that’s probably not sustainable. Now I’m a little bit more disciplined in doing that. But at the same time, there have been other people emerging who really liked the project and wanted to assist, so that has been very beneficial.”
From a technical point of view, Piero notes that the biggest hurdle was “probably the shift between TensorFlow1 and TensorFlow2 because I started developing Ludwig during TensorFlow1. So all the code base was structured around the abstraction of TensorFlow. TensorFlow2 changes the abstraction, so it took quite some time to adapt.”
Fortunately, TensorFlow2 offered significant advantages in terms of code structure such as
extensibility, general quality and the ease of dealing with the underlying TensorFlow layer were definitely worth it.
What’s next for Ludwig toolbox?
Ludwig’s design makes it highly accessible to new improvements in that it provides to a certain extent, interfaces that allow people to “add their own models, their own features, add new optimizers, new learning rate schedulers, all these things can be plugged in in Ludwig relatively easily.
Maintainers can be used in many different ways, we can add more models and features, or we can improve what is there, making it more scalable, making it faster, etc.”
Piero shares that they are currently pursuing scalability in terms of data pipelines, pre-processing and interaction with data. “This is important because that will basically enable it to be used in industrial contexts where there is a huge amount of data. Right now, those kinds of tasks are not very well supported. What is more supported is the use case of a data scientist trying to solve their own problems, rather than help a company that has a terabyte of data that wants to train a model on a terabyte of data that sits in a remote machine.”
Community involvement is most welcome – Piero hopes that the interested people will help to add new models and features, especially once the large scale use cases are in operation.
Want to learn more about Ludwig toolbox?
Join Piero and learn more about the deep learning toolbox at The Codemotion Virtual conference: The Italian Edition, held November 3-5, from 14:00 to 19:00 CET.
A single ticket grants you attendance to four conferences spread over the week, offering a deep dive into a plethora of topics relating to Backend, Frontend, Emerging Technologies, and AI / ML / DL. It’s a fantastic opportunity to learn first-hand about the best state-of-the-art technology, activities, good practices, and case studies for everyone working in tech regardless of your profile or your level of experience.