Building AI Models Like Open Source

This paper is short and readable, compared to many in the field. It also gave me a better sense of scale in our industry — learned that the cost to train GPT-3 ran to millions of dollars.

The ability to incorporate open source packages into a piece of software allows developers to easily add new functionality. This kind of modular reuse of subcomponents is currently rare in machine learning models. In a future where continuously improved and backward-compatible models are commonplace, it might be possible to greatly improve the modularity of machine learning models. For example, a core “natural language understanding” model could be augmented with an input module that allows it to process a text in a new language, a “retrieval” module that allows it to look up information in Wikipedia, or a “generation” output module that allows it to conditionally generate text. Including a shared library in a software project is made significantly easier by package managers, which allow a piece of software to specify which libraries it relies on. Modularized machine learning models could also benefit from a system for specifying that a model relies on specific subcomponents. If a well-defined semantic versioning system is carefully followed, models could further specify which version of their dependencies they are compatible with.

But mostly it left me wondering, “how and when?”

Leave a Reply

Your email address will not be published. Required fields are marked *