Building AI Models Like Open Source

This paper is short and readable, compared to many in the field. It also gave me a better sense of scale in our industry — learned that the cost to train GPT-3 ran to millions of dollars.

The ability to incorporate open source packages into a piece of software allows developers to easily add new functionality. This kind of modular reuse of subcomponents is currently rare in machine learning models. In a future where continuously improved and backward-compatible models are commonplace, it might be possible to greatly improve the modularity of machine learning models. For example, a core “natural language understanding” model could be augmented with an input module that allows it to process a text in a new language, a “retrieval” module that allows it to look up information in Wikipedia, or a “generation” output module that allows it to conditionally generate text. Including a shared library in a software project is made significantly easier by package managers, which allow a piece of software to specify which libraries it relies on. Modularized machine learning models could also benefit from a system for specifying that a model relies on specific subcomponents. If a well-defined semantic versioning system is carefully followed, models could further specify which version of their dependencies they are compatible with.

But mostly it left me wondering, “how and when?”

Me and ChatGPT

This is not a post about the widely publicized ChatGPT hallucinations, which appear to have been a momentary glitch, easily rolled back and resolved. This is something I experienced about a week earlier. I use ChatGPT primarily to help with programming questions — particularly those that are too general or broad to be easily searched on StackOverflow and the like. But occasionally, I get bored and try things just to see what will happen…

Last week I decided to go back to an exercise I had first tried about two years earlier, with an LLM on a different service. We played an imaginary game of blackjack.

Excerpt from ChatGPT interaction

 

Something surprising happened. ChatGPT tried repeatedly to “take over” as dealer, rather than waiting for me to announce the cards. I don’t know whether to categorize this as a “hallucination,” or just an unexpected result in an unscripted interaction.

Link to full conversation.

I found the entire exchange fascinating and rewarding, at a deep and existential level. Next time, we’ll have to talk about Plato and the Cave.

What Is Truth?

Defining truth is a question impossible for mere mortals. Defining facts is a good deal easier.

Determining whether a statement is provable is still a subjective exercise, but one familiar to journalists and juries. In other words, the distinction has economic value. To give just a few examples, it plays a role in determining insurance premiums and whether medical research does or does not get funded. Distinguishing between positive and normative statements is essential for figuring out budgets: where schools and hospitals will be built, and the funds allocated to police and first responders. In other words, the ability to understand and recognize facts affects people’s lives and health directly.

Can an AI be taught to recognize the same types of distinctions? I would argue that it most certainly can. It probably doesn’t even require an LLM. Just a simple text classification model.

Here’s how I would do it.

We have a hypothetical function:

is_provable(x)

Input: String of up to 1000 characters in length.

Output: Boolean

Goal: Evaluate whether this string fits the definition of a factual statement (e.g. one that can be verified as either true or false).

It should be possible to train a text classification model to evaluate different statements and determine whether they are likely to be provable or impossible to prove.

Examples of provable statements:

Quantitative Expressions – Ex. “The population of the United States is 330 million people.”

Comparative Statements – Ex. “The Nile is the longest river in Africa.”

Direct Quotations – Ex. “John F. Kennedy told the people of Berlin, ‘Ich bin ein Berliner.'”

Descriptions of Past or Present Events – “On June 6, 1944, Allied forces landed on the beaches of Normandy.”

In general, data that can be cited or attributed may be considered factual. However, this depends on trust in the methods and judgment of those compiling the information source.

I need to stress that the goal here is not to determine whether the statement itself is true or false. It is only to predict whether the statement is possible or impossible to verify (e.g. “a fact”).

Not every important statement can be proven. For example, scientific hypotheses are not provable, because future evidence may call them into question — but summaries of experimental results certainly qualify as factual statements.

Of course, plenty of rabbit holes and pitfalls with this approach. I should emphasize that I am proposing something more akin to sentiment analysis than a precise epistemology. The reason this exercise might be at all worthwhile is that at the end of the day, it could be used to build a repository with some very interesting and relevant applications.

Wikipedia would be the obvious place to go for training data. A pool of human undergraduates (pre-law, economics, psychology, and philosophy) could provide a secondary source of validation data. Members of Debate Clubs would be your ideal candidates.

Yes, we live in the era of fake news. Stock valuations are difficult to pin down. The concept of calling a politician or corporate leader out for telling a lie seems quaint and old-fashioned. In practice, we are overwhelmed with information. Trust is a vanishing commodity. Training an AI model to distinguish between factual and non-factual statements will not restore that trust. What it will do is allow collection of a dataset that may inform the basis of a worldview. One defined based on human standards, but accessible and interpretable by machines. (And by the way, we are certainly talking about an LLM here — one trained with a massive amount of parameters.)

To some, that prospect may seem frightening. But I would argue that if unchecked, reliance on AI systems incapable of distinguishing between fact and fantasy could result in far greater harm.

This Poem Really Spoke to Me

It was written in 1981 (the author died in 2017) but feels emblematic of the present day.

Hard Times

by John Ashberry
 
Trust me. The world is run on a shoestring.
They have no time to return the calls in hell
And pay dearly for those wasted minutes. Somewhere
In the future it will filter down through all the proceedings

But by then it will be too late, the festive ambience
Will linger on but it won’t matter. More or less
Succinctly they will tell you what we’ve all known for years:
That the power of this climate is only to conserve itself.

Whatever twists around it is decoration and can never
Be looked at as something isolated, apart. Get it? And
He flashed a mouthful of aluminum teeth there in the darkness
To tell however it gets down, that it does, at last.

Once they made the great trip to California
And came out of it flushed. And now every day
Will have to dispel the notion of being like all the others.
In time, it gets to stand with the wind, but by then the night is closed off.

Iterations of an Iteration

My ex-partner and I have a running joke about being trapped in a simulation.

Or is it a joke? Hard to say.

That conversation thread inspired me to watch “Hang the DJ” from Black Mirror a second time. I think after “San Junipero,” it is probably my second favorite episode. Picked up a few things I hadn’t noticed on the first viewing… like the way in which the script calls attention to the number “44” (also the season and episode number, and the number that in Chinese tradition signifies death).

It does seem to me that “The System” worked pretty much the way that a dating simulation of the interaction of two personalities actually would work. So much so that I actually had to try a variant with ChatGPT.

Turned out to be sexist drivel, but something I could certainly see a pr agency serving up to a client. What else could I do but iterate further? On the second prompt, I specifically requested non-sexist dialogue. And what I got was a brand new conversation about iteration in software development… this time between two men.

 

>>
PROMPT #1: Generate sample dialogue for a screenplay about iteration.

 

ChatGPT

INT. TECH STARTUP OFFICE – DAY

The office is buzzing with energy as JASON, a young and ambitious software developer, sits at his desk, typing furiously on his laptop. His colleague, LISA, approaches with a cup of coffee.

LISA (leaning over) Hey, Jason, what’s got you so focused?

JASON (grinning) I’m working on the latest iteration of our app. It’s going to be a game-changer.

LISA (excited) Really? What’s new this time?

JASON (looking at his screen) We’ve revamped the user interface, made it more intuitive. And check this out – we’ve added a feature that allows users to customize their profiles in ways we never thought of before.

LISA (impressed) Nice! But didn’t we just release an update last month?

JASON (nodding) Yeah, but that’s the thing about iteration. It never stops. We learn from each release, gather feedback, and keep improving.

LISA (smiling) So, it’s like a never-ending cycle of upgrades?

JASON Exactly! Each iteration takes us a step closer to perfection. It’s all about refining, optimizing, and giving our users the best experience possible.

LISA (sipping her coffee) I love it. But how do you handle the pressure? The constant need to improve?

JASON (grinning) Pressure is part of the game. It’s what keeps us on our toes. Plus, seeing the impact of our work and making something better each time? That’s the real reward.

LISA (looking around the office) Well, everyone seems pretty engaged in their work. I guess iteration is the heartbeat of this place.

JASON (nodding) Absolutely. It’s not just about the product; it’s about the journey of making it better every single day. And hey, speaking of which, I’ve got to get back to this iteration. The code won’t write itself.

LISA (laughing) Alright, go work your magic. Can’t wait to see what this version brings.

Jason gets back to typing, and Lisa walks away, inspired by the constant pursuit of improvement in the world of iteration.

 

>>
PROMPT #2: Generate non-sexist dialogue for a screenplay about iteration.
ChatGPT

INT. SOFTWARE DEVELOPMENT OFFICE – DAY

The development team is busy at work, typing away on their keyboards. ALEX, a seasoned programmer, is sitting at a desk reviewing code. JORDAN, a junior developer, approaches with a notebook in hand.

JORDAN Hey, Alex, mind if I bounce some ideas off you for the next iteration?

ALEX Sure, Jordan, shoot. What are you thinking?

JORDAN (smiling) I was thinking about streamlining the onboarding process for new users. The feedback from our last release pointed out some confusion.

ALEX (nodding) That’s a great focus. Simplifying onboarding is always a win. What’s your plan?

JORDAN I thought about breaking it down into smaller steps and adding tooltips for guidance. It might make the whole process more user-friendly.

ALEX (leaning back) Nice approach. It aligns with our user-centric philosophy. Have you considered how it might affect the overall user experience?

JORDAN I did. I think it will make the initial interaction smoother without overwhelming users with information.

ALEX Good thinking. And how do you plan to gather feedback during this iteration?

JORDAN I was thinking of setting up a quick survey for users who go through the onboarding. Something short and focused on their experience.

ALEX Smart move. Continuous feedback is key to refining our work. Remember, iteration isn’t just about the code; it’s about understanding the user journey.

JORDAN Absolutely. I appreciate your input, Alex.

ALEX No problem, Jordan. It’s all about collaboration. If you need any help with implementation, feel free to reach out.

JORDAN Thanks, Alex. I’ll definitely take you up on that. Let’s make this iteration our best one yet.

They both get back to work, each motivated to contribute to the success of the next iteration.