Thank you for joining us for today’s 5 Minute Tech Challenge! We’re so glad you’re a part our community. Today, we learn from Haroon Choudery, co-founder and CEO at foundation model monitoring startup, Autoblocks. Take it away, Haroon!
Ever heard of ChatGPT? I know, I know — that probably seems like a rhetorical question. Chances are that you’ve heard, if not tried, the fastest-growing app of all time!
But ChatGPT doesn’t only signal the arrival of really great chatbots (although that’s true). The implications are much broader than that. Foundation Models, like the Large Language Models behind ChatGPT, are one of the most incredible technologies we’ve ever seen, and I’m here to tell you more about them.
Foundation Models vs Traditional Machine Learning
But what makes Foundation Models (FMs) like ChatGPT so special? Let’s compare them to traditional machine learning (ML) models for contrast.
Firstly, they're trained differently than traditional AI models. Whereas traditional ML models (specifically, supervised ML models) are trained using labeled data with human supervision, FMs are trained on lots of data using a process called self-supervised learning. This means that FMs can learn from unlabeled data without the need for human supervision, which allows FMs to learn and generalize patterns in data, making them highly adaptable to different tasks and industries; no need for human annotation.
Traditional models are like students who have only studied one subject and can only solve problems related to that subject. Foundation Models, on the other hand, are like super students (aka “overachievers”) who can learn from a lot of data without the need for a teacher to show them the right answer. This makes them highly adaptable to different tasks and industries.
This capability is well illustrated by the FM capability for few-shot learning, which allows them to learn new tasks or concepts from just one or a few examples. This means that the model can quickly adapt to new scenarios and perform well with limited training data.
Let’s walk through an example: let’s say you have a FM that has been trained to generate text in a specific domain, such as writing product descriptions for an e-commerce website. With few-shot learning, you could provide the model with just a few examples of a new product, and it would be able to generate a description for it based on its existing knowledge of the domain. This is possible because the FM has a general understanding of product descriptions (including their language and style), and can quickly adapt to new products based on its existing knowledge.
Few-shot learning is a valuable feature for FMs, as it enables them to be applied to a wide range of industries and use cases, where data may be scarce or difficult to obtain. With few-shot learning, FMs can quickly learn and generalize patterns in data, making them highly adaptable to different tasks and industries. Going back to the analogy of FMs as overachievers, imagine an overachiever who needs very minimal training to get really good at new challenges (we all know someone like this). That’s a FM.
What challenges do Foundation Models have?
Although it may seem like we’re on the brink of AGI and a robot takeover, it’s worth understanding that there are some tasks that Foundation Models may not be good at out-of-the-box. For example, a generic FM like ChatGPT may not perform well on tasks that require a deep understanding of a specific industry or domain, such as legal or medical terminology.
However, one powerful aspect of FMs is their capability to be fine-tuned with domain-specific or task-specific training data. This makes them highly adaptable and capable of performing well on a wide range of tasks.
This is a little different than the few-shot learning we described above.Few-shot learning is when you include examples in the prompt, after the model has been trained, while fine-tuning involves training the FM further on domain-specific or task-specific data. Fine-tuning can give much better results than few-shot learning, but it requires access to niche data; few-shot learning can be sufficient for tasks that only require a specific style or tone.
So why is this important? Well, FMs can be used to create really great chatbots, like ChatGPT, but they can also be used for a lot of other things: they can help businesses generate enterprise-grade marketing copy or create specific types of art. As the field of Foundation Models continues to advance, we can expect them to cover more specific use cases.
Whereas traditional ML models were great specialists, FMs have the ability to flex from being generalists and specialists to a model that can do both!
In summary, Foundation Models like the Large Language Models behind ChatGPT are one of the most incredible technologies we’ve ever seen. They are highly adaptable and capable of performing a wide range of tasks, making them valuable tools for businesses in various sectors. The capability of FMs for few-shot learning and fine-tuning with domain-specific or task-specific training data makes them highly adaptable and capable of performing well on a wide range of tasks, including many that have yet to be uncovered. With FMs, we have a new era of AI that can revolutionize the way we do work and solve complex problems.
Coming up in two weeks, we will be learning from Dave Holmes-Kinsella, a full-stack player-coach data scientist at Synctera who builds new things from scratch & makes existing things better. Dave will be teaching us about Window Functions in SQL: Why order by and limit are the work of the very devil himself! I can’t wait to learn more.