A Primer on AI - A Junior VC

I’ve spent the last few years studying AI

Like you, I heard of LLMs, GPT, Agents, ML, but wanted a full picture that connected the dots. Before we dive in, we need first to define intelligence

I like Max Tegmark’s definition: “Intelligence is the ability to achieve complex goals”

This definition will be significant because every AI system fundamentally achieves a complex goal. AI has been around for longer than we think

The 1950s were the first big wave, with attempts to create neurons

Alan Turing came up with his test, inspired by AI

Herbert Simon predicted machines would be capable in 20 yrs, of doing man’s work. Not to be, as AI had a winter in 70s

AI pioneers were interested in machine learning. ML involves building a model that creates useful outputs based on inputs, built on “training data”

The goal is to find statistical patterns in a training set that generalize outside. Humans do this very well in 3 ways.

The 3 are: 1) Supervised learning: Inputs/outputs both provided, predict new outputs based on new inputs 2) Unsupervised learning: Only inputs, have not been categorised or labelled, there is no training data 3) Reinforcement learning: No training inputs or outputs, just learning

The 80s saw a new type of model. Geoffrey Hinton promoted neural networks.Hinton, today the father of AI, pioneered a feedforward neural network

Like a human being, this makes the error an input. This new implementation of ML would be called “deep learning”

Deep learning (DL) algorithm adjusts itself for accuracy through the processes of backpropagation or feeding the error back

DL layers algorithms and computing units, or neurons, into what is an artificial neural network. It would inspire an old tech giant to go Deep, literally

The 90s saw the first big win for AI. A computer called Deep Blue beat the best chess player Garry Kasparov in 1997

IBM would become the AI poster child in 2000s, as Google would quietly create an algo called PageRank. AI had started to seep in but people didn’t have the power

2023 isn’t AI’s first big hype cycle. In the 2010s, research went deep, but nobody built a successful consumer-facing app

Deep in Google Brain, a team of 4 was going to invent a new DL model. 2016 paper would be called “Attention is All You Need”

The model?

“Transformer”

The transformer would use “attention”, which mimicked human attention. Using the concept of attention, it would give higher “attention” to certain inputs over others

Attention would simplify the neural network, make them output better while reducing training time

Not too far away, a little non-profit called Open AI would immediately put this into practice. Generative Pre-trained Transformers or GPT would be developed by OpenAI in 2018

GPT1 in 2018 was trained on 117M parameters, with web data. It was a big start but lacked coherence

GPT2 in 2019 would be trained on 1.5Bn parameters, got coherence right, but struggled at reasoning. GPT3 in 2020 would be trained on 175Bn parameters and got reasoning right

GPT was trained on large amounts of language and called Large Language Models

OpenAI was onto something

By 2021, two key trends were converging to make AI mainstream

a) Compute costs falling exponentially
b) model parameters increasing exponentially

The two combined to make large AI models both useful and simultaneously accessible

Open AI thought of plugging GPT-3 into chat

Just before ChatGPT, a startup called Stability would show the impact of neural nets on images. Instead of using transformers, it used a diffusion model for images

Diffusion models worked poorly for text, but exceptionally for images

Like gas diffusion, the output went viral

ChatGPT would drop and explode in Nov 22, reaching 1M users in 5 days. It had no referral loop, no social layer. But its output became viral

26 years after Deep Blue beat Kasparov, the power of AI was now in normal people’s hands

OpenAI put AI on the internet, creating magic

In March 23, OpenAI would release GPT 3.5 as an API. The release resulted in a flood of Twitter threads hawking new tools

Apps would be most useful in professional work involving repetition, code and basic creativity

But most apps were wrappers; the real tech was OpenAI’s APIs

In April 23, an open-source GitHub Repo called “AutoGPT” would go viral

AutoGPT would introduce the concept of “agents”. Agents were LLMs, but with the power to operate autonomously. “Autonomously” may make it look like they were all-knowing robos, but they were just smart LLMs

Unlike a “dumb” LLM, an agent could take any instruction and break it into achievable sub-goals. The task could be as large as “running a company” to as small as “summarise news from Twitter”

These agents were a highly evolved version of IBM’s Deep Blue

AI will dramatically change the way we work, but I think the fears are overblown

I have built tools on GPT that are excellent at synthesis but lack human creativity

I see it as a potent tool, but tools cannot beat humans.

Humans with tools, beat humans without tools