Interconnects Audio
by Nathan LambertAudio format of posts on interconnects.ai -- generated with AI from the author.
Copyright: © 2024 Nathan Lambert
Episodes
We aren't running out of training data, we are running out of open training data
8m · PublishedData licensing deals, scaling, human inputs, and repeating trends in open vs. closed.
This is AI generated audio with Python and 11Labs.
Source code: https://github.com/natolambert/interconnects-tools
Original post: https://www.interconnects.ai/p/the-data-wall
0:00 We aren't running out of training data, we are running out of open training data
2:51 Synthetic data: 1 trillion new tokens per day
4:18 Data licensing deals: High costs per token
6:33 Better tokens: Search and new frontiers
Name, image, and AI's likeness
9m · PublishedCelebrity's power will only grow in the era of infinite content.
This is AI generated audio with Python and 11Labs.
Source code: https://github.com/natolambert/interconnects-tools
Original post: https://www.interconnects.ai/p/name-image-and-ai-likeness
0:00 Name, image, and AI's likeness
1:11 OpenAI's second terrible, horrible, no good, very bad week
4:36 The expansion of name and likeness
7:46 Culture and AI development
OpenAI chases Her
12m · PublishedChatGPT leaves the textbox, and Google is building the same, and more, as practical tools.
This is AI generated audio with Python and 11Labs.
Source code: https://github.com/natolambert/interconnects-tools
Original post: https://www.interconnects.ai/p/openai-and-her
00:00 OpenAI chases Her
02:10 Talking to ChatGPT
03:53 GPT-4o: Toward omnimodal models
08:25 Google's mirror with Gemini
10:11 OpenAI's AI Safety: Have your cake and eat it too
Fig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/her/img_018.png
Fig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/her/img_023.jpg
OpenAI's Model (behavior) Spec, RLHF transparency, and personalization questions
14m · PublishedNow we will have some grounding for when weird ChatGPT behaviors are intended or side-effects -- shrinking the Overton window of RLHF bugs.
This is AI generated audio with Python and 11Labs.
Source code: https://github.com/natolambert/interconnects-tools
Original post: https://www.interconnects.ai/p/openai-rlhf-model-spec
00:00 OpenAI's Model (behavior) Spec, RLHF transparency, and personalization questions
02:56 Reviewing the Model Spec
08:26 Where RLHF can fail OpenAI
12:23 From Model Spec's to personalization
Fig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-spec/img_027.png
Fig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-spec/img_029.png
Fig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-spec/img_033.png
Fig 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-spec/img_034.png
Fig 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-spec/img_041.webp
Fig 6: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-spec/img_046.webp
RLHF: A thin line between useful and lobotomized
13m · PublishedMany, many signs of life for preference fine-tuning beyond spoofing chat evaluation tools.
This is AI generated audio with Python and 11Labs.
Source code: https://github.com/natolambert/interconnects-tools
Original post: https://www.interconnects.ai/p/how-rlhf-works-2
00:00 How RLHF works, part 2: A thin line between useful and lobotomized
04:27 The chattiness paradox
08:09 The mechanism for making models chattier
10:42 Next steps for RLHF research
Fig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/rlhf/img_012.webp
Fig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/rlhf/img_018.png
Fig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/rlhf/img_025.png
Phi 3 and Arctic: Outlier LMs are hints
9m · PublishedModels that seem totally out of scope from recent open LLMs give us a sneak peek of where the industry will be in 6 to 18 months.
This is AI generated audio with Python and 11Labs.
Source code: https://github.com/natolambert/interconnects-tools
Original post: https://www.interconnects.ai/p/phi-3-and-arctic-llms
0:00 Phi 3 and Arctic: Outlier LMs are hints
1:01 Arctic & open mixture of expert trends
6:10 Phi 3, synthetic data, and small models
Fig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/phi3/img_004.png
Fig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/phi3/img_008.png
Fig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/phi3/img_018.png
AGI is what you want it to be
10m · PublishedCertain definitions of AGI are backing people into a pseudo-religious corner.
This is AI generated audio with Python and 11Labs.
Source code: https://github.com/natolambert/interconnects-tools
Original post: https://www.interconnects.ai/p/agi-is-what-you-want-it-to-be
00:00 AGI is what you want it to be
04:01 RL still rules the AGI discourse
05:43 Modern AGI tests
07:37 Agency and shifting goalposts
Fig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/agi/img_018.png
Fig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/agi/img_020.png
Llama 3: Scaling open LLMs to AGI
15m · PublishedMeta shows that scaling won't be a limit for open LLM players in the near future.
This is AI generated audio with Python and 11Labs.
Source code: https://github.com/natolambert/interconnects-tools
Original post: https://www.interconnects.ai/p/llama-3-and-scaling-open-llms
00:00 Llama 3; scaling open LLMs to AGI
01:44 Pretraining, data, and basic evals
06:06 Alignment and human evaluations
10:08 Chatting with Meta AI and Llama 3 70B Instruct
11:55 Same Llama license (mostly)
12:52 The healthy open LLM ecosystem
Fig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama3/img_011.jpeg
Fig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama3/img_013.png
Fig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama3/img_015.png
Fig 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama3/img_020.png
Fig 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama3/img_036.png
Fig 6: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama3/img_040.png
Fig 7: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama3/img_046.jpeg
Fig 8: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama3/img_061.png
Fig 9: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama3/img_063.webp
Fig 10: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama3/img_066.png
Fig 11: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama3/img_068.jpeg
Stop "reinventing" everything to "solve" alignment
7m · PublishedIntegrating some non computing science into reinforcement learning from human feedback can give us the models we want.
This is AI generated audio with Python and 11Labs.
Source code: https://github.com/natolambert/interconnects-tools
Original post: https://www.interconnects.ai/p/reinventing-llm-alignment
0:00 Stop "reinventing" everything to "solve" AI alignment
2:19 Social Choice for AI Alignment: Dealing with Diverse Human Feedback
7:03 OLMo 1.7 7B: A truly open model with actually good benchmarks
Fig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/reinvention/img_013.png
Fig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/reinvention/img_015.png
Fig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/reinvention/img_018.png
Fig 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/reinvention/img_024.png
Fig 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/reinvention/img_027.png
The end of the "best open LLM"
6m · PublishedModeling the compute versus performance tradeoff of many open LLMs.
This is AI generated audio with Python and 11Labs.
Source code: https://github.com/natolambert/interconnects-tools
Original post: https://www.interconnects.ai/p/compute-efficient-open-llms
0:00 The end of the "best open LLM"
3:05 Compute efficient open LLMs
Fig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/scaling/img_004.jpeg
Fig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/scaling/img_009.png
Fig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/scaling/img_014.png
Fig 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/scaling/img_016.png
Fig 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/scaling/img_018.png
Fig 6: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/scaling/img_020.png
Fig 7: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/scaling/img_022.png
Fig 8: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/scaling/img_024.png
Fig 9: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/scaling/img_028.png
Interconnects Audio has 35 episodes in total of non- explicit content. Total playtime is 8:54:36. The language of the podcast is English. This podcast has been added on December 24th 2023. It might contain more episodes than the ones shown here. It was last updated on May 30th, 2024 21:40.