LLaMA: Open and Efficient Foundation Language Models

Overview

Note: This section is under heavy development.

What's New?

This paper introduces a collection of foundation language models ranging from 7B to 65B parameters.

Training Data

The models are trained on trillions of tokens with publicly available datasets.

Research Context

The work by (Hoffman et al. 2022) shows that given a compute budget, smaller models trained on a lot more data can achieve better performance than larger counterparts. This work recommends training 10B models on 200B tokens.

Key Finding: However, the LLaMA paper finds that the performance of a 7B model continues to improve even after 1T tokens.

Research Focus

LLaMA Research Overview

This work focuses on training models (LLaMA) that achieve the best possible performance at various inference budgets, by training on more tokens.

Capabilities & Key Results

Performance Comparison

LLaMA-13B outperforms GPT-3(175B) on many benchmarks despite being:

10x smaller
Possible to run on a single GPU

LLaMA 65B is competitive with models like:

Chinchilla-70B
PaLM-540B

Resources

Paper

LLaMA: Open and Efficient Foundation Language Models

Code

GitHub Repository

References

April 2023

March 2023

Key Takeaways

Efficient Scaling: Smaller models trained on more data can outperform larger counterparts
Continuous Improvement: 7B model performance improves even after 1T tokens
Cost-Effective: LLaMA-13B outperforms GPT-3(175B) while being 10x smaller
Single GPU Compatible: LLaMA-13B can run on a single GPU
Competitive Performance: LLaMA 65B competes with state-of-the-art models
Open Foundation: Provides base for many derivative models and applications

Adversarial prompting

Coding

Creativity

Evaluation

LLMs for classification

Image generation

Information extraction

LLM research findings

Mathematics

Models

Question answering

Reasoning

Risks & Misuses

Text summarizations

Truthfulness

LLaMA: Open and Efficient Foundation Language Models

Overview

What's New?

Training Data

Research Context

Research Focus

Capabilities & Key Results

Performance Comparison

Resources

Paper

Code

References

April 2023

March 2023

Key Takeaways

LLaMA: Open and Efficient Foundation Language Models ​

Overview ​

What's New? ​

Training Data ​

Research Context ​

Research Focus ​

Capabilities & Key Results ​

Performance Comparison ​

Resources ​

Paper ​

Code ​

References ​

April 2023 ​

March 2023 ​

Key Takeaways ​

Related Topics ​

LLaMA: Open and Efficient Foundation Language Models

Overview

What's New?

Training Data

Research Context

Research Focus

Capabilities & Key Results

Performance Comparison

Resources

Paper

Code

References

April 2023

March 2023

Key Takeaways

Related Topics