A Survey of Large Language Models - AcWing

历史记录

清除记录

猜你想搜

AcWing热点
App
登录/注册

A Survey of Large Language Models

作者：

Phalange , 2023-10-18 16:00:24 , 所有人可见 , 阅读 86

0

综述分享

A Survey of Large Language Models

PLM(Pre-trained language models)
ELMO: 在LSTM上预训练，然后根据下游任务进行finetune (context-aware)
Bert: 在无标签的词源上进行训练，然后根据下游任务进行finetune(context-aware)

改进预训练方法的论文：
“Roberta: A robustly optimized BERT pretraining approach CoRR” 
“Multitask prompted training enables zero-shot task generalization ICLR 2022”
“What language model architecture and pretraining objective works best for zero-shot generalization?”

LLM(Large Language Models)
扩展(scaling)的PLM会有奇效，遵循scaling law: “Scaling laws for neural language models” 效果：“Emergent abilities of large language models”

Question: how LLM attain superior ability?
Article:  “How does gpt obtain its ability? tracing emergent abilities of language models to their sources,”

emergent abilities
1. In-context learning: 上下文学习
2. Instruction follow: 更为详细的prompt LaMDA-PT
3. Step-by-step reasoning: 用于处理复杂问题(如数学推理) 严格的输入链条

0 评论