【导读】cedrickchee维护这个项目包含用于自然语言处理(NLP)的大型机器(深度)学习资源,重点关注转换器(BERT)的双向编码器表示、注意机制、转换器架构/网络和NLP中的传输学习。https://github.com/cedrickchee/awesome-bert-nlpPapersBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding by Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova.Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context by Zihang Dai, Zhilin Yang, Yiming Yang, William W. Cohen, Jaime Carbonell, Quoc V. Le and Ruslan Salakhutdinov.Uses smart caching to improve the learning of long-term dependency in Transformer. Key results: state-of-art on 5 language modeling benchmarks, including ppl of 21.8 on One Billion Word (LM1B) and 0.99 on enwiki8. The authors claim that the method is more flexible, faster d
………………………………