Chrome Extension
WeChat Mini Program
Use on ChatGLM
AI Reads Science
Chat
编组 4Search
Chat
编组 3ChatPaper

57,300,563

Researchers

310,487,224

Publications

8,935,750

Concepts

2,266,326,996

Citations
Follow
Explore
Trend
Topic
Hardware-Aligned and Natively Trainable Sparse Attention
The latest paper from DeepSeek introduces a new attention mechanism — NSA, a locally trainable sparse attention mechanism for ultra-fast long-context training and inference.
YiFan Zhang,Shanglin Lei,Runqi Qiao,Zhuoma GongQue,Xiaoshuai Song,Guanting Dong, Qiuna Tan, Zhe Wei, Peiqing Yang, Ye Tian, Yadong Xue, Xiaofei Wang,
CoRR (2024)
Cited0Views11050
Download
Bibtex
ChatPaper
4.5 Star
0
11050
Computing Research Repository (2024)
Cited7Views1872
Download
Bibtex
ChatPaper
Rate
7
1872
Expand all 5 New Papers
Topic
Mixture of Block Attention for Long-Context LLMs
Kimi proposed a new attention mechanism, MoBA, which combines the principles of MoE and improves the efficiency of LLMs in long-text scenarios without sacrificing performance.
Minghao Xu, Lichuan Xiang,Xu Cai,Hongkai Wen
CoRR (2024)
Cited2Views2012
Download
Bibtex
ChatPaper
Rate
2
2012
Benjamin Warner, Antoine Chaffin,Benjamin Clavié,Orion Weller, Oskar Hallström, Said Taghadouini, Alexis Gallagher, Raja Biswas,Faisal Ladhak, Tom Aarsen,Nathan Cooper,Griffin Adams,
CoRR (2024)
Cited69Views1426
Download
Bibtex
ChatPaper
Rate
69
1426
Frank F. Xu, Yufan Song, Boxuan Li, Yuxuan Tang, Kritanjali Jain, Mengxue Bao, Zora Z. Wang,Xuhui Zhou, Zhitong Guo, Murong Cao, Mingyang Yang, Hao Yang Lu,
Computing Research Repository (2024)
Cited22Views1201
Download
Bibtex
ChatPaper
Rate
22
1201
Expand all 5 New Papers

Loading more RecommendationsGet more recommendations Load MoreAdd KeywordSet your interests to get accurate recommendation

gongan
京ICP备20011824号-11  网信算备110108105858001230019  Beijing-ChatGLM-20230821gongan京公网安备11010802035176号© 2005-2025 AMiner