RoPE explorer
Linear Attention for Efficient Bidirectional Sequence Modeling (external)
Model, Theory, Systems, Results