more references fix
This commit is contained in:
parent
e522ecf630
commit
f00d10b59f
2 changed files with 3 additions and 3 deletions
|
|
@ -289,6 +289,9 @@ Theoretical analysis suggests 2-3x improvements in inference throughput. For a d
|
|||
- Looped transformer cyclic trajectories and input injection (rosinality): https://x.com/rosinality/status/2043953033428541853
|
||||
- Parcae scaling laws for stable looped language models — thread (Hayden Prairie): https://x.com/hayden_prairie/status/2044453231913537927
|
||||
- RoPE-like loop index embedding idea to differentiate functions across iterations (davidad): https://x.com/davidad/status/2044453231913537927
|
||||
- On the Looped Transformers Controversy by ChrisHayduk: https://x.com/ChrisHayduk/status/2045947623572688943
|
||||
- On the Looped Transformers Controversy Summary by @realsigridjin https://x.com/realsigridjin/status/2046012743778766875
|
||||
|
||||
|
||||
### Papers
|
||||
|
||||
|
|
|
|||
|
|
@ -191,6 +191,3 @@ def mythos_1t() -> MythosConfig:
|
|||
lora_rank=256,
|
||||
max_output_tokens=131072,
|
||||
)
|
||||
|
||||
|
||||
# fmt: on
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue