grok-pytorch

my best attempt of implementing grok in pytorch

What is Grok?

grok or grok-1 is a newly opensourced mixture of experts language model by xai.

Disclaimer

This implementation is not intended to be run. It is intended to be a reference for understanding the architecture of the grok model, which is also the reason I wrote this. Personally I also find it easier to reason about a model architecture when shapes are provided via type hints.
There are probably still many bugs in this implementation. I have not tested it extensively. And it's also possible that I have missed aspects of the architecture. However it should give you an idea of how grok works.

Attributions

The original implementation of grok in jax and haiku can be found here.
certain parts of the code were adapted from x-transformers - MIT License, mema - MIT License and huggingface/transformers - Apache 2.0 License

Citations

RoFormer: Enhanced Transformer with Rotary Position Embedding

@misc{su2023roformer,
      title={RoFormer: Enhanced Transformer with Rotary Position Embedding},
      author={Jianlin Su and Yu Lu and Shengfeng Pan and Ahmed Murtadha and Bo Wen and Yunfeng Liu},
      year={2023},
      eprint={2104.09864},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Attention Is All You Need

@misc{vaswani2017attention,
      title={Attention Is All You Need},
      author={Ashish Vaswani and Noam Shazeer and Niki Parmar and Jakob Uszkoreit and Llion Jones and Aidan N. Gomez and Lukasz Kaiser and Illia Polosukhin},
      year={2017},
      eprint={1706.03762},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Root Mean Square Layer Normalization

@misc{zhang2019root,
      title={Root Mean Square Layer Normalization},
      author={Biao Zhang and Rico Sennrich},
      year={2019},
      eprint={1910.07467},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
model.py		model.py
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

grok-pytorch

What is Grok?

Disclaimer

Attributions

Citations

About

Releases

Packages

Contributors 3

Languages

License

dominiquegarmier/grok-pytorch

Folders and files

Latest commit

History

Repository files navigation

grok-pytorch

What is Grok?

Disclaimer

Attributions

Citations

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages