Implementation of the LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper
-
Updated
Jun 8, 2024 - Python
Implementation of the LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper
This repository is used to collect papers and code in the field of AI.
Sequence-to-sequence framework with a focus on Neural Machine Translation based on PyTorch
Yet Another Transformer Implementation
Seq2SeqSharp is a tensor based fast & flexible deep neural network framework written by .NET (C#). It has many highlighted features, such as automatic differentiation, different network types (Transformer, LSTM, BiLSTM and so on), multi-GPUs supported, cross-platforms (Windows, Linux, x86, x64, ARM), multimodal model for text and images and so on.
Developing Natural Language Processing tools to enhance Learning Analytics. Creating an automated dashboard that diagnoses strengths and weaknesses from educational data.
Sentiment analysis on the IMDB dataset using Bag of Words models (Unigram, Bigram, Trigram, Bigram with TF-IDF) and Sequence to Sequence models (one-hot vectors, word embeddings, pretrained embeddings like GloVe, and transformers with positional embeddings).
Extractive Nepali Question Answering System | Browser Extension & Web Application
A novel implementation of fusing ViT with Mamba into a fast, agile, and high performance Multi-Modal Model. Powered by Zeta, the simplest AI framework ever.
This is the project repo associated with the paper "Disentangling and Integrating Relational and Sensory Information in Transformer Architectures" by Awni Altabaa, John Lafferty
Slides from my NLP course on the transformer architecture
This study aims to investigate the effectiveness of three Transformers (BERT, RoBERTa, XLNet) in handling data sparsity and cold start problems in the recommender system. We present a Transformer-based hybrid recommender system that predicts missing ratings and ex- tracts semantic embeddings from user reviews to mitigate the issues.
Simple character level Transformer
Simple implementation of the paper "Attention Is All You Need" - https://arxiv.org/abs/1706.03762 using pytorch.
An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites
A desktop application to assist in learning languages. Uses a deep learning model to generate translations.
Inference Llama 2 in one file of pure 🔥
transformer implementation from scratch for next character prediction
Facial Attribute Recognition using the Transformer architecture, 91% on CelebA
Official implementation of "Particle Transformer for Jet Tagging".
Add a description, image, and links to the transformer-architecture topic page so that developers can more easily learn about it.
To associate your repository with the transformer-architecture topic, visit your repo's landing page and select "manage topics."