Skip to content
View naoa's full-sized avatar

Organizations

@groonga @mroonga @ipnexus @cleanhearing @patentfield

Block or report naoa

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Training LLMs with QLoRA + FSDP

Jupyter Notebook 1,474 192 Updated Nov 9, 2024

Build LLM-powered applications in Ruby

Ruby 1,754 239 Updated Apr 29, 2025

Language-Agnostic SEntence Representations

Jupyter Notebook 3,633 462 Updated May 2, 2024

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 21,159 2,615 Updated Mar 4, 2025

Incremental Skip-gram Model with Negative Sampling

Shell 69 8 Updated Jun 30, 2019

Word2Vec naïve version from scratch vs Word2Vec parallelized version.

Jupyter Notebook 1 Updated Aug 4, 2022

Package for evaluating word embeddings

Python 437 111 Updated Jan 4, 2021

RiverText is a framework that standardizes the Incremental Word Embeddings proposed in the state-of-art. Please feel welcome to open an issue in case you have any questions or a pull request if you…

Python 22 1 Updated Feb 26, 2025

Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

C++ 13,703 1,186 Updated Jul 29, 2024

🍇 GRAPE is a Rust/Python Graph Representation Learning library for Predictions and Evaluations

Jupyter Notebook 578 38 Updated Feb 24, 2024

A collection of ORM-style clients to public patent data

Python 106 42 Updated Apr 8, 2025

Painterro - JavaScript painting plugin

JavaScript 652 88 Updated Sep 18, 2024

🔥 Use pre-trained models in PyTorch to extract vector embeddings for any image

Python 606 97 Updated Dec 23, 2023

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…

Python 33,972 4,900 Updated Apr 28, 2025

Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.

Python 3,359 258 Updated Oct 18, 2024

Header-only C++/python library for fast approximate nearest neighbors

C++ 4,664 706 Updated Apr 20, 2025

Non-Metric Space Library (NMSLIB): An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-metric spaces.

C++ 3,476 459 Updated Sep 21, 2024

FAst Lookups of Cosine and Other Nearest Neighbors (based on fast locality-sensitive hashing)

C 1,150 194 Updated Jun 1, 2024

Hash function quality and speed tests

C++ 1,971 183 Updated Apr 22, 2025

SIMD (SSE) population count --- http://0x80.pl/articles/sse-popcount.html

C++ 339 50 Updated Apr 1, 2024

Javascript Canvas Library, SVG-to-Canvas (& canvas-to-SVG) Parser

TypeScript 29,967 3,571 Updated Apr 22, 2025

Zest is a compression-based text classifier using Meta's Zstandard compression algorithm. Zest is language-agnostic and this approach simplifies configuration, avoids careful feature extraction and…

Python 5 Updated Jan 15, 2022

Datasets, SOTA results of every fields of Chinese NLP

HTML 1,804 271 Updated Apr 7, 2022

[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821

Python 3,547 525 Updated Oct 16, 2024

Pytorch version of BERT-whitening

Python 308 44 Updated Oct 9, 2021

PISA: Performant Indexes and Search for Academia

C++ 982 67 Updated Apr 27, 2025

BERT models for Japanese text.

Python 533 55 Updated Mar 23, 2024

PyTorch code for SpERT: Span-based Entity and Relation Transformer

Python 699 147 Updated Feb 1, 2024
Next
Showing results