It is a multi-lingual (97 languages) text content automatic recognition and segmentation tool. 强大的TTS多语言(97种语言)混合文本内容自动分词工具。
-
Updated
Sep 7, 2024 - Python
It is a multi-lingual (97 languages) text content automatic recognition and segmentation tool. 强大的TTS多语言(97种语言)混合文本内容自动分词工具。
GlotLID: Language Identification with Support for More Than 2000 Labels -- EMNLP 2023
Faster, modernized fork of the language identification tool langid.py
An AI model that automatically detects the language of a given text supported by 3 pre-trained NLP models and one ML algorithm.
.NET Port of Language Identification Library for langid-java. 移植自langid-java的语言识别库。
A Julia package for language identification.
Language detection (majority voting)
To 1) create train/test samples of Tatoeba sentences for NLP-related tasks & 2) evaluate the performance of different solutions for detecting the language of a given text.
Add a description, image, and links to the langid topic page so that developers can more easily learn about it.
To associate your repository with the langid topic, visit your repo's landing page and select "manage topics."