Skip to content

Commit

Permalink
funasr1.0.4
Browse files Browse the repository at this point in the history
  • Loading branch information
LauraGPT committed Jan 30, 2024
1 parent c47ad73 commit f1c1cb0
Show file tree
Hide file tree
Showing 5 changed files with 37 additions and 75 deletions.
24 changes: 18 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,10 @@
([简体中文](./README_zh.md)|English)

# FunASR: A Fundamental End-to-End Speech Recognition Toolkit
<p align="left">
<a href=""><img src="https://img.shields.io/badge/OS-Linux%2C%20Win%2C%20Mac-brightgreen.svg"></a>
<a href=""><img src="https://img.shields.io/badge/Python->=3.7,<=3.10-aff.svg"></a>
<a href=""><img src="https://img.shields.io/badge/Pytorch-%3E%3D1.11-blue"></a>
</p>


[![PyPI](https://img.shields.io/pypi/v/funasr)](https://pypi.org/project/funasr/)


<strong>FunASR</strong> hopes to build a bridge between academic research and industrial applications on speech recognition. By supporting the training & finetuning of the industrial-grade speech recognition model, researchers and developers can conduct research and production of speech recognition models more conveniently, and promote the development of speech recognition ecology. ASR for Fun!

Expand All @@ -28,6 +27,7 @@

<a name="whats-new"></a>
## What's new:
- 2024/01/30:funasr-1.0 has been released ([docs](https://github.com/alibaba-damo-academy/FunASR/discussions/1319))
- 2024/01/25: Offline File Transcription Service 4.2, Offline File Transcription Service of English 1.3 released,optimized the VAD (Voice Activity Detection) data processing method, significantly reducing peak memory usage, memory leak optimization; Real-time Transcription Service 1.7 released,optimizatized the client-side;([docs](runtime/readme.md))
- 2024/01/09: The Funasr SDK for Windows version 2.0 has been released, featuring support for The offline file transcription service (CPU) of Mandarin 4.1, The offline file transcription service (CPU) of English 1.2, The real-time transcription service (CPU) of Mandarin 1.6. For more details, please refer to the official documentation or release notes([FunASR-Runtime-Windows](https://www.modelscope.cn/models/damo/funasr-runtime-win-cpu-x64/summary))
- 2024/01/03: File Transcription Service 4.0 released, Added support for 8k models, optimized timestamp mismatch issues and added sentence-level timestamps, improved the effectiveness of English word FST hotwords, supported automated configuration of thread parameters, and fixed known crash issues as well as memory leak problems, refer to ([docs](runtime/readme.md#file-transcription-service-mandarin-cpu)).
Expand All @@ -48,7 +48,19 @@
<a name="Installation"></a>
## Installation

Please ref to [installation docs](https://alibaba-damo-academy.github.io/FunASR/en/installation/installation.html)
```shell
pip3 install -U funasr
```
Or install from source code
``` sh
git clone https://github.com/alibaba/FunASR.git && cd FunASR
pip3 install -e ./
```
Install modelscope for the pretrained models (Optional)

```shell
pip3 install -U modelscope
```

## Model Zoo
FunASR has open-sourced a large number of pre-trained models on industrial data. You are free to use, copy, modify, and share FunASR models under the [Model License Agreement](./MODEL_LICENSE). Below are some representative models, for more models please refer to the [Model Zoo]().
Expand Down
24 changes: 18 additions & 6 deletions README_zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,9 @@
(简体中文|[English](./README.md))

# FunASR: A Fundamental End-to-End Speech Recognition Toolkit
<p align="left">
<a href=""><img src="https://img.shields.io/badge/OS-Linux%2C%20Win%2C%20Mac-brightgreen.svg"></a>
<a href=""><img src="https://img.shields.io/badge/Python->=3.7,<=3.10-aff.svg"></a>
<a href=""><img src="https://img.shields.io/badge/Pytorch-%3E%3D1.11-blue"></a>
</p>

[![PyPI](https://img.shields.io/pypi/v/funasr)](https://pypi.org/project/funasr/)


FunASR希望在语音识别的学术研究和工业应用之间架起一座桥梁。通过发布工业级语音识别模型的训练和微调,研究人员和开发人员可以更方便地进行语音识别模型的研究和生产,并推动语音识别生态的发展。让语音识别更有趣!

Expand All @@ -31,6 +29,7 @@ FunASR希望在语音识别的学术研究和工业应用之间架起一座桥

<a name="最新动态"></a>
## 最新动态
- 2024/01/30:funasr-1.0发布,更新说明[文档](https://github.com/alibaba-damo-academy/FunASR/discussions/1319)
- 2024/01/25: 中文离线文件转写服务 4.2、英文离线文件转写服务 1.3,优化vad数据处理方式,大幅降低峰值内存占用,内存泄漏优化;中文实时语音听写服务 1.7 发布,客户端优化;详细信息参阅([部署文档](runtime/readme_cn.md))
- 2024/01/09: funasr社区软件包windows 2.0版本发布,支持软件包中文离线文件转写4.1、英文离线文件转写1.2、中文实时听写服务1.6的最新功能,详细信息参阅([FunASR社区软件包windows版本](https://www.modelscope.cn/models/damo/funasr-runtime-win-cpu-x64/summary))
- 2024/01/03: 中文离线文件转写服务 4.0 发布,新增支持8k模型、优化时间戳不匹配问题及增加句子级别时间戳、优化英文单词fst热词效果、支持自动化配置线程参数,同时修复已知的crash问题及内存泄漏问题,详细信息参阅([部署文档](runtime/readme_cn.md#中文离线文件转写服务cpu版本))
Expand All @@ -49,7 +48,20 @@ FunASR希望在语音识别的学术研究和工业应用之间架起一座桥

<a name="安装教程"></a>
## 安装教程
FunASR安装教程请阅读([Installation](https://alibaba-damo-academy.github.io/FunASR/en/installation/installation.html)

```shell
pip3 install -U funasr
```
或者从源代码安装
``` sh
git clone https://github.com/alibaba/FunASR.git && cd FunASR
pip3 install -e ./
```
如果需要使用工业预训练模型,安装modelscope(可选)

```shell
pip3 install -U modelscope
```

## 模型仓库

Expand Down
2 changes: 1 addition & 1 deletion funasr/version.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
1.0.3
1.0.4
30 changes: 0 additions & 30 deletions funasr/quick_start.md → runtime/quick_start.md
Original file line number Diff line number Diff line change
Expand Up @@ -132,33 +132,3 @@ python3 funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode offline --au

For more examples, please refer to [docs](https://github.com/alibaba-damo-academy/FunASR/blob/main/runtime/docs/SDK_advanced_guide_offline.md)


## Industrial Model Egs

If you want to use the pre-trained industrial models in ModelScope for inference or fine-tuning training, you can refer to the following command:

```python
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks

inference_pipeline = pipeline(
task=Tasks.auto_speech_recognition,
model='damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch',
)

rec_result = inference_pipeline(audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_zh.wav')
print(rec_result)
# {'text': '欢迎大家来体验达摩院推出的语音识别模型'}
```

More examples could be found in [docs](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_pipeline/quick_start.html)

## Academic model egs

If you want to train from scratch, usually for academic models, you can start training and inference with the following command:

```shell
cd egs/aishell/paraformer
. ./run.sh --CUDA_VISIBLE_DEVICES="0,1" --gpu_num=2
```
More examples could be found in [docs](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_pipeline/quick_start.html)
32 changes: 0 additions & 32 deletions funasr/quick_start_zh.md → runtime/quick_start_zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,35 +130,3 @@ python3 funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode offline --au



### 工业模型egs

如果您希望使用ModelScope中预训练好的工业模型,进行推理或者微调训练,您可以参考下面指令:


```python
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks

inference_pipeline = pipeline(
task=Tasks.auto_speech_recognition,
model='damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch',
)

rec_result = inference_pipeline(audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_zh.wav')
print(rec_result)
# {'text': '欢迎大家来体验达摩院推出的语音识别模型'}
```

更多例子可以参考([点击此处](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_pipeline/quick_start.html)


### 学术模型egs

如果您希望从头开始训练,通常为学术模型,您可以通过下面的指令启动训练与推理:

```shell
cd egs/aishell/paraformer
. ./run.sh --CUDA_VISIBLE_DEVICES="0,1" --gpu_num=2
```

更多例子可以参考([点击此处](https://alibaba-damo-academy.github.io/FunASR/en/academic_recipe/asr_recipe.html)

0 comments on commit f1c1cb0

Please sign in to comment.