Releases: alibaba/EasyTransfer
AppZoo Supports DSW, Add Meta-FT, Tianchi
AppZoo
- Supports export best checkpoints by args
export_best_checkpoint
andexport_best_checkpoint_metric
- Supports read/write inputTable/outputTable from OSS
- Supports DSW PAI TF docker, support read local/OSS files from DSW
- Text match: Support dimension reduction of the
pool_output
of the modeltext_match_bert_two_tower
- Text classification: Fix the export bug of PairedClassificationRegressionPreprocessor
- Sequence labeling: Fix the bug of preprocess when the special tokens enter in and the BERT Tokenizer fails
- MRC: Fix predict bug of HAE model
Scripts
Init release easytransfer
Main Features
• Language model pre-training tool: it supports a comprehensive pre-training tool for users to pre-train language models such as T5 and BERT. Based on the tool, the user can easily train a model to achieve great results in the benchmark leaderboards such as CLUE, GLUE, and SuperGLUE;
• ModelZoo with rich and high-quality pre-trained models: supports the Continual Pre-training and Fine-tuning of mainstream LM models such as BERT, ALBERT, RoBERTa, T5, etc. It also supports a multi-modal model FashionBERT developed using the fashion domain data in Alibaba;
• AppZoo with rich and easy-to-use applications: supports mainstream NLP applications and those models developed inside of Alibaba, e.g.: HCNN for text matching, and BERT-HAE for MRC.
• Automatic knowledge distillation: supports task-adaptive knowledge distillation to distill knowledge from a teacher model to a small task-specific student model. The resulting method is AdaBERT, which uses a neural architecture search method to find a task-specific architecture to compress the original BERT model. The compressed models are 12.7x to 29.3x faster than BERT in inference time and 11.5x to 17.0x smaller in terms of parameter size and with comparable performance.
• Easy-to-use and high-performance distributed strategy: based on the in-house PAI features, it provides easy-to-use and high-performance distributed strategy for multiple CPU/GPU training.