This will help you build a simulation environment that is consistent with the online system of the competition, and provide an environment that is easy to develop, including installing all dependencies and providing ssh acess and jupyter notebook.
1.get dev
image
pull image from dockerhub
docker pull padeoe/cail:dev
or build it by yourself
docker build \
-t cail:dev \
--build-arg apt_mirror="mirrors.aliyun.com" \
--build-arg pypi_mirror="https://mirrors.aliyun.com/pypi/simple" \
-f docker/dev/Dockerfile .
2.run container
BERT_PRETRAINED_MODEL_DIR="/data/resources/bert"
SSH_PORT=2229
JUPYTER_PORT=2230
docker run \
--name cail_dev \
--restart always \
--runtime nvidia \
-v $BERT_PRETRAINED_MODEL_DIR:/bert \
-p $SSH_PORT:22 \
-p $JUPYTER_PORT:8888 \
-d \
padeoe/cail:dev
Now we can access ssh at ssh -p 2229 root@localhost
and jupyter notebook at http://localhost:2230.
The ssh password is cail
by default, you can change it in Dockerfile
This container will perform model training and output model files.
1.get train
image
pull image from dockerhub
docker pull padeoe/cail:train
or build it by yourself
docker build \
-t cail:train \
--build-arg apt_mirror="mirrors.aliyun.com" \
--build-arg pypi_mirror="https://mirrors.aliyun.com/pypi/simple" \
-f docker/train/Dockerfile .
2.prepare data
The dataset provided by organizer is downloaded during image building process,
so the dataset is bundled in docker image in /cail/data
.
If you want to use your own dataset, put dataset files at DATA_DIR
,
the layout should be like this:
$ tree data/
data/
├── raw
│ └── CAIL2019-SCM-big
│ └── input.txt
└── test
├── ground_truth.txt
└── input.txt
3 directories, 3 files
3.train models
BERT_PRETRAINED_MODEL="/data/resources/bert/pytorch_chinese_L-12_H-768_A-12/"
OUTPUT_MODELS_DIR="$PWD/model"
DATA_DIR="/data/padeoe/project/cail/data"
docker run \
--name cail-train \
--rm \
-it \
--runtime nvidia \
-e LOCAL_USER_ID=`id -u $USER` \
-v $BERT_PRETRAINED_MODEL:/bert/pytorch_chinese_L-12_H-768_A-12 \
-v ${DATA_DIR}:/cail/data \
-v ${OUTPUT_MODELS_DIR}:/cail/model \
padeoe/cail:train
The model will be saved at OUTPUT_MODELS_DIR
.
This container will test your model and compress the codes into zip format required by the organizer.
1.get submit
image
pull image from dockerhub
docker pull padeoe/cail:submit
or build it by yourself
docker build \
-t cail:submit \
--build-arg apt_mirror="mirrors.aliyun.com" \
--build-arg pypi_mirror="https://mirrors.aliyun.com/pypi/simple" \
-f docker/submit/Dockerfile .
2.run test and zip codes
docker run \
--rm \
-it \
--runtime nvidia \
-e LOCAL_USER_ID=`id -u $USER` \
-e SUBMIT_FILES="main.py model.py model" \
-v "$PWD":/codes \
-v "$PWD/data/submit_zip":/output_zip \
padeoe/cail:submit
It will execute main.py
and judger.py
and output the the accuracy of evaluation.
Finally, the compressed codes and models for submit will stored in $PWD/data/submit_zip
.