PaddleClas
训练环境搭建
https://www.paddlepaddle.org.cn/documentation/docs/zh/install/docker/linux-docker.html
# 拉取docker镜像
sudo docker pull registry.baidubce.com/paddlepaddle/paddle:2.4.2-gpu-cuda11.2-cudnn8.2-trt8.0
# 官方镜像
sudo docker run -id --name face --gpus all --shm-size=64G -p 33331:22 -p 9100:9100 -p 9101:9101 registry.baidubce.com/paddlepaddle/paddle:2.4.2-gpu-cuda11.2-cudnn8.2-trt8.0 bash
# 自己做的镜像
sudo docker run -id --name folds --gpus all --shm-size=32G -p 33332:22 -p 9102:9102 -p 9103:9103 face_shape_cls:1.0 bash
sudo docker exec -it folds bash
# 开启ssh,远程调试用
service ssh start
service ssh status
可忽略的报错:
ERROR: onnx 1.13.1 has requirement protobuf<4,>=3.20.2, but you'll have protobuf 3.20.0 which is incompatible.
训练模型和推理
修改配置文件
# global configs
Global:
checkpoints: null
pretrained_model: null
output_dir: ./output/
device: gpu
save_interval: 100
eval_during_train: True
eval_interval: 1
epochs: 300
print_batch_step: 10
use_visualdl: False
# used for static mode and model export
image_shape: [3, 224, 224]
save_inference_dir: ./inference
# model architecture
Arch:
name: ResNet50_vd
class_num: 3
# loss function config for traing/eval process
Loss:
Train:
- CELoss:
weight: 1.0
Eval:
- CELoss:
weight: 1.0
Optimizer:
name: Momentum
momentum: 0.9
lr:
name: Cosine
learning_rate: 0.0125
# learning_rate: 0.05
warmup_epoch: 5
regularizer:
name: 'L2'
coeff: 0.00002
# data loader for train and eval
DataLoader:
Train:
dataset:
name: ImageNetDataset
image_root: ./dataset/folds/
cls_label_path: ./dataset/folds/train_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- RandCropImage:
size: 224
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 16
drop_last: False
shuffle: True
loader:
num_workers: 2
use_shared_memory: True
Eval:
dataset:
name: ImageNetDataset
image_root: ./dataset/folds/
cls_label_path: ./dataset/folds/val_list.txt
transform_ops:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
sampler:
name: DistributedBatchSampler
batch_size: 32
drop_last: False
shuffle: False
loader:
num_workers: 2
use_shared_memory: True
Infer:
infer_imgs: docs/images/inference_deployment/whl_demo.jpg
batch_size: 10
transforms:
- DecodeImage:
to_rgb: True
channel_first: False
- ResizeImage:
resize_short: 256
- CropImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
PostProcess:
name: Topk
topk: 1
class_id_map_file: ./dataset/folds/label_list.txt
Metric:
Train:
- TopkAcc:
topk: [1, 1]
Eval:
- TopkAcc:
topk: [1, 1]
训练命令
python tools/train.py -c ./ppcls/configs/quick_start/ResNet50_vd.yaml -o Arch.pretrained=True
训练后会在 PaddleClas/output/
目录下生成模型文件
部分结果:
CELoss: 0.08261, loss: 0.08261, top1: 0.98000
[2023/06/20 13:36:26] ppcls INFO: [Eval][Epoch 100][Iter: 0/4]CELoss: 0.07598, loss: 0.07598, top1: 0.96875, batch_cost: 0.66089s, reader_cost: 0.56775, ips: 48.41936 images/sec
[2023/06/20 13:36:27] ppcls INFO: [Eval][Epoch 100][Avg]CELoss: 0.03693, loss: 0.03693, top1: 0.99000
[2023/06/20 13:36:27] ppcls INFO: Already save model in ./output/ResNet50_vd/best_model
[2023/06/20 13:36:27] ppcls INFO: [Eval][Epoch 100][best metric: 0.9899999499320984]
推理命令
# 黑眼圈推理命令
python tools/infer.py -c black_ResNet50.yaml -o Infer.infer_imgs=dataset/black/black/2000855.jpg -o Global.pretrained_model=output/ResNet50_vd/best_model
# 法令纹推理命令
python tools/infer.py -c folds_ResNet50.yaml -o Infer.infer_imgs=dataset/folds/folds/2000358.png -o Global.pretrained_model=output/ResNet50_vd/best_model
服务化部署
分类模型导出
参考:https://github.com/PaddlePaddle/PaddleClas/blob/release/2.5/docs/zh_CN/deployment/export_model.md
https://blog.csdn.net/loutengyuan/article/details/126674945
# 黑眼圈导出推理模型
python tools/export_model.py \
-c black_ResNet50.yaml \
-o Global.pretrained_model=output/ResNet50_vd/best_model \
-o Global.save_inference_dir=./deploy/model/black_ResNet50_infer
# 法令纹导出推理模型
python tools/export_model.py \
-c folds_ResNet50.yaml \
-o Global.pretrained_model=output/ResNet50_vd/best_model \
-o Global.save_inference_dir=./deploy/model/folds_ResNet50_infer
就能在你的保存路径中得到inference 模型。
用 paddle_serving_client 命令把下载的 inference 模型转换成易于 Server 部署的模型格式:
cd deploy/model
# 转换 ResNet50_vd 模型
# 黑眼圈
python3.7 -m paddle_serving_client.convert \
--dirname ./black_ResNet50_infer/ \
--model_filename inference.pdmodel \
--params_filename inference.pdiparams \
--serving_server ./black_ResNet50_vd_serving/ \
--serving_client ./black_ResNet50_vd_client/
# 法令纹
python3.7 -m paddle_serving_client.convert \
--dirname ./folds_ResNet50_infer/ \
--model_filename inference.pdmodel \
--params_filename inference.pdiparams \
--serving_server ./folds_ResNet50_vd_serving/ \
--serving_client ./folds_ResNet50_vd_client/
启动服务命令
python3.7 folds_classification_web_service.py &>log.txt &
python3.7 black_classification_web_service.py &>log.txt &
# 停止服务
python3.7 -m paddle_serving_server.serve stop