项目简介

本项目使用paddle的动态图机制实现了视频动作分类TSN模型,并在简化版的HMDB 51数据集进行了训练和验证

下载安装命令

## CPU版本安装命令
pip install -f https://paddlepaddle.org.cn/pip/oschina/cpu paddlepaddle

## GPU版本安装命令
pip install -f https://paddlepaddle.org.cn/pip/oschina/gpu paddlepaddle-gpu

模型简介

Temporal Segment Network (TSN) 是视频分类领域经典的基于2D-CNN的解决方案。该方法主要解决视频的长时间行为判断问题,通过稀疏采样视频帧的方式代替稠密采样,既能捕获视频全局信息,也能去除冗余,降低计算量。最终将每帧特征平均融合后得到视频的整体特征,并用于分类。本代码实现的模型为基于单路RGB图像的TSN网络结构,Backbone采用ResNet-50结构。具体的模型结构图如下:
基于PaddlePaddle的视频分类-TSN(动态图版)-LMLPHP论文原文:Temporal Segment Networks: Towards Good Practices for Deep Action Recognition
博客推荐:TSN(Temporal Segment Networks)算法笔记

数据介绍

本项目使用HMDB 51的demo数据进行训练和测试,从原始数据集中抽取出10类,每类10个样本,需要在运行前对数据进行一些预处理,预处理的步骤如下:

  • 执行脚本avi2jpg.py,将hmdb51数据中的视频文件逐帧处理为jpg文件并保存在以视频名称命名的文件夹下
  • 执行脚本jpg2pkl.py,将同一视频对应的jpg文件的路径以及标签保存在以视频命名的pkl文件中,并对于每一类视频取 该类视频总数的80%为训练集,10%为验证集,10%为测试集,具体的数据集划分参数可在脚本中修改。
  • 执行脚本data_list_gener.py,生成对应的train.list,val.list,test.list文件

注意:上述脚本中的文件路径需要根据实际情况进行修改

文件结构

|--configs					# 配置
|--model					# 模型
|--reader					# 读取数据
|--data						# 数据					
|--data_list_gener.py				# 生成train、test、eval
|--infer.py					# 模型推断
|--avi2jpg.py					# 视频变成帧,保存为jpg
|--train.py					# 训练脚本
|--utils.py					# 通用工具
|--jpg2pkl.py					# jpg变成pkl
|--config.py					# 读取配置并生成
In[1]
!tar xf data/data10072/hmdb_data_demo.tar -C data 2>/dev/null
!python avi2jpg.py
!python jpg2pkl.py
!python data_list_gener.py
{'brush_hair': 0, 'dive': 1, 'dribble': 2, 'catch': 3, 'chew': 4, 'draw_sword': 5, 'clap': 6, 'climb_stairs': 7, 'climb': 8, 'cartwheel': 9}
{'brush_hair': 0, 'dive': 1, 'dribble': 2, 'catch': 3, 'chew': 4, 'draw_sword': 5, 'clap': 6, 'climb_stairs': 7, 'climb': 8, 'cartwheel': 9}
brush_hair_jpg brush_hair
dive_jpg dive
dribble_jpg dribble
catch_jpg catch
chew_jpg chew
draw_sword_jpg draw_sword
clap_jpg clap
climb_stairs_jpg climb_stairs
climb_jpg climb
cartwheel_jpg cartwheel
80
10
10
In[5]
!python train.py --use_gpu True --epoch 50 --pretrain True
# !python train.py --use_gpu True --epoch 10
Loss at epoch 43 step 6: [1.0806234], acc: [0.6]
Loss at epoch 43 step 7: [1.1876048], acc: [0.6]
Loss at epoch 44 step 0: [0.8507346], acc: [0.7]
Loss at epoch 44 step 1: [1.7061481], acc: [0.4]
Loss at epoch 44 step 2: [0.69134974], acc: [0.7]
Loss at epoch 44 step 3: [0.6366395], acc: [0.8]
Loss at epoch 44 step 4: [1.6385108], acc: [0.3]
Loss at epoch 44 step 5: [0.8179408], acc: [0.7]
Loss at epoch 44 step 6: [1.1013589], acc: [0.8]
Loss at epoch 44 step 7: [0.7362377], acc: [0.7]
Loss at epoch 45 step 0: [1.1794446], acc: [0.6]
Loss at epoch 45 step 1: [1.1445456], acc: [0.7]
Loss at epoch 45 step 2: [0.60544634], acc: [0.8]
Loss at epoch 45 step 3: [1.365793], acc: [0.5]
Loss at epoch 45 step 4: [0.83137083], acc: [0.6]
Loss at epoch 45 step 5: [1.2792709], acc: [0.5]
Loss at epoch 45 step 6: [1.4185008], acc: [0.7]
Loss at epoch 45 step 7: [0.58891225], acc: [0.7]
Loss at epoch 46 step 0: [0.6643461], acc: [0.7]
Loss at epoch 46 step 1: [0.9874472], acc: [0.5]
Loss at epoch 46 step 2: [0.35808498], acc: [1.]
Loss at epoch 46 step 3: [0.44791618], acc: [0.9]
Loss at epoch 46 step 4: [1.0225163], acc: [0.5]
Loss at epoch 46 step 5: [0.50967324], acc: [0.9]
Loss at epoch 46 step 6: [0.6359672], acc: [0.7]
Loss at epoch 46 step 7: [1.260846], acc: [0.6]
Loss at epoch 47 step 0: [0.7233087], acc: [0.8]
Loss at epoch 47 step 1: [1.5215951], acc: [0.4]
Loss at epoch 47 step 2: [0.9552907], acc: [0.5]
Loss at epoch 47 step 3: [0.6822446], acc: [0.7]
Loss at epoch 47 step 4: [0.4305572], acc: [0.9]
Loss at epoch 47 step 5: [0.71233517], acc: [0.6]
Loss at epoch 47 step 6: [0.9817282], acc: [0.8]
Loss at epoch 47 step 7: [2.074246], acc: [0.3]
Loss at epoch 48 step 0: [0.95157], acc: [0.6]
Loss at epoch 48 step 1: [1.1726437], acc: [0.6]
Loss at epoch 48 step 2: [1.3101113], acc: [0.4]
Loss at epoch 48 step 3: [0.5663533], acc: [0.8]
Loss at epoch 48 step 4: [0.71649504], acc: [0.7]
Loss at epoch 48 step 5: [0.8225], acc: [0.8]
Loss at epoch 48 step 6: [0.9775295], acc: [0.7]
Loss at epoch 48 step 7: [1.2217357], acc: [0.7]
Loss at epoch 49 step 0: [0.6619309], acc: [0.7]
Loss at epoch 49 step 1: [0.7392478], acc: [0.7]
Loss at epoch 49 step 2: [0.9415503], acc: [0.6]
Loss at epoch 49 step 3: [0.6194843], acc: [0.8]
Loss at epoch 49 step 4: [1.1828535], acc: [0.4]
Loss at epoch 49 step 5: [0.60173714], acc: [0.8]
Loss at epoch 49 step 6: [0.7926864], acc: [0.8]
Loss at epoch 49 step 7: [0.52622277], acc: [0.8]
Final loss: [0.52622277]
In[6]
!python eval.py --weights 'checkpoints_models/tsn_model' --use_gpu True
[INFO: eval.py:  118]: Namespace(batch_size=1, config='configs/tsn.txt', filelist=None, infer_topk=1, log_interval=1, model_name='tsn', save_dir='./output', use_gpu=True, weights='checkpoints_models/tsn_model')
{'MODEL': {'name': 'TSN', 'format': 'pkl', 'num_classes': 10, 'seg_num': 3, 'seglen': 1, 'image_mean': [0.485, 0.456, 0.406], 'image_std': [0.229, 0.224, 0.225], 'num_layers': 50}, 'TRAIN': {'epoch': 45, 'short_size': 240, 'target_size': 224, 'num_reader_threads': 1, 'buf_size': 1024, 'batch_size': 10, 'use_gpu': True, 'num_gpus': 1, 'filelist': './data/hmdb_data_demo/train.list', 'learning_rate': 0.01, 'learning_rate_decay': 0.1, 'l2_weight_decay': 0.0001, 'momentum': 0.9, 'total_videos': 80}, 'VALID': {'short_size': 240, 'target_size': 224, 'num_reader_threads': 1, 'buf_size': 1024, 'batch_size': 2, 'filelist': './data/hmdb_data_demo/val.list'}, 'TEST': {'seg_num': 7, 'short_size': 240, 'target_size': 224, 'num_reader_threads': 1, 'buf_size': 1024, 'batch_size': 10, 'filelist': './data/hmdb_data_demo/test.list'}, 'INFER': {'short_size': 240, 'target_size': 224, 'num_reader_threads': 1, 'buf_size': 1024, 'batch_size': 1, 'filelist': './data/hmdb_data_demo/test.list'}}
[INFO: config.py:   68]: ---------------- Valid Arguments ----------------
[INFO: config.py:   70]: MODEL:
[INFO: config.py:   72]:     name:TSN
[INFO: config.py:   72]:     format:pkl
[INFO: config.py:   72]:     num_classes:10
[INFO: config.py:   72]:     seg_num:3
[INFO: config.py:   72]:     seglen:1
[INFO: config.py:   72]:     image_mean:[0.485, 0.456, 0.406]
[INFO: config.py:   72]:     image_std:[0.229, 0.224, 0.225]
[INFO: config.py:   72]:     num_layers:50
[INFO: config.py:   70]: TRAIN:
[INFO: config.py:   72]:     epoch:45
[INFO: config.py:   72]:     short_size:240
[INFO: config.py:   72]:     target_size:224
[INFO: config.py:   72]:     num_reader_threads:1
[INFO: config.py:   72]:     buf_size:1024
[INFO: config.py:   72]:     batch_size:10
[INFO: config.py:   72]:     use_gpu:True
[INFO: config.py:   72]:     num_gpus:1
[INFO: config.py:   72]:     filelist:./data/hmdb_data_demo/train.list
[INFO: config.py:   72]:     learning_rate:0.01
[INFO: config.py:   72]:     learning_rate_decay:0.1
[INFO: config.py:   72]:     l2_weight_decay:0.0001
[INFO: config.py:   72]:     momentum:0.9
[INFO: config.py:   72]:     total_videos:80
[INFO: config.py:   70]: VALID:
[INFO: config.py:   72]:     short_size:240
[INFO: config.py:   72]:     target_size:224
[INFO: config.py:   72]:     num_reader_threads:1
[INFO: config.py:   72]:     buf_size:1024
[INFO: config.py:   72]:     batch_size:1
[INFO: config.py:   72]:     filelist:./data/hmdb_data_demo/val.list
[INFO: config.py:   70]: TEST:
[INFO: config.py:   72]:     seg_num:7
[INFO: config.py:   72]:     short_size:240
[INFO: config.py:   72]:     target_size:224
[INFO: config.py:   72]:     num_reader_threads:1
[INFO: config.py:   72]:     buf_size:1024
[INFO: config.py:   72]:     batch_size:10
[INFO: config.py:   72]:     filelist:./data/hmdb_data_demo/test.list
[INFO: config.py:   70]: INFER:
[INFO: config.py:   72]:     short_size:240
[INFO: config.py:   72]:     target_size:224
[INFO: config.py:   72]:     num_reader_threads:1
[INFO: config.py:   72]:     buf_size:1024
[INFO: config.py:   72]:     batch_size:1
[INFO: config.py:   72]:     filelist:./data/hmdb_data_demo/test.list
[INFO: config.py:   73]: -------------------------------------------------
W0306 17:51:40.394389  2943 device_context.cc:237] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 9.2, Runtime API Version: 9.0
W0306 17:51:40.398815  2943 device_context.cc:245] device: 0, cuDNN Version: 7.3.
验证集准确率为:0.699999988079071
In[7]
!python infer.py --weights 'checkpoints_models/tsn_model' --use_gpu True
[INFO: infer.py:  114]: Namespace(batch_size=1, config='configs/tsn.txt', filelist=None, infer_topk=1, log_interval=1, model_name='tsn', save_dir='./output', use_gpu=True, weights='checkpoints_models/tsn_model')
{'MODEL': {'name': 'TSN', 'format': 'pkl', 'num_classes': 10, 'seg_num': 3, 'seglen': 1, 'image_mean': [0.485, 0.456, 0.406], 'image_std': [0.229, 0.224, 0.225], 'num_layers': 50}, 'TRAIN': {'epoch': 45, 'short_size': 240, 'target_size': 224, 'num_reader_threads': 1, 'buf_size': 1024, 'batch_size': 10, 'use_gpu': True, 'num_gpus': 1, 'filelist': './data/hmdb_data_demo/train.list', 'learning_rate': 0.01, 'learning_rate_decay': 0.1, 'l2_weight_decay': 0.0001, 'momentum': 0.9, 'total_videos': 80}, 'VALID': {'short_size': 240, 'target_size': 224, 'num_reader_threads': 1, 'buf_size': 1024, 'batch_size': 2, 'filelist': './data/hmdb_data_demo/val.list'}, 'TEST': {'seg_num': 7, 'short_size': 240, 'target_size': 224, 'num_reader_threads': 1, 'buf_size': 1024, 'batch_size': 10, 'filelist': './data/hmdb_data_demo/test.list'}, 'INFER': {'short_size': 240, 'target_size': 224, 'num_reader_threads': 1, 'buf_size': 1024, 'batch_size': 1, 'filelist': './data/hmdb_data_demo/test.list'}}
[INFO: config.py:   68]: ---------------- Infer Arguments ----------------
[INFO: config.py:   70]: MODEL:
[INFO: config.py:   72]:     name:TSN
[INFO: config.py:   72]:     format:pkl
[INFO: config.py:   72]:     num_classes:10
[INFO: config.py:   72]:     seg_num:3
[INFO: config.py:   72]:     seglen:1
[INFO: config.py:   72]:     image_mean:[0.485, 0.456, 0.406]
[INFO: config.py:   72]:     image_std:[0.229, 0.224, 0.225]
[INFO: config.py:   72]:     num_layers:50
[INFO: config.py:   70]: TRAIN:
[INFO: config.py:   72]:     epoch:45
[INFO: config.py:   72]:     short_size:240
[INFO: config.py:   72]:     target_size:224
[INFO: config.py:   72]:     num_reader_threads:1
[INFO: config.py:   72]:     buf_size:1024
[INFO: config.py:   72]:     batch_size:10
[INFO: config.py:   72]:     use_gpu:True
[INFO: config.py:   72]:     num_gpus:1
[INFO: config.py:   72]:     filelist:./data/hmdb_data_demo/train.list
[INFO: config.py:   72]:     learning_rate:0.01
[INFO: config.py:   72]:     learning_rate_decay:0.1
[INFO: config.py:   72]:     l2_weight_decay:0.0001
[INFO: config.py:   72]:     momentum:0.9
[INFO: config.py:   72]:     total_videos:80
[INFO: config.py:   70]: VALID:
[INFO: config.py:   72]:     short_size:240
[INFO: config.py:   72]:     target_size:224
[INFO: config.py:   72]:     num_reader_threads:1
[INFO: config.py:   72]:     buf_size:1024
[INFO: config.py:   72]:     batch_size:2
[INFO: config.py:   72]:     filelist:./data/hmdb_data_demo/val.list
[INFO: config.py:   70]: TEST:
[INFO: config.py:   72]:     seg_num:7
[INFO: config.py:   72]:     short_size:240
[INFO: config.py:   72]:     target_size:224
[INFO: config.py:   72]:     num_reader_threads:1
[INFO: config.py:   72]:     buf_size:1024
[INFO: config.py:   72]:     batch_size:10
[INFO: config.py:   72]:     filelist:./data/hmdb_data_demo/test.list
[INFO: config.py:   70]: INFER:
[INFO: config.py:   72]:     short_size:240
[INFO: config.py:   72]:     target_size:224
[INFO: config.py:   72]:     num_reader_threads:1
[INFO: config.py:   72]:     buf_size:1024
[INFO: config.py:   72]:     batch_size:1
[INFO: config.py:   72]:     filelist:./data/hmdb_data_demo/test.list
[INFO: config.py:   73]: -------------------------------------------------
W0306 17:51:51.912968  3005 device_context.cc:237] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 9.2, Runtime API Version: 9.0
W0306 17:51:51.917469  3005 device_context.cc:245] device: 0, cuDNN Version: 7.3.
实际标签['boom_snap_clap_(challenge)_clap_u_nm_np1_fr_med_0'], 预测结果clap
实际标签['Basic_Basketball_Moves_dribble_f_cm_np1_ri_goo_7'], 预测结果dribble
实际标签['Bubblegum_Wigger_chew_h_nm_np1_fr_goo_0'], 预测结果climb_stairs
实际标签['Torwarttraining_Arminia_Bielefeld_catch_f_cm_np1_ri_med_5'], 预测结果catch
实际标签['Eishin_Ryu_Iaido_-_Toho_Forms_draw_sword_f_cm_np1_ri_med_0'], 预测结果cartwheel
实际标签['H_I_I_T__Swamis_stairs_with_Max_Wettstein_featuring_Donna_Wettstein_climb_stairs_f_cm_np1_ba_med_4'], 预测结果brush_hair
实际标签['Beim_Radschlag_Hose_gerissen_(Nick)_xD_cartwheel_f_cm_np1_ri_med_0'], 预测结果cartwheel
实际标签['BASE_JUMPING_COMPILATION_PART_3_AMAZING!!!!!!!_dive_f_cm_np1_fr_bad_5'], 预测结果catch
实际标签['Brunette_Foxyanya_ultra_silky_long_hair_brushing_hairjob_brush_hair_f_nm_np1_le_goo_3'], 预测结果brush_hair
实际标签['Axel_beim_Klettern_an_der_Unisportwand_climb_f_cm_np2_ba_med_0'], 预测结果climb
 

请点击此处查看本环境基本用法. 
Please click here for more detailed instructions.

点击链接,使用AI Studio一键上手实践项目吧:https://aistudio.baidu.com/aistudio/projectdetail/205004 

下载安装命令

## CPU版本安装命令
pip install -f https://paddlepaddle.org.cn/pip/oschina/cpu paddlepaddle

## GPU版本安装命令
pip install -f https://paddlepaddle.org.cn/pip/oschina/gpu paddlepaddle-gpu

>> 访问 PaddlePaddle 官网,了解更多相关内容

09-05 00:38