用Keras实现视频动作分类

使用LSTM或者RNN处理时序信息太费时间

所以我们先训练一个用于图片上运动信息分类的CNN，通过采用一种滚动平均的方法将其应用在视频分类中

普通方法

Loop over all frames in the video file
For each frame, pass the frame through the CNN
Classify each frame individually and independently of each other
Choose the label with the largest corresponding probability
Label the frame and write the output frame to disk

if you’ve ever tried to apply simple image classification to video classification you likely encountered a sort of “prediction flickering”（预测闪烁）

因此如何避免CNN模型在两个标签之间闪烁很重要

使用滚动平均方法（rolling prediction average）

假设视频里的随后几帧含有相同的语义内容

Loop over all frames in the video file
For each frame, pass the frame through the CNN
Obtain the predictions from the CNN
Maintain a list of the last K predictions
Compute the average of the last K predictions and choose the label with the largest corresponding probability（假设每个视频帧有三个概率输出，则将3xK平均为单个3，然后取概率最大的标签）
Label the frame and write the output frame to disk

代码实现

代码链接：

https://github.com/perfectism13/learning_colab

前期准备

from google.colab import drive
drive.mount('/content/drive')
import os
os.chdir(r'/content/drive/My Drive/colab/keras-video-classification/keras-video-classification')
print(os.getcwd())
!ls
!apt-get install p7zip
!7z x sports-type-classifier-data.7z
!ls
!tree --dirsfirst --filelimit 50

开始

# USAGE
# python train.py --dataset Sports-Type-Classifier/data --model model/activity.model --label-bin model/lb.pickle --epochs 50

# set the matplotlib backend so figures can be saved in the background
import matplotlib
matplotlib.use("Agg")

# import the necessary packages
from keras.preprocessing.image import ImageDataGenerator
from keras.layers.pooling import AveragePooling2D
from keras.applications import ResNet50
from keras.layers.core import Dropout
from keras.layers.core import Flatten
from keras.layers.core import Dense
from keras.layers import Input
from keras.models import Model
from keras.optimizers import SGD
from sklearn.preprocessing import LabelBinarizer
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
from imutils import paths
import matplotlib.pyplot as plt
import numpy as np
import argparse
import pickle
import cv2
import os

# construct the argument parser and parse the arguments
# 构建python文件运行时的命令行输入
ap = argparse.ArgumentParser()
ap.add_argument("-d", "--dataset", required=True,
	help="path to input dataset")
ap.add_argument("-m", "--model", required=True,
	help="path to output serialized model")
ap.add_argument("-l", "--label-bin", required=True,
	help="path to output label binarizer")
ap.add_argument("-e", "--epochs", type=int, default=25,
	help="# of epochs to train our network for")
ap.add_argument("-p", "--plot", type=str, default="plot.png",
	help="path to output loss/accuracy plot")
args = vars(ap.parse_args())

# initialize the set of labels from the spots activity dataset we are
# going to train our network on
LABELS = set(["weight_lifting", "tennis", "football"])

# grab the list of images in our dataset directory, then initialize
# the list of data (i.e., images) and class images
print("[INFO] loading images...")
# 通过imtuils的paths类获取dataset对应路径下所有图片的路径
imagePaths = list(paths.list_images(args["dataset"])) 
data = []
labels = []

# loop over the image paths
for imagePath in imagePaths:
	# extract the class label from the filename
	label = imagePath.split(os.path.sep)[-2]

	# if the label of the current image is not part of of the labels
	# are interested in, then ignore the image
	# 跳过不参与训练的标签对应的图片
	if label not in LABELS:
		continue

	# load the image, convert it to RGB channel ordering, and resize
	# it to be a fixed 224x224 pixels, ignoring aspect ratio
	image = cv2.imread(imagePath)
	image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
	image = cv2.resize(image, (224, 224))

	# update the data and labels lists, respectively
	# 储存训练标签和图片
	data.append(image)
	labels.append(label)

# convert the data and labels to NumPy arrays
# 将数据和标签转换成numpy数组
data = np.array(data)
labels = np.array(labels)

# perform one-hot encoding on the labels
# 通过二值元素数组来对标签进行一键编码
lb = LabelBinarizer()
labels = lb.fit_transform(labels)

# partition the data into training and testing splits using 75% of
# the data for training and the remaining 25% for testing
# 将整个数据的%75用来训练剩下的用来检验
(trainX, testX, trainY, testY) = train_test_split(data, labels,
	test_size=0.25, stratify=labels, random_state=42)

# initialize the training data augmentation object
# 声明训练时的数据增强操作
trainAug = ImageDataGenerator(
	rotation_range=30,
	zoom_range=0.15,
	width_shift_range=0.2,
	height_shift_range=0.2,
	shear_range=0.15,
	horizontal_flip=True,
	fill_mode="nearest")

# initialize the validation/testing data augmentation object (which
# we'll be adding mean subtraction to)
# 测验时无数据增强操作
valAug = ImageDataGenerator()

# define the ImageNet mean subtraction (in RGB order) and set the
# the mean subtraction value for each of the data augmentation
# objects
# 设置训练和测试时用来归一化的图片每个通道的均值
mean = np.array([123.68, 116.779, 103.939], dtype="float32")
trainAug.mean = mean
valAug.mean = mean

# load the ResNet-50 network, ensuring the head FC layer sets are left
# off
# 通过keras.applications来获取预训练的resnet50模型
baseModel = ResNet50(weights="imagenet", include_top=False,
	input_tensor=Input(shape=(224, 224, 3)))

# construct the head of the model that will be placed on top of the
# the base model
# 定义basemodel前的全连接层
headModel = baseModel.output
headModel = AveragePooling2D(pool_size=(7, 7))(headModel)
headModel = Flatten(name="flatten")(headModel)
headModel = Dense(512, activation="relu")(headModel)
headModel = Dropout(0.5)(headModel)
headModel = Dense(len(lb.classes_), activation="softmax")(headModel)

# place the head FC model on top of the base model (this will become
# the actual model we will train)
# 在basemodel前加上一个全连接层来微调
model = Model(inputs=baseModel.input, outputs=headModel)

# loop over all layers in the base model and freeze them so they will
# *not* be updated during the training process
# 训练时将basemodel里的全部参数冻结
for layer in baseModel.layers:
	layer.trainable = False

# compile our model (this needs to be done after our setting our
# layers to being non-trainable)
# 定义初始学习率和优化方法来编译模型
print("[INFO] compiling model...")
opt = SGD(lr=1e-4, momentum=0.9, decay=1e-4 / args["epochs"])
model.compile(loss="categorical_crossentropy", optimizer=opt,
	metrics=["accuracy"])

# train the head of the network for a few epochs (all other layers
# are frozen) -- this will allow the new FC layers to start to become
# initialized with actual "learned" values versus pure random
# 使用fit_generator类来对逐批生成的数据进行训练
print("[INFO] training head...")
H = model.fit_generator(
	trainAug.flow(trainX, trainY, batch_size=32),
	steps_per_epoch=len(trainX) // 32,
	validation_data=valAug.flow(testX, testY),
	validation_steps=len(testX) // 32,
	epochs=args["epochs"])

# evaluate the network
# 使用sklearn的classification_report类查看预测效果
print("[INFO] evaluating network...")
predictions = model.predict(testX, batch_size=32)
print(classification_report(testY.argmax(axis=1), 
	predictions.argmax(axis=1), target_names=lb.classes_))

# plot the training loss and accuracy
# 打印训练loss和准确性等
N = args["epochs"]
plt.style.use("ggplot")
plt.figure()
plt.plot(np.arange(0, N), H.history["loss"], label="train_loss")
plt.plot(np.arange(0, N), H.history["val_loss"], label="val_loss")
plt.plot(np.arange(0, N), H.history["acc"], label="train_acc")
plt.plot(np.arange(0, N), H.history["val_acc"], label="val_acc")
plt.title("Training Loss and Accuracy on Dataset")
plt.xlabel("Epoch #")
plt.ylabel("Loss/Accuracy")
plt.legend(loc="lower left") # 标签说明放在左下角
plt.savefig(args["plot"])

# serialize the model to disk
# 讲训练好的模型保存
print("[INFO] serializing network...")
model.save(args["model"])

# serialize the label binarizer to disk
# 保存标签二值化器
f = open(args["label_bin"], "wb")
f.write(pickle.dumps(lb))
f.close()
!python train.py --dataset data --model output/activity.model \
	--label-bin output/lb.pickle --epochs 50
Using TensorFlow backend.
[INFO] loading images...
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:66: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:541: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4479: The name tf.truncated_normal is deprecated. Please use tf.random.truncated_normal instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:190: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:197: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:203: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

2019-12-14 08:24:02.011318: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2300000000 Hz
2019-12-14 08:24:02.013846: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x2306840 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2019-12-14 08:24:02.013885: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2019-12-14 08:24:02.018815: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2019-12-14 08:24:02.152830: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-14 08:24:02.153830: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x2306bc0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2019-12-14 08:24:02.153865: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Tesla K80, Compute Capability 3.7
2019-12-14 08:24:02.155504: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-14 08:24:02.156205: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: 0000:00:04.0
2019-12-14 08:24:02.176292: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2019-12-14 08:24:02.394209: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2019-12-14 08:24:02.508460: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2019-12-14 08:24:02.530840: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2019-12-14 08:24:02.748284: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2019-12-14 08:24:02.770949: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2019-12-14 08:24:03.213586: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2019-12-14 08:24:03.213829: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-14 08:24:03.214779: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-14 08:24:03.215576: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2019-12-14 08:24:03.220635: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2019-12-14 08:24:03.222331: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-12-14 08:24:03.222385: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0 
2019-12-14 08:24:03.222418: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N 
2019-12-14 08:24:03.223491: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-14 08:24:03.224341: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-14 08:24:03.225215: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2019-12-14 08:24:03.225289: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10805 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7)
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:207: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:216: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:223: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:2041: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:148: The name tf.placeholder_with_default is deprecated. Please use tf.compat.v1.placeholder_with_default instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4267: The name tf.nn.max_pool is deprecated. Please use tf.nn.max_pool2d instead.

/usr/local/lib/python3.6/dist-packages/keras_applications/resnet50.py:265: UserWarning: The output shape of `ResNet50(include_top=False)` has been changed since Keras 2.2.0.
  warnings.warn('The output shape of `ResNet50(include_top=False)` '
Downloading data from https://github.com/fchollet/deep-learning-models/releases/download/v0.2/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5
94658560/94653016 [==============================] - 7s 0us/step
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4271: The name tf.nn.avg_pool is deprecated. Please use tf.nn.avg_pool2d instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4432: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:3733: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
[INFO] compiling model...
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/optimizers.py:793: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:3576: The name tf.log is deprecated. Please use tf.math.log instead.

[INFO] training head...
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/math_grad.py:1424: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1033: The name tf.assign_add is deprecated. Please use tf.compat.v1.assign_add instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1020: The name tf.assign is deprecated. Please use tf.compat.v1.assign instead.

Epoch 1/50
2019-12-14 08:24:25.840365: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2019-12-14 08:24:26.891772: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
48/48 [==============================] - 29s 597ms/step - loss: 1.2660 - acc: 0.3822 - val_loss: 0.9862 - val_acc: 0.5332
Epoch 2/50
48/48 [==============================] - 24s 504ms/step - loss: 1.0399 - acc: 0.4962 - val_loss: 0.8028 - val_acc: 0.6337
Epoch 3/50
48/48 [==============================] - 24s 496ms/step - loss: 0.8882 - acc: 0.5950 - val_loss: 0.6543 - val_acc: 0.7119
Epoch 4/50
48/48 [==============================] - 23s 478ms/step - loss: 0.8053 - acc: 0.6517 - val_loss: 0.5938 - val_acc: 0.7510
Epoch 5/50
48/48 [==============================] - 23s 479ms/step - loss: 0.7165 - acc: 0.6940 - val_loss: 0.4964 - val_acc: 0.8045
Epoch 6/50
48/48 [==============================] - 24s 490ms/step - loss: 0.6342 - acc: 0.7500 - val_loss: 0.4970 - val_acc: 0.8148
Epoch 7/50
48/48 [==============================] - 23s 471ms/step - loss: 0.6217 - acc: 0.7526 - val_loss: 0.4313 - val_acc: 0.8313
Epoch 8/50
48/48 [==============================] - 22s 464ms/step - loss: 0.5881 - acc: 0.7832 - val_loss: 0.3984 - val_acc: 0.8683
Epoch 9/50
48/48 [==============================] - 23s 475ms/step - loss: 0.5611 - acc: 0.7741 - val_loss: 0.4083 - val_acc: 0.8498
Epoch 10/50
48/48 [==============================] - 23s 471ms/step - loss: 0.5241 - acc: 0.8073 - val_loss: 0.3564 - val_acc: 0.8807
Epoch 11/50
48/48 [==============================] - 22s 459ms/step - loss: 0.4947 - acc: 0.8171 - val_loss: 0.3755 - val_acc: 0.8642
Epoch 12/50
48/48 [==============================] - 22s 467ms/step - loss: 0.4841 - acc: 0.8190 - val_loss: 0.3499 - val_acc: 0.8704
Epoch 13/50
48/48 [==============================] - 22s 463ms/step - loss: 0.4801 - acc: 0.8196 - val_loss: 0.3397 - val_acc: 0.8704
Epoch 14/50
48/48 [==============================] - 22s 455ms/step - loss: 0.4430 - acc: 0.8242 - val_loss: 0.3491 - val_acc: 0.8786
Epoch 15/50
48/48 [==============================] - 22s 456ms/step - loss: 0.4179 - acc: 0.8477 - val_loss: 0.2985 - val_acc: 0.8909
Epoch 16/50
48/48 [==============================] - 22s 448ms/step - loss: 0.4189 - acc: 0.8379 - val_loss: 0.3302 - val_acc: 0.8807
Epoch 17/50
48/48 [==============================] - 22s 454ms/step - loss: 0.4167 - acc: 0.8425 - val_loss: 0.3139 - val_acc: 0.8868
Epoch 18/50
48/48 [==============================] - 22s 464ms/step - loss: 0.3985 - acc: 0.8444 - val_loss: 0.3067 - val_acc: 0.8926
Epoch 19/50
48/48 [==============================] - 21s 447ms/step - loss: 0.3881 - acc: 0.8587 - val_loss: 0.3024 - val_acc: 0.8951
Epoch 20/50
48/48 [==============================] - 22s 459ms/step - loss: 0.3944 - acc: 0.8548 - val_loss: 0.2902 - val_acc: 0.8992
Epoch 21/50
48/48 [==============================] - 21s 446ms/step - loss: 0.3782 - acc: 0.8620 - val_loss: 0.2824 - val_acc: 0.9115
Epoch 22/50
48/48 [==============================] - 21s 440ms/step - loss: 0.4034 - acc: 0.8412 - val_loss: 0.3105 - val_acc: 0.9012
Epoch 23/50
48/48 [==============================] - 22s 461ms/step - loss: 0.3600 - acc: 0.8691 - val_loss: 0.2667 - val_acc: 0.9012
Epoch 24/50
48/48 [==============================] - 22s 456ms/step - loss: 0.3357 - acc: 0.8750 - val_loss: 0.2630 - val_acc: 0.9177
Epoch 25/50
48/48 [==============================] - 21s 440ms/step - loss: 0.3411 - acc: 0.8737 - val_loss: 0.3130 - val_acc: 0.8951
Epoch 26/50
48/48 [==============================] - 22s 460ms/step - loss: 0.3312 - acc: 0.8848 - val_loss: 0.2747 - val_acc: 0.9198
Epoch 27/50
48/48 [==============================] - 22s 448ms/step - loss: 0.3401 - acc: 0.8691 - val_loss: 0.2811 - val_acc: 0.9033
Epoch 28/50
48/48 [==============================] - 21s 445ms/step - loss: 0.3326 - acc: 0.8776 - val_loss: 0.2546 - val_acc: 0.9280
Epoch 29/50
48/48 [==============================] - 21s 442ms/step - loss: 0.3445 - acc: 0.8692 - val_loss: 0.2683 - val_acc: 0.9156
Epoch 30/50
48/48 [==============================] - 22s 454ms/step - loss: 0.3072 - acc: 0.8854 - val_loss: 0.2658 - val_acc: 0.9095
Epoch 31/50
48/48 [==============================] - 22s 449ms/step - loss: 0.3029 - acc: 0.8978 - val_loss: 0.2819 - val_acc: 0.9033
Epoch 32/50
48/48 [==============================] - 22s 457ms/step - loss: 0.3123 - acc: 0.8900 - val_loss: 0.2680 - val_acc: 0.9198
Epoch 33/50
48/48 [==============================] - 22s 450ms/step - loss: 0.3020 - acc: 0.8867 - val_loss: 0.2422 - val_acc: 0.9198
Epoch 34/50
48/48 [==============================] - 22s 463ms/step - loss: 0.3244 - acc: 0.8802 - val_loss: 0.2712 - val_acc: 0.9115
Epoch 35/50
48/48 [==============================] - 22s 462ms/step - loss: 0.2898 - acc: 0.8952 - val_loss: 0.2588 - val_acc: 0.9160
Epoch 36/50
48/48 [==============================] - 22s 454ms/step - loss: 0.3157 - acc: 0.8854 - val_loss: 0.2807 - val_acc: 0.9095
Epoch 37/50
48/48 [==============================] - 22s 465ms/step - loss: 0.3021 - acc: 0.8848 - val_loss: 0.2514 - val_acc: 0.9218
Epoch 38/50
48/48 [==============================] - 22s 453ms/step - loss: 0.3053 - acc: 0.8731 - val_loss: 0.2646 - val_acc: 0.9053
Epoch 39/50
48/48 [==============================] - 22s 454ms/step - loss: 0.2845 - acc: 0.8874 - val_loss: 0.2502 - val_acc: 0.9177
Epoch 40/50
48/48 [==============================] - 21s 443ms/step - loss: 0.2850 - acc: 0.8972 - val_loss: 0.2571 - val_acc: 0.9136
Epoch 41/50
48/48 [==============================] - 22s 461ms/step - loss: 0.2892 - acc: 0.8997 - val_loss: 0.2667 - val_acc: 0.9156
Epoch 42/50
48/48 [==============================] - 22s 450ms/step - loss: 0.2804 - acc: 0.8971 - val_loss: 0.2466 - val_acc: 0.9115
Epoch 43/50
48/48 [==============================] - 21s 436ms/step - loss: 0.2756 - acc: 0.8978 - val_loss: 0.2548 - val_acc: 0.9177
Epoch 44/50
48/48 [==============================] - 21s 445ms/step - loss: 0.2730 - acc: 0.9010 - val_loss: 0.2562 - val_acc: 0.9239
Epoch 45/50
48/48 [==============================] - 21s 443ms/step - loss: 0.2724 - acc: 0.9017 - val_loss: 0.2561 - val_acc: 0.9259
Epoch 46/50
48/48 [==============================] - 22s 455ms/step - loss: 0.2871 - acc: 0.8906 - val_loss: 0.2700 - val_acc: 0.9239
Epoch 47/50
48/48 [==============================] - 22s 456ms/step - loss: 0.2734 - acc: 0.9017 - val_loss: 0.2244 - val_acc: 0.9259
Epoch 48/50
48/48 [==============================] - 22s 453ms/step - loss: 0.2570 - acc: 0.9062 - val_loss: 0.2566 - val_acc: 0.9156
Epoch 49/50
48/48 [==============================] - 22s 450ms/step - loss: 0.2457 - acc: 0.9069 - val_loss: 0.2649 - val_acc: 0.9177
Epoch 50/50
48/48 [==============================] - 21s 446ms/step - loss: 0.2753 - acc: 0.8997 - val_loss: 0.2346 - val_acc: 0.9300
[INFO] evaluating network...
                precision    recall  f1-score   support

      football       0.92      0.95      0.93       196
        tennis       0.92      0.91      0.92       179
weight_lifting       0.94      0.91      0.92       143

      accuracy                           0.92       518
     macro avg       0.93      0.92      0.92       518
  weighted avg       0.92      0.92      0.92       518

[INFO] serializing network...

用滚动预测平均的方式应用模型到视频分类中

# USAGE
# python predict_video.py --model model/activity.model --label-bin model/lb.pickle --input example_clips/lifting.mp4 --output output/lifting_128avg.avi --size 128

# import the necessary packages
from keras.models import load_model
from collections import deque #用来实现滑动平均
import numpy as np
import argparse
import pickle
import cv2
from matplotlib import pyplot as plt

# construct the argument parser and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-m", "--model", required=True,
	help="path to trained serialized model")
ap.add_argument("-l", "--label-bin", required=True,
	help="path to  label binarizer")
ap.add_argument("-i", "--input", required=True,
	help="path to our input video")
ap.add_argument("-o", "--output", required=True,
	help="path to our output video")
ap.add_argument("-s", "--size", type=int, default=128,
	help="size of queue for averaging")
args = vars(ap.parse_args())

# load the trained model and label binarizer from disk
# 从命令行的声明载入训练好的模型以及二值化
print("[INFO] loading model and label binarizer...")
model = load_model(args["model"])
lb = pickle.loads(open(args["label_bin"], "rb").read())

# initialize the image mean for mean subtraction along with the
# predictions queue
mean = np.array([123.68, 116.779, 103.939][::1], dtype="float32")
# 初始化Q为一个双向序列，deque的尺寸由size决定
Q = deque(maxlen=args["size"])

# initialize the video stream, pointer to output video file, and
# frame dimensions
# 使用opencv的VideoCapture类读取视频流，并初始化视频写入类
vs = cv2.VideoCapture(args["input"])
writer = None
(W, H) = (None, None)

# loop over frames from the video file stream
# 循环抓取被测试视频的帧
while True:
	# read the next frame from the file
	(grabbed, frame) = vs.read()

	# if the frame was not grabbed, then we have reached the end
	# of the stream
	# 如果没有抓取到视频帧，则退出
	if not grabbed:
		break

	# if the frame dimensions are empty, grab them
	if W is None or H is None:
		(H, W) = frame.shape[:2]

	# clone the output frame, then convert it from BGR to RGB
	# ordering, resize the frame to a fixed 224x224, and then
	# perform mean subtraction 归一化操作
	output = frame.copy()
	frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
	frame = cv2.resize(frame, (224, 224)).astype("float32") #改变图像宽和高，像素转移到特定数据类型
	frame -= mean

	# make predictions on the frame and then update the predictions
	# queue
	# 新建一维用来储存每一帧对应的分类结果
	preds = model.predict(np.expand_dims(frame, axis=0))[0]
	# 将预测值加载Q后面
	Q.append(preds)

	# perform prediction averaging over the current history of
	# previous predictions
	# 先平均到K个概率输出值，在取最大
	results = np.array(Q).mean(axis=0)
	i = np.argmax(results)
	label = lb.classes_[i]

	# draw the activity on the output frame
	text = "activity: {}".format(label)
	cv2.putText(output, text, (35, 50), cv2.FONT_HERSHEY_SIMPLEX,
		1.25, (0, 255, 0), 5)

	# check if the video writer is None
	if writer is None:
		# initialize our video writer
		fourcc = cv2.VideoWriter_fourcc(*"MJPG")
		writer = cv2.VideoWriter(args["output"], fourcc, 30,
			(W, H), True)

	# write the output frame to disk
	writer.write(output)

	# show the output image
	# 避免cannot connect to X server，此处使用matplotlib
	# cv2.imshow("Output", output)
  plt.imshow(output)
	key = cv2.waitKey(1) & 0xFF

	# if the `q` key was pressed, break from the loop
	if key == ord("q"):
		break

# release the file pointers
print("[INFO] cleaning up...")
writer.release()
vs.release()
!sudo apt-get install tree
!tree --dirsfirst --filelimit 50
.
├── data
│   ├── badminton [938 entries exceeds filelimit, not opening dir]
│   ├── baseball [746 entries exceeds filelimit, not opening dir]
│   ├── basketball [495 entries exceeds filelimit, not opening dir]
│   ├── boxing [705 entries exceeds filelimit, not opening dir]
│   ├── chess [481 entries exceeds filelimit, not opening dir]
│   ├── cricket [715 entries exceeds filelimit, not opening dir]
│   ├── fencing [635 entries exceeds filelimit, not opening dir]
│   ├── football [799 entries exceeds filelimit, not opening dir]
│   ├── formula1 [687 entries exceeds filelimit, not opening dir]
│   ├── gymnastics [719 entries exceeds filelimit, not opening dir]
│   ├── hockey [572 entries exceeds filelimit, not opening dir]
│   ├── ice_hockey [715 entries exceeds filelimit, not opening dir]
│   ├── kabaddi [454 entries exceeds filelimit, not opening dir]
│   ├── models
│   │   ├── res50-stage-1.pth
│   │   ├── res50-stage-2.pth
│   │   ├── res50-stage-3.pth
│   │   ├── stage-1.pth
│   │   ├── tmp.pth
│   │   └── unfreeze-stage-1.pth
│   ├── motogp [679 entries exceeds filelimit, not opening dir]
│   ├── shooting [536 entries exceeds filelimit, not opening dir]
│   ├── swimming [689 entries exceeds filelimit, not opening dir]
│   ├── table_tennis [713 entries exceeds filelimit, not opening dir]
│   ├── tennis [718 entries exceeds filelimit, not opening dir]
│   ├── volleyball [713 entries exceeds filelimit, not opening dir]
│   ├── weight_lifting [577 entries exceeds filelimit, not opening dir]
│   ├── wrestling [611 entries exceeds filelimit, not opening dir]
│   ├── wwe [671 entries exceeds filelimit, not opening dir]
│   ├── badminton_urls.txt
│   ├── baseball_urls.txt
│   ├── basketball_urls.txt
│   ├── boxing_urls.txt
│   ├── chess_urls.txt
│   ├── cleaned.csv
│   ├── cricket_urls.txt
│   ├── export.pkl
│   ├── fencing_urls.txt
│   ├── football_urls.txt
│   ├── formula1_urls.txt
│   ├── gymnastics_urls.txt
│   ├── hockey_urls.txt
│   ├── ice_hockey_urls.txt
│   ├── kabaddi_urls.txt
│   ├── motogp_urls.txt
│   ├── shooting_urls.txt
│   ├── swimming_urls.txt
│   ├── table_tennis_urls.txt
│   ├── tennis_urls.txt
│   ├── volleyball_urls.txt
│   ├── weight_lifting_urls.txt
│   ├── wrestling_urls.txt
│   └── wwe_urls.txt
├── example_clips
│   ├── lifting.mp4
│   ├── soccer.mp4
│   └── tennis.mp4
├── model
├── output
│   ├── activity.model
│   ├── lb.pickle
│   ├── tennis_128frames_smoothened.avi
│   ├── tennis_128frames_smoothened (convert-video-online.com).mp4
│   └── tennis_1frame.avi
├── Sports-Type-Classifier
│   ├── readme_images
│   │   ├── acc_sports.png
│   │   ├── cric.png
│   │   ├── heat_cric.png
│   │   ├── si_sports.png
│   │   ├── sports_confusion_matrix.png
│   │   ├── sports_data_aug.png
│   │   └── sports.png
│   ├── README.md
│   └── sports_classifier.ipynb
├── plot.png
├── predict_video.py
├── sports_classifier.ipynb
├── sports-type-classifier-data.7z
├── sports-type-classifier-data.zip
└── train.py

29 directories, 53 files
!python predict_video.py --model output/activity.model \
	--label-bin output/lb.pickle \
	--input example_clips/lifting.mp4 \
	--output output/lifting_128frame.avi \
	--size 128
!python predict_video.py --model output/activity.model \
	--label-bin output/lb.pickle \
	--input example_clips/tennis.mp4 \
	--output output/tennis_128frames_smoothened.avi \
	--size 128

References

https://www.pyimagesearch.com/2019/07/15/video-classification-with-keras-and-deep-learning/