Python基于ImageAI实现图像识别详解

时间:2023-02-03 心随而动人气:0

背景简介

ImageAI是一个面向计算机视觉编程的Python库，支持最先进的机器学习算法。主要图像预测，物体检测，视频对象检测与跟踪等多个应用领域。利用ImageAI，开发人员可用很少的代码构建出具有包含深度学习和计算机视觉功能的应用系统。

ImageAI目前支持在ImageNet数据集上对多种不同机器算法进行图像预测和训练，ImageNet数据集项目始于2006年，它是一项持续的研究工作，旨在为世界各地的研究人员提供易于访问的图像数据库。

图像预测

算法引入

图像预测（Image Prediction）是指利用由各种不同算法构建而成的预测器对输入图像或视频帧进行分析解构，并返回其中所包含的物体对象名及其相应的百分比概率（Percentage Probabilities）的过程。

ImageAI提供了4种不同算法模型进行图像预测，并在ImageNet数据集上进行了训练。4种算法模型分别如下：

（1）由F.N.Iandola团队提出了SqueezeNet（预测速度最快，正确率中等）。

（2）由Microsoft公司提供的ResNet50（预测速度快，正确率较高）。

（3）由Google公司提供的InceptionV3（预测速度较慢，正确率高）。

（4）由Facebook公司提供的DenseNet121（预测速度最慢，正确率最高）。

ImageAI可对一幅图像或者多幅图像进行预测。下面我们将分别用两个简单的示例来进行解释和演示。

单图像预测

单图像预测主要是用到ImageAI中imagePrediction类中的predictImage（）方法，其主要过程如下：

（1）定义一个imagePrediction（）的实例。

（2）通过setMoTypeAsResNet（）设置模型类型以及通过setModePath（）设置模型路径。

（3）调用loadModel（）函数模型载入模型。

（4）利用predictImage（）函数进行预测。该函数有两个参数，一个参数用于指定要进行预测的文件，另一个参数result_count则用于设置我们想要预测结果的数量（该参数的值1~100可选）。函数将返回预测的对象名及其相应的百分比概率。

在以下示例中，我们将预测对象模型类型设置为ResNet，当然，我们也可以用其他的上几篇的算法进行图像预测。基于ImageAI的单图像预测的示例代码：

from imageai.Prediction import ImagePrediction
import os
import time
#开始计时
start_time=time.time()
execution_path=os.getcwd()
#对ImagePrediction类进行实例化
prediction=ImagePrediction()
#设置算法模型类型
prediction.setModelTypeAsResNet()
prediction.setModelPath(os.path.join(execution_path,'resent50_weights_tf_dim_ordering_tf_kernels.h5'))
prediction.loadModel()
predictions,probabilities=prediction.predictioImage(os.path.join(execution_path,'sample.jpg'),result_count=5)
end_time=time.time()
for eachPrediction,eachProbability in zip(predictions,probabilities):
    print(eachPrediction+":"+str(eachProbability))
print('Total time cost:',end_time-start_time)

多图像检测

对于多图像检测，我们可以通过多次调用predictImage（）函数的方式来进行。而更简单的方法时一次性调用predicMultipleImages（）。其主要工作流程为：

（1）定义一个ImagePrediction（）的实例。

（2）通过setModelTypeAsResNet（）设置模型类型以及通过setModelPath()设置模型路径。

（3）调用loadModel（）函数载入模型。

（4）创建一个数组并将所有要预测的图像的路径添加到数组。

（5）通过调用predictMultiple Images（）函数解析包含图像路径的数组并执行图像预测，通过分析result_count_per_image（默认值为2）的值来设定每个图像需要预测多少种可能。

#多图像预测
from image.Prediction import ImagePrediction
import os
execution_path=os.getcwd()
#初始化预测器
multiple_prediction=ImagePrediction()
multiple_prediction.setModelTypeAsResNet()
#设置模型文件路径
multiple_prediction.setModelPath(os.path.join(execution_path,'resent50_weights_tf_ordering_tf_kernels.h5'))
#加载模型
multiple_prediction.loadModel()
all_images_array=[]
all_files=os.listdir(execution_path)
for each_file in all_files:
    if(each_file.endswith('.jpg') or each_file.endswith('.png')):
        all_images_array.append(each_file)
results_array=multiple_prediction.predictMultipleImages(all_images_array,result_count_per_image=3)
for each_result in results_array:
    predictions,percentage_probanlities=each_result['predictions'],each_result['percentage_probabilities']
    for index in range(len(predictions)):
        print(predictions[index]+':'+str(percentage_probanlities[index]))
print('-----------')

目标检测

ImageAI提供了非常方便和强大的方法来对图像执行对象检测并从中提取每个识别出的对象。

图像目标检测

基于ImageAI的图像目标检测主要是用到了ObjectDetection类中的detectObjectFromImage（）方法。

示例代码：

#目标检测
from imageai.Detection import ObjectDetection
import os
import time
start_time=time.time()
#execution_path=os.getcwd()#获取当前目录
detector=ObjectDetection() #实例化一个ObjectDetection类
detector.setModelTypeAsRetinaNet() #设置算法模型类型为RetinaNet
#etector.setModelPath()
detector.loadModel() #加载模型
#图像目标检测，百分比概率阈值设置为30可检测出更多的物体（默认值为30）
detections=detector.detectObjectsFromImage(input_image="D:\Image\\four.jpg",output_image_path='D:\Image\\fourr.jpg',minimum_percentage_probability=30)
end_time=time.time()
for eachObject in detections:
    print(eachObject['name'],":",eachObject['percentage_probability'],":",eachObject['box_points'])
print('Total Time cost:',end_time-start_time)

视频目标检测

视频目标检测应用范围非常广泛，包括动态目标跟踪，自动无人体步态识别等各种场景，由于视频中包含大量的时间和空间冗余信息，对视频中的目标检测是非常消耗硬件资源的，所以博主建议使用安装了GPU硬件和CPU版的tensorflow深度学习框架的硬件设备来执行相关任务，而在CPU设备上进行视频目标检测会很慢。

视频目标检测需要用到ImageAI中VideoObjectDetection类的detectObjectsFromVideo（）方法。

示例代码如下：

#视频目标检测
from imageai.Detection import VideoObjectDetection
import os
import time
start_time=time.time()
detector=VideoObjectDetection() #初始化视频检测类
detector.setModelTypeAsRetinaNet()
#detector.setModelPath('D:\Image:\haha.mp4')
detector.loadModel() #加载模型
video_path=detector.detectObjectsFromVideo(input_file_path='D:\Image\haha.mp4',output_file_path='D:Image:\hahaha.mp4',frames_per_second=20,log_progress=True)
print(video_path)
end_time=time.time()
print('Total time cost:',end_time-start_time)

加载全部内容