Skip to content

Artificial Intelligence application to classify random pictures according to the objects they show.

License

Notifications You must be signed in to change notification settings

mamannon/Picture_Classification

Repository files navigation

Picture_Classifier

Artificial Intelligence application to classify random pictures according to the objects they show. Uses neural network models to recognize objects from a picture and then classifies the picture according to the objects it contains. Picture_Classifier is an application for sorting image files, which sorts images based on image content using neural network artificial intelligence. The application is written in the Python programming language using e.g. TensorFlow, Keras, NumPy, Scikit-learn and Pandas libraries. The user interface of the application is command line.

Currently Picture_Classifier can classify only RGB(24bit) JPG files. The application is designed to be able to classify any image file format, but there's not implementation yet.

At the moment object detectors work as they should, but actual picture classification according to the detected objects doesn't. There are at least two reasons for that:

  • There aren't enough training material to teach classification: the only material is for object detectors, but that alone is not enough for teaching picture classification.

  • Current implementation applies transfer learning to the picture classification model, so when you create a new object detector model, picture classification model doesn't need to be created from scratch. This is how it is implemented now, although it should not be possible: it is possible to add new classes to the model, but it is not possible to add new features to the dataset without creating the model from scratch.

I don't like the idea to add extra training material or train everything from scratch every time you add new object detector model, so perhaps I implement picture classification without machine learning in the future, if that is possible.

Principle

The basic idea is for the user of the application to create a folder with a mixed set of image files and another folder where the application creates subfolders, the application sorts and moves the image files from the original folder. In order for the application to be able to perform this sorting, it needs to know the sorting categories and learn from the model data what kind of image file is sorted into which category. You have to recognize different objects from the pictures, such as cars, faces, flowers, dogs, etc.

In order to teach the artificial intelligence to sort pictures, it has to be given model data to study and this data to be kept in the right way: we use the folder tree structure of the files to tell which are the categories to be sorted and what kind of objects should be found in this category. We use the following file tree structure from top to bottom:

  • Root folder: this folder contains each of the material to be studied. This folder contains only Class Folders folders.

  • Class folders: the folders below the root folder represent the categories to be sorted, and the folder names will be the names of the categories that Picture_Classifier uses to create the folders, moving the sorted image files into those folders. Class folders folder immediately contains only Object folders (main category) folders.

  • Object folders (main category): these folders each contain their own general topic, for example cars, faces, flowers or whatever, which will be part of the classification criteria, but does not define any classification by itself. Object folder (main category) cannot contain picture files but only folders. Object folder (main category) folder immediately contains only the Object folders (subcategory) folders.

  • Object folders (subcategory): these folders each contain their own specific topic, for example cars or family cars, which is the dominant part of the classification criteria, and this does define the classification by itself. Object folders (subcategory) folder immediately contains only image files.

The main category/subcategory division is making it easier to maintain the teaching data, it is not important for the recognition of objects in the Picture_Classifier application. Objects are always identified at the subcategory level.

Architecture

Picture_Classifier contains several AI models: one for sorting and one for identifying each general object type, or main category. An identifying model classifies all subcategories included by main category.

  • The object is identified from the image with the code of the objectfinder.py file. Object_finder finds an object if it fills the image, for example a face in a passport photo, and is trained to recognize faces.

  • Identifying the object from the image file to be sorted is done by searching for it at different scales and in different locations. For this, the image is divided into several grids representing different scales, and a recognizable object is searched for in each grid separately. This is implemented with the code in the objectclassifier.py file. In this way, it is possible to obtain several findings in different scales, but only the findings of the scale containing the most findings are taken into account.

  • When the objects in the image file have been identified and their numbers have been calculated, it is necessary to decide in which category the image file will be placed based on this information. This is done with the code in the pictureclassifier.py file.

In addition to the files mentioned above, Picture_Classifier contains four other source code files:

  • Main.py contains the main() function of the application (even though there is no main function in Python) and a loop where the user of the application can tinker.

  • The constants.py file contains all the constants defined in the application. The values ​​of these variables can only be changed inside the Constants.py file.

  • Image_dataset_utils.py is a Tensorflow library source code, which is needed in the project so we can directly use paths_and_labels_to_dataset and image_dataset_from_directory methods.

  • Dataset_utils.py is a Tensorflow libraray source code, which is needed to get image_dataset_utils.py to work.

Tuning

Application behaviour can be tuned with following constants in constants.py file:

OBJECT_CLASSIFIER_FOLDER = the path to the folder where Picture_Classifier stores the serialized Object_finder models.

OBJECT_CLASSIFIER_IMPERTINENT_CLASS_NAME = subcategory level name for object classifications, which are to be excluded from process. This name must be different than any folder name in training material.

PICTURE_CLASSIFIER_NAME = filename to save the serialized Picture_classifier model.

OBJECT_BATCH_SIZE = the number of data rows of studying the Object_finder model in one cycle.

PICTURE_BATCH_SIZE = The number of data rows of studying the Picture_classifier model in one cycle.

OBJECT_IMG_HEIGHT = the height of the image in pixels to which the Object_finder model adjusts the recognizable image before recognition.

OBJECT_IMG_WIDTH = the width of the image in pixels to which the Object_finder model adjusts the recognizable image before recognition.

OBJECT_TILE_OVERLAP_COEFFICIENT = an integer that tells how many overlapping identifications an image is recognized in one scale.

OBJECT_MULTI_TILE_SIZE_MULTIPLIER = a real number that acts as a coefficient when moving from one scale to another.

OBJECT_EPOCHS = an integer that tells how many cycles the Object_finder model studies before intermediate storage, on the basis of which the study is continued.

OBJECT_THRESHOLD = a positive real number between [0, 1], indicating the probability at which the Object_finder model must at least identify an object in order to classify it as identified.

PICTURE_NUMBER_OF_HIDDEN_NEURONS = the number of neurons in the Picture_classifier model.

PICTURE_LEARNING_RATE = a positive (small) real number that adjusts the TensorFlow Keras optimizer's gradient calculation.

PICTURE_TRAINING_STEPS = an integer that tells how many cycles the Picture_classifier model learns the classification in total.

PICTURE_DISPLAY_STEP = an integer after how many cycles the Picture_classifier model reports its learning to the command line.

About

Artificial Intelligence application to classify random pictures according to the objects they show.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages