Search Image by Image¶
- Tutorial Difficulty: ★☆☆☆☆
- 7 min read
- Languages: SQL (100%)
- File location: tutorial_en/thanosql_search/search_image_by_image.ipynb
- References: Introduction of MNIST DataSet, A Simple Framework for Contrastive Learning of Visual Representations
Tutorial Introduction¶
Understanding Image Vectorization
Images(height x width x channel [RGB] x color intensity) are meaningless if the information for each pixel is randomly generated. In other words, an image can only be recognized as an image if each pixel has a specific pattern associated with the surrounding pixels. With this information, it can be inferred that an image can be represented on a low-dimensional feature vector. Recently, studies using machine learning to vectorize and express each image in a low-dimensional space based on similarity have been conducted.
There are several ways to define the similarity of an image. It can refer to the colors being similar, the objects in the image being similar, or the context of the image being similar (ex. a handwritten number). Although it is difficult to give an exact definition of an image similarity, machine learning learns and vectorizes these general features.
ThanoSQL uses the Self-Supervised Learning Model to input images into the database and retrieve similar images from it. When you upload images to the ThanoSQL's database, a machine learning algorithm places similar images together while placing non-similar images apart. Image features are derived from an unlabeled dataset and fine-tuned with a small amount of labeled data. Then, it can be used for classification or regression tasks.
Furthermore, ThanoSQL uses machine learning algorithms to vectorize datasets. The vectorized data is stored as a database column in the image table and is used to calculate the similarity(distance).
The following are examples and applications of the ThanoSQL similar image search algorithm.
- Input your favorite image and have similar artworks searched for and recommended to you.
- Find similar images within an album containing thousands of photos.
- Store your images in the ThanoSQL's database and create your own search engine or machine learning model utilizing the ThanoSQL AutoML regression/classification model.
In This Tutorial
👉 This tutorial will use the MNIST handwriting dataset. Each image consists of a hand written number between 0 and 9 and is correctly labeled. The MNIST handwriting dataset consists of 1,000 train images and 200 test images.
Create a model that uses ThanoSQL to input handwriting data and retrieves similar images from the database.
0. Prepare Dataset¶
As mentioned in the ThanoSQL Workspace, you must create an API token and run the query below to execute the query of ThanoSQL.
%load_ext thanosql
%thanosql API_TOKEN=<Issued_API_TOKEN>
Prepare Dataset¶
%%thanosql
GET THANOSQL DATASET mnist_data
OPTIONS (overwrite=True)
Success
Query Details
- "GET THANOSQL DATASET" downloads the specified dataset to the workspace.
- "OPTIONS" specifies the option values to be used for the GET THANOSQL DATASET clause.
- "overwrite": determines whether to overwrite a dataset if it already exists. If set as True, the old dataset is replaced with the new dataset (bool, optional, True|False, default: False)
%%thanosql
COPY mnist_train
OPTIONS (if_exists='replace')
FROM 'thanosql-dataset/mnist_data/mnist_train.csv'
Success
%%thanosql
COPY mnist_test
OPTIONS (if_exists='replace')
FROM 'thanosql-dataset/mnist_data/mnist_test.csv'
Success
Query Details
- "COPY" specifies the name of the dataset to be saved as a database table.
- "OPTIONS" specifies the option values to be used for the COPY clause.
- "if_exists": determines how the function should handle the case where the table already exists, it can either raise an error, append to the existing table, or replace the existing table (str, optional, 'fail'|'replace'|'append', default: 'fail')
1. Check Dataset¶
To create a handwriting classification model, we use the mnist_train table located in the ThanoSQL workspace database. Run the query below to check the contents of the table.
%%thanosql
SELECT *
FROM mnist_train
LIMIT 5
image_path | filename | label | |
---|---|---|---|
0 | thanosql-dataset/mnist_data/train/6782.jpg | 6782.jpg | 5 |
1 | thanosql-dataset/mnist_data/train/1810.jpg | 1810.jpg | 5 |
2 | thanosql-dataset/mnist_data/train/33617.jpg | 33617.jpg | 5 |
3 | thanosql-dataset/mnist_data/train/27802.jpg | 27802.jpg | 5 |
4 | thanosql-dataset/mnist_data/train/50677.jpg | 50677.jpg | 5 |
Understanding the Data Table
The mnist_train table contains the following information.
- image_path: image path
- filename: file name
- label: image label
2. Build an Image Vectorization Model¶
To create an image vectorization model with the name my_image_search_model using the mnist_train table, run the following query.
(Estimated duration of query execution: 1 min)
%%thanosql
BUILD MODEL my_image_search_model
USING SimCLR
OPTIONS (
image_col='image_path',
max_epochs=1,
overwrite=True
)
AS
SELECT *
FROM mnist_train
Success
Query Details
- "BUILD MODEL" creates and trains a model named my_image_search_model.
- "USING" specifies SimCLR as the base model.
- "OPTIONS" specifies the option values used to create a model.
- "image_col": the name of the column containing the image path (str, default: 'image_path')
- "max_epochs": number of times to train with the training dataset (int, optional, default: 5)
- "overwrite": determines whether to overwrite a model if it already exists. If set as True, the old model is replaced with the new model (bool, optional, True|False, default: False)
To vectorize the mnist_test images run the following "CONVERT USING" query. The vectorized results are stored in a user-defined column(default: convert_result) in the mnist_test table.
%%thanosql
CONVERT USING my_image_search_model
OPTIONS (
image_col='image_path',
result_col='convert_result'
)
AS
SELECT *
FROM mnist_test
image_path | filename | label | convert_result | |
---|---|---|---|---|
0 | thanosql-dataset/mnist_data/test/5099.jpg | 5099.jpg | 6 | [b'\xd7', b'q', b'&', b'<', b'\xa1', b'<', b'\... |
1 | thanosql-dataset/mnist_data/test/9239.jpg | 9239.jpg | 6 | [b'\x1e', b'@', b'\xa9', b'=', b']', b'K', b'\... |
2 | thanosql-dataset/mnist_data/test/2242.jpg | 2242.jpg | 6 | [b'\x87', b'\xd1', b' ', b'>', b'\x83', b'e', ... |
3 | thanosql-dataset/mnist_data/test/3451.jpg | 3451.jpg | 6 | [b'\xac', b'\xd2', b"'", b'>', b'*', b'\xfa', ... |
4 | thanosql-dataset/mnist_data/test/2631.jpg | 2631.jpg | 6 | [b'O', b'\xf1', b'M', b'=', b'9', b'w', b'?', ... |
... | ... | ... | ... | ... |
195 | thanosql-dataset/mnist_data/test/8045.jpg | 8045.jpg | 8 | [b'\x00', b'\x00', b'\x00', b'\x00', b'\xa4', ... |
196 | thanosql-dataset/mnist_data/test/9591.jpg | 9591.jpg | 8 | [b'\x99', b'\x15', b'9', b'>', b'\xb7', b'\x88... |
197 | thanosql-dataset/mnist_data/test/7425.jpg | 7425.jpg | 8 | [b'\x1c', b'H', b'\x8a', b'>', b'f', b'\x83', ... |
198 | thanosql-dataset/mnist_data/test/2150.jpg | 2150.jpg | 8 | [b'\xf4', b'Y', b'S', b'>', b'\x96', b'^', b'\... |
199 | thanosql-dataset/mnist_data/test/5087.jpg | 5087.jpg | 8 | [b'\xc8', b'\x80', b'o', b'=', b'\xe6', b'\xdc... |
200 rows × 4 columns
Query Details
- "CONVERT USING" uses my_image_search_model as an algorithm for image vectorizaion.
- "OPTIONS" specifies the options to be used for image vectorization.
- "image_col": the name of the column containing the image path (str, default: 'image_path')
- "result_col": defines the column name that contains the vectorized results (str, optional, default: 'convert_result')
Execute the "CONVERT USING" query statement below and save the converted result in a new table so that it can be used with other ThanoSQL query statements.
%%thanosql
CREATE TABLE mnist_test_convert_en AS
SELECT * FROM (
CONVERT USING my_image_search_model
OPTIONS (
image_col='image_path',
result_col='convert_result'
)
AS
SELECT *
FROM mnist_test
)
Success
3. Search for Similar Images Using Image Vectorization Models¶
This step uses the my_image_search_model image vectorization model and the test table to search for images similar to the "923.jpg" image(handwritten 8).
923.jpg Image File
%%thanosql
SEARCH IMAGE
USING my_image_search_model
OPTIONS (
search_by='image',
search_input='thanosql-dataset/mnist_data/test/923.jpg',
emb_col='convert_result',
result_col='search_result'
)
AS
SELECT *
FROM mnist_test_convert_en
image_path | filename | label | convert_result | search_result | |
---|---|---|---|---|---|
0 | thanosql-dataset/mnist_data/test/923.jpg | 923.jpg | 8 | [b'\xcd', b'\x1d', b'\x08', b'>', b'\xf6', b'\... | 1.000000 |
1 | thanosql-dataset/mnist_data/test/7645.jpg | 7645.jpg | 8 | [b']', b'\xac', b'L', b'=', b'\xe2', b'\x99', ... | 0.997068 |
2 | thanosql-dataset/mnist_data/test/5087.jpg | 5087.jpg | 8 | [b'\xc8', b'\x80', b'o', b'=', b'\xe6', b'\xdc... | 0.996402 |
3 | thanosql-dataset/mnist_data/test/685.jpg | 685.jpg | 8 | [b'>', b'\x84', b';', b'=', b'\xb6', b'N', b'\... | 0.995973 |
4 | thanosql-dataset/mnist_data/test/6573.jpg | 6573.jpg | 8 | [b'\x94', b'\xf7', b'\xfe', b'=', b'\x11', b'\... | 0.995699 |
... | ... | ... | ... | ... | ... |
195 | thanosql-dataset/mnist_data/test/2220.jpg | 2220.jpg | 4 | [b'\xfc', b'6', b'\x16', b'>', b'\xa8', b'\x99... | 0.984654 |
196 | thanosql-dataset/mnist_data/test/3684.jpg | 3684.jpg | 3 | [b's', b'\x00', b'\xd1', b'=', b'\xa0', b'\x97... | 0.984400 |
197 | thanosql-dataset/mnist_data/test/7020.jpg | 7020.jpg | 1 | [b'\x15', b'\xe4', b'\xf1', b'<', b'n', b'\x86... | 0.984273 |
198 | thanosql-dataset/mnist_data/test/2938.jpg | 2938.jpg | 1 | [b'h', b'\xdc', b'\xb0', b'=', b'\xee', b'\x1d... | 0.983866 |
199 | thanosql-dataset/mnist_data/test/7176.jpg | 7176.jpg | 1 | [b'&', b'\xb1', b'4', b'=', b'\x9e', b'.', b'[... | 0.983214 |
200 rows × 5 columns
Query Details
- "SEARCH IMAGE [image|text|audio|video]" defines the image|text|audio|video file type to search for.
- "USING" defines the model used for image vectorization.
- "OPTIONS" specifies the options to be used for image searching.
- "search_by": defines the image|text|audio|video type to be used for the search (str)
- "search_input": defines the input to be used for the search (str)
- "emb_col": the column that contains the vectorized results (str)
- "result_col": defines the name of the column that contains the search results (str, optional, default: 'search_result')
- "AS" defines the embedding table to be used for searches. In this example, the mnist_test table is used.
To output the "SEARCH" result using the "PRINT" clause to output the top four most similar images, run the following query. Though we've only done a minimal amount of training, you can see that images similar to 8 are returned.
%%thanosql
PRINT IMAGE
AS (
SELECT image_path, search_result
FROM (
SEARCH IMAGE
USING my_image_search_model
OPTIONS (
search_by='image',
search_input='thanosql-dataset/mnist_data/test/923.jpg',
emb_col='convert_result',
result_col='search_result',
top_k=4
)
AS
SELECT *
FROM mnist_test_convert_en
)
)
/home/jovyan/thanosql-dataset/mnist_data/test/923.jpg
/home/jovyan/thanosql-dataset/mnist_data/test/7645.jpg
/home/jovyan/thanosql-dataset/mnist_data/test/5087.jpg
/home/jovyan/thanosql-dataset/mnist_data/test/685.jpg
Note
The training options of the algorithm recognize the image regardless of the image's left-right inversion and color differences. This is because a dog's picture should be recognized as a dog even if it is flipped or has a color difference. If the color feature is important, such as clothing images, or if vertical and horizontal inversions are important, such as numbers, the training options should be changed.
4. In Conclusion¶
In this tutorial, we used the MNIST handwriting dataset to vectorize images and perform image search. As this is a beginner-level tutorial, we focused on the process rather than accuracy. The model's accuracy can be improved by adding precise tuning and small amounts of labeling to each dataset.
- How to Upload My Data to the ThanoSQL Workspace
- How to Create a Table Using My Data
- How to Upload My Model to the ThanoSQL Workspace
Inquiries About Deploying a Model for Your Own Service
If you have any difficulties creating your own model using ThanoSQL or applying it to your services, please feel free to contact us below😊
For inquiries regarding building an image similarity search models: contact@smartmind.team