# __Search Video by Text__

- Tutorial Difficulty: â˜…â˜…â˜†â˜†â˜†
- 7 min read 
- Languages: [SQL](https://en.wikipedia.org/wiki/SQL) (100%)
- File location: tutorial_en/thanosql_search/search_video_by_text.ipynb   
- References: [Kinetics-700](https://paperswithcode.com/dataset/kinetics-700), [X-CLIP](https://arxiv.org/abs/2207.07285)

## Tutorial Introduction

<div class="admonition note">
    <h4 class="admonition-title">Understanding Multi-modal Learning</h4>
    <p>
    Multi-modal refers to an environment in which various forms of information are communicated, where modality refers to data types.
    In the case of machine learning using multi-modal data, it enables an integrated analysis since it effectively learns from various forms of data such as image data, text data, and sensor data.
    </p>
    <p>
    OpenAI's CLIP is a image-text multimodal deep learning model specialized in understanding text and images together.
    </p>
</div>

__The following are examples and applications of the ThanoSQL text-video search algorithm.__

- Use text descriptions to search from your own videos to return videos containing the scenes you want.
- Search for the scene you want using text from YouTube videos and so on.

<div class="admonition note">
    <h4 class="admonition-title">In This Tutorial</h4>
    <p>ðŸ‘‰ This tutorial uses the kinetics700-2020 dataset. Kinetics is a large image dataset of human behavior released by DeepMind. Kinetics 700-2020 is a new version of the Kinetics dataset which was released in 2020 and includes images of 700 classes. </p>
</div>

The ThanoSQL's X-CLIP model is a pre-built model that extends the existing image-text multimodal CLIP model to understand the relationship between video and text. In this tutorial, we'll use a model that inputs text to search for videos from within the ThanoSQL workspace database.

## __0. Prepare Dataset & Model__

As mentioned in the [ThanoSQL Workspace](https://docs.thanosql.ai/1.5/en/getting_started/paas/workspace/lab/), you must create an API token and run the query below to execute the query of ThanoSQL. 

In [None]:
%load_ext thanosql
%thanosql API_TOKEN=<Issued_API_TOKEN>

### __Prepare Dataset__

In [2]:
%%thanosql
GET THANOSQL DATASET kinetics700_data
OPTIONS (overwrite=True)

Success


<div class="admonition note">
    <h4 class="admonition-title">Query Details</h4>
    <ul>
        <li>"<strong>GET THANOSQL DATASET</strong>" downloads the specified dataset to the workspace.</li>
        <li>"<strong>OPTIONS</strong>" specifies the option values to be used for the <strong>GET THANOSQL DATASET</strong> clause.
        <ul>
            <li>"overwrite": determines whether to overwrite a dataset if it already exists. If set as True, the old dataset is replaced with the new dataset (bool, optional, True|False, default: False)</li>
        </ul>
        </li>
    </ul>
</div>

In [3]:
%%thanosql
COPY kinetics700
OPTIONS (if_exists='replace')
FROM 'thanosql-dataset/kinetics700_data/kinetics700.csv'

Success


<div class="admonition note">
    <h4 class="admonition-title">Query Details</h4>
    <ul>
        <li>"<strong>COPY</strong>" specifies the name of the dataset to be saved as a database table.</li>
        <li>"<strong>OPTIONS</strong>" specifies the option values to be used for the <strong>COPY</strong> clause.
        <ul>
           <li>"if_exists": determines how the function should handle the case where the table already exists, it can either raise an error, append to the existing table, or replace the existing table (str, optional, 'fail'|'replace'|'append', default: 'fail')</li>
        </ul>
        </li>
    </ul>
</div>

### __Prepare the Model__

In [4]:
%%thanosql
GET THANOSQL MODEL xclip
OPTIONS (
    model_name='tutorial_search_xclip',
    overwrite=True
    )

Success


<div class="admonition note">
    <h4 class="admonition-title">Query Details</h4>
    <ul>
        <li>"<strong>GET THANOSQL MODEL</strong>" downloads the specified model to the workspace.</li>
        <li>"<strong>OPTIONS</strong>" specifies the option values to be used for the <strong>GET THANOSQL MODEL</strong> clause.
        <ul>
            <li>"model_name": the model name to store a given model in the ThanoSQL workspace (str, optional)</li>
            <li>"overwrite": determines whether to overwrite a model if it already exists. If set as True, the old model is replaced with the new model (bool, optional, True|False, default: False)</li>
        </ul>
        </li>
    </ul>
</div>

## __1. Check Dataset__

For this tutorial, we use the __kinetics700__ table located in the ThanoSQL workspace database. Run the query below to check the contents of the table.

In [5]:
%%thanosql
SELECT *
FROM kinetics700
LIMIT 5

Unnamed: 0,video_path,label,duration
0,thanosql-dataset/kinetics700_data/video/-dhP2A...,checking tires,10
1,thanosql-dataset/kinetics700_data/video/1ejgHK...,testifying,10
2,thanosql-dataset/kinetics700_data/video/2Yvab3...,checking tires,10
3,thanosql-dataset/kinetics700_data/video/3nFLLc...,punching person (boxing),10
4,thanosql-dataset/kinetics700_data/video/5PfhCJ...,kitesurfing,10


<div class="admonition note">
    <h4 class="admonition-title">Understanding the Data Table</h4>
    <p>The <strong>kinetics700</strong> table contains the following information.</p>
    <ul>
        <li>video_path: video path</li>
        <li>label: video label</li>
        <li>duration: video time</li>
    </ul>
</div>

In [6]:
%%thanosql
PRINT VIDEO
AS
SELECT video_path
FROM kinetics700
LIMIT 2

/home/jovyan/thanosql-dataset/kinetics700_data/video/-dhP2AH0eqI.mp4


/home/jovyan/thanosql-dataset/kinetics700_data/video/1ejgHKw8E3Y.mp4


## __2. Convert Using a Pre-built Model__ 

To vectorize the __kinetics700__ videos, run the "__CONVERT USING__" query. The vectorized results are stored in a user-defined column(default: 'convert_result') in the __kinetics700__ table. 

In [7]:
%%thanosql
CONVERT USING tutorial_search_xclip
OPTIONS (
    video_col='video_path',
    result_col='convert_result'
    )
AS
SELECT *
FROM kinetics700

Unnamed: 0,video_path,label,duration,convert_result
0,thanosql-dataset/kinetics700_data/video/-dhP2A...,checking tires,10,"[b'\x16', b'm', b'\xfb', b'\xbe', b'\xba', b'!..."
1,thanosql-dataset/kinetics700_data/video/1ejgHK...,testifying,10,"[b'X', b'\x05', b'\xe7', b'\xbe', b'\xf1', b'\..."
2,thanosql-dataset/kinetics700_data/video/2Yvab3...,checking tires,10,"[b'\x10', b'\x96', b'\xfa', b'\xbe', b'\xff', ..."
3,thanosql-dataset/kinetics700_data/video/3nFLLc...,punching person (boxing),10,"[b'\x19', b'O', b' ', b'?', b'\xf8', b'\xbf', ..."
4,thanosql-dataset/kinetics700_data/video/5PfhCJ...,kitesurfing,10,"[b' ', b'u', b'\xa3', b'?', b'\x12', b'D', b'L..."
...,...,...,...,...
93,thanosql-dataset/kinetics700_data/video/wwgl_8...,land sailing,10,"[b'\xb6', b'7', b'\xa0', b'?', b'\x9e', b']', ..."
94,thanosql-dataset/kinetics700_data/video/xICkLB...,cutting nails,10,"[b'\xcd', b'\xe3', b'\x04', b'\xbf', b'\x94', ..."
95,thanosql-dataset/kinetics700_data/video/xlRC0n...,testifying,10,"[b'\x86', b'\xbb', b'_', b'\xbd', b'\x97', b'\..."
96,thanosql-dataset/kinetics700_data/video/yyy2Vy...,bench pressing,10,"[b'B', b'\xd0', b'%', b'?', b'\xae', b'=', b'\..."


<div class="admonition note">
    <h4 class="admonition-title">Query Details</h4>
    <ul>
        <li>"<strong>CONVERT USING</strong>" uses <strong>tutorial_search_xclip</strong> as an algorithm for video vectorizaion.</li>
        <li>"<strong>OPTIONS</strong>" specifies the options to be used for text vectorization.
        <ul>
            <li>"video_col": the name of the column containing the video path (str, default: 'video_path')</li>
            <li>"result_col": defines the column name that contains the vectorized results (str, optional, default: 'convert_result')</li>
        </ul>
        </li>
    </ul>
</div>

Execute the "__CONVERT USING__" query statement below and save the converted result in a new table so that it can be used with other __ThanoSQL__ query statements.

In [9]:
%%thanosql
CREATE TABLE kinetics700_convert_en AS 
SELECT * FROM (
    CONVERT USING tutorial_search_xclip
    OPTIONS (
        video_col='video_path',
        result_col='convert_result'
        )
    AS
    SELECT *
    FROM kinetics700
)

Success


## __3. Search__ 

Perform a text-based video search using the "__SEARCH VIDEO__" query statement and the __tutorial_search_xclip__ model.
 Execute the following query with the text value "bench press" and the embedded <strong>kinetics700</strong> videos to calculate the similarity.

In [10]:
%%thanosql
SELECT video_path, label, score
FROM (
    SEARCH VIDEO 
    USING tutorial_search_xclip
    OPTIONS (
        search_by='text',
        search_input='bench press',
        emb_col='convert_result',
        result_col='score',
        top_k=10
        )
    AS 
    SELECT * 
    FROM kinetics700_convert_en
    )

Unnamed: 0,video_path,label,score
0,thanosql-dataset/kinetics700_data/video/qNB9qv...,bench pressing,0.312154
1,thanosql-dataset/kinetics700_data/video/yyy2Vy...,bench pressing,0.274932
2,thanosql-dataset/kinetics700_data/video/a9S4Ox...,golf chipping,0.202286
3,thanosql-dataset/kinetics700_data/video/ML7Oll...,snowkiting,0.198646
4,thanosql-dataset/kinetics700_data/video/zb9HGN...,country line dancing,0.196104
5,thanosql-dataset/kinetics700_data/video/AfKqHI...,parasailing,0.193912
6,thanosql-dataset/kinetics700_data/video/6MWLkJ...,kitesurfing,0.192512
7,thanosql-dataset/kinetics700_data/video/BEnKTN...,snowkiting,0.190333
8,thanosql-dataset/kinetics700_data/video/aKcKTY...,opening bottle (not wine),0.187058
9,thanosql-dataset/kinetics700_data/video/8DIU9c...,playing squash or racquetball,0.186208


<div class="admonition note">
    <h4 class="admonition-title">Query Details</h4>
    <ul>
        <li>"<strong>SEARCH VIDEO</strong>" searches for videos. Input the text description of the video using the "text" variable.</li>
        <li>"<strong>USING</strong>" specifies <strong>tutorial_search_xclip</strong> as the model.</li>
        <li>"<strong>OPTIONS</strong>" specifies the option values required for video vectorization.
        <ul>
            <li>"search_by": defines the image|text|audio|video type to be used for the search (str)</li>
            <li>"search_input": defines the input to be used for the search (str)</li>
            <li>"emb_col": the column that contains the vectorized results (str)</li>
            <li>"result_col": defines the name of the column that contains the search results (str, optional, default: 'search_result')</li>
            <li>"top_k": number of rows to return. If set as None, returns the entire data table (int, optional, default: 1000)</li>
        </ul>
        </li>
        <li>"<strong>AS</strong>" defines the embedding table to be used for search. In this example, the <strong>kinetics700</strong> table is used.</li>
</div>

In [11]:
%%thanosql
PRINT VIDEO
AS (
    SELECT video_path
    FROM (
        SEARCH VIDEO 
        USING tutorial_search_xclip
        OPTIONS (
            search_by='text',
            search_input='bench press',
            emb_col='convert_result',
            result_col='score',
            top_k=2
            )
        AS 
        SELECT * 
        FROM kinetics700_convert_en
        )
    )

/home/jovyan/thanosql-dataset/kinetics700_data/video/qNB9qv6PqwI.mp4


/home/jovyan/thanosql-dataset/kinetics700_data/video/yyy2Vy_5DjI.mp4


## __4. In Conclusion__
In this tutorial, we searched for videos in the __kinetics700 dataset__ by text using a multi-modal text/video vectorization model. As this is a beginner-level tutorial, we focused on the process and showing visible results rather than accuracy. The video search can retrieve more accurate results by utilizing various queries.

* [How to Upload My Data to the ThanoSQL Workspace](https://docs.thanosql.ai/1.5/en/getting_started/data_upload/)
* [How to Create a Table Using My Data](https://docs.thanosql.ai/1.5/en/how-to_guides/ThanoSQL_query/COPY_SYNTAX/)
* [How to Upload My Model to the ThanoSQL Workspace](https://docs.thanosql.ai/1.5/en/how-to_guides/ThanoSQL_query/UPLOAD_MODEL_SYNTAX/)

<div class="admonition tip">
    <h4 class="admonition-title">Inquiries About Deploying a Model for Your Own Service</h4>
    <p>If you have any difficulties creating your own model using ThanoSQL or applying it to your services, please feel free to contact us belowðŸ˜Š</p>
    <p>For inquiries regarding building a text-video search models: <a href="mailto:contact@smartmind.team">contact@smartmind.team</a></p>
</div>