Skip to content

CLIP

Notation Conventions

  • Parentheses () indicate literal parentheses.
  • Braces {} are used to bind combinations of options.
  • The bracket [] indicates an optional clause.
  • An ellipsis following a comma in brackets [,...] means that the preceding item can be repeated as a comma-separated list
  • The vertical bar | represents the logic OR.
  • VALUE represents a regular value.
  • literal: a fixed or unchangeable value, also known as a Constant.

    Each literal has a special data type such as column, in the table.

CONVERT Syntax

Use the "CONVERT" statement to convert data into the vectors and add it to the table.

query_statement:
    query_expr

CONVERT USING (model_name_expression)
OPTIONS (
    expression [ , ...]
    )
AS
(query_expr)

OPTIONS Clause

OPTIONS (
    (image_col=column_name),
    (text_col=column_name),
    (convert_type={'image'|'text'}),
    [batch_size=VALUE],
    [result_col=column_name]
    )

The "OPTIONS" clause allows you to change the value of a parameter. The definition of each parameter is as follows.

  • "image_col": the name of the column containing the image path (str, default: 'image_path')
  • "text_col": the name of the column containing the text (str, default: 'text')
  • "convert_type": file type for vectorization (str, 'image'|'text', default: 'image')
  • "batch_size": the size of dataset bundle utilized in a single cycle of training (int, optional, default: 16)
  • "result_col": defines the column name that contains the vectorized results (str, optional, default: 'convert_result')

CONVERT Example

An example "CONVERT" query can be found in Search Image by Text.

%%thanosql
CONVERT USING tutorial_search_clip
OPTIONS (
    image_col='image_path', 
    convert_type='image',
    batch_size=128,
    result_col='convert_result'
    )
AS 
SELECT *
FROM unsplash_data

SEARCH IMAGE Syntax

Use the "SEARCH IMAGE" statement to retrieve the desired image data.

query_statement:
    query_expr

SEARCH IMAGE 
USING (model_name_expression)
OPTIONS (
    expression [ , ...]
    )
AS
(query_expr)

OPTIONS Clause

OPTIONS (
    (search_by={image|text|audio|video}),
    (search_input=expression),
    (emb_col=column_name),
    [result_col=expression],
    [top_k=VALUE]
    )

The "OPTIONS" clause allows you to change the value of a parameter. The definition of each parameter is as follows.

  • "search_by": defines the image|text|audio|video type to be used for the search (str)
  • "search_input": defines the input to be used for the search (str)
  • "emb_col": the column that contains the vectorized results (str)
  • "result_col": defines the name of the column that contains the search results (str, optional, default: 'search_result')
  • "top_k": number of rows to return. If set as None, returns the entire data table (int, optional, default: 1000)

SEARCH IMAGE Example

An example "SEARCH IMAGE" query can be found in Search Image by Text.

%%thanosql
SEARCH IMAGE 
USING tutorial_search_clip
OPTIONS (
    search_by='text',
    search_input='a black cat',
    emb_col='convert_result',
    result_col='search_result'
    )
AS 
SELECT * 
FROM unsplash_data

Last update: 2023-08-09