CLIP¶
Notation Conventions
- Parentheses
()
indicate literal parentheses. - Braces
{}
are used to bind combinations of options. - The bracket
[]
indicates an optional clause. - An ellipsis following a comma in brackets [,...] means that the preceding item can be repeated as a comma-separated list
- The vertical bar
|
represents the logicOR
. - VALUE represents a regular value.
- literal: a fixed or unchangeable value, also known as a Constant.
Each literal has a special data type such as column, in the table.
CONVERT Syntax¶
Use the "CONVERT" statement to convert data into the vectors and add it to the table.
query_statement:
query_expr
CONVERT USING (model_name_expression)
OPTIONS (
expression [ , ...]
)
AS
(query_expr)
OPTIONS Clause
OPTIONS (
(image_col=column_name),
(text_col=column_name),
(convert_type={'image'|'text'}),
[batch_size=VALUE],
[result_col=column_name]
)
The "OPTIONS" clause allows you to change the value of a parameter. The definition of each parameter is as follows.
- "image_col": the name of the column containing the image path (str, default: 'image_path')
- "text_col": the name of the column containing the text (str, default: 'text')
- "convert_type": file type for vectorization (str, 'image'|'text', default: 'image')
- "batch_size": the size of dataset bundle utilized in a single cycle of training (int, optional, default: 16)
- "result_col": defines the column name that contains the vectorized results (str, optional, default: 'convert_result')
CONVERT Example
An example "CONVERT" query can be found in Search Image by Text.
%%thanosql
CONVERT USING tutorial_search_clip
OPTIONS (
image_col='image_path',
convert_type='image',
batch_size=128,
result_col='convert_result'
)
AS
SELECT *
FROM unsplash_data
SEARCH IMAGE Syntax¶
Use the "SEARCH IMAGE" statement to retrieve the desired image data.
query_statement:
query_expr
SEARCH IMAGE
USING (model_name_expression)
OPTIONS (
expression [ , ...]
)
AS
(query_expr)
OPTIONS Clause
OPTIONS (
(search_by={image|text|audio|video}),
(search_input=expression),
(emb_col=column_name),
[result_col=expression],
[top_k=VALUE]
)
The "OPTIONS" clause allows you to change the value of a parameter. The definition of each parameter is as follows.
- "search_by": defines the image|text|audio|video type to be used for the search (str)
- "search_input": defines the input to be used for the search (str)
- "emb_col": the column that contains the vectorized results (str)
- "result_col": defines the name of the column that contains the search results (str, optional, default: 'search_result')
- "top_k": number of rows to return. If set as None, returns the entire data table (int, optional, default: 1000)
SEARCH IMAGE Example
An example "SEARCH IMAGE" query can be found in Search Image by Text.
%%thanosql
SEARCH IMAGE
USING tutorial_search_clip
OPTIONS (
search_by='text',
search_input='a black cat',
emb_col='convert_result',
result_col='search_result'
)
AS
SELECT *
FROM unsplash_data
Last update:
2023-08-09