{ "cells": [ { "attachments": {}, "cell_type": "markdown", "id": "6cfd2a8c-fdfc-4233-abd1-ece097069522", "metadata": {}, "source": [ "# __Create an Image Classification Model__" ] }, { "attachments": {}, "cell_type": "markdown", "id": "407db758", "metadata": {}, "source": [ "- Tutorial Difficulty: ★☆☆☆☆\n", "- 10 min read\n", "- Languages: [SQL](https://en.wikipedia.org/wiki/SQL) (100%)\n", "- File location: tutorial_en/thanosql_ml/classification/image_classification.ipynb\n", "- References: [(AI-Hub) Product image data](https://aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=realm&dataSetSn=64), [A ConvNet for the 2020s](https://arxiv.org/abs/2201.03545)" ] }, { "attachments": {}, "cell_type": "markdown", "id": "125a9e8e", "metadata": {}, "source": [ "## Tutorial Introduction\n", "\n", "
Classification is a type of Machine Learning that predicts which category(Category or Class) the target belongs to. For example, both binary classifications(used for classifying men or women) and multiple classifications(used to predict animal species such as dogs, cats, rabbits, etc.) are included in the classification tasks.
The human ability to classify the same data is estimated at about 95%.
\n", "You can also create a classification model based on the behavior of art enthusiasts to classify who is most likely to enjoy a particular piece of art. In other words, using only artwork images, you can create a model that predicts art preferences based on age, gender, place, and etc.\n", "
👉 Build an image classification model to classify more than 10,000 products using the Product Image dataset from AI-Hub, a data sharing platform. The model can be used for detection and identification in smart warehouses and unmanned stores.\n",
" Dataset consists total of 1,440,000 images. In this tutorial, you will use 1,800 training data and 200 test data to learn how to use ThanoSQL.
\n", " | image_path | \n", "div_l | \n", "div_m | \n", "div_s | \n", "div_n | \n", "comp_nm | \n", "img_prod_nm | \n", "multi | \n", "
---|---|---|---|---|---|---|---|---|
0 | \n", "thanosql-dataset/product_image_data/product_im... | \n", "유제품 | \n", "요구르트 | \n", "떠먹는 요구르트 | \n", "떠먹는 요구르트 | \n", "기타 | \n", "토핑오트&애플시나몬 | \n", "False | \n", "
1 | \n", "thanosql-dataset/product_image_data/product_im... | \n", "홈클린 | \n", "위생용품 | \n", "일반비누 | \n", "일반비누 | \n", "크리오 | \n", "크리오)골드디비누 | \n", "True | \n", "
2 | \n", "thanosql-dataset/product_image_data/product_im... | \n", "면류 | \n", "용기면 | \n", "국물용기라면 | \n", "짬뽕라면 | \n", "농심 | \n", "농심오징어짬뽕컵67G | \n", "True | \n", "
3 | \n", "thanosql-dataset/product_image_data/product_im... | \n", "디저트 | \n", "디저트/베이커리 | \n", "냉장디저트 | \n", "냉장디저트 | \n", "Dole 코리아 | \n", "Dole후룻볼슬라이스복숭아198g | \n", "False | \n", "
4 | \n", "thanosql-dataset/product_image_data/product_im... | \n", "주류 | \n", "기타주류 | \n", "칵테일 | \n", "칵테일 | \n", "롯데주류 | \n", "순하리소다톡바나나355ML | \n", "True | \n", "
product_image_train table contains the following information.
\n", "In this example, we set \"max_epochs\" to 1 to train the model quickly. In general, larger number of \"max_epochs\" increases performance of the inference at the cost of the computation time.
\n", "\n", " | image_path | \n", "div_l | \n", "div_m | \n", "div_s | \n", "div_n | \n", "comp_nm | \n", "img_prod_nm | \n", "multi | \n", "predict_result | \n", "
---|---|---|---|---|---|---|---|---|---|
0 | \n", "thanosql-dataset/product_image_data/product_im... | \n", "생활용품 | \n", "위생용품 | \n", "면봉 | \n", "면봉 | \n", "기타 | \n", "콩맥스전자담배용크리닝면봉 | \n", "True | \n", "생활용품 | \n", "
1 | \n", "thanosql-dataset/product_image_data/product_im... | \n", "소스 | \n", "장류 | \n", "쌈장 | \n", "쌈장 | \n", "씨제이제일제당 | \n", "해찬들고기전용쌈장450G | \n", "False | \n", "소스 | \n", "
2 | \n", "thanosql-dataset/product_image_data/product_im... | \n", "디저트 | \n", "디저트/베이커리 | \n", "냉장디저트 | \n", "냉장디저트 | \n", "Dole 코리아 | \n", "Dole후룻볼슬라이스복숭아198g | \n", "False | \n", "디저트 | \n", "
3 | \n", "thanosql-dataset/product_image_data/product_im... | \n", "음료 | \n", "기능성음료 | \n", "한방음료 | \n", "한방음료 | \n", "광동제약 | \n", "유어스광동어성초500ml | \n", "False | \n", "음료 | \n", "
4 | \n", "thanosql-dataset/product_image_data/product_im... | \n", "주류 | \n", "기타주류 | \n", "칵테일 | \n", "칵테일 | \n", "롯데주류 | \n", "순하리소다톡바나나355ML | \n", "False | \n", "주류 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
197 | \n", "thanosql-dataset/product_image_data/product_im... | \n", "의약외품 | \n", "기능성음료 | \n", "숙취해소음료 | \n", "숙취해소음료 | \n", "동아제약 | \n", "동아제약)가그린제로100ML | \n", "False | \n", "의약외품 | \n", "
198 | \n", "thanosql-dataset/product_image_data/product_im... | \n", "소스 | \n", "장류 | \n", "쌈장 | \n", "쌈장 | \n", "씨제이제일제당 | \n", "해찬들고기전용쌈장450G | \n", "True | \n", "소스 | \n", "
199 | \n", "thanosql-dataset/product_image_data/product_im... | \n", "주류 | \n", "기타주류 | \n", "칵테일 | \n", "칵테일 | \n", "롯데주류 | \n", "순하리소다톡바나나355ML | \n", "True | \n", "주류 | \n", "
200 | \n", "thanosql-dataset/product_image_data/product_im... | \n", "유제품 | \n", "요구르트 | \n", "떠먹는 요구르트 | \n", "떠먹는 요구르트 | \n", "기타 | \n", "토핑오트&애플시나몬 | \n", "False | \n", "유제품 | \n", "
201 | \n", "thanosql-dataset/product_image_data/product_im... | \n", "주류 | \n", "기타주류 | \n", "칵테일 | \n", "칵테일 | \n", "롯데주류 | \n", "순하리소다톡바나나355ML | \n", "True | \n", "주류 | \n", "
202 rows × 9 columns
\n", "If you have any difficulties creating your own model using ThanoSQL or applying it to your service, please feel free to contact us below😊
\n", "For inquiries regarding building an image classification model: contact@smartmind.team
\n", "