Partial Visual-semantic Embedding: Fashion Intelligence System With Sensitive Part-by-part Learning
2022 Β· Ryotaro Shimizu, Takuma Nakamura, Masayuki Goto
Abstract
In this study, we propose a technology called the Fashion Intelligence System based on the visual-semantic embedding (VSE) model to quantify abstract and complex expressions unique to fashion, such as ''casual,'' ''adult-casual,'' and ''office-casual,'' and to support users' understanding of fashion. However, the existing VSE model does not support the situations in which the image is composed of multiple parts such as hair, tops, pants, skirts, and shoes. We propose partial VSE, which enables sensitive learning for each part of the fashion coordinates. The proposed model partially learns embedded representations. This helps retain the various existing practical functionalities and enables image-retrieval tasks in which changes are made only to the specified parts and image reordering tasks that focus on the specified parts. This was not possible with conventional models. Based on both the qualitative and quantitative evaluation experiments, we show that the proposed model is superior
Authors
(none)
Tags
Stats
Related papers
- Fashion-specific Attributes Interpretation Via Dual Gaussian Visual-semantic Embedding (2022)0.00
- Fashionvil: Fashion-focused Vision-and-language Representation Learning (2022)14.66
- Diversity In Fashion Recommendation Using Semantic Parsing (2019)10.21
- Fashionfae: Fine-grained Attributes Enhanced Fashion Vision-language Pre-training (2024)0.00
- Exploiting Latent Codes: Interactive Fashion Product Generation, Similar Image Retrieval, And Cross-category Recommendation Using Variational Autoencoders (2020)0.00
- Fad-vlp: Fashion Vision-and-language Pre-training Towards Unified Retrieval And Captioning (2022)7.81
- Fame-vil: Multi-tasking Vision-language Model For Heterogeneous Fashion Tasks (2023)15.69
- Fine-grained Fashion Similarity Prediction By Attribute-specific Embedding Learning (2021)15.29