Vexir2vec: An Architecture-neutral Embedding Framework For Binary Similarity
2023 Β· S. Venkatakeerthy, Soumya Banerjee, Sayan Dey, et al.
Abstract
Binary similarity involves determining whether two binary programs exhibit similar functionality, often originating from the same source code. In this work, we propose VexIR2Vec, an approach for binary similarity using VEX-IR, an architecture-neutral Intermediate Representation (IR). We extract the embeddings from sequences of basic blocks, termed peepholes, derived by random walks on the control-flow graph. The peepholes are normalized using transformations inspired by compiler optimizations. The VEX-IR Normalization Engine mitigates, with these transformations, the architectural and compiler-induced variations in binaries while exposing semantic similarities. We then learn the vocabulary of representations at the entity level of the IR using the knowledge graph embedding techniques in an unsupervised manner. This vocabulary is used to derive function embeddings for similarity assessment using VexNet, a feed-forward Siamese network designed to position similar functions closely and se
Authors
(none)
Tags
Stats
Related papers
- VERSE: Versatile Graph Embeddings From Similarity Measures (2018)17.42
- Reveal Hidden Pitfalls And Navigate Next Generation Of Vector Similarity Search From Task-centric Views (2025)0.00
- Evaluating The Impact Of Word Embeddings On Similarity Scoring In Practical Information Retrieval (2026)0.00
- Search Efficient Binary Network Embedding (2019)3.58
- Approximate Vector Set Search Inspired By Fly Olfactory Neural System (2024)0.00
- Semantic Vector Encoding And Similarity Search Using Fulltext Search Engines (2017)6.77
- Visil: Fine-grained Spatio-temporal Video Similarity Learning (2019)13.70
- A Survey On Efficient Processing Of Similarity Queries Over Neural Embeddings (2022)0.00