Overview
Our similarity system compares parts by turning each part into a compact “fingerprint” called a vector. Later, vectors are compared to find parts that are close to each other.
There are two main cases:
When a part has detected features, the system uses much more shape-related information. It looks at things like holes, pockets, planes, curved surfaces, and other recognizable elements. It also considers measurements related to those features, such as their sizes and distribution. In that case, the similarity is based on a richer understanding of the model, so different parts are easier to tell apart.
When a part has no detected features, the system has much less to work with. In that case, it falls back mostly to a few general scalar values, mainly volume, surface area, and turnability. That means the part is represented in a much simpler way.
This is exactly why two parts can look different to a human but still receive the same similarity vector. If both parts have no features, and their scalar values are close enough, the system may treat them as effectively the same. If the app also contains extremely large parts, the maximum values used for normalization become very high. Then, smaller parts get compressed into a very narrow range, and their differences become much less visible to the algorithm. After that, the final values are rounded, and two different parts can end up with the same result.
So the key point is: if features are missing, similarity becomes much less about real geometry and much more about a few summary numbers. If those numbers are close, different parts can collapse into the same similarity fingerprint.
In simple terms:
With features: similarity is more detailed and shape-aware.
Without features: similarity is much simpler and easier to flatten.
If app-wide maximum values are extremely large, small, and medium parts can become too similar after normalization.
A few small notes about the similarity filter logic:
Similarity is searched only among other parts that already have an embedding.
The current part itself is excluded from the results.
Only parts from the same app are considered.
Only parts from tickets in
confirmedstate are included.Only parts whose ticket has a currency are included.
Only parts with a price are included in the final candidate list.
If
materialIdis passed, the results are additionally filtered by that material.Only parts above the requested
minSimilaritythreshold are kept.The search first takes the nearest vector candidates, then applies the business filters on top.
Final sorting is by
similarity DESC, thenprice, thenid DESC.
Future Objectives
Making similarity work across more parameters that allow users to tune how it works more precisely.