> not useful for the task of finding an optimal candidate
That statement is just flat out incorrect on it's face, however it did make me think of something I hadn't though of before, which is this:
Embedding vectors can be made to have a "scale" (multiplier) on specific terms which represent the amount of "weight" to add to that term. For example if I have 10 years experience in Java Web Development, then we can take the actual components of that vector
embedding (i.e. for string "Java Web Development") and multiply them by some proportionality of 10, and that results in a vector that is "Further" into that direction. This represents an "amount" of directional into the Java Web direction.
So this means even with vector embeddings we can scale out to specific amounts of experience. Now here's the cool part. You can then take all THOSE scaled vectors (one for each individual job candidate skill) and average them to get a single point in space which CAN be compared as a single scalar distance from what the Job Requirements specify.
Then you would have to renormalize the vectors. You really really want to keep the range -1..1 because that is a special case where cosine similarity equals dot product equals Euclidean distance.
I meant the normalized hyperspace direction (unit vector) represents a particular "skill" and the distance into that direction (extending outside the unit hypersphere) is years of experience.
This is geometrically "meaningful", semantically. It would apply to not just a time vector (experience) but in other contexts it could mean other things. Like for example, money invested into a particular sector (Hedge fund apps).
This makes me realize we could design a new type of Perceptron (MLP) where specific scalars for particular things (money, time, etc.) could be wired into the actual NN architecture, in such a way that a specific input "neuron" would be fed a scalar for time, and a different neuron a scalar for money, etc. You'd have to "prefilter" each training input to generate the individual scalars, but then input them into the same "neuron" every time during training. This would have to improve overall "Intelligence" by a big amount.
That statement is just flat out incorrect on it's face, however it did make me think of something I hadn't though of before, which is this:
Embedding vectors can be made to have a "scale" (multiplier) on specific terms which represent the amount of "weight" to add to that term. For example if I have 10 years experience in Java Web Development, then we can take the actual components of that vector embedding (i.e. for string "Java Web Development") and multiply them by some proportionality of 10, and that results in a vector that is "Further" into that direction. This represents an "amount" of directional into the Java Web direction.
So this means even with vector embeddings we can scale out to specific amounts of experience. Now here's the cool part. You can then take all THOSE scaled vectors (one for each individual job candidate skill) and average them to get a single point in space which CAN be compared as a single scalar distance from what the Job Requirements specify.