Package org.mockserver.llm.codec
Class EmbeddingVectors
java.lang.Object
org.mockserver.llm.codec.EmbeddingVectors
Shared, deterministic embedding-vector generation used by every provider
embedding codec (OpenAI, Gemini, Ollama, Bedrock Titan/Cohere).
The vector is either reproducibly derived from the input text + seed +
dimensions (when EmbeddingResponse.getDeterministicFromInput() is
true and an input is present) or random, then L2-normalised so it behaves
like a real embedding (unit length, cosine-comparable). Provider codecs only
differ in the JSON envelope they wrap this vector in.
-
Method Summary
Modifier and TypeMethodDescriptionstatic intapproximateTokens(String input) Approximate the prompt token count from the input length, matching the convention used by the chat codecs (~4 chars per token).static double[]build(EmbeddingResponse embedding, String input, int defaultDimensions) Build the embedding vector for anEmbeddingResponseand input, applying the determinism flag, seed, and default dimensions, then L2-normalising the result.static double[]generateDeterministicVector(String input, int dimensions, long seed) static double[]generateRandomVector(int dimensions) static voidnormalizeL2(double[] vector)
-
Method Details
-
build
Build the embedding vector for anEmbeddingResponseand input, applying the determinism flag, seed, and default dimensions, then L2-normalising the result.- Parameters:
embedding- the configured embedding responseinput- the request input text (may be null)defaultDimensions- provider-specific default when no dimensions are configured- Returns:
- an L2-normalised vector
-
generateDeterministicVector
-
generateRandomVector
public static double[] generateRandomVector(int dimensions) -
normalizeL2
public static void normalizeL2(double[] vector) -
approximateTokens
Approximate the prompt token count from the input length, matching the convention used by the chat codecs (~4 chars per token).
-