CardioEmbed · benchmark

Domain specialization, measured.

How a LoRA-adapted clinical-cardiology embedding compares against five baseline models on the published benchmark. Real numbers from arXiv:2511.10930. Model weights on Hugging Face.

Separation score · higher is better

The benchmark.

Separation score measures how well a model differentiates clinical cardiology concepts that look textually similar but mean different things (e.g. aortic stenosis vs aortic regurgitation). CardioEmbed's LoRA adaptation gives it a 3.0× advantage over the strongest biomedical baseline.

CardioEmbed · Qwen3-8BLoRA-adapted

0.510+1452%

Flagship LoRA-adapted embedding for clinical cardiology. Best-in-class separation, 99.6% classification accuracy.

Gemma-2-2BGoogle · baseline

0.455+700%

Efficient 2B-parameter baseline. Strong balance of speed and accuracy for production workloads.

Qwen3-4BAlibaba · baseline

0.446+1184%

Latest Qwen3 architecture. Multilingual support for international deployments.

Qwen2.5-0.5BAlibaba · baseline

0.327+1215%

Ultra-lightweight baseline. Fits on edge and resource-constrained environments.

E5-large-v2Microsoft · baseline

0.284+787%

Industry-standard E5 embedding model. Common baseline for semantic search.

BioLinkBERT-LargeBioMed · baseline

0.168+438%

Pre-trained on PubMed citations. Strong general biomedical NLP baseline.

All metrics

What the numbers mean.

Model	Separation	Improvement	Dim	Accuracy
CardioEmbed · Qwen3-8B	0.510	+1452%	3584	99.6%
Gemma-2-2B	0.455	+700%	2304	—
Qwen3-4B	0.446	+1184%	1792	—
Qwen2.5-0.5B	0.327	+1215%	896	—
E5-large-v2	0.284	+787%	1024	—
BioLinkBERT-Large	0.168	+438%	1024	—

Run it yourself.

Live inference endpoint (deepneuro.ai/lora-api) coming once the RunPod deployment lands. Until then, the model weights on Hugging Face are the canonical way to run it yourself.