VeriSpeak SDK

Speaker recognition for stand-alone or Web applications

VeriSpeak voice identification technology is designed for biometric system developers and integrators. The text-dependent speaker recognition algorithm assures system security by checking both voice and phrase authenticity. Voiceprint templates can be matched in 1-to-1 (verification) and 1-to-many (identification) modes.

Available as a software development kit that enables the development of stand-alone and Web-based speaker recognition applications on Microsoft Windows, Linux, Mac OS X, iOS and Android platforms.

Reliability Tests

The VeriSpeak 9.0 algorithm has been tested with voice samples taken from the XM2VTS Database, as well as with voice samples from Neurotechnology's internal database.

Experiment 1
VeriSpeak ROC chart calculated using voice samples from XM2VTS database
Click to zoom


Experiments 2 and 3
VeriSpeak ROC chart calculated using voice samples from Neurotechnology internal database
Click to zoom

Experiment 1
VeriSpeak ROC chart calculated using voice samples from XM2VTS database
Experiments 2 and 3
VeriSpeak ROC chart calculated using voice samples from Neurotechnology internal database

These voice template matching experiments were performed with the VeriSpeak 9.0 text-dependent engine:

  • Experiment 1 used voice samples from the XM2VTS database. All samples include the same fixed phrase pronounced by all subjects.
  • Experiment 2 used voice samples from Neurotechnology's internal voice database 1. All samples included the same fixed phrase pronounced by all subjects.
  • Experiment 3 used voice samples from Neurotechnology's internal voice database 2. Each subject pronounced a unique phrase during his/her recording.

Receiver operation characteristic (ROC) curves are usually used to demonstrate the recognition quality of an algorithm. ROC curves show the dependence of false rejection rate (FRR) on the false acceptance rate (FAR). Charts with ROC curves for each of the experiments are available above.

VeriSpeak 9.0 text-dependent algorithm tests with XM2VTS and Neurotechnology's internal databases
  Exp. 1 Exp. 2 Exp. 3
Total voice samples in the database 2360 309 305
Subjects in the database 295 42 42
Recording sessions per subject 8 1 - 10 1 - 10
Average voice sample length (seconds) 6.167 4.975 6.214
FRR at 0.1 % FAR 9.500 % 3.108 % 0.286 %