Biometrics
Robotics
Resources
Ordering
Services

Basic Recommendations for Speaker Recognition

The speaker recognition accuracy of VeriSpeak and MegaMatcher depends on the audio quality during enrollment and identification. Certain constraints should be noted before or during algorithm integration into a speaker recognition system, whereas other can be overcome by enrollment with the same phrase in different environments.

At least 2-seconds long voice samples are recommended to assure recognition quality.

General Security

A passphrase should be kept in secret and not pronounced in an environment where other people may hear it if the speaker recognition system is used in a scenario with unique phrases for each user.

Microphones

There are no particular constraints on models or manufacturers when using regular PC microphones, headsets or the built-in microphones in laptops, smartphones and tablets. However these factors should be noted:

  • The same microphone model is recommended (if possible) for use during both enrollment and recognition, as different models can produce different sound quality. Also some models may introduce specific noise or distortion into the audio, or may include certain hardware sound processing, which will not be present when using a different model. This recommendation is also valid when using smartphones or tablets, as different device models may alter the voice in different ways.
  • The same microphone position and distance is recommended during enrollment and recognition. Headsets provide optimal distance between user and microphone; this distance is recommended when non-headset microphones are used.
  • Web cam built-in microphones should be used with care, as they are usually positioned at a rather long distance from the user and may provide lower sound quality. The sound quality may be affected if users change their position relative to the web cam.

Sound Settings

Settings for clear sound must be ensured, as some audio software, hardware or drivers may have certain means of sound modification enabled by default. For example, the Microsoft Windows OS usually has sound boost enabled by default.

At least 11,025 Hz sampling rate with at least 16-bit depth should be set during voice recording.

Environment Constraints

The VeriSpeak and MegaMatcher speaker recognition algorithm is sensitive to background noise or loud voices in the background that may interfere with the user's voice and affect the recognition results. These solutions may be considered to reduce or eliminate these problems:

  • A silent environment for enrollment and recognition.
  • Several samples of the same phrase recorded in different environments can be stored in a biometric template. Later the user will be matched against these samples with much higher recognition quality.
  • Close-range microphones (like those in headsets or smartphones) that are not affected by distant sources of sound.
  • Third-party or custom solutions for background noise reduction, like using two separate microphones for recording user voice and background sound, and later subtracting the background noise from the recording.

User Behavior and Voice Changes

These natural voice changes do not occur often but may affect speaker recognition accuracy:

  • A temporarily hoarse voice caused by a cold or other sickness
  • Different emotional states that affect voice (i.e. cheerful voice versus tired voice)
  • Different pronuncation speeds during enrollment and identification

The aforementioned voice and user behavior changes can be managed in two ways:

  • Separate enrollments for the altered voice with storing the records to the same person's template;
  • Controlled neutral voice during enrollment and identification.
Products
AFIS or multi-biometric fingerprint, iris, face and voice identification for large-scale systems.
MegaMatcher

Face identification for PC or Web solutions.
VeriLook

Fingerprint identification for PC and Web solutions.
VeriFinger

Iris identification for PC and Web solutions.
VeriEye

Speaker recognition for PC or Web applications.
VeriSpeak

Object recognition for robotics and computer vision.
SentiSight

SDKs for mobile devices:

More products for developers:

End-user products:
  • NCheck Finger Attendance – an attendance control application that uses fingerprint biometrics to perform employee identification.
  • NVeiler Video Filter – a plug-in for VirtualDub that automatically detects faces in a frame, tracks the faces (or other objects) in subsequent frames and hides them.
 
Copyright © 1998 - 2012 Neurotechnology | Terms & Conditions | Privacy Policy