Company
Company

About us

Technology awards

Customers and references

Distributors

Job and career

Présentation de l'entreprise en français

Sky Biometry >

Biometric Supply >

Company info

About Neurotechnology

Job and career

Contact information

Subsidiaries

Sky Biometry >

Biometric Supply >

Technology Awards

Fingerprint recognition

Palmprint recognition

Face recognition

Iris recognition

Object recognition

Customers

Case studies

Solution Partner products

Our biometric technology in scientific papers

Solution Partner request

Distributors

Africa

Asia

Europe and Middle East

North America

South America
Products
Products

Artificial Intelligence

Natural Language Processing

Speech recognition & synthesis Voice assistant, transcription, models.

Chatbot assistants LLM-based chatbots and virtual assistants

StockGeist.ai Market Sentiment Monitoring

NetGeist.ai Natural Language Processing solutions

Computer Vision

SentiVeillance Video Surveillance for Integrators

SentiSight.ai Machine Learning Platform

DroneScope service Aerial Image Analysis

Brain-computer Interface

BrainAccess solutions Brain-Computer Interface

Biometrics

National-scale Projects

MegaMatcher ABIS Turnkey on-premise solution

MegaMatcher Criminal IDRS Criminal Registration System

MegaMatcher Criminal Investigation Criminal Investigation System

MegaMatcher ABIS Online Cloud service

MegaMatcher Accelerator High-performance server-side matching

MegaMatcher IDMS Identity Management System

MegaMatcher IDRS Identity Registration System

Stand-alone or Client-server

MegaMatcher SDK Fingerprints, palmprints, faces, irises and voiceprints

Single biometric modality SDK VeriFinger, VeriLook, VeriEye, and VeriSpeak

VeriFinger SDK Fingerprint recognition

VeriLook SDK Facial recognition

VeriEye SDK Iris recognition

VeriSpeak SDK Speaker recognition

Embedded and Smart Card Platforms

MegaMatcher On Card SDK Multimodal Biometric on Card Comparison

FingerCell SDK Fingerprint identification for embedded platforms

Mobile and Web Apps

MegaMatcher ID Identity authentication for secure applications

Single biometric products Face, Voice, and Slap Verification

Face Verification Identity authentication for secure applications

Voice Verification Identity authentication for secure applications

Slap Verification SDK Identity authentication for secure applications

SkyBiometry Online service at skybiometry.com

Other BIometric Products

Synthetic face dataset Acquire a collection of AI-generated faces

NCheck products family End-user biometric systems and cloud services

Check My Age Free app for age estimation by face image

Ultrasound technologies

Focused sound technology

Focusonics Directional speakers

Research

Ultrasonic non-contact tweezers Ultrasonic particle manipulation
News
Download
Support
EN | LT | FR

Įmonės pristatymas lietuvių kalba Apžvalga, NKA, biometrija, komp. rega, EEG technologijos

Présentation de l'entreprise en français Aperçu, biométrie, intelligence artificielle à EEG technologies
Contact us

SDK Contents and Technical Information

Neurotechnology AI SDK is intended for developers who want to use our automated speech recognition and speaker diarization engines in their on-premise systems. The SDK allows rapid development of speech-to-text applications using functions from the Neurotechnology AI SDK. Developers provide their audio input, text data etc, and have complete control over the output data; therefore the Neurotechnology AI SDK functions can be used with any user interface or integrated into third-party systems.

List of the components

The table below lists the components of the Neurotechnology AI SDK:

Automated speech recognition engine (more info)
Components	Microsoft Windows	Linux
• ASR Dev. Edition	1 single computer license
• ASR-10	Optionally available
• ASR-30	Optionally available
• ASR-100	Optionally available
Speaker diarization engine (more info)
• Diarization Dev. Edition	1 single computer license
• Diarization-20	Optionally available
• Diarization-60	Optionally available
• Diarization-200	Optionally available
Wrappers for programming languages and platforms
• Python	+	+
• C++	+	+
• .NET	+
• Java	+	+
Simple programming samples
• .NET	+
• Java	+	+
Diarization programming samples
• Java	+	+
Documentation
• Neurotechnology AI SDK documentation	+

Automated Speech Recognition Engine

The Automatic Speech Recognition (ASR) engine is responsible for transcribing audio samples into text. The ASR engine can use the output of the Speaker Diarization engine to process the records with multiple speakers.

The Automatic Speech Recognition (ASR) engine is available as these components with different performance capabilities:

ASR Dev. Edition – intended for initial integration, development, testing, and small-scale production workloads. It is designed to run on a PC with ADD: CPU, GPU and RAM requirements. One license is included with the Neurotechnology AI SDK.
ASR-10 – intended for systems with small-scale processing capabilities, like transcribing up to several thousand short phone calls per day. It is designed to run on a server with ADD: CPU, GPU and RAM requirements. Licenses for this components are optionally available with the Neurotechnology AI SDK.
ASR-30 – intended for systems with moderate throughput, like immediate phone call processing. It is designed to run on a server with ADD: CPU, GPU and RAM requirements. Licenses for this components are optionally available with the Neurotechnology AI SDK.
ASR-100 – intended for large-scale systems with high amounts of audio records, like immediate processing of the recent TV or radio shows, or the uploaded audio/video on social media platforms. It is designed to run on a server with ADD: CPU, GPU and RAM requirements. Licenses for this components are optionally available with the Neurotechnology AI SDK.
Custom version of the ASR engine with even higher performance is available upon request.

Automated speech recognition engine performance
Component	ASR Dev. Edition	ASR-10	ASR-30	ASR-100
Processing speed (seconds of the record per one real-time second)	2	10	30	100

Speaker Diarization Engine

The Speaker Diarization engine is responsible for recognizing who and when is speaking in the record, and marking them as specific timestamps for a particular audio record. The output of the diarization engine can be used for processing the records with multiple speakers using the ASR engine.

The Speaker Diarization engine is available as these components with different performance capabilities:

Diarization Dev. Edition – intended for initial integration, development, testing, and small-scale production workloads. It is designed to run on a PC with ADD: CPU, GPU and RAM requirements. One license is included with the Neurotechnology AI SDK.
Diarization-20 – intended for systems with small-scale processing capabilities, like transcribing up to several thousand short phone calls per day. It is designed to run on a server with ADD: CPU, GPU and RAM requirements. Licenses for this components are optionally available with the Neurotechnology AI SDK.
Diarization-60 – intended for systems with moderate throughput, like immediate phone call processing. It is designed to run on a server with ADD: CPU, GPU and RAM requirements. Licenses for this components are optionally available with the Neurotechnology AI SDK.
Diarization-200 – intended for large-scale systems with high amounts of audio records, like immediate processing of the recent TV or radio shows, or the uploaded audio/video on social media platforms. It is designed to run on a server with ADD: CPU, GPU and RAM requirements. Licenses for this components are optionally available with the Neurotechnology AI SDK.
Custom version of the Speaker Diarization engine with even higher performance is available upon request.

Speaker diarization engine performance
Component	Diarization Dev. Edition	Diarization-20	Diarization-60	Diarization-200
Processing speed (seconds of the record per one real-time second)	4	20	60	200

Usage Recommendations

The record should be stored in the RAM before performing speaker diarization and speech recognition, so enough free RAM should be ensured. Very long multi-hour records can be divided into shorter chunks.
Voice records of at least 1 second in length are recommended to assure the quality of speech recognition.
Microphones – there are no particular constraints on models or manufacturers when using regular PC microphones, headsets or the built-in microphones in laptops, smartphones and tablets.
Constant sound level / loudness is recommended to assure the quality of speech recognition.
Settings for clear sound must be ensured; as some audio software, hardware or drivers may have sound modification enabled by default. For example, the Microsoft Windows OS usually has, by default, sound boost enabled.
A minimum 8000 Hz sampling rate, with at least 16-bit depth, should be used during voice recording.
Environment constraints – in general the speaker diarization and the speech recognition engines produce best results when clear voice records are provided. There are these specific considerations to ensure better recognition quality:
- Background noise; which interferes with the speaker voice, can affect the recognition results, thus third-party or custom solutions for background noise reduction can be used to pre-process the voice records.
- Multiple people speaking at the same time can affect the recognition results.
- Short voice overlappings are acceptable.