Unlocking precision and speed: Discover the latest enhancements to Kensho Scribe
The latest Kensho Scribe update delivers a 30% reduction in transcription errors, raising the bar for accuracy across everything from earnings calls to meeting notes.
We are excited to introduce the latest update to Kensho Scribe AI, our speech-to-text transcription tool. From meeting note documentation to earnings call transcription, our enhanced model offers cutting-edge technology to streamline your workflow and deliver precise results. This new model delivers a 30% reduction in errors compared to our previous model while significantly increasing output quality. These latest advancements can transform your transcription experience and drive greater efficiency in your operations.
What’s new?
Cutting-Edge Architecture
At the heart of this update is our new model architecture. We have incorporated a number of advancements in the Automatic Speech Recognition (ASR) space, leveraging the latest developments in neural network design to build a more robust and scalable system. This 1.1-billion parameter architecture enhances Scribe’s ability to process and understand complex speech patterns, providing more accurate transcriptions even in challenging audio conditions. Also included are state-of-the-art mechanisms such as global attention and transformer layers have significantly improved the model’s contextual understanding and nuance detection.
Scribe AI is a powerful solution consisting of several models including ones for recognizing speech and understanding diverse accents (Acoustic Model), decoding language (Decoder Language Model), tracking when speakers change (Speaker Diarization Model), and identifying speakers (Speaker Identification Model) all working together. This integrated approach provides tremendous value to users by delivering highly accurate and contextually relevant transcriptions.
The results above were derived from a dataset of 220 hours of new earnings calls from August 2024. WER was calculated using human-annotated transcriptions of those earnings calls. Kensho Scribe is trained specifically for the finance industry, including earnings calls, whereas open source models are essentially attempting zero-shot formatting and transcription.
Updated Training Data
Data quality is crucial for accurate speech-to-text solutions. In this update, we trained our model extensively on 300,000 hours of up-to-date, high-quality business and finance data from S&P Global. Another unique competitive advantage lies in the production process of our training data. Models such as Large Language Models (LLMs) often improve with larger datasets scraped from a variety of sources; but ASR systems can experience diminishing returns with variously sourced data because of inconsistencies in labeling. With Kensho Scribe, our dedicated data transcription team meticulously labels earnings transcripts and investor calls, ensuring that Scribe AI is trained on reliable, orthographically consistent data. This enriched data ensures that our model is not only more precise in transcription but also in formatting style. Additionally, a breadth of data accommodating a global audience with diverse linguistic backgrounds and speech patterns has led to an improved performance and more accurate transcription results.
Why it matters
Enhanced Accuracy
Our new model features a notable enhancement in transcription accuracy. Leveraging advanced architectures similar to those in modern LLMs and utilizing a broader training dataset, it effectively recognizes and transcribes diverse accents, dialects, proper nouns, and industry-specific jargon. With this update, users experience an overall 30% reduction in word error rates from our previous offering. This includes improvements to formatting, correctly assigning speech to the right speaker, and correctly identifying the right word. This means clearer, more reliable fully-formatted text outputs that require less post-processing and editing.
Comparison of transcripts from the Kensho Scribe audio file
Quicker Turnaround Times
Besides accuracy, speed is essential in transcription. This release enhances both our Scribe AI and Scribe Human-in-the-Loop (HITL) offerings. The new Scribe AI model transcribes one hour of audio in ten seconds. This faster processing ensures businesses receive high-quality information promptly, minimizes costs from delays or prolonged timelines, and boosts team productivity. Scribe AI also improves our Scribe HITL service workflow by reducing the time required for human review. This improvement accelerates turnaround times for our external clients, such as expert networks and investment researchers, who have strict service-level agreements. Additionally, it enables us to generate machine readable earnings call transcripts more quickly during earnings call season.
Improved User Experience
Enhanced accuracy and faster turnaround times lead to a better user experience. Greater accuracy means fewer errors in transcribed text and quicker turnaround times means users receive their transcriptions faster. These efficiency gains allow users streamline workflows, leading to increased productivity and a smoother user experience.
Kensho’s commitment to innovation drives us to continually enhance our offerings. With this new model, Kensho builds on Scribe AI’s existing advantages by automatically formatting and eliminating imperfections like filler words and disfluencies making it ideal for downstream business use cases, including natural language processing applications like text mining and document search. This update also delivers notable benefits such as enhanced accuracy and increased efficiency, resulting in more precise transcriptions and streamlined performance. Additionally, it adapts the model to current market trends. By keeping pace with technological advancements, we not only gain a competitive edge but also uphold industry standards and compliance.