Bulk text, table and key-value extraction made easy.Get Started
...items extracted and counting!
Access the Data Stored Inside Your Documents
Kensho Extract has been trained on millions of financial documents to make it easier to get access to the data hidden inside your files. Whether you’re looking to make your financial documents machine readable, trying to tie table data into your proprietary database, or looking for specific data points across multiple documents, Kensho Extract can help.
See for yourself! Document Segmentation & Layout Analysis.
Kensho Extract is a fundamental machine learning capability that allows users to get access to the data stored inside their financial documents in a simple-to-use format for further analysis and action. Kensho Extract can be used independently or in conjunction with other services offered by Kensho.
Combining our document layout analysis and table structure recognition models, Kensho Extract allows users to quickly transform their unstructured documents into a machine-readable format that organizes the headers, titles, paragraphs, tables and footers detected within the document in their natural reading order. Our extraction capability interprets messy page layouts, structuring text into cohesive paragraphs that can be effectively analyzed and searched.Get Started
Kensho Extract will work with you!
Kensho Extract can be accessed in one of three ways:
Human-In-The-Loop (HITL) Services
Automated extraction services are never 100% perfect, but in partnership with S&P Global, Kensho provides you with the best possible experience. The human-in-the-loop service can be staffed on your end with access to our UI, or by Kensho, to allow a more hands-off approach to achieving the highest possible data extraction quality for your unique specifications.Talk to us about HITL extraction
Kensho Extract Use Cases
Frequently Asked Questions (FAQs)
Do you support any type of document?
Do you support languages beyond English?
Do you support table extraction?
Do you ever miss tables?
I only care about a single table in each document (e.g. the income statement), can you automate its extraction?
Why Kensho Extract?
Structured data is valuable. Whether you’re…
The fundamental block for all of these initiatives is having access to clean, structured data.
Unfortunately, the data most companies have is neither structured nor clean — whether hidden in slide decks, PDFs, or in a database that has mutated a dozen times since inception, data is frequently all but inaccessible. That is, unless you’re willing invest a lot of incredibly valuable expert time in trying to understand the information and then attempt to structure it via liberal use of spreadsheets.
We feel your pain.
S&P Global employs thousands of trained analysts who process more than 5 million pages of financial content on a yearly basis. Luckily, all that effort has created one of the largest data sets of machine learning training data for corporate financial documents, allowing us to speed up our internal operations anywhere from 50% - 100% depending on the task at hand.