...links made to-date and counting fast!
Connected Data is More Valuable Data
Kensho Link is a machine learning service that maps entities in your database with unique ID numbers drawn from S&P Global’s world-class company database with precision and speed. Link uses an AI algorithm trained to return high quality links, even when the data inputs are incomplete or contain errors.
Once mapped to these unique IDs, a database becomes much more powerful. It is linked and cleaned, making data enrichment and deduplication seamless.
Whether you want to clean and organize your data or enrich it with S&P Global’s data, Kensho Link makes your data more valuable.
See for yourself
To return a unique ID for each of your entities.
Kensho Link compares your submitted data to S&P Global’s Capital IQ database, a leading company database with over 20 million (and growing) entities.
To find a match, Kensho Link works with what you have, even if it’s just a company name.
If available, Link will also use the following to increase the quality of its predictions:
- Phone Numbers
- Street Addresses
- Years founded
Save Time and Reduce Errors
Manual linking processes are impractical, and traditional automated solutions lack accuracy for all but the simplest linking jobs. Kensho Link accurately processes millions of company entities in hours.
Kensho Link’s strength is based on its algorithms and the unparalleled data of S&P Global backing it.
Using proprietary training data from S&P Global’s unique datasets, Kensho’s machine learning algorithms understand the patterns and data required to make highly accurate link predictions, even if your data is incomplete, uses aliases and abbreviations, or has incorrect information.
What are the benefits to using Kensho Link?
See how an investment management firm grows their data subscriptions and future dataset growth
For one investment management company, the ability to map messy data in the form of company names, aliases, and domains to the CapIQ CompanyID enhances their data subscriptions and promotes future dataset growth.
Not being able to quickly and efficiently pull data on companies in their database is painful. Kensho Link enables this client to do the mapping at scale with confidence. To learn more about how a global media & advertising firm is using Kensho Link, click the button below. Once mapped to these unique IDs, a database becomes much more powerful. It is linked and cleaned, making data enrichment and deduplication seamless.
How does Kensho Link work?
When a user submits a database record, Kensho Link identifies which records in S&P Global’s databases match with your entity. Our service will:
First, we’ve designed our model to enable smart comparisons between your data and S&P Global’s dataset.
We use best practices from natural language processing (NLP) to compute the similarity of names and addresses much more effectively than traditional keyword matching or fuzzy logic.
We use state-of-the-art tree-based models to learn the optimal ways to weight the similarities of different fields against one another.
This allows us to create an advanced approach that is less susceptible to edge cases than a rules-based model.
Second, our machine learning is backed by S&P Global’s world-class data and experience linking datasets, something other AI data companies typically cannot leverage.
S&P Global has decades of experience linking external data sources to their databases, and through machine learning we can stand on the lessons from every past content set ingestion project when comparing your dataset to S&P Global’s.
A linking service can only be as good as the data it is linking to and with over 20 million companies and growing, S&P Global’s Capital IQ dataset represents a gold standard dataset that will power the links you need to gain the most value from your data.
API Guides & Tutorials
Find our developer documentation, for building tools with the Link’s APIs and more
The Kensho Link Story
Kensho Link began as an internal S&P Global project, helping Market Intelligence and CapIQ integrate new datasets more quickly into their platform and Xpressfeed data offerings.
As we learned more about the challenges of integrating successive large datasets, we began applying the lessons learned toward building a generic company linking model, one that would be robust to recognizing nuances and differences among datasets.
In 2018, Kensho successfully linked Crunchbase’s private companies database to S&P Global’s Market Intelligence data, enabling CapIQ and Market Intelligence platform users and data subscribers with access to privately held companies licensed from Crunchbase.
In 2020, the Kensho Link product launched on Marketplace to support S&P Global’s data product offerings, giving customers direct access to Kensho Link’s capabilities for the first time.Get Started
Frequently Asked Questions
AI Linking is the process of joining two datasets together using machine learning models when they do not share a common unique identifier.
Kensho Link makes it simple to identify which records in S&P Global’s CapIQ database match up with your database. It looks at the attributes of your data (company name, address, country, etc), compares them to candidates from the CapIQ database, and computes a score based on the information you’ve provided and the hundreds of thousands of hand-labeled annotations we’ve trained our models. This Link Score quantifies how good a match each of the candidates is.
Kensho Link’s strength comes from two places: its model design and the data backing it.
Link’s design: We’ve designed our Link model to enable smart comparisons between your data and S&P Global’s. We use best practices from NLP to compute the similarity of names and addresses in ways more sophisticated than just keyword matching, (e.g., TF-IDF based methods), and we use state-of-the-art tree-based models to learn the optimal ways to weight the similarities of different fields against one another. This allows us to create a robust approach that is much more accurate with difficult examples than a rules-based model.
Link’s data: Our industry-leading machine learning is backed by world-class data. S&P Global has decades of experience linking external data sources to their databases, and through machine learning, we leverage the lessons from every past content set ingestion project when comparing your dataset to S&P’s. Because S&P Global’s data universe is so broad — over 20 million companies and growing — Link is able to reliably provide the links you need to make your data actionable.
The Link Score represents the quality of a particular pairwise match between your input and a matched CapIQ CompanyID. Higher scores indicate better matches.
Kensho Link gives you a score associated with each link prediction. This score reflects the strength of the model’s prediction for a given matched pair. You can request up to the top five results for each link. By giving up to five predictions, Kensho Link demonstrates what machines do best — sorting through millions of records to surface the top few — and empowers you to do what people do best — make nuanced decisions about specific data points.
With Kensho Link, we make it simple for you to sort out which companies we have coverage for, which ones we can help you enrich, and which ones you might need to clean up. The link scores provide your team with the transparency and insight to develop trust in the model and its predictions, as well as cater the results you receive toward your own use cases.
Kensho Link is a machine learning service that uses state-of-the-art technology combined with S&P Global’s rich data ecosystem to make predictions for links you might have. The predictions made by Link are determined by a vast set of training data and the implicit patterns therein, not by explicit rules or set weights. We’ve leveraged hundreds of thousands of hand-annotated samples for our training data to provide the highest quality links for your database.
We use state-of-the-art machine learning methods to make our predicted links. Our linking has two key pieces: how we compute similarity between two records, and how score links based upon those similarities. We use Natural Language Processing to compute similarity in ways smarter than simple keyword matching, and in order to weigh the different similarities against one another, we use a gradient-boosted decision tree, LightGBM, which is both fast and accurate. For specifics, we use both word- and character-based similarity metrics to handle our text-based input fields, which allows us to detect misspellings and weight terms by their relative rarity (i.e., the term "Apple" is more important than the term "Incorporated").
Yes, start your free trial or talk to your S&P Global relationship manager to discuss running a sample.