Research
Our Approach
At Kensho, we work on cutting-edge research and develop leading ML and NLP capabilities for real-world impact.
We hire talented people and give them the freedom and resources needed to accomplish our shared goals. Our team conducts pure research with the goal of creating novel contributions and publishing our work in top-tier venues.
Our Projects
Learn more about what we’re researching.
We focus our efforts on problems in the finance and business worlds, designed to push the envelope and drive impact.
Tokenization
What is the role of tokenization in training LLMs, particularly for financial and quantitative use cases? In our research, we dissect and improve the tokenization process, quantifying impact on pre-training and downstream performance.
LLMs
LLMs show strong performance on many challenging tasks, but they still struggle to solve many real-world problems in business and finance. We’re developing better models for the business world, while focusing on the entire pipeline, including tokenization, constructing high-quality datasets, alignment, and evaluation.
Numeric Understanding
LLMs often struggle with numeric understanding. While we can alleviate some of this difficulty with code generation, we still want our models to be able to understand and process numerical data. Our team is investigating how well language models use numbers, and identifying the mechanisms language models learn to use numbers.
Benchmarks
LLMs require rigorous evaluation benchmarks, and targeting the reasoning skills needed in business and finance presents unique challenges. We are developing benchmarks for evaluating a model’s ability to reason about realistic financial problems, S&P AI Benchmarks.
Multi Document QA
Current state-of-the-art foundational models do not always correctly answer complex questions that require grounding knowledge from multiple sources. We are developing intelligent reading comprehension agents that can process and reason over a range of document collections.
Factuality
As GenAI applications become more prevalent in our daily lives, it’s increasingly important that they produce factually correct and accurate outputs. We develop methods to monitor factuality with the ultimate goal of providing a tool to manage model outputs of GenAI products.
Meet the R&D Team
Chris created and leads Kensho’s R&D lab, and he holds a joint faculty appointment at MIT, where he teaches ML and NLP. Previously, he taught and advised graduate students at Harvard. Since 2004, he has conducted ML research within industry, government, and academia, including at MIT Lincoln Laboratory, Spotify, Google, IBM Watson. He received his PhD from Brown University.
Michael leads language model efforts at Kensho, which includes training, alignment and evaluation. Previously, Michael was part of a federally funded R&D center, leading research teams focusing on AI-enhanced decision making and ML security. He holds an M.S. in Computational Science and Engineering from Harvard.
Varshini’s research focuses on developing and evaluating language models for financial applications, with specific focus on both tokenization and retrieval-based approaches (RAG). Prior to joining Kensho’s R&D team as a Research Engineer, she obtained her Master’s in Data Science from Harvard.
Charlie is interested in both language model alignment for conversational and code-based use cases, and understanding what and how language models learn. Before joining Kensho, he completed his PhD at Brown University as part of the Language Understanding and Representation Lab.
Work with us!
We are looking for world-class researchers to join our growing team.
Publications
Learn more about our research through our publications.
Craig W. Schmidt, Varshini Reddy, Haoran Zhang, Alec Alameddine, Omri Uzan, Yuval Pinter, Chris Tanner
EMNLP - 2024
Vu Trong Kim, Michael Krumdick, Varshini Reddy, Franck Dernoncourt, Viet Dac Lai
EMNLP - 2024
Michael Krumdick, Rik Koncel-Kedziorski, Viet Dac Lai, Varshini Reddy, Charles Lovering, Chris Tanner
ACL - 2024
Varshini Reddy, Rik Koncel-Kedziorski, Viet Dac Lai, Michael Krumdick, Charles Lovering, Chris Tanner
ACL - 2024
Omri Uzan, Craig W Schmidt, Chris Tanner, Yuval Pinter
ACL - 2024
Amir Pouran Ben Veyseh, Viet Dac Lai, et al.
NAACL (Findings) - 2024
Thuat Nguyen, Chien Van Nguyen, Viet Dac Lai, et al.
LREC-COLING - 2024
Viet Dac Lai, et al.
LREC-COLING - 2024
Duy Pham, Viet Dac Lai
NCME - 2024