Brightseed (San Francisco, CA) has announced a major expansion of its proprietary bioactive dataset. Now at 21 million natural compounds, its database has nearly doubled since 2023. This growth is coupled with massive gains in efficiency as discovery costs have decreased more than 98 percent since the company’s start in 2018, the company stated.
According to the company, data remains a foundational pillar of Brightseed’s platform and a core driver of its differentiated AI and machine learning capabilities. Combined with advances in profiling and data analysis, this expansion increases both the breadth and depth of the dataset—further strengthening the platform’s ability to generate novel, high-confidence insights.
Brightseed said its dataset enables the generation of functional inference models across more than 23 health territories. Its corporate partners use these to make accelerated innovation advances across agriculture, health, nutraceuticals, animal health and beyond, according to the company. For example, one of its consumer health partners is leveraging Brightseed’s platform to model and discover complementary pathways in the GLP-1 signaling cascade, with a growing pipeline of candidate molecules already identified.
“The real story here is not just the size of our dataset, but its productivity,” said Lee Chae, PhD, co-founder and CEO of Brightseed. “Brightseed’s platform is built off of the world’s largest biologically connected and actionable library of proprietary natural compounds, and we’ve only tapped a small percentage of this dataset so far. There’s still so much gold for us to explore and discover.”
Unlike public datasets or generalized AI training data, Brightseed said its platform maps its proprietary dataset of small molecules to biological mechanisms and human health outcomes, enabling discovery and validation at unprecedented scale. This dataset powers Brightseed’s innovation platform, including its AI discovery engine, Forager, enabling customers to:
- Identify relationships among novel bioactives and mechanisms not found in public literature
- Evaluate biological relevance and feasibility earlier in development
- Generate differentiated, IP-ready innovation opportunities
- Increase the probability of commercial success
No comparable dataset exists that connects natural compounds to human health outcomes with this level of scale, structure and validation, according to the company.
Brightseed’s innovation platform addresses a structural challenge across life sciences: innovation is fragmented, slow, and high-risk. By integrating proprietary data with AI, Brightseed enables a continuous innovation model that shifts risk earlier, accelerates decision-making, and improves success rates.
For more information, visit www.brightseedbio.com.


