Data-Driven Insights: What 7,000+ Catalyst Proposals Tell Us About Cardano Innovation

In the exciting up and down history of Catalyst funding, which with over $100M grants to 1800 projects is the largest innovation fund of its kind on the planet, one question looms large: what actually works? After diving deep into over 7,000 proposals across Project Catalyst Funds 7 to 12, the team at Sapient have uncovered some fascinating patterns about funding success, gaming and policy changes from fund to fund. We covered 17 open source data science libraries and published 20 data reports, an AI-for-Catalyst showcase, MongoDB database and API, which you can find on our Github. Here are some of the highlights

The Power of Keywords (Or Why “DeFi” Might Tank Your Proposal)

It turns out, a proposal’s success might hinge on a single word. Our analysis revealed that name-dropping “MLabs” in the title resulted in an astounding 84% success rate—talk about brand recognition. Meanwhile, proposals featuring “DeFi” struggled to match the average 19% success rate. Mentions of “Africa” consistently outperformed, reflecting Cardano’s strategic focus on the continent. And if you’re wondering about the secret sauce, “Aiken” titled proposal (top among popular smart contract innovations) enjoyed a remarkable 69% success rate. The takeaway? The ADA holders are highly motivated by new technology and areas where Cardano is already leading.

The Rise and Fall of Community Assessment

Remember when your high school grades determined everything? Project Catalyst went through a similar phase. Our analysis of reviewer scores across funding rounds revealed a fascinating trajectory: assessor scores initially gained tremendous influence over funding outcomes, only to see their impact diminish after Fund-10. The reason were a series of policy changes that resulted less prominence in the voting app and support for other channels of promoting proposals. It’s democracy’s eternal struggle: balancing expert opinion with community wisdom.

As the boxplot chart clearly shows, differences between unfunded (red) and funded (green) proposals’ scores was over 1.0 full marks for Fund-7, but narrowed markedly to below 0.2

The Numbers Game: Cracking the Funding Code

Want to get funded? You might need anywhere between 40-50 million YES votes, but timing is everything. Fund-8 emerged as the “easiest” round to secure funding, while Fund-10 proved to be the toughest nut to crack. Special categories like Catalyst Operations required substantially more votes—a reflection of their prominence in the voting app and the community’s heightened scrutiny of governance-related proposals.

Clone Wars: The Copycats Strike Back

Community requests were at the heart of our project, and one recurring theme was the fear of plagiarism or multiple submission of cloned proposals. Our analysis uncovered numerous near-identical proposals submitted just before deadlines. Using open source NLP libraries’ text analysis, we identified proposal pairs with similarity scores above 0.80. While some similarities stemmed from offering similar services for different cross-chain bridges or languages, others raised red flags about the need for stronger duplicate detection mechanisms and clearer submission rules.

The AI Revolution Meets Cardano Governance

Our research also explored how AI could transform the proposal ecosystem with latest innovations like Generative AI, data mining and natural language processing. From detecting fraudulent submissions to generating insights from historical data, the potential applications are vast. Think of it as having a digital assistant that’s read every proposal ever submitted and can spot patterns invisible to the human eye. The future might see AI helping with everything from proposal writing to voter education—though hopefully not replacing human creativity entirely.

The most promising areas in our opinion are Chatbot-type LLM trained on official docs, past funds, chat history etc., for example to improve onboarding experiences and post-funding assistance. For governance purposes, more targeted, supervised learning algorithms have promise.

Building the Database of all Proposals’ Voting and Funding Data

To make this treasure trove of data accessible to all, we’ve created a MongoDB-powered database that houses every proposal detail since Fund-7. Unlike traditional CSV files, this system offers lightning-fast queries and real-time analytics. This can be accessed via a dedicated API with intuitive syntax like “python3 catalyst_query.py --fund Fund10 --challenge “Ecosystem” --min_score 4.5 --status FUNDED”.

The future of decentralized funding isn’t just about distributing ADA—it’s about distributing knowledge. By understanding what works, what doesn’t, and why, we’re learning how to make better decisions as a community and hopefully be able to tweak and improve from fund to fund.

Looking Ahead: Sapient Fund 13 proposals
We’re introducing two projects to bring our Fund-11 toolkit and insights to a broader audience and new technical capabilities. Aimed at fostering fairness and smarter decision-making, “Arvo: Fair Funding for All” ensures a more balanced distribution of funds, giving smaller projects a fair chance to succeed. “dRep AI Assistant” empowers dReps with data-driven insights, supporting more informed voting decisions. Together, these initiatives aim to strengthen transparency, fairness, and community-driven impact in Catalyst’s next chapter.

7 Likes