Scope, There It Is!
Redefining the front-end of AI innovation with a PSA—helping organizations move from vague ideas to viable project plans.
Across industries, AI has evolved from a buzzword to a real driver of innovation. Yet, when it comes to solving society’s most pressing problems (like public health, urban mobility, or disaster relief) the path from idea to AI solution remains frustratingly unclear. While companies and nonprofits alike are eager to harness AI for good, they often find themselves stuck at the very beginning: scoping the problem.
The Scoping Bottleneck
“Problem scoping” may not sound like the most glamorous part of launching an AI project, but it is arguably the most critical. It’s the phase where organizations figure out what problem they’re actually trying to solve and how AI could realistically help. In corporate settings, this often means refining vague goals into specific, solvable challenges. In social impact spaces (such as global health or humanitarian aid), it’s even more difficult. Teams often lack deep technical knowledge, access to relevant data, or alignment between domain experts and technologists.
Think of it like this: wanting to use AI to “improve maternal health outcomes” sounds ambitious and noble, but what exactly does that mean in practice? Should the system identify early-risk pregnancies? Optimize clinic staffing? Predict medication stockouts? Without a sharply defined problem, even the best algorithms can’t do much.
This scoping process is slow, subjective, and resource-heavy. It relies on back-and-forth consultations between experts, extensive literature reviews, data audits, and endless iterations. Many well-intentioned AI-for-social-good initiatives never make it past this stage.
That’s the problem the researchers behind “Towards Automated Scoping of AI for Social Good Projects” set out to solve.
Enter the Problem Scoping Agent
The team developed a new AI-powered tool called the Problem Scoping Agent (PSA), a kind of digital consultant built to help organizations define the “what” and “how” of AI for good. At its core, the PSA uses cutting-edge large language models or LLMs (think of these as supercharged versions of ChatGPT) trained to understand complex questions, review research papers, synthesize data, and generate thoughtful responses.
But the innovation isn’t just in the tech. What makes the PSA special is how it’s deployed. Rather than trying to automate entire AI solutions, the PSA focuses exclusively on generating high-quality project proposals. These proposals lay out a potential problem, explain how AI might help, highlight relevant datasets, and consider implementation feasibility.
Here’s how it works: A user (say, a nonprofit project manager) provides a broad prompt like “use AI to improve childhood nutrition in low-income areas.” The PSA then scans available literature, data sources, and past projects. It synthesizes all this information and produces a structured proposal: outlining a well-defined problem, proposing specific AI techniques, identifying necessary datasets, and flagging potential risks or limitations.
In short, the PSA helps teams jump-start their projects by cutting through ambiguity and surfacing the most viable AI opportunities (without needing a full-time data science team).
This kind of tool is more than a productivity booster. It’s a bridge—helping mission-driven teams turn intentions into actionable strategies, and expanding the reach of AI into sectors that have traditionally struggled to access it.
Once the researchers built the PSA, the next step was to put it to the test. Could an AI-driven assistant actually generate useful, actionable project proposals for AI-for-social-good initiatives (on par with those created by human experts)?
To answer this, the research team designed a series of controlled experiments. These weren’t theoretical demonstrations; they tested the PSA in real-world-like scenarios to assess its practical effectiveness.
Putting the PSA to Work
The researchers tasked the PSA with generating project proposals across a variety of domains: public health, education, transportation, and more. Each assignment started with a high-level prompt, such as “improve access to maternal health in underserved areas.” From there, the PSA independently generated a structured proposal that included a clearly defined problem, relevant AI methods, available datasets, and potential implementation risks.
But the researchers didn’t stop at merely generating outputs. To objectively assess quality, they conducted blind evaluations, a method borrowed from academic peer review and product design testing. A panel of subject matter experts reviewed proposals generated by the PSA without knowing whether they were written by the AI system or a human expert. Each proposal was scored on dimensions like clarity, relevance, depth of analysis, creativity, and feasibility.
This method created a level playing field. The goal wasn’t to prove that the PSA was superior to human experts; it was to see whether it could produce work of comparable quality. If an AI tool could consistently meet expert-level standards, it would mark a significant breakthrough in democratizing access to AI project development.
How Success Was Measured
The evaluation framework was rooted in real-world applicability. The research team wasn’t interested in flashy outputs or impressive-sounding buzzwords. They focused on whether the PSA’s proposals were actually useful for organizations considering AI-driven interventions.
Each proposal was evaluated on the following key criteria:
- Problem clarity: Is the problem clearly and specifically defined—avoiding vague or overly broad framing?
- Alignment with AI capabilities: Does the proposed solution fit the strengths and limitations of AI technology?
- Evidence grounding: Are the recommendations based on real data sources, relevant research, or prior work in the field?
- Feasibility and risks: Does the proposal acknowledge implementation constraints, such as data availability or ethical concerns?
- Innovativeness: Does the proposal surface fresh or underexplored ideas that a human team might not consider?
In addition to expert reviews, the researchers also used internal metrics from the PSA system to understand how it sourced information, filtered data, and synthesized its final outputs. This added layer of analysis ensured that the PSA wasn’t simply generating plausible-sounding content; it was reasoning through problems in a way that mirrored expert decision-making.
What They Found
The results were encouraging. The PSA frequently produced proposals that reviewers rated as comparable in quality to those written by experienced human teams. In some cases, it even outperformed on novelty and structured thinking, thanks to its ability to scan vast bodies of literature and synthesize insights that might otherwise go unnoticed.
Crucially, the evaluation highlighted that while the PSA wasn’t perfect, it was good enough to be useful. That distinction matters: it means the tool isn’t replacing human strategists, but rather enhancing their ability to generate high-quality ideas quickly and at scale.
These findings validated the PSA’s core promise: that AI can play a meaningful role not just in solving problems, but also in defining which problems to solve in the first place.
The PSA’s success wasn’t measured by how impressive it sounded, but by how usable its outputs were for actual decision-makers. Evaluation hinged on one central question: Would a real-world team find this proposal credible, actionable, and valuable?
Reviewers weren’t told whether they were assessing proposals written by humans or by the PSA. They simply judged based on quality. This blind evaluation design was essential. It stripped away bias and put the tool’s work on even footing with experienced professionals. Across multiple rounds, the PSA held its own, producing proposals that met the standards expected from seasoned strategy teams or data scientists. That’s not just a technical win; it’s also a strategic one. It signals that the early stages of AI project development can be augmented, or even accelerated, with the right tools.
But no solution is without trade-offs.
Recognizing the Limits
Despite its strong performance, the PSA isn’t a magic wand. Its outputs, while often impressive, can vary in quality depending on the prompt, context, and available data. Like all AI systems, it is only as good as the information it draws from. In domains with limited public research or sparse data, the PSA might struggle to generate nuanced insights.
It also has no lived experience or institutional memory. For example, a human strategist working in education reform might intuitively understand that a certain intervention failed in the past due to political friction or community resistance. The PSA, on the other hand, will miss that unless such knowledge is documented in accessible sources. This makes human oversight not just helpful, but also necessary.
Additionally, while the PSA can surface plausible directions for AI applications, it doesn’t replace the need for cross-functional alignment, stakeholder buy-in, or on-the-ground testing. In other words, it helps frame the “what” and “how” of a project, but real execution still depends on human leadership.
What Comes Next
The future direction of this work is as much about collaboration as it is about technology. Researchers envision enhanced versions of the PSA that can incorporate user feedback, learn from past proposal performance, and even customize outputs based on organizational goals. Think of it evolving from a static idea generator into an interactive planning partner.
There’s also room to expand its reach. While the initial experiments focused on social-good projects, the same scoping challenges exist in corporate ESG strategy, sustainability planning, and digital transformation efforts. The PSA’s core engine (structured, knowledge-informed synthesis) could be tailored for many sectors where teams are under pressure to innovate but short on technical guidance.
Why It Matters
Perhaps the most important implication is what this tool represents: a shift in how we think about AI’s role in society. Too often, AI is treated like a black box, reserved for technologists or researchers. But the PSA flips the script. It empowers policy-makers, nonprofits, and business leaders to lead AI initiatives by lowering the barrier to entry. It opens the door for more inclusive participation in innovation.
By automating the most resource-intensive part of the AI project lifecycle (problem definition), the PSA helps redirect energy toward action. And in the context of urgent societal challenges, that’s not just helpful. It’s also vital.
In the end, this research isn’t just about a clever application of AI. It’s about reimagining the front end of problem-solving, where good intentions often stall—and replacing friction with momentum. That shift could unlock a new generation of AI-for-good initiatives, not driven by tech giants, but by the people and organizations closest to the problems.
Further Readings
- Emmerson, J., Ghani, R., & Shi, Z. R. (2025, April 28). Towards automated scoping of AI for social good projects. arXiv.org. https://arxiv.org/abs/2504.20010
- Mallari, M. (2025, April 30). Define and conquer. AI-First Product Management by Michael Mallari. https://michaelmallari.bitbucket.io/case-study/define-and-conquer/