Skip to main content

Morgan Stanley uses AI evals to shape the future of financial services

Morgan Stanley(opens in a new window) collaborated with OpenAI to build AI solutions that empower financial advisors with faster insights, more informed decisions, and efficient summarization tools to deepen client relationships. Their success was grounded in a robust evaluation framework that ensures AI performs reliably, consistently, and at the high standards advisors expect.

By embedding GPT-4 into their workflows, Morgan Stanley Wealth Management has enhanced how financial advisors access the firm’s knowledge base and respond to client needs. Today, over 98% of advisor teams actively use AI @ Morgan Stanley Assistant—Morgan Stanley’s internal chatbot for answering financial advisors’ questions—for seamless internal information retrieval.

“This technology makes you as smart as the smartest person in the organization. Each client is different, and AI helps us cater to each client’s unique needs.”
Jeff McMillan, Head of Firmwide AI at Morgan Stanley
A man in a suit and white shirt is seated at a table, gesturing with his hands as he appears to be engaged in a thoughtful conversation. The background is blurred with a soft blue tone, and a hint of golden decor adds warmth to the setting.

Building a foundation: evaluations that drive adoption

Rolling out AI in financial services required confidence that the technology will deliver outsized value while meeting the firm’s strict standards for quality and reliability. 

Morgan Stanley met this challenge by implementing an evaluation (eval) framework to test every AI use case before deployment. Evals measure how models perform against real-world use cases and guide improvements, with expert feedback, at every step.

The team began with three targeted goals for their first AI use cases:

  • Faster information retrieval to save advisors hours of document searching.

  • Automation of repetitive tasks like summarizing research reports.

  • Enhanced insights tailored to client needs.

To evaluate GPT-4’s performance against their experts, Morgan Stanley ran summarization evals to test how effectively the model condensed vast amounts of intellectual capital and process-driven content into concise summaries. Advisors and prompt engineers graded AI responses for accuracy and coherence, allowing the team to refine prompts and improve output quality.

The eval framework wasn’t static; it evolved as the team learned. They next introduced translation evals for multilingual clients and worked closely with OpenAI to fine-tune retrieval methods, ensuring AI could handle an ever-expanding document library. 

“We went from being able to answer 7,000 questions to a place where we can now effectively answer any question from a corpus of 100,000 documents,” says David Wu, Head of Firmwide AI Product & Architecture Strategy at Morgan Stanley.

McMillan notes the impact that fast and reliable answers from AI @ Morgan Stanley Assistant has had on advisors’ conversations. “Now, advisors can engage clients on topics they haven’t discussed before because the friction between knowledge and communication has gone to zero.” 

Scaling success from pilot to firmwide use

Building on the success of AI @ Morgan Stanley Assistant, Morgan Stanley’s internal chatbot for answering financial advisors’ questions, the team launched AI @ Morgan Stanley Debrief, a meeting summary tool for financial advisors powered by Whisper and GPT-4. 

Debrief turns Zoom recordings, with client consent, into actionable outputs like client notes, which are automatically integrated into CRM systems, and draft follow-ups, summarizing key action items that advisors can refine and send.

Advisors review and adjust AI-generated outputs before finalizing them, maintaining a balance between automation and human oversight.

Both tools benefited from Morgan Stanley’s eval-driven approach. For Debrief, the team developed evaluation datasets representing various meeting types and rigorously tested the model’s ability to capture critical action items without introducing errors.

“The feedback from advisors has been overwhelmingly positive. They’re more engaged with clients, and follow-ups that used to take days now happen within hours.”
Kaitlin Elliott, Head of Firmwide Generative AI Solutions, Morgan Stanley
A woman with red hair talks to a man in a suit in front of a marble wall with "Morgan Stanley" visible, viewed through glass.

Strengthening trust with controls

To meet financial services’ rigorous compliance standards, Morgan Stanley integrated quality assurance into their eval framework. Daily testing with a regression suite of sample questions identified potential weaknesses and improved the system’s ability to deliver compliant outputs.

“Based on all the questions we input and outputs we’re getting, we’d sit with OpenAI and say, ‘What can we change about our retrieval methods to help the accuracy we need at Morgan Stanley?’” says Elliott.

OpenAI’s zero data retention policy also addressed key security concerns, ensuring Morgan Stanley’s proprietary data remains private.

“One of the first questions we get is, is our information going to be used by OpenAI to train the public ChatGPT?” says Wu. “The OpenAI team’s willingness to ensure zero data retention has been really impactful.”

A triptych showing: hands typing on a laptop with a bright screen, a woman in a red top and blazer speaking while seated, and a close-up of a camera monitor displaying her on screen during a recording session.

98% adoption, increased engagement, and new services potential

Morgan Stanley’s focus on quality and reliability has led to trusted, secure solutions that employees want to use:

  • Nearly all advisor teams now use AI tools like the Assistant daily, achieving over 98% adoption in wealth management.

  • Access to documents has jumped from 20% to 80%, dramatically reducing search time and increasing document retrieval efficiency.

  • Advisors spend more time on client relationships, thanks to task automation and faster insights.

Their strong eval framework has also unlocked a flywheel for future solutions and services. With AI @ Morgan Stanley becoming a “super app” for employees, Morgan Stanley envisions countless use cases across departments and is already scaling Assistant functionality for its institutional securities group.

“We’re building platforms that will support many other use cases,” says Wu. “Debrief is currently for advisors speaking to clients, but why couldn’t we make that available for the investment banker speaking to the CFO?” 

“It’s a fundamental change which both improves the quality of our content and creates new products and services that only people who are close to the problem can imagine,” says McMillan.

A woman with curly hair and hoop earrings sits in profile, appearing attentive, with a blurred figure in red in the background during a meeting.