Accelerating and refining UX tests with AI and multi-agent systems.
As an experiment with Synthetic Personas shows, the value of AI lies not in replacing entire processes, but in solving specific bottlenecks.
Here at Taqtile, we view onboarding not just as an integration process, but as an invitation to practical experimentation with new technologies. As a result of this process, Ysabella Andrade's experiment using Synthetic Personas for UX testing reinforced one of the most important aspects of using technology:
Solutions with Artificial Intelligence yield results when they start with specific bottlenecks, growing with each result achieved, rather than automating entire processes.
In Ysabella's approach, instead of trying to replace the complete UX research, she focused on a critical stage: the intermediate refinement between the initial prototype and the first usability test. As a result, the product gains maturity before going for testing with real people, bringing speed and value to the process, since testing with real people can be leveraged for more strategic questions.
In the article below, the process is detailed, covering the engineering of the agents used and the learnings about the limits and potentials of this approach for designers and PMs.
And if you want to know more about the adoption of AI in large companies, check out the Taqtile Radar, a report with key learnings and market data about technology adoption in 2025.
Synthetic Personas: Practical Learnings for UX Testing
By Ysabella Andrade.

Concept/Prompt: Ysabella Andrade.
And what if you could test your product with dozens of "users" before recruiting a single real participant?
It seems contradictory, but this is exactly what synthetic personas with AI allow you to do.
During my onboarding at Taqtile, I received a challenge: to understand how synthetic personas could optimize the refinement of the final design/product process before being tested with real users.
Our purpose was to use the onboarding space, a moment to understand processes and provide training, to test if AI could advance hypotheses and identify usability failures/business rules before taking the product for real validation.
Below, I detail the process and the main learnings:
1. The Scenario: the journey in B2B logistics
For the prototype to make sense, it needs to solve a real problem and utilize real data. The product tested for our analysis was a B2B e-commerce platform aimed at the distribution of consumer goods. This business aims to make the entire purchase process digital through its official channels, especially the mobile app.
In this product, the "hard user" is the company's representative who serves other businesses. We were not dealing with a common shopping cart, but rather a complex journey that needed to cater to various logistical scenarios.
Thus, the challenge was to simplify this dense process without reducing the informational aspect, maintaining the trust of both buyers and sellers. Given that time was tight, we saw the opportunity to test synthetic personas to mature the prototype and understand the possibilities and limits of AI in UX research: will it replace real people? Does it generate valuable insights with reliable evidence? And how to structure a good test in this format?
2. The process: prompt engineering and security in the terminal
To ensure technical accuracy and data security, we did not use common interfaces (like web chat GPT, Gemini web, etc.). We operated via Gemini CLI with API Key directly in the terminal.
This choice allowed us greater control over the variables and ensured that sensitive business data was processed in a controlled environment, not to be used as learning by the model.
We structured our AI into two complementary agents, creating a "pipeline" for data analysis:
The Research Agent: its mission was to conduct syntheses, that is, it consumed raw transcriptions of interviews, Looker metrics, and business journeys to generate actionable personas. No invented data: if the information was not in the original material, the agent reported the gap.
The Senior UX Agent: this acted as a simulator of the interview. It received the personas created by the first agent and executed a Roleplay. Through it, we could question the prototype from different perspectives, alternating between response modes (Roleplay, Critique, or Mixed).
The final artifact: from the two agents above, our synthetic persona was born, which is created by Agent 1 and used by Agent 2. With this artifact, we could know what a real person would potentially respond about our questions/screens presented.

Above is part of the prompt from the “researcher” agent, with the description of the command, result format, and response limitations.
2.1. The synthetic personas: João and Susana
For the test to be effective, we selected two profiles that personify the different journeys of users:

After configuring the research agent (to create personas from the data) and the Senior UX agent (to simulate the conversation with them), I ran the command roleplay mode in the terminal and the usability test with AI came to life. It was in this synthetic dialogue that we identified the potential that synthetic personas have to identify points for improvement in the app's flow!
2.2. What did we learn about using synthetic personas?
At the end of the process, although we gained important insights to arrive at a proposed flow with much greater maturity, the biggest insights were about how to work with AI in design.
2.2.1. The persona is a reflection of the data used
I learned that a synthetic persona is a "time capsule". If the source material (transcriptions, business information, research, etc.) is old or incomplete, the persona will struggle to evaluate significant changes based on real data, since those do not exist in these scenarios. In other words, there is a good chance that the results do not reflect the current behavior of users or the business objectives.
Because of this, it is important to emphasize that AI does not replace the need for continuous field research; it merely enhances the analysis of what has already been collected. Therefore, to use synthetic personas, it’s interesting to have a continuous discovery habit embedded in your team to make your results more accurate.
2.2.2 Accuracy depends on the prompt structure
The test revealed that AI tends to ramble if the questions are generic. To extract value, it is necessary to direct the persona with explicit references.

Example of one of the questions (with directions and references) sent for the simulation.
For example, instead of asking "What do you think of the app's checkout?", we asked, "Based on your pain of lack of fiscal transparency, what is your opinion about this grouping of discounts at the final stage on the page (cite folder/document where the page is located)?". Structured and conditioned questions generate rich responses filled with details; loose questions generate confusing answers.
2.2.3 Roleplay as a way to gather quick critiques
The greatest advantage was the agility to conduct synthetic A/B tests. We were able to quickly contrast the reactions of two opposing profiles (the enthusiast vs. the skeptic). This allowed us to identify logical flaws and usability improvements in minutes, something that would take days in a traditional recruitment cycle.
Moreover, thinking about a future iteration, making the entire process automated and granular prompts, would make raising improvements even more agile and with more evidence data!
2.2.4 The synthetic analysis as a support point
Using the agent in "mixed mode" (roleplay + UX reading) allowed the AI not only to simulate the user but also to provide technical feedback on the simulation itself. This helped to map data gaps: the AI itself would notify us when it did not have enough evidence in the repository to respond with confidence or would directly cite which data source it used to respond.
This opened doors to explore tests in a multi-agent structure: dividing the testing objectives across more than one agent (for example, one to analyze previous recordings to capture moments of doubts and emotions; one to analyze the screens and already raise design critiques; etc).

Example of the analysis that the “Senior UX” agent brought after a simulation.
2.2.5 Accelerated learning curve about the business
An unexpected benefit was how synthetic personas helped to learn about the client's business faster. Many questions about the business and about the users I was able to clarify by asking directly to the persona, since it fed on all the previous materials from the product.
Later, I confirmed the points with the team in a synchronous meeting. In the meeting, I confirmed that the points it raised were consistent with the information the project team had.
I saw this as an opportunity to resolve doubts that would previously require scheduled meetings with project colleagues and depend on each person's agenda to occur. Now, I can "ask" it first, especially in situations where time is a challenge.
Of course, always considering the limitation of AI: I ask about subjects where we have data and always confirm the crucial points with the team later, before taking action.
3. AI for refining, humans for validating

A fundamental learning from this onboarding was the demystification of the role of AI. The use of synthetic personas is a refinement tool. It allows the designer to reach validation with humans with a much more mature prototype, having already "cleaned" obvious logical and flow errors.
In other words, it does not eliminate the need for validation with real users, but instead optimizes the product before real testing, so that the validation focus is on crucial points for the business.
4. The Boundaries of AI
Although the use of synthetic personas has accelerated our design cycle, methodological maturity requires recognizing where technology still faces barriers (and what we can iterate on or not). During the process, we mapped limitations that define the role of AI as a refinement tool, not a replacement:
4.1 The dependence on data
A synthetic persona is only as deep as the data used to feed it. If there are gaps in the original materials or if it is outdated (as was our case), the AI will not be able to predict accurate reactions. For example, in our simulation, the synthetic user frequently focused on a problem in its journey that has now been resolved, but the tests conducted with real users were before this improvement. In this scenario, we understood that it reflects the past to try to anticipate the future; therefore, the constant recycling of real data is important.
4.2 The "surprise" and emotional factor
The AI focuses on logic and pattern detection, but still fails to replicate human unpredictability. Complex emotions, such as hesitation when seeing a new button/screen, are aspects that the simulation could not capture. As pointed out by the NN Group study on the topic: human behavior is complex and depends on context, and synthetic users cannot capture this complexity.
4.3 Qualitative vs. quantitative accuracy
During the tests, we noticed that the analysis leans more towards qualitative. While AI is excellent at pointing out why a flow is confusing, the measurement of task time in a synthetic environment still lacks sufficient validation beyond what the AI suggests. The simulation answers us with the "what" and "why", but the "how much" still belongs to real usability tests!
4.4 The risk of hallucination
Without a rigorous prompt structure (and even without a multi-agent structure), AI can "ramble" or try to fill information gaps with generic assumptions about the data. Organizing the synthetic persona's responses followed by the UX analysis from the agent was crucial to separate what was potential evidence from what was inference, keeping a “clean” result of responses generated without context and reliable data.
Finally, recognizing these limitations does not diminish the value of the methodology. On the contrary, it provides the designer with the necessary insight to know exactly when to trust the simulation, when it is necessary to iterate the method, and when it is time to go out and listen to the real user.
Leading with awareness
During my onboarding, I learned that Taqtile has been living with technological transformations since its inception. Experiencing these changes, I understood that technology changes, but it is people who lead advancements.
In the case of AI, it’s no different! As this experiment showed, synthetic personas can advance weeks of learning in just a few days of simulation. However, it is not the technology itself, but how we use it that determines the success or failure of any project… or even an experiment.
The use of organized agents for each stage gave us the agility to fail fast and correct before we invested time in real recruitment and interviews.
Interested in learning more? Here at Taqtile, we face daily challenges of implementing Artificial Intelligence in large corporations. Follow our page on LinkedIn, where we are always sharing new learnings, techniques, studies, and strategies involving technology!