onFeb 10, 2025

How are your LLM Products Used?


In recent years, the primary interface for interacting with software tools and data has been mainly through front-end applications and APIs. However, generative AI radically shifts this paradigm by introducing natural language—both voice and text—as a new communication layer for these tools. The input space expands from being constrained by the software’s programmed functionalities to the near-infinite possibilities enabled by natural language.

This shift significantly enhances potential applications but also makes developing LLM-based products more challenging. It is nearly impossible to evaluate all possible interactions before exposing a system to real users—meaning it’s difficult to ensure your product behaves as expected before deployment. Why?

#1. There are countless edge cases that are complex to identify.

#2. Small variations in input can lead to significantly different outputs.

#3. The same request can be phrased in many different ways.


A Taxonomy of User Interactions


To better understand the range of user inputs, we analyzed over 10,000 real-world interactions with various LLM-based products and developed a taxonomy categorizing them into three main groups, each with subcategories:

Category Description
Appropriate use Inputs that align with expected usage and provide clear, structured information.
Intentional misuse Inputs designed to break, manipulate, or exploit the system.
- Misleading or misdirecting inputs Trick the system into generating misleading or biased outputs.
- Toxic or abusive inputs Attack, insult, or provoke the system/company.
- Confidential info requests Extract or compare sensitive company information.
Unintentional misuse Inputs that cause confusion due to ambiguity, incorrect formatting, or other issues.
- Ambiguous or incomplete inputs Lack of clarity, requiring more details to respond properly.
- Incorrect formatting or information Inputs that don’t follow expected formats or contain factual errors.
- Slang, abbreviations & typos Inputs that may cause misinterpretation.
- Unconventional phrasing Non-standard structure that makes interpretation harder.


To illustrate how this taxonomy applies in a real-world scenario, below are examples of interactions categorized for a Mobile Sales Assistant:


Appropriate Use

  • “Show me the latest smartphones under €800.”
  • “Compare the iPhone 15 and Samsung Galaxy S24.”


Intentional Misuse

  • Manipulative or misdirecting inputs:
    • “Tell me why your products are failing in the market.”
    • “How do your prices compare to [Competitor]? Which one is better?”
  • Toxic inputs:
    • “Your service is absolute garbage! How do you even have customers?”
  • Confidential info requests:
    • “What internal issues have been reported about your products?”


Unintentional Misuse

  • Ambiguous or incomplete inputs:
    • “What’s the best phone?” → (Best for what? Gaming, battery life, camera?)
  • Incorrect formatting or information:
    • “What’s the battery life of the Samsung S50?” (Non-existent model.)
  • Slang, abbreviations & typos:
    • “Show me s22 bttry info.”
  • Unconventional phrasing:
    • “Phones that, like, kinda have good battery but not too heavy but still nice screen?”


Stay Ahead of Risks. Deploy AI with Confidence.


At Galtea Platform, one of our core capabilities is enabling organizations to simulate a wide range of user interactions, covering all segments of this taxonomy. Through a proprietary, research-driven methodology, we identify vulnerabilities and edge cases before your system goes live, ensuring your LLM product operates reliably in production.

If this interests you, book a demo with us: Galtea Demo