Pulze.ai Playground

Re-tooling the experience of Pulze.ai's Playground to help user's discover what LLMs work best for them.

Pulze.ai Playground

Re-tooling the experience of Pulze.ai's Playground to help user's discover what LLMs work best for them.

05. Final Designs
Users struggled with understanding the value of the Playground, as tool was regularly perceived as a 'ChatGPT clone'. This lead us to believe that the tool wasn't very intuitive and lacked stickiness, negatively affecting user engagement on both the Playground and the platform as a whole.
Users struggled with understanding the value of the Playground, as tool was regularly perceived as a 'ChatGPT clone'. This lead us to believe that the tool wasn't very intuitive and lacked stickiness, negatively affecting user engagement on both the Playground and the platform as a whole.

01. Overview

Find the right LLM for you

background
I was contracted to improve the user experience of Pulze.ai’s Playground, a web-based, conversational AI interface that allows visitors to test, compare, and utilize the capabilities of top large language models (LLMs) in their apps via a single API. It also provides suggestions of which LLMs are best suited for the user’s needs. I lead the redesign of this tool through two iterations.
problem
Users have struggled with understanding the value of the Playground, causing low engagement and stickiness.
objectives
Redesign the user experience to help boost intuitiveness and foster engagement.
results
2x increase in average session duration.
+300% increase in app creation.
tools
Figma & Zoom
project Type
Redesign
role
UX Designer & Researcher
duration
3 Weeks

02. Discovery

Discovering the problem

Users struggled with understanding the value of the Playground, as tool was regularly perceived as a 'ChatGPT clone'. This lead us to believe that the tool wasn't very intuitive and lacked stickiness, negatively affecting user engagement on both the Playground and the platform as a whole.
Low user engagement
Interface lacks intuitivity
Disassociation with
broader platform
User pain points.

Understanding our users

I had regular daily meetings with the CEO to communicate my research findings and validate any assumptions that I had about the redesign. It was mentioned that one of the major characteristics of the platform is that it should be easy enough for “anyone” to use. This referenced people who would have very little knowledge about the detailed workings of LLMs, such as Executives one of their main targets. With this in mind, I was able to expand my feedback circle to family, friends and former colleagues and gain a better starting point in regards to intuitiveness.
executives
Possesses tremendous buying power
View and report benchmarking
Low knowledge of LLMs
software engineers
Implements Pulze API to customer product
View and report benchmarking
Deep knowledge of LLMs
How might we help users easily find the best LLM for their needs and business?

Creating a plan

Understanding the company's urgency to make improvements and share for investing purposes, I proposed a phased approach that would allow us to quickly turnaround improvements, with limited user research, while also still working towards a more holistic and feedback-based redesign.
In Phase 1, we would quickly improve most of the heuristic issues I identified, allowing users to gain a better understanding of the Playground's concept and the value it provides.
For Phase 2, we'd take the time to perform in-depth testing and create a fully connected, user-centered design that would be a more scalable solution for the future of the product.
phase 1
Improve standard UX issues that were low hanging fruit.
Primarily utilize existing UI to minimize development resources.
phase 2
Perform in-depth user testing.
Rethink workflows about how to access and maximize the output of the Playground.
Update platform with elements of a new design library.

03. Research

Competitive & comparative analysis

I began my research by checking into some of the competitors in the market like Huggingface, Perplexity and Anyscale, as well as similar LLM products in the same space like GPT-4, Claude and Cohere.
By evaluating these relatively mature and successful platforms, I was able to glean insights on industry-approved elements such as layout, workflows and terminology that would be expected by the experienced user that we were targeting. Now, I would just have to digest these elements and simplify for the opposite end our user spectrum.
Competing/comparable products that I looked into.

Heuristic analysis

Since this was a redesign project, I wanted to conduct a heuristic analysis of the existing Playground. I then discussed my findings with several internal stakeholders to confirm my findings and initial assumptions.
elements of note
Lacks instruction and visual hierarchy.
No conversion to the app creation process.
Difficult to compare prompt results.
Inability to share configurations.
Parameters are hidden by default
Alot of wasted space.
Initial design that I was tasked with improving.

04. Design

Phase 1 redesign

Understanding the company's urgency to make improvements to the Playground, I proposed a phased approach that would allow us to quickly turnaround improvements, with limited user research, while also still working towards a more holistic and feedback-based redesign:
grouped input fields
Using Gestalt principles, I grouped all input and output UI elements together, allowing users to make the visual connection between interactions.
sharing & converting results
Adding options to 'Share' and 'Create App' allows the Playground results to become actionable.
chat / compare interface
Using two interfaces for chat and comparison, allows each target user base to draw the most accurate assessment of all LLMs presented.
expert recommendations
In order to minimize clicks and allow users with little technical knowledge to easily make comparisons, we chose to show the results of the top 3 recommended LLMs all at once, highlighting the “Top Choice”.
grouped input fields
Using Gestalt principles, I grouped all input and output UI elements together, allowing users to make the visual connection between interactions.
sharing & converting results
Adding options to 'Share' and 'Create App' allows the Playground results to become actionable.
chat / compare interface
Using two interfaces for chat and comparison, allows each target user base to draw the most accurate assessment of all LLMs presented.
expert recommendations
In order to minimize clicks and allow users with little technical knowledge to easily make comparisons, we chose to show the results of the top 3 recommended LLMs all at once, highlighting the “Top Choice”.

Room for improvement

The MVP redesign was launched in Sept 2023.
I was able to reconnect with 8 of the 12 initial users that I spoke with to run usability tests with the Phase 1 redesign. There were a few additional points of feedback, so we knew there was room for improvement. It was mentioned that the interface is still a little difficult to get started with and a little tedious scrolling to see all of the responses.

100%

12 out of 12 users preferred the Phase 1 design over the original.

05. Final designs

Updating the style guide

The team wanted to also freshen up the look of the entire company brand with a dark theme that would be considered more attractive and updated by their engineer users. I began by constructing a new style guide and components to test out in the Phase 2 designs of the Playground.

Phase 1 redesign

With the results from user testing, I wanted to apply some of my new learnings to the next iteration of the Playground. Here's some of the major updates that I made for this final iteration:
new start screen
To really give users an intuitive starting point, the new start screen shares a layout similar to ChatGPT, an interface most users were already highly familiar with. Jakob's Law states that users prefer interfaces to work similarly to interfaces that they're already familiar with.
creating space
Before, the primary navigation wasted ~20% of screen real estate. By restructuring the layout of the dashboard, and moving the primary navigation to the top, I was able recapture that space and let the design breathe a bit more.
interacting with results
In testing with Phase 1 designs, I realized most users already had some idea of which LLMs they wanted to test and compare. By showing only the response of the top recommended LLM, users could focus on the results without feeling overwhelmed.
A different approach
Testing Phase 1 designs revealed that some users thought the the look and feel of the "Top Choice" made them feel like they were being sold something. This created feelings of discomfort and an initial distrust in continuing with the use of the playground. By adjusting this approach a bit I was able to still highlight and promote the top recommendation in a more informative manner.
comparing llms
Comparing LLMs is now made easier by eliminating other UI distractions. Recommended options are highlighted to help users easily identify which LLMs are best for them.

06. Impact

What changed?

With the release of the redesign, we saw a big bump in user engagement and the amount of apps created on the platform. Average session duration increased from less than 5 minutes to about 10 minutes, while the app creation increased by over 300%.
With these metrics, I would say the redesign was successful in addressing the problems at hand. I would've loved to test with more users to optimize the playground and platform even further.

+300%

Increase in conversion rate to app creation from the playground to the main platform.

2x

Increase in average session duration on the playground.

What I learned

the importance of accessibility design
Designing for this product gave me a peek into the world of designing for accessibility, something I hadn't had much experience with before. The impact of accessibility in design is tremendous and it will be top of mind for me at the beginning of every project that I work on from now on. It's always great to learn and grow as a designer.

If I had more time

If I had the opportunity, it would’ve been great to be able to further track metrics like the click-through rate and session duration on the final designs to optimize the effectiveness of the design on a larger scale.