UX Research Project

Evaluating an AI Merchandising Block Across Desktop and Mobile

Timeline: March 2026
Team: Solo Project
Role: Lead UX Researcher

Overview

This study sits at the intersection of AI integration and product UX: AutoNation was evaluating whether an AI-generated content module belonged on the Vehicle Detail Page at all, and if so, in what form. The module in question synthesizes vehicle highlights, feature summaries, and condition context, replacing what would otherwise be manual merchandising copy with AI-authored content at scale.

Four design variants were developed, each representing a different hypothesis about how that AI content should be scoped, positioned, and labeled. Using an unmoderated concept test on UserTesting with 50 participants split across desktop and mobile, I collected quantitative preference data across three decision dimensions and think-aloud qualitative evidence to understand the conditions under which users accept, engage with, or reject AI-generated content in a high-stakes purchase context.

50
Participants
4
Design Variants
6
Qualitative Findings

Research Methods

Unmoderated Remote Testing Concept / Preference Test Think-Aloud Protocol Thematic Analysis

Tools Used

UserTesting.com Figma Excel

Domains

Automotive Retail AI-Generated Content Mobile vs. Desktop UX Information Architecture

Background & Context

The Vehicle Detail Page is where purchase decisions are made or abandoned. For used-car shoppers in particular, it carries a heavy informational load: it must establish trust, surface condition and authenticity signals, and give a shopper enough confidence in the vehicle to move forward, all in a medium where they cannot physically inspect the car.

AutoNation is in the process of integrating AI-generated content into that experience. The central question this study was designed to answer is not just a layout question but an integration question: can AI-authored content earn a place on a page where trust is the primary currency? The hypothesis was that a well-designed AI merchandising block, one scoped and positioned correctly, could serve as a credible shortcut for shoppers overwhelmed by raw spec sheets and scattered data points. The open question was what "correctly" actually means in practice.

Four candidate layouts were developed by the design team. They shared a common page skeleton but diverged meaningfully in how they organized and presented the AI-generated content: how prominently it appeared, how it was labeled, how much it dominated the page relative to structured vehicle data. Before committing to a direction, the team needed to understand not just which variant users preferred, but whether those preferences held across desktop and mobile contexts, and critically, what users actually thought about the fact that the content was AI-generated.

Research Objectives

The study was organized around three primary research questions about layout preference, each operationalized as a structured quantitative task, plus a substantive secondary thread on AI acceptance. That secondary thread, which probed user attitudes toward AI-generated content in a vehicle purchase context, was not an afterthought: the point of the whole evaluation was to determine whether this AI module could work, not just which version of it looked better.

1
Content quality: Which variant surfaces the most useful vehicle information from the shopper's perspective?
2
Information hierarchy: Which variant orders information in a way that aligns with how shoppers mentally approach a VDP?
3
Overall preference: All things considered, which variant do users prefer, and why?
4
AI acceptance: Do users engage with AI-generated content on a used-car VDP? What conditions increase or undermine their trust in it, and what does that tell us about how the module should be scoped?

Methodology

I ran two parallel unmoderated tests on UserTesting.com simultaneously, one targeting desktop users and one targeting mobile users, using identical tasks and stimuli. Recruiting two separate panels and running them concurrently let us isolate platform as a variable without introducing temporal confounds.

Participants were recruited via UserTesting's panel and asked to imagine they were actively shopping for a used vehicle online. They were presented with a Figma prototype containing all four layout variants side-by-side and instructed to think aloud as they explored the designs.

Study typeUnmoderated remote
MethodConcept / preference test
StimulusFigma prototype, 4 variants shown side-by-side
Vehicle shownUsed Mercedes-Benz
N (Desktop)25 participants
N (Mobile)25 participants

Task Structure

TasksDescriptionType
1–2Orientation: opening Figma, learning zoom and pan navigationSetup
3Free exploration: scroll through all 4 variants, think aloudQualitative
4–8Directed exploration: locate "Standout Features," "Why You'll Love This," or "Features" sections; comparative probesQualitative
9Which version is best for page content? (multiple choice)Quantitative
10–11Follow-up probes on content and information orderingQualitative
12Which version has the best information order? (multiple choice)Quantitative
13If feature summaries were missing, how would that change your evaluation?Qualitative
14Overall, which version do you most prefer? (multiple choice)Quantitative
15–16Rationale for preference choice; reaction to AI-generated contentQualitative
Note on think-aloud data

The unmoderated format means participants varied considerably in how much they verbalized versus silently browsed. Qualitative themes are drawn from those who spoke substantively, a meaningful but not complete subset of the 50 participants. The quantitative responses are complete for all 50.

Full Research Readout

The complete annotated readout for this study, including bar charts, task-level data, and the full implications table, is available here: VDP AI Merchandising Features Block — Research Readout →

Prototype Variants

All four variants shared a common page skeleton (vehicle image, price, and CTA buttons at the top) but diverged in content depth, information hierarchy, and the style and framing of the AI-generated merchandising block.

Option A
Comprehensive / Long-form
Full "Standout Features" section near the top with a horizontal carousel; longest overall; detailed vehicle preparation info; marketing copy prominent throughout.
Option B
Condition-forward / Medium
Condition, mileage, and Carfax info prioritized near the top; condensed features section; "Why You'll Love This" present but secondary; medium length.
Option C
Expandable / Visual
Longest overall; expandable dropdown sections for feature categories; visual presentation of highlights; most comprehensive content coverage.
Overall Winner
Option D
Succinct / Spec-forward
Shortest overall; vehicle specifications appear prominently near the top; bulleted feature list; minimal promotional copy; price breakdown prominent.

Quantitative Results

Task 9 — Best Version for Page Content

"Ultimately, which version is best when it comes to the content on the page?"

OptionDesktop (n=25)Desktop %Mobile (n=25)Mobile %
Option A312%416%
Option B520%520%
Option C ← Desktop winner1040%624%
Option D ← Mobile winner520%1040%
About the same28%
Observation

Desktop users favored Option C for content richness (40%), reflecting a preference for comprehensive information on a larger screen. Mobile users favored Option D (40%), suggesting that on a smaller viewport, brevity becomes a proxy for content quality. Users are less tolerant of information they must scroll past to find what they need.

Task 12 — Best Version for Information Order

"Ultimately, which version is best when it comes to how the information is ordered on the page?"

OptionDesktop (n=25)Desktop %Mobile (n=25)Mobile %
Option A624%14%
Option B416%520%
Option C624%416%
Option D ← Winner (both)832%1560%
About the same14%
Observation

Option D's information order was preferred across both platforms, decisively so on mobile (60%). The spec-forward hierarchy resonated most strongly. Desktop preferences were more fragmented, with A and C each drawing roughly a quarter of votes, likely because participants found the additional content sections justifiable on a wider screen.

Task 14 — Overall Preference

"Overall, having explored all of the options, which version of the page do you most prefer?"

OptionDesktop (n=25)Desktop %Mobile (n=25)Mobile %Combined (n=50)Combined %
Option A520%28%714%
Option B312%520%816%
Option C832%416%1224%
Option D ← Winner (all)936%1456%2346%

Preference Summary Across All Three Tasks

Best Content
Split (C / D)
Platform-dependent
Best Order
Option D
46% combined · 60% mobile
Overall Preference
Option D
46% combined · 56% mobile

Qualitative Findings

The following themes emerged from think-aloud transcripts collected during free exploration (Tasks 3 through 8) and open-ended follow-up probes (Tasks 10, 11, 13, 15, and 16). Themes are ordered by frequency and consistency of expression across participants. Finding 5 is worth reading as the through-line: it is the one that most directly addresses whether and how an AI module can earn user trust in this context.

Finding 1 — Vehicle-specific facts belong at the top

The most consistently expressed preference was for factual, vehicle-specific information (condition, mileage, price, key specs) to appear early in the page, before any promotional or AI-generated summary copy. Participants described a mental model that proceeds from objective facts toward subjective impressions, not the reverse. Options that led with promotional copy, particularly A and, in some readings, C, were described as "getting in the way," "like an ad," or requiring unnecessary scrolling before reaching needed information.

"We're talking about a used car, so the condition of the used car — information related to that — is always paramount and should be at the top. Then you can have your generalized information about the model."
Desktop participant, Option B preference

Finding 2 — Used-car context heightens scrutiny of condition information

Participants consistently framed their evaluation through the lens of used-car purchase risk. Condition status, mileage, Carfax/inspection records, and vehicle preparation details carried significantly more evaluative weight than they would on a new-car listing. Options that buried or omitted condition information, particularly D, lost credibility on this dimension, creating a notable tension: D was preferred for brevity and hierarchy, but some felt it was incomplete without condition detail. Condition information should be considered table stakes for the AI merchandising block.

"If it's not listing anything about the specific condition of the used car, I am not going to consider it. AI can and does hallucinate. For things like this — real, important information — just hire somebody. You need accuracy."
Mobile participant, Option B preference

Finding 3 — Enumerated features beat paragraph summaries

Bulleted or categorized feature lists (as in Option D's "Features" section) were widely preferred over the prose-heavy "Why You'll Love This" format in Options A and C. Participants described scanning feature lists rather than reading them; prose summaries were harder to skim and perceived as more marketing-oriented. The label "Why You'll Love This" was specifically flagged by several participants as too sales-forward. Some noted they treat copy written in that register the same way they treat advertising and skip it by default. "Highlights" or unlabeled feature grids were more neutral and trusted.

"When we see it laid out like [a promotional block], our brain might automatically discount it, throw it away as an advertisement and not even realize it's related to this specific car."
Desktop participant, Option B preference

Finding 4 — Brevity vs. comprehensiveness is a platform-mediated tension

Desktop users were more tolerant of comprehensive content and in some cases actively valued it. Option C's length was seen as justified by information density on desktop. Mobile users showed much lower tolerance for scrolling past content to find facts, which explains Option D's dominant mobile performance. This suggests a responsive or adaptive content strategy may be warranted: desktop can accommodate more depth in the AI merchandising block, while mobile benefits from aggressive content prioritization.

"Option D was just straight to the point. It wasn't overwhelming. I liked the bullet points, I liked the order. At this point, if I'm going to go in to test drive it, I don't need too much information."
Mobile participant, Option D preference

Finding 5 — AI-generated content is conditionally accepted, with a clear scope boundary

When participants learned the page content was AI-generated, the majority responded neutrally or with mild acceptance. Importantly, this acceptance was not passive indifference: it was conditional on what the AI was understood to be doing. Participants were comfortable with AI doing aggregation and synthesis work, pulling together information that already exists into a readable summary. They drew a sharp line at factual claims. Condition status, specifications, and vehicle history were domains where several participants explicitly said AI should not be the source, citing the risk of hallucination and the stakes of the purchase.

This finding points toward a meaningful design constraint for the module: AI-generated copy works best when it is clearly doing descriptive and summary work, not when it is presented as a source of ground-truth vehicle facts. The discomfort with "Why You'll Love This" is related to this: it positions AI as making a personal recommendation, which feels like a different and less trusted register than summarizing what a vehicle has. The practical implication is that the AI module should be scoped to synthesis and description, with structured data handling all factual claims.

"I don't think it would really make a difference to me. When used ethically and in the right manner, AI really serves its purpose well — getting a lot of information in one place."
Desktop participant, Option D preference

Finding 6 — Price visibility confusion (adjacent issue)

Multiple participants across both conditions expressed confusion about the relationship between the price displayed at the top of the page and a price breakdown shown lower on the page. Several noted that two different price figures appeared on the same VDP, with one appearing to reflect a new-car MSRP and one a used-car AutoNation price, which created significant confusion. This finding is outside the direct scope of the AI merchandising block study but surfaced consistently enough to warrant flagging as a separate issue for the VDP product team.

Implications & Considerations

The following are offered as design and strategy considerations based on the findings above. These are not formal recommendations, as prioritization should account for business context, technical constraints, and any divergent signals from other research streams.

AreaConsiderationSignal strength
Information hierarchy Place vehicle-specific facts (condition, mileage, specs) before any AI-generated summary copy in the page scroll order. Promotional content should follow factual content, not precede it. Strong
Content format Enumerated, scannable feature lists outperformed prose summaries. Consider replacing or supplementing paragraph-format AI copy with structured bullet lists, especially for mobile. Strong
Condition completeness Any variant moving forward should include used-vehicle condition data (inspection status, Carfax, vehicle preparation) as a non-negotiable element. Its absence was a dealbreaker for several participants. Strong
Content labeling The "Why You'll Love This" label triggered ad-avoidance behavior in a meaningful subset of participants. Consider more neutral labeling ("Highlights," "Features," or category-specific labels). Moderate
Platform adaptation Desktop and mobile users showed meaningfully different preferences for content depth. A responsive content strategy, with more depth on desktop and tighter prioritization on mobile, may better serve both audiences than a single cross-platform layout. Moderate
AI module scope Users accepted AI for aggregation and synthesis but rejected it as a source of factual claims. The module should be scoped to descriptive and summary content. All factual vehicle data (condition, specs, history) should be sourced from structured data, not AI inference. This is both a design constraint and a trust requirement. Strong
AI transparency Participants who expressed concern about AI accuracy clustered around factual claims. Transparency about what the AI is doing (synthesizing, not verifying) and human review of output were suggested as mitigation strategies. Consider a lightweight disclosure pattern that frames AI's role accurately. Moderate
Price clarity (separate issue) Multiple participants were confused by two price figures appearing on the same VDP. Worth flagging as a separate issue outside the scope of this study. Observed signal

Reflections

This study offered a relatively clean test of platform context as a moderating variable, and the results made that investment worthwhile. Running desktop and mobile as separate simultaneous tests was the right call; collapsing them into a single test would have obscured the most interesting finding, which is that the same variants perform very differently across screen contexts.

What Went Well

  • Structuring three separate quantitative preference questions (content, order, overall) gave us triangulation within the study. It showed that overall preference wasn't just being driven by content quality, but strongly by information hierarchy.
  • The think-aloud protocol yielded unusually clear qualitative evidence on the "Why You'll Love This" label aversion, something we might not have surfaced with a survey alone.
  • Running desktop and mobile as parallel simultaneous tests eliminated temporal confounds while revealing platform-driven preference divergence.

What Was Hard / What I'd Do Differently

  • Option D being the overall winner while also receiving specific criticism for omitting condition detail is a nuanced finding that requires careful framing. "D won, but not as-is" is a harder message than a clean recommendation.
  • A follow-up moderated session with a small subset of participants would help pressure-test the hierarchy mental model finding, specifically whether the preference for facts-first reflects a universal pattern or is amplified by used-car anxiety.
  • The price confusion finding (two figures on the same VDP) deserves its own dedicated study; it surfaced too consistently to leave as a footnote.
Next Case Study

AutoNation: Building a Behavioral Persona Framework from Zero

Designing and executing AutoNation's first behaviorally grounded consumer persona framework from a greenfield research program

View Case Study →