Wearable Comfort Benchmarking
At Meta, I own the comfort benchmarking program for an AR/VR wearable device. For this multi-phase research profram, I designed the research framework that measures whether the device meet user comfort requirements across physical fit/comfort, thermal comfort, and visual comfort including eye strain and visually induced motion sickness (VIMS).
The information provided does not necessarily represent the views of Meta.
Problem Statement
Head-worn AR/VR devices present unique comfort challenges. Unlike phones or laptops, these products rest directly on the user's face and head for extended periods. Comfort failures are the primary reason users stop wearing a device. If it's too heavy, too hot, or slips during use, people take it off. For wearable products, comfort cannot be an afterthought. It needs to be measured, tracked, and validated at every stage of hardware development.
I was brought on as the hardware UX research lead to build the comfort benchmarking program from scratch: define what "comfortable enough" means, create the instruments to measure it, and run the studies that inform whether the product is ready to ship.
Research Approach
The benchmarking program is structured in two phases, each designed to isolate different comfort variables and build on the findings of the last.
Phase 1: Physical Comfort Baseline
The first phase focuses on physical comfort (fit, stability, weight distribution, and pressure points) using non-functional prototype hardware. By removing display, software, and thermal variables, these studies isolate the mechanical comfort of the device itself. This phase has run across multiple hardware revisions, creating a longitudinal dataset that tracks how design changes affect comfort over time.
Phase 2: Functional Comfort Validation
The second phase introduces functional devices with active displays, running software, and real thermal output. These studies measure the full comfort experience during realistic use scenarios. The key question: does comfort hold up when the device is actually running? Thermal output and visual strain introduce new variables that non-functional prototypes cannot capture.
Study Design
Each study follows a repeated-measures protocol with surveys administered at regular intervals during multi-hour wear sessions. The core survey instrument covers:
- Overall comfort and fit acceptability
- Discomfort severity and body location mapping
- Perceived weight and balance
- Stability and slip frequency
- Adjustment frequency
- Thermal comfort onset and severity
- Eye strain and VIMS
- Extrapolated wear duration and takeoff drivers
The survey instrument was designed to be consistent across study phases, enabling direct comparison between hardware revisions and between non-functional and functional conditions. Each phase adds new modules (thermal, visual, software experience) while preserving the core physical comfort battery.
Participant Recruitment & Fitting
Wearable comfort is deeply personal. A device that fits one face shape well may be unusable on another. Each study recruits dozens of external participants screened for diverse anthropometrics. Before each session, participants go through a standardized fitting protocol to ensure they are wearing the correct size with properly adjusted components. This fitting process was developed iteratively across study phases and documented for vendor moderator training.
Data Collection & Analysis
Data is collected through Qualtrics surveys administered at regular intervals during multi-hour wear sessions, combined with in-depth exit interviews. This produces both quantitative comfort trajectories (how comfort changes over time) and qualitative insight into the specific experiences driving discomfort.
I also collaborated with a fellow researcher to vibe code a Python app to record device data over WiFi in real-time during sessions. This eliminated the need for additional wires that may have confounded the comfort signal, replacing an obstructive wired measurement setup.
Analysis combines descriptive and inferential statistics, performance tracking against product requirements, and comparison across hardware revisions. I pull results into topline findings shared with cross-functional stakeholders within days of data collection, followed by detailed reports with design recommendations shortly after. This collaborative approach allows me to enable
AI-Augmented Research Workflows
I integrate AI tools like Claude and Gemini into my research workflows to maintain consistency and move faster while always remaining "in the loop" to validate output quality. I use AI to generate study scripts and training documentation for external research moderators, ensuring every moderator follows the same protocol across sessions and study phases. This is especially important when running studies with vendor partners who may be unfamiliar with the product or the fitting process. AI tools also help me draft survey instruments, synthesize qualitative data from exit interviews, and prepare stakeholder-facing reports more quickly.
Impact
The benchmarking program has directly influenced hardware design decisions across multiple product development cycles. Specific impacts include:
- Identified comfort regressions introduced by hardware design changes between prototype revisions, enabling the team to course-correct before finalizing design decisions.
- Defined product specifications for comfort, fit, stability, and thermal acceptability. These are the quantitative thresholds that determine whether the device meets its requirements.
- Built the validated survey instrument now used as the standard measurement tool for comfort research across this product line, enabling consistent tracking over time.
- Authored vendor safety and research protocols for handling unreleased hardware with external participants, including a leak prevention framework adopted by the broader research team.
Vendor & Stakeholder Management
Each study involves coordinating with an external research vendor for participant recruitment, scheduling, and moderation. I manage the vendor relationship end-to-end: scoping and quoting, writing training materials, conducting on-site moderator training, running daily debriefs during data collection, and performing quality checks on incoming data. Internally, I present findings in cross-functional hardware and product reviews, where comfort data directly informs go/no-go decisions on design changes.
What I Learned
Building a benchmarking program from scratch taught me that the instrument design matters as much as the study design. Consistency across phases is what makes the data comparable. The same core questions, the same fitting protocol, the same acceptance criteria. That's what makes the trends meaningful and reliable. Without that consistency, each study would be a standalone snapshot rather than a longitudinal signal.
I also learned the value of bridging UXR and hardware engineering. Comfort research for wearables sits at the intersection of human perception and physical design. The most useful findings came from combining subjective user data with objective sensor measurements.