Week 7 / February 2026

Systems Detect Problems They Won't Enforce

From age gating to accessibility, capability outpaces accountability in deployed AI

Synthesized using AI

Analyzed 159 papers. AI models can occasionally hallucinate, please verify critical details.

The strongest signal this week is the gap between what systems can detect and what they choose to enforce. Chatbots demonstrate clear capability to identify child users from conversational cues—across 1,050 programmatic experiments with age-indicative prompts, popular consumer chatbots estimated user age accurately but took zero enforcement action when children under 13 were identified, directly violating their stated COPPA policies. This isn't a technical limitation; it's a design choice to detect without acting. Similar patterns appear in educational feedback systems and accessibility tools: capability exists, but accountability mechanisms remain unbuilt.

Accessibility research is shifting from feature requests to infrastructure critique. Blind and low-vision researchers report not trusting any professional research tool, with nearly one-fifth unable to independently perform literature review or evaluate visual outputs—forcing delegation that costs professional recognition. Deaf content creators navigate sign language translation as political work involving multilingual code-switching, cultural negotiation, and variant selection that no current system addresses. Both papers frame exclusion not as missing accommodations but as fundamental design failures that encode who gets to participate autonomously. The implication: systems claiming universal access while requiring intermediaries aren't accessible—they're performative.

The most surprising finding challenges assumptions about AI transparency. Identical LLM-generated feedback drives dramatically different learner engagement based solely on whether it's attributed to AI or a human teaching assistant, with effect sizes from d = 0.88 to 1.56. Content quality doesn't predict engagement; source attribution does. This inverts the transparency-as-neutral assumption underlying most AI disclosure policies. Interface labeling becomes a pedagogical intervention with measurable outcomes, forcing a choice between honesty and effectiveness that current design guidance doesn't address.

Featured(1/6)

2602.10235

Reimagining Sign Language Technologies: Analyzing Translation Work of Chinese Deaf Online Content Creators

Xinru Tang, Anne Marie Piper

Preprint·2026-02-10

Stop building sign language systems as language-pair converters. Design for multilingual fluidity, cultural negotiation, and creator control over political framing. Best for platforms hosting deaf-created content.

Sign language translation systems treat translation as a technical mapping problem, ignoring how deaf creators actually navigate multilingualism, cultural context, and political stakes when producing signed content.

Method: Interviews with 13 deaf Chinese content creators reveal translation as (trans)languaging work: creators fluidly blend Chinese Sign Language, written Chinese, and regional sign variants while managing cultural meaning-making for mixed deaf/hearing audiences. They explicitly navigate political choices—which sign variants to use, how to represent deaf culture, when to code-switch—that no current translation system addresses.

Caveats: Chinese context with specific linguistic politics. Generalizability to other sign language communities unverified.

Reflections: How would translation systems support creator control over code-switching and variant selection rather than enforcing standardization? · What interface affordances enable deaf creators to signal cultural context that hearing audiences need without flattening signed expression? · Can systems surface the political stakes of translation choices to hearing users without burdening deaf creators with explanation labor?

accessibilitysocial-computingethics

2602.09522

Earinter: A Closed-Loop System for Eating Pace Regulation with Just-in-Time Intervention Using Commodity Earbuds

Jun Fang, Ka I Chan, Xiyuxing Zhang, Yuntao Wang, Mingze Gao, Leyi Peng, Jiajin Li, Zihang Zhan, Zhixin Zhao, Yuanchun Shi

Preprint·2026-02-10

Single-sensor commodity wearables can drive behavior change without specialized hardware. If you're building health interventions, prioritize closed-loop systems over passive tracking—the real-time feedback loop drives retention effects.

Rapid eating is hard to self-regulate because people don't notice pace changes in real-time and sustained self-monitoring during meals is cognitively taxing.

Method: Commodity earbuds repurposed to detect chewing via bone-conduction sensors achieve F1 = 0.97 for chewing detection and MAE of 0.18 ± 0.13 chews/min for pace estimation, enabling closed-loop just-in-time interventions. In a 13-day field study (N=14), the system significantly increased chews per swallow and reduced consumption speed, with statistical signs of carryover effects on retention-probe days when the system was off.

Caveats: Fourteen participants over 13 days. Longer-term adherence and broader dietary contexts untested.

Reflections: Do carryover effects persist beyond the study period, or do users revert once the system is removed permanently? · How does intervention effectiveness vary across meal types (snacks vs. formal meals) and social contexts (alone vs. dining with others)? · Can the system adapt intervention timing based on learned user pace patterns rather than fixed thresholds?

wearableshealthcareevaluation-methods

2602.11311

Same Feedback, Different Source: How AI vs. Human Feedback Shapes Learner Engagement

Caitlin Morris, Pattie Maes

Preprint·2026-02-11

Don't assume transparent AI labeling is neutral. If engagement matters, consider hybrid attribution strategies or invest in making AI feedback feel genuine rather than just accurate. Know that source disclosure trades honesty for effort.

As AI delivers more educational feedback, learners' beliefs about whether feedback comes from AI or humans may shape engagement—but content and source are usually confounded.

Method: Identical LLM-generated feedback attributed to either AI or a human TA produced dramatically different engagement: TA-attributed feedback drove significantly more time and effort (d = 0.88-1.56). Evaluation drivers also differed—AI ratings correlated with prior AI trust (r = 0.85), while TA ratings correlated with perceived genuineness (r = 0.65).

Caveats: Creative coding context with single-session engagement. Long-term learning outcomes and other domains untested.

Reflections: Does the engagement gap persist over multiple feedback cycles, or do learners habituate to AI-attributed feedback? · What specific feedback features drive perceived genuineness for AI vs. human sources? · In domains requiring iterative revision (writing, design), does lower initial engagement with AI feedback compound into worse learning outcomes?

educationai-interactiontrust-safety

1 / 6

Featured

Findings(1/5)

Feedback systems shift from content delivery to source attribution·Privacy enforcement moves from policy compliance to implementation gaps·Ambient sensing replaces explicit monitoring in behavior change systems·Workflow tools expose decision trees instead of hiding them behind automation·Accessibility research shifts from accommodation to infrastructure critique

Learners engage differently with identical feedback based solely on whether they believe it came from AI or a human teaching assistant, even when content is held constant. This attribution effect matters more than the feedback quality itself for engagement outcomes. The implication: hybrid AI-human educational systems must design for source perception, not just output accuracy. The interface decision of labeling becomes a pedagogical intervention.

2602.11311

Same Feedback, Different Source: How AI vs. Human Feedback Shapes Learner Engagement

Surprises(1/3)

Source attribution matters more than feedback quality for learner engagement·Reproducibility fails even with curated documentation and preset environments·Vibe coding productivity depends on human guidance, not just AI capability

When learners receive identical LLM-generated feedback, their engagement differs based solely on whether they believe it came from AI or a human teaching assistant. The content is held constant; only the attribution varies. This inverts the assumption that feedback effectiveness depends primarily on accuracy or specificity. Interface labeling decisions become pedagogical interventions with measurable learning outcomes.

2602.11311

Same Feedback, Different Source: How AI vs. Human Feedback Shapes Learner Engagement

TOOLBOX(6)

T2VTree

Tool

Visual analytics system for agent-assisted thought-to-video authoring. Represents authoring as a tree where each node binds editable specifications (intent, inputs, workflow, prompts, parameters) with multimodal outputs. Integrates collaborating agents that translate step-level intent into executable plans, with branching authoring, in-place preview, and stitching for multi-scene video creation.

2602.08368

Kissan-Dost

Tool

Multilingual, sensor-grounded conversational system for smallholder agriculture. Couples commodity soil and climate sensors with retrieval-augmented generation to deliver plain-language guidance over WhatsApp text or voice. Achieved over 90 percent correctness on 99 sensor-grounded crop queries with subsecond end-to-end latency in 90-day pilot deployment.

2602.08593

Earinter

Tool

Commodity-earbud-based closed-loop system for eating pace regulation. Repurposes bone-conduction voice sensor to capture chewing vibrations and estimate eating pace as chews per swallow (CPS). Achieves F1=0.97 for chewing detection and MAE of 0.18±0.13 chews/min, 3.65±3.86 chews/swallow for pace estimation, enabling real-time just-in-time intervention during daily meals.

2602.09522

GatheringSense

Tool

AI-driven dual-path framework for understanding Chinese literati gatherings (Wenren Yaji). Combines AI-generated multimodal content with embodied participation experiences. Instantiates embodied cognition principles to support cultural understanding through both AI-generated imagery and physical ritual participation, evaluated in mixed-methods study with N=48 participants.

2602.12565

GroundLink

Tool

Unity add-on for Virtual Production that surfaces meeting-derived knowledge directly in the 3D editor. Features meeting knowledge dashboard for capturing decisions and comments, constraint-aware feedforward that proactively informs the editor environment, and cross-modal synchronization providing referential links between dashboard and editor to support common ground establishment.

2602.12987

iRULER

Tool

Interactive rubric-based LLM evaluation tool for writing revision. Scaffolds review process by user-defined criteria, provides justification for score selection, and offers actionable revisions targeting different quality levels. Recursively uses rubric-of-rubrics approach to iteratively refine user-defined rubrics. Validated through controlled experiments on writing revision and rubric creation tasks.

2602.12779

Don't blame me: How Intelligent Support Affects Moral Responsibility in Human Oversight

Tests whether AI assistance in oversight tasks lets people dodge moral responsibility when things go wrong. Spoiler: it does, and that's a design problem.

2602.09239

"These cameras are just like the Eye of Sauron": A Sociotechnical Threat Model for AI-Driven Smart Home Devices as Perceived by UK-Based Domestic Workers

Interviews domestic workers about smart home surveillance in employers' houses. Reveals a power asymmetry where the watched have zero control over the watchers' tech.

2602.08187

Large Language Models in Peer-Run Community Behavioral Health Services: Understanding Peer Specialists and Service Users' Perspectives on Opportunities, Risks, and Mitigation Strategies

Asks people with lived mental health experience whether LLMs belong in peer support. The opacity and scale of models clash with recovery's core values of trust and autonomy.

2602.10001

Human-AI Synergy Supports Collective Creative Search

Uses a word-guessing task to measure whether AI makes groups more creative. Finds hybrid teams outperform both humans alone and AI alone—rare empirical win for collaboration.

2602.08972

PPG as a Bridge: Cross-Device Authentication for Smart Wearables with Photoplethysmography

Proposes using your heartbeat waveform to authenticate across wearables. Wild methodology: your pulse becomes your password, captured by the sensors you already wear.

2602.07802

TouchScribe: Augmenting Non-Visual Hand-Object Interactions with Automated Live Visual Descriptions

Generates real-time descriptions of what blind users are touching. Bridges the gap between tactile exploration and visual features like color, text, or damage.

2602.11567

Behavioral Indicators of Overreliance During Interaction with Conversational Language Models

Identifies behavioral signals that predict when users are blindly trusting LLM output. Turns overreliance from a vague concern into something you can detect and measure.

2602.12631

AI Agents for Inventory Control: Human-LLM-OR Complementarity

Pits LLMs against operations research algorithms in inventory management. Finds each excels where the other fails—rigid models vs. adaptive reasoning—suggesting hybrid approaches.

REFLECTION(4)

Friction is a feature, not a bug

Across clinical oversight, creative tools, and education, seamless AI assistance is eroding the judgment it was designed to augment. The emerging pattern suggests that removing friction doesn't improve outcomes—it transfers accountability from humans to systems, leaving users less capable of catching errors when the AI inevitably fails.

Frictionless interfaces accelerate adoption but create dependency loops—users stop reasoning through decisions and defer to the system. At what point does convenience become incompetence, and who bears the cost when the system fails?

1 / 4

Week 06February 2026

Week 08February 2026

ABOUT THIS ISSUE

How was this newsletter synthesized?

Methodology

This newsletter is generated by an AI pipeline (leveraging Anthropic Sonnet 4.5 & Haiku 4.5) that processes the metadata and abstracts of every new arXiv HCI paper from the past week—159 this issue. Each paper is scored on three dimensions: Practice (applicability for practitioners), Research (scientific contribution), and Strategy (industry implications), with scores from 1-5. Papers passing threshold are grouped into topic clusters, and each cluster is summarized to capture what that body of research is exploring.

Selection Criteria

The pipeline builds a curated selection that balances high scores with topic diversity—and deliberately includes at least one 'contrarian' paper that challenges prevailing assumptions. This selection is then analyzed to identify key findings (patterns across multiple papers) and surprises (results that contradict conventional wisdom). A narrative synthesis ties the week's research together under a unifying frame.

Key Themes Discovered

Field Report: ai-interaction

Trust, Control, and Relational Dynamics

This cluster examines how AI systems reshape human authority, responsibility, and decision-making in high-stakes contexts. Core tensions emerge: when AI constrains human choice, moral accountability dissolves; when recommendations flow without friction, users over-rely; when AI enters peer support, relational authority reconfigures. Research spans clinical oversight, behavioral health, inventory decisions, and creative work—asking not what AI can do, but how its presence redistributes epistemic and moral weight. Design implications center on preserving human judgment through friction, transparency about capability limits, and co-constructed rather than delegated trust.

1/12

Top Papers in this Theme

2602.08187

Large Language Models in Peer-Run Community Behavioral Health Services: Understanding Peer Specialists and Service Users' Perspectives on Opportunities, Risks, and Mitigation Strategies

2602.10701

Don't blame me: How Intelligent Support Affects Moral Responsibility in Human Oversight

2602.12631

AI Agents for Inventory Control: Human-LLM-OR Complementarity

2602.09296

PointAloud: An Interaction Suite for AI-Supported Pointer-Centric Think-Aloud Computing

2602.11567