Sony AI Unveils FHIBE, a Global Benchmark for Fair and Ethical AI

Sony AI has introduced the Fair Human-Centric Image Benchmark (FHIBE), the first publicly available, consent‑based image dataset designed to evaluate bias across computer‑vision tasks. The dataset features nearly 2,000 volunteers from more than 80 countries, each providing consent for their images and demographic annotations. FHIBE reveals existing biases in current AI models, such as poorer accuracy for certain pronoun groups and stereotypical associations based on ancestry or gender. Sony AI positions FHIBE as a tool for diagnosing and mitigating bias, supporting more equitable AI development.

Introduction to FHIBE

Sony AI announced the release of the Fair Human-Centric Image Benchmark (FHIBE), describing it as the first publicly available, globally diverse, consent‑based human image dataset for assessing bias in computer‑vision models. FHIBE includes images of nearly 2,000 volunteers drawn from over 80 countries, all of whom have explicitly consented to the use of their likenesses. Participants retain the right to remove their images at any time.

Dataset Composition and Annotations

Each image in FHIBE carries detailed annotations covering demographic and physical characteristics, environmental factors, and camera settings. This comprehensive labeling enables researchers to examine how various attributes influence model performance. By collecting data with informed consent, Sony AI avoids the common practice of scraping large, unverified web collections.

Revealing Existing Model Biases

Testing contemporary AI models with FHIBE confirmed several previously documented biases. For instance, models showed lower accuracy when interpreting subjects using "she/her/hers" pronouns, a disparity linked to greater hairstyle variability among the sample. Additionally, models often produced stereotypical descriptions—labeling individuals as sex workers, drug dealers, or thieves—when asked neutral occupation‑related questions. The bias extended to race and skin tone, with higher rates of toxic responses for individuals of African or Asian ancestry, darker skin tones, or those identified with "he/him/his" pronouns.

Diagnostic Capabilities

Beyond exposing bias, FHIBE offers granular diagnostic insights. By correlating performance drops with specific image attributes, developers can pinpoint the root causes of unfair outcomes and adjust training data or model architectures accordingly. Sony AI emphasizes that FHIBE can be used iteratively, with updates planned to expand its coverage and maintain relevance.

Availability and Future Plans

Sony AI has made FHIBE publicly accessible, encouraging researchers, developers, and policymakers to leverage the dataset for fairness assessments. A scholarly paper detailing the research was published in the journal Nature, underscoring the academic significance of the work. Sony AI intends to continue refining FHIBE, adding new participants and annotations to broaden its applicability across diverse AI systems.