Rubric version v1.0 · effective from 2026-05-20 · last updated 2026-05-21
HireAIScore is an evaluator-driven directory. There are no user reviews and no crowd ratings. Every score in this directory is assigned by a named human reviewer working against the public rubric below, on the basis of evidence that the reviewer can show.
This page is the rubric, end to end. It does not move without a methodology version bump, and every change is recorded in the version history at the bottom of this page.
Why we built this rubric
By the end of 2026, anyone deploying an AI hiring tool in the EU, Colorado, Illinois, or New York City is on the hook for a documentation burden that the typical HR team does not have the expertise to satisfy alone. The deployer's obligations — fundamental rights impact assessments, post-market monitoring, human oversight design, bias audits — depend on what the vendor actually provides. A glossy product page is not enough; an Annex IV-aligned technical pack is.
This rubric scores vendors on the things deployers genuinely need from them to meet those obligations. It deliberately weights the categories that the law treats as non-optional over the categories that are nice-to-have.
The seven categories
Each vendor is scored 0–100 in each of the seven categories below. The category weights sum to 1.0; the total score is the weighted sum.
| # | Category | Weight | What we look for |
|---|---|---|---|
| 1 | Article 11 Technical Documentation | 0.20 | Does the vendor publish or provide on request Annex IV-aligned technical documentation? Completeness against the 9 sections of Annex IV. |
| 2 | Bias Audit Transparency | 0.18 | Has the vendor published a bias audit per NYC LL 144? How recent? Methodology disclosed? Results published vs gated? |
| 3 | FRIA Support | 0.15 | Does the vendor provide deployers the data needed to complete a Fundamental Rights Impact Assessment under Article 27? Templates, system cards, intended use definitions. |
| 4 | Data Governance Disclosure | 0.15 | Training data sources, validation methodology, performance metrics across demographic subgroups, data retention policies. |
| 5 | Human Oversight Design | 0.12 | How does the system support Article 14 human oversight? Override capabilities, confidence thresholds, decision auditability. |
| 6 | Post-Market Monitoring | 0.12 | Article 72 obligations: incident reporting channels, performance drift monitoring, model update disclosure. |
| 7 | Customer Documentation | 0.08 | Instructions for use (Article 13), system limitations clearly stated, intended use boundaries documented. |
What evidence we look for in each category
Article 11 Technical Documentation. Public Annex IV-aligned documentation, system cards, technical whitepapers, or a documented availability-on-request process with a named scope and contact path. A page that says "documentation available on request" without specifying what is in the pack scores below a vendor whose pack table-of-contents is public.
Bias Audit Transparency. Published bias audit reports (full or summary), audit methodology, audit cadence statement, third-party auditor identity. NYC LL 144 audits are the baseline; we credit vendors who run audits in jurisdictions that do not require them.
FRIA Support. FRIA templates, deployer guidance, system cards naming intended use, customer-facing data sheets. The Article 27 obligation falls on the deployer, but a vendor's documentation is what makes a complete FRIA possible. We grade on whether a deployer using only the vendor's public materials could complete each section of the FRIA.
Data Governance Disclosure. Training-data source descriptions, subgroup performance metrics, retention and deletion policy, data-rights process. We treat the absence of subgroup metrics — even when overall accuracy is disclosed — as a substantial gap.
Human Oversight Design. Override and stop controls, confidence threshold configuration, audit log surfaces, escalation paths in product. We credit vendors who design oversight as a first-class workflow, not as a settings toggle.
Post-Market Monitoring. Public incident-reporting channel, model-update changelog, drift monitoring evidence, customer notification protocol. A vendor with no public way to report concerns scores poorly here regardless of what is happening internally.
Customer Documentation. Instructions for use, limitations statement, intended-use boundaries, customer support documentation. Sales decks do not count; only what is given to a deployer post-purchase.
How we score
Each category gets an integer from 0 to 100. The total score is the weighted sum:
total = Σ (category_score × category_weight)
We round the total to the nearest integer for display, but the unrounded total drives the letter grade. There is no curve.
| Score range | Grade | Label |
|---|---|---|
| 90–100 | A | Exemplary |
| 80–89 | B | Strong |
| 70–79 | C | Adequate |
| 60–69 | D | Concerning |
| 0–59 | F | Substantial gaps |
A grade is a statement about a vendor's posture against the rubric, not about
the vendor as a company or a product. A C vendor can be the right choice
for a given deployer; an A vendor can still be the wrong choice if its
product does not fit the role.
Evidence types
Every category score must be backed by at least one piece of evidence. We classify evidence into five types:
documentation— a public-facing technical document, whitepaper, or product page produced by the vendor.audit_report— a third-party audit, e.g. a bias audit per NYC LL 144.public_statement— a blog post, press release, regulatory filing, or conference talk.integration— a feature observable in the product. We credit only features we can verify ourselves.absence— a noted lack of evidence after good-faith search. This is itself evidence. When we useabsence, we cite the places we looked so the vendor can dispute it with a real URL.
Update cadence
- Rubric review: annually. A version bump (e.g. v1.0 → v1.1) triggers re-scoring of every vendor before the new version is treated as canonical.
- Vendor scores: refreshed at least every six months, and immediately on a material vendor change (acquisition, major product launch, published audit, new system card).
- Last-reviewed dates are visible above the fold on every vendor profile.
Conflicts of interest
HireAIScore is a sister property to Casework, the AI hiring compliance firm. The two properties share operators but make decisions in separate forums — Casework engagement findings do not modify HireAIScore scores, and Casework engagement leads do not score vendors. The full policy, including the "no vendor money in either direction" and "named recommendations disclose relationships" clauses, is on the About page.
The three load-bearing rules for the rubric itself:
- No vendor pays for placement, scoring, or removal. Ever. No exceptions.
- If Casework has had a paid commercial relationship with a vendor in the past 24 months, that fact is disclosed on the vendor's profile and on the About page.
- The reviewer who scored the vendor is named on the vendor's profile. If they have a personal or financial relationship with the vendor, the score is reassigned to an independent reviewer who does not.
We treat the conflicts disclosure as a load-bearing piece of the rubric. A vendor profile that should carry a disclosure and does not is a bug; please report it via the address on the About page.
How vendors can request a review or response
Vendors are not consulted before a profile is published. The site is evaluator-driven; consulting vendors before publication would introduce the exact pressure the rubric is designed to resist.
After publication:
- To request a re-score when the underlying evidence has changed (new bias audit, new technical pack, published system card), email the address on the About page with the new evidence. Re-scores typically land within 30 days.
- To submit a response — a vendor right-of-reply — email the same address. Responses are published verbatim, attributed to a named respondent, on the vendor's profile, above the conflicts disclosure.
We do not edit vendor responses for tone or accuracy. We do reserve the right to decline responses that contain personal data, threats, or claims about other vendors.
Methodology version history
| Version | Effective from | Changes |
|---|---|---|
| v1.0 | 2026-05-20 | Initial rubric. Seven categories tied to the EU AI Act, Colorado AI Act, Illinois HB 3773, NYC LL 144, and current case-law. |