How to Audit Your AI Hiring Tools for Compliance
If you use AI in your hiring process, you need to audit it. Not because it sounds like best practice, but because an AI hiring audit is the only way to know whether your tools are working as intended, whether they are creating legal exposure, and whether they are actually improving your hiring outcomes. Many employers deploy AI hiring tools based on vendor promises and never look under the hood. That is a liability waiting to materialize. A structured compliance audit protects your organization, builds defensible documentation, and often reveals opportunities to improve how you use these tools.
This is the final post in our AI & Hiring Law series. The previous posts covered the legal landscape, EEOC guidelines, the organization vs decision distinction, and state-by-state regulations. This post gives you the practical tools to ensure compliance across all of those frameworks.
Why You Need an AI Hiring Audit
There are four categories of risk that an audit addresses:
- Legal risk. Multiple jurisdictions now require bias audits (NYC Local Law 144 mandates an annual independent audit). Even where audits are not yet legally required, the EEOC and state agencies expect employers to monitor their AI tools for disparate impact. Conducting an audit now creates documentation that demonstrates good faith compliance.
- Discrimination risk. AI tools can produce discriminatory outcomes without anyone intending it. Historical training data, proxy variables, and algorithmic design choices can all create patterns that systematically disadvantage protected groups. You will not know this is happening unless you measure it.
- Effectiveness risk. AI hiring tools are expensive and consequential. If your resume screener is ranking candidates in ways that do not correlate with actual job performance, you are paying for a tool that makes your hiring worse. An audit reveals whether the tool is delivering what was promised.
- Reputational risk. Candidates talk. If your AI tools produce obviously unfair outcomes — rejecting qualified candidates, penalizing certain backgrounds, or providing a poor candidate experience — your employer brand suffers. An audit helps you catch these problems before they become public.
The Five-Phase Audit Framework
This framework is designed to be thorough enough for legal compliance and practical enough for organizations of any size. You can conduct this audit internally or engage an external auditor, though NYC Local Law 144 specifically requires independent auditors for bias audits.
Phase 1: Inventory and Classification
Before you can audit your AI tools, you need to know what you have. This phase is more involved than it sounds, because AI is often embedded in tools that are not marketed as “AI hiring tools.”
Step 1: List every tool in your hiring process. Include your applicant tracking system (ATS), any assessment platforms, interview scheduling tools, background check services, sourcing platforms, and any other technology that touches the candidate journey.
Step 2: Identify AI components. For each tool, ask the vendor (or review the documentation) whether the tool uses any form of machine learning, statistical modeling, natural language processing, or algorithmic scoring. Many ATS platforms have AI features that were enabled by default and that your team may not even know about.
Step 3: Classify each AI component. For each AI feature you identify, classify it into one of three categories:
- Administrative: Scheduling, communication, pipeline management. Low decision-making impact.
- Organizational: Data presentation, candidate information formatting, structured question generation, score aggregation. Medium impact — influences what humans see and how they see it.
- Evaluative: Scoring, ranking, screening, recommending, filtering. High impact — directly affects which candidates advance and which do not.
Focus your audit effort on evaluative and organizational tools. Administrative tools carry minimal compliance risk.
Phase 2: Bias Testing
This is the core of the compliance audit. For every AI tool classified as evaluative, you need to test whether the tool produces disparate impact on protected groups.
Step 1: Gather outcome data. For each AI tool, collect the outcomes (scores, rankings, pass/fail determinations) for all candidates processed over the past 12 months. If possible, associate each outcome with the candidate's demographic information (race/ethnicity, sex, age group, disability status).
A note on demographic data: many employers do not collect this information from candidates. If you do not have it, you have two options: (1) begin collecting voluntary self-identification data going forward, or (2) use statistical methods (such as Bayesian Improved Surname Geocoding, or BISG) to estimate demographics based on publicly available data. Option 1 is more accurate; option 2 is a reasonable interim approach.
Step 2: Calculate selection rates. For each protected group, calculate the rate at which candidates pass, advance, or receive favorable AI scores. For example, if your AI resume screener advances 40% of male applicants and 28% of female applicants, those are your selection rates.
Step 3: Apply the four-fifths rule. Divide the lower selection rate by the higher. If the result is below 80%, the tool shows potential disparate impact. In the example above: 28% / 40% = 70%, which is below the threshold.
Step 4: Investigate flagged disparities. If the four-fifths rule identifies potential disparate impact, investigate the specific criteria driving the disparity. Is the AI screening for a criterion that is not actually job-related? Is a legitimate criterion being measured in a way that disadvantages certain groups? Is there an alternative approach that would achieve the same business purpose with less disparate impact?
Step 5: Document findings and actions. For each AI tool tested, document the selection rates, the four-fifths analysis, any disparities identified, the investigation of root causes, and the remediation steps taken. This documentation is your legal defense if the tool's outcomes are ever challenged.
Phase 3: Vendor Assessment
Your AI vendors should be partners in compliance, not obstacles. This phase evaluates whether your vendors are providing the transparency and support you need.
Questions to ask every AI hiring vendor:
- What data does the tool use to make its assessments? You need a complete list of data inputs, not a vague description. “We analyze candidate responses” is not sufficient. What specific features of the responses? How are they weighted?
- Has the tool been validated for the specific use cases you are applying it to? A tool validated for customer service roles is not automatically valid for engineering roles. Validation must be specific to the job type, the candidate population, and the outcomes being predicted.
- Has the tool been independently audited for bias? If yes, request the audit report. If no, ask why not and what the vendor's timeline is for conducting one. A vendor that has never tested its tool for bias is not a vendor you should trust with hiring decisions.
- Can the tool provide audit trails for individual candidates? If a candidate challenges their rejection, can you show exactly what data the AI used, how it scored the candidate, and what criteria drove the outcome?
- What accommodations does the tool support? Can the tool accommodate candidates with disabilities? Can timed assessments be extended? Can alternative input methods be used? What is the process for requesting an accommodation?
- Does the tool comply with applicable state and local AI hiring laws? Ask specifically about NYC Local Law 144, Illinois AIVICA, and the Colorado AI Act. If the vendor is not familiar with these laws, that is a warning sign.
- What does the vendor's contract say about liability for discriminatory outcomes? Many vendor contracts include indemnification clauses that shift all liability to the employer. Understand your contractual exposure.
- How does the vendor update and retrain the AI model? AI models are not static. Understanding the retraining cadence and methodology helps you assess whether the tool's performance and fairness are being maintained over time.
If a vendor cannot answer these questions satisfactorily, you should seriously consider whether that vendor's tool belongs in your hiring process.
Phase 4: Process Review
Beyond the tools themselves, audit how the AI is integrated into your hiring process. The best AI tool, poorly implemented, can still create legal and practical problems.
Human oversight assessment: For each AI tool, verify that a human reviews the AI's output before any action is taken on a candidate. Check whether human reviewers have the time, training, and authority to meaningfully override AI recommendations. Review override rates — if humans never override the AI, the human review may not be meaningful. See our detailed framework on AI organization vs decision-making for the criteria that distinguish genuine oversight from rubber-stamping.
Candidate experience assessment: Walk through your entire hiring process as if you were a candidate. Note every point where AI is involved. Check whether candidates are notified of AI use, whether the notifications are clear and timely, and whether there is a visible path to request an accommodation or alternative process.
Training assessment: Interview your hiring managers and recruiters. Do they understand what the AI tools do? Can they explain how the AI scores or ranks candidates? Do they know they have the authority to override AI recommendations? If the people using the tools do not understand them, meaningful human oversight is not happening.
Phase 5: Documentation and Monitoring Plan
An audit is only as valuable as its documentation and follow-through. This phase establishes the records and ongoing processes that sustain compliance.
Required documentation:
- AI tool inventory. A complete list of all AI tools in your hiring process, their vendors, their functions, and their classification (administrative, organizational, evaluative).
- Bias audit results. For each evaluative tool, selection rates by protected group, four-fifths analysis, any flagged disparities, investigation findings, and remediation actions.
- Vendor assessment records. Vendor responses to compliance questions, validation documentation, audit reports, and contract terms related to liability and data handling.
- Process documentation. How each AI tool is integrated into the hiring process, what human oversight exists, how overrides work, and what candidate notifications are provided.
- Impact assessments. Required by Colorado and expected by other jurisdictions, these documents describe the purpose, data inputs, outputs, risks, and mitigation measures for each high-risk AI tool.
- Remediation records. Documentation of any changes made to AI tools, processes, or vendor relationships based on audit findings.
Ongoing monitoring plan:
- Quarterly: Review AI tool outcome data for emerging disparities. This does not require a full audit — a quick check of selection rates by group can flag issues early.
- Annually: Conduct a complete bias audit (required by NYC Local Law 144 and recommended everywhere). Update impact assessments. Re-evaluate vendor compliance.
- Upon change: Whenever an AI vendor updates their tool, whenever you change how a tool is used in your process, or whenever you add a new AI tool, review the relevant audit components.
- Retention: Retain all audit documentation for at least three years beyond the last use of each AI tool. Some jurisdictions may require longer retention.
What to Do When You Find a Problem
Audits exist to find problems. Finding a problem is not a failure — it is the audit working as intended. Here is how to respond:
If the Bias Audit Reveals Disparate Impact
- Determine whether the impacted criterion is job-related and consistent with business necessity. If the criterion driving the disparate impact is genuinely required for job performance, document the business necessity justification thoroughly.
- Evaluate less discriminatory alternatives. Even if the criterion is job-related, you must consider whether there is a different approach that would achieve the same purpose with less disparate impact.
- If the criterion is not job-related: remove it. Work with your vendor to modify the AI's scoring criteria, or stop using that feature of the tool.
- Monitor the change. After remediation, track outcomes to verify that the disparate impact has been reduced or eliminated.
If the Process Review Reveals Insufficient Human Oversight
- Identify where AI is making de facto decisions. Apply the five-question framework to each AI touchpoint.
- Restructure workflows. Insert meaningful human review at every point where the AI's output leads to candidate advancement or rejection.
- Train hiring managers. Ensure they understand their role in reviewing AI outputs and their authority to override.
- Track override rates. Use override frequency as a metric for whether human oversight is genuine.
If a Vendor Cannot Provide Adequate Compliance Support
- Document the gap. Record specifically what the vendor could not provide (audit data, validation evidence, accommodation support, etc.).
- Assess the risk. Determine whether you can mitigate the vendor's shortfall through your own processes, or whether the gap creates unacceptable legal exposure.
- Set a deadline. Give the vendor a reasonable timeline to address the shortfall. If they cannot or will not, begin evaluating alternatives.
- Consider the transition carefully. Switching AI tools mid-stream has its own risks. Plan the transition to minimize disruption to active hiring processes and ensure continuity of candidate data.
Who Should Conduct the Audit
For most organizations, a combination of internal and external resources is ideal:
- Bias testing (Phase 2): An independent auditor is required by NYC Local Law 144 and recommended everywhere. Look for auditors with expertise in industrial-organizational psychology, statistical analysis, and employment law.
- Vendor assessment (Phase 3): Your procurement or legal team, with input from HR and IT. The people who manage the vendor relationship should lead this phase.
- Process review (Phase 4): Internal HR or operations, ideally supplemented by someone outside the hiring team who can provide an objective perspective.
- Documentation and monitoring (Phase 5): Your compliance or legal function, with operational support from HR.
For smaller organizations without dedicated compliance teams, the audit can be led by whoever manages the hiring process, supplemented by an external auditor for the bias testing phase. The investment is modest compared to the legal and operational risk of unaudited AI tools.
Building Compliance into Your AI Hiring Tools from the Start
The best time to audit is before you deploy. When evaluating new AI hiring tools, use the vendor assessment questions in Phase 3 as part of your procurement process. Choose tools that are designed for compliance — tools that maintain human decision-making authority, provide transparent scoring, and support reasonable accommodations. PersonaScore, for example, is built around the principle that AI should organize and inform, never decide. Assessment data flows to human decision-makers with full transparency, and no candidate is automatically advanced or rejected by the system.
Choosing compliant tools from the start is dramatically less expensive than auditing non-compliant tools, remediating problems, and potentially switching platforms after issues are discovered.
The Bottom Line
An AI hiring audit is not a bureaucratic exercise. It is the mechanism by which you ensure that your AI tools are doing what they are supposed to do, are not creating legal liability, and are actually improving your hiring outcomes. The audit framework in this guide — inventory, bias testing, vendor assessment, process review, and documentation — is designed to be thorough enough for legal compliance and practical enough for organizations of any size.
Do not wait for a complaint, an enforcement action, or a bad outcome to prompt your first audit. The employers who are best positioned for the evolving AI regulatory landscape are the ones who are already auditing, already documenting, and already fixing the problems they find.
This concludes the AI & Hiring Law series. For more on how to use AI responsibly and effectively in your hiring process, read our guide on AI in Hiring: What Actually Works vs What's Just Hype. And for the full series, start from the beginning with AI in Hiring: What's Legal, What's Not, and What's Gray.