5 Accessibility Agent Lessons From Building a General AI Too

Prev Article Next Article

It is an understatement to say agents have reshaped how developers interact with code. GitHub has embraced agent-based workflows across many of its initiatives, including a fascinating experiment: a general-purpose accessibility agent designed to catch issues early and provide real-time guidance. This pilot project has already reviewed thousands of pull requests and resolved a meaningful percentage of accessibility barriers before they reached production. The experience offers practical lessons for any team considering similar automation.

accessibility agent lessons

Lesson 1: Define What the Agent Can and Cannot Do

One of the first and most important accessibility agent lessons from this experiment involves setting clear expectations. The team behind GitHub’s pilot did not try to solve every possible accessibility problem with automation. They understood that a single tool, no matter how sophisticated, cannot replace human judgment or cover every edge case.

The social model of disability teaches that barriers arise from how environments are built. Digital spaces work the same way. An agent can help remove visible obstacles, but it cannot fully understand context, intent, or the nuanced needs of every user. By acknowledging this limitation early, the team gained faster buy-in from stakeholders and launched the experiment more quickly.

This lesson applies broadly. When building any accessibility tool, define its scope honestly. An agent that tries to do everything often ends up doing nothing well. Focus on objective, repeatable issues where automation shines. Leave complex judgment calls to human reviewers.

Lesson 2: Structured Issue Tracking Creates a Goldmine for LLMs

GitHub had already invested years in a mature system for logging accessibility issues. Each report followed a structured template with steps to reproduce, metadata about severity, applicable WCAG success criteria, crosslinks to pull requests, and acceptance criteria. All issues lived in a single repository, creating a consistent corpus of high-quality examples.

When the team began building the agent, this organized collection became one of its strongest assets. The agent could investigate past issues and extrapolate code snippets and language patterns that helped it recognize similar problems in new pull requests. The consistency of the data made it ideal for reference, even though it was originally created long before LLMs became popular.

One of the key accessibility agent lessons here is that preparation matters. If your organization has not already invested in manually identifying and documenting accessibility issues, you will face a harder road. The European Accessibility Act is now in effect, and Title II of the Americans with Disabilities Act will establish WCAG 2.1 AA as the legal standard by April 2027. Organizations that wait will be at a disadvantage, not just legally but also technically, because they will lack the structured data needed to train or guide an agent effectively.

Start building your issue library today. Use templates. Add metadata. Centralize everything. This investment pays off whether you build an agent next year or five years from now.

Lesson 3: Vague Instructions Produce Poor Results

Telling an LLM to “use accessibility best practices” sounds reasonable but rarely works. The agent experiment demonstrated that vague instructions in skill files lead to weak outcomes. LLMs trained on massive codebases often inherit accessibility antipatterns because so much of the web has historically been built without accessibility in mind. Without specific guidance, the agent reproduces those same mistakes.

The GitHub team learned that the agent needed better content. They invested in cataloging actual issues and verified fixes. This collection of real problems and their resolutions provided highly contextual examples that the agent could reference. Instead of abstract principles, the agent worked from concrete before-and-after scenarios.

This is an old truth in specialized domains. A lawyer does not learn the law by reading a single paragraph about “being fair.” A doctor does not diagnose from a vague instruction to “help patients.” Accessibility requires the same depth. Write specific instructions. Provide real examples. Include failed attempts alongside successful fixes. The agent will perform dramatically better with precise guidance than with generic advice.

This accessibility agent lesson extends beyond code. Any team training an LLM for a specialized task should invest in curated, high-quality examples rather than relying on broad instructions.

Lesson 4: Fuzzy Matching Is an Asset, Not a Liability

LLMs are non-deterministic. They do not produce the same output every time, and they do not match patterns exactly. Many engineers view this as a weakness. For accessibility work, it can be exactly what you need.

GitHub’s agent was instructed to investigate past issues and extrapolate related code and language snippets. Because LLMs perform fuzzy matching, they could identify problems that did not look identical to anything in the training data. A button with a slightly unconventional label or a heading structure that mostly followed patterns but deviated in one place might still be caught. Exact pattern matching would have missed these cases.

This flexibility proved valuable. In a domain like accessibility, where every interface is unique and barriers appear in countless forms, rigid matching fails. The agent needed to recognize structural similarities even when the surface details differed. The non-deterministic nature of LLMs, often seen as a bug, became a feature.

You may also enjoy reading: 5 Ways Corrupt DoT Head Took Oil Money for Reality Show.

One of the more surprising accessibility agent lessons is that embracing uncertainty can improve outcomes. Instead of fighting the probabilistic nature of LLMs, design your agent to work with it. Use confidence thresholds. Let the agent flag issues it is unsure about for human review. The fuzzy approach catches more real problems than a strict rule-based system ever could.

Lesson 5: Accessibility Is Holistic and Token-Consuming

Accessibility does not live in one discipline. It touches code, design, copywriting, interaction patterns, and content strategy. A button might fail accessibility not because the code is wrong but because the label text is unclear. A heading structure might break because a designer moved elements visually without updating the underlying semantic order.

This holistic nature means an agent must consider many dimensions at once. It also means token consumption becomes a real concern. Every check, every reference to past issues, every comparison against WCAG criteria consumes tokens. If the agent is not efficient, costs rise quickly and response times suffer.

GitHub’s experiment handled this by focusing the agent on the most frequent and objective issue types. To date, the agent has reviewed 3,535 pull requests with a 68% resolution rate. The top five issue categories were:

Making structure and relationships clear to assistive technologies
Providing clear and concise names for interactive controls
Ensuring users are aware of important announcements
Ensuring text alternatives exist for non-text content
Moving keyboard focus through pages in a logical order

Each of these represents real friction removed for people who rely on assistive technology. These are not theoretical concerns. They are barriers that would have prevented users from navigating GitHub effectively. By concentrating on the highest-impact issues, the agent delivered meaningful improvements without excessive resource consumption.

The final accessibility agent lessons here is to resist the temptation to cover everything. Accessibility is broad, and an agent cannot master every dimension at once. Pick the areas where automation delivers the best return. Address the issues that appear most frequently in your audits. Let the agent handle those while human reviewers focus on deeper, more contextual problems that require empathy and experience.

This approach also makes the agent easier to improve over time. Start with a narrow scope. Measure results. Expand gradually as you learn what works.

What This Means for Your Team

The GitHub experiment is still ongoing, and the agent continues to learn. But the patterns are already clear. Building a general accessibility agent is not about replacing human effort. It is about creating a tool that amplifies what humans already do well. The agent catches the repeatable, objective issues that slow down reviewers. It provides answers quickly so developers do not have to interrupt their workflow to research accessibility questions.

For teams considering a similar path, the message is straightforward. Invest in structured issue tracking now. Define scope honestly. Write specific instructions. Embrace the fuzzy nature of LLMs. And focus on the issues that matter most to real users. The legal landscape is shifting, with the European Accessibility Act and ADA Title II raising standards worldwide. The teams that start building these systems today will be far better positioned when those deadlines arrive.

Accessibility is not a feature. It is a fundamental property of good software. An agent cannot solve it alone, but it can be a powerful partner in the ongoing work of removing barriers.