A phased program to automate the manual document triage and bill-to-notes reconciliation work that today consumes 50–60 hours of staff time per week — without compromising the security posture you've spent fifteen years building.
Direct PT/DX receives thousands of physical therapy reports each week from hundreds of clinics across dozens of software platforms. Two of the most time-consuming workflows on the team today — document triage and bill-to-notes reconciliation — are still done by hand. They don't have to be.
A self-contained AI document processing service that lives inside the Direct PT/DX AWS perimeter, integrates directly with Phoenix, and applies enterprise-grade large language models to the two specific workflows your team identified.
The 50–60 hours per week your team spends opening PDFs and matching them to referrals is recovered. The reconciliation work currently limited to flagged providers expands to every bill — catching billing leakage that today goes undetected.
Direct PT/DX runs a mature, custom-built PHP application — Phoenix — that's been refined over fifteen years to handle a complex, EDI-driven workers' comp scheduling business. The system works. The gap is in the manual judgment layer that sits on top of it.
PDFs land in two places: the SFTP "leftover" folder (files that didn't auto-match their EDI 837 bill) and the Zoho inbox (faxes and email submissions). A staff member must open each file and answer:
Then the file is archived, deleted, or uploaded with corresponding database updates.
For flagged providers, the team verifies that billed CPT codes and units are actually supported by the clinical notes. A typical PT visit is four fifteen-minute units across one or more procedure codes.
The bias is conservative: anything ambiguous is failed for human review.
HIPAA-grade posture, white-hat tested by the cyber carrier. Files must not leave the Direct PT/DX environment unless via a vetted, BAA-covered API.
Production database, file server, and application servers are migrating from Cloud Eleven to AWS. New work should deploy directly into the AWS target environment.
Several hundred thousand lines of PHP 5.7. Integration via MySQL connector and the existing file server — not via Phoenix code changes.
A dedicated AI document-processing service runs alongside Phoenix in AWS, inside the Direct PT/DX perimeter. It uses an enterprise LLM (Claude or OpenAI under a Business Associate Agreement) for the parts of the problem that require judgment, and deterministic Python for everything else.
SFTP leftover folder · Zoho mailbox (fax + email) · Phoenix referral & bill data via MySQL · Provider CPT format library
Python service on an AWS EC2 instance · PDF text and vision extraction · Enterprise LLM calls (Claude/OpenAI w/ BAA) · Vector store for provider format memory · Deterministic matching logic
File server uploads · Phoenix database updates · Review queue for ambiguous cases · Drafted provider outreach emails · Audit log of every decision
Most document automation failures come from two mistakes: trying to use rules where you need judgment, and trying to use AI where you need precision. This architecture splits the work cleanly.
The LLM handles the unstructured judgment work — reading a faxed handwritten note, recognizing that Therapy South phrases 97110 differently than Encore, deciding whether a duplicate has new content.
Deterministic Python handles everything else — file movement, database updates, audit trails, integration with Phoenix. This is where reliability lives.
The result: an accuracy ceiling much higher than rule-based automation, with the operational reliability of traditional software.
Each phase has a clear objective, defined scope, named deliverables, and explicit acceptance criteria.
The foundation phase. Before any code is written, we sign the BAA and MSA, get SFTP credentials and a representative sample of 500+ real documents covering the major providers and the long tail, catalog the format universe, define exactly which data fields we extract and how, and build the ground-truth test set every later phase will be measured against. We also finalize the LLM provider choice (Claude vs. OpenAI), the AWS deployment target, and the security review process.
Provision the AWS environment, stand up the Python service, build the integration layer between the LLM provider and Phoenix, and put logging, audit trails, error handling, and retries in place. By the end of this phase, we can ingest a PDF from any source, route it through an LLM, and write a result somewhere — generically. Phases 2 and 3 then specialize this foundation to the two specific workflows.
Built and shipped in six slices: identity extraction (patient name, DOB, date of service), provider classification, Phoenix referral matching, duplicate detection with addendum awareness, the action layer (archive / delete / upload + database update), and the Zoho mailbox handler that pulls fax PDFs out of email before they hit the leftover folder. Each slice is tested against the ground-truth set; the phase isn't complete until aggregate accuracy clears the agreed threshold.
The CPT-to-narrative reconciliation problem. Build the procedure-code dictionary from the AMA reference, then layer per-provider format mappings — how Therapy South phrases 97110, how Encore phrases 97140, how Cora phrases re-evaluations. The LLM reasoning layer compares billed line items against documented work and returns pass / fail / uncertain. Conservative bias throughout: uncertainty always becomes a failure routed to review. Provider-specific overrides for the largest providers. By the end of this phase, every bill in the system can be reconciled — not just the flagged-provider subset that the team can manually handle today.
A lightweight review console for Debbie's team to handle the cases the system flags. Each item arrives pre-analyzed with the system's reasoning visible, drafted provider outreach (when relevant), and one-click actions to confirm, override, or escalate. The feedback channel Paul described — "the system found an eighth thing I didn't train it on" — becomes a structured loop where novel patterns surface to a human reviewer rather than disappearing.
Security review aligned with the standards your cyber carrier already applies. Runbooks for every failure mode. Documentation written for the PHP journeyman being onboarded — not for engineers who already know the system. Training sessions with Paul, Mike, and Debbie. After Phase 5, the internal team has the documentation, access, and confidence to run the system without ongoing reliance on Bowers Enterprises.
A separate engagement to get Paul, Mike, and Debbie operating with modern AI tooling: Claude Code inside VS Code, structured prompting for Phoenix support tasks, and AI-assisted code review for the PHP journeyman. Structured as a kickoff workshop followed by four follow-up coaching sessions. Available as a separate statement of work.
Sequencing is deliberate: Phase 0 de-risks everything that follows. Phase 1 builds the foundation. Phases 2 and 3 ship the two use cases. Phase 4 wraps the human layer. Phase 5 hands the keys over. Each phase has a clean handoff to the next, with go/no-go decision points between them.
Legend. Solid teal = active development. Outlined cells = overlap weeks with the prior phase (testing, handoff, parallel work). Bars assume a smooth path through acceptance criteria; phases can flex ±2 weeks within the total envelope.
Discovery complete. BAA signed. Ground-truth test set in hand. Phase 1 approved to start.
Document triage in production. First measurable labor recovery on the team begins.
Reconciliation live across all providers. Full system handed off to the internal team.
| # | Phase |
|---|---|
| 00 | Discovery & Foundation Design BAA, sample data, ground-truth test set, architecture spec |
| 01 | Toolkit & Infrastructure AWS service, LLM integration, ingestion, Phoenix connector |
| 02 | Document Triage Automation Identity extraction, referral matching, dedup, action layer, Zoho |
| 03 | Bill-to-Notes Reconciliation CPT dictionary, provider mappings, reasoning engine, overrides |
| 04 | Human-in-the-Loop Workflow Review console, drafted outreach, feedback loop |
| 05 | Hardening, Documentation & Handoff Security review, runbooks, training, source transfer |
Every estimate above assumes the items below are in place. If any of them shift materially, we'll re-plan together before scope or timeline changes are locked in.
This proposal does not include changes to Phoenix's PHP code. The AI service integrates via the file server and MySQL — the same way Phoenix already exchanges data with its own automations. If a future phase requires Phoenix changes, that's separate scope and we'd quote it as such.
No project of this size lands without friction. The honest version: here are the things most likely to go sideways, and how the plan absorbs them.
Mitigation. Phase 0 builds the ground-truth set before we code. Phase 2 is sliced into six independent deliverables so we can measure accuracy incrementally and tune. If we miss the threshold at end of Phase 2, we extend by up to two weeks at no additional cost before re-scoping.
Mitigation. These cases are explicitly out of the auto-routing target. They route to the review queue with the system's best guess and the original PDF attached — same workflow your team uses today, but consolidated in one place rather than scattered across folders.
Mitigation. Phase 1 deploys into the AWS target regardless. If a Phoenix component is still on Cloud Eleven by Phase 2 kickoff, the service connects across — this is normal for hybrid environments. No phase blocks on the migration completing.
Mitigation. Security posture is designed in from Phase 1, not bolted on at the end. Phase 0 includes a checkpoint conversation with whoever runs your security review so requirements are known before we build. Remediation work for any review finding falls inside the Phase 5 envelope unless it's a wholesale architecture change.
Mitigation. The LLM integration is built behind an abstraction layer. Swapping providers (or migrating to self-hosted later) is a configuration change, not a rewrite. Both major enterprise providers offer HIPAA-eligible tiers with similar capabilities, so a forced switch is recoverable.
Mitigation. Phase 3 explicitly budgets per-provider override modules for the largest providers. If a provider's format defies general reasoning, we ship a targeted rule layer for that provider rather than expanding generic LLM work. The architecture supports this; the timeline absorbs up to three such override builds.
If this proposal aligns with what you have in mind, three things move us into Phase 0 within a week.
Signed acceptance below, or a counter-proposal on any line item we should adjust. Either is welcome.
Bowers Enterprises will deliver draft documents within 48 hours of acceptance. Standard HIPAA-aligned BAA, standard professional services MSA. Direct PT/DX legal review and execution.
A working session with Paul Williams to map the document landscape, identify the test-set sources, and align on the architecture decisions that drive everything downstream. Phase 0 invoice issued at kickoff.