The Curalytics Prescription Pipeline

The Big Picture

Curant Health is the mail-order pharmacy that ships PrEP medications to Healthvana's telehealth patients. When a provider prescribes Descovy or Truvada through DoseSpot, Curant fills the order, packages it, and ships it to the patient's door.

Healthvana needs to know when that happens: whether a prescription was filled, when it shipped, and whether the patient is actually receiving their medication. This pipeline answers those questions. It scrapes dispensing data from Curant's portal, feeds it into Healthvana's record system, matches each fill event to the right patient and prescription, and creates structured records that providers can see in their dashboard.

🌐 Scrape Browser-based download from Curalytics portal

→

☁️ Ingest S3 upload, Celery polling, ParseMessage creation

→

⚙️ Parse Patient & Rx matching, dispense record creation

Stage 1: The Download Tool

The curalytics-download repo is a small Next.js app deployed at rx.hvna.dev. It runs inside the Curalytics portal via a bookmarklet and extracts three distinct datasets that, when joined together, give us a complete picture of every prescription fill.

Why not an API?

Curant's portal (curalytics.azurewebsites.net) has no documented external API. The only way to get data out is through the portal's own authenticated web interface. The download tool works around this by injecting an iframe that piggy-backs on the user's existing browser session.

How it works

A Healthvana team member navigates to the Curalytics portal and clicks a bookmarklet. That injects a script that opens rx.hvna.dev in an iframe. The iframe communicates with the parent page through postMessage RPC, effectively proxying authenticated API requests through the user's session. No credentials are stored or transmitted outside the browser.

The user picks a date range and clicks download. The orchestrator fetches three endpoints sequentially, joins the results, and produces a single JSON file that downloads to the user's browser.

Three datasets, one join

Medication Summary RxNumber PatientFirstName PatientLastName PatientDateOfBirth NDC DispensedQuantity DaysSupply DateWritten LastFillDate

Package Contents RxNumber PatientShipmentId DrugName PrescriptionName Quantity

Dispensed Scripts PatientShipmentId ShipDate DispensedDate FillDate Courier TrackingNumber TrackingLink DeliveryDate

↓ via RxNumber

↓ via PatientShipmentId

Joined Row { summary, package, dispensed }

Summary gives us the prescription-level data: who the patient is, what drug was prescribed, and when. Package tells us which shipment contains which prescription (one shipment can have multiple Rx items). Dispensed gives us the shipping and delivery details for each shipment.

The join logic indexes packages by RxNumber and dispensed records by PatientShipmentId. For each summary row, it finds matching packages, then finds the dispensed record for each package's shipment. Summary rows with no matching package still appear in the output (with package: null).

Example joined JSON object

{
  "summary": {
    "RxNumber": "RX-102938",
    "PatientFirstName": "Jane",
    "PatientLastName": "Doe",
    "PatientDateOfBirth": "1/15/1990",
    "NDC": "61958200201",
    "MedicationName": "Descovy 200-25 Mg Tablet",
    "DispensedQuantity": 30,
    "DaysSupply": 30,
    "DateWritten": "2026-01-05T10:00:00Z",
    "LastFillDate": "1/10/2026",
    "CompletedDate": "1/10/2026"
  },
  "package": {
    "PatientShipmentId": "SHP-445566",
    "RxNumber": "RX-102938",
    "DrugName": "DESCOVY TAB 200-25MG",
    "PrescriptionName": "Descovy 200-25 Mg Tablet",
    "Quantity": "30"
  },
  "dispensed": {
    "PatientShipmentId": "SHP-445566",
    "ShipDate": "1/10/2026",
    "DispensedDate": "1/10/2026",
    "FillDate": "2026-01-10T14:30:00Z",
    "DeliveryDate": "1/13/2026",
    "Courier": "UPS",
    "DeliveryType": "UPS Next Day Air",
    "TrackingNumber": "1Z999AA10123456784",
    "TrackingLink": "https://ups.com/track?num=..."
  }
}

Manual step

The download tool produces a JSON file that saves to the user's browser. Getting it into S3 is currently a manual upload. There is no automated push from the tool to S3.

Stage 2: File Ingestion

Once the JSON file lands in an S3 bucket, Healthvana's existing feed infrastructure takes over. The same system that processes HL7 lab results, CSV imports, and SFTP feeds also handles Curant JSON files. No special plumbing was needed, just a new row in the FeedConfiguration table.

The feed system

A FeedConfiguration record in the database defines everything the system needs to know about a data source:

FeedConfiguration

feed_type = JSON_FEED
storage_class = HVS3BucketFeed
bucket_name = (configured per env)
file_prefix = curant-
file_extensions = json
parser = CurantJsonParser
process_func_path = json_process_files

From file to ParseMessage

A ParseMessage is a database record representing one piece of incoming data that needs to be interpreted. Think of it as an item in a queue: once created, it gets handed to the right parser for processing.

Celery polls S3

The process_files Celery task runs every 5 minutes (cron 2-57/5). It acquires a distributed lock, iterates active FeedConfiguration entries, and uses HVS3BucketFeed to list new files matching the configured prefix and extension.

Download to temp directory

New files are downloaded from S3 to a local temporary directory on the worker. The feed configuration's local_path determines where files land.

Parse JSON, create ParseMessages

json_process_files() reads each file. The JSON can be a single object or an array of objects; both are handled. For each object, it computes a message_hash and calls get_or_create on ParseMessage. If the hash already exists, the duplicate is skipped. Otherwise a new ParseMessage is created with the raw JSON, the parsed dict (plus filename), and the parser class path.

Trigger parsing

Each new ParseMessage has its .parse() method called. This dynamically loads CurantJsonParser via an allowlist-gated importer (GetParser) and hands off to the parser's parse_message() method.

Idempotent by design

Re-uploading the same JSON file to S3 won't create duplicate records. The message_hash on ParseMessage catches exact duplicates at ingestion time, and update_or_create keyed on shipment_id catches duplicates at parse time. The pipeline is safe to re-run.

Stage 3: The Parser

This is the core of the pipeline. CurantJsonParser inherits from the same MessageParser base class used by every other parser in the system (HL7 labs, Athena results, etc.), but overrides the methods that matter for pharmacy data: patient matching, prescription matching, and dispense record creation.

Three models come up repeatedly in this section (full schema in Data Models): EmrMedication is a prescription, created upstream when a provider writes an Rx in DoseSpot. EmrMedicationDispensed is a single fill/dispense event, created by this parser. NdcRxnormMapping is a cache that maps NDC drug codes to RxNorm identifiers.

Parser lifecycle

The base class MessageParser.parse_message() orchestrates a fixed sequence. CurantJsonParser plugs into this sequence by implementing each hook:

Validate

Check that the message is non-empty and marked as final. Curant messages are always final (there are no partial/preliminary pharmacy results).

Receiving Facility

Resolve the receiving facility code "Curalytics" to a ReceivingFacility record and its parent Organization. This connects the data to the right org context.

Patient Match

Find the CompanyClient for this patient using name and date of birth. Multi-pass fuzzy matching handles compound last names and variations.

Records Request

Link to an existing Visit and RecordsRequest. Prefers the visit associated with the matched prescription. Falls back to the patient's most recent visit.

Process Pharmacy

Match the dispensed data to a specific EmrMedication (prescription), then create or update an EmrMedicationDispensed record. Update last_fill_date on the prescription.

Steps 1 and 2 are framework plumbing shared across all parsers. Step 4 links the result to a visit for record-keeping. The custom Curant logic lives in steps 3 (patient matching) and 5 (prescription matching and dispense record creation), detailed below.

Patient matching

The parser receives a first name, last name, and date of birth from the Curalytics data. It needs to find the corresponding CompanyClient in Healthvana. This is harder than it sounds because names don't always match exactly between systems.

Exact match

First name (case-insensitive, unaccented) + last name (exact) + DOB. Scoped to the HVD company with a valid EMR system identifier. This catches the majority of patients.

Compound last name → single

If the Curalytics last name has spaces (e.g. "Garcia Lopez"), try matching on just the last segment ("Lopez"). Catches cases where Healthvana has only one part of a compound name. Requires the segment to be at least 4 characters.

Single → compound suffix

If the Curalytics last name is a single word, try a suffix match (iendswith) against Healthvana. Catches the reverse: Healthvana has "Garcia Lopez" but Curalytics only has "Lopez". Also requires 4+ characters.

✓

Prescription gate

Even after a name match, the patient must have at least one existing EmrMedication under the HVD company. This prevents false matches against patients who were never prescribed anything through our telehealth service.

At every step, if multiple records match, the parser bails out rather than guessing. Unmatched messages are logged and skipped. There is no manual review queue.

Prescription matching

Once we have the patient, we need to figure out which of their prescriptions this fill event belongs to. The parser's get_emr_medication() method runs a multi-pass lookup with decreasing strictness.

The search space is: all EmrMedication records for this patient, at the HVD organization, from the Curant pharmacy, with matching quantity. Within that space, the parser tries combinations of two axes:

Date windows

Tight: date_written within ±1 day of the Curalytics DateWritten

Loose: created_at within the last 90 days

Drug identifiers

NDC: Exact 11-digit National Drug Code match

RxNorm: Resolved via NdcRxnormMapping (catches generic substitutions)

The parser tries the tight date window first (with NDC, then RxNorm), then falls back to the loose window (same order). Within each pass, results are ordered so that unfilled prescriptions come first (no last_fill_date), then by newest created_at.

Refill capacity check

The parser doesn't just grab the first matching prescription. It checks whether the prescription still has room for another fill: filled_dispense_count < refills_allowed + 1. A prescription for 30 tablets with 3 refills can have at most 4 dispense events (the initial fill plus 3 refills). If all slots are used, the parser moves on to the next candidate. This prevents stacking fills onto an already-exhausted prescription.

Creating dispense records

With patient and prescription matched, process_pharmacy() does the final write. It extracts fields from all three sections of the joined data:

From summary: NDC, quantity dispensed, days supply, date written
From package: Rx number, drug name, prescription name
From dispensed: shipment ID, carrier, tracking number/URL, ship date, delivery date, fill date

The NDC is resolved to an RxNorm ID through NdcRxnormMapping. If the mapping isn't cached locally, the system calls the NIH RxNav API (rxnav.nlm.nih.gov/REST/ndcstatus.json) and caches the result for future lookups.

The dispense record is written with update_or_create, keyed on (emr_medication, shipment_id). If the same shipment is processed twice, the existing record gets updated rather than duplicated. If either package or dispensed data is missing from the joined row, the parser skips creating an EmrMedicationDispensed record but still updates the prescription's last_fill_date when possible.

That last_fill_date update is important: whenever a fill's date is more recent than the current value, the prescription record is updated. This is the field that status indicators, care plans, and the provider dashboard use to know whether a patient's medication is current.

Data Models

Three models carry the prescription and dispensing data. One existed before this pipeline (EmrMedication); two are new.

EmrMedication

The prescription. Created by Healthie/DoseSpot when a provider prescribes medication.

userFK → User

organizationFK → Organization

visitFK → Visit

medicationCharField

ndcCharField

rxnorm_idCharField

pharmacyCharField

quantityIntegerField

refills_allowedIntegerField

days_supplyIntegerField

date_writtenDateTimeField

last_fill_dateDateField

statusCharField

↓ has many

EmrMedicationDispensed

A single fill/dispense event. Created by this pipeline. Unique on (emr_medication, shipment_id).

emr_medicationFK → EmrMedication

shipment_idCharField (unique key)

fill_numberPositiveIntegerField

ndc_dispensedCharField

rxnorm_idCharField

quantity_dispensedPositiveIntegerField

days_supply_dispensedPositiveIntegerField

carrierCharField

tracking_numberCharField

tracking_urlURLField

pharmacy_date_writtenDateTimeField

fill_datetimeDateTimeField

dispensed_datetimeDateTimeField

ship_datetimeDateTimeField

delivery_datetimeDateTimeField

pharmacy_rx_numberCharField

pharmacy_drug_nameCharField

statusCharField (filled/partial/cancelled/returned)

NdcRxnormMapping

Cache table for NDC → RxNorm lookups. Populated automatically on cache miss via the NIH RxNav API.

ndcCharField (unique)

rxnorm_idCharField

medication_nameCharField

Ownership boundary

EmrMedication records are created upstream by the Healthie integration when a provider writes a prescription in DoseSpot. This pipeline never creates EmrMedication records. It only reads them for matching, then writes EmrMedicationDispensed records and updates the last_fill_date field. The prescribing system remains the source of truth for prescriptions; this pipeline adds the dispensing layer on top.

The Provider API

Prescriptions and their dispensing history surface through a single read-only endpoint:

GET /api/v2/medication/prescriptions/<user_id>/

The response returns EmrMedication records ordered by most recent first (-date_written, then -created_at), each with a nested dispensed_events array containing every EmrMedicationDispensed record for that prescription.

Example API response

{
  "items": [
    {
      "id": 1042,
      "user": 5678,
      "organization": 12,
      "medication": "Descovy 200-25 Mg Tablet",
      "ndc": "61958200201",
      "rxnorm_id": "1747697",
      "quantity": 30,
      "days_supply": 30,
      "refills_allowed": 3,
      "pharmacy": "Curant Health Florida, LLC DS ID 8589",
      "prescribed_by": "Dr. Smith",
      "date_written": "2026-01-05T10:00:00Z",
      "last_fill_date": "2026-02-10",
      "status": "active",
      "visit_id": 9001,
      "dispensed_events": [
        {
          "id": 201,
          "shipment_id": "SHP-445566",
          "fill_number": 0,
          "quantity_dispensed": 30,
          "days_supply_dispensed": 30,
          "carrier": "UPS",
          "tracking_number": "1Z999AA10123456784",
          "fill_datetime": "2026-01-10T14:30:00Z",
          "dispensed_datetime": "2026-01-10T14:30:00Z",
          "ship_datetime": "2026-01-10T16:00:00Z",
          "delivery_datetime": "2026-01-13T11:20:00Z",
          "status": "filled"
        },
        {
          "id": 202,
          "shipment_id": "SHP-556677",
          "fill_number": 1,
          "quantity_dispensed": 30,
          "days_supply_dispensed": 30,
          "carrier": "UPS",
          "tracking_number": "1Z999AA10987654321",
          "fill_datetime": "2026-02-10T10:00:00Z",
          "dispensed_datetime": "2026-02-10T10:00:00Z",
          "ship_datetime": "2026-02-10T15:00:00Z",
          "delivery_datetime": "2026-02-13T09:45:00Z",
          "status": "filled"
        }
      ]
    }
  ]
}

The endpoint uses standard pagination (ProviderAppAntPagination) and requires session authentication.

Why this API matters

Without this pipeline, Healthvana would have prescriptions but no structured dispense timeline from Curant. This endpoint exposes both layers together: the prescription itself and its concrete fill/shipping events.

End-to-End Pipeline

This animation shows a single joined record flowing through the full pipeline, from the Curalytics portal to the database.

Interactive: Pipeline Flow

🌐 Curalytics

📥 Download

📦 S3

⏱ Celery

⚙️ Parser

🗃 Database