Ratings Quality Assurance

Increase the breadth and depth of clinical assessment quality reviews in clinical trials. We provide fast, reliable quality indicators of 100% of clinical assessments administered, including reviews of administration and scoring.

Work with us


Some of the most costly errors in clinical trials come from poorly administered and scored clinical interviews. Winterlight’s speech and language analysis platform quickly identifies potential quality issues to allow for faster site and rater remediation.

Our QA pipeline reviews and flags poorly administered or scored clinical interviews across a variety of cognitive tests and clinician reported outcomes.

We also detect scoring errors by automated analysis of tasks, such as word recall, serial seven subtraction, category fluency, or more complex items like word finding difficulty.

Image of ipad

Product Offerings:

Image of checklist

100% Assessment Review

We can run quality reviews on 100% of the clinical assessments conducted in a trial. We can provide reviews for the following assessments:

  • ADAS-Cog
  • CDR
  • MMSE
  • MoCA
  • HAM-D
  • And more
Image of graphed data

Extract quality metrics from rater and participant speech

We automatically extract scores and quality indicators from recordings, like:

  • Rater’s adherence to protocol
  • Rater identity across visits
  • Assessment scores
  • Pace of rater speech
  • Speaking latency
  • Amount of rater and participant speech
  • And more
Image of illuminated lightbulb

Expedite manual expert review

We can flag whole assessments that require expert review.

We can point reviewers to the subsection of the assessment requiring review.

We provide reviewers with assessment transcripts, making reviews faster and more thorough.

Illustration of magnifying glass

Audio QA

Automated analysis of rater and participant speech includes pacing of the rater, latency to respond, amount of speech, rater identity across visits, and more

Illustration of heart

Content QA

Review of clinical content to identify administration issues, including adherence to clinical standards, clinician interruptions, or skipped segments

Illustration of brain

Clinical Scoring

Automated clinical scores based on machine learning models, including word recall scores, word finding difficulty, and spoken language ability

How we use speech metrics

Audio QA

  • Speech and language metrics including the rater’s speech rate, latency to respond, amount of speech, ratio of speech between the rater and the participant and more can all tell us about the quality of assessment administration
  • Speaker identification. We use novel voice analysis algorithms to compare the identity of speakers across two audio files. With this state-of-the-art methodology, we can analyze clinical recordings to identify whether the rater changed across sessions, or identify duplicate participant enrolment in a clinical trial

Content QA

  • Adherence to clinical standards. Our transcriptionists ensure that the raters adhere to the clinical standards of administration for the assessment and flag instances of deviation. This includes ensuring the rater:
    • Reads the script verbatim when appropriate
    • Provides appropriate number of follow up prompts
    • Has no inappropriate interruptions
    • Follows up with questions appropriately

Clinical Scoring

  • Objective scoring of listing tasks. This includes scores for word recall, fluency tasks, serial sevens, and other tasks that involve counting and categorizing a participant’s responses.
  • Enhanced scoring of existing tasks. Enhanced scoring refers to additional metrics we provide, such as participant latency to respond, listing repetitions, recall order and grouping, and so on.
  • Scoring of complex subjective assessments. We can measure elements of speech like clarity, pauses, stutters, hesitations, which contribute to complex scores like Word Finding Difficulty or Spoken Language Ability on the ADAS-Cog.
  • Novel ML models to predict disease status. One of our goals is to predict disease status for a range of CNS diseases, including clinical outcome scores, using speech and language.


CTAD 2022 Late breaking abstract

Accuracy of automated scoring of word recall assessments

Natural language processing tools can be used to automate and standardize the scoring of clinical assessments. Many cognitive assessments used as endpoints in Alzheimer’s disease (AD) trials require manual scoring and review which can be costly and time consuming. Developments in natural language processing technology can be leveraged to develop automated and objective tools to generate text transcripts and produce scores for cognitive assessments. As a proof of concept, we tested an automated method to score the word recall portion of the ADAS-Cog, a standard endpoint in AD research. In this study, we found that preconfigured automated systems approached human accuracy, although still tended to underestimate scores due to transcription errors. Future work to refine the use of ASR to evaluate clinical endpoints includes optimizing ASR accuracy by filtering noise before processing samples and further customizing language models to suit the datasets at hand, as well as exploring the use of ASR in other elements of cognitive assessments, to provide more efficient and scalable scoring methods.

Get in touch

Registered Company Address

Winterlight Labs
100 King Street West
1 First Canadian Place
Suite 6200, P.O. Box 50
Toronto ON M5X 1B8

Send us a message