Overview

Some of the most costly errors in clinical trials come from poorly administered and scored clinical interviews. Winterlight’s speech and language analysis platform quickly identifies potential quality issues to allow for faster site and rater remediation.

Our QA pipeline reviews and flags poorly administered or scored clinical interviews across a variety of cognitive tests and clinician reported outcomes.

We also detect scoring errors by automated analysis of tasks, such as word recall, serial seven subtraction, category fluency, or more complex items like word finding difficulty.

Product Offerings:

100% Assessment Review

We can run quality reviews on 100% of the clinical assessments conducted in a trial. We can provide reviews for the following assessments:

ADAS-Cog
CDR
MMSE
MoCA
PANSS
HAM-D
And more

Extract quality metrics from rater and participant speech

We automatically extract scores and quality indicators from recordings, like:

Rater’s adherence to protocol
Rater identity across visits
Assessment scores
Pace of rater speech
Speaking latency
Amount of rater and participant speech
And more

Expedite manual expert review

We can flag whole assessments that require expert review.

We can point reviewers to the subsection of the assessment requiring review.

We provide reviewers with assessment transcripts, making reviews faster and more thorough.

Audio QA

Automated analysis of rater and participant speech includes pacing of the rater, latency to respond, amount of speech, rater identity across visits, and more

Content QA

Review of clinical content to identify administration issues, including adherence to clinical standards, clinician interruptions, or skipped segments

Clinical Scoring

Automated clinical scores based on machine learning models, including word recall scores, word finding difficulty, and spoken language ability

How we use speech metrics

Audio QA

Speech and language metrics including the rater’s speech rate, latency to respond, amount of speech, ratio of speech between the rater and the participant and more can all tell us about the quality of assessment administration
Speaker identification. We use novel voice analysis algorithms to compare the identity of speakers across two audio files. With this state-of-the-art methodology, we can analyze clinical recordings to identify whether the rater changed across sessions, or identify duplicate participant enrolment in a clinical trial

Content QA

Adherence to clinical standards. Our transcriptionists ensure that the raters adhere to the clinical standards of administration for the assessment and flag instances of deviation. This includes ensuring the rater:

Reads the script verbatim when appropriate
Provides appropriate number of follow up prompts
Has no inappropriate interruptions
Follows up with questions appropriately

Clinical Scoring

Objective scoring of listing tasks. This includes scores for word recall, fluency tasks, serial sevens, and other tasks that involve counting and categorizing a participant’s responses.
Enhanced scoring of existing tasks. Enhanced scoring refers to additional metrics we provide, such as participant latency to respond, listing repetitions, recall order and grouping, and so on.
Scoring of complex subjective assessments. We can measure elements of speech like clarity, pauses, stutters, hesitations, which contribute to complex scores like Word Finding Difficulty or Spoken Language Ability on the ADAS-Cog.
Novel ML models to predict disease status. One of our goals is to predict disease status for a range of CNS diseases, including clinical outcome scores, using speech and language.

Announcements

CTAD 2022 Late breaking abstract

Accuracy of automated scoring of word recall assessments

Natural language processing tools can be used to automate and standardize the scoring of clinical assessments. Many cognitive assessments used as endpoints in Alzheimer’s disease (AD) trials require manual scoring and review which can be costly and time consuming. Developments in natural language processing technology can be leveraged to develop automated and objective tools to generate text transcripts and produce scores for cognitive assessments. As a proof of concept, we tested an automated method to score the word recall portion of the ADAS-Cog, a standard endpoint in AD research. In this study, we found that preconfigured automated systems approached human accuracy, although still tended to underestimate scores due to transcription errors. Future work to refine the use of ASR to evaluate clinical endpoints includes optimizing ASR accuracy by filtering noise before processing samples and further customizing language models to suit the datasets at hand, as well as exploring the use of ASR in other elements of cognitive assessments, to provide more efficient and scalable scoring methods.

Ratings Quality Assurance

Overview

Product Offerings:

100% Assessment Review

Extract quality metrics from rater and participant speech

Expedite manual expert review

Audio QA

Content QA

Clinical Scoring

How we use speech metrics

Audio QA

Content QA

Clinical Scoring

Announcements

CTAD 2022 Late breaking abstract

Accuracy of automated scoring of word recall assessments

Get in touch

Registered Company Address