Transcript files don’t follow a single global standard. In fact, they can vary significantly depending on:
The transcription tool used
Whether the transcript was created manually or automatically
The formatting preferences of researchers or agencies
Industry-specific standards and legacy systems
As a result, transcripts may look very different from one another.
Let’s walk you through the formats that are supported and how DoReveal works with them.
At its core, DoReveal focuses on conversations between speakers. To process any transcript effectively, it looks for:
Who is speaking
Where speaker changes occur
What each speaker is saying
Speaker labels can take many forms, such as:
Moderator / Participant
Speaker 1 / Speaker 2
Names like John / Sarah
Even generic labels or partial identifiers
As long as speaker separation can be inferred, DoReveal can structure the conversation for analysis.
Standard Transcript Files
These can be a word document or a .txt file. DoReveal looks for speaker labels and actual text. If there is additional information like section labels or some meta-data on top, it could impact the accuracy. Simple formats like the following work the best:
Example 1: Speaker names, followed by what they said:
Example 2: A variation of the first example, with additional separating character like a ":" next to speaker names
Example 3: Generic speaker labels instead of actual names
Handling Non-Standard Formats
Some transcripts don’t follow structured speaker labeling. DoReveal provides utilities to help convert a couple of them into a usable format. These are available through transcript format tools on the DoReveal website.
Supported Transcript Utilities
DoReveal provides tools to help standardize two common non-standard formats:
1. Font-Based Transcript Format
Some transcripts do not explicitly label speakers. Instead, formatting is used to distinguish them. For example:
Moderator text may appear in bold
Participant text appears in regular font
DoReveal can interpret this structure and convert it into clear speaker labels.
This format is most useful for One-on-one interviews (IDIs)
2. VTT-Like Structured Format
Another common format includes structured blocks such as:
Sequential IDs (1, 2, 3…)
Timestamp ranges
Speaker roles and names (or partial labels)
For example:
Moderator: Henry – spoken text
Respondent: John – spoken text
Or simply:
Henry – spoken text
John – spoken text
DoReveal can process these variations and normalize them into structured conversations.
If you need support for additional format, please contact us at support@doreveal.com.