← Back to reports Validra
Validra
Validra
AI Performance Report

How our AI tools performed

A simple look at how fast and reliably Validra's AI tools respond when many people use them at the same time

What this report shows

This is a health-check of Validra's AI tools. We simulated several people using each tool at the same time (just like a busy day) and measured two simple things: did it work, and how long did people wait.

Each section below covers one tool. Green numbers are good. A short note explains what the tool does and which numbers matter most. New to the terms? See “How to read the numbers” at the bottom of the page.

At a glance

All tools compared across every measure in one graph. Each group of bars is one measure; each colour is a tool. Bar heights are scaled within each measure (units differ) and the exact value is shown on top.

Report MasterAI EditorAI Chunking
1m 29s
18s
10s
Response time
1m 37s
2m 42s
29s
Total wait
42.5k
1.9k
16.0k
Tokens used
100%
100%
100%
Reliability
435
152
587
AI speed

Report Master — Automatic Report Writing

What it does: turns your study data into a complete, written report — automatically. What to look for: Success Rate shows how often a report was produced without errors, and the Avg times show the typical wait to generate one.

Report Master mixed  |  1 request(s)  |  2026-06-02T06:02:01+00:00  |  model: qwen2.5:14b
Model
qwen2.5:14b
Requests
1
Completed
1
Failed
0
Success Rate
100.0%
Avg Client
1m 37.1s
Avg Server
1m 29.4s
Avg Tokens/sec
580.1
Total Tokens
42,475
LLM Calls
26

Client-Side Latency

Generate click → Download ready

Min
1m 37.1s
p50
1m 37.1s
p75
1m 37.1s
p85
1m 37.1s
p90
1m 37.1s
p95
1m 37.1s
p99
1m 37.1s
Max
1m 37.1s
Mean: 1m 37.1s  |  Std Dev: 0.0s

Server-Side Duration

From API response

Min
1m 29.4s
p50
1m 29.4s
p75
1m 29.4s
p85
1m 29.4s
p90
1m 29.4s
p95
1m 29.4s
p99
1m 29.4s
Max
1m 29.4s
Mean: 1m 29.4s  |  Std Dev: 0.0s

Tokens/sec (higher = faster)

Min
580.1
p50
580.1
p75
580.1
p85
580.1
p90
580.1
p95
580.1
p99
580.1
Max
580.1

Per-Request Breakdown

#StatusClientServerTokensLLM CallsTok/sJob ID
1 completed 1m 37.1s 1m 29.4s 42,475 26 580.1 24612c9b-4cba-424d-b293-4d0136730e0e

AI Editor — Edit & Generate Text

What it does: edits protocol text from plain instructions (for example, “change the wording to past tense”) and can generate new content. What to look for: the typical time each edit and each generation took, and how many succeeded.

AI Editor mixed  |  2 test(s)  |  http://100.107.115.34  |  2026-06-02T06:02:01+00:00  |  model: phi3 (vision: qwen2.5vl:7b)
Model
phi3 (vision: qwen2.5vl:7b)
Tests
2
Edit Passed
2/2
Generate Passed
2/2
Avg Edit LLM
17.4s
Avg Generate LLM
18.1s
Avg Total Time
2m 41.9s
Avg Edit Tok/s
229.8
Avg Generate Tok/s
203.1
Total Tokens
1,879

Edit LLM Time

"change tense to past"

Min
16.9s
p50
16.9s
p75
17.8s
p85
17.8s
p90
17.8s
p95
17.8s
p99
17.8s
Max
17.8s
Mean: 17.4s  |  Std Dev: 0.5s

Generate LLM Time

"add table same as in image"

Min
18.1s
p50
18.1s
p75
18.1s
p85
18.1s
p90
18.1s
p95
18.1s
p99
18.1s
Max
18.1s
Mean: 18.1s  |  Std Dev: 0.0s

Edit Tokens/sec

higher = faster

Min
203.0
p50
203.0
p75
256.7
p85
256.7
p90
256.7
p95
256.7
p99
256.7
Max
256.7
Mean: 229.8  |  Std Dev: 26.8

Generate Tokens/sec

higher = faster

Min
202.9
p50
202.9
p75
203.3
p85
203.3
p90
203.3
p95
203.3
p99
203.3
Max
203.3
Mean: 203.1  |  Std Dev: 0.2

Per-Test Breakdown

#SetupEditEdit LLMEdit TokensEdit Tok/sGenerateGen LLMGen TokensGen Tok/sTotalError
1 2m 1.7s passed 17.8s 493 203.0 passed 18.1s 498 202.9 2m 37.6s -
2 2m 11.2s passed 16.9s 390 256.7 passed 18.1s 498 203.3 2m 46.2s -

Data Analysis — Understand Your Data

What it does: reads an uploaded data file and prepares it for analysis. What to look for: Avg Analysis and Avg Processing are the usual wait times; Success Rate is how often it finished cleanly.

Data Analysis — AI Mode

No Data Analysis results found.

AI Chunking — Reading Long Documents

What it does: splits a long document into smaller sections the AI can read and understand. What to look for: the typical time to process a document, and how often it completed.

AI Chunking mixed  |  2 run(s)  |  http://100.107.115.34  |  2026-06-02T06:02:01+00:00  |  model: phi3
Model
phi3
Runs
2
Completed
2
Failed
0
Success Rate
100.0%
Avg Chunking
10.3s
Avg Total Time
28.6s
Avg Tokens/sec
782.8
Total Doc Tokens
16,046

Chunking Time

Process Document → Done

Min
10.2s
p50
10.2s
p75
10.3s
p85
10.3s
p90
10.3s
p95
10.3s
p99
10.3s
Max
10.3s
Mean: 10.3s  |  Std Dev: 0.1s

Tokens/sec

doc tokens ÷ chunking time

Min
778.9
p50
778.9
p75
786.6
p85
786.6
p90
786.6
p95
786.6
p99
786.6
Max
786.6
Mean: 782.8  |  Std Dev: 3.9

Per-Run Breakdown

#StatusSetupChunkingTotalDoc TokensTok/sError
1 completed 19.0s 10.3s 29.3s 8,023 778.9 -
2 completed 17.7s 10.2s 27.9s 8,023 786.6 -

How to read the numbers

Success rate — out of every attempt, how many finished correctly. Higher is better — 100% means nothing failed.
Typical wait time (Avg / Average) — the usual time a person waited for a result. Shown in seconds (s) or minutes (m). Lower is better.
p50, p85, p99 — a quick way to show the range of waits. p50 = half of people waited less than this. p85 = 85% waited less. p99 = almost everyone (99%) waited less. Lower is better.
Tokens / second — roughly how fast the AI writes its answer (a “token” is about ¾ of a word). Higher is better.
Setup — the time to open the page and get ready, before the AI work even begins.
Model — the specific AI engine used for that tool (for example, “qwen2.5:14b” or “phi3”). Different tools can use different engines.