10 aggregate classification datasets computed from 54 anonymized voice AI conversations (2,334 turns). Includes OpenAI's Asking/Doing/Expressing trichotomy, voice-specific facets, per-turn sentiment (VADER), question-type and pronoun distributions, conversation-arc transitions, and time-of-day patterns. Raw conversation text is not included.
"Contains information from the HereSay Voice AI Classifications Dataset (2026-Q2-v2) by HereSay (heresay.live), which is made available under the ODC Attribution License (ODC-BY 1.0)."
You need a free HereSay account to download. This is so you see the license at download time.
Sign in or create a free account01_conversation_stats.csv — workhorse table (turns, char counts, redactions)02_asking_doing_expressing.csv — OpenAI taxonomy label per conversation03_voice_facets.csv — mic_test / practice / casual / emotional / info_seeking04_turn_lengths.csv — char + word counts per turn05_voice_signals.csv — filler words, mic-check phrases, ASR garble06_question_types.csv — what / how / why / yes-no question counts07_pronoun_usage.csv — 1st / 2nd / 3rd person counts per role08_arc_transitions.csv — opening facet → closing facet (Sankey input)09_sentiment.csv — VADER compound score per turn10_time_of_day.csv — UTC hour bucket counts