[2025S] James Finch (Postdoc)

TBA

James Finch

Date: 2025-02-07 / 2:00 - 3:00 PM
Location: White Hall 100


Abstract

This presentation introduces a novel streaming-based approach to slot schema induction for task-oriented dialogue systems, addressing the limitations of clustering-based methods that rely on dense vector embeddings, large data, sensitive hyperparameter tuning. Our method reframes schema induction as a text generation task, utilizing a streaming paradigm to incrementally construct adaptable schemas in real time. To support this, we present the DOTS dataset—a fully automated, schema-consistent dialogue dataset generated with GPT-4o—and propose new evaluation metrics that better align with human judgment, overcoming the shortcomings of embedding similarity-based approaches. Additionally, we demonstrate that widely used benchmarks like MultiWOZ and SGD are no longer suitable due to LLM memorization of their slot schemas, motivating the need for our new evaluation dataset. Our experiments, including comparisons with clustering-based models and large-scale benchmarks, highlight the effectiveness of our methods across both synthetic and human-authored task scenarios.

Link

Presentation