Chatbot Evaluation

This project focuses on developing comprehensive assessment methodologies for conversational AI systems. It aims to create versatile frameworks for evaluating both open-domain and task-specific chatbots, addressing key performance metrics such as response relevance, coherence, and user satisfaction. This research will contribute to improving chatbot design and functionality across various applications.


Director

  • Jinho Choi - Associate Professor at Emory University

Related Projects


Publications

  1. Exploring the Impact of Human Evaluator Group on Chat-Oriented Dialogue Evaluation. Finch, S. E. and Choi, J. D. Proceedings of the Joint International Conference on Computational Linguistics, Language Resources and Evaluation (COLING-LREC), 2024.
  2. Leveraging Large Language Models for Automated Dialogue Analysis. Finch, S. E.; Paek, E. S.; and Choi, J. D. Proceedings of the Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL), 2023.
  3. Don't Forget Your ABC's: Evaluating the State-of-the-Art in Chat-Oriented Dialogue Systems. Finch J. D.; Finch S. E.; Choi, J. D. Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), 2023.
  4. DFEE: Interactive DataFlow Execution and Evaluation Kit. He, H.; Feng S.; Bonadiman D.; Zhang Y.; Mansour S. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2023.
  5. Report from the NSF Future Directions Workshop on Automatic Evaluation of Dialog: Research Directions and Challenges. Mehri, S.; Choi, J. D.; D'Haro, L. F.; Deriu, J.; Eskenazi, M.; Gasic, M.; Georgila, K.; Hakkani-Tur, D.; Li, Z.; Rieser, V.; Shaikh, S.; Traum, D.; Yeh, Y.; Yu, Z.; Zhang, Y.; Zhang, C. arXiv, 2203.10012, 2022.
  6. What Went Wrong? Explaining Overall Dialogue Quality through Utterance-Level Impacts. Finch, J.; Finch, S.; and Choi, J. D. Proceedings of the EMNLP Workshop on NLP for Conversational AI (NLP4ConvAI), 2021.
  7. Towards Unified Dialogue System Evaluation: A Comprehensive Analysis of Current Evaluation Protocols. Finch, S. E.; and Choi, J. D. Proceedings of the Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL), 2020.