CV | Chin-Jou Li

Basics

Name	Chin-Jou Li
Position	Master Student @ CMU LTI
Email	chinjoul@andrew.cmu.edu

Work

2025.08 - Present
Teaching Assistant

11-751 Speech Recognition and Understanding, Carnegie Mellon University
2025.05 - Present
Research Collaborator

WAVLab, Carnegie Mellon University

Building phone foundation model and evaluation benchmark, supervised by Prof. Shinji Watanabe.
2025.01 - Present
Teaching Assistant

11-4/611 Natural Language Processing, Carnegie Mellon University
2024.09 - Present
Graduate Researcher

ChangeLing Lab, Carnegie Mellon University

Researching speech processing with a phonetic/phonological focus, supervised by Prof. David R. Mortensen.
2023.01 - 2023.12
Research Assistant

Department of Neurology Neurological Institute, Taipei Veterans General Hospital

Applied computer vision techniques to clinical seizure recording, including action recognition and privacy protection.
2022.10 - 2024.06
Undergraduate Researcher

Intelligent Agents Lab, National Taiwan University

Handled data processing and model evaluation for Traditional Chinese LLM development, supervised by Prof. Jane Yung-Jen Hsu.
2022.09 - 2024.06
Teaching Assistant / DNS Team Member

Network Administration and System Administration Team, National Taiwan University
2022.07 - 2024.02
Research Assistant

Biomedical Acoustic Signal Processing Lab, Academia Sinica

Research in audio-visual speech enhancement and speaker diarization, supervised by Dr. Yu Tsao and Dr. Jen-Cheng Hou.

Education

2024.08 - 2025.12

Pittsburgh, PA, USA
M.S.

Carnegie Mellon University

Intelligent Information Systems (ML / NLP)
2020.09 - 2024.06

Taipei, Taiwan
B.Sc.

National Taiwan University

Computer Science and Information Engineering

Publications

2026.01

PRiSM: Benchmarking Phone Realization in Speech Models

Preprint
2025.10

POWSM: A Phonetic Open Whisper-Style Speech Foundation Model

Preprint
2025.10

Prompt-MII: Meta-Learning Instruction Induction for LLMs

ICLR 2026
2025.05

Towards Inclusive ASR: Investigating Voice Conversion for Dysarthric Speech Recognition in Low-Resource Languages

Interspeech 2025
2025.05

Efficient Many-Shot In-Context Learning with Dynamic Block-Sparse Attention

ACL 2025
2024.12

Epileptic Seizure Classification with Patient-Level and Video-Level Contrastive Pretraining

EMBC 2024
2024.09

Face swapping in seizure videos for patient deidentification

Epilepsy Research
2023.11

AI-based face transformation in patient seizure videos for privacy protection

MCP: Digital Health
2023.09

Deep Complex U-Net with Conformer for Audio-Visual Speech Enhancement

AVSEC, Interspeech 2024 Satellite Event

Skills

	Speech
	Speech Recognition
	Voice Conversion
	Speech Enhancement
	ESPNet

	NLP
	LLM
	MLLM
	Data Processing

	Infra
	Pytorch
	Linux
	Cluster Computing

Projects

2024.09 - Present
Universal Phone Recognition

Building speech foundation model and evaluation benchmark for phone recognition
- Speech Processing
- Phonetics & Phonology
2024.09 - Present
Speech Technologies for Pathological Speech

Investigating speech generation on cross-lingual dysarthric speech
- Speech Processing
- Atypical Speech
2023.09 - 2024.06
TAIDE - Trustworthy AI Dialogue Engine

A trustworthy generative AI dialogue engine tailored for Taiwan
- LLM Evaluation
- Data Curation
2023.02 - 2023.05
EGO4D Audio-Visual Speaker Diarization Challenge 2023

Applied S3L embeddings and people tracking to egocentric videos
- Multimodal
- Speech Processing

Basics

Work

11-751 Speech Recognition and Understanding, Carnegie Mellon University

WAVLab, Carnegie Mellon University

Building phone foundation model and evaluation benchmark, supervised by Prof. Shinji Watanabe.

11-4/611 Natural Language Processing, Carnegie Mellon University

ChangeLing Lab, Carnegie Mellon University

Researching speech processing with a phonetic/phonological focus, supervised by Prof. David R. Mortensen.

Department of Neurology Neurological Institute, Taipei Veterans General Hospital

Applied computer vision techniques to clinical seizure recording, including action recognition and privacy protection.

Intelligent Agents Lab, National Taiwan University

Handled data processing and model evaluation for Traditional Chinese LLM development, supervised by Prof. Jane Yung-Jen Hsu.

Network Administration and System Administration Team, National Taiwan University

Biomedical Acoustic Signal Processing Lab, Academia Sinica

Research in audio-visual speech enhancement and speaker diarization, supervised by Dr. Yu Tsao and Dr. Jen-Cheng Hou.

Education

Carnegie Mellon University

Intelligent Information Systems (ML / NLP)

National Taiwan University

Computer Science and Information Engineering

Publications

Preprint

Preprint

ICLR 2026

Interspeech 2025

ACL 2025

EMBC 2024

Epilepsy Research

MCP: Digital Health

AVSEC, Interspeech 2024 Satellite Event

Skills

Projects

Building speech foundation model and evaluation benchmark for phone recognition

Investigating speech generation on cross-lingual dysarthric speech

A trustworthy generative AI dialogue engine tailored for Taiwan

Applied S3L embeddings and people tracking to egocentric videos