CV
Basics
| Name | Chin-Jou Li |
| Position | Master Student @ CMU LTI |
| chinjoul@andrew.cmu.edu |
Work
-
2025.08 - Present -
2025.05 - Present Research Collaborator
WAVLab, Carnegie Mellon University
Building phone foundation model and evaluation benchmark, supervised by Prof. Shinji Watanabe.
-
2025.01 - Present -
2024.09 - Present Graduate Researcher
ChangeLing Lab, Carnegie Mellon University
Researching speech processing with a phonetic/phonological focus, supervised by Prof. David R. Mortensen.
-
2023.01 - 2023.12 Research Assistant
Department of Neurology Neurological Institute, Taipei Veterans General Hospital
Applied computer vision techniques to clinical seizure recording, including action recognition and privacy protection.
-
2022.10 - 2024.06 Undergraduate Researcher
Intelligent Agents Lab, National Taiwan University
Handled data processing and model evaluation for Traditional Chinese LLM development, supervised by Prof. Jane Yung-Jen Hsu.
-
2022.09 - 2024.06 Teaching Assistant / DNS Team Member
Network Administration and System Administration Team, National Taiwan University
-
2022.07 - 2024.02 Research Assistant
Biomedical Acoustic Signal Processing Lab, Academia Sinica
Research in audio-visual speech enhancement and speaker diarization, supervised by Dr. Yu Tsao and Dr. Jen-Cheng Hou.
Education
Publications
-
2025.10 -
2025.10 -
2024.09 Face swapping in seizure videos for patient deidentification
Epilepsy Research
-
2023.11 AI-based face transformation in patient seizure videos for privacy protection
MCP: Digital Health
-
2023.09 Deep Complex U-Net with Conformer for Audio-Visual Speech Enhancement
AVSEC, Interspeech 2024 Satellite Event
Skills
| Speech | |
| Speech Recognition | |
| Voice Conversion | |
| Speech Enhancement | |
| ESPNet |
| NLP | |
| LLM | |
| MLLM | |
| Data Processing |
| Infra | |
| Pytorch | |
| Linux | |
| Cluster Computing |
Projects
- 2024.09 - Present
Universal Phone Recognition
Building speech foundation model and evaluation benchmark for phone recognition
- Speech Processing
- Phonetics & Phonology
- 2024.09 - Present
Speech Technologies for Pathological Speech
Investigating speech generation on cross-lingual dysarthric speech
- Speech Processing
- Atypical Speech
- 2023.09 - 2024.06
TAIDE - Trustworthy AI Dialogue Engine
A trustworthy generative AI dialogue engine tailored for Taiwan
- LLM Evaluation
- Data Curation
- 2023.02 - 2023.05
EGO4D Audio-Visual Speaker Diarization Challenge 2023
Applied S3L embeddings and people tracking to egocentric videos
- Multimodal
- Speech Processing