CV
Basics
| Name | Chin-Jou Li | 
| Position | Master Student @ CMU LTI | 
| chinjoul@andrew.cmu.edu | 
Work
-  2025.08 - Present 
-  2025.05 - Present Research CollaboratorWAVLab, Carnegie Mellon UniversityBuilding phone foundation model and evaluation benchmark, supervised by Prof. Shinji Watanabe.
-  2025.01 - Present 
-  2024.09 - Present Graduate ResearcherChangeLing Lab, Carnegie Mellon UniversityResearching speech processing with a phonetic/phonological focus, supervised by Prof. David R. Mortensen.
-  2023.01 - 2023.12 Research AssistantDepartment of Neurology Neurological Institute, Taipei Veterans General HospitalApplied computer vision techniques to clinical seizure recording, including action recognition and privacy protection.
-  2022.10 - 2024.06 Undergraduate ResearcherIntelligent Agents Lab, National Taiwan UniversityHandled data processing and model evaluation for Traditional Chinese LLM development, supervised by Prof. Jane Yung-Jen Hsu.
-  2022.09 - 2024.06 Teaching Assistant / DNS Team MemberNetwork Administration and System Administration Team, National Taiwan University
-  2022.07 - 2024.02 Research AssistantBiomedical Acoustic Signal Processing Lab, Academia SinicaResearch in audio-visual speech enhancement and speaker diarization, supervised by Dr. Yu Tsao and Dr. Jen-Cheng Hou.
Education
Publications
-  2024.09 Face swapping in seizure videos for patient deidentificationEpilepsy Research
-  2023.11 AI-based face transformation in patient seizure videos for privacy protectionMCP: Digital Health
-  2023.09 Deep Complex U-Net with Conformer for Audio-Visual Speech EnhancementAVSEC, Interspeech 2024 Satellite Event
Skills
| Speech | |
| Speech Recognition | |
| Voice Conversion | |
| Speech Enhancement | |
| ESPNet | 
| NLP | |
| LLM | |
| MLLM | |
| Data Processing | 
| Infra | |
| Pytorch | |
| Linux | |
| Cluster Computing | 
Projects
-  2024.09 - PresentPhonetic Speech Foundation ModelBuilding speech foundation model supporting phoneme recognition- Speech Processing
- Phonetics & Phonology
 
-  2024.09 - PresentData Augmentation for Pathological SpeechInvestigating voice conversion on cross-lingual dysarthric speech- Speech Processing
- Atypical Speech
 
-  2024.10 - 2024.12Training Decoder-Only ASR with Long ContextExplored cache retrieval to extend context with limited compute- Speech Processing
- Efficiency
 
-  2023.09 - 2024.06TAIDE - Trustworthy AI Dialog EngineA trustworthy generative AI dialogue engine tailored for Taiwan- LLM Evaluation
- Data Curation
 
-  2023.02 - 2023.05EGO4D Audio-Visual Speaker Diarization Challenge 2023Applied S3L embeddings and people tracking to egocentric videos- Multimodal
- Speech Processing