CV
Basics
Name | Chin-Jou Li |
Position | Master Student @ CMU LTI |
chinjoul@andrew.cmu.edu |
Work
-
2025.01 - 2025.05 -
2024.09 - Present Graduate Researcher
ChangeLing Lab, Carnegie Mellon University
Researching speech processing with a phonetic/phonological focus, supervised by Prof. David R. Mortensen.
-
2023.01 - 2023.12 Research Assistant
Department of Neurology Neurological Institute, Taipei Veterans General Hospital
Applied computer vision techniques to clinical seizure recording, including action recognition and privacy protection.
-
2022.10 - 2024.06 Undergraduate Researcher
Intelligent Agents Lab, National Taiwan University
Handled data processing and model evaluation for Traditional Chinese LLM development, supervised by Prof. Jane Yung-Jen Hsu.
-
2022.09 - 2024.06 Teaching Assistant / DNS Team Member
Network Administration and System Administration Team, National Taiwan University
-
2022.07 - 2024.02 Research Assistant
Biomedical Acoustic Signal Processing Lab, Academia Sinica
Research in audio-visual speech enhancement and speaker diarization, supervised by Dr. Yu Tsao and Dr. Jen-Cheng Hou.
Education
Publications
-
2025.03 -
2024.09 Face swapping in seizure videos for patient deidentification
Epilepsy Research
-
2023.11 AI-based face transformation in patient seizure videos for privacy protection
MCP: Digital Health
-
2023.09 Deep Complex U-Net with Conformer for Audio-Visual Speech Enhancement
AVSEC, Interspeech 2024 Satellite Event
Skills
Speech | |
Speech Recognition | |
Voice Conversion | |
Speech Enhancement | |
ESPNet |
NLP | |
LLM | |
MLLM | |
Data Processing |
Infra | |
Pytorch | |
Linux | |
Cluster Computing |
Projects
- 2024.09 - Present
Phonetic Speech Foundation Model
Building speech foundation model supporting phoneme recognition
- Speech Processing
- Phonetics & Phonology
- 2024.09 - Present
Data Augmentation for Pathological Speech
Investigating voice conversion on cross-lingual dysarthric speech
- Speech Processing
- Atypical Speech
- 2024.10 - 2024.12
Training Decoder-Only ASR with Long Context
Explored cache retrieval to extend context with limited compute
- Speech Processing
- Efficiency
- 2023.09 - 2024.06
TAIDE - Trustworthy AI Dialog Engine
A trustworthy generative AI dialogue engine tailored for Taiwan
- LLM Evaluation
- Data Curation
- 2023.02 - 2023.05
EGO4D Audio-Visual Speaker Diarization Challenge 2023
Applied S3L embeddings and people tracking to egocentric videos
- Multimodal
- Speech Processing