About Me
Name
Heejin Do (도희진)
- heejindo@postech.ac.kr
- docando71@gmail.com
Research Topic
Natural Language Processing (NLP)
↪ Automated Essay Scoring, Automated Pronunciation Assessment, LLM reasoning and evaluation, Natural Language Generation
Education
- Mar. 2016 – Aug. 2020: Bachelor’s degree in Computer Education and Data Science from Sungkyunkwan University
- Aug. 2019 – Dec. 2019: Exchange student in Information Science and Technology (IST) at Pennsylvania State University
- Aug. 2020 – Feb. 2025: Ph.D. at the Graduate School of Artificial Intelligence, POSTECH
- Current: Research Intern at NAVER Cloud AI Lab
Papers
International Journal Papers
- Target-Oriented Knowledge Distillation with Language-Family-Based Grouping for Multilingual NMT
Heejin Do, Gary Geunbae Lee, ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), 2022, Paper Link
International Conference Papers
- Towards Prompt Generalization: Grammar-aware Cross-Prompt Automated Essay Scoring
Heejin Do, Taehee Park, Sangwon Ryu, Gary Geunbae Lee, Findings of the Association for Computational Linguistics (NAACL 2025), Albuquerque, New Mexico
- Multimodal Cognitive Reframing Therapy via Multi-hop Psychotherapeutic Reasoning
Subin Kim, Hoonrae Kim, Heejin Do, Gary Lee, Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL 2025), Albuquerque, New Mexico
- DyPCL: Dynamic Phoneme-level Contrastive Learning for Dysarthric Speech Recognition
Wonjun Lee, Solee Im, Heejin Do, Yunsu Kim, Jungseul Ok, Gary Lee, Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL 2025), Albuquerque, New Mexico
- Revisiting Early Detection of Sexual Predators via Turn-level Optimization
Jinmyeong An, Sangwon Ryu, Heejin Do, Yunsu Kim, Jungseul Ok, Gary Lee, Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL 2025), Albuquerque, New Mexico
- Autoregressive Multi-trait Essay Scoring via Reinforcement Learning with Scoring-aware Multiple Rewards
Heejin Do, Sangwon Ryu, Gary Geunbae Lee, Empirical Methods in Natural Language Processing (EMNLP 2024), Miami, Florida, Paper Link
- Acoustic Feature Mixup for Balanced Multi-aspect Pronunciation Assessment
Heejin Do, Wonjun Lee, Gary Geunbae Lee, International Conference on Speech Communication and Technology (Interspeech 2024), Kos Island, Greece, Paper Link
- Key-Element-Informed sLLM Tuning for Document Summarization
Sangwon Ryu*, Heejin Do*, Yunsu Kim, Gary Geunbae Lee, Jungseul Ok, International Conference on Speech Communication and Technology (Interspeech 2024), Kos Island, Greece, Paper Link
- Multi-Dimensional Optimization for Text Summarization via Reinforcement Learning
Sangwon Ryu*, Heejin Do*, Yunsu Kim, Gary Geunbae Lee, Jungseul Ok, the Association for Computational Linguistics (ACL 2024), Bangkok, Thailand, Paper Link
- Aspect-based Semantic Textual Similarity for Educational Test Items
Heejin Do, Gary Geunbae Lee, International Conference on Artificial Intelligence in Education (AIED 2024), Recife, Brazil, Paper Link | Code
- Autoregressive Score Generation for Multi-trait Essay Scoring
Heejin Do, Yunsu Kim, Gary Geunbae Lee, Findings of the European Chapter of the Association for Computational Linguistics (EACL 2024), Malta, Paper Link | Code
- Score-Balanced Loss for Multi-Aspect Pronunciation Assessment
Heejin Do, Yunsu Kim, Gary Geunbae Lee, International Conference on Speech Communication and Technology (Interspeech 2023), Dublin, Ireland, Paper Link | Code
- Prompt- and Trait Relation-aware Cross-prompt Essay Trait Scoring
Heejin Do, Yunsu Kim, Gary Geunbae Lee, Findings of the Association for Computational Linguistics (ACL 2023, Long), Toronto, Canada, Paper Link | Code
- Hierarchical Pronunciation Assessment with Multi-Aspect Attention
Heejin Do, Yunsu Kim, Gary Geunbae Lee, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2023), Rhodes Island, Greece, Paper Link | Code
Domestic Conference Papers
- 어족 기반 지식 증류 기법을 적용한 다국어 신경망 기계 번역
도희진, 이근배, 2021 한국컴퓨터종합학술대회, 2021년 6월, 온라인
Patents & SW
Patents
- 도희진, 이근배, 어족 기반 지식 증류 기법을 적용한 다국어 신경망 기계 번역 시스템, 장치 및 방법, 출원번호 10-2021-0097108, 2021년 07월 23일 (등록 완료)
- 도희진, 김윤수, 이근배, 자동 발음 평가 방법 및 장치, 출원번호 10-2023-0132781, 2023년 10월 05일
SW
- C-2021-027720. 어족 기반 지식 증류 기법을 활용한 다국어 신경망 기계 번역. 2021.07.09 (도희진)
Honors and Awards
- POSTECHIAN Fellowship, POSTECH, 2024
- [한국보안윤리학회] 4차산업혁명인재양성 연합페스티벌, 최우수상(2위) - 2017.11
- [과학기술정보통신부] SW Eduthon(에듀톤), 대상(1위), 과학기술정보통신부 장관상 - 2018.12
- [해양수산부/UNIST/울산항만공사] 제 8회 빅데이터 분석 경진대회, 우수상(2위) - 2019.08
- [성균관대학교] 성균가족상 사회봉사부문, 대상(1위) - 2018.11
- [성균관대학교] R 코딩/통계 캠프 프로젝트, 우수상(3위) - 2019.01
Experience
Software Education Volunteer
- World Friends Korea/NIA/KIV, 우즈베키스탄 타슈켄트 정보통신대학(TUIT) ICT 봉사단
- NAVER CONNECT 재단, ‘SW야 놀자’ 활동위원
- Code Club 한국위원회, 활동위원
- SKT Silver Net, 스마트폰 혁신스쿨 SW 교육봉사
- 동성고등학교, SW 교육봉사
Internship
- ST Unitas, Process Planning Team
Etc
- 성균관대학교 데이터사이언스 R 통계 캠프 프로젝트
- Teaching Assistant of ‘문제해결과 알고리즘’ and ‘컴퓨팅사고와 SW 코딩’ (SKKU)