publications | yonsei-ml3

^* denotes equal contribution.

2025

Fast and Fluent Diffusion Language Models via Convolutional Decoding and Rejective Fine-tuning

Yeongbin Seo, Dongha Lee, Jaehyung Kim, and Jinyoung Yeo

In Advances in Neural Information Processing Systems (NeurIPS), 2025

Spotlight Presentation, 207/21575=3.19%

Paper Code
Robot-R1: Reinforcement Learning for Enhanced Embodied Reasoning in Robotics

Dongyoung Kim, Sumin Park, Huiwon Jang, Jinwoo Shin, Jaehyung Kim^*, and Younggyo Seo^*

In Advances in Neural Information Processing Systems (NeurIPS), 2025

Paper
Personalized LLM Decoding via Contrasting Personal Preference

Hyungjune Bu^*, Chanjoo Jung^*, Minjae Kang, and Jaehyung Kim

In Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025

Paper Code Website
Improving Chemical Understanding of LLMs via SMILES Parsing

Yunhui Jang, Jaehyung Kim, and Sungsoo Ahn

In Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025

Paper
Personalized Language Models via Privacy-Preserving Evolutionary Model Merging

Kyuyoung Kim, Jinwoo Shin, and Jaehyung Kim

In Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025

Oral Presentation

Paper Code
Debiasing Online Preference Learning via Preference Feature Preservation

Dongyoung Kim, Jinsung Yoon, Jinwoo Shin, and Jaehyung Kim

In Annual Meeting of the Association for Computational Linguistics (ACL), 2025

Paper Code
Structural Reasoning Improves Molecular Understanding of LLM

Yunhui Jang, Jaehyung Kim, and Sungsoo Ahn

In Annual Meeting of the Association for Computational Linguistics (ACL), 2025

Paper
ReVISE: Learning to Refine at Test-Time via Intrinsic Self-Verification

Hyunseok Lee^*, Seunghyuk Oh^*, Jaehyung Kim, Jinwoo Shin, and Jihoon Tack

In Proceedings of the International Conference on Machine Learning (ICML), 2025

Paper Code
Few-shot Personalization of LLMs with Mis-aligned Responses

Jaehyung Kim and Yiming Yang

In Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2025

Paper Code
Spread Preference Annotation: Direct Preference Judgment for Efficient LLM Alignment

Dongyoung Kim, Kimin Lee, Jinwoo Shin, and Jaehyung Kim

In International Conference on Learning Representations (ICLR), 2025

Oral Presentation, 207/11672=1.77%

Paper Code
Alternative Mixed Integer Linear Programming Optimization for Joint Job Scheduling and Data Allocation in Grid Computing

Shengyu Feng^*, Jaehyung Kim^*, Yiming Yang, Joseph Boudreau, Tasnuva Chowdhury, Adolfy Hoisie, Raees Khan, Ozgur O. Kilic, Scott Klasky, Tatiana Korchuganova, Paul Nilsson, Verena Ingrid Martinez Outschoorn, David K. Park, Norbert Podhorszki, Yihui Ren, Frederic Suter, Sairam Sri Vatsavai, Wei Yang, Shinjae Yoo, Tadashi Maeno, and Alexei Klimentov

Future Generation Computer Systems, 2025

Paper
Align to Misalign: Automatic LLM Jailbreak with Meta-Optimized LLM Judges

Hamin Koo, Minseon Kim, and Jaehyung Kim

arXiv preprint, 2025

Paper
Revisiting the Uniform Information Density Hypothesis in LLM Reasoning Traces

Minju Gwak, Guijin Son, and Jaehyung Kim

arXiv preprint, 2025

Paper
TiTok: Transfer Token-level Knowledge via Contrastive Excess to Transplant LoRA

Chanjoo Jung and Jaehyung Kim

arXiv preprint, 2025

Paper
Prior-based Noisy Text Data Filtering: Fast and Strong Alternative for Perplexity

Youngbin Seo, Gayoung Kim, Jaehyung Kim, and Jinyoung Yeo

arXiv preprint, 2025

Paper
Efficient LLM Collaboration via Planning

Byeongchan Lee^*, Jonghoon Lee^*, Dongyoung Kim, Jaehyung Kim, Kyungjoon Park, Dongjun Lee, and Jinwoo Shin

arXiv preprint, 2025

Paper
Training-free LLM Verification via Recycling Few-shot Examples

Dongseok Lee, Jimyung Hong, Dongyoung Kim, and Jaehyung Kim

ICML Workshop ES-FoMo-III, 2025

Spotlight Presentation, 14/146=9.59%

Paper Code
Revisit What You See: Disclose Language Prior in Vision Tokens for Efficient Guided Decoding of LVLMs

Beomsik Cho and Jaehyung Kim

ICML Workshop ES-FoMo-III, 2025

Paper Code
LLMs Think, But Not in Your Flow: Reasoning-Level Personalization for Black-Box Large Language Models

Jieyong Kim^*, Tongyoung Kim^*, Soojin Yoon, Jaehyung Kim, and Dongha Lee

arXiv preprint, 2025

Paper
EMCee: Improving Multilingual Capability of LLMs via Bridging Knowledge and Reasoning with Extracted Synthetic Multilingual Context

Hamin Koo and Jaehyung Kim

arXiv preprint, 2025

Paper Code

2024

Optimized Feature Generation for Tabular Data via LLMs with Decision Tree Reasoning

Jaehyun Nam^*, Kyuyoung Kim^*, Seunghyuk Oh, Jihoon Tack, Jaehyung Kim, and Jinwoo Shin

In Advances in Neural Information Processing Systems (NeurIPS), 2024

Paper Code
Online Adaptation of Language Models with a Memory of Amortized Contexts

Jihoon Tack, Jaehyung Kim, Eric Mitchell, Jinwoo Shin, Yee Whye Teh, and Jonathan Richard Schwarz

In Advances in Neural Information Processing Systems (NeurIPS), 2024

Paper Code
Learning to Correct for QA Reasoning with Black-box LLMs

Jaehyung Kim, Dongyoung Kim, and Yiming Yang

In Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Paper Code
Tabular Transfer Learning via Prompting LLMs

Jaehyun Nam, Woomin Song, Seong Hyeon Park, Jihoon Tack, Sukmin Yun, Jaehyung Kim, Kyu Hwan Oh, and Jinwoo Shin

In Conference on Language Modeling (COLM), 2024

Paper Code
Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs

Woomin Song^*, Seunghyuk Oh^*, Sangwoo Mo, Jaehyung Kim, Sukmin Yun, Jung-Woo Ha, and Jinwoo Shin

In International Conference on Learning Representations (ICLR), 2024

Paper Code
SuRe: Summarizing Retrievals using Answer Candidates for Open-domain QA of LLMs

Jaehyung Kim, Jaehyun Nam, Sangwoo Mo, Jongjin Park, Sang-Woo Lee, Minjoon Seo, Jung-Woo Ha, and Jinwoo Shin

In International Conference on Learning Representations (ICLR), 2024

Paper Code
Can LLMs Generate Diverse Molecules? Towards Alignment with Structural Diversity

Hyosoon Jang, Yunhui Jang, Jaehyung Kim, and Sungsoo Ahn

arXiv preprint, 2024

Paper
SelectLLM: Can LLMs Select Important Instructions to Annotate?

Ritik Sachin Parkar^*, Jaehyung Kim^*, Jong Inn Park, and Dongyeop Kang

arXiv preprint, 2024

Paper
Under the Surface: Tracking the Artifactuality of LLM-Generated Data

Debarati Das^*, Karin De Langis^*, Anna Martin^*, Jaehyung Kim^*, Minhwa Lee^*, Zae Myung Kim^*, Shirley Hayati, Risako Owan, Bin Hu, Ritik Parkar, Ryan Koo, Jonginn Park, Aahan Tyagi, Libby Ferland, Sanjali Roy, Vincent Liu, and Dongyeop Kang

arXiv preprint, 2024

Paper
Meta-Crafting: Improved Detection of Out-of-distributed Texts via Crafting Metadata Space

Ryan Koo, Yekyung Kim, Dongyeop Kang, and Jaehyung Kim

AAAI 2024 Student Abstract, 2024

Website

2023

RoAST: Robustifying Language Models via Adversarial Perturbation with Selective Training

Jaehyung Kim, Yuning Mao, Rui Hou, Hanchao Yu, Davis Liang, Pascale Fung, Qifan Wang, Fuli Feng, Lifu Huang, and Madian Khabsa

In Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Paper Code
A Universal Framework for Dataset Characterization with Multidimensional Meta-information

Jaehyung Kim, Yekyung Kim, Karin Johanna Langis, Jinwoo Shin, and Dongyeop Kang

In Annual Meeting of the Association for Computational Linguistics (ACL), 2023

Paper Code
Prefer to Classify: Improving Text Classifiers via Auxiliary Preference Learning

Jaehyung Kim, Jinwoo Shin, and Dongyeop Kang

In Proceedings of the International Conference on Machine Learning (ICML), 2023

Paper Code
Everyone’s Voice Matters: Quantifying Annotation Disagreement Using Demographic Information

Ruyuan Wan, Jaehyung Kim, and Dongyeop Kang

In AAAI Conference on Artificial Intelligence (AAAI), 2023

Oral Presentation

Paper Code

2022

Time Is MattEr: Temporal Self-supervision for Video Transformers

Sukmin Yun, Jaehyung Kim, Dongyoon Han, Hwanjun Song, Jung-Woo Ha, and Jinwoo Shin

In Proceedings of the International Conference on Machine Learning (ICML), 2022

Paper Code
Patch-level Representation Learning for Self-supervised Vision Transformers

Sukmin Yun, Hankook Lee, Jaehyung Kim, and Jinwoo Shin

In Conference on Computer Vision and Pattern Recognition (CVPR), 2022

Oral Presentation

Paper Code
What Makes Better Augmentation Strategies? Augment Difficult but Not too Different

Jaehyung Kim, Dongyeop Kang, Sungsoo Ahn, and Jinwoo Shin

In International Conference on Learning Representations (ICLR), 2022

Code
Spread Spurious Attribute: Improving Worst-group Accuracy with Spurious Attribute Estimation

Junhyun Nam, Jaehyung Kim, Jaeho Lee, and Jinwoo Shin

In International Conference on Learning Representations (ICLR), 2022

Paper

2020

Distribution Aligning Refinery of Pseudo-label for Imbalanced Semi-supervised Learning

Jaehyung Kim, Youngbum Hur, Sejun Park, Eunho Yang, Sung Ju Hwang, and Jinwoo Shin

In Advances in Neural Information Processing Systems (NeurIPS), 2020

Paper Code
M2m: Imbalanced Classification via Major-to-minor Translation

Jaehyung Kim^*, Jongheon Jeong^*, and Jinwoo Shin

In Conference on Computer Vision and Pattern Recognition (CVPR), 2020

Code

2017

Simplified Stochastic Feedforward Neural Networks

Kimin Lee, Jaehyung Kim, Song Chong, and Jinwoo Shin

arXiv preprint, 2017

Paper