* denotes equal contribution.

2025

  1. Fast and Fluent Diffusion Language Models via Convolutional Decoding and Rejective Fine-tuning
    Yeongbin Seo, Dongha Lee, Jaehyung Kim, and Jinyoung Yeo
    In Advances in Neural Information Processing Systems (NeurIPS), 2025
    Spotlight Presentation, 207/21575=3.19%
  2. Robot-R1: Reinforcement Learning for Enhanced Embodied Reasoning in Robotics
    Dongyoung Kim, Sumin Park, Huiwon Jang, Jinwoo Shin, Jaehyung Kim*, and Younggyo Seo*
    In Advances in Neural Information Processing Systems (NeurIPS), 2025
  3. Personalized LLM Decoding via Contrasting Personal Preference
    Hyungjune Bu*, Chanjoo Jung*, Minjae Kang, and Jaehyung Kim
    In Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025
  4. Improving Chemical Understanding of LLMs via SMILES Parsing
    Yunhui Jang, Jaehyung Kim, and Sungsoo Ahn
    In Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025
  5. Personalized Language Models via Privacy-Preserving Evolutionary Model Merging
    Kyuyoung Kim, Jinwoo Shin, and Jaehyung Kim
    In Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025
    Oral Presentation
  6. Debiasing Online Preference Learning via Preference Feature Preservation
    Dongyoung Kim, Jinsung Yoon, Jinwoo Shin, and Jaehyung Kim
    In Annual Meeting of the Association for Computational Linguistics (ACL), 2025
  7. Structural Reasoning Improves Molecular Understanding of LLM
    Yunhui Jang, Jaehyung Kim, and Sungsoo Ahn
    In Annual Meeting of the Association for Computational Linguistics (ACL), 2025
  8. ReVISE: Learning to Refine at Test-Time via Intrinsic Self-Verification
    Hyunseok Lee*, Seunghyuk Oh*, Jaehyung Kim, Jinwoo Shin, and Jihoon Tack
    In Proceedings of the International Conference on Machine Learning (ICML), 2025
  9. Few-shot Personalization of LLMs with Mis-aligned Responses
    Jaehyung Kim and Yiming Yang
    In Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2025
  10. Spread Preference Annotation: Direct Preference Judgment for Efficient LLM Alignment
    Dongyoung Kim, Kimin Lee, Jinwoo Shin, and Jaehyung Kim
    In International Conference on Learning Representations (ICLR), 2025
    Oral Presentation, 207/11672=1.77%
  11. Alternative Mixed Integer Linear Programming Optimization for Joint Job Scheduling and Data Allocation in Grid Computing
    Shengyu Feng*, Jaehyung Kim*, Yiming Yang, Joseph Boudreau, Tasnuva Chowdhury, Adolfy Hoisie, Raees Khan, Ozgur O. Kilic, Scott Klasky, Tatiana Korchuganova, Paul Nilsson, Verena Ingrid Martinez Outschoorn, David K. Park, Norbert Podhorszki, Yihui Ren, Frederic Suter, Sairam Sri Vatsavai, Wei Yang, Shinjae Yoo, Tadashi Maeno, and Alexei Klimentov
    Future Generation Computer Systems, 2025
  12. Align to Misalign: Automatic LLM Jailbreak with Meta-Optimized LLM Judges
    Hamin Koo, Minseon Kim, and Jaehyung Kim
    arXiv preprint, 2025
  13. Revisiting the Uniform Information Density Hypothesis in LLM Reasoning Traces
    Minju Gwak, Guijin Son, and Jaehyung Kim
    arXiv preprint, 2025
  14. TiTok: Transfer Token-level Knowledge via Contrastive Excess to Transplant LoRA
    Chanjoo Jung and Jaehyung Kim
    arXiv preprint, 2025
  15. Prior-based Noisy Text Data Filtering: Fast and Strong Alternative for Perplexity
    Youngbin Seo, Gayoung Kim, Jaehyung Kim, and Jinyoung Yeo
    arXiv preprint, 2025
  16. Efficient LLM Collaboration via Planning
    Byeongchan Lee*, Jonghoon Lee*, Dongyoung Kim, Jaehyung Kim, Kyungjoon Park, Dongjun Lee, and Jinwoo Shin
    arXiv preprint, 2025
  17. Training-free LLM Verification via Recycling Few-shot Examples
    Dongseok Lee, Jimyung Hong, Dongyoung Kim, and Jaehyung Kim
    ICML Workshop ES-FoMo-III, 2025
    Spotlight Presentation, 14/146=9.59%
  18. Revisit What You See: Disclose Language Prior in Vision Tokens for Efficient Guided Decoding of LVLMs
    Beomsik Cho and Jaehyung Kim
    ICML Workshop ES-FoMo-III, 2025
  19. LLMs Think, But Not in Your Flow: Reasoning-Level Personalization for Black-Box Large Language Models
    Jieyong Kim*, Tongyoung Kim*, Soojin Yoon, Jaehyung Kim, and Dongha Lee
    arXiv preprint, 2025
  20. EMCee: Improving Multilingual Capability of LLMs via Bridging Knowledge and Reasoning with Extracted Synthetic Multilingual Context
    Hamin Koo and Jaehyung Kim
    arXiv preprint, 2025

2024

  1. Optimized Feature Generation for Tabular Data via LLMs with Decision Tree Reasoning
    Jaehyun Nam*, Kyuyoung Kim*, Seunghyuk Oh, Jihoon Tack, Jaehyung Kim, and Jinwoo Shin
    In Advances in Neural Information Processing Systems (NeurIPS), 2024
  2. Online Adaptation of Language Models with a Memory of Amortized Contexts
    Jihoon Tack, Jaehyung Kim, Eric Mitchell, Jinwoo Shin, Yee Whye Teh, and Jonathan Richard Schwarz
    In Advances in Neural Information Processing Systems (NeurIPS), 2024
  3. Learning to Correct for QA Reasoning with Black-box LLMs
    Jaehyung Kim, Dongyoung Kim, and Yiming Yang
    In Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
  4. Tabular Transfer Learning via Prompting LLMs
    Jaehyun Nam, Woomin Song, Seong Hyeon Park, Jihoon Tack, Sukmin Yun, Jaehyung Kim, Kyu Hwan Oh, and Jinwoo Shin
    In Conference on Language Modeling (COLM), 2024
  5. Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs
    Woomin Song*, Seunghyuk Oh*, Sangwoo Mo, Jaehyung Kim, Sukmin Yun, Jung-Woo Ha, and Jinwoo Shin
    In International Conference on Learning Representations (ICLR), 2024
  6. SuRe: Summarizing Retrievals using Answer Candidates for Open-domain QA of LLMs
    Jaehyung Kim, Jaehyun Nam, Sangwoo Mo, Jongjin Park, Sang-Woo Lee, Minjoon Seo, Jung-Woo Ha, and Jinwoo Shin
    In International Conference on Learning Representations (ICLR), 2024
  7. Can LLMs Generate Diverse Molecules? Towards Alignment with Structural Diversity
    Hyosoon Jang, Yunhui Jang, Jaehyung Kim, and Sungsoo Ahn
    arXiv preprint, 2024
  8. SelectLLM: Can LLMs Select Important Instructions to Annotate?
    Ritik Sachin Parkar*, Jaehyung Kim*, Jong Inn Park, and Dongyeop Kang
    arXiv preprint, 2024
  9. Under the Surface: Tracking the Artifactuality of LLM-Generated Data
    Debarati Das*, Karin De Langis*, Anna Martin*, Jaehyung Kim*, Minhwa Lee*, Zae Myung Kim*, Shirley Hayati, Risako Owan, Bin Hu, Ritik Parkar, Ryan Koo, Jonginn Park, Aahan Tyagi, Libby Ferland, Sanjali Roy, Vincent Liu, and Dongyeop Kang
    arXiv preprint, 2024
  10. Meta-Crafting: Improved Detection of Out-of-distributed Texts via Crafting Metadata Space
    Ryan Koo, Yekyung Kim, Dongyeop Kang, and Jaehyung Kim
    AAAI 2024 Student Abstract, 2024

2023

  1. RoAST: Robustifying Language Models via Adversarial Perturbation with Selective Training
    Jaehyung Kim, Yuning Mao, Rui Hou, Hanchao Yu, Davis Liang, Pascale Fung, Qifan Wang, Fuli Feng, Lifu Huang, and Madian Khabsa
    In Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
  2. A Universal Framework for Dataset Characterization with Multidimensional Meta-information
    Jaehyung Kim, Yekyung Kim, Karin Johanna Langis, Jinwoo Shin, and Dongyeop Kang
    In Annual Meeting of the Association for Computational Linguistics (ACL), 2023
  3. Prefer to Classify: Improving Text Classifiers via Auxiliary Preference Learning
    Jaehyung Kim, Jinwoo Shin, and Dongyeop Kang
    In Proceedings of the International Conference on Machine Learning (ICML), 2023
  4. Everyone’s Voice Matters: Quantifying Annotation Disagreement Using Demographic Information
    Ruyuan Wan, Jaehyung Kim, and Dongyeop Kang
    In AAAI Conference on Artificial Intelligence (AAAI), 2023
    Oral Presentation

2022

  1. Time Is MattEr: Temporal Self-supervision for Video Transformers
    Sukmin Yun, Jaehyung Kim, Dongyoon Han, Hwanjun Song, Jung-Woo Ha, and Jinwoo Shin
    In Proceedings of the International Conference on Machine Learning (ICML), 2022
  2. Patch-level Representation Learning for Self-supervised Vision Transformers
    Sukmin Yun, Hankook Lee, Jaehyung Kim, and Jinwoo Shin
    In Conference on Computer Vision and Pattern Recognition (CVPR), 2022
    Oral Presentation
  3. What Makes Better Augmentation Strategies? Augment Difficult but Not too Different
    Jaehyung Kim, Dongyeop Kang, Sungsoo Ahn, and Jinwoo Shin
    In International Conference on Learning Representations (ICLR), 2022
  4. Spread Spurious Attribute: Improving Worst-group Accuracy with Spurious Attribute Estimation
    Junhyun Nam, Jaehyung Kim, Jaeho Lee, and Jinwoo Shin
    In International Conference on Learning Representations (ICLR), 2022

2020

  1. Distribution Aligning Refinery of Pseudo-label for Imbalanced Semi-supervised Learning
    Jaehyung Kim, Youngbum Hur, Sejun Park, Eunho Yang, Sung Ju Hwang, and Jinwoo Shin
    In Advances in Neural Information Processing Systems (NeurIPS), 2020
  2. M2m: Imbalanced Classification via Major-to-minor Translation
    Jaehyung Kim*, Jongheon Jeong*, and Jinwoo Shin
    In Conference on Computer Vision and Pattern Recognition (CVPR), 2020

2017

  1. Simplified Stochastic Feedforward Neural Networks
    Kimin Lee, Jaehyung Kim, Song Chong, and Jinwoo Shin
    arXiv preprint, 2017