Selected Publications


Dongjun Jang, Jean Seo, Sungjoo Byun, Taekyoung Kim, Minseok Kim, and Hyopil Shin (2024), CARBD-Ko: A Contextually Annotated Review Benchmark Dataset for Aspect-Level Sentiment Classification in Korean. arXiv preprint arXiv:2402.15046.


Dongjun Jang, Sungjoo Byun, Hyemi Jo, and Hyopil Shin (2024), KIT-19: A Comprehensive Korean Instruction Toolkit on 19 Tasks for Fine-Tuning Korean Large Language Models, Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Sungjoo Byun, Jiseung Hong, Sumin Park, Dongjun Jang, Jean Seo, Minseok Kim, Chaeyoung Oh, and Hyopil Shin (2024), Korean Bio-Medical Corpus (KBMC) for Medical Named Entity Recognition, Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)


Dongjun Jang, Sungjoo Byun, and Hyopil Shin (2024), A Study on How Attention Scores in the BERT Model are Aware of Lexical Categories in Syntactic and Semantic Tasks on the GLUE Benchmark, Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)




Dongjun Jang, Sangah Lee, Sungjoo Byun, Jinwoong Kim, Jean Seo, Minseok Kim, Soyeon Kim, Chaeyoung Oh, Jaeyoon Kim, Hyemi Jo, and Hyopil Shin (2023), DaG LLM ver 1.0: Pioneering Instruction-Tuned Language Modeling For Korean NLP, arXiv:2311.13784v1.


Sungjoo Byun, Dongjun Jang, Hyemi Jo, and Hyopil Shin (2023), Automatic Construction of a Korean Toxic Instruction Dataset for Ethical Tuning of Large Language Models, Workshop on Instruction Tuning and Instruction Following, 37th Conference on Neural Information Processing Systems (NeurIPS 2023).


장동준, 김은진 and 신효필. (2022). Affinity Prober를 이용한 언어 모델의 문장 수용성 판단 결정의 경계 요인 분석. 언어, 47(4), 829-855.


Sanagah Lee, SeokGi Kim, Eunjin Kim, Minji, Kang, and  Hyopil Shin (2022), Contract Eligibility Verification Enhanced by Keyword and Contextual Embeddings, KIISE Vol. 49. N0. 10


Sanagah Lee and Hyopil Shin (2021), The Korean Morphologically Tight-Fitting Tokenizer for Noisy User-Generated Texts, Proceedings of the 2021 EMNLP Workshop W-NUT: The Seventh workshop on Noisy User-gnerated Text.


Sangah Lee and Hyopil Shin (2021), Combining Sentiment-Combined Model with Pre-Trained BERT Models for Sentiment Analysis, KIISE, Vol40., No.7.


Sana Lee, Hansol Jang, Yunmee Baik, Suzi Park and Hyopil Shin (2020), A Small-Scale Korean-Specific BERT Language Model, Journal of KIISE, Vol 47., No.7.


Sujin Choi, Hyopil Shin and Seung-Shik Kang (2020), Predicting Audience-Rated News Quality: Using Survey, Text Mining, and Neural Network Methods, Digital Journalism, vol 9.


Suzi Park and Hyopil Shin (2019), Leveraging More Fine-grained Representation to Reduce Instability within Word Embeddings, Language and Information vol23. No.3.


Sana Lee, and Hyopil Shin (2018), An Analysis of Linear Argumentation Structure of Korean Debate texts Using Sequential Modeling and Linguistic Features, Journal of KIISE vol. 45 No. 12.


Timour Igamberdiev, and Hyopil Shin (2018), Metaphor Identification with Paragraph and Word Vectorization: An Attention-Based Neural Approach, Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation.


Youngsam Kim and Hyopil Shin (2018), Measuring Semantic Orientation of Words Using Temporal Difference Learning, Journal of KIISE vol. 45 No. 12.


Suzi Park, and Hyopil Shin (2018), Grapheme-level Awareness in Word Embeddings for Morphologically Rich Languages, 11th Edition of the Language Resources and Evaluation Conference(LREC2018).


Youngsam Kim, and Hyopil Shin (2017), Finding Sentiment Dimension in Vector Space of Movie Reviews: An Unsupervised Approach, Journal of Cognitive Science 18-1.


Akihiko Yamada, and Hyopil Shin (2017), Applying Word Embeddings to Measure the Semantic Adaptation of English Loanwords in Japanese and Korean, Language Research 23-3.


Munhyong Kim, and Hyopil Shin (2016), Automatic Product Review Helpfulness Estimation based on Review Information Types, Journal of KIISE vol. 43 No. 9.


Hyopil Shin, Munhyong Kim, and Suzi Park (2016), Modality-based Sentiment Analysis through the Utilization of the Korean Sentiment Analysis Corpus, Eoneohag 74.


Yulia Otmakhova and Hyopil Shin (2015), Do we Really Need Lexical Information? Towards a Top-down Approach to Sentiment Analysis of Product Reviews, NAACL-HLT 2015, pp. 1599-1568.


Munhyong Kim and Hyopil Shin (2014), Pinpointing Sentence-Level Subjectivity through Balanced Subjectivity and Objectivity Features, Lecture Notes in Computer Science: Advances in Natural Language Processing, Springer.


Hyopil Shin (2014), A Corpus Study of Nested Sources for Subjectivity Analysis, Eoneohag 69.


Suzi Park and Hyopil Shin (2014), Identification of Implicit Topics in Twitter Data Not Containing Explicit Search Queries, COLING 2014.


Hyopil Shin and  Munhyong Kim (2013), Specifications and Analysis of the Korean Sentiment Analysis Corpus, Language Research 49-2.


Youngsam Kim, Honggi Kim, and Hyopil Shin (2013),  A comparative study of Entry-Grid and LSA models on Korean Sentence ordering, Korean Journal of cognitive science 24-4.


Youngsam Kim, Munhyong Kim, Andrew Cattle, Julia Otmakhova, Suzi Park, and Hyopil Shin (2013), Applying Graph-based Keyword Extraction to Document Retrieval, IJCNLP 2013.


Youngsam Kim, and Hyopil Shin (2013), Romanization-based Approach to Morphological Analysis in Korean SMS Text Processing, IJCNLP 2013.


Hayeon Jang, Munhyong Kim, and Hyopil Shin (2013), KOSAC: A Full-fledged Korean Sentiment Analysis Corpus, 27th Pacific Asia Conference on Language, Information, and Computation.


Munhyong Kim, Yu-Mi Jo, Hayeon Jang, and Hyopil Shin (2013), KOSAC(Korean Sentiment Analsysis Corpus): 한국어 감정 및 의견 분석 코퍼스, 2013 한국컴퓨터종합학술대회.


Munhyong Kim, Yu-Mi Jo, Hyun-Jo You, Yoon-shin Kim, Hayeon Jang, Seungho Nam, and Hyopil Shin (2012), Semantic Types and Representation of Korean Set Time Expressions, Language and Information 16-1. 


Yu-Mi Jo, Munhyong Kim,Hyun-Jo You, Yun-Shin Kim, Seungho Nam, and Hyopil Shin (2011), Problematic Set-Denoting Temporal Expressions in the Framework of ISO-TimeML, ICSC2011 Workshop on Semantic Annotation for Computational Linguistics Resources.


Hyun-Jo You, Hayeon Jang, Yu-Mi Jo, Yun-Shin Kim, Seungho Nam, and Hyopil Shin (2011), The Korean TimeML: A Study of Event and Temporal Information in Korean Texts, Language and Information 15-1.


Hayeon Jang and Hyopil Shin(2010), Sentiment Analysis of Korean Using Effective Linguistic Features and Adjustment of Word Senses, Language and Information 14-2.


Minsu Ko and Hyopil Shin (2010), Grading System of Movie Review through the Use of An Appraisal Dictionary and Computation of Semantic Segment, Korean Journal of Cognitive Science 21-4.


Juliano Paiva Junho, Yumi Jo and Hyopil Shin (2010), The KOLON System: Tools for Ontological Natural Language Processing in Korean, PACLIC24.


Hayeon Jang and Hyopil Shin (2010), Effective Use of Linguistics Features for Sentiment Analysis of Korean, PACLIC24.


Hayeon Jang and Hyopil Shin (2010), Language-Specific Sentiment Analysis in Morphologically Rich Langauges, COLING2010. 


Hyopil Shin (2010), KOLON(the KOrean Lexicon mapped onto the Mikrokosmos ONtology): Mapping Korean Words onto the Mikrokosmos Ontology and Combining Lexical Resources, Eoneohak 56.


Hyopil Shin and Hyunjo You (2009), Hybrid N-gram Probability Estimation in Morphologically Rich Languages, The 23rd Pacific Asia Conference on Language, Information, and Computation, Hong Kong.


Seohyun Im, Yoonshin Kim, Youmi Cho, Hayun Jang, Minsu Ko, Seungho Nam, and Hyopil Shin (2009), KTARSQI: The Annotation of Temporal and Event Expressions in Korean, 21st Annual Conference on Human and Cognitive Language Technology.


Hyunjo You, Munhyung Kim, Juliano Junho, Seungho Nam and Hyopil Shin (2009), Saken: the Korean Event Tagger, 21st Annual Conference on Human and Cognitive Language Technology.


Hyopil Shin (2009), Linguistics and Statistical Models(언어학과 통계 모델), Seoul National University Press.


Seohyun Im, Hyunjo You, Hayun Jang, Seungho Nam , and Hyopil Shin (2009), KTimeML: Specification of Temporal and Event Expressions in Korean Text, The 7th Workshop on Asian Language Resources, Association for Computational Linguistics.


Hyopil Shin and Insik Cho (2008), A Noun-Predicate Bigram-based Similarity Measure for Lexical Relations, Lecture Notes in Artificial Intelligence 5221, Springer.


Jung-Min Kim, Byoung-Il Choi, Hyo-Pil Shin and Hyoung-Joo Kim (2007), A methodology for constructing of philosophy ontology based on philosophical texts, Computer Standards & Interfaces 29-3.


Jung-Min Kim, Hyopil Shin, and Hyoung-Joo Kim (2007), Schema and Constraints-based Matching and Merging of Topic Maps, Information Processing and Management 43-3.


Hyopil Shin (2007), Mapping Korean Basic Verbs to the Mikrokosmos Ontology (in Korean), Eoneohak 49.


Hyopil Shin (2007), A Statistical Approach to Collocations Based on the Log Likelihood Ratio (in Korean), Eoneohak 47.


Hyopil Shin (2006), A Flat Korean Analysis Based on the Typed Feature Structures and LKB (Linguistic Knowledge Building) (in Korean), Eoneohak 44. 


Jung-Min Kim, Hyopil Shin, and Hyoung-Joo Kim (2006), A Multi-Strategic Mapping Approach for Distributed Topic Maps(in Korean),https://koreascience.kr/article/JAKO200612842607060.view?orgId=anpor&hide=breadcrumb,journalinfo Journal of KISS: Software and Applications 33-1.


Hyopil Shin (2005), Some Considerations on the Analysis of Linguistic Data based on Statistics (in Korean), Language Research 41-3.


Hyopil Shin (2005), Ontological Semantics (in Korean), Semantic and Syntactic Structure and Beyond. 


Hyopil Shin (2004), Ontology-based Conceptual Structures and Lexical Mapping (in Korean), Language Research 40-3.


Insik Cho, HyunJo Yoo and Hyopil Shin (2004), Specialized Words in the 21st Sejong Electronic Dictionary (in Korean), Korean Dictionary 3.


Hyopil Shin (2004), Ontolgoy and Semantic Web As a Knowledge Base (in Korean), Communications of the Korean Information Processing.


Hyopil Shin (2003), Constructing A Korean-English Bilingual Dictionary For Bilingual Dictionary For Well-formed English Sentence Generations in A Gloss-based System (in Korean), Korean Journal of Cognitive Science 14-2. 


Hyopil Shin (2003), A Knowledge-based Question-Answering System: With a View to Constructing A Fact Database (in Korean), Korean Journal of Cognitive Science 13-1.


Hyopil Shin (2001), Toward More Efficient Processing of Typed Feature Structures in Korean, Eoneohak 29.


Hyopil Shin and Eugene Koontz (2001), KaBAL(Knowledge Base Access Language): A Language For Querying And Editing XML Documents, Applied To Linguistic Knowledge Base, IEEE NLP-KE, Tucson, USA.


Hyopil Shin and W. Ogden (2001), Combining Summarization and Machine Translation Facilities to Build an Interactive Cross-Language Retrieval System, The 19th International Conference on Computer Processing of Oriental Languages, Korea. 


Hyopil Shin and Spencer Koehler (2000), A Knowledge-Based Fact Database: Acquisition to Application, Knowledge Based Computer Systems 3, Allied Publisher. 


Hyopil Shin and Spencer Koehler (2000), Acquiring Factual Knowledge Through Ontological Instantiation, The Series of Lecture Notes in Computer Science, vol. 1886, Springer-Verlag Publisher.


Hyopil Shin and Jerrold Stach (2000), Using Long Runs as Predictors of Semantic Coherence in a Partial Document Retrieval System, Workshop of Syntax and Semantic Complexity in Natural Language Processing Systems, ANLP/NAACL 2000, Seattle, USA.


Hyopil Shin and Jerrold Stach (1999), Incorporating Probabilistic Semantic Categories (SEMCATs) Into Vector Space Techniques For Partial Document Retrieval, Journal of Computer Systems and Information Management, vol. 2-3, Maximillan Press. 


Hyopil Shin (1999), The VP-barrier Algorithm for a Robust Syntactic Processing in Head-Final Languages, Proceedings of the Natural Language Processing Pacific Rim Symposium, Beijing. 


Hyopil Shin (1999), Maximally Efficient Syntactic Parsing with Minimal Resources, 99 Korean and Korean Language Processing.


Hyopil Shin (1999), Syntactic and Semantic Interfaces for Lexically Unrealized Relations, Proceedings of Mid-America Linguistics Conference, University of Kansas.


W. Ogden, J. Cowie, M. Davis, E. Ludovik, H. Molina-Salgado and Hyopil Shin (1999), Getting Information from Documents You Cannot Read: An Interactive Cross-Language Text Retrieval and Summarization System, Joint ACM Digital Library/SIGIR Workshop on Multilingual Information Discovery and Access (MIDAS), Univ. of California, Berkely.


Hyopil Shin, Incorporating Semantic Categories Into Partial Information Retrieval System, M.S. Thesis, University of Missouri-Kansas City. 


Hyopil Shin (1996), Syntactic and Semantic Structure of the Korean Relative Constructions: A Unified Approach, Taehaksa. 


J. Oh and Hyopil Shin (1995), Lexaurus: A Multilingual, Ontology-based Bilingual Electronic Dictionary, Language Research 31-3, Seoul National University.