default search action
Fajri Koto
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c27]Fajri Koto, Haonan Li, Sara Shatnawi, Jad Doughman, Abdelrahman Boda Sadallah, Aisha Alraeesi, Khalid Almubarak, Zaid Alyafeai, Neha Sengupta, Shady Shehata, Nizar Habash, Preslav Nakov, Timothy Baldwin:
ArabicMMLU: Assessing Massive Multitask Language Understanding in Arabic. ACL (Findings) 2024: 5622-5640 - [c26]Haonan Li, Yixuan Zhang, Fajri Koto, Yifei Yang, Hai Zhao, Yeyun Gong, Nan Duan, Timothy Baldwin:
CMMLU: Measuring massive multitask language understanding in Chinese. ACL (Findings) 2024: 11260-11285 - [c25]Samuel Cahyawijaya, Holy Lovenia, Fajri Koto, Rifki Afina Putri, Tjeng Wawan Cenggoro, Jhonson Lee, Salsabil Maulana Akbar, Emmanuel Dave, Nuur Shadieq, Muhammad Ihza Mahendra, Dea Annisayanti Putri, Bryan Wilie, Genta Indra Winata, Alham Fikri Aji, Ayu Purwarianti, Pascale Fung:
Cendol: Open Instruction-tuned Generative Large Language Models for Indonesian Languages. ACL (1) 2024: 14899-14914 - [c24]Fajri Koto, Tilman Beck, Zeerak Talat, Iryna Gurevych, Timothy Baldwin:
Zero-shot Sentiment Analysis in Low-Resource Languages Using a Multilingual Sentiment Lexicon. EACL (1) 2024: 298-320 - [c23]Chen Liu, Fajri Koto, Timothy Baldwin, Iryna Gurevych:
Are Multilingual LLMs Culturally-Diverse Reasoners? An Investigation into Multicultural Proverbs and Sayings. NAACL-HLT 2024: 2016-2039 - [i27]Fajri Koto, Tilman Beck, Zeerak Talat, Iryna Gurevych, Timothy Baldwin:
Zero-shot Sentiment Analysis in Low-Resource Languages Using a Multilingual Sentiment Lexicon. CoRR abs/2402.02113 (2024) - [i26]Fajri Koto, Haonan Li, Sara Shatnawi, Jad Doughman, Abdelrahman Boda Sadallah, Aisha Alraeesi, Khalid Almubarak, Zaid Alyafeai, Neha Sengupta, Shady Shehata, Nizar Habash, Preslav Nakov, Timothy Baldwin:
ArabicMMLU: Assessing Massive Multitask Language Understanding in Arabic. CoRR abs/2402.12840 (2024) - [i25]Fajri Koto, Rahmad Mahendra, Nurul Aisyah, Timothy Baldwin:
IndoCulture: Exploring Geographically-Influenced Cultural Commonsense Reasoning Across Eleven Indonesian Provinces. CoRR abs/2404.01854 (2024) - [i24]Samuel Cahyawijaya, Holy Lovenia, Fajri Koto, Rifki Afina Putri, Emmanuel Dave, Jhonson Lee, Nuur Shadieq, Tjeng Wawan Cenggoro, Salsabil Maulana Akbar, Muhammad Ihza Mahendra, Dea Annisayanti Putri, Bryan Wilie, Genta Indra Winata, Alham Fikri Aji, Ayu Purwarianti, Pascale Fung:
Cendol: Open Instruction-tuned Generative Large Language Models for Indonesian Languages. CoRR abs/2404.06138 (2024) - [i23]David Romero, Chenyang Lyu, Haryo Akbarianto Wibowo, Teresa Lynn, Injy Hamed, Aditya Nanda Kishore, Aishik Mandal, Alina Dragonetti, Artem Abzaliev, Atnafu Lambebo Tonja, Bontu Fufa Balcha, Chenxi Whitehouse, Christian Salamea, Dan John Velasco, David Ifeoluwa Adelani, David Le Meur, Emilio Villa-Cueva, Fajri Koto, Fauzan Farooqui, Frederico Belcavello, Ganzorig Batnasan, Gisela Vallejo, Grainne Caulfield, Guido Ivetta, Haiyue Song, Henok Biadglign Ademtew, Hernán Maina, Holy Lovenia, Israel Abebe Azime, Jan Christian Blaise Cruz, Jay P. Gala, Jiahui Geng, Jesús-Germán Ortiz-Barajas, Jinheon Baek, Jocelyn Dunstan, Laura Alonso Alemany, Kumaranage Ravindu Yasas Nagasinghe, Luciana Benotti, Luis Fernando D'Haro, Marcelo Viridiano, Marcos Estecha-Garitagoitia, Maria Camila Buitrago Cabrera, Mario Rodríguez-Cantelar, Mélanie Jouitteau, Mihail Mihaylov, Mohamed Fazli Mohamed Imam, Muhammad Farid Adilazuarda, Munkhjargal Gochoo, Munkh-Erdene Otgonbold, Naome A. Etori, Olivier Niyomugisha, Paula Mónica Silva, Pranjal A. Chitale, Raj Dabre, Rendi Chevi, Ruochen Zhang, Ryandito Diandaru, Samuel Cahyawijaya, Santiago Góngora, Soyeong Jeong, Sukannya Purkayastha, Tatsuki Kuribayashi, Thanmay Jayakumar, Tiago Timponi Torrent, Toqeer Ehsan, Vladimir Araujo, Yova Kementchedjhieva, Zara Burzo, Zheng Wei Lim, Zheng Xin Yong, Oana Ignat, Joan Nwatu, Rada Mihalcea, Thamar Solorio, Alham Fikri Aji:
CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark. CoRR abs/2406.05967 (2024) - [i22]Holy Lovenia, Rahmad Mahendra, Salsabil Maulana Akbar, Lester James V. Miranda, Jennifer Santoso, Elyanah Aco, Akhdan Fadhilah, Jonibek Mansurov, Joseph Marvin Imperial, Onno Pepijn Kampman, Joel Ruben Antony Moniz, Muhammad Ravi Shulthan Habibi, Frederikus Hudi, Railey Montalan, Ryan Ignatius, Joanito Agili Lopo, William Nixon, Börje F. Karlsson, James Jaya, Ryandito Diandaru, Yuze Gao, Patrick Amadeus Irawan, Bin Wang, Jan Christian Blaise Cruz, Chenxi Whitehouse, Ivan Halim Parmonangan, Maria Khelli, Wenyu Zhang, Lucky Susanto, Reynard Adha Ryanda, Sonny Lazuardi Hermawan, Dan John Velasco, Muhammad Dehan Al Kautsar, Willy Fitra Hendria, Yasmin Moslem, Noah Flynn, Muhammad Farid Adilazuarda, Haochen Li, Johanes Lee, R. Damanhuri, Shuo Sun, Muhammad Reza Qorib, Amirbek Djanibekov, Wei Qi Leong, Quyet V. Do, Niklas Muennighoff, Tanrada Pansuwan, Ilham Firdausi Putra, Yan Xu, Ngee Chia Tai, Ayu Purwarianti, Sebastian Ruder, William-Chandra Tjhi, Peerat Limkonchotiwat, Alham Fikri Aji, Sedrick Keh, Genta Indra Winata, Ruochen Zhang, Fajri Koto, Zheng Xin Yong, Samuel Cahyawijaya:
SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages. CoRR abs/2406.10118 (2024) - [i21]Fajri Koto:
Cracking the Code: Multi-domain LLM Evaluation on Real-World Professional Exams in Indonesia. CoRR abs/2409.08564 (2024) - 2023
- [c22]Samuel Cahyawijaya, Holy Lovenia, Alham Fikri Aji, Genta Indra Winata, Bryan Wilie, Fajri Koto, Rahmad Mahendra, Christian Wibisono, Ade Romadhony, Karissa Vincentio, Jennifer Santoso, David Moeljadi, Cahya Wirawan, Frederikus Hudi, Muhammad Satrio Wicaksono, Ivan Halim Parmonangan, Ika Alfina, Ilham Firdausi Putra, Samsul Rahmadani, Yulianti Oenang, Ali Akbar Septiandri, James Jaya, Kaustubh D. Dhole, Arie Ardiyanti Suryani, Rifki Afina Putri, Dan Su, Keith Stevens, Made Nindyatama Nityasya, Muhammad Farid Adilazuarda, Ryan Hadiwijaya, Ryandito Diandaru, Tiezheng Yu, Vito Ghifari, Wenliang Dai, Yan Xu, Dyah Damapuspita, Haryo Akbarianto Wibowo, Cuk Tho, Ichwanul Muslim Karo Karo, Tirana Fatyanosa, Ziwei Ji, Graham Neubig, Timothy Baldwin, Sebastian Ruder, Pascale Fung, Herry Sujaini, Sakriani Sakti, Ayu Purwarianti:
NusaCrowd: Open Source Initiative for Indonesian NLP Resources. ACL (Findings) 2023: 13745-13818 - [c21]Genta Indra Winata, Alham Fikri Aji, Samuel Cahyawijaya, Rahmad Mahendra, Fajri Koto, Ade Romadhony, Kemal Kurniawan, David Moeljadi, Radityo Eko Prasojo, Pascale Fung:
NusaX: Multilingual Parallel Sentiment Dataset for 10 Indonesian Local Languages. EACL 2023: 815-834 - [c20]Fajri Koto, Nurul Aisyah, Haonan Li, Timothy Baldwin:
Large Language Models Only Pass Primary School Exams in Indonesia: A Comprehensive Test on IndoMMLU. EMNLP 2023: 12359-12374 - [c19]Samuel Cahyawijaya, Holy Lovenia, Fajri Koto, Dea Adhista, Emmanuel Dave, Sarah Oktavianti, Salsabil Maulana Akbar, Jhonson Lee, Nuur Shadieq, Tjeng Wawan Cenggoro, Hanung Wahyuning Linuwih, Bryan Wilie, Galih Pradipta Muridan, Genta Indra Winata, David Moeljadi, Alham Fikri Aji, Ayu Purwarianti, Pascale Fung:
NusaWrites: Constructing High-Quality Corpora for Underrepresented and Extremely Low-Resource Languages. IJCNLP (1) 2023: 921-945 - [i20]Haonan Li, Fajri Koto, Minghao Wu, Alham Fikri Aji, Timothy Baldwin:
Bactrian-X : A Multilingual Replicable Instruction-Following Model with Low-Rank Adaptation. CoRR abs/2305.15011 (2023) - [i19]Haonan Li, Yixuan Zhang, Fajri Koto, Yifei Yang, Hai Zhao, Yeyun Gong, Nan Duan, Timothy Baldwin:
CMMLU: Measuring massive multitask language understanding in Chinese. CoRR abs/2306.09212 (2023) - [i18]Neha Sengupta, Sunil Kumar Sahu, Bokang Jia, Satheesh Katipomu, Haonan Li, Fajri Koto, Osama Mohammed Afzal, Samta Kamboj, Onkar Pandit, Rahul Pal, Lalit Pradhan, Zain Muhammad Mujahid, Massa Baali, Alham Fikri Aji, Zhengzhong Liu, Andy Hock, Andrew Feldman, Jonathan Lee, Andrew Jackson, Preslav Nakov, Timothy Baldwin, Eric P. Xing:
Jais and Jais-chat: Arabic-Centric Foundation and Instruction-Tuned Open Generative Large Language Models. CoRR abs/2308.16149 (2023) - [i17]Chen Cecilia Liu, Fajri Koto, Timothy Baldwin, Iryna Gurevych:
Are Multilingual LLMs Culturally-Diverse Reasoners? An Investigation into Multicultural Proverbs and Sayings. CoRR abs/2309.08591 (2023) - [i16]Samuel Cahyawijaya, Holy Lovenia, Fajri Koto, Dea Adhista, Emmanuel Dave, Sarah Oktavianti, Salsabil Maulana Akbar, Jhonson Lee, Nuur Shadieq, Tjeng Wawan Cenggoro, Hanung Wahyuning Linuwih, Bryan Wilie, Galih Pradipta Muridan, Genta Indra Winata, David Moeljadi, Alham Fikri Aji, Ayu Purwarianti, Pascale Fung:
NusaWrites: Constructing High-Quality Corpora for Underrepresented and Extremely Low-Resource Languages. CoRR abs/2309.10661 (2023) - [i15]Fajri Koto, Nurul Aisyah, Haonan Li, Timothy Baldwin:
Large Language Models Only Pass Primary School Exams in Indonesia: A Comprehensive Test on IndoMMLU. CoRR abs/2310.04928 (2023) - [i14]Zhengzhong Liu, Aurick Qiao, Willie Neiswanger, Hongyi Wang, Bowen Tan, Tianhua Tao, Junbo Li, Yuqi Wang, Suqi Sun, Omkar Pangarkar, Richard Fan, Yi Gu, Victor Miller, Yonghao Zhuang, Guowei He, Haonan Li, Fajri Koto, Liping Tang, Nikhil Ranjan, Zhiqiang Shen, Xuguang Ren, Roberto Iriondo, Cun Mu, Zhiting Hu, Mark Schulze, Preslav Nakov, Tim Baldwin, Eric P. Xing:
LLM360: Towards Fully Transparent Open-Source LLMs. CoRR abs/2312.06550 (2023) - 2022
- [j1]Fajri Koto, Timothy Baldwin, Jey Han Lau:
FFCI: A Framework for Interpretable Automatic Evaluation of Summarization. J. Artif. Intell. Res. 73 (2022) - [c18]Alham Fikri Aji, Genta Indra Winata, Fajri Koto, Samuel Cahyawijaya, Ade Romadhony, Rahmad Mahendra, Kemal Kurniawan, David Moeljadi, Radityo Eko Prasojo, Timothy Baldwin, Jey Han Lau, Sebastian Ruder:
One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia. ACL (1) 2022: 7226-7249 - [c17]Fajri Koto, Timothy Baldwin, Jey Han Lau:
LipKey: A Large-Scale News Dataset for Absent Keyphrases Generation and Abstractive Summarization. COLING 2022: 3427-3437 - [i13]Alham Fikri Aji, Genta Indra Winata, Fajri Koto, Samuel Cahyawijaya, Ade Romadhony, Rahmad Mahendra, Kemal Kurniawan, David Moeljadi, Radityo Eko Prasojo, Timothy Baldwin, Jey Han Lau, Sebastian Ruder:
One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia. CoRR abs/2203.13357 (2022) - [i12]Genta Indra Winata, Alham Fikri Aji, Samuel Cahyawijaya, Rahmad Mahendra, Fajri Koto, Ade Romadhony, Kemal Kurniawan, David Moeljadi, Radityo Eko Prasojo, Pascale Fung, Timothy Baldwin, Jey Han Lau, Rico Sennrich, Sebastian Ruder:
NusaX: Multilingual Parallel Sentiment Dataset for 10 Indonesian Local Languages. CoRR abs/2205.15960 (2022) - [i11]Samuel Cahyawijaya, Alham Fikri Aji, Holy Lovenia, Genta Indra Winata, Bryan Wilie, Rahmad Mahendra, Fajri Koto, David Moeljadi, Karissa Vincentio, Ade Romadhony, Ayu Purwarianti:
NusaCrowd: A Call for Open and Reproducible NLP Research in Indonesian Languages. CoRR abs/2207.10524 (2022) - [i10]Samuel Cahyawijaya, Holy Lovenia, Alham Fikri Aji, Genta Indra Winata, Bryan Wilie, Rahmad Mahendra, Christian Wibisono, Ade Romadhony, Karissa Vincentio, Fajri Koto, Jennifer Santoso, David Moeljadi, Cahya Wirawan, Frederikus Hudi, Ivan Halim Parmonangan, Ika Alfina, Muhammad Satrio Wicaksono, Ilham Firdausi Putra, Samsul Rahmadani, Yulianti Oenang, Ali Akbar Septiandri, James Jaya, Kaustubh D. Dhole, Arie Ardiyanti Suryani, Rifki Afina Putri, Dan Su, Keith Stevens, Made Nindyatama Nityasya, Muhammad Farid Adilazuarda, Ryan Ignatius, Ryandito Diandaru, Tiezheng Yu, Vito Ghifari, Wenliang Dai, Yan Xu, Dyah Damapuspita, Cuk Tho, Ichwanul Muslim Karo Karo, Tirana Noor Fatyanosa, Ziwei Ji, Pascale Fung, Graham Neubig, Timothy Baldwin, Sebastian Ruder, Herry Sujaini, Sakriani Sakti, Ayu Purwarianti:
NusaCrowd: Open Source Initiative for Indonesian NLP Resources. CoRR abs/2212.09648 (2022) - 2021
- [c16]Fajri Koto, Jey Han Lau, Timothy Baldwin:
Evaluating the Efficacy of Summarization Evaluation across Languages. ACL/IJCNLP (Findings) 2021: 801-812 - [c15]Fajri Koto, Jey Han Lau, Timothy Baldwin:
Top-down Discourse Parsing via Sequence Labelling. EACL 2021: 715-726 - [c14]Fajri Koto, Jey Han Lau, Timothy Baldwin:
IndoBERTweet: A Pretrained Language Model for Indonesian Twitter with Effective Domain-Specific Vocabulary Initialization. EMNLP (1) 2021: 10660-10668 - [c13]Fajri Koto, Jey Han Lau, Timothy Baldwin:
Discourse Probing of Pretrained Language Models. NAACL-HLT 2021: 3849-3864 - [i9]Fajri Koto, Jey Han Lau, Timothy Baldwin:
Top-down Discourse Parsing via Sequence Labelling. CoRR abs/2102.02080 (2021) - [i8]Fajri Koto, Jey Han Lau, Timothy Baldwin:
Discourse Probing of Pretrained Language Models. CoRR abs/2104.05882 (2021) - [i7]Fajri Koto, Jey Han Lau, Timothy Baldwin:
Evaluating the Efficacy of Summarization Evaluation across Languages. CoRR abs/2106.01478 (2021) - [i6]Fajri Koto, Jey Han Lau, Timothy Baldwin:
IndoBERTweet: A Pretrained Language Model for Indonesian Twitter with Effective Domain-Specific Vocabulary Initialization. CoRR abs/2109.04607 (2021) - 2020
- [c12]Fajri Koto, Afshin Rahimi, Jey Han Lau, Timothy Baldwin:
IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP. COLING 2020: 757-770 - [c11]Fajri Koto, Jey Han Lau, Timothy Baldwin:
Liputan6: A Large-scale Indonesian Dataset for Text Summarization. AACL/IJCNLP 2020: 598-608 - [c10]Fajri Koto, Ikhwan Koto:
Towards Computational Linguistics in Minangkabau Language: Studies on Sentiment Analysis and Machine Translation. PACLIC 2020: 138-148 - [i5]Fajri Koto, Ikhwan Koto:
Towards Computational Linguistics in Minangkabau Language: Studies on Sentiment Analysis and Machine Translation. CoRR abs/2009.09309 (2020) - [i4]Fajri Koto, Afshin Rahimi, Jey Han Lau, Timothy Baldwin:
IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP. CoRR abs/2011.00677 (2020) - [i3]Fajri Koto, Jey Han Lau, Timothy Baldwin:
Liputan6: A Large-scale Indonesian Dataset for Text Summarization. CoRR abs/2011.00679 (2020) - [i2]Fajri Koto, Jey Han Lau, Timothy Baldwin:
FFCI: A Framework for Interpretable Automatic Evaluation of Summarization. CoRR abs/2011.13662 (2020)
2010 – 2019
- 2019
- [c9]Fajri Koto, Jey Han Lau, Timothy Baldwin:
Improved Document Modelling with a Neural Discourse Parser. ALTA 2019: 67-76 - [i1]Fajri Koto, Jey Han Lau, Timothy Baldwin:
Improved Document Modelling with a Neural Discourse Parser. CoRR abs/1911.06919 (2019) - 2017
- [c8]Fajri Koto, Gemala Y. Rahmaningtyas:
Inset lexicon: Evaluation of a word list for Indonesian sentiment analysis in microblogs. IALP 2017: 391-394 - 2016
- [c7]Fajri Koto:
A Publicly Available Indonesian Corpora for Automatic Abstractive and Extractive Chat Summarization. LREC 2016 - 2015
- [c6]Fajri Koto, Sakriani Sakti, Graham Neubig, Tomoki Toda, Mirna Adriani, Satoshi Nakamura:
A Study on Natural Expressive Speech: Automatic Memorable Spoken Quote Detection. IWSDS 2015: 145-152 - [c5]Fajri Koto, Mirna Adriani:
The Use of POS Sequence for Analyzing Sentence Pattern in Twitter Sentiment Analysis. AINA Workshops 2015: 547-551 - [c4]Fajri Koto, Mirna Adriani:
HBE: Hashtag-Based Emotion Lexicons for Twitter Sentiment Analysis. FIRE 2015: 31-34 - [c3]Fajri Koto, Mirna Adriani:
A Comparative Study on Twitter Sentiment Analysis: Which Features are Good? NLDB 2015: 453-457 - 2014
- [c2]Fajri Koto, Sakriani Sakti, Graham Neubig, Tomoki Toda, Mirna Adriani, Satoshi Nakamura:
The use of semantic and acoustic features for open-domain TED talk summarization. APSIPA 2014: 1-4 - [c1]Fajri Koto, Sakriani Sakti, Graham Neubig, Tomoki Toda, Mirna Adriani, Satoshi Nakamura:
Memorable spoken quote corpora of TED public speaking. O-COCOSDA 2014: 1-4
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-14 23:31 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint