Skip to content

AI4MOL/Mat-Instruction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

Logo

Under Construction

Mat-Instructions: A Large-Scale Inorganic Material Instruction Dataset for Large Language Models

Abstract

Recent advancements in large language models (LLMs) have revolutionized research discovery across various scientific disciplines, including materials science. The discovery of novel materials, particularly crystal materials, is essential for achieving sustainable development goals (SDGs), as they drive breakthroughs in climate change mitigation, clean and affordable energy, and the promotion of industrial innovation. However, unlocking the full potential of LLMs in materials research remains challenging due to the lack of high quality, diverse, and instruction-based datasets. Such datasets are crucial for guiding these models in understanding and predicting the structure, property, and function of materials across various tasks. To address this limitation, we introduce Mat-Instruction, a large-scale inorganic material instruction dataset, specifically designed to unlock the potential of LLMs in materials science. Extensive experiments on fine-tuning LLaMA with our Mat-Instruction dataset demonstrate its effectiveness in advancing progress for materials science.

Links

Todo

  • ✅ Release the main code and dataset
  • Reconstruct the code structure and release the eval pipeline
  • Add Pretrained Models

Citation

If you find this work useful in your research, please consider citing:

@inproceedings{ijcai2025p1089,
    title     = {Mat-Instructions: A Large-Scale Inorganic Material Instruction Dataset for Large Language Models},
    author    = {Liu, Ke and Gao, Shangde and Fu, Yichao and Wu, Xiaoliang and Tong, Shuo and Rajan, Ajitha},
    booktitle = {Proceedings of the Thirty-Fourth International Joint Conference on
                Artificial Intelligence, {IJCAI-25}},
    publisher = {International Joint Conferences on Artificial Intelligence Organization},
    editor    = {James Kwok},
    pages     = {9799--9807},
    year      = {2025},
    month     = {8},
    note      = {AI and Social Good},
    doi       = {10.24963/ijcai.2025/1089},
    url       = {https://doi.org/10.24963/ijcai.2025/1089},
    }

About

[IJCAI'25] Mat-Instructions: A Large-Scale Inorganic Material Instruction Dataset for Large Language Models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages