LLM-assisted tool to annotate research data with machine-understandable, semantic data dictionaries
Contributor: Tvisha Vedant
Mentors: JB Poline, Arman Jahanpour, Sebastian Urchs, Alyssa Dai, bcmcpher
To participate in the Neurobagel query federation, datasets must conform to Neurobagel’s data model, so annotating the datasets is necessary to harmonize them for query federation. The project aims to reduce the human effort to manually annotate individual data elements by automating the current annotation tool provided by Neurobagel using Large Language Models (LLMs). The already existing annotation tool will be integrated with an LLM-based assistant which will categorize and annotate each data element and it will be followed by human verification. The project includes automating the tool using LLMs, then integrating the tool into the existing webpage and making changes in the UI accordingly.
- Reduce the human effort to manually annotate individual data elements by automating the current annotation tool provided by Neurobagel using Large Language Models (LLMs).