Our research focuses on a wide variety of machine learning and large language model applications in cancer care. This includes both improving direct patient care and addressing operational challenges such as scheduling and documentation. We're ultimately trying to answer the question, How can these technologies make healthcare better for patients, providers, and researchers?
Large language models have already begun transforming the research landscape. Previously prohibitive natural language processing tasks have become approachable to many more investigators, enabling many new lines of research involving unstructured data. A major challenge is the rapidly changing landscape of large language model tools and infrastructure.
Many clinicians and researchers may have ideas for applications but don't currently have the contextual or technical knowledge needed for implementation, or thoughtful consideration of trade-offs of different approaches. To begin install Git, python, and PIP on the computer, and run the commands in a terminal to verify the installation. Then run the git clone command to download the repository and install the necessary requirements.
To create a vector database, edit the config. py file, replacing the value of the following variable with the file path to the folder containing the documents that will be used to augment the large language model. Then save the updated file in the articles directory.
Next, in a terminal in the same directory, execute the code with the Python 3 build index command to create and persist the database. Verify if the database is now saved in the vector database folder. To query the augmented LLM, execute the Python 3 run augmented llm.
py command in the terminal. Test user queries to receive responses that are augmented by the data from the document set. Then press control plus C to exit when finished.
For creating MCQs, edit the file questions. py, taking note of the format of the examples. Add questions following a similar format, and save the file.
To edit the config. pi file, add the API key for open AI or hugging face. If the objective is to benchmark against models from either provider.
Now save the file. Then edit the compare llms. py file, and choose the set of models to test against.
After uncommenting the models to compare against. Finally, in a terminal, execute the code with the compare llms command and after executioni View the model responses in the specified folder for grading or other review. Among the MCQs tested, the embed model performed significantly better than the base one, as did the augmented open AI models.