One modern cancer research paradigm holds that understanding the genetic alterations leading to cancer will lead to improvements in prevention, early detection, identifying prognostic markers, and developing targeted therapies. This oncology database in Microsoft Access spreads across seven primary tables, including a patient's table with clinical information, a TMA table with protein expression data, and a DNA specimens table. With mutational data, relationships are developed to link the data among the tables so that the laboratory studies of tissue belonging to a particular patient can be connected with his or her clinical history.
The linked data can then be retrieved via a query of the database and statistically analyzed to determine relationships among protein expression, DNA, mutations and clinical outcomes. Ultimately, linking clinical and laboratory data facilitates the study of genetic aberrations and protein expression from the perspective of clinical endpoints in cancer patients. A key component of translational research is the ability to accrue clinical and basic science research data, and amalgamate the two in a meaningful relationship.
However, the bioinformatics tools necessary to do this are often limited and institution dependent. This database was developed to help our researchers answer key questions relevant to translational cancer research, such as how a patient's genetic and proteomic makeup can contribute to certain clinical endpoints, such as response to therapy or survival. Though this database was developed for researchers and clinicians involved with thoracic malignancies, this model can be applied to other diseases and to suit the needs of researchers and other institutions.
We hope that you will adopt this protocol for your own database projects. Prior to initiating a translational database project, it is essential to develop IRB approved protocols. To safeguard patient confidentiality, label all the collected specimens, tissue, blood, or other bodily fluids with a barcoded unique identifier, and appropriately document them in the sample procurement form.
To begin developing the database, first create primary tables to capture domains of interest, such as information about patients sample data, tissue microarray, and DNA specimens. Other tables of interest may include chest oncology conference, cell lines, and gans tables. In order to create a table in Microsoft Access, go to the create tab and select table.
Initially, the table is presented in data sheet view. However, in order to enter variables of interest, the table must be presented in design view. Under the home tab, select view, design view, enter and describe variables of interest in design view for the patient's table.
This will include the medical record number, last name, first name, date of birth, et cetera. In design view, the date of field must be described with a data type as the medical record number is a seven digit number. This variable will be described as a numeric one.
Since the variable last name is text, it is listed as a free text variable. It is also helpful to provide a description for the variable. For example, for the variable sex, it is important for the data entry person to know that male should be entered as one and female as two, design the tables to limit redundancy.
The patient's table should be the only place where detailed information about the patients, their cancer, their clinical course, their risk factors, and their outcomes are kept within the database. Develop a sample data table to link pathological specimens to their patient sources as multiple specimens may be obtained from each patient. This table should link each specimen's identification number to the patient's medical record number in A TMA table capture protein expression data resulting from tissue microarray immunohistochemical analysis.
Label the protein that has been studied and identify the level of expression of that protein using a standardized scoring system. Also identify the type of tissue that has been analyzed, such as tumor or normal tissue list, all DNA specimens that have undergone mutational analysis. In a DNA specimens table, clearly annotate genetic alterations, such as a description of the exact mutation, the gene for which there is a mutation, and the end result of such a mutation.
Generate a fifth chest oncology conference table to describe the weekly meetings of the thoracic oncology clinical team. Include pertinent information resulting from these meetings such as the availability of pathological specimens For each patient, establish a freestanding cell lines table that describes the cell lines utilized in the laboratory. For research purposes, describe important characteristics belonging to each cell.
Finally, create another freestanding table for CL elgan to list autologous receptor, tyrosine kinases and associated needle score. A measure of similarity with human proteins. To facilitate queries from multiple tables, individual tables must be linked via primary keys.
Establish a primary key or unique identifier for each table. A unique identifier is a number or word that represents only one unique individual and cannot be listed twice in a table. For example, a patient's medical record number is the primary key in the patient's table.
To establish a primary key, go to the design view of the table of interest. Right click on the field. That will become the primary key, and click the option primary key.
To link the tables, go to the database tools tab and click on the relationships icon. Drag the mouse from the field of interest in one table to the corresponding field in another table. In this manner, two tables are connected with relationships in place.
A query can then be run To obtain related information from multiple tables, select the query design option under the create tab, select and display tables, including the fields of interest. Then select variables of interest from each table. Run the query to produce a spreadsheet form of results and save the data using an appropriate file extension to import data.
Verify that there is a perfect match between the name of variables in the imported data and the name of variables in Microsoft Access. The data cannot be successfully imported if the variables are described differently, such as if there are spaces in one but not the other. Once symmetry is obtained between the two tables, use an append or update query to import the data into access.
An append query adds new rows of data, such as new patients, while an update query will update data on entries already within the database. To export the data as a Microsoft Excel spreadsheet, go to the external data tab and click export data to Excel. Microsoft Access also allows data to be exported in other commonly used formats.
A researcher is interested in correlating information for the protein paxin across the patient's table, the TMA table and the DNA specimens table. As the tables are already linked via the medical record number, the data can readily be queried In order to obtain the relevant information, she clicks on the create tab and clicks query design. She selects the three tables of interest patients, TMA and DNA specimens from each table.
She selects the variables of interest such as the patient's name from the patient's table, the origin of the sample from the TMA table paxin expression from the TMA table and Paxin mutations from the DNA specimens table. She hits run query to generate the query and the results appear in a query spreadsheet. All of the patients for whom there are both paxin, expressional, and mutational data will appear in this spreadsheet.
She can then identify the frequency of such mutations, correlate specific mutations with changes in expression or associate mutations with clinical variables such as survival. Please note that the results demonstrated here are for sample patients and not real patients identified within the database. The results can be exported for primary data analysis by the biostatistician, and the results are shared with the researcher.
When creating a database such as this, it is important to ensure that all data are entered in a consistent manner. Data should only be entered into a field. If the researcher is certain of its validity, It is necessary to reemphasize the importance of maintaining confidentiality and keeping a database that is in strict compliance with HIPAA and IRB regulations.
With this translational database project using Microsoft Access, we have been able to build a database with nearly 3000 unique patients and have expression data on 63 unique proteins Advances that we can make in identifying such mutations have the potential to lead to improvements in our ability to prevent, detect, and treat devastating diseases.