This protocol uses deep learning algorithm to segment tongue images and automatically separates the tongue from the whole image, which can be used for further research. This protocol compares the performance of four-tongue image segmentation algorithm to select the base algorithm that can segment a tongue with a clear counter. Patented instruments and algorithm are used in this study.
It is recommended to use professional instruments and master computer technology. To begin, prepare the self-developed handheld lingual face diagnostic instrument to collect lingual face images of patients. Fill in the patient's name, gender, age, and disease on the computer page.
Next, ask the patient to sit upright and position the image acquisition device connected to the computer. Then, position the face of the patient in the image acquisition instrument and ask them to extend their tongue out of their mouth to the maximum extent. Verify through the images on the computer screen that the patient is in the correct position and that the tongue and face are fully exposed.
Press the shoot button on the computer screen three times to capture three images. Then, manually select the collected images of the face and the tongue. Filter them according to the Image Acquisition page of the software.
Collect three images from each patient at a time and export the images to a folder for manual screening. Next, select a standard, fully exposed, well-illuminated, and clear images, the sample for algorithm training and testing. Delete the images that do not meet the criteria of a standard image.
The criteria for exclusion are incomplete tongue and face exposure and images that are too dark. To perform tongue image segmentation, open LabelMe and click on the Open button in the upper left corner of the interface. Select the folder containing the image to be segmented and open the photos.
Click the Create Polygon button to track the points. Track the tongue and lingual shapes, name them according to the selected areas, and save. On finishing all tracking, click Save to save the image to the data folder.
Set the device to capture images with a size of 1080x1920 pixels and ensure the size of the filled image is 1920x1920 pixels. Manually screen the the images of each patient to select and retain one from the three images that were taken for uncontrollable factors, such as subject blinking and lens blocking. For training the model, collect data from 200 individuals or 600 images and retain about 200 images that are usable after the screening.
Randomly divide all the tongue images according to the image number, placing 70%of them into the Training Set and 30%of the images into the Test Set folder. Download and install Anaconda Python and LabelMe from their official websites. Activate the environment and complete the installation and adjustment of the overall environment.
Choose the appropriate model for the research purpose. In this study, UNet, SegNet, DeepLab Version 3, and PSPNet were selected for validation. Next, build the deep learning algorithm model in the installed environment.
Tune the parameters by adjusting the values and complete the model training using the Training Set. Using LabelMe annotation and uniform image size methods, construct the required data set in conjunction with the research content. Examine the results of the segmentation and evaluate the model performance based on four metrics, precision, recall, mean pixel accuracy or MPA, and MIOU, to get a more comprehensive evaluation.
Compare the generated values of the four models. The higher the value, the higher the segmentation accuracy and the better the performance of the model. According to the index results, the UNet algorithm is superior to the other algorithms in MIOU, MPA, precision, and recall, and its segmentation accuracy is also higher.
PSPNet is better than DeepLab Version 3 in MIOU, MPA, and recall, while the DeepLab Version 3 model is lower than the segment model in all indexes. In tongue segmentation, the outer red area in the image results from tongue segmentation and the inner green area results from tongue coating segmentation. In this study, the unit algorithm performed well in tongue segmentation, and further research can be conducted based on the unit algorithms.