To establish the machine learning or ML strategy for the initial PSC colony states, prepare a data set of bright-field images at zero hours or before CHIR treatment and the final cTNT fluorescence images. Quantify the differentiation efficiency of each well by the differentiation Efficiency index computed from its cTNT fluorescence images. Quantify the morphological profiles of zero hour bright-field images by high dimensional features that reveal the properties of the colony's shape.
Randomly divide the dataset into a training set and a test set. Train a random forest regression model on the training set to predict differentiation efficiency from zero hour bright-field image features. Evaluate the trained random forest model on the test set.
Prepare a dataset consisting of whole well bright-field image streams at zero to 12 hours and the corresponding CHIR doses, which are the combinations of CHIR concentrations and duration. According to the displayed criteria, determine the low, optimal, and high CHIR concentration range under each CHIR duration in each batch. Under each duration, compute delta CHIR concentration for each concentration to quantify its deviation from the optimum.
Extract features of the image streams in the dataset. For each CHIR duration, train a logistic regression model to predict the CHIR concentration label from the image stream features on the training set. Evaluate the classification performance of the trained logistic regression model on the test set using accuracy, precision, recall, F1 score, and area under the curve.
To evaluate the model performance in CHIR dose assessment, in the test set, merge predicted labels of parallel wells with the same CHIR concentration using deviation scores ranging from minus one to one. Using the Pearson correlation coefficient, confirm that the predicted deviation scores highly correlate with the true delta CHIR concentration for each dose. Next, apply the trained logistic regression models to assess CHIR doses in new batches.
With the model-based CHIR dose assessment, rescue wells under each suboptimal CHIR concentration accordingly by adjusting their CHIR duration or concentration towards optimum before 48 hours. Prepare a dataset consisting of bright-field images on day six, and manually annotate CM-committed CPCs by tracking the cTNT-positive cells in the image streams from day 12 back to day six. Crop the bright-field images and the corresponding manual annotation of CM-committed CPCs into patches.
Label patches greater than or equal to 30%of CM-committed CPCs as positive, and patches without CM-committed CPCs as negative. Train a deep convolutional neural network, ResNeSt, to learn to classify these patches. Use Grad-CAM to highlight the regions that contribute most to the inference of ResNeSt represented by heat maps.
Merge the patch level-binarized heat maps to get the predicted CM-committed CPC regions, which are named imaged-recognized CPC, or IRCPC regions. Compare the IRCPC regions with manually annotated masks on the test set using accuracy, F1 score, precision, recall, specificity, and intersection over union. Prepare a data set consisting of bright-field images of CMs and the corresponding cTNT fluorescence images.
Train the pix2pix model on the training set before applying the trained model on the test set. To purify the CM-committed CPCs, use a non-cytotoxic photoactivitable probe, dual-activatable cell tracker 1, or DACT-1, to region selectively label non-CPC. After irradiating the non-CPC's region with a 405 nanometer laser line, detect the DACT-1-labeled cells using a 561 nanometer laser line.
After sorting the dissociated cells, seed the sorted cells in a matrigel-coated 96 well plate, and culture DACT-1-labeled non-CPCs, unlabeled IRCPCs, and the control group cells for six days in the medium.