Begin by adding all the multiomics input data sets to the input data folder. Here they contain data from patients with stable, chronic, and acute coronary syndromes. To pre-process the data, click on the folder symbol, then double-click on the mofa_workflow scripts and configurations to access the configuration folder.
Double-click on the data configuration CSV file to open it. In the value column, enter the paths to the input data and results folders. In the configuration name-value column, specify a name to be added as a file extension for all saved files.
To save the changes, select File and Save CSV File from the menu at the top. Then, using the navigation menu on the left side, click on scripts to go to the scripts folder. Double click on 00_Configuration_Update.
ipynb to open the initialization notebook. To run the script, click on the Restart Kernel and Run All Cells button at the top, and then click Restart in the popup. To navigate to the configurations folder, double-click on configurations.
Then double-click on 01_Pre_Processing_SC_Data. csv to open the file. Verify the automatically filled in values, select File and Save CSV file from the menu at the top to save the changes.
Then use the navigation menu on the left side and click on scripts to navigate to the scripts folder. Double-click on 01_Prepare_Pseudobulk. ipynb to open the notebook.
To run the script, click on the Restart Kernel and Run All Cells button at the top, and then click Restart in the popup. To navigate to the figures folder, double-click first on figures, and then on 01_figures. Open the newly generated plot, FIG01_Amount_of_Cells overview.
Then examine the plot to identify cell type clusters with a very low number of cells per sample. Note down the names of those cluster IDs to exclude them in the subsequent steps. To navigate back to the configuration folder, click on dots and double click on configurations.
Then open the file 02_Pre_Processing_Configuration_SC.csv. Add all cluster IDs identified for exclusion in the previous step, separated by commas in the cell_type_exclusion column. To save the changes, select File and Save CSV File from the menu at the top.
Now open the file 02_Pre_Processing_Configuration. CSV, and adjust the pre-processing configuration for each dataset included and stored in the data input folder. Adjust the parameters in the columns as needed depending on which pre-processing steps should be applied.
Save the changes by selecting File and Save CSV File. To navigate to the scripts folder, click on scripts. Open the notebook 02_Integrate_and_Normalize_Data_Sources.ipynb.
Click on the Restart Kernel and Run All Cells button at the top, and then click Restart in the popup. Next, navigate to the generated 02_results folder. Click on the folder symbol, then double click on results and 02_results.
Verify that it includes the file 02_Combined_Data, configuration name, INTEGRATED CSV containing the combined pre-process data input file.