Our protocol provides a comprehensive, unobtrusive, and accurate way to describe how people can engage with online communities while they're recovering from drug and alcohol addiction. This technique goes beyond the self-reported approach that's been used in previous studies, allowing us to measure more implicit markers of recovery. Before beginning the extraction, load the required functions, datasets, or compiled code that allow users to analyze, transform or extract data in R, and load the external retention and user data into R, as a data frame from a CSV file.
When all of the packages have been loaded, use the get group function from R Facebook to extract data from the social media page of the community of interest, and save the data as a data frame. Using the get posts function from R Facebook, along with the just-extracted post IDs, extract the data about post likes made on the page. Using the get posts function from R Facebook and the extracted post IDs, extract data on the comments made on each post, and save this data as a data frame.
Using the comment IDs, extract data on the comment likes made on each post, and save this data as a data frame. Then, combine the posts, post likes, comments, and comment likes data into one data frame, and add a monthly breakdown. To calculate the social media activity made and received by each client, calculate the number of posts, comments, post likes, and comment likes made by each client, and the number of posts, comments, post likes, and comment likes received by each client.
Join the data frame of the social media activity made and received by each client to the retention data frame, and calculate the difference between posts and comments with likes and no likes, and the difference between posts with comments and no comments. Join the likes difference data to the retention data, and the comments difference data to the retention data. Calculate all of the likes made by each client, and all of the likes received by each client.
Then, identify which users did not participate in the social media group. To conduct a social network analysis, create an edge list of relationships within the social network, based on liking posts and comments and commenting on posts by looking at two columns within the dataset. The first column contains the anonymous ID of the person making the post, while the second column contains the anonymous ID of the person liking or commenting on the post.
Next, create a vertex list of all the individuals in the group by converting the two columns in the list of relationships into one column, and removing any duplicate anonymous IDs, so that only the unique anonymous ID is left. Using the graph data frame and get adjacency functions in the igraph package, create graph and graph matrix objects from the edge and vertex lists. Then, use the degree and betweenness functions from the igraph package, to obtain the network degree and betweenness statistics of the online group.
To conduct a computerized linguistic analysis in the Linguistic Inquiry Word Count software, export the textual social media data and post comment ID column into CSV files. Import the CSV files of the textual social media data into the Linguistic Inquiry Word Count or LIWC software, by clicking analyze text, Excel CSV file, and the column containing the posts and comments to select the text to be analyzed. After LIWC has completed analyzing the textual data, save the output as a new CSV file.
Import the LIWC results CSV file into R, and merge the results with existing data. The data will be matched by the post comment ID column, which exists in both LIWC and the existing data frames. Calculate the total LIWC scores for each user in posts and comments, and join these scores to the retention data.
Calculate the total LIWC scores for each user in all of the textual posts and comments combined data, and join these scores to the retention data. Then, remove any network analysis from the retention data frame. To determine if indicators of engagement with the online community predict retention in the offline recovery program, use the IM function in base R to conduct linear regression analysis of the retention data as the dependent variable, and the LIWC categories, comments, post likes, and comment likes as independent variables.
Then, combine the regression analysis results into one data frame. To create a monthly social network analysis map, prepare data frames for social network analysis maps, and create an edge list based on the monthly cumulative social media activity. Create a vertex list based on the monthly cumulative social media activity, and create graphs and graph matrices based on the monthly cumulative social media activity.
Set the layout of social network analysis maps based on the cumulative social media activity, and add colors based on the user roles. Then, create social network analysis maps and save the maps to a file. For calculation of the monthly cumulative social media group social media activity, calculate the monthly cumulative social media activity by the staff, clients, and other members of the social media group.
Then, calculate the monthly cumulative social media activity by all of the members of the social media group, and join the monthly cumulative social media activity data frames together. Here, a visual representation of the social network and its evolution over a period of eight months in the form of connections between all of the participants in the online community, is shown. The number of connections that an agent in the network has determines how central they will be in the social network.
These representative results support the argument that overall, positive social interactions between members of an online recovery community are supportive of the recovery process. Participants'levels of engagement with the online community are measured by computing the contributions of all of the participants in the online community as the number of posts, comments, and likes made by the staff, clients, and broader community members. As shown by the results, in this analysis, the levels of online interaction and in-group validation, as reflected by the number of likes received for posts and comments, predict the program retention.
Program retention is also predicted by identification markers, as captured by the use of the pronoun we in posts, and by the achievement words in both posts and comments. Finally, where participants are situated within the social network also represents an important aspect of retention. When using this approach, we need to remember that this is only one way to capture psychological processes in online communities.
Ideally other data sources should also be accessed. This method can be adapted to investigate online social interactions in other types of online communities, including online forums, discussion groups, chat rooms, and so on.