Visualizing Changing Roles over time
Published:
In the end, I will be using social network analysis as a driver for why people move through roles, but I will start with identifying what the roles are, and some summary statistics of how people move through them.
In order to identify roles, I created monthly activity snapshots for each user, and then used a clustering algorithm in R to automatically identify different "behavioral roles". There is some evidence that the data don't cluster cleanly, but clustering isn't a central component of my research, so I am moving forward anyway.
I decided to use the k-mediods (aka "partitioning around mediods" or "pam") algorithm in R. I used the silhouette function to identify the best k (which was 3 clusters).
The data I used to create the clusters only included those months where a user made at least 5 edits. So, I created a 4th cluster to represent months where a user made less than 5 edits, and used a python program to add the cluster results to the original stats file.
I then wrote another python script to rearrange this data, so that it is in the format
ID Month1 Month2 Month3 ...
1 1 2 2
2 1 0 2so that Month1 is the user's role in their first month, Month2 the role in that user's 2nd month, etc....
For way, way too long today I've been trying to get R to display this data as a stacked area graph of the ratio of roles by month. It's been a huge pain to try to figure out how to reshape it, etc.
Once I can get that, I want to compare that graph to the graph of those who were in each role at least X times during their tenure.
