 
        School of Information University of Michigan SI 614 Community structure in networks Lecture 17 Outline  One mode networks and cohesive subgroups  measures of cohesion  types of subgroups  Affiliation networks  team assembly Why care about group cohesion?  opinion formation and uniformity  if each node adopts the opinion of the majority of its neighbors, it is possible to have different opinions in different cohesive subgroups within a cohesive subgroup – greater uniformity Other reasons to care  Discover communities of practice (more on this next time)  Measure isolation of groups  Threshold processes:  I will adopt an innovation if some number of my contacts do  I will vote for a measure if a fraction of my contacts do What properties indicate cohesion?  mutuality of ties  everybody in the group knows everybody else  closeness or reachability of subgroup members  individuals are separated by at most n hops  frequency of ties among members  everybody in the group has links to at least k others in the group  relative frequency of ties among subgroup members compared to nonmembers Cliques  Every member of the group has links to every other member  Cliques can overlap overlapping cliques of size 3 clique of size 4 Considerations in using cliques as subgroups  Not robust  one missing link can disqualify a clique  Not interesting  everybody is connected to everybody else  no core-periphery structure  no centrality measures apply  How cliques overlap can be more interesting than that they exist  Pajek  remember from class on motifs:  construct a network that is a clique of the desired size  Nets>Fragment (1 in 2)>Find a less stingy definition of cohesive subgroups: k cores  Each node within a group is connected to k other nodes in the group 4 core 3 core Pajek: Net>Partitions>Core>Input,Output,All Assigns each vertex to the largest k-core it belongs to subgroups based on reachability and diameter  n – cliques  maximal distance between any two nodes in subgroup is n 2-cliques  theoretical justification  information flow through intermediaries frequency of in group ties  Compare # of in-group ties within-group ties ties from group to nodes external to the group Given number of edges incident on nodes in the group, what is the probability that the observed fraction of them fall within the group? The smaller the probability – the stronger the cohesion considerations with n-cliques  problem  diameter may be greater than n  n-clique may be disconnected (paths go through nodes not in subgroup) 2 – clique diameter = 3 path outside the 2-clique  fix  n-club: maximal subgraph of diameter 2 cohesion in directed and weighted networks  something we’ve already learned how to do:  find strongly connected components  keep only a subset of ties before finding connected components  reciprocal ties  edge weight above a threshold 1 Digbys Blog 2 JamesWalcott 3 Pandagon 4 blog.johnkerry.com 5 Oliver Willis 6 America Blog 7 Crooked Timber 8 DailyKos 9 American Prospect 10Eschaton 11Wonkette 12TalkLeft 13Political Wire 14Talking Points Memo 15Matthew Yglesias 16Washington Monthly 17MyDD 18JuanCole 19Left Coaster 20Bradford DeLong (A) 1 21 2 3 4 6 7 9 10 8 24 25 26 15 18 16 14 33 35 34 37 20 29 30 32 31 19 (C) 28 12 17 27 11 13 (B) 23 22 5 38 40 39 36 21 JawaReport 22Voka Pundit 23Roger LSimon 24Tim Blair 25Andrew Sullivan 26 Instapundit 27Blogsfor Bush 28 LittleGreenFootballs 29Belmont Club 30Captain’s Quarters 31Powerline 32 HughHewitt 33 INDCJournal 34RealClearPolitics 35Winds ofChange 36Allahpundit 37MichelleMalkin 38WizBang 39Dean’s World 40Volokh Example: political blogs (Aug 29th – Nov 15th, 2004) A) all citations between A-list blogs in 2 months preceding the 2004 election B) citations between A-list blogs with at least 5 citations in both directions C) edges further limited to those exceeding 25 combined citations only 15% of the citations bridge communities Affiliation networks  otherwise known as  membership network  e.g. board of directors  hypernetwork or hypergraph  bipartite graphs  interlocks m-slices  transform to a one-mode network  weights of edges correspond to number of affiliations in common  m-slice: maximal subnetwork containing the lines with a multiplicity equal to or greater than m A= 1 1 1 1 0 1 1 1 1 0 1 1 2 2 0 1 1 2 4 1 0 0 0 1 1 1-slice 1 1 1 1 2 2 slice Pajek: Net>Transform>2Mode to 1-Mode> Include Loops, Multiple Lines Info>Network>Line Values (to view) Net>Partitions>Valued Core>First threshold and step Scottish firms interlocking directorates legend: 2-railways 4-electricity 5-domestic products 6-banks 7-insurance companies 8-investment banks methods used directly on bipartite graphs rare Finding bicliques of users accessing documents An algorithm by Nina Mishra, HP Labs Documents Users Team Assembly Mechanisms Determine Collaboration Network Structure and Team Performance Roger Guimera, Brian Uzzi, Jarrett SpiroLuıs A. Nunes Amaral Science, 2005 astronomy and astrophysics social psychology economics Issues in assembling teams  Why assemble a team?  different ideas  different skills  different resources  What spurs innovation?  applying proven innovations from one domain to another  Is diversity (working with new people) always good?  spurs creativity + fresh thinking  but  conflict  miscommunication  lack of sense of security of working with close collaborators Parameters in team assembly 1. m, # of team members 2. p, probability of selecting individuals who already belong to the network 3. q, propensity of incumbents to select past collaborators Two phases  giant component of interconnected collaborators  isolated clusters creation of a new team  incumbents (people who have already collaborated with someone)  newcomers (people available to participate in new teams)  pick incumbent with probability p  if incumbent, pick past collaborator with probability q Time evolution of a collaboration network newcomer-newcomer collaborations newcomer-incumbent collaborations new incumbent-incumbent collaborations repeat collaborations after a time t of inactivity, individuals are removed from the network BMI data  Broadway musical industry  2258 productions  from 1877 to 1990  musical shows performed at least once on Broadway  team: composers, writers, choreographers, directors, producers but not actors  Team size increases from 1877-1929  the musical as an art form is still evolving  After 1929 team composition stabilizes to include 7 people:  choreographer, composer, director, librettist, lyricist, producer Collaboration networks  4 fields (with the top journals in each field)  social psychology (7)  economics (9)  ecology (10)  astronomy (4)  impact factor of each journal  ratio between citations and recent citable items published  A= total cites in 1992  B= 1992 cites to articles published in 1990-91 (this is a subset of A)  C= number of articles published in 1990-91  D= B/C = 1992 impact factor size of teams grows over time degree distributions data data generated from a model with the same p and q and sequence of team sizes formed Predictions for the size of the giant component  higher p means already published individuals are co- authoring – linking the network together and increasing the giant component S = fraction of network occupied by the giant component Predictions for the size of the giant component (cont’d)  increasing q can slow the growth of the giant component – co-authoring with previous collaborators does not create new edges network statistics Field teams individuals p q fR S (size of giant component) BMI 2258 4113 0.52 0.77 0.16 0.70 social psychology 16,526 23,029 0.56 0.78 0.22 0.67 economics 14,870 23,236 0.57 0.73 0.22 0.54 ecology 26,888 38,609 0.59 0.76 0.23 0.75 astronomy 30,552 30,192 0.76 0.82 0.39 0.98 what stands out? what is similar across the networks? different network topologies ecology economics astronomy main findings  all networks except astronomy close to the “tipping” point where giant component emerges  sparse and stringy networks  giant component takes up more than 50% of nodes in each network  impact factor (how good the journal is where the work was published)  p positively correlated  going with experienced members is good  q negatively correlated  new combinations more fruitful ecology, economics, social psychology  S for individual journals positively correlated  more isolated clusters in lower-impact journals ecology social psychology
© Copyright 2025