Social Communities Identification and Creation

From SAM
Jump to: navigation, search

The topic of this page is the automatic identification and creation of social communities or groups based on social network data and context data. This topic has been investigated previously in the areas of social network analysis and context[1] and in the area of dynamic social networks[2][3][4].


Graph depiction of a social network with two hub users (indicated in black)
The first question to address when considering the automatic identification and/or creation of social communities is: what is the purpose of carrying out this activity? The topic is relevant in a number of application areas with distinct requirements, for instance when considering the identification of secret or hidden social communities such as of criminal or terrorist networks within a larger society.

In the context of SAM, the motivation for identifying and explicitly creating social communities is to connect persons with similar known properties so that they can socialise around media. In this context it is possible that social communities that are relevant exhibit specific properties such as a short life span or constantly evolving group membership depending on the data used to create and maintain the created communities.

Relevance to SAM

Both for the creation and modification of social communities and in order to provide business intelligence techniques for commercial users of the SAM Platform, it is necessary to research and develop techniques for the analysis of the specific relevant context that is of concern for SAM. Furthermore, it is necessary to research and develop methods for the analysis of social communities and networks in terms of properties that are relevant for the successful creation and management of these communities and for the analysis of the communities for the benefit of commercial users.

After receiving information from the user interface and based on the context (assets used, asset metadata, user interaction etc.), Sentiment Analysis techniques will be employed in the SAM Platform to build and update information related to user characterisation that also includes their personal characteristics (e.g. age, location, etc.) and personal preferences (e.g. preferences about specific brands, TV programmes, services, products, etc.). This user characterisation can trigger personalised asset recommendations from suitably semantically characterised repositories. Sentiment analysis technologies allow obtaining near-real-time statistics concerning user responses to media assets by analysing their comments that are available within the SAM Platform.

For this task, it is necessary to gather, organise and analyse different types of data that are relevant in the context of SAM, including the analysis of textual data provided by users, such as comments or short messages, and the analysis of user ratings data.

State of the Art Analysis

Social Network Analytics

Methods for the identification of communities in social networks apply social network analysis in order to identify structures within social networks and groupings that can be used in order to create communities. For analysis purposes, social networks are usually conceptualised as graphs with nodes representing persons and edges representing connections between persons (regardless of how edges are established). Social network analyses examine in particular the following metrics:

  • Connections between nodes and properties of connections
  • Distributions of nodes and connections in the overall graph
  • Segmentation of graphs into subgraphs or "cliques"

In particular the segmentability of graphs into cliques is of interest for finding communities for the purposes of SAM. J. Pattillo et al.[5] discuss relevant techniques.

The determination of edges between nodes generally depends on the specific data that is available - connections between users in a social network such as Facebook could for instance be primarily derived from users' friend lists, while edges for a social network based on Twitter data could for instance be based both on follower relationships and also on other factors such as explicit references of user handles or hashtag terms.

Graph-Based Clustering Techniques for Identifying Communities

Social networks can relatively easily (one might argue "naturally") be represented as graphs of vertices and edges. A straightforward approach for identifying subgroups within such a graph can then be to apply graph-based clustering methods[6]. Hierarchical clustering methods[7] in particular may be useful in order to identify different levels of grouping also within identified clusters.

Open Problems: Using Small-World Network and Scale-Free Network Properties

Social networks tend to exhibit properties of small-world networks[8], which has implications regarding which analysis techniques can be applied on social networks in order to identify subcommunities within a larger social network graph.

One approach for identifying subcommunities within a larger social network is to assume that a network may have properties of a scale-free network[9]. Subcommunities could then be created based around identified hub nodes/persons.

Open Problems: Changes over Time

Social network interactions and structures are not static, but change over time. Tantipathananandh et al.[10] identify the impact of changes in social networks over time when identifying communities in social networks. They propose a solution using Dynamic Programming and heuristics they deem appropriate for the domain of social networks.

Tools, Frameworks and Services

  • JUNG Java Universal Network/Graph Framework[11]
  • NetMiner[12]

Related Projects

  • SOCIALNETS: Social networking for pervasive adaptation[13]

SAM Approach

The SAM dynamic community creation component is responsible for identifying communities of users that may benefit from being connected via the SAM social media component. Two sub-components are involved creating and operating dynamic communities in SAM. The Dynamic Community sub-component of the SAM platform is responsible for identifying communities of users that may benefit from being connected. The Dynamic Community backend manages the created communities and ensures that messages are available to the users who are part of a community.

Architecture and Dependencies

In this subsection, the functionalities of the component associated with dynamic community management are identified with respect to the identified goals and user requirements. The top component is the Context Control, as defined in SAM deliverable D3.2.1 “Global Architecture Definition”. The sub-components that address the desired features and functionalities are the Community Structure Analyser, the Community Manager and the Community Actuator.


The key functionalities that are accomplished by the dynamic community sub-components are:

  • identify clusters of users using two different formats of user representation
  • identify and label viable clusters that may become user communities in the SAM platform
  • update the cluster composition as additional user profile data is added over time and in response to users accepting or declining invitations to join a community

Implementation and Technologies

The system has been implemented in Java as a Java Servlet. As for the majority of SAM components, the sub-components use RESTful web services for communication with other components of the SAM platform.

Latest Developments

The latest version of the social communities identification and creation tools provide the final integration of dynamic community tools with the remainder of the SAM platform and integrate a configurable deterministic rule-based module for the creation of dynamic communities that allows business users to manually define rules for community management.


  1. N. Yu and Q. Han, “Context-Aware Community: Integrating Contexts with Contacts for Proximity-Based Mobile Social Networking,” in 2013 IEEE Int. Conf. Distributed Computing in Sensor Systems, Cambridge, MA, 2013, pp. 141-148.
  2. D. Greene et al., “Tracking the Evolution of Communities in Dynamic Social Networks,” in 2010 Int. Conf. Advances in Social Networks Analysis and Mining, Odense, 2010, pp. 176-183.
  3. R. Lubke et al. “MobilisGroups: Location-Based Group Formation in Mobile Social Networks,” in 2011 IEEE Int. Conf. Pervasive Computing and Communications Workshops, Seattle, WA, 2011, pp. 502-507.
  4. K. Xu et al., “Tracking Communities in Dynamic Social Networks,” in Proc. 4th Int. Conf. Social Computing, Behavioral-Cultural Modeling and Prediction, 2011, pp. 219-226.
  5. J. Pattillo, N. Youssef and S. Butenko, "Clique Relaxation Models in Social Network Analysis", In: M. Thai and P. Pardalos, "Handbook of Optimization in Complex Networks: Communication and Social Networks", Springer, 2011
  6. Wikipedia entry "Cluster analysis",
  7. Wikipedia entry "Hierarchical Clustering",
  8. Wikipedia entry "Small-World Network",
  9. Wikipedia entry "Scale-Free Network",
  10. C. Tantipathananandh, T. Berger-Wolf, D. Kempe. "A Framework for Community Identification in Dynamic Social Networks", in: Proceedings of the 13th ACM SIGKDD International Confrence on Knowledge Discovery and Data Mining, pp. 717-726.
  11. JUNG, the Java Universal Network/Graph Framework,
  12. NetMiner,
  13. SOCIALNETS: Social networking for pervasive adaptation,