Business Intelligence and Social Mining

From SAM
Jump to: navigation, search

Business Intelligence is the ability to transform data into information and information into knowledge in order to optimize the process of Decision Making in a company. To support it, Business Intelligence offers a set of methodologies, applications, and technologies to gather, refine and transform data from transactional systems and unstructured information in a structured, for reporting or for analysis and conversion information into knowledge.

Introduction

In order to make decisions about the company strategic lines it is necessary to know its current company situation. In based on this information it is possible to define a company strategy. BI provides a system to cover the needs to define this strategic planning process. These systems allow the possibility to discover trends and interesting information allowing media producers, publishers or broadcasters to take further decisions quickly and so, further facilitate their commercialisation and exploitation opportunities.

Relevance to SAM

An objective in SAM project is to offer the possibility to generate this kind of information to facilitate the Decision Making in a company. A possible way to reach this analysis is storing data in a Cloud Storage System which will be analysed a posteriori. To interact in this area one must be careful due to privacy/trust concerns. Strategies such as anonymity, tight and transparent user agreements and access control can be included. Nevertheless Social Media is often public by nature so this will not restrict the aims of the project.

To generate this information it is useful to use advanced techniques based on Social Graph Analysis, Natural Language Processing and Business Intelligence. These techniques will provide different Stakeholders with configurable reports in order to analyse the different aspects of the data.

Finally, a possible objective in SAM Project could be detecting the sentiment expressed in different social media and from different people and putting it to tangible business use. Results of sentiment analysis could also be used by SAMs BI(Business Intelligence)[1] mechanism to provide companies with qualitative statistics.

State of the Art Analysis

Social Network Analysis (SNA)

The SNA [2] is an analysis of social networks whose objective is to describe network characteristics from a numeric or visual point of view based on quantitative or qualitative information. On the other hand, SNA allows obtaining information about how groups of people are connected between them. Therefore SNA can provide an analysis of relationship very useful to provide interesting reports for SAM end users.

Data Warehousing

In computing, a data warehouse is a system used for reporting and data analysis. Integrating data from one or more disparate sources creates a central repository of data, a data warehouse (DW). Data warehouses store current and historical data and are used for creating trending reports for senior management reporting such as annual and quarterly comparisons. The data stored in the warehouse is uploaded from the operational systems (such as marketing, sales, etc., shown in the figure to the right). The data may pass through an operational data store for additional operations before it is used in the DW for reporting.[3]

Reports Generator

In order to generate interactive, tabular, graphical, or free-form reports from the dataset that SAM platform uses. On the other hand, it is also interesting to include rich data visualization, such as charts, maps, and spark-lines as well as to offer the possibility to select from a variety of viewing formats, export reports to other applications, and subscribe to published reports.

A challenge in reports generator is to offer online report ('ad hoc' reporting) to the final user because it is necessary to have a production reporting. Normally, this process involves querying and OLPT Database which could be offer by BI Tools. A goal in this kind of reports is that the final user doesn’t know that they are using a BI tool, so it is necessary that the final product has embedded the OLPT application. The main feature of these products is the possibility to offer standard reports and custom reports defined by the final user. Basically there are 2 kind of environment: desktop and web but today the second one is the predominant.

Natural Language Processing

The NLP [4] is a part of Social Media analytic but it needs to be integrated with BI process. NLP techniques use processes based on the structure of language to extract basic unit of information from a sentence named entities. With this kind of techniques it is possible also to infer the relation between entities and their features. To develop these processes it is necessary to identify parts of dialogue and word groups as well as the roles for each sentence.

Data Mining

Today a big challenge for the companies is to extract knowledge from the huge amount of data that they store in their databases. Data Mining [5] offers the possibility to extract this insight and for this reason it is a basic pillar in a BI environment. The main objective is to build analytical models is based on discover patterns and associations and to allow the possibility to generate classification and prediction and to present mining results.

There are several terms related to knowledge extraction in the social networks: " Opinion Mining "," Sentiment Analysis [6]"," Concept-Level Opinion Analysis "," Social Mining ", etc.. The basic idea of these approaches is the use of Artificial Intelligence techniques to obtain useful knowledge automatically about the opinions, preferences and user trends in social networks. These mining techniques provide an excellent opportunity to capture the views and feelings of the general public about:

  • Social Events.
  • Movements or political parties.
  • Business Strategies business.
  • Brand image.
  • Products and Services.
  • Relevant Persons.

Sentiment Analysis

The Sentiment analysis objective is to obtain the attitude of a sentence, phrase or comment. Counting the number of positive or negative words in a sentence it is possible to categorize the sentiment of the phrase. But if you use in a positive sentence the sarcasm probably this sentiment turn negative. Therefore it is necessary to apply different Sentiment Analysis techniques to extract the true sentiment based on the context of the dialogue.

BI Tools

With the SAM context, the following are major requirements for the BI solution:

  • An Intuitive Web User Interface: Content Providers will need to define and generate reports in an easy and fast way in order to make decisions based on these reports around the clock. For this reason, it is crucial to provide a friendly web user interface that provides access anytime, anywhere.
  • Ad Hoc Reporting and Analysis: SAM’s BI solution must offer simple ad-hoc reporting capabilities make it easy for any worker level, including high level staff members, to quickly build and run their own reports any time they need them.
  • Flexible Formatting Options: SAM Content providers should be able to present their information in different ways. Therefore the selected BI technology must allow users to output their reports as Excel spreadsheets, Word documents, web pages, Adobe PDF files, or other common formats.
  • Dynamic Information Distribution: The BI tool should allow SAM Content Providers to schedule reports to automatically run at pre-set days and times. This helps ensuring that the information being used to support decision making is refreshed at regular intervals, so it is up-to-date and accurate at all times.
  • Modular Design: Decoupling the data integration and ETL and Ddata warehouseing implementation from the user interface with reporting and analytics functionality will allow the maximising of capabilities and a flexible design.

The following select lists some of the most promising BI technologies for the SAM context:

  • Logi Analytics[7] (formerly Logi XML) focuses on the user facing aspect of BI offering a very easy-to-use and embeddable platform that includes reporting, analysis and dashboards for both IT and business users, plus data integration.
  • Good Data[8] is a cloud BI and analytics specialist delivering a complete BI solution as SaaS. It provides a range of front-end BI capabilities and packaged analytic applications that complement its comprehensive cloud and on-premises source data integration and cloud-based data warehouse platform.
  • Birst[9] BI platform is primarily a cloud-based offering. It includes a broad range of components, such as data integration, federation and modelling, a data warehouse with a semantic layer, reporting, dashboards, mobile BI and a recently announced interactive visualization tool.
  • Jaspersoft[10] sells an end-to-end, open-source, BI and data integration platform featuring a low-cost-of-ownership value proposition often used to build embedded BI applications. Jaspersoft has a scalable, modular, standards-based design that allows the flexibility needed for a wide variety of deployments from on-premises to cloud.
  • Microsoft BI[11] offers a competitive and expanding set of BI and analytics capabilities, packaging and pricing. Its reporting and analytics features are primarily based on Excel with the Power View add-on while the data integration is handled by Power Pivot and Power Query with the data warehouse implemented on SQL Server. On-premises as well as cloud deployments are supported, both via SharePoint portal.
  • Revolution Extreme is an ETL and data warehouse solution provided by TIE Kinetix. It provides unique automated generation of data integration processes and monitoring tables. It aims at significantly speeding up and simplifying data warehouse development by providing a configuration user interface (rather than coding), automatic document creation, and traceability of data. It can connect to any source providing a SQL Server Integration Services interface.

Social Media Tools

The following list contains information about different Social Media Tools:

SAM Approach

The Business Intelligence (BI) subcomponent allows business users, Media Broadcasters and Information Brokers to monitor the actions and events within the SAM platform. It also provides functionalities to derive useful insights from this data, and it provides access to advanced Social Media features provided by the SAM platform through the Social Mining subcomponent like sentiment analysis. It therefore aims to provide valuable feedback to the business user to enable business related decision making.

Architecture and Dependencies

The Business Intelligence subcomponent provides the reports and analysis in the SAM platform. Therefore it is of maximum importance that the selected technology provides the following features:

  • An Intuitive Web User Interface: Content Providers will need to define and generate reports in an easy and fast way in order to make decisions based on these reports around the clock. For this reason, it is crucial to provide a user-friendly web user interface that provides access anytime, anywhere.
  • Ad-Hoc Reporting and Analysis: SAM’s BI solution must offer simple ad-hoc reporting capabilities make it easy for any worker level, including high-level staff members, to quickly build and run their own reports any time they need them.
  • Flexible Formatting Options: SAM Content providers should be able to present their information in different ways. Therefore the selected BI technology must allow users to output their reports as Excel spreadsheets, Word documents, web pages, Adobe PDF files, or other common formats.
  • Dynamic Information Distribution: The BI tool should allow SAM Content Providers to schedule reports to automatically run at pre-set days and times. This helps ensuring that the information being used to support decision making is updated at regular intervals, so it is up-to-date and accurate.
  • Modular Design: Decoupling the data integration and data warehouse implementation from the user interface with reporting and analytics functionality will allow the maximising of capabilities and a flexible design.
Analytics Architecture

Implementation and Technologies

Frontend Technologies

Backend Technologies

Subcomponents

A summary of the tasks carried out for each subcomponent of the first version of the prototype is shown in the following table.

Subcomponent Task
Actors Media Broadcasters and Information Brokers will use the BI functionalities in order to define and access the analytic reports
Report Manager This component provides tools to help the user to create the necessary requests from information (query) and to visualise different kind of reports. It will provide a graphical interface to make it easy to describe and check what data fields are available and to define the BI reports. Finally, it permits the stakeholders to access and browse the reports once generated or updated
ETL Once the reports have been created the next step is to generate the necessary information to show them. To reach this goal the BI component will Extract, Transform and Load (ETL) information into the system. This information will be extracted from the Cloud Storage and Social Mining components
Data Warehouse The Data Warehouse will store in a specific database the information extracted in the ETL process. It will additionally store a metadata dictionary in Cloud Storage that can be used by the Report Manager

Functionality and UI Elements

Articles

References

<references>
  1. BI(Business Intelligence) http://en.wikipedia.org/wiki/Business_intelligence
  2. SNA http://en.wikipedia.org/wiki/Social_Network_Analysis
  3. http://en.wikipedia.org/wiki/Data_warehouse
  4. NLP http://en.wikipedia.org/wiki/Natural_language_processing
  5. Data Mining http://en.wikipedia.org/wiki/Data_mining
  6. Sentiment Analysis http://en.wikipedia.org/wiki/Sentiment_analysis
  7. http://www.logianalytics.com/info
  8. http://www.gooddata.com
  9. http://www.birst.com/
  10. https://www.jaspersoft.com/
  11. http://www.microsoft.com/en-us/server-cloud/solutions/business-intelligence/