Data Mining Functionalities

Data mining has an important place in today’s world. It becomes an important research area as there is a huge amount of data available in most of the applications. This huge amount of data must be processed in order to extract useful information and knowledge, since they are not explicit. Data Mining is the process of discovering interesting knowledge from large amount of data.

The kinds of patterns that can be discovered depend upon the data mining tasks employed. By and large, there are two types of data mining tasks: descriptive data mining tasks that describe the general properties of the existing data, and predictive data mining tasks that attempt to do predictions based on inference on available data. The data mining functionalities and the variety of knowledge they discover are briefly presented in the following list:

  1. Characterization: It is the summarization of general features of objects in a target class, and produces what is called characteristic rules. The data relevant to a user-specified class are normally retrieved by a database query and run through a summarization module to extract the essence of the data at different levels of abstractions. For example, one may wish to characterize the customers of a store who regularly rent more than movies a year. With concept hierarchies on the attributes describing the target class, the attribute oriented induction method can be used to carry out data summarization. With a data cube containing summarization of data, simple OLAP operations fit the purpose of data characterization.
  2. Discrimination: Data discrimination produces what are called discriminant rules and is basically the comparison of the general features of objects between two classes referred to as the target class and the contrasting class.
Read the rest

An Introduction to Data Mining

Data mining involves the use of sophisticated data analysis tools to discover previously unknown, valid patterns and relationships in large data sets. These tools can include statistical models, mathematical algorithms, and machine learning methods such as neural networks or decision trees. Consequently, data mining consists of more than collecting and managing data, it also includes analysis and prediction. The objective of data mining is to identify valid, novel, potentially useful, and understandable correlations and patterns in existing data. Finding useful patterns in data is known by different names (e.g., knowledge extraction, information discovery, information harvesting, data archaeology, and data pattern processing).

The term “data mining” is primarily used by statisticians, database researchers, and the business communities. The term KDD (Knowledge Discovery in Databases) refers to the overall process of discovering useful knowledge from data, where data mining is a particular step in this process. The steps in the KDD process, such as data preparation, data selection, data cleaning, and proper interpretation of the results of the data mining process, ensure that useful knowledge is derived from the data. Data mining is an extension of traditional data analysis and statistical approaches as it incorporates analytical techniques drawn from various disciplines like AI, machine learning, OLAP, data visualization, etc.

Data Mining covers variety of techniques to identify nuggets of information or decision-making knowledge in bodies of data, and extracting these in such a way that they can be. Put to use in the areas such as decision support, prediction, forecasting and estimation. The data is often voluminous, but as it stands of low value as no direct use can be made of it; it is the hidden information in the data that is really useful.… Read the rest

Disaster Recovery Plan (DRP) in Business

Fire, flood, earthquake and accidental deletion of data are all acts that can cause disastrous consequences on data. Such disasters can prevent the network from operating normally, which in turn can hamper the organisation’s business. These disasters can be classified into man-made disasters and environmental disasters. Man-made disasters are intentionally or unintentionally caused by humans. For example, a user accidentally deletes the data, virus and malicious programs can damage data and various other events can cause data loss and downtime. Environmental disasters are non-preventive but can be reduced if appropriate precautions are taken. Environmental disasters include fire, flood, earthquake, tornado and hurricane.

Disaster recovery deals with recovery of data that is damaged due to destructive activities. The time required to recover from a disaster depends on the disaster recovery plan implemented by the organisation. A good disaster recovery plan can prevent an organisation from any type of disruption.

Disaster Recovery Plan/Business Continuity Plan

A Disaster Recovery Plan (DRP) helps to identify threats to an existing business such as terrorism, fire, earthquake and flood. It also provides guidance on how to deal with occurrence of such events. Disasters are unpredictable; hence, planning for the worst is important for any business. A DRP is also called a Business Continuity Plan (BCP). The only difference between Disaster Recovery Plan and Business Continuity Plan is the focus. The focus of Business Continuity Plan is to provide continuity of operations in the organisation. Whereas, Disaster Recovery Plan focuses on recovery and rebuilding of the organisation after a disaster has occurred.

Read the rest

Business Intelligence (BI)

Business Intelligence is the process of discovering and analyzing data to make informed business decisions. The management in any business needs this aspect of management as part of the companies integral infrastructure in today’s world in order for the business to succeed. The data collected from many of the data collecting sources is used to determine trends, or measure, manage and improve on the performances of individuals, processes, teams and business units. The enterprise refers to any business organisation that uses computers as an integral part of their business and relies on it for that businesses development.

The History Of Business Intelligence

The term Business Intelligence was coined by the Gartner group in mid-1990s. But Business Intelligence was around before that, it originated in the Management Information Systems reporting systems of the 1970s. Reports in this era was only two fold, there was no analytical dimension to reporting. In the early 1980s, Executive Information Systems emerged. This introduced ad hoc (on demand) reporting forecasting, prediction, trend analysis, drill down to details, status access and critical success factors. It was available to top level managers who were the ones to make decisions for the businesses future. Some of the capabilities from the 1990s appeared in products along with some new ones and it was called Business Intelligence. A good Business Intelligence based enterprise information system contains all the information executives need. By 2005 Business Intelligence systems started to include artificial intelligence capabilities and more powerful analytical capabilities. The most sophisticated Business Intelligence Products include most of these capabilities.… Read the rest

Case Study of Zara : Application of Business Intelligence in Retail Industry

ZARA is a Spanish clothing and accessories retailer based in Arteixo, Galicia.  Founded in 24 May ,1975 by Amancio Ortega and Rosalía Mera, the brand is renowned for it’s ability to deliver new clothes to stores quickly and in small batches. Zara needs just two weeks to develop a new product and get it to stores, compared to the six-month industry average, and launches around 10,000 new designs each year. Zara was described by Louis Vuitton Fashion Director Daniel Piette as “possibly the most innovative and devastating retailer in the world. The company produces about 450 million items a year for its 1,770 stores in 86 countries.

The Zara has made of use of Information Systems (IS) and to advance in many areas. This has resulted in huge success for the company. This included application of Business intelligence (BI) involves technologies, practices for collection, integration and applications to analyze and present business information. The main aim of business intelligence is to promote better business decision making.

BI describes a group of information on concepts and methods to better decision making in business. This is achieved by employing a fact based support systems. The intelligence systems are data-driven and sometimes used in executive information systems. Predictive views on business operations can be provided by use of BI systems. predictive views on business operations can be provided by use of BI systems since historical and current data has been gathered into a data bank performance management benchmarking is done whereby information on other companies in the same industry is gathered.… Read the rest

Comparison Between Proprietary Software and Open Source Software

Proprietary software can be defined as closed software that is distributed under a license agreement that limits any modifications to the software. Its just opposite to the concept of Open source software. Open source software can be defined as software that is distributed freely under a license agreement with no limitations on changes made to the source code.

Many proprietary corporations make software freely available to users. For example, Adobe provides users with the Adobe Acrobat Reader. The Adobe Acrobat Reader is an application that users to view documents that have been saved in the portable document format (PDF). The PDF format is developed by Adobe and has become a standard in saving files as electronic documents. ‘Standardization’ and ‘compatibility’ are the two main drivers for the success of proprietary software. Any user that opens a PDF formatted document with Adobe Acrobat Reader is confident that the document will be accessible and readable. The standardization of the PDF format for document files is what made almost every document processing application incorporate it into their processes. In turn, that made the PDF format compatible within document processing software. However, software corporations that incorporate a process to save files as PDF formatted documents must pay Adobe a licensing fee in order to have access the code that will convert files into PDF in conformity with the Adobe Acrobat Reader application. The Adobe Acrobat Reader creates user dependency. Users that create documents desire software that understands the PDF format and the software vendor must pay Adobe for licensing, in order to meet the requirements of its users.… Read the rest