Data are becoming an infinity commodity. The most common “trait” of any profession is use of data, may it be qualitative or quantitative. There are so many terms related with data that there are different disciplines about it. The COVID19 has re-emhasized how much vigorous attention is required in data, starting from raw data to bio-metric data such as facial recognition, pattern-matching etc. So, what are these terms that are significantly impacting our lives. Well, a Data Scientist will be able to provide you more accurate definitions, relationships, scientific applications. However, in layman’s terms I will just point-out the relationships, differences and example of applications.
- Data Analysis: The most common part with all data lingo is Data Analysis. Data Analysis is present from Analytics to AI. From Excel to Python, Data Analysis has to be performed. So, I do not want to bore you with such simple but most important term. It is hard to articulate when the term “Data Analysis” was first used but it is safer to say it originated in 1940s.
- Data Mining: Data mining is a part of 1960’s database system. With Data Mining, Knowledge Discovery in Databases (KDD) is used interchangebly. However, KDD is the process of finding useful information and pattern in Data. While Data Mining is the use of algorithms to extract the information and pattern through KDD process. (Margaret H. Dunham). Widely used data mining processes are: Classification, Regression, Time Series Analysis, Prediction, Clustering, Summarization, Association Rules, Sequence Discovery etc.
- Analytics: Analytics or sometimes interchangebly called Business Analytics is the use of mathematical models, software/information technology, statistical analysis, quantitative methods, mathematical or computer based models (James R Evans). It is a process of extracting information through different tools for informed business decision making. Business Analytics softwares are the most widely used tools by professionals from a Marketing, Finance, Accounting, HR professionals to Data Scientists. For instance: Ms Excel, SAS, Minitab etc.
- Data Science: Data Science is automated data processing through principles, processes and applications of data. It sounds pretty theoretical but it is not. It also involves Data Mining technologies as discussed above. It is to be noted that although these processes help data processing technologies, all of them are not Data Science Tools. Data Science takes into account of one term which is “Big Data”. Big Data, as you know, complex-high volume in size that cannot be processed through traditional software package.
I rest here, as the other terminologies should be described in the second series. Whatever the term may be the ultimate goal is to predict, process and finally help in Decision Making for the stakeholders. So, bon voyage into the journey of Data Domain. We will be exploring, mining, processing and finally will simplify the terms so that it does not look like jergon to us. Wish to see you there. Happy Data Driving.