What is Big Data – Know More about BIG Data

What is Big Data – Know More about BIG Data
“Big data is like teenage sex. Everybody is talking about it, everyone thinks everyone else is doing it, so everyone claims they are doing it.” 

Click Here to know : Big Data Implementation - Java-Based Tools


What is Big Data? You may ask; and more importantly why it is the latest trend in nearly every business domain? Is it just a hype or its here to stay?
As a matter of fact “Big Data” is a pretty straightforward term – its just what its says – a very large data-set. How large? The exact answer is “as large as you can imagine”!
How can this data-set be so massively big? Because the data may come from everywhere and in enormous rates: RFID sensors that gather traffic data, sensors used to gather weather information, GPRS packets from cell phones, posts to social media sites, digital pictures and videos, online purchase transaction records, you name it! Big Data is an enormous data-set that may contain information from every possible source that produces data that we are interested in.
Nevertheless Big Data is more than simply a matter of size; it is an opportunity to find insights in new and emerging types of data and content, to make businesses more agile, and to answer questions that were previously considered beyond our reach. That is why Big Data is characterized by four main aspects: Volume, Variety, Velocity, and Veracity(Value) known as “The four V's of Big Data”. Let’s briefly examine what each one of them stands for and what challenges it presents:


Volume

Volume references the amount of content a business must be able to capture, store and access. 90% of the world’s data has been generated in the past two years alone. Organizations today are overwhelmed with volumes of data, easily amassing terabytes—even petabytes—of information of all types, some of which needs to be organized, secured and analyzed.

Variety

80% of the world’s data is semi – structured. Sensors, smart devices and social media are generating this data through Web pages, weblog files, social-media forums, audio, video, click streams, e-mails, documents, sensor systems and so on. Traditional analytics solutions work very well with structured information, for example data in a relational database with a well formed schema. Variety in data types represents a fundamental shift in the way data is stored and analysis needs to be done to support today’s decision-making and insight process. Thus Variety represents the various types of data that can’t easily be captured and managed in a traditional relational database but can be easily stored and analyzed with Big Data technologies.

Velocity

Velocity requires analyzing data in near real time, aka “sometimes 2 minutes is too late!”. Gaining a competitive edge means identifying a trend or opportunity in minutes or even seconds before your competitor does. Another example is time-sensitive processes such as catching fraud where information must be analyzed as it streams into your enterprise in order to maximize its value. Time-sensitive data has a very short shelf-life; compelling organizations to analyze them in near real-time.

Veracity (Value)

Acting on data is how we create opportunities and derive value. Data is all about supporting decisions, so when you are looking at decisions that can have a major impact on your business, you are going to want as much information as possible to support your case. Nevertheless the volume of data alone does not provide enough trust for decision makers to act upon information. The truthfulness and quality of data is the most important frontier to fuel new insights and ideas. Thus establishing trust in Big Data solutions probably presents the biggest challenge one should overcome to introduce a solid foundation for successful decision making.
While the existing installed base of business intelligence and data warehouse solutions weren’t engineered to support the four V’s, big data solutions are being developed to address these challenges.

 The Importance of Big Data and What You Can Accomplish

The real issue is not that you are acquiring large amounts of data. It's what you do with the data that counts. The hopeful vision is that organizations will be able to take data from any source, harness relevant data and analyze it to find answers that enable 1) cost reductions, 2) time reductions, 3) new product development and optimized offerings, and 4) smarter business decision making. For instance, by combining big data and high-powered analytics, it is possible to:
  1. Determine root causes of failures, issues and defects in near-real time, potentially saving billions of dollars annually.
  2. Optimize routes for many thousands of package delivery vehicles while they are on the road.
  3. Analyze millions of SKUs to determine prices that maximize profit and clear inventory.
  4. Generate retail coupons at the point of sale based on the customer's current and past purchases.
  5. Send tailored recommendations to mobile devices while customers are in the right area to take advantage of offers.
  6. Recalculate entire risk portfolios in minutes.
  7. Quickly identify customers who matter the most.
  8. Use clickstream analysis and data mining to detect fraudulent behavior.

 Challenges

Many organizations are concerned that the amount of amassed data is becoming so large that it is difficult to find the most valuable pieces of information.
  1. What if your data volume gets so large and varied you don't know how to deal with it?
  2. Do you store all your data?
  3. Do you analyze it all?
  4. How can you find out which data points are really important?
  5. How can you use it to your best advantage?
Until recently, organizations have been limited to using subsets of their data, or they were constrained to simplistic analyses because the sheer volumes of data overwhelmed their processing platforms. But, what is the point of collecting and storing terabytes of data if you can't analyze it in full context, or if you have to wait hours or days to get results? On the other hand, not all business questions are better answered by bigger data. You now have two choices:
A. Incorporate massive data volumes in analysis. If the answers you're seeking will be better provided by analyzing all of your data, go for it. High-performance technologies that extract value from massive amounts of data are here today. One approach is to apply high-performance analytics to analyze the massive amounts of data using technologies such as grid computing, in-database processing and in-memory analytics.

B. Determine upfront which data is relevant. Traditionally, the trend has been to store everything (some call it data hoarding) and only when you query the data do you discover what is relevant. We now have the ability to apply analytics on the front end to determine relevance based on context. This type of analysis determines which data should be included in analytical processes and what can be placed in low-cost storage for later use if needed.

No comments:

Post a Comment