Skip to main content

What is Big Data ?

Many people believe Big Data is simply a large amount of data, but it is defined by more than just size.
Gartner Definition of Big Data is :" Big Data are high-volume, high-velocity, and/or high-variety
information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization."

Big Data is described within the Gartner definition based on the three Vs:

  •  Volume: Size of data (how big it is)
  •  Velocity: How fast data is being generated
  •  Variety: Variation of data types to include source, format, and structure

In terms of the three Vs, the Gartner definition effectively says that:
"There is a lot of data, it is coming into the system rapidly, and it comes from many different sources in many different formats."

IT companies are investing billions of dollars into research and development for Big Data, Business Intelligence (BI), data mining, and analytic processing technologies. This fact underscores
the importance of accessing and making sense of Big Data in a fast, agile manner.
Big Data is important; those who can harness Big Data will have the edge in critical decision making. Companies utilizing advanced analytics platforms to gain real value from Big Data will grow faster than their competitors and seize new opportunities.

To support Big Data, modern analytic processing tools must .

  • Shift away from traditional, rearward-looking BI tools and platforms to more forward-thinking analytic platforms.
  • Support a data environment that is less focused on integrating with only traditional, corporate data warehouses and more focused on easy integration with external sources.
  • Support a mix of structured, semi-structured, and unstructured data without complex, time Consuming IT engineering efforts.
  • Process data quickly and efficiently to return answers before the business opportunity is lost.
  • Present the business user with an interface that doesn't require extensive IT knowledge to operate.

Comments

Popular posts from this blog

40 Redis interview questions for 2021 - 2022

  Redis interview questions 1.What is Redis?. 2. What is the data type of Redis? 3. What are the benefits of using Redis? 4. What are the advantages of Redis over Memcached? 5. What are the differences between Memcache and Redis? 6. Is Redis single-process and single-threaded? 7. What is the maximum storage capacity of a string type value? 8. What is the persistence mechanism of Redis? Their advantages and disadvantages? 9. Redis common performance problems and solutions: 10. What is the deletion strategy of redis expired keys? 11. Redis recycling strategy (elimination strategy)? 12. Why does edis need to put all data in memory? 13. Do you understand the synchronization mechanism of Redis? 14. What are the benefits of Pipeline? Why use pipeline? 15. Have you used Redis cluster? What is the principle of cluster? 16. Under what circumstances will the Redis cluster solution cause the entire cluster to be unavailable? 17. What are the Java clients supp...

Recursion-maze problem - Rat in the Maze - Game

  package com.bei.Demo01_recursion; public class MiGong {     public static void main(String[] args)  {         //First create a two-dimensional array to simulate the maze         int [][]map=new int[8][7];         //Use 1 for wall         for (int i = 0; i <7 ; i++) {             map[0][i]=1;             map[7][i]=1;         }         for (int i = 0; i <8 ; i++) {             map[i][0]=1;             map[i][6]=1;         }         //Set the bezel         map[3][1]=1;         map[3][2]=1;         //Output         for (int i = 0; i <8 ; i++) {             for (int j = 0; j ...

165 + Big Data and Artificial intelligence ( AI ) terms and terminology Glossary

  Latest and most comprehensive big data/artificial intelligence terms & terminology in English (highly recommended for collection) for years 2021 and 2022   A  1.  Apache Kafka:  named after the Czech writer Kafka, used to build real-time data pipelines and streaming media applications. The reason it is so popular is that it can store, manage, and process data streams in a fault-tolerant manner, and it is said to be very "fast". Given that the social network environment involves a lot of data stream processing, Kafka is currently very popular.