Big Data Challenges

The challenges include capture, curation, storage, search, sharing, transfer, analysis and visualization. The trend to larger data sets is due to the additional information derivable from analysis of a single large set of related data, as compared to separate smaller sets with the same total amount of data, allowing correlations to be found to “spot business trends, prevent diseases, combat crime and so on. Few challenges are summarized as

  • Size of Big Data – Big data is… well… big in size! For a small company that is used to dealing with data in gigabytes, 10TB of data would be BIG. However for companies like Facebook and Yahoo, petabytes is big. The size of big data, makes it impossible (or at least cost prohibitive) to store in traditional storage like databases or conventional filers but also the cost to store gigabytes of data.
  • Big Data is unstructured or semi structured – A lot of Big Data is unstructured.
    Lack of structure makes relational databases not well suited to store Big Data. Plus, not many databases can cope with storing billions of rows of data.
  • Processing this huge data to mine intelligence out of it is also a big challenge.
  • Analysis and making predictions from such voluminous unstructured data is also challengeable.

A parallel processing framework can solve the posed problems applying the divide and conquer. The solution involves, division of data into smaller sets which is processed in a parallel manner. But, it needs a robust storage platform which can scale to a very large degree (and at reasonable cost) as the data grows and allows for system failure. Processing all this data may take thousands of servers, so the price of these systems must be affordable to keep the cost per unit of storage reasonable.

Share this post
[social_warfare]
Big Data Source
Big Data Benefits

Get industry recognized certification – Contact us

keyboard_arrow_up