Big Data Source

Original big data was the web data — as in the entire Internet! The scope of big data is growing beyond niche sources to include sensor and machine data, transactional data, metadata, social network data and consumer-authored information. These days Big data comes from multiple sources.

  • Web Data — still it is big data
  • Click stream data : when users navigate a website, the clicks are logged for further analysis (like navigation patterns). Click stream data is important in on line advertising and E-Commerce
  • sensor data : sensors embedded in roads to monitor traffic and misc. other applications generate a large volume of data
  • Connected Devices : Smart phones are a great example. For example when you use a navigation application like Google Maps or Waze, your phone sends pings back reporting its location and speed (this information is used for calculating traffic hotspots). Just imagine hundreds of millions (or even billions) of devices consuming data and generating data.
  • Social network profiles or Social media data: Sites like Facebook, Twitter, LinkedIn generate a large amount of data. Tapping user profiles from Facebook, LinkedIn, Yahoo, Google, and specific-interest social or travel sites, to cull individuals’ profiles and demographic information, and extend that to capture their hopefully-like-minded networks.
  • Social influencers—Editor, analyst and subject-matter expert blog comments, user forums, Twitter & Facebook “likes,” Yelp-style catalog and review sites, and other review-centric sites like Apple’s App Store, Amazon, etc.
  • Activity-generated data—Computer and mobile device log files, aka “The Internet of Things.” This category includes web site tracking information, application logs, and sensor data – such as check-ins and other location tracking – among other machine-generated content. But consider also the data generated by the processors found within vehicles, video games, cable boxes or, soon, household appliances.
  • Software as a Service (SaaS) and cloud applications—Systems like Salesforce.com, Netsuite, SuccessFactors, etc. all represent data that’s already in the Cloud but is difficult to move and merge with internal data.
  • Public—Microsoft Azure MarketPlace/DataMarket, The World Bank, SEC/Edgar, Wikipedia, IMDb, etc. – data that is publicly available on the Web which may enhance the types of analysis able to be performed.
Share this post
[social_warfare]
Big Data Types
Big Data Challenges

Get industry recognized certification – Contact us

keyboard_arrow_up