What is Big Data? Understanding the 8 V’s

If you want an effective User and Entity Behavior Analytics (UEBA) solution, you’re going to need to leverage Big Data analytics. Coined in 2001, Gartner’s Big Data definition refers to “high-volume, high-velocity and/or high-variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making and process automation”. In other words, Big Data is made up of structured, semi-structured and unstructured data sets. These data sets are difficult to process using traditional database and software techniques because of the 3 V’s mentioned above. The data is simply too big (volume), moves too fast (velocity) or surpasses the current processing capacity (variety). Read on to learn about Big Data analytics, Data Lakes, Data Warehouses, UEBA vendors offering open choice big data, and more!

#

The 8 V’s of Big Data

First, there were the 3 V’s of Big Data – volume, velocity and variety.  Then, there was an expansion to include 3 more – veracity, variability and value. Gurucul has since expanded the list to include two more descriptions – venue and vector.

Know the 8 V’s of Big Data:

  • Volume – The quantity of generated and stored data. The size of the data determines the value and potential insight and whether it can be considered big data or not.
  • Velocity – The speed data is generated and processed to meet the requirements of availability in real-time, as well as demands and challenges that might impact or impede its access for efficient utility and analytical development.
  • Variety – The type and nature of the data, both structured and unstructured, which expands the choices and options which facilitate analysts to effectively draw from the range of critical context to produce useful resulting insights.
  • Veracity – The quality of raw or refined captured data can vary greatly, affecting accurate analysis.
  • Variability – Inconsistency of the data set can hamper processes to handle and manage it.
  • Value – What benefit data delivers by virtue of comprehensive control of big data’s massive volume.
  • Venue – The scotoma or blind spots in a security perimeter that come with separate and unintegrated silos of data; a popular desihackers.
  • Vector – The channels by which data flows and is ingested into data lakes and elsewhere, as well as its effectiveness and cost.

#

Why Do We Need Big Data Analytics?

We’ve had relational database technology since the early 70’s. It can be described as “a collection of data items organized as a set of formally described tables with unique index keys. Data can be accessed or reassembled in different ways without having to reorganize the database tables, often in queries with boolean logic”.

In today’s high tech and mobile environment, it’s not uncommon for a user to have more than one device that exists outside of an organization’s physical environment. For example, an employee might have a company-provided laptop, a work phone and a tablet that they take home with them at the end of the workday. A reliable UEBA solution must monitor the streams of security activity data and access information. A relational data base wouldn’t be able to keep up with the variety of data coming in, the volume or the speed. That’s why we need big data analytics.

#

Data Lakes vs. Data Warehouses

Do you know the difference? Data lakes are not data warehouses – so, don’t get them confused.

Big Data takes in large amounts of data from multiple sources and pours it all into one big data lake. The information sits unfiltered, unprocessed and unstructured. Your UEBA solution will extract knowledge from it via machine learning to expose predictive patterns and insights.

A data warehouse stores data with everything organized, archived and ordered. It only stores necessary data used for reporting and extracting by specific business users. Data warehouses have a specific set of data to include and exclude. That is because data only loads into the warehouse when there is a use for it.

Data lakes store all raw data, even data that probably won’t even be used. The lack of structure in a data lake makes it easy for configuration. Data scientists access data lakes since they have the skills to do in-depth analysis. But it is accessible by all users.

#

Choose Gurucul as your UEBA Vendor Offering Open Choice of Big Data

It’s true that not all UEBA vendors are equal. One of the biggest complaints we hear about other UEBA vendors is that they customize the backend of their data lake. So, even if you own a data lake of the same flavor, you’ll have to purchase theirs too. Is that cost efficient?

What you want is open choice of big data, but there’s only one UEBA vendor on the market offering it. Gurucul is not reliant on a single big data platform. We know that our customers could change their underlying data layer at any time. So, we support any data lake because of that. Additionally, there is no cap on the volume of data Gurucul’s UEBA solution can ingest!

Gurucul UEBA sits right on top of any data lake. If the customer doesn’t have one, Gurucul will give them Hadoop for free.

An effective UEBA solution requires the power of Big Data analytics. Contact us today to get started!

ABOUT THE AUTHOR:

Nilesh DherangeNilesh Dherange, Chief Technology Officer, Gurucul

Nilesh Dherange is responsible for development and execution of Gurucul’s technology vision. Nilesh brings a wealth of experience in inventing, designing, and building software from inception to release. Nilesh has been a technologist and leader at three startups and at one of the largest software development companies in the world. Prior to founding Gurucul, Nilesh was an integral member of a company that built a Roles and Compliance product acquired by Sun Microsystems. Nilesh was also a co-founder and VP of Engineering for BON Marketing Group where he conceptualized and created BON Ticker — an innovative patented bid management system which used predictive analytics to determine advertising bids for PPC marketing campaigns on search engines like Google, Yahoo, MSN etc. Nilesh holds a B.A in Social Science, B.E in Computer Engineering from University of Mumbai and M.S in Computer Science from University of Southern California.