What is raw data? Definition, examples
We live in the age of machine learning technology, AI solutions, and digital information. Our digital world is full of raw, unstructured data and technologies that base on information are available to use for any marketing goal. As Forbes article says, “2.5 quintillion bytes of data flooding out online every day at our pace, but that pace is accelerating with the growth of the Internet.” So these quintillions of data must be organized and profitably used. Let’s take a look at what is raw data and how to effectively use it thanks to data technologies.
Raw data - definition
Raw data is a set of information that was delivered from a certain data entity to the data provider and hasn’t been processed yet by machine nor human. This information is gathered out of online sources to deliver deep insight into users’ online behavior. Thanks to this information marketers can easily create personalized online campaigns and reach target users with accurate message in the right time.
Worth to admit that raw data as is, without being processed by algorithms, isn’t very useful. Usually, it’s a bunch of code, like user cookie for example, which doesn’t bring much information, but when this data is integrated with appropriate user profiles, it is really helpful for marketers or business analysts. The integration is possible within the data provider, e.g, by using Data Management Platform (DMP).
DMP uses AI algorithms and to match raw data with 3rd party data profiles available on the platform. Various DMP providers offer different volume of data profiles, e.g. OnAudience.com DMP includes over 27 billions of user profiles. It is advised to have data scientists among your company staff to be able to fully receive the benefits that raw data gives.
Composition of raw data
Raw data is a source of information for Data Stream service, which we offer. This service was deployed to deliver data as a result of cross-functional cooperation of integrated marketing systems, such as Demand Side Platform (DSP), Supply Side Platform (SSP) and data provider (DMP). Read more about opportunities that Data Stream service can give your company.
Data Stream and raw data itself can be provided in various formats. In OnAudience.com it is available in four formats. Each has corresponding attributes, based on the chosen data to be received.
1. Data Point format contains the following attributes:
- number of Data Point occurrences - it shows how many events, such as opening a website or clicking in specific link, was generated by users
- last user's activity
- main user’s country - by traveling, users can be assigned to various countries; main user’s country is the one which occurs most often
- last timestamp (in UNIX form) represents a time when an event related to specific data point occurred last time
- cookie lifetime is a period of time when a particular cookie is exchanging between the user’s web browser and server
2. In Segment format encoded user ID and segment IDs are shared. Included segments belong to the client and represent specific characteristics of web page visitor’s, like interests or demographic data.
3. Hybrid data is a combo of both previous data formats but per particular Data Points. This data is more customizable, so it allows to get more precise information about users, like specific set of interests and demography information.
4. URLs - is a set of information about particular URL that was visited. Following fields are shared out there:
- timestamp (in UNIX form)
- userAgent - it’s indicating what type of device was used
- short IP address
How can you use raw data?
There are multiple areas, where raw data can be used. It’s a piece of good source information to be included in the planning stage of research, during prediction or to test on the final. The most popular fields are:
- Fraud detection & scoring - raw data can be used as source data for an anti-fraud algorithm. For example, timestamp or amount of cookie occurrences or analysis of data points can be used within the scoring system to detect fraud or to make sure that a message receiver is not a bot (so-called Non-Human Traffic).
- Artificial Intelligence - raw data can be treated as a train set and a test set during AI and machine learning algorithms building.
- Raw data can be used for Profiling & personalization to customize client profiles and divide them for segmentation, e.g., per gender or location (based on Data Point). The segments are used in precise targeting of online ads and sending clients personalized messages.
- Business Intelligence - raw data is a source of information for BI systems, that helps to enrich user profiles with more detailed information, e.g., purchase path or geodata. This information is a good material for business analysis and predictive research.
- Targeting - processed data by data scientists can help to improve online campaigns and reach the target audience.
- CRM Enrichment - data can be integrated with the client’s CRM system. CRM integration provides a possibility to fill the gaps in user profiles with demographic data, interests or buying intentions. So, by enriching CRM systems, clients get a full view of their customers, which allows them to send highly personalized messages.
From raw data to customer segments
You can create segments according to various factors, such as age, interests, gender, marital status or industry. In fact, you can treat raw data as a foundation for the segments. DMP platform allows to build segments with unique, custom attributes. It helps to deliver the right message to the right audience and improve brand experience. Read more about customer segmentation and how to use custom segments.
DMP - how to manage large, chaotic data
DMP is a platform where all data is being integrated. Dedicated pixel, created in DMP as a data point is licensing into publisher’s website, where attributes of particular visitor are stored on the platform. This is a part of anonymous information that users’ profiles consist of and later used for creating segments.
AI algorithms, built in DMP platforms make a smart mapping of received data and existing profiles. This allows analyzing the particular users’ behavior who belong to your target audience, increasing your customer network and running personalized online campaigns. In other words, DMP help you manage all stored data, segment it and easily use to precisely reach selected audiences in your campaigns.