What a month this has been for Dell EMC Isilon! Not only are we announcing some pretty powerful innovations, we also won a Technology & Engineering Emmy® Award this month. Isilon and OneFS have made a powerful impact on the media and entertainment industry and we have been able to empower organizations to take control of their unstructured data and drive change. According to Gartner, “By 2024 enterprises will triple their unstructured data stored as file or object storage from what they have in 2019.”* As data continues to grow at this unrelenting pace, it is … READ MORE
Enter the world of data lakes. Data lakes are repositories that can take in data from multiple sources. Rather than process data for immediate analysis, all received data is stored in its native format. This model allows data lakes to hold massive amounts of data while using minimal resources. Data is only processed upon being called for usage (compared to a data warehouse, which processes all incoming data). This ultimately allows data lakes to be an efficient way for storage, resource management, and data preparation.
But do you actually need a data lake, especially if your big data solution already has a data warehouse? The answer is a resounding yes. In a world where the volume of data transmitted across countless devices continues to increase, a resource-efficient means of accessing data is critical to a successful organization. In fact, here are four specific reasons why the need for a data lake is only going to get more urgent as time goes on.
Never miss an update about big data! Subscribe to the Big Data Blog to receive the latest posts straight to your inbox!
90% of data has been generated since 2016
90% of all data ever is a lot—or is it? Consider what has become available to people as Wi-Fi, smartphones, and high-speed data networks have entered everyday life over the past twenty years. In the early 2000s, streaming was limited to audio, while broadband internet was used mostly for web surfing, emailing, and downloads. In that paradigm, device data was at a minimum and the actual data consumed was mostly about interpersonal communication, especially because videos and TV hadn’t hit a level of compression that supported high-quality streaming. Towards the end of the decade, smartphones became common and Netflix had shifted its business priority to streaming.
That means between 2010 and 2020, the internet has seen the growth of smartphones (and their apps), social media, streaming services for both audio and video, streaming video game platforms, software delivered through downloads rather than physical media, and so on, all creating exponential consumption of data. As for the part that is the most relevant to business? Consider how many businesses have associated apps constantly transmitting data to and from devices, whether to control appliances, provide instructions and specifications, or quietly transmit user metrics in the background.
With 5G data networks widely starting to deploy in 2019, bandwidths and speeds are only going to get better. This means as massive—and significant—as big data has already been in the past few years, it’s only going to get bigger as technology allows the world to become even more connected. Is your data repository ready?
95% of businesses handle unstructured data
In a digital world, businesses collect data from all types of sources, and most of that is unstructured. Consider the data collected by a company that sells services and makes appointments via an app. While some of that data comes structured—that is, in predefined formats and fields such as phone numbers, dates, transaction prices, time stamps, etc.—a company like that still has to archive and store a lot of unstructured data. Unstructured data is any type of data that doesn’t contain an inherent structure or predefined model, which makes it difficult to search, sort, and analyze without further preparation.
For the example above, unstructured data comes in a wide range of formats. For a user making an appointment, any text fields filled out to make that appointment count as unstructured data. Within the company itself, emails and documents are another form of unstructured data. The posts from a company’s social media channel are also unstructured data. Any photos or videos used by employees as notes while performing services are unstructured data. Similarly, any instructional videos or podcasts created by the company as marketing assets are also unstructured.
Unstructured data is everywhere, and as more devices connect to deliver a greater range of information, it becomes clear that organizations need a way to get their proverbial arms around all of it.
4.4 GB of data are used by Americans every minute
More than 325 million people live in the US. Nearly 70% of them have smartphones. And even if you don’t count the people currently streaming media, consider what is happening on an average smartphone in a minute. It’s receiving an update on the weather. It’s checking for any new emails in the user’s inbox. It’s pushing data to social media, delivering voicemail over Wi-Fi, delivering strategic marketing notifications from apps, such as when a real estate app pushes a new housing listing. It’s sending text and images via chat apps, and downloading app/OS updates in the background.
Data is everywhere now, which means the minute that just passed while you read the above paragraph, gigabytes of data have been transmitted across the country—4.4 million GB of data every minute, according to Domo’s Data Never Sleeps report. And that’s just the United States; when combined with the rest of the world, the total volume of data grows exponentially. For businesses, collecting this kind of data is vital to all aspects of operations, from marketing to sales to communication. Thus, every organization must put a premium on safe, available, and accessible storage.
50% of businesses say that big data has changed their sales and marketing
Most people think of big data in terms of the technical aspects. Clearly, a company that works through a phone app or provides a form of streaming uses big data and is delivering a service that simply wasn’t feasible twenty years ago. However, big data is much more than delivery of streaming content. It can create significant improvements in sales and marketing—so much so that according to a McKinsey report, 50% of businesses say that big data is driving them to change their approach in these departments.
What’s the reason for this? With big data, organizations have a much more efficient path to understanding customers than in-person focus groups. Data allows for gathering a mass sample of actions from existing and potential customers. Everything from their website browsing prior to conversion to how long they engaged with certain features of a product or service are all available at high volume, which creates a large enough sample size for a reliable customer model. To be in the cutting-edge 50%, an organization needs to have the data infrastructure to receive, store, and retrieve massive amounts of structured and unstructured data for processing.
Basically, you need a data lake
The above statistics all point to one thing—your organization needs a data lake. And if you don’t get ahead of the curve now in terms of managing data, it’s clear that the world will pass you by in all areas: operations, sales, marketing, communications, and other departments. Data is simply a way of life now, enabling precise insight-driven decisions and unparalleled discovery into root causes. When combined with machine learning and artificial intelligence, this data also allows for predictive modeling for future actions.
Learn more about why data lakes are the future of big data and discover Oracle’s big data solutions—and don’t forget to subscribe to the Oracle Big Data blog to get the latest posts sent to your inbox.
(Note: Corrected typo from Domo’s Data Never Sleeps citation.)
What is the difference between structured and unstructured data—and should you care? For many businesses and organizations, such distinctions may feel like they belong solely to the IT department dealing with big data. And while there is some truth to that, it’s worthwhile for everyone to understand the difference, because once you grasp the definition of structured data and unstructured data (along with where that data lives and how to process it), it’s possible to see how this can be used to improve any data-driven process.
And these days, nearly any workflow in any department is data-driven.
Sales, marketing, communications, operations, human resources, all of these produce data. Even the smallest of small business—say, a brick-and-mortar store with physical inventory and a local customer base—produces structured and unstructured data from things like email, credit card transactions, inventory purchases, and social media. Thus, taking advantage of this comes through understanding the two, and how they work together.
What Is Structured Data?
Structured data is data that uses a predefined and expected format. This can come from many different sources, but the common factor is that the fields are fixed, as is the way that it is stored (hence, structured). This predetermined data model enables easy entry, querying, and analysis. Here are two examples to illustrate this point.
First, consider transactional data from an online purchase. In this data, each record will have a timestamp, purchase amount, associated account information (or guest account), item(s) purchased, payment information, and confirmation number. Because each field has a defined purpose, it makes it easy to manually query (the equivalent of hitting CTRL+F on an Excel spreadsheet) and also easy for machine learning algorithms to identify patterns—and in many cases, identify anomalies outside of those patterns.
Another example is data coming from a medical device. Something as simple as a hospital EKG meter represents structured data down to two key fields: the electrical activity of a person’s heart and the associated timestamp. Those two fields are predefined and would easily fit into a relational or tabular database; machine learning algorithms could easily identify patterns and anomalies with just a few minutes worth of records.
Despite the vast difference in technical complexity between these examples, it’s clearly shown that structured data drills down to using established and expected elements. Timestamps will arrive in a defined format; it won’t (or can’t) transmit a timestamp described in words because that is outside of the structure. A predefined format allows for easy scalability and processing, even if handled on a manual level.
Structured data can be used for anything as long as the source defines the structure. Some of the most common uses in business include CRM forms, online transactions, stock data, corporate network monitoring data, and website forms.
What Is Unstructured Data?
Structured data comes with definition. Thus, unstructured data is the opposite of that. Rather than predefined fields in a purposeful format, unstructured data can come in all shapes and sizes. Though typically text (like an open text field in a form), unstructured data can come in many forms to be stored as objects: images, audio, video, document files, and other file formats. The common point with all types of unstructured data comes back to the idea of lacking definition. Unstructured data is more commonly available (more on that below) and fields may not have the same character or space limits as structured data. Given the wide range of formats comprising unstructured data, it’s not surprising that this type typically makes up about 80% of an organization’s data.
Let’s look at some examples of unstructured data.
First, a company’s social posts are a specific example of unstructured data. The metrics behind each social media post—likes, shares, views, hashtags, and so on—are structured, in that they are predefined and purposeful for each post. The actual posts, though, are unstructured. The posts archive into a repository, but searching or relating the posts with metrics or other insights requires effort. There is no way of knowing what each post specifically contains without actually examining it, whether it’s customer service or promotion or an organizational news update. Compare that to structured data, where the purpose of fields (e.g., dates, names, geospatial coordinates) is clear.
A second example comes from media files. Something like a podcast has no structure to its content. Searching for the podcast’s MP3 file is not easy by default; metadata such as file name, timestamp, and manually assigned tags may help the search, but the audio file itself lacks context without further analysis or relationships.
Another example comes from video files. Video assets are everywhere these days, from short clips on social media to larger files that show full webinars or discussions. As with podcast MP3 files, content of this data lacks specificity outside of metadata. You simply can’t search for a specific video file based on its actual content in the database.
How Do They Work Together?
In today’s data-driven business world, structured and unstructured data tend to go hand in hand. For most instances, using both is a good way to develop insight. Let’s go back to the example of a company’s social media posts, specifically posts with some form of media attachment. How can an organization develop insights on marketing engagement?
First, use structured data to sort social media posts by highest engagement, then filter out hashtags that aren’t related to marketing (for example, removing any high-engagement posts with a hashtag related to customer service). From there, the related unstructured data can be examined—the actual social media post content—looking at messaging, type of media, tone, and other elements that may give insight as to why the post generated engagement.
This may sound like a lot of manual labor is involved, and that was true several years ago. However, advances in machine learning and artificial intelligence are enabling levels of automation. For example, if audio files are run through natural-language processing to create speech-to-text output, then the text can be analyzed for keyword patterns or positive/negative messaging. These insights are expedited thanks to cutting-edge tools, which are becoming increasingly important due to the fact that big data is getting bigger and that the majority of that big data is unstructured.
Where Data Comes From and Where It Goes
In today’s business world, data comes in from multiple sources. Let’s look at a mid-size company with a standard ecommerce setup. In this case, data likely comes from the following areas:
- Customer transactions
- Customer account data
- Customer feedback forms
- Inventory purchasing
- Logistical tracking
- Social media engagement
- Marketing outreach engagement
- Internal HR data
- Search engine crawling for keywords
- And much more
In fact, the amount of data pulled by any company these days is staggering. You don’t have to be one of the world’s biggest corporations to be part of the big data revolution. But how you handle that data is key to being able to utilize it. The best solution in many cases is a data lake.
Data lakes are repositories that receive structured, and unstructured data. The ability to consolidate multiple data inputs into a single source makes data lakes an essential part of any big data infrastructure. When data goes into a data lake, any inherent structure is stripped out so that it is raw data, making it easily scalable and flexible. When the data is read and processed, it is then given structure and schema as needed, balancing both volume and efficiency.
Efficiency in storage is key because scalability and flexibility allow for including more data sources and more applications of cutting-edge tools such as machine learning. This means that the foundation for receiving structured and unstructured data needs to be built for the present and the future, and the industry consensus points to moving data to the cloud.
Want to dig deeper? The following links might help:
With the rise of the Internet of Things (IoT), Artificial Intelligence (AI), Digital Experiences and Big Data, the role of Unstructured Data has become more pronounced. Unfortunately, many organizations struggle to make sense of this data due to its nature. It tends to proliferate rapidly, is distributed across many systems, difficult to identify—and the way it needs to be used creates real-time requirements that previously were not as prevalent. Organizations across every vertical—from Biotech, Oil and Gas, Media & Entertainment, to Retail, Manufacturing, and Banking & Financial Services—are learning to take advantage of data to power digital experiences and are providing insights across their business.
If storing unstructured data is a challenge for businesses, it could be an opportunity for you—because they need your help to deliver the technology solutions that can help them unlock their data capital.
Data: A New Asset, Akin to Financial, Intellectual and Human Capital
The value of an organization has traditionally been measured in terms of its human capital (the talent of the workforce), intellectual property (patents and knowledge that give a competitive advantage), operations (the superior efficiencies built into the business processes) and infrastructure (physical properties and resources).
Yet, in a world in which digital transformation is determining winners (and losers), unlocking data capital allows businesses to extend the value of their traditional assets while also creating new opportunities and efficiencies that reach much further into their strategic imperatives. Data capital becomes the competitive advantage helping businesses succeed at a rapid pace.. That’s why data capital is rapidly becoming an organization’s most valuable asset.
How Can You Help Organizations in this “Digital Gold Rush”?
Much like how Levis and Sears clothed and prepared prospectors for the harsh climates and conditions in their quest to strike it rich so can you play the role of technology advisor and provider. Unlike the Gold Rush, however, this is an area from which your customers are likely to reap potential rewards. Still, 93% say that there are barriers to their organization becoming a successful digital business by 2030 or beyond.¹
In order to play the role of trusted advisor, you need to help customers understand what is at stake and how they can achieve their goals:
Educate them on the Impact of Data:It is estimated by IDC that +50%² of global GDP will be digitized. Whether it’s mastering the logistics of getting coffee beans from Costa Rica to Seattle or finding new revenue streams for Digital Media, organizations will have to transform or risk possibly losing share.
Digital Transformation starts with Data Capital:If organizations are prioritizing projects like IoT, Digital Transformation, or AI and they are not spending the same amount of time considering the data platform, they could risk limiting themselves. Autonomous vehicles, for instance, generate so much data that data gravity will make it challenging to leverage the cloud, and if an organization hasn’t accounted for this, they can potentially waste much time and money.
Dell EMC Unstructured Data Solutions can help: Isilon and ECS offer best of breed technologies uniquely suited to address the needs of your organizations. By presenting Dell EMC solutions, you can help your customers swiftly and decisively make their digital transformation real. Additionally, by leading with our solutions, you can use deal registration to help improve the likelihood of winning and maximizing your opportunity in competitive situations.
Meeting Your Customers’ Needs
Dell EMC delivers the data solutions that help make it easier to transform data into an asset that can power your Digital Experiences.
With strong industry technologies that address your data needs providing:
- Unified Data lake: Cover your entire data footprint from edge, core, to cloud and eliminate silos while supporting existing and cloud-native apps
- Simplicity at scale: Stay in control with a system that scales performance, capacity, and management seamlessly into the Exabyte range
- Extracting value from data: Turn data into an asset by powering digital experiences, streamlining workflows, and gaining real-time insights
To find out more about data capital and how placing it at the center of your digital transformation (DX) strategy can help accelerate your organization’s success, read the IDC White Paper: Unlock the Power of Data Capital: Accelerate DX.
² IDC White Paper: Unlock the Power of Data Capital: Accelerate DX
Shared files, scanned images, photos, videos and audio files – they all constitute unstructured data. They’re a storage issue for individuals, and they’re an even bigger problem for organizations. It’s a problem that’s growing. Over 80% of the data in the datacenter is unstructured data.¹ And, according to IDC, the scale-out file based storage capacity shipments from 2016 to 2021 is expected to grow at a Compound Annual Growth Rate of 25.3%.²
Dell EMC Isilon scale-out NAS storage and ECS storage solutions are designed to tackle the problem, and our event kit provides everything you’ll need to organize an event at which you can articulate their benefits. So, as our campaign theme says, your customers can ‘break through’ their unstructured data burden, no matter what the scale.
Who to Target…
Whether they are new or existing Dell EMC customers, they’re all set to benefit from the latest Isilon and ECS offerings.
Target organizations include leading global enterprises and mid-market businesses across a broad range of industry segments, including EDA, financial services, healthcare, life sciences, media & entertainment, and oil & gas.
… And What You Should Know About our Market Positioning
On Isilon, you’ll want to tell your customers that a Total Economic Impact study by Forrester Consulting in December 2016 shows that Isilon enables customers to lower their storage costs, simplify management, support growth and realize up to a 250% ROI in a three-year period. A single administrator can manage 20X more storage with Isilon resulting in huge savings.3
On ECS, you can tell them the solutions provide a single, globally accessible repository that scales infinitely for both traditional and next-generation workloads— with up to a 48% lower TCO when compared to public cloud services.1
In short, Isilon and ECS are great sales propositions. We’ll do all we can to help you articulate them—and convert customer contact into sales success.
What’s Included in the Isilon and ECS Event Kit
The Dell EMC Isilon and ECS event kit contains everything you need to spread the ‘break through’ message.
To raise awareness ahead of the event we’ve provided a brief call guide you can use to steer conversations you have with customers and prospects when you call them with an invitation. There’s also a series of emails—an invitation, a reminder and a registration confirmation—into which you can simply insert your own details before you send them to your customers and prospects.
On the day itself, we’ve provided plenty of material to dress the room, including a poster, a pop-up banner and a flyer. And naturally, we’ve also included a customer-facing presentation containing speaker notes, to which you can add your own branding.
The presentation includes details of the Isilon starter bundle, a low-price and partner-exclusive entry offering that’s simple, flexible, efficient, easily manageable and of course highly scalable. It also explains how customers can download fully-featured ECS Trial Version software at no cost and with no time limit for non-production use—so they can try out the environment for themselves.
After the event, the kit enables you to follow up with a thank-you email for those who came, and a sorry-we-missed-you email for those who didn’t. Both of these emails include links to an e-brochure. This e-brochure is also included in this kit, and contains a summary of Dell EMC Isilon solutions and highlights of the event topics.
It’s worth recalling that, as with the rest of Dell EMC’s extensive storage portfolio, the Future-Proof Loyalty Program applies to Isilon and ECS. It includes a three-year satisfaction guarantee, hardware investment protection, and seamless data migration—all designed for your customers’ peace of mind as they approach their purchase decision.
It’s also worth remembering the outstanding range of incentives available to you as a partner for every Isilon and ECS sales success you achieve.
Find out more
Dell EMC Isilon scale-out NAS storage and ECS storage solutions are great for your customers’ business – and they’re great for yours too. Make the most of our event kit—visit the Digital Marketing Portal.
- Analysis of Unstructured Data: Applications of Text Analytics and Sentiment Mining SAS. Retrieved June 24, 2016.
- IDC Worldwide File- and Object- Based Storage Forecast, 2017-2021, US42280717. 2017.
- The Total Economic Impact™ Of Dell EMC Isilon Scale-Out NAS: A Forrester Total Economic Impact™ Study Commissioned by Dell EMC December 2016.