What’s the Connection Between Big Data and AI?

When people talk about big data, are they simply referring to numbers and metrics?

Yes.

And no.

Technically, big data is simply bits and bytes—literally, a massive amount (petabytes or more) of data. But to dismiss big data as mere ones and zeroes misses the point. Big data may physically be a collection of numbers, but when placed against proper context, those numbers take on a life of their own.

This is particularly true in the realm of artificial intelligence (AI). AI and big data are intrinsically connected; without big data, AI simply couldn’t learn. From the perspective of the team in charge of Oracle’s Cloud Business Group (CBG) Product Marketing, they liken big data to the human experience. On Oracle’s Practical Path To AI podcast episode Connecting the Dots Between Big Data and AI, team members compare the AI learning process to the human experience.

The short version: the human brain ingests countless experiences every moment. Everything that is taken in by senses is technically a piece of information or data—a note of music, a word in a book, a drop of rain, and so on. Infant brains learn from the very beginning they start taking in sensory information, and the more they encounter, the more they are able to assimilate and process, then respond in new and informed ways.

AI works similarly. The more data an AI model encounters, the more intelligent it can become. Over time, as more and more data processes through the AI model, it becomes increasingly significant. In that sense, AI models are trained by big data, just as human brains are trained by the data accumulated through multiple experiences.

And while this may all seem scary at first, there’s a definite public shift toward trusting AI-driven software. This is discussed further by Oracle’s CBG team on the podcast episode, and it all goes back to the idea of human experiences. In the digital realm, people now have the ability to document, review, rank, and track these experiences. This knowledge becomes data points in big data, thus fed into AI models which start validating or invalidating the experiences. With enough of a sample size, a determination can be made based on “a power of collective knowledge” that grows and creates this network.

However, that doesn’t mean that AI is the authority on everything, even with all the data in the world.

To hear more about this topic—and why human judgment is still a very real and very necessary part of, well, everything—listen to the entire podcast episode Connecting the Dots Between Big Data and AI and be sure to visit Oracle’s Big Data site to stay on top of the latest developments in the field of big data.

Guest author Michael Chen is a senior manager, product marketing with Oracle Analytics.

Related:

  • No Related Posts

Check out the 19c New Features Learning Paths

Curious about Oracle Database 19c features? Well, if you liked the 18c learning path series, including the Apply Oracle 18c Database New Features learning path, then you are in luck.

The Database User Assistance group is happy to announce two new learning paths for 19c:

The learning paths provide a set of detailed tutorials that will help you explore new features of the database from your laptop. To get started, download the database here.

Let us know what you think!

Related:

  • No Related Posts

Microsoft and Oracle to Interconnect Microsoft Azure and Oracle Cloud


Microsoft Corp. and Oracle Corp. on Wednesday announced a cloud interoperability partnership enabling customers to migrate and run mission-critical enterprise workloads across Microsoft Azure and Oracle Cloud. Enterprises can now seamlessly connect Azure services, to Oracle Cloud services. Today, a lot of enterprises already use a combination of Microsoft and Oracle to run their business. In addition, enterprises have invested heavily in both Oracle and Microsoft solutions for many years. Now for the first time, organizations can develop and leverage both Microsoft and Oracle cloud services simultaneously which enables easier migration of on-premises applications, the utilization of a broader range of tools, and the ability to take advantage of existing investments across both clouds.

As a result of this expanded partnership, the companies are today making available a new set of capabilities:

  • Connect Azure and Oracle Cloud seamlessly, allowing customers to extend their on-premises data centers to both clouds. This direct interconnect is available starting today in Ashburn (North America) and Azure US East, with plans to expand additional regions in the future.

  • Unified identity and access management, via a unified single sign-on experience and automated user provisioning, to manage resources across Azure and Oracle Cloud. Also available in early preview today, Oracle applications can use Azure Active Directory as the identity provider and for conditional access.

  • Supported deployment of custom applications and packaged Oracle applications (JD Edwards EnterpriseOne, E-Business Suite, PeopleSoft, Oracle Retail, Hyperion) on Azure with Oracle databases (RAC, Exadata, Autonomous Database) deployed in Oracle Cloud. The same Oracle applications will also be certified to run on Azure with Oracle databases in Oracle Cloud.

  • A collaborative support model to help IT organizations deploy these new capabilities while enabling them to leverage existing customer support relationships and processes.

  • Oracle Database will continue to be certified to run in Azure on various operating systems, including Windows Server and Oracle Linux.

More information about specific cross-cloud capabilities, use cases, business advantages, and more can be found here: https://blogs.oracle.com/cloud-infrastructure/oracle-microsoft-azure-alliance

Related:

  • No Related Posts

Autonomous Database – Now with Spatial Intelligence

We are pleased to announce that Oracle Autonomous Database now comes with spatial intelligence! If you are completely new to Oracle Autonomous Database then firstly: where have you been for the last 18 months?, secondly: here is a quick recap of the key features:

What is Oracle Autonomous Database

Oracle Autonomous Database provides a self-driving, self-securing, self-repairing cloud service that eliminate the overhead and human errors associated with traditional database administration. Oracle Autonomous Database takes care of configuration, tuning, backup, patching, encryption, scaling, and more. Additional information can be found at https://www.oracle.com/database/autonomous-database.html.

Special Thanks…

This post has been prepared by David Lapp who is part of the Oracle Spatial and Graph product management team.He is extremely well known within our spatial and graph community. If you want to follow David’s posts on the Spatial and Graph blog then use this link and the spatial and graph blog is here.

Spatial Features

The core set of Spatial features have been enabled on Oracle Autonomous Database. Highlights of the enabled features are; native storage and indexing of point/line/polygon geometries, spatial analysis and processing, such as proximity, containment, combining geometries, distance/area calculations, geofencing to monitor objects entering and exiting areas of interest, and linear referencing to analyze events and activities located along linear networks such as roads and utilities. For details on enabled Spatial features, please see the Oracle Autonomous Database. documentation.

Loading Your Spatial Data into Autonomous Database

In Oracle Autonomous Database, data loading is typically performed using either Oracle Data Pump or Oracle/3rd party data integration tools. There are a few different ways to load and configure your spatial data sets:

  • Load existing spatial data
  • Load GeoJSON, WKT, or WKB and convert to Spatial using SQL.
  • Load coordinates and convert to Spatial using SQL.

Obviously the files containing your spatial data sets can be located in your on-premise data center or maybe your desktop computer, but for the fastest data loading performance Oracle Autonomous Database also supports loading from files stored in Oracle Cloud Infrastructure Object Storage and other cloud file stores. Details can be found here for ATP: https://docs.oracle.com/en/cloud/paas/atp-cloud/atpug/load-data.html and here if you are using ADW: https://docs.oracle.com/en/cloud/paas/autonomous-data-warehouse-cloud/user/load-data.html.

Configuring Your Spatial Data

Routine Spatial data configuration is performed using Oracle SQL Developer GUIs or SQL commands for:

  • Insertion of Spatial metadata
  • Creation of Spatial index
  • Validation of Spatial data

Example Use Case for ATP

For the purposes of this post lets focus on an ATP-style use case. OLTP applications commonly require calculations that are invoked upon changes to data to support business functions and enforce business rules. For example:

  • Permitting system transactions require validation that the activity complies with regulations
  • Financial transactions require checks against known pattern of fraud
  • Service transactions require determination of optimal resources

In these and many other transaction processing scenarios, location plays an important role and are supported by the Spatial features of Oracle Autonomous Transaction Processing. Using the permitting scenario as an example, the major steps for location-based transaction validations are:

  • Load geospatial reference data for regulation enforcement such as environmentally sensitive areas, school zones, redevelopment zones.
  • In the permit transaction process, capture the proposed activity locations
  • Use Spatial to perform location-based validations of proposed activities, for example the proximity of an activity involving hazardous materials to environmentally sensitive areas:

— Proximity for validation



— Use a SQL statement with SDO_WITHIN_DISTANCE

— and geometry constructor to validate the proximity rule



SELECT DECODE(count(*), 0, ‘PASSED’, FAILED’) as location_validation

FROM environmental_sensitive_areas

WHERE SDO_WITHIN_DISTANCE(

geometry,

sdo_geometry(2001,4326,sdo_point_type(permit_longitude,permit_latitude,null),null,null),

‘distance=5 unit=MILE’) = ‘TRUE’;

  • Use Spatial to determine notification requirements for a permitted activity. For example, notify public safety jurisdictions within the proximity of proposed hazardous materials transport routes:


— Proximity for notification



— Use a SQL statement with SDO_WITHIN_DISTANCE

— and route geometries to determine notifications



SELECT hazmat_route, jurisdiction

FROM jurisdictions, hazmat_routes

WHERE SDO_WITHIN_DISTANCE(

jurisdictions.geometry,

hazmat_routes.geometry,

‘distance=5 unit=MILE’) = ‘TRUE’;

ROUTE JURISDICTION

2017-03-A AL_REGION_7

2017-03-A AL_REGION_9

2017-03-B AL_REGION_9

2017-04-A AL_REGION_2

2017-04-B AL_REGION_3

2017-04-B AL_REGION_9

These and any other location-based transactions may be operationalized as triggers and procedures invoking more involved business logic. As a fully integrated feature of Oracle Autonomous Transaction Processing, location-based operations can be seamlessly blended with the mainstream aspects of transaction processing logic.

What about data warehouse use cases?

If you are interested in spatial use cases relating to data warehousing projects then click over to this blog post on the Data Warehouse Insider blog: Autonomous Data Warehouse – Now With Spatial Intelligence

Summary

For important best practices and further details on the use of these and many other Spatial operations, please refer to Oracle Autonomous Transaction Processing documentation and the Autonomous Data Warehousing documentation.

Related:

  • No Related Posts

Linguistic analysis comes to Autonomous Database

We are pleased to announce that Oracle Text indexes has been enabled in Oracle Autonomous Database. Which means you can now do linguistic analysis in Autonomous Database! If you are completely new to Oracle Autonomous Database (where have you been for the last 18 months?) then here is a quick recap of the key features:

What is Oracle Autonomous Database?

Oracle Autonomous Database provides a self-driving, self-securing, self-repairing cloud service that eliminate the overhead and human errors associated with traditional database administration. Oracle Autonomous Database takes care of configuration, tuning, backup, patching, encryption, scaling, and more. Additional information can be found at https://www.oracle.com/database/autonomous-database.html.

Special Thanks…

This post has been prepared by Roger Ford who is the product manager for Oracle Text.He is extremely well known within the Oracle developer community and if you want to follow Roger’s posts on using Text features and functions then use this link and his Oracle Text blog is here.

Let’s Get Started With Oracle Text…

Those developers with a strong Oracle Database background will no doubt be familiar with Oracle Text indexes, but this post is aimed at those with some knowlege of SQL but new to Oracle Database, as well as those experienced Oracle users who need a refresher on this functionality.

What is Oracle Text?

Oracle Text allows you to do full-text searching on textual content in the database. Storing names and addresses? You can search them by looking for a lastname, a street name or a zip code. Storing full-length documents such as PDFs or Excel spreadsheets? You can find them by any word, phrase or code used in the document. It’s a search engine built into the database, and accessible directly through SQL or PL/SQL.

Oracle Text indexes use the databases extensibility framework which allows you to create new indextypes. Oracle Text provides three of these:

  • CONTEXT – general purpose full-text index
  • CTXCAT – Specialist Catalog index
  • CTXRULE – An index on a pre-defined set of queries to be run against one document at a time.

Additionally, Oracle Text technology is used in the JSONSEARCH index, part of Oracle’s suite of JSON support tools.

In this discussion, we’re going to focus on the CONTEXT and JSONSEARCH indexes.

CONTEXT index

A CONTEXT index is a word-based index. That is, it allows to find any word (or combination of words) within a document. Now, I should probably define document: Although it can be a full-sized document such as a PDF or MS Word document in a BLOB column of the database, it can also be just a simple VARCHAR2 column value.

To create a CONTEXT index, we use the standard create index syntax, with the addition of phrase ‘INDEXTYPE IS CTXSYS.CONTEXT’ (the CTXSYS schema owns all Oracle Text objects).

So let’s see an example:

CREATE TABLE demo (text VARCHAR2(200));INSERT INTO demo VALUES ('David Copperfield: A book by Charles Dickens');CREATE INDEX demoindex on demo(text) INDEXTYPE IS CTXSYS.CONTEXT;

Now each Oracle Text indextype has an associated search operator. In the case of a CONTEXT index, that is the CONTAINS function. CONTAINS takes the name of the column to search and a search expression, and returns a value which represents a hit or not (that is, whether the row contents matches the search expression. The value returned can be considered as zero for no match, or greater than zero for a match. So we can do a search like:

SELECT text FROM demo WHERE CONTAINS (text, 'copperfield') > 0;

Many people think a CONTEXT index is just an indexed version of the LIKE operator. This is not the case. You will note that, unlike a LIKE search, we didn’t need to put wildcards around “copperfield”, and nor did we need to worry about case sensitivity. As it’s a full-word index, doing a search like:

SELECT text FROM demo WHERE CONTAINS (text, 'copper') > 0;

will NOT succeed. ‘copper’ as a word does not appear in the text. We could have used a trailing wild card to make the query work:

SELECT text FROM demo WHERE CONTAINS (text, 'copper%') > 0;

But note that the wild card is making ‘copper’ match ‘copperfield’, it’s not just doing a substring search on the whole string.

Now, what else can we do with a CONTEXT search? Lots. Way too much to cover here. But here’s a few examples which will hopefully be self-expanatory:

-- Phrase Search SELECT text FROM demo WHERE CONTAINS (text, 'charles dickens') > 0;-- AND searchSELECT text FROM demo WHERE CONTAINS (text, 'copperfield AND dickens') > 0;-- OR searchSELECT text FROM demo WHERE CONTAINS (text, 'copperfield OR (two cities)') > 0;-- Proximity search : two terms within 20 words of each otherSELECT text FROM demo WHERE CONTAINS (text, 'NEAR((copperfield, dickens),20)') >0;

It should be noted that CONTEXT indexes are not synchronous. The index is only updated when they are SYNC’d. By default, an index is only SYNC’d by a call to the PL/SQL procedure: ctx_ddl.sync_index(”). However we can make that automatic using the optional PARAMETERS clause of the CREATE INDEX statement. The simplest way is to specify SYNC(ON COMMIT) which means the index is automatically updated when changes are committed:

CREATE INDEX demoindex on demo(text) INDEXTYPE IS CTXSYS.CONTEXT PARAMETERS ('SYNC(ON COMMIT)');

This is fine for low-DML indexes. However, if you have have many processes all doing inserts and updates on the indexed table, you may find that the many commits start to block each other. In this case, you might want to use SYNC(EVERY ) instead. Timeperiod is specified as a scheduler interval. For example the following will invoke SYNC every 15 seconds:

CREATE INDEX demoindex on demo(text) INDEXTYPE IS CTXSYS.CONTEXT PARAMETERS ('SYNC(EVERY SYSDATE+1/24/60/4)');

There are many other customizations of the text index which are performed through the PARAMETERS clause. Interested readers are directed to the following books:

Text Reference Manual : https://docs.oracle.com/en/database/oracle/oracle-database/18/ccref/index.htmlText Application Developer's Guide : https://docs.oracle.com/en/database/oracle/oracle-database/18/ccapp/index.html

JSON Search index

When dealing with JSON (JavaScript Object Notation) documents, you can choose to index them in different ways. If you know the layout (schema) of all your documents, it’s often efficient to create a function-based index on particular elements of your JSON documents.

For example, if we have a table called j_purchaseorder which has an element PONumber, we might create an index thus:

CREATE UNIQUE INDEX po_num_idx1 ON j_purchaseorder (json_value(po_document, '$.PONumber' RETURNING NUMBER));

But one of the great things about working with JSON is that you don’t need to know the schema in advance. How can we therefore ensure that everything in the document is indexed, even though we don’t know in advance the layout of documents? In that case we can use the JSON search index.

JSON Search indexes only work on tables with an ‘IS JSON’ constraint – if you don’t have one then you must add it to your table (it’s a good idea anyway to avoid rogue, invalid documents getting into your system).

Although the index is a variation on the CONTEXT indextype, the syntax to create a JSON search index is somewhat simpler: CREATE SEARCH INDEX indexname ON jsontable(jsoncolumn) FOR JSON;

That will automatically add all the name/value pairs found in the document set into the index.

The index will be used automatically, where appropriate, to speed up any JSON searches, such as queries using the JSON_VALUE predicate. But it also has it’s own search function, quite similar to the CONTAINS clause used for CONTEXT indexes. For example you could call:

SELECT id FROM table WHERE JSON_TEXTCONTAINS ( jsoncolumn, '$', 'dickens AND copperfield')

(note there’s no ‘> 0’ needed at the end here, unlike CONTAINS).

The second argument of JSON_TEXTCONTAINS is a path specification. Here we’re searching the whole document, but we could restrict it to a specific part of the document if we chose.

Let’s look at a full worked example:

-- create the table, not forgetting the IS JSON constraintcreate table jsontab (id number, jsoncol varchar2(200), constraint colIsJson check (jsoncol is json));-- insert some test datainsert into jsontab values (1, '{ booktitle: "David Copperfield", bookauthor: "Charles Dickens" }'); -- create our Search indexcreate search index jsonindex on jsontab (jsoncol) for json;-- search for Dickens and Copperfield anywhere in the docselect * from jsontab where json_textcontains(jsoncol, '$', 'dickens AND copperfield');-- search only bookauthor values for Dickensselect * from jsontab where json_textcontains(jsoncol, '$.bookauthor', 'dickens');

JSON_TEXTCONTAINS allows most of the syntax used in the Context CONTAINS clause, and has similar characteristics with regard to case sensitivity, word splitting, and so on.

Another thing that a JSON search index gives us is the DATAGUIDE. This is a description of the layout of our JSON documents, which can be really useful if you don’t know exactly what’s been inserted into your table. For the simple example above, we can the dataguide using the get_index_dataguide procedure in the dbms_json package:

select dbms_json.get_index_dataguide ( 'jsontab', 'jsoncol', dbms_json.format_hierarchical, dbms_json.pretty) from dual;

This gives the output:

{ "type" : "object", "properties" : { "booktitle" : { "type" : "string", "o:length" : 32, "o:preferred_column_name" : "JSONCOL$booktitle" }, "bookauthor" : { "type" : "string", "o:length" : 16, "o:preferred_column_name" : "JSONCOL$bookauthor" } }}

So we can see the layout of our documents, including information about the maximum length of any values in the table.

What’s Next?

Over on the Oracle Text blog, Roger has posted a series of articles on getting started with Oracle Text:

Summary

As with CONTEXT indexes, we’ve only touched the surface of JSON search capabilities. For more information, the reader is encouraged to look at the Oracle Database JSON Developer’s Guide : https://docs.oracle.com/en/database/oracle/oracle-database/18/adjsn/

Related:

  • No Related Posts

A Few Things You Should Know About What It Takes To Be a Leader

17,600 minutes. Or 293 hours. That is the average amount of time Americans spend driving in their car each year (AAA Foundation report). The majority of us rely heavily on cars to provide transportation services and we might not think twice about the significant functions our cars provide for us until we are without one or don’t have access to one. Imagine having to build your own fully functional car that is reliable, comfortable, and safe in order to have one. Where would you start? Would it be feasible? Would you trust the outcome? Without proper resources, materials, research, knowledge etc., the end result would most likely be questionable and unreliable. Similarly, it is very difficult to run a business without the right technology and resources, especially in today’s digital economy when the abundance of data is affecting companies big and small across all vertical markets. Above all, data is doubling every two years. According to Forbes, there are 2.5 quintillion bytes of data created each day at our current pace, but that pace is only accelerating. Businesses are using all these and other digital data to make important business decisions. How many specialized experts would a company need to manage millions of lines and columns? A whole army might not be enough…

In 2017, during Oracle Open World, Larry Ellison turned the data management world upside down when he announced the first and only self-driving autonomous database. This announcement was ground-breaking and revolutionary. Now, Oracle can offer a fully autonomous database that uses machine learning to automatically tune, secure, patch, upgrade itself, resulting in a dramatic decrease in costs with, fewer errors, higher security, and higher reliability. The tedious tasks of managing millions of lines and columns are now taken care of.

The fully autonomous database performs all of the mundane manual tasks behind the scenes and offers 100% security and availability of your data. Because of this, your time and focus can be spent on driving your business forward and making smarter decisions faster, rather than worrying about the behind-the-scenes work. The result? Unprecedented availability, high performance, and security, all for a much lower cost. The world’s first autonomous database is:

  • Self-Driving: Provides continuous adaptive performance tuning based on machine learning.

  • SelfSecuring: Automatically upgrades and patches itself while running. Automatically applies security updates while running to protect against cyber attacks.

  • Self-Repairing: Provides automated protection from downtime. SLA guarantees 99.995 percent reliability and availability, which reduces costly planned and unplanned downtime to less than 30 minutes a year

Oracle continues to be a market and technology leader in data management delivering leading-edge innovations for both IT and business. Recognized by Gartner, Oracle is named a Leader In 2018 Gartner Magic Quadrant for Operational Database Management Systems.

This marked the thirteenth time that Gartner has positioned Oracle as a Leader, namely because of the progressive endeavors to constantly and consistently bring innovative database technologies to the market. Additionally, in 2018 Oracle scores highest in all four categories in 2018 Gartner Critical Capabilities for Operational Database Management Systems report that included traditional transactions (4.4/5), distributed variable data (4.2/5), event processing/data in motion(4.25/5), operational & analytics convergence (4.33/5).

Fast forward to 2019, Oracle technology for data warehousing, data science, and data lakes placed furthest in both completeness of vision and ability to execute in Gartner 2019 Magic Quadrant for Data Management Solutions for Analytics. Oracle is named a Leader 13 times in a row.

These successes, of course, did not just happen overnight; the efforts have been a culmination of research, investment, and thousands of engineering work for decades, much like the progression of a car.


Oracle Database 19c is the latest generation of the world’s most popular database is now available. This database cloud service is designed to support mixed workloads through any deployment strategy, on-premises and in the cloud.

So, what does it take to be a leader? It takes the willingness to constantly search out new ways to always be on the forefront of innovation, knowing and deeply understand the customers’ problems and deliver elegant, simple and yet effective solutions. Oracle has set a model to inspire IT leaders to reach their full potential and drive their business into their desired paths to success. Oracle Autonomous Database is revolutionizing how data is managed and analyzed, enabling faster, easier data access, helping to unlock the potential of your data so you can transform your business with innovation.

Explore premium analyst content and learn what makes Oracle Database the industry leader.

Disclaimer:

Gartner, Magic Quadrant for Data Management Solutions for Analytics, Adam Ronthal, Roxane Edjlali, Rick Greenwald, 21 January 2019

Gartner, Critical Capabilities for Operational Database Management Systems, Donald Feinberg, Merv Adrian, Nick Heudecker, 23 October 2018

Gartner, Magic Quadrant for Data Management Solutions for Analytics, 21 January 2019, Adam Ronthal, Roxane Edjlali, Rick Greenwald

This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available upon request from https://go.oracle.com/LP=80823?elqCampaignId=202976

Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

Related:

  • No Related Posts

Why Oracle CEO Mark Hurd Thinks You Should Hire Big Data Experts

There are nearly 15,000 petabytes of data on the internet right now (there may be more than 15,000 by the time you read this article), and that data deluge grows by 70 terabytes every second.

To put that in perspective, streaming the latest episode of Game of Thrones probably required about three gigabytes of data. Trying to consume all the data currently stored online would be the equivalent of five billion people downloading the latest Game of Thrones episode at the same time.

But that’s not how online data really works.

One episode of a TV show is a tiny drop in the vast ocean of online data. Much of that data—those 70 new terabytes created every second—is generated by and for businesses as they go about their day-to-day work.

Somehow, all of this data needs to be understood.

During an age of less internet connectivity and fewer people online, it was possible for talented data analysts to make sense of the information flooding into their systems on their own. But today, as Oracle CEO Mark Hurd has said, “Whether you’re looking at information on employees, customers, or whatever it may be, the amount of data that companies now have is beyond the ability for even the most sophisticated data scientists to take advantage of.”

That’s where big data comes into play. Big data talent is more important than ever to modern enterprises. The best big data experts now have access to advanced artificial intelligence and machine learning technologies that can help make sense of the data deluge on a real-time basis. These capabilities allow big data experts to move beyond super scaled number crunching, letting them deploy their intelligence, creativity, and perception to find actionable benefits among the billions of bits and bytes flowing through their employers’ systems.

During his Oracle OpenWorld 2018 keynote, Hurd highlighted the impossibility of manual analysis at scale and went on to point out that this impossibility is, “not true of machine learning. Further, Hurd also noted, “The opportunity to turn all that data into knowledge…[into] information that helps you sell more, [or into] information that helps you save more—AI will affect both.”

This ability to transform torrents of data into actionable business strategies can apply to many of a business’ core operational functions, including human capital management. Hurd made this connection at Oracle OpenWorld 2018 as well, noting, “35 percent of a recruiter’s day is spent sourcing and processing candidates…the ability to know whether a GPA matters, whether a major matters, whether your extracurricular activities in school matter…it’s very difficult to harness all of that data information. Not true when AI is applied.”

Recruiting departments can put big data experts to work for their cause. An analyst working for an enterprise with thousands of employees could save their employer millions of dollars by employing these technologies. By using Oracle HCM Cloud to gather and analyze data throughout the recruiting process, businesses can reduce turnover and improve the quality of hires.

Big data is more than a buzzword. Top talent can achieve measurable results for a wide range of businesses, including manufacturing, healthcare, and retail. You can see many benefits of big data on Oracle’s Big Data Use Cases page. You can also see real results big data experts achieved while working with Oracle Big Data Cloud on the Success Stories database. Big data experts have utilized Oracle Big Data Cloud to (among other things):

Does your organization have big data experts? If it doesn’t, finding them should be a matter of when, and not if. Having talent who can turn your organization’s flood of information into real operational results can make a huge difference in business performance and on the bottom line.

Related:

  • No Related Posts

Migrate from Amazon Redshift to Oracle Autonomous Data Warehouse in 7 easy steps.

In this blog, I plan to give you a quick overview of how you can use SQL Developer Amazon Redshift Migration Assistant to help you migrate your existing Amazon Redshift to Oracle Autonomous Data Warehouse (ADW)

But first, why the need to migrate to Autonomous Data Warehouse?

Data-driven organizations differentiate themselves through analytics to further their competitive advantage by extracting value from all their data sources. Today’s digital world is already creating data at an explosive rate, that organizations’ physical data warehouses that were once great for collecting data from across the enterprise for analysis are not able to keep pace with storage and compute resources needed to support them. In addition, the manual cumbersome task of patching, upgrading and securing the environments and data poses significant risks to businesses.

There are few cloud vendors that serve this niche market, one of them is Amazon Redshift, a fully managed data warehouse cloud service that is built on top of technology licensed from ParAccel. Though it is an early entrant, its query processing architecture severely limits concurrency levels, making it unsuitable for large data warehouses or web-scale data analytics. Redshift is only available for fixed blocks of hardware configurations, as such computers cannot be scaled independently of storage. This leads to excess capacity making customers pay for more than what is used. Additionally, resizing puts it in a read-only state and may require downtime, which could take hours while data is redistributed.

Oracle Autonomous Data Warehouse is a fully managed database tuned and optimized for data warehouse workloads that support both structured and unstructured data. It automatically and continuously patches, tunes, backups, and upgrades with virtually no downtime. Integrated machine-learning algorithms drive automatic caching, adaptive indexing, advanced compression, and optimized cloud data-loading delivers unrivaled performance, allowing you to quickly extract data insights and make critical decisions in real time. With little human intervention, the product virtually eliminates human error, with dramatic implications for not only minimizing security breaches and outages but also on cost. Autonomous Data Warehouse is built on latest Oracle Database software and technology that runs your existing on-premises marts, data warehouses, and applications, making it compatible with all your existing data warehouse, data integration, and BI tools.

Strategize your Data Warehouse Migration

Here is a proposed workflow for either on-demand migration of Amazon Redshift or the generation of scripts for a scheduled manual migration that can be run at a later time

Establish connections to both Amazon Redshift (Source) and Oracle Autonomous Data Warehouse (Target) using SQL Developer Migration Assistant.

Download SQL Developer 18.3 or later versions. It is a client application that can be installed on a workstation, laptop for both Windows / Mac OSX. For the purposes of this blog, we will run it on Microsoft Windows. Download Amazon Redshift JDBC driver to access Amazon Redshift Environment.

Open SQL Developer application and add Redshift JDBC driver as third-party driver (Tools > Preferences > Database > Third Party JDBC Drivers)

Add Connection to Amazon Redshift Database, in the connections panel, create new connection, select the Amazon Redshift tab and enter the connection information for Amazon Redshift.

Tip:

  • If you are planning to migrate multiple schemas it is recommended to connect with the master username to your Amazon Redshift instance.
  • If you deployed your Amazon Redshift environment within a Virtual Private Cloud (VPC) you have to ensure that your cluster is accessible from the Internet, here are the details on how to enable public Internet access.
  • If your Amazon Redshift client connection to the database appears to hang or times out when running long queries, here are the details with possible solutions to address this issue.

Add Connection to Oracle Autonomous Data Warehouse, in the connections panel, create new connection, select the Oracle tab and enter the connection information along with wallet details. If you haven’t provisioned Autonomous Data Warehouse yet, please do so now. Here are quick easy steps to get you started. You can even start with a free trial account.

Test connections for both Redshift and Autonomous Data Warehouse before you save them.

2. Capture / Map Schema: From the tools menu of SQL Developer, start the Cloud Migration Wizard to capture metadata schemas and tables from the source database (Amazon Redshift).

First, connect to AWS Redshift from the connection profile and identify the schemas that need to be migrated. All objects, mainly tables, in the schema will be migrated. You have the option to migrate data as well. Migration to Autonomous Data Warehouse is a per-schema basis and schemas cannot be renamed as part of the migration.

Note: When you migrate data, you have to provide the AWS access key, AWS Secret Access Key, and an existing S3 bucket URI where the Redshift data will be uploaded to and staged. The security credentials require privileges to store data in S3. If possible, create new, separate access keys for the migration. The same access keys will be used later to load the data into the Autonomous Data Warehouse using secure REST requests.

For example, if you provide URI as https://s3-us-west-2.amazonaws.com/my_bucket

Migration assistant will create these folders: oracle_schema_name/oracle_table_name inside the bucket: my_bucket

"https://s3-us-west 2.amazonaws.com/my_bucket/oracle_schema_name/oracle_table_name/*.gz"

Redshift Datatypes are mapped to Oracle Datatypes. Similarly, Redshift Object names are converted to Oracle names based on Oracle Naming Convention. The column defaults that use Redshift functions are replaced to their Oracle equivalents.

3. Generate Schema: Connect to Autonomous Data Warehouse from the connection profile. Ensure the user has administrative privileges, as this connection is used throughout the migration to create schemas and objects. Provide a password for the migration repository that will be created in the Autonomous Data Warehouse. You can choose to remove this repository post-migration. Specify a directory on the local system to store generated scripts necessary for the migration. To start migration right away, choose ‘Migrate Now

Use ‘Advanced Settings’ to control the formatting options, parallel threads to enable when loading data, reject limit (number of rows to reject before erroring out)during the migration

Review the summary and click ‘finish’. If you have chosen an immediate migration, then the wizard stays open until the migration is finished. If not, the migration process generates the necessary scripts in the specified local directory and does not run the scripts.

If you choose to just generate migration scripts in the local directory, then continue with the next steps.

  1. Stage Data: Connect to Amazon Redshift environment to run redshift_s3unload.sql to unload data from Redshift tables and store them to Amazon Storage S3 (staging) using the access credentials and the S3 bucket that was specified in the migration wizard workflow.
  2. Deploy Target Schema: Connect to Autonomous Data Warehouse as a privileged user (example: ADMIN) to run adwc_ddl.sql to deploy the generated schemas and DDLs converted from Amazon Redshift.
  3. Copy Data: While being connected to Autonomous Data Warehouse, run adwc_dataload.sql that contains all the load commands necessary to load data straight from S3 into your Autonomous Data Warehouse.
  4. Review Migration Results: Migration task creates 3 files in local directory; MigrationResults.log, readme.txt and redshift_migration_reportxxx.txt. Each of them will have information on the status of migration

Test few queries to make sure all your data from Amazon Redshift has migrated. Oracle Autonomous Data Warehouse supports connections from various client applications. Connect and test them.

Conclusion

With greater flexibility, lower infrastructure cost, and lower operations overhead, there’s a lot to love about Oracle Autonomous Data Warehouse. The unique value of Oracle comes from its complete cloud portfolio with intelligence infused at every layer, spanning infrastructure services, platform services, and applications. For Oracle, the autonomous enterprise goes beyond just automation, in which machines respond to an action with an automated reaction, instead, it is based on applied machine learning, making it completely autonomous, eliminating human error and delivering unprecedented performance, high security and reliability in the cloud.

Related:

  • No Related Posts