GA of Oracle Database 20c Preview Release

The latest annual release of the world’s most popular database, Oracle Database 20c, is now available for preview on Oracle Cloud (Database Cloud Service Virtual Machine).

As with every new release, Oracle Database 20c introduces key new features and enhancements that further extend Oracle’s multi-model converged architecture with the introduction of Native Blockchain Tables, and more performance enhancements such as Automatic In-Memory (AIM) and a binary JSON datatype. For a quick introduction, watch Oracle EVP, Andy Mendelsohn discuss Oracle Database 20c during his last Openworld keynote.

For the complete list of new features in Oracle Database 20c, please refer to the new features guide in latest documentation set. To learn more about some of the key new features and enhancements in Oracle Database 20c, check out the following blog posts:

For availability of Oracle Database 20c on all other platforms on-premises (including Exadata) and in Oracle Cloud please refer to MyOracle Support (MOS) note 742060.1.


  • No Related Posts

What Is Oracle Cloud Infrastructure Data Catalog?

And What Can You Do with It?

Simply put, Oracle Cloud Infrastructure Data Catalog helps organizations manage their data by creating an organized inventory of data assets. It uses metadata to create a single, all-encompassing and searchable view to provide deeper visibility into your data assets across Oracle Cloud and beyond. This video provides a quick overview of the service.

This helps data professionals such as analysts, data scientists, and data stewards discover and assess data for analytics and data science projects. It also supports data governance by helping users find, understand, and track their cloud data assets and on-premises data as well—and it’s included with your Oracle Cloud Infrastructure subscription.

Never miss an update about big data! Subscribe to the Big Data Blog to receive the latest posts straight to your inbox!

Why Does Oracle Cloud Infrastructure Data Catalog Matter?

Hint: It has to do with self-service data discovery and governance.

Oracle Cloud Infrastructure Data Catalog matters because it’s a foundational part of the modern data platform—a platform where all of your data stores can act as one, and you can view and access that data easily, no matter whether it resides in Oracle Cloud, object storage, an on-premises database, big data system, or a self-driving database.

This means that data users—data scientists, data analysts, data engineers, and data stewards—can all find data across systems and the enterprise more easily because a data catalog provides a centralized, collaborative environment to encourage exploration. Now these key players can trust their data because they gain technical as well as business context around it. It means they don’t have to have SQL access, or understand what object storage is, or figure out the complexities of Hadoop—they can get started faster with their single unified view through their data catalog. It’s no longer necessary to have five different people with five different skillsets just to find where the right data resides.

Easy data discovery is now possible.

And of course, it’s not just data discovery that’s easier. Governance is also easier—and that is a key benefit with GDPR and ever more complex compliance requirements in today’s world of multiple enterprise systems, with on-premises, cloud, and multi-cloud environments.

With Oracle Cloud Infrastructure Data Catalog, you have better visibility into all of your assets, and business context is available in the form of a business glossary and user annotations. And of course, understanding the data you have is essential for governance.

How Does Oracle Cloud Infrastructure Data Catalog Work?

Oracle Cloud Infrastructure Data Catalog takes metadata—technical, business, and operational—from various data sources, users, and assets, and harvests it to turn it into a data catalog: a single collaborative solution for data professionals to collect, organize, find, access, enrich, and activate metadata to support self-service data discovery and governance for trusted data assets across Oracle Cloud.

And what’s so important about this metadata? Metadata is the key to Oracle Cloud Infrastructure Data Catalog. There are three types of metadata that are relevant and key to how our data catalog works:

  • Technical metadata: Used in the storage and structure of the data in a database or system
  • Business metadata: Contributed by users as annotations or business context
  • Operational metadata: Created from the processing and accessing of data, which indicates data freshness and data usage, and connects everything together in a meaningful way

You can harvest this metadata from a variety of sources, including:

    • Oracle Cloud Infrastructure Object Storage
    • Oracle Database
    • Oracle Autonomous Transaction Processing
    • Oracle Autonomous Data Warehouse
    • Oracle MySQL Cloud Service
    • Hive
    • Kafka

And the supported file types for Oracle Cloud Infrastructure Object Storage include:

    • CSV, Excel
    • ORC, Avro, Parquet
    • JSON

Once the technical metadata is harvested, subject matter experts and data users can contribute business metadata in the form of annotations to the technical metadata. By organizing all this metadata and providing a holistic view into it, Oracle Cloud Infrastructure Data Catalog helps data users find the data they need, discover information on available data, and gain information about the trustworthiness of data for different uses.

How Can You Use a Data Catalog?

Metadata Enrichment

Oracle Cloud Infrastructure Data Catalog enables users to collaboratively enrich technical information with business context to capture and share tribal knowledge. You can tag or link data entities and attributes to business terms to provide a more all-inclusive view as you begin to gather data assets for analysis and data science projects. These enrichments also help with classification, search, and data discovery.

Business Glossaries

One of the first steps towards effective data governance is establishing a common understanding of business concepts across the organization, and establishing their relationships to the data assets in the organization. Oracle Cloud Infrastructure Data Catalog makes it possible to see associations and linkages between glossary terms and other technical terms, assets, and artifacts. This helps increase user trust because users understand the relationships and what they’re looking at.

Oracle Cloud Infrastructure Data Catalog makes this possible by including capabilities to collaboratively define business terms in rich text form, categorize them appropriately, and build a hierarchy to organize this vocabulary. You can also create parent-child relationships between various terms to build a taxonomy, or set business term owners and approval status so that users know who can answer their questions regarding specific terms. Once created, users can then link these terms to technical assets to provide business meaning and use them for searching as well.

Searchable Data Asset Inventory

By organizing all this metadata and providing a more complete view into it, Oracle Cloud Infrastructure Data Catalog helps users find the data they need, discover information on available data, and gain information about the trustworthiness of data for different uses.

Being able to search across data stores makes finding the right data so much easier. With Oracle Cloud Infrastructure Data Catalog, you have a powerful, searchable, standardized inventory of the available data sources, entities, and attributes. You can enter technical information, defined tags, or business terms to easily pull up the right data entities and assets. You can also use filtering options to discover relevant datasets, or browse metadata based on the technical hierarchy of data assets, entities, and attributes. These features make it easier to get started with data science, analytics, and data engineering projects.

Data Catalog API and SDK

Many of Oracle Cloud Infrastructure Data Catalog’s capabilities are also available as public REST APIs to enable integrations such as:

  • Searching and displaying results in applications that use the data assets
  • Looking up definitions of defined business terms in the business glossary and displaying them in reporting applications
  • Invoking job execution to harvest metadata as needed

Available search capabilities include:

  • Search data based on technical names, business terms, or tags
  • View details of various objects
  • Browse Oracle Cloud Infrastructure Data Catalog based on data assets

Available single collaborative environment includes:

  • Homepage with helpful shortcuts and operational stats
  • Search and browse
  • Quick actions to manage data assets, glossaries, jobs, and schedules
  • Popular tags and recently updated objects


Oracle Cloud Infrastructure Data Catalog is the underlying foundation to data management that you’ve been waiting for—and it’s included with your Oracle Cloud Infrastructure subscription. Now, data professionals can use technical, business, and operational metadata to support self-service data discovery and governance for data assets in Oracle Cloud and beyond.

Leverage your data in new ways, and more easily than you ever could before. Try Oracle Cloud Infrastructure Data Catalog today and start discovering the value of your data. And don’t forget to subscribe to the Big Data Blog for the latest on Big Data straight to your inbox!


  • No Related Posts

What Is Oracle Cloud Infrastructure Data Science?

And how does it work?

Incredible things can be done with data science, and more appear in the news every day—but there are still many barriers to success. These barriers range from a lack of proper support for data scientists to challenges around operationalizing and maintaining models in production.

That is why we created Oracle Cloud Infrastructure Data Science. Based on the acquisition of in 2018, Oracle Cloud Infrastructure Data Science was built with the goal of making data science collaborative, scalable, and powerful for every enterprise on Oracle Cloud Infrastructure. This short video gives an overview of the power of Oracle Cloud Infrastructure Data Science.

Oracle Cloud Infrastructure Data Science was created with the data scientist in mind—and it’s uniquely suited for data science success because of its support for team-based activity. When it comes to data science success, teams must collaborate at each step of the model lifecycle: from building models all the way through to deployment and beyond.

Oracle Cloud Infrastructure Data Science helps make all of that possible.

Never miss an update about data science! Introducing Oracle Data Science on Twitter — follow @OracleDataSci today for the latest updates!

What Is Oracle Cloud Infrastructure Data Science?

Oracle Cloud Infrastructure Data Science makes data science more structured and more efficient by offering:

Access to data and open-source tools

We are data-source agnostic. Your data can be on Autonomous Data Warehouse, on Object Storage, in MongoDB, or even in an Elasticsearch instance on Azure or AWS Redshift. It doesn’t matter to us where the data is; we just care about giving you access to your data to get things done.

With Oracle Cloud Infrastructure Data Science, you can use the best of open source, including:

  • Tools and languages like Python and JupyterLab
  • Visualization like Plotly and Matplotlib
  • Machine-learning libraries like TensorFlow, Keras, SciKit-Learn, and XGBoost
  • Version control with Git

Ability to utilize compute on demand

We’ll give you the client connectors you need to access your data and a configurable volume to store that data in your notebook compute environment.

But of course, it doesn’t stop there. You can also select the amount of compute you need to train your model on Oracle Cloud Infrastructure. For now, you can choose small to large CPU virtual machines. And in the near future, we’re planning to add GPUs.

Collaborative workflow

We make a big deal out of teamwork, because we believe that data science can’t truly be successful unless there’s an emphasis on making those teams efficient and successful. We’ve done everything we can to make this possible.

Data scientists can work in “projects” where it’s easy to see what’s happening with a high-level view. Data scientists can share and reuse data science assets and test their colleagues’ models.

Model deployment

Model deployment is usually challenging. But it’s made easier with Oracle Functions on Oracle Cloud Infrastructure. Create a machine learning model function which can be invoked from any application. It’s one of many possible deployment targets, and it’s fully managed, high scalable, and on-demand.

What Makes Oracle Cloud Infrastructure Data Science Different?

With the growing popularity of data science and machine learning, products that claim to help are a dime a dozen. So, what makes Oracle Cloud Infrastructure Data Science different?

This isn’t an analytics tool with some machine learning capabilities embedded within it. Nor is it an app that offers AI capabilities across different products.

Oracle Cloud Infrastructure Data Science is a platform built for the modern, expert data scientist. And it was built by data scientists who were seeking a platform that would help them perform their complex work better. It’s not a drag-and-drop interface­. This is meant for data scientists who write code in Python and need something with real power to enable real data science.

Oracle Cloud Infrastructure Data Science is right for you if you:

  • Have a team and see the benefits of centralized work
  • Prefer Python to drag-and-drop interfaces
  • Want to take advantage of the benefits of Oracle Cloud, with easy access to your data

Oracle Cloud Infrastructure Data Science is also right for you if you need:

  • The ability to train large models on large amounts of data with minimal infrastructure expertise
  • A system to evaluate and monitor models throughout their lifecycle
  • Improved productivity through automation and streamlined workflows
  • Capabilities to deploy models for varying use cases
  • Ability to collaborate with team members in an enterprise organization
  • A seamless, integrated Oracle Cloud Infrastructure user experience

How Does Oracle Cloud Infrastructure Data Science Work?

Oracle Cloud Infrastructure Data Science has:

Projects to centralize, organize, and document a team’s work. These projects describe the purpose of the work and allow users to organize notebook sessions and models.

Notebook Sessions for Python analyses and model development. Users can easily launch Oracle Cloud Infrastructure compute, storage, and networking for Python data science workloads. These sessions provide easy access to JupyterLab and other curated open-source machine-learning libraries for building and training models.

In addition, these notebook sessions come loaded with tutorials and example use cases to make getting started easier than ever.

Accelerated Data Science (ADS) SDK to make common data science tasks faster, easier, and less error-prone. This is a Python library that offers capabilities for data exploration and manipulation, model explanation and interpretation, and AutoML for automated model training.

Model Catalog to enable model auditability and reproducibility. You can track model metadata (including the creator, created date, name, and provenance), save model artifacts in service-managed object storage, and load models into notebook sessions for testing.

How Does Oracle Cloud Infrastructure Data Science Help with Model Management?

The process of building a machine leaning model is an iterative one, and it’s one that essentially never ends. Let’s walk through how Oracle Cloud Infrastructure Data Science makes it easier to manage models throughout every step of the entire lifecycle.

Building a Model

Oracle Cloud Infrastructure Data Science’s JupyterLab environment offers a variety of open-source libraries for building machine learning models. It also includes the Accelerated Data Science (ADS) SDK, which provides APIs on data ingestion, data profiling and visualization, automated feature engineering, automated machine learning, model evaluation, and model interpretation. It’s everything that’s needed in a unified Python SDK, accomplishing in a few lines of code what a data scientist would typically do in hundreds of lines of code.

Training a Model

Data scientists can automate model training through the ADS AutoML API. ADS can help data scientists find the best data transformations for datasets. After the model evaluation shows that the model is ready for production, the model can be made accessible to anybody who needs to use it.

Evaluating a Model

ADS also helps with model evaluation to ensure that your model is accurate and reliable. What percent accuracy can you achieve with the model? How can you make it more accurate? You want to feel confident in your model before you start to deploy it.

Explaining a Model

Model explainability is becoming an increasingly important part of machine learning and data science. Can your model give you more information about why it’s making the decisions it’s reaching? Increasingly, there are more European regulations around the right to know. GDPR, for example, states that the data subject has a right to an explanation of the decision reached by a model.

Deploying a Model

Taking a trained machine learning model and getting it into the right systems is often a difficult and laborious process. But Oracle Cloud Infrastructure enables team to operationalize models as scalable and secure APIs. Data scientists can load their model from the model catalog, deploy the model using Oracle Functions, and secure the model endpoint with Oracle API Gateway. Then, the model REST API can be called from any application.

Model Monitoring

Unfortunately, deploying a model isn’t the end of it. Models must always be monitored after deployment to maintain good health. The data it was trained on may no longer be relevant for future predictions after a while. For example, in the case of fraud detection, the fraudsters may come up with new ways to defraud the system, and the model will no longer be as accurate. Oracle Cloud Infrastructure Data Science is working to provide data scientists with tools to easily monitor how the model continues to do while it’s deployed, so that it becomes easier to monitor model accuracy over time.


Oracle Cloud Infrastructure Data Science is an enterprise-grade service in which teams of data scientists can collaborate to solve business problems and leverage the latest and greatest in Oracle Cloud Infrastructure to build, train, and deploy their models in the cloud.

It is part of Oracle’s data and AI platform, which makes it simple to integrate and manage your data and use the power of data science and machine learning for more business results.

With Oracle Cloud Infrastructure Data Science, it’s easier than ever before for data scientists to get started, work with the tools and libraries that they want, and gain streamlined access to all data in Oracle Cloud Infrastructure and beyond. For more information, see this overview video and don’t forget to subscribe to the Oracle Big Data blog to get the latest posts sent to your inbox.


  • No Related Posts

On-Premises Autonomous Database

As Autonomous Database services have launched over the past 2 years, I’ve been often asked by customers, “will the Autonomous Database features become available to on-premises deployments?”

The answer is yes, but with a large caveat. Actually, the question itself is slightly misguided and made me realize that some fundamentals seem to be missing from the Autonomous Database dialog.

To clarify, we must first consider that there is a big difference between a database and a database service. The Autonomous Database is a service, it’s much more than a database.

A Database is software technology in a bundled set of binaries licensed and installed by a customer onto a predetermined set of infrastructure (compute, storage and network). Once installed, the customer is responsible for all aspects of the database operation, including dealing with optimal workload configuration, the availability and expansion of physical resources like storage for data, taking backups of the database for regulatory and recovery purposes, the availability of the database in the face of hardware and software level failures, updating the software to get new features or patch software flaws, addressing security concerns like protecting access to the data, etc.

On the other hand, a Database Service is not only database software technology, but a set of integrated capabilities designed to use automation to enhance the customer experience of using a database in an end to end solution. A Database Service transfers many responsibilities involved in operating the database to the service provider. An obvious example being that not every customer must install a database (or even create one for that matter), rather the service provider has automated access so new databases can be available for use in minutes instead of hours or even days. Another example being that a customer no longer must download and install software maintenance updates, rather they can simply be scheduled or initiated on demand and the service provider automation takes care of all the dirty work to get the update completed. Database Services provide capabilities beyond database operations, they can include solution components like identity management, resource governance and operational notification services, just to name a few. Also, at Oracle, the Database Services have additional data management capabilities built-in like deep monitoring, data modeling tools, data visualization, low code development environments, an array of additional value add that surrounds “the database”. This is in essence what customers of Oracle’s Co-Managed Database Services have when using Oracle Database Cloud Service (DBCS) or Exadata Cloud Service.

Now, the question is what makes an Autonomous Database, a Database Service that is self-driving, self-securing and self-repairing? The answer is that an Autonomous Database includes an additional artificial intelligence (A.I.) software layer, a layer that leverages Machine Learning algorithms and decision making to harden software automation, so it’s proactive rather than reactive and it becomes software rather than people operating the database.

Autonomous Database Service - more than a Database

Humans are especially good at abstracting. For example, coming up with a system design for dynamic process monitoring and automation. However, unlike a computer, humans are not good at scanning large quantities of data while looking for divergent patterns. For example, humans are not good at deep packet inspection of flowing zeros and ones in a network routing algorithm. The A.I. layer is a special purpose computer process, efficient and looking at large quantities of data in a way that is impossible for humans. The A.I. layer is comparing what it sees to patterns (commonly referred to as models) in that data, and is then capable of making fast decisions based on past patterns of successful operation as well as anti-patterns, data and patterns that lead to failure. Using an A.I. layer allows to minimize the number of humans in the operating loop, so there is a faster, more repeatable and reliable response to emerging conditions.

The data and patterns being examined by the A.I. layers in an Autonomous Database go well beyond the database operating logs. The data and patterns extend to every aspect of the service including the operating system, hypervisor, compute and storage level metrics, network logs, logs of supporting processes as well as the logs of adjunct functions like modeling and low code development tools.

Even further, a decision made by the A.I. layer may involve some rather complex activities such as the decision to decommission a compute server, put another one in place to handle the old server’s activities, move the software processes to the new server, update networking layers to incorporate the new server in service call routing, etc. These things are possible because the Autonomous Database service is operating in the Cloud where there is an effectively unlimited amount of infrastructure available and an API enabling a software defined infrastructure setup. It is possible to spin up new resources on demand to proactively mitigate any approaching failure condition.

It is also important to note that the data and patterns of importance can be highly influenced by the underlying components of the service, whether that’s a specific version of some software library or a specific vendor and model storage device, memory card, network switch, etc. This is how real machine learning works, it needs a sample data set from a larger (bigger is better) system set to train the models and if the system set changes, the models can become invalid.

So, let’s now revisit the question, “will the Autonomous Database features become available to on-premises deployments”?

From a traditional database deployment perspective, hopefully its clear now, Autonomous Database is a service, it is much more than a database. Much of the additional value add that comes with the service is not available in any on-premises form factor. There is no on-premises notion of a self-service software defined infrastructure, a centralized logging service, a resource governance and operational notification service nor any complete set of highly available tooling that supports the capabilities that surround the database. Finally, the system as a whole, it’s tooling and the A.I. layer depends on both the accessibility of a virtually unlimited set of infrastructure that can be called upon on demand and a set of machine learning patterns (models) that depend on a data set that is specific to the configuration running inside the Oracle Cloud, from the supporting software libraries to the specific vendor hardware running in our Infrastructure as-a Service layer.

Given all of this, the answer to our question for a traditional database deployment is necessarily, No. The large caveat is that Autonomous Database will not become available for traditional on-premises deployments.

So, why then is the answer given at the start of the blog, yes? Well, just when you thought it was all finally understood, let’s talk about non-traditional database deployments. There is a version of Autonomous Database coming to what is called an Oracle Gen 2 Exadata Cloud at Customer service deployment. This is a representative slice of the Oracle Cloud that a customer can host inside their data center, inside their network and behind their own firewall. The Oracle Gen 2 Exadata Cloud at Customer has a light weight design to allow the Autonomous Database A.I. layer and all of the supporting adjunct service capabilities to live and run in the Oracle Cloud, while using the A.I. and Autonomous Database automation to operate the database while it runs on your premises. Only in this specialized extension of the Oracle Cloud to your data center will it be possible to have an Autonomous Database on premises. Keep an eye out for a future blog with more details on this exciting Autonomous Database deployment option.


  • No Related Posts

5 ways to get an Oracle Database

Do you want to get your hands on an Oracle Database but don’t know how? Here are 5 ways to get you going:

Do you just want to type some awesome SQL and need a database to do so? Then is your friend. LiveSQL is a browser-based SQL scratchpad that not only allows you to pull off some SQL magic but also to save and share your scripts with others. It also comes with a comprehensive library of tutorials and samples. LiveSQL is the best place for anybody that is completely unfamiliar with Oracle Database and wants to get going.

If you want to have an Oracle Database on your machine instead, but don’t want to worry about setup and configuration, the Oracle provided Docker images are a good choice. All you need is to install Docker on your machine (Mac or Windows), build an image one time from Oracle’s Docker GitHub repo and Docker will take care of the rest. From then on, all you have to remember is:

docker run -name oracle -p 1521:1521 oracle/database:19.3.0-ee


docker start oracle

Docker is great for running one or many instances and versions of an Oracle Database on your machine without having to know how to operate (start/stop/setup) them. What you end up with is still a full-fledged Oracle Database.

If you want to have an Oracle Database on your machine, but you prefer to run it inside a Virtual Machine, then the Oracle provided Vagrant scripts will do a great job. HashiCorp’s Vagrant is a great tool to provision repeatable VM environments, including VirtualBox VMs. For this scenario, you will need to install Oracle’s VirtualBox and HashiCorp’s Vagrant on your machine first. Once you have done that, provision a VM via the scripts from the Oracle Vagrant Boxes GitHub repo and let Vagrant take care of the rest. All you have to remember is:

vagrant up


vagrant ssh

The Vagrant box is great if you want a scripted and repeatable way of creating a VirtualBox VM that contains an Oracle Database. You can also provision multiple VMs with different versions of the Oracle Database. The VM comes with port forwarding enabled by default, which means that you are able to connect any of your tools from your host directly, say SQL Developer for example, to the database inside the VM and treat the VM like a little embedded server.

If you like the VM approach but don’t want or need the repeatable nature of Vagrant, then the Oracle Database Application Development VM is the right choice for you. Simply download the .ova file, import it into VirtualBox and start the VM. The VM will boot into a graphical Linux desktop.

The Oracle Database App Dev VM comes with tools like SQL Developer and Oracle REST Data Services preinstalled, which makes it a great self-contained, one-stop-shop VM. It too has port forwarding enabled by default, in case you want to connect your tools from your host directly. Another bonus of the App Dev VM is that it also includes some hands-on labs that you can go through.

If you want an Oracle Database but not on your laptop, then you should check out the Oracle Cloud Free Tier which includes an Always Free Oracle Autonomous Database. Once you have signed up for the free tier and provisioned your Always Free Autonomous Database, you can head over to SQL Developer Web and get going.

The Always Free Tier Oracle Autonomous Database is great if you want the latest and greatest what Oracle has to offer in terms of cloud databases. SQL Developer Web and APEX come out of the box and you can connect any other app or IDE from anywhere around the world, as long as it has access to the internet. And the best part, as long as you use the database, it stays with you forever!

Now, what you are waiting for? Get yourself an Oracle Database!


  • No Related Posts

Autonomous Database – Dedicated : Operational Notifications

In a previous blog on Autonomous Operations Policies, I detailed a bit about how Autonomous Database Dedicated Infrastructure deployments differ from Shared Infrastructure and illustrated how with Dedicated you can setup a policy to govern a development-test, pre-production, production software update lifecycle.

I had said this next blog would be discussing how to perform monitoring of Autonomous Database – Dedicated Exadata Infrastructure operations. The objective being a group of users can be asynchronously informed about Maintenance activities including: when a new update operation is being scheduled, reminded when scheduled updates are going to occur, when software updates are beginning and when updates have ended, along with a status so you know all is good. This is done by using a combination of Oracle Events and Notifications. This is of course extremely important for any business to optimize their own operations and streamline a response to any disruption in business.

As it turns out, between now and the time I wrote the first part of my Autonomous Database Dedicated Infrastructure blog series on operational controls, Todd Sharp, a colleague of mine has written an excellent blog post that gives a general overview of Oracle Events and Notifications. Todd’s blog includes how to configure a Notification Topic Subscription. So, rather than providing a step by step guide here again, I am going to refer you to Todd’s blog and focus in this blog on the details of what notifications and events are available specifically for Autonomous Dedicated deployments.

Oracle Cloud service Resources, which are API endpoints in Oracle Cloud, all generate Events about their activities. Those service Resource Events can be monitored using Oracle Notifications Service. Recall there are 3 key service Resources for Autonomous – Dedicated: autonomous-exadata-infrastructures (AEI), autonomous-container-databases (ACD), autonomous-databases (ADB).

Oracle Notifications uses a publish and subscribe communications model. The idea is to create a Topic of interest to which relevant service Resource Events are published and for which interested users create Subscriptions to be notified of each event by their chosen protocol e.g. http, email, pager duty.

For example, one might create a Topic like MaintenanceActivities and then any service resource generating events related to maintenance can be configured to publish their events to that Topic. Users who want to monitor maintenance activities across all resources that are involved in maintenance can create a Subscription to the MaintenanceActivities Topic.

Topics are service Resources that are part of Oracle Notifications, they are defined by you and can represent an aggregation of Events that make sense to how your organization is setup to monitor service operations. If you are a small company, you might even create a Topic like ServiceActivities and direct all Events to that single Topic and perhaps one person is Subscribed to that Topic and gets all notices about all service activities. In larger companies where responsibilities are segmented you might create a range of Topics like Compliance, Security, Administration, Billing and target specific subsets of service events to each, having different groups of people monitoring each Topic. A single Event can be sent to multiple Topics if it makes sense that more than one group is aware of a specific kind of activity.

The obvious question becomes, what service Resource events are available for Autonomous Database? The current set of events for Autonomous Database dedicated deployments include:

Autonomous Exadata Infrastructure – Create(begin/end), Maintenance(scheduled/remind/begin/end), Terminate(begin/end) … a total of 8 event types.

Autonomous Container Database – Compartment(change), Backup(begin/end), Create(begin/end), Restart(begin/end), Maintenance(scheduled, reminder, begin, end), Restore(begin/end), Terminate(begin/end), Update(begin/end), Update(begin/end) … a total of 19 event types.

Autonomous Database – Change Compartment(begin/end), Create(begin/end), Create Backup(begin/end), Generate Wallet, Restore(begin/end), Start(begin/end), Stop(begin/end), Terminate(begin/end) … a total of 14 event types.

Your first step to monitoring maintenance activities would be to create a Topic for the maintenance related events, let’s say we create a Topic SoftwareUpdateCompliance. Keeping in mind, service Resources are specific to a given Compartment, so your Topic is created as a specific Compartment Resource, in the same Compartment where the Events will be getting generated. In Todd’s blog, you learned how to use Oracle Notifications to create the notification Topic. The details page of such a Topic would look as follows:

Of course, now that you have a Topic you need to have at least 1 Subscription so that when events arrive, they will be directed to someone paying attention. You learned how to do that in Todd’s blog so will not repeat here, but ideally you will have created a Subscription that targets a protocol as shown below like pager duty, so your operations team can get asynchronous notifications.

You would now need to setup maintenance related events to be published to that Topic. Recall you learned in Todd’s blog that in the Events Service, you create Event Rules that aggregate related events, when a Rule fires it triggers an Action. An Action can be directed towards Oracle Functions, Streams or Notification. When an Action targets a Notification Topic, then all Subscriptions to the Topic get the notice that the Event has happened, with detail about the Event.

To monitor maintenance activities for Autonomous Databases, you create a Rule for all Maintenance related events, then assign all possible events that have been defined for Maintenance across the Autonomous Database service Resources. You learned how to create an Event Rule in Todd’s blog, below you will see that you need to create an aggregation of Event Types for the Database Service that includes the Event Types: Autonomous Exadata Infrastructure – Maintenance(scheduled/remind/begin/end) and Autonomous Container Database – Maintenance(scheduled, reminder, begin, end).

Make sure your Rule’s Action is setup for Notifications

Rule Target setting to Notifications

Choose the Compartment where events would be generated which is where you’ve created your Topic of interest.

Rule to Compartment Setting

Select the Topic that was just created, in this case SoftwareUpdateCompliance.

Setting Rule Topic

After clicking on Create Rule, you will then be taken to a Details page where you can test it. Maintenance is not as easily triggered as it’s a scheduled activity, so its important to run a test and make sure you see the event show up in the Slack Channel associated with your Topic Subscription.

Because Maintenance cannot be directly triggered, below is an example where an operations-automation channel got an event of “eventType” : “com.oraclecloud.databaseservice.autonomous.database.backup.begin”. Today these events come in a raw JSON format, in the future will have the option to request a human readable format, but for now, getting it is JSON can be useful for further API automation.

Well, that’s all there is to it. It’s quite simple to setup Topics of interest for different categories of Events, direct all of the Events in that category to any Subscription created for the Topic. Using these Oracle Cloud features one can effectively monitor the health of the databases supporting all business applications.


  • No Related Posts

Why Data Lakes Need a Data Catalog

It’s no secret that big data is getting much bigger with each passing year—in fact, the world is seeing exponential growth in the amount of data generated, as plenty of research shows. That creates the issue of storage. If all those bits and bytes are being transmitted and you need access to them in order to analyze and derive insights via business intelligence, then the next logical step is a data lake.

But what happens when all of that data is sitting in the data lake? Finding anything specific within such a repository can be unwieldy by today’s standards. With the growing volume of data generated by all the world’s devices, the data lake will only grow wider and deeper with each passing day. Thus, while collecting it into a repository is key to using it, information needs to be cataloged and accessible in order for it to actually be usable. The sensible solution, then, is to implement a data catalog.

Never miss an update about big data! Subscribe to the Big Data Blog to receive the latest posts straight to your inbox!

What Is a Data Lake?

Before understanding why a data catalog can be so useful in this situation, it’s important to grasp the concept of a data lake. In layman’s terms, a data lake acts as a repository that stores data exactly the way it comes in. If it’s a structured dataset, it maintains that structure without adding any further indexing or metadata. If it’s unstructured data (for example, social media posts, images, MP3 files, etc.), it lands in the data lake as is, whatever its native format might be. Data lakes can take input from multiple sources, making them a functional single repository for an organization to use as a collection point. To go further into the lake metaphor, consider each data source as a stream or a river and they all lead to the data lake, where raw and unfiltered datasets sit next to curated and enterprise/certified datasets.

Collecting data is only half of the equation, however. A repository only works well if data can be called up and used for analysis. In a data lake, data remains in its raw format until this step happens. At that point, a schema is applied to it for processing (schema on read), allowing analysts and data scientists to pick and choose what they work with and how they work with it.

This is a very simple call-and-response action, but one element is missing: the search process. A data lake requires data governance. Without organization, searching for data is a chaotic, inefficient, and time-consuming process. And if too much time passes without clear organization and governance, the value of a data lake may collapse under its own accumulated data.

Enter the data catalog.

What Is a Data Catalog?

A data catalog is exactly as it sounds: it is a catalog for all the big data in a data lake. By applying metadata to everything within the data lake, data discovery and governance become much easier tasks. By applying metadata and a hierarchical logic to incoming data, datasets receive the necessary context and trackable lineage to be used efficiently in workflows.

Let’s use the analogy of notes in a researcher’s library. In this library, a researcher gets structured data in the form of books that feature chapters, indices, and glossaries. The researcher also gets unstructured data in the form of notebooks that feature no real organization or delineation at all. A data catalog would take each of these items without changing their native format and apply a logical catalog to them using metadata such as date received, sender, general topic, and other such items that could accelerate data discovery.

Given that most data lake situations lack a universal organizational tool, a true data catalog is an essential add-on. Without the level of organization of a data catalog, a data lake becomes a data swamp—and trying to pull data from a data swamp creates a process that is inefficient at best and a bottleneck at worst.

How Data Lakes Work with Data Catalogs

Let’s take a look at a data scientist’s workflow from two different perspectives: without a data catalog and with a data catalog. Our hypothetical case study involves a smart doorbell that provides a stream of device data. At the same time, the company tracks mentions on social media by users who’ve had packages stolen to determine times that more accurately predict when thieves come.

Without a data catalog: In this example, a data lake has datasets streaming in from Internet of Things (IoT) devices along with collected social media posts from the marketing team. A data analyst wants to examine the impact of a specific feature’s usage on social media sharing. Remember, the data in a data lake remains raw and unprocessed. In this case, data scientists will have to pull device datasets from the time period of the feature’s launch, then examine the individual data tables. To cross reference against social media, they will have to pull all social media posts from this time period, then filter out by keyword to try and drill down using mentions of the feature. While all this can be achieved using the data lake as a single source, it also requires quite a bit of manual labor for preparation time.

With a data catalog: As datasets come into the data lake, a data catalog’s machine learning capabilities recognize the IoT data and create a universal schema based on those elements. Users still have the ability to apply their own metadata to enhance discoverability. Thus, when data scientists want to pull their data, a search within the data catalog brings up relevant results associated with the feature and other targeted keywords, allowing for much quicker preparation and processing.

This example illustrates the stark difference created by a data catalog. Without it, data scientists are essentially searching through folders without context—the information sought has to be already identified through some means such as data source, time range, and file type. In a small, controlled data environment with limited sources, this is workable. However, in a large repository featuring many sources and heavy collaboration, it quickly devolves into murky chaos.

A data catalog doesn’t completely automate everything, though its ability to intake structured data does feature significant automated processing. However, even with unstructured data, inherent machine learning and artificial intelligence capabilities mean that if a data scientist manually processes data with set patterns, then the catalog can begin to learn and provide first-cut recommendations to speed things up.

Position Your Data Lake for Success

The volume of data flowing into repositories is only getting bigger with each passing day. To ensure efficiency and accuracy, a form of governance is necessary for creating order among the chaos. Otherwise, a data lake quickly becomes a proverbial data swamp. Fortunately, data catalogs are a simple tool to achieve this, and by integrating such a thing into a repository, organizations are set up for success now—and prepared to scale up as needed towards a bigger-than-big data future.

Need to know more about data lakes and data catalogs? Check out Oracle’s big data management products and don’t forget to subscribe to the Oracle Big Data blog to get the latest posts sent to your inbox


  • No Related Posts

Oracle Database(s) Top DB-Engines Ranking

Oracle Database and MySQL top the latest DB-Engines ranking for database management systems (see chart below).

The DB-Engines Index utilizes a scientific method to calculate the popularity of every database management system in use (whether that be relational, NoSQL, time-series, etc.). As of January 2020, Oracle Database is ranked the most popular among all 350 databases that have been analyzed. MySQL, the open source database that is developed, distributed, and supported by Oracle takes the #2 spot.

Not only do Oracle Database and MySQL top the DB-Engines ranking, but their 2019 increase in popularity was higher than any other database’s (see the chart below). In other words, the popularity gap between Oracle and its database competitors is widening.

This latest DB-Engines ranking follows the November 2019 Gartner report, ‘Critical Capabilities for Operational Database Management Systems’, where Oracle Database again achieved the highest scores in all four Use Cases.


  • No Related Posts

Enterprise Manager CIS Benchmark Certification Eases Adoption of Secure Database Best Practices

It only takes a single mistake for the “bad guys” to be able to exploit a misconfiguration and exfiltrate your data. Thanks to the Center for Internet Security, Oracle Database users can avoid such scenarios by following the best practices defined by the CIS Benchmarks™. With the high rate of change in DevOps-oriented development teams and the profilferation of data across on-premise and cloud environments, database administrators now have an easy way to comply with these standards right within Oracle Enterprise Manager.

Configuration and Compliance management has been part of Oracle Enterprise Manager Database Lifecycle Management for a long time, and we’re happy to report that Oracle Enterprise Manager has been certified by CIS to compare the configuration status of Oracle Databases against the consensus-based best practice standards contained in the Oracle Database Benchmark v2.1.0, Level 1- RDBMS Profile. Organizations that leverage Oracle Enterprise Manager can now ensure that the configurations of their critical assets align with the CIS Benchmarks consensus-based practice standards for all their database releases including Oracle Database 18c and 19c. For more details on Oracle’s CIS listings visit Center for Internet Security Web Site.

“Data is a company’s most valuable asset, and securing it has never been more important. We are pleased to support the industry standard CIS Benchmarks as part of our comprehensive Enterprise Manager automation and compliance offerings.”

Wim Coekaerts, Senior Vice President, Software Development

“Cybersecurity challenges are mounting daily, which makes the need for standard configurations imperative. By certifying its product with CIS, Oracle has demonstrated its commitment to actively solve the foundational problem of ensuring standard configurations are used throughout a given enterprise.”

Curtis Dukes, CIS Executive Vice President of Security Best Practices & Automation Group.

Enterprise Manager supports 2 flavors of the CIS Oracle Database v2.1.0 Benchmarks, one for Single-Instance Database and one for Cluster Database. Below is a screenshot of what the listings look like in the Compliance Framework.

Figure 1. CIS Benchmarks as they appear in the Enterprise Manager user interface.

CIS provides comprehensive configuration coverage for Oracle database, including:

  • Installation
  • Parameters
  • Connectivity
  • User Privileges
  • Auditing

Below are examples of some of the specific areas the Benchmark focuses on:

Figure 2. Samples of evaluation areas in the CIS Benchmarks for Oracle Database.

In addition to the CIS Benchmarks included in the latest release of Oracle Enterprise Manager, we’ve also included new Oracle-provided Security benchmarks for Database 18c and 19c. We’re committed to continuing to bring you best-in-class security offerings to harden your security posture across your data estate, whether on-premise or in the cloud.

For more information about Oracle Enterprise Manager, visit and for more information about the Center for Internet Security (CIS), visit

About CIS

The Center for Internet Security, Inc. (CIS®) makes the connected world a safer place for people, businesses, and governments. We are a community-driven nonprofit, responsible for the CIS Controls® and CIS Benchmarks™, globally recognized best practices for securing IT systems and data. We lead a global community of IT professionals to continuously refine these standards to proactively safeguard against emerging threats. Our CIS Hardened Images® provide secure, on-demand, scalable computing environments in the cloud. CIS is home to the Multi-State Information Sharing and Analysis Center® (MS-ISAC®), the trusted resource for cyber threat prevention, protection, response, and recovery for U.S. State, Local, Tribal, and Territorial government entities, and the Elections Infrastructure Information Sharing and Analysis Center® (EI-ISAC®), which supports the cybersecurity needs of U.S. elections offices. To learn more, visit or follow us on Twitter: @CISecurity.


  • No Related Posts

Cloud Day: What’s Possible and Where to Start

Want to get a peek into the future of modern IT? Then, come to Oracle Cloud Day, says Dain Hansen, VP of Product Marketing for IaaS and PaaS at Oracle. After speaking at Oracle Cloud Day events in Boston and Chicago last year, Hansen said that one of the things he liked best about the event was that it gave people a real view into what their future could be.

“Imagine a world where everything is automated. You can use AI to power the next level of insights, or you can build a modern application that you can talk to just like you talk to your phone,” Hansen said. “Those are things that we want people to experience. We want them to get first-hand knowledge of and use and touch and see what’s possible.”

Register here

This year, Hansen said, it’s all about how to use data to get a leg up—on the competition and in your career.

“You’re going to see all kinds of ways to use your data,” Hansen said.

Oracle Cloud Day will take a broad, yet detailed, look at all things data—how to manage it, how to secure it, how to draw insights from it, and how to create applications and services that use it in new ways.

But with so much to see at Oracle Cloud Day and so many new technologies to take in, we asked Hansen, “How does someone get the most from Oracle Cloud Day?”

Here are Hansen’s three tips.

Discover the Best Way to Do What You’re Trying to Do

Because there’s so much expertise on hand, Oracle Cloud Day is the perfect place to get information on best practices. Hansen recommends focusing first on what you’re trying to do within your organization, then finding the best way to do it.

If you’re a security person, maybe you want to learn about the latest security threats or figure out the best way to secure your data across cloud and on premises. If you’re an apps IT person, maybe you want to hear about the best way to migrate an application to the cloud.

Whatever it is, zero in on that topic, seek out the best way to do it, and take a look at how Oracle can help. Oracle Cloud Day is a great venue to experience technologies first hand and talk to experts about how they can help you with not only your needs, but the needs of your business as a whole.

Decide What You’re Going to Learn Next

Once you’ve identified how you can address your current needs, take a look at the horizon. What’s next?

“Everyone is always trying to learn something. Even for me, I’m always trying to study and see what I need to pick up on,” Hansen said.

Because of its emphasis on modern IT and the breadth of Oracle technology, Cloud Day is a great place to get up to date on what’s next for you and your business.

Hear From People Already Doing It

Maybe one of the best things about Cloud Day, Hansen said, is that attendees get to hear from companies already reaching their goals. Cloud Day will be packed with real-life stories told by customers who have made the journey.

“Customers don’t mess around. They don’t mince words. They tell it like it is. And that’s one thing that I don’t want anyone to miss is to hear what our customers say about what they’re doing,” Hansen said.

With 15 sessions across three tracks—Modernizing Data Management, Modernizing Applications, and Transforming Business with Analytics and AI—plus the Developer Playground, industry experts and partners in the Innovation Lounge, and a keynote that brings it all together, there are plenty of opportunities to track down all the information you need for what you’re doing today and what you’ll want to do tomorrow.

Now that you know how to make the most of your time at Cloud Day, don’t forget to register. For more information about Oracle Cloud Day, visit the Oracle Cloud Day website.


  • No Related Posts