What Is Oracle Cloud Infrastructure Data Science?

And how does it work?

Incredible things can be done with data science, and more appear in the news every day—but there are still many barriers to success. These barriers range from a lack of proper support for data scientists to challenges around operationalizing and maintaining models in production.

That is why we created Oracle Cloud Infrastructure Data Science. Based on the acquisition of DataScience.com in 2018, Oracle Cloud Infrastructure Data Science was built with the goal of making data science collaborative, scalable, and powerful for every enterprise on Oracle Cloud Infrastructure. This short video gives an overview of the power of Oracle Cloud Infrastructure Data Science.

Oracle Cloud Infrastructure Data Science was created with the data scientist in mind—and it’s uniquely suited for data science success because of its support for team-based activity. When it comes to data science success, teams must collaborate at each step of the model lifecycle: from building models all the way through to deployment and beyond.

Oracle Cloud Infrastructure Data Science helps make all of that possible.

Never miss an update about data science! Introducing Oracle Data Science on Twitter — follow @OracleDataSci today for the latest updates!

What Is Oracle Cloud Infrastructure Data Science?

Oracle Cloud Infrastructure Data Science makes data science more structured and more efficient by offering:

Access to data and open-source tools

We are data-source agnostic. Your data can be on Autonomous Data Warehouse, on Object Storage, in MongoDB, or even in an Elasticsearch instance on Azure or AWS Redshift. It doesn’t matter to us where the data is; we just care about giving you access to your data to get things done.

With Oracle Cloud Infrastructure Data Science, you can use the best of open source, including:

  • Tools and languages like Python and JupyterLab
  • Visualization like Plotly and Matplotlib
  • Machine-learning libraries like TensorFlow, Keras, SciKit-Learn, and XGBoost
  • Version control with Git

Ability to utilize compute on demand

We’ll give you the client connectors you need to access your data and a configurable volume to store that data in your notebook compute environment.

But of course, it doesn’t stop there. You can also select the amount of compute you need to train your model on Oracle Cloud Infrastructure. For now, you can choose small to large CPU virtual machines. And in the near future, we’re planning to add GPUs.

Collaborative workflow

We make a big deal out of teamwork, because we believe that data science can’t truly be successful unless there’s an emphasis on making those teams efficient and successful. We’ve done everything we can to make this possible.

Data scientists can work in “projects” where it’s easy to see what’s happening with a high-level view. Data scientists can share and reuse data science assets and test their colleagues’ models.

Model deployment

Model deployment is usually challenging. But it’s made easier with Oracle Functions on Oracle Cloud Infrastructure. Create a machine learning model function which can be invoked from any application. It’s one of many possible deployment targets, and it’s fully managed, high scalable, and on-demand.

What Makes Oracle Cloud Infrastructure Data Science Different?

With the growing popularity of data science and machine learning, products that claim to help are a dime a dozen. So, what makes Oracle Cloud Infrastructure Data Science different?

This isn’t an analytics tool with some machine learning capabilities embedded within it. Nor is it an app that offers AI capabilities across different products.

Oracle Cloud Infrastructure Data Science is a platform built for the modern, expert data scientist. And it was built by data scientists who were seeking a platform that would help them perform their complex work better. It’s not a drag-and-drop interface­. This is meant for data scientists who write code in Python and need something with real power to enable real data science.

Oracle Cloud Infrastructure Data Science is right for you if you:

  • Have a team and see the benefits of centralized work
  • Prefer Python to drag-and-drop interfaces
  • Want to take advantage of the benefits of Oracle Cloud, with easy access to your data

Oracle Cloud Infrastructure Data Science is also right for you if you need:

  • The ability to train large models on large amounts of data with minimal infrastructure expertise
  • A system to evaluate and monitor models throughout their lifecycle
  • Improved productivity through automation and streamlined workflows
  • Capabilities to deploy models for varying use cases
  • Ability to collaborate with team members in an enterprise organization
  • A seamless, integrated Oracle Cloud Infrastructure user experience

How Does Oracle Cloud Infrastructure Data Science Work?

Oracle Cloud Infrastructure Data Science has:

Projects to centralize, organize, and document a team’s work. These projects describe the purpose of the work and allow users to organize notebook sessions and models.

Notebook Sessions for Python analyses and model development. Users can easily launch Oracle Cloud Infrastructure compute, storage, and networking for Python data science workloads. These sessions provide easy access to JupyterLab and other curated open-source machine-learning libraries for building and training models.

In addition, these notebook sessions come loaded with tutorials and example use cases to make getting started easier than ever.

Accelerated Data Science (ADS) SDK to make common data science tasks faster, easier, and less error-prone. This is a Python library that offers capabilities for data exploration and manipulation, model explanation and interpretation, and AutoML for automated model training.

Model Catalog to enable model auditability and reproducibility. You can track model metadata (including the creator, created date, name, and provenance), save model artifacts in service-managed object storage, and load models into notebook sessions for testing.

How Does Oracle Cloud Infrastructure Data Science Help with Model Management?

The process of building a machine leaning model is an iterative one, and it’s one that essentially never ends. Let’s walk through how Oracle Cloud Infrastructure Data Science makes it easier to manage models throughout every step of the entire lifecycle.

Building a Model

Oracle Cloud Infrastructure Data Science’s JupyterLab environment offers a variety of open-source libraries for building machine learning models. It also includes the Accelerated Data Science (ADS) SDK, which provides APIs on data ingestion, data profiling and visualization, automated feature engineering, automated machine learning, model evaluation, and model interpretation. It’s everything that’s needed in a unified Python SDK, accomplishing in a few lines of code what a data scientist would typically do in hundreds of lines of code.

Training a Model

Data scientists can automate model training through the ADS AutoML API. ADS can help data scientists find the best data transformations for datasets. After the model evaluation shows that the model is ready for production, the model can be made accessible to anybody who needs to use it.

Evaluating a Model

ADS also helps with model evaluation to ensure that your model is accurate and reliable. What percent accuracy can you achieve with the model? How can you make it more accurate? You want to feel confident in your model before you start to deploy it.

Explaining a Model

Model explainability is becoming an increasingly important part of machine learning and data science. Can your model give you more information about why it’s making the decisions it’s reaching? Increasingly, there are more European regulations around the right to know. GDPR, for example, states that the data subject has a right to an explanation of the decision reached by a model.

Deploying a Model

Taking a trained machine learning model and getting it into the right systems is often a difficult and laborious process. But Oracle Cloud Infrastructure enables team to operationalize models as scalable and secure APIs. Data scientists can load their model from the model catalog, deploy the model using Oracle Functions, and secure the model endpoint with Oracle API Gateway. Then, the model REST API can be called from any application.

Model Monitoring

Unfortunately, deploying a model isn’t the end of it. Models must always be monitored after deployment to maintain good health. The data it was trained on may no longer be relevant for future predictions after a while. For example, in the case of fraud detection, the fraudsters may come up with new ways to defraud the system, and the model will no longer be as accurate. Oracle Cloud Infrastructure Data Science is working to provide data scientists with tools to easily monitor how the model continues to do while it’s deployed, so that it becomes easier to monitor model accuracy over time.

Conclusion

Oracle Cloud Infrastructure Data Science is an enterprise-grade service in which teams of data scientists can collaborate to solve business problems and leverage the latest and greatest in Oracle Cloud Infrastructure to build, train, and deploy their models in the cloud.

It is part of Oracle’s data and AI platform, which makes it simple to integrate and manage your data and use the power of data science and machine learning for more business results.

With Oracle Cloud Infrastructure Data Science, it’s easier than ever before for data scientists to get started, work with the tools and libraries that they want, and gain streamlined access to all data in Oracle Cloud Infrastructure and beyond. For more information, see this overview video and don’t forget to subscribe to the Oracle Big Data blog to get the latest posts sent to your inbox.

Related:

PureMessage for UNIX: Sample policy.siv file for Delay Queue

This article provides a sample policy.siv file with Delay queue Related test and actions.

The following sections are covered:

Applies to the following Sophos products and versions

PureMessage for Unix 6.4.0 and above.

You can download the sample file here.

Note: When you combine not with allof for the pmx_delayed_mail test you will face an error in the Policy tab of the Manager Interface. It will work for the Milter since there is no syntax or semantic error, although the GUI can’t display it. To work around this you need to write not and allof with two different nested if statements.

Incorrect statement

if allof (pmx_relay :memberof “internal-hosts”,

not pmx_delayed_mail) {

}

Correct statement

if pmx_relay :memberof “internal-hosts” {

if not pmx_delayed_mail {

}

}

If you’ve spotted an error or would like to provide feedback on this article, please use the section below to rate and comment on the article.

This is invaluable to us to ensure that we continually strive to give our customers the best information possible.

Related:

Installing and Connecting Anaconda on Windows to Autonomous Transaction Processing

Introduction

In the previous blog we provisioned and connected to an Autonomous Transaction Processing instance. Autonomous Transaction Processing supports a complex mix of high-performance transactions, reporting, batch, IoT, and machine learning in a single database, allowing much simpler application development and deployment and enabling real-time analytics, personalization, and fraud detection.

In this blog you will install Oracle Client libraries, install Visual Studio, install Anaconda and run a few simple commands on Jupyter Notebook.

Step 1: Download the Oracle Instant Client

In order to connect and run applications from your PC to remote Oracle databases, such as Autonomous Transaction Processing, Oracle client libraries must be installed on your computer. Oracle Instant Client enables applications to connect to a local or remote Oracle Database for development and production deployment. The Instant Client libraries provide the necessary network connectivity, as well as basic and high-end data processing features, to make full use of any Oracle database. It underlies the Oracle APIs of popular languages and environments including Node.js, Python and PHP, as well as providing access for OCI, OCCI, JDBC, ODBC and Pro*C applications. Tools included in Instant Client, such as SQL*Plus and Oracle Data Pump, provide quick and convenient data access.

Let us start with Oracle Instant Client for Microsoft Windows (x64) 64-bit. You can find it here. If you happen to run another operating system, you can find the relevant Oracle Instant Client libraries here.

  • Accept License Agreement and select Basic Lite Package.

  • This will require you signing into OTN with your SSO account. If you do not have an account you need to create one.

  • Download the file and then proceed to the directory where the file was downloaded. Unzip the file into a directory. Open Command Prompt and navigate to the directory.

  • Add this directory to your path in Windows:
    • In Search, search for and then select: Advanced Systems Settings (Control Panel)
    • Click Environment Variables at the bottom of screen
    • In the System Variables double click Path
    • In the screen that opens up select NEW
    • Add full path to the instant client directory (C:instantclient_18_5)

Step 2: Installing Microsoft Visual Studio Redistributable

  • Oracle Client libraries for Windows require the presence of the correct Visual Studio redistributable. Follow the link below to install:

https://support.microsoft.com/en-us/help/2977003/the-latest-supported-visual-c-downloads#bookmark-vs2013

  • Select the correct architecture

  • Double Click the downloaded file and proceed with the installation

  • This completes the installation of the pre-requisites

Step 3: Installing Anaconda/Python/Jupyter

Anaconda/Jupyter is a popular IDE. Anaconda/Jupyter is very sensitive to other installed versions and PATH’s associated with previous installations on your computer. If you have other versions of Python installed, remove them as any PATH’s and projects associated with them or this installation may not work.

  • Download the software from www.anaconda.com/download
  • Select the Python 3.7 version download highlighted below, make sure you select the one for your correct architecture (32 or 64-bit)

  • Go to the folder where the file was downloaded and Double Click it. This brings up the Anaconda installation page, go ahead and Click Next.

  • Click I agree on the next screen

  • In the next screen Select Just me and Click Next

  • Install in the following directory: C:Anaconda3 You must create the directory if the directory does not exist create (the installer will not create it). Click Next

  • Make sure you Select Register Anaconda as my default Python 3.7. Leave Add Anaconda to your PATH environment variable non-selected. Click Install

  • The installation will take a few minutes. Once complete Click Next

  • You will get a prompt to install Microsoft VS Code. Skip this step.

  • Deselect both options in the next screen and Click Next.

  • You must add the new install directory into your PATH. Add C:Anaconda3 and C:Anaconda3scripts to your PATH:

In Windows 10:

  • In Search, search for and then select: Advanced System Settings (control panel)
  • Click Environment Variables at bottom of screen
  • In the System variables double click Path
  • In the screen that opens up select NEW
  • Add full path to the anaconda directory (C:Anaconda3)
  • Add full path to the anaconda scripts directory (C:Anaconda3scripts)

Hooray!!! Anaconda and Python is now installed.

Step 4: Using Anaconda/Jupyter/Python with Autonomous Transaction Processing

Before running any Python apps that access the database, the correct packages must be loaded into the Python environment. Open a Command Prompt Window and navigate to the directory where you installed Anaconda (C:Anaconda3) and run the following commands in order. pip is a package management system used to install and manage software packages written in Python. We will use pip to install the packages:

pip install –upgrade pip

pip install keyring

pip install cx_oracle

pip install sql

pip install ipython-sql

pip install python-sql

  • To Start Anaconda/Jupyter, go to the Windows Start Icon, Click and Select Anaconda Navigator under Anaconda3. Once inside Anaconda, Select Jupyter

  • A new browser page will open up, running Jupyter, Select New and then Python 3 highlighted below:

  • A new Python Notebook will open up. Python is an interpreted language so we must load libraries to use every time an environment is started up. Libraries are loaded with the import command, we will use 3 libraries. Run the following commands as shown below. Copy the 3 lines below and Paste them directly in the box next to the In[]: prompt, then select Run.

import cx_Oracle

import keyring

import os

  • Run a simple command to display your PATH. Run the following command (copy and paste into the box and select Run): print(os.environ[“PATH”]

  • Now let us set the TNS_ADMIN variable. TNS_ADMIN is the location of the unzipped wallet files. Instructions on how to create a wallet can be found here (Hyperlink to previous blog post). Below we set and then check the variable (the first command sets it, the second displays it back). Run the following command (copy and paste into the box and select Run):

os.environ[‘TNS_ADMIN’] = ‘c:wallets’

print(os.environ[“TNS_ADMIN”]

  • Let’s make some external calls to the Autonomous Transaction Processing. For that we need to load another library. Run the command below which will load the library needed to call external sql databases (ignore warning/error messages, make sure to include the %):

%load_ext sql

  • Next let us connect to the Autonomous Transaction Processing database using a user name, password and service. Use your admin account and password created when the ATP database was created. The format of the command is:

%sql oracle+cx_oracle://user:password@service

Once connected you will get the message ‘Connected: admin@None’

  • To run a query, once connected use the oracle+cx library calls followed by the SQL statement (notice no ; at the end of the statement). The SQL below is the same one we ran in previous labs, copy the statement below and paste it in the box and click Run.

%sql oracle+cx_oracle://user:password@service

SELECT channel_desc, TO_CHAR(SUM(amount_sold),’9,999,999,999′) SALES$,RANK() OVER (ORDER BY SUM(amount_sold)) AS default_rank,RANK() OVER (ORDER BY SUM(amount_sold) DESC NULLS LAST) AS custom_rank FROM sh.sales, sh.products, sh.customers, sh.times, sh.channels, sh.countries WHERE sales.prod_id=products.prod_id AND sales.cust_id=customers.cust_idAND customers.country_id = countries.country_id AND sales.time_id=times.time_idAND sales.channel_id=channels.channel_idAND times.calendar_month_desc IN (‘2000-09’, ‘2000-10′)AND country_iso_code=’US’ GROUP BY channel_desc

Awesome. Now you are connected to Autonomous Transaction Processing using Anaconda.

Written by Philip Li & Sai Valluri

Related:

DLP 15.5 and Update Readiness Tool Warnning

I need a solution

Dear,

I am testing the version 15.5 and after run the Readiness Tool the resault show this 6 warnning :

——————————————————————————

Start: Oracle System Parameter Validation – 2019-03-20 16:05:30
    Parameter Name                 Current Value        Recommended Value
    —————————— ——————– ——————–
    memory_target                   0                                3072
    pga_aggregate_target       1073741824               0
    sessions                             1528                          1500
    sga_max_size                    3221225472              0
    sga_target                          3221225472              0
    sort_area_size                   65536                         0
End  : Oracle System Parameter Validation – elapsed .04s – WARNING (6 warnings)

———————————————————————————–

My question is, really need to fix this? in my case is a clean instalation of the oracle and the single tier server

Hostname                 : 099W5751                                             
Upgrade Source Version   : 15.5                                                 
Upgrade Target Version   : 15.5
Readiness Tool Version   : 1
Database Schema Version  : 15.5
Oracle Version           : Oracle Database 12c Standard Edition Release 12.2.0.1.0 – 64bit Production
Oracle Patchset          : 12.2.0.1
Oracle Server Platform   : Microsoft Windows x86 64-bit
Date                     : 2019-03-20 16:04:50

Enforce Version      Date Installed                 Is Current Version?
——————– —————————— ——————-
15.5.0.17018         2019-03-15 11:53:34            Y

Start: Oracle RAC support not enabled – 2019-03-20 16:04:50
End  : Oracle RAC support not enabled – elapsed 0s – PASSED

Start: Oracle CDC support not enabled – 2019-03-20 16:04:50
End  : Oracle CDC support not enabled – elapsed 0s – PASSED

Start: Oracle Securefile Validation – 2019-03-20 16:04:50
**** Oracle 12c detected *****
End  : Oracle Securefile Validation – elapsed .11s – PASSED

Start: Oracle Virtual Column Validation – 2019-03-20 16:04:51
End  : Oracle Virtual Column Validation – elapsed 2.09s – PASSED

Start: Oracle Partition Table Validation – 2019-03-20 16:04:53
End  : Oracle Partition Table Validation – elapsed 0s – PASSED

Start: Numeric Overflow Validation – 2019-03-20 16:04:53
End  : Numeric Overflow Validation – elapsed .02s – PASSED

Start: Table Definition Validation – 2019-03-20 16:04:53
End  : Table Definition Validation – elapsed 6.86s – PASSED

Start: Index Definition Validation – 2019-03-20 16:05:00
End  : Index Definition Validation – elapsed 20.09s – PASSED

Start: Foreign Key Validation – 2019-03-20 16:05:20
End  : Foreign Key Validation – elapsed 5.31s – PASSED

Start: Miscellaneous Object Validation – 2019-03-20 16:05:25
End  : Miscellaneous Object Validation – elapsed .05s – PASSED

Start: Invalid Object Validation – 2019-03-20 16:05:25
End  : Invalid Object Validation – elapsed .02s – PASSED

Start: Oracle System Privilege Validation – 2019-03-20 16:05:25
End  : Oracle System Privilege Validation – elapsed .11s – PASSED

Start: Oracle Object Privilege Validation – 2019-03-20 16:05:25
End  : Oracle Object Privilege Validation – elapsed .75s – PASSED

Start: Check Constraint Validation – 2019-03-20 16:05:26
End  : Check Constraint Validation – elapsed 1.17s – PASSED

Start: Sequence Validation – 2019-03-20 16:05:27
    Out-of-order sequences
    Sequence Name                  Table Name                     Column Name                    Sequence.NEXTVAL Highest PK Value
    —————————— —————————— —————————— —————- —————-
End  : Sequence Validation – elapsed 2.85s – PASSED

Start: Oracle System Parameter Validation – 2019-03-20 16:05:30
    Parameter Name                 Current Value        Recommended Value
    —————————— ——————– ——————–
    memory_target                  0                    3072
    pga_aggregate_target           1073741824           0
    sessions                       1528                 1500
    sga_max_size                   3221225472           0
    sga_target                     3221225472           0
    sort_area_size                 65536                0
End  : Oracle System Parameter Validation – elapsed .04s – WARNING (6 warnings)

Start: Data Validation – 2019-03-20 16:05:31
End  : Data Validation – elapsed 0s – PASSED

Data Objects Check Summary: There are total of 6 warnings and 0 errors.                                                                                                                                                                                                                                    
For details about the Update Readiness Tool output messages and troubleshooting information, see the Support Center article at https://support.symantec.com/en_US/article.DOC1066…                                                                                                                     
 

0

Related: