Transforming Medical Research with a Big Data Services Platform

EMC logo


The era of big data has opened up new opportunities for medical innovation.  A new wave of research projects is looking to accelerate medical discovery by combining vast amounts of personal and operational healthcare information onto readily available compute and storage for utilizing leading edge analytics tools and techniques.

The opportunities are tremendous but the challenges are many. Rapid progress requires architectures and processes that overcome the biggest challenges of data management in the healthcare sector: rigorous data privacy regulations; diverse information standards; a proliferation of application and data silos and complex integration points; streams of real-time data from monitors, scanners, imaging devices, wearables, and mobile devices; and vast databases that hold data of every conceivable type.

Partners HealthCare is one of the world’s leading medical research organizations. Encompassing both Massachusetts General and Brigham and Women’s – the major teaching hospitals for Harvard Medical School – Partners HealthCare supports thousands of research projects each year. The Enterprise Research and Infrastructure Services (ERIS) group provides enabling technical capabilities for their research and innovation communities, ensuring that teams have access to the technology infrastructure, data, tools, and support resources they need to meet their project and operational goals.

The research and innovation teams at Partners are pioneers in the use of big data technologies. However it was clear that the aforementioned challenges, along with infrastructure limitations and a lack of supporting services, were impeding adoption and progress. Recognizing that their customers required a new approach for service delivery, ERIS worked with Dell EMC Services to architect, build, and operationalize a platform for developing and executing big data medical and translational research projects faster, more efficiently, and at lower cost.

The result of this collaboration is the Integrated Data Environment for Analytics platform (IDEA). The IDEA platform provides the Partners HealthCare community of researchers and innovators with four key service capabilities that are fundamental to the enablement of big data solutions – storage, compute, analytics, and platform.

  • A pay-as-you-go storage solution that offers secure and unlimited capacity for a wide variety of data sets at a range of price and performance points. Designed around Isilon, CloudPools, and Elastic Cloud Storage (ECS), it provides a single place to store, secure, and analyze unstructured data sets critical for research initiatives in a privacy-aware environment. Using the multi-protocol capabilities of Isilon OneFS, a single data copy can be accessed from any instrument or system, while maintaining an authentication and authorization model that is integrated with Partners existing security processes. CloudPools allows data to be encrypted and warm-archived to ECS or public cloud providers, thereby providing unlimited secure storage, while adhering to Partners’ stringent security and regulatory requirements. This implementation strategy for a secure Data Lake is fundamental to enabling big data analytics in Healthcare and Life Science.
  • On-demand provisioning of Linux-based virtual machines pre-integrated with Partners security platform (authentication, authorization and auditing), storage, and common analytics and development tools. Supporting services include patching, maintenance, backup and high availability, relieving the research teams of common administrative burdens.
  • Integration with leading analytics and research applications that allows all data to be accessed and analyzed in-place using a common data repository. Built upon the Dell EMC Analytic Insights Module (AIM), the platform provides foundational data management and processing capabilities based on the Hadoop ecosystem. Access to Spark, Hive, HBase, Sqoop, and HAWQ is available from purpose-built IDEA Virtual Desktop workstations. These high-powered VDI workstations include installations of popular open-source data sciences tools including R, Python, RStudio, Jupyter notebooks and Spyder.  Multiple relational and NoSQL datastore options are available, including MySQL, PostgreSQL, Greenplum, and MongoDB. IDEA is securely and seamlessly integrated with the ERIS High Performance Computing (HPC) environment, allowing for the development of fully integrated data pipelines between the two systems.
  • An application development platform that allows research and dev teams to rapidly translate their research analytics and processes into robust, data applications that can be deployed as cloud resources for clinical and business use outside of the IDEA environment.

The IDEA platform is used across the research and clinical innovation enterprise. The scalability and flexibility present allows for their use by both large, well-funded institutions, and small innovation teams with limited budgets. Customers of the IDEA platform include:

  • The Center for Integrated Diagnostics, which integrates genomic profiling with advanced analytics across vast data sets to provide patients with a new approach for the personalized treatment of serious diseases. Using IDEA, the center has collaborated with Dell EMC and InterSystems on development of a prototype next-generation precision medicine system (MRE), as well as introducing several innovations into the clinical workflow.
  • The Martinos Center for Biomedical Imaging is one of the world’s premier research centers devoted to development and application of advanced biomedical imaging technologies. The center is using IDEA data services to securely and efficiently share a 100+TB neuroimaging data set, The Human Connectome Project, across its many research teams and analytics platforms.
  • The Center for Connected Health is a leader in the innovation and development of IoT-based solutions that are empowering patients to transform their health care experience. The center is using the compute and platform capabilities of IDEA in the development of next generation mobile healthcare solutions.

These teams, and many others, are only just beginning to explore and understand the power of the IDEA platform, and its potential for supporting medical innovation. Their excitement is palpable. With the support of the ERIS team and Dell EMC, the research teams at Partners HealthCare are shaping the future of healthcare in the big data age.

The post Transforming Medical Research with a Big Data Services Platform appeared first on InFocus Blog | Dell EMC Services.


Update your feed preferences


   

   


   


   

submit to reddit
   

Related:

Smarter flights and lights

By 2020, the Internet of Things will comprise over 30 billion devices, growing to over 75 billion in 2025. It will transform cities, hospitals, factories. The way we work, shop and live will fundamentally change. Such exponential growth means enormous potential, and an avalanche of data from a vast array of sources and sensors. Capitalizing on this paradigm shift requires distilling this data into actionable insights.

Enter Predix: GE’s platform for developing applications for the Industrial Internet is already transforming cities, hospitals and factories. Built on the Pivotal Cloud Foundry platform, Predix offers an operating system for developing, testing and delivering secure software that powers everything from safer jet engines, to more intelligent streets.

Related:

Peter Principle: The Destroyer of Great Ideas…and Companies

EMC logo


Wikibon just released their “2017 Big Data Market Forecast.” How rosy that forecast looks depends upon whether you look at Big Data as yet another technology exercise, or if you look at Big Data as a business discipline that organizations can unleash upon competitors and new market opportunities. To quote the research:

“The big data market is rapidly evolving. As we predicted, the focus on infrastructure is giving way to a focus on use cases, applications, and creating sustainable business value with big data capabilities.”

Leading organizations are in the process of transitioning the big data conversation from “what technologies and architectures do we need?” to “how effective is our organization at leveraging data and analytics to power our business models?

We developed the Big Data Business Model Maturity Index to help our clients to answer that question; to be able to 1) understand where they sit today with respect to how effective they are in leveraging data and analytics to power their business models, and 2) what is the roadmap for creating sustainable business value with big data capabilities (see Figure 1).

Figure 1: Big Data Business Model Maturity Index

Figure 1: Big Data Business Model Maturity Index

So why do organizations struggle if it’s not a technology or an architecture challenge? Why do organizations struggle when the path is so clear, and the business and financial benefits to compelling?

I believe that organizations fail in creating sustainable business value with big data capabilities because of the Peter Principle.

“Peter Principle”: The Destroyer of Great Ideas

The Peter Principle is a management theory formulated by Laurence J. Peter in 1969. It states that the selection of a candidate for a position is based on the candidate’s performance in their current role, rather than on abilities relevant to the intended role. Thus, employees only stop being promoted once they can no longer perform effectively – that “managers rise to the level of their incompetence.[1]

There are two key points in this concept that are hindering the wide spread adoption of data and analytics to power – or transform – an organization’s business models:

  • “Selection of a candidate for a position is based on the candidate’s performance in their current role, rather than on abilities relevant to the intended role.” Never before have we had an opportunity to create and leverage superior customer, product, operational and market insights to disrupt business models and disintermediate customer relationships…never. Consequently, current business leadership lacks the experience to know what to do to make this happen. Organizations likely need a new generation of management (which we are seeing in the “born digital” companies like Amazon, Google, Uber and Netflix) or a massive un-education/re-education of their current business leadership (like what we are seeing at GE…more to follow on the GE transformation, so keep reading!!) to realize that analytics is a business discipline to drive differentiation and monetization opportunities.
  • “Managers rise to the level of their incompetence” which means that those in power are very reluctant to embrace any new approaches with which they are not already familiar. And we have all met these folks who can’t embrace a new way of thinking because they are so personally or professionally invested in the old way of thinking. Consequently, new ideas and concepts die before they are even given a chance because these folks are threatened by any thinking that did not get them to where they are today.

How do you teach the existing generation of management to “think differently” about how to leverage data and analytics to power their business models? How does one get an organization to open their minds and stop focusing on just “paving the cow path,” but instead focus on data and analytics-driven innovation? Let’s try a little exercise, my guinea pigs!!

Decision Modeling: Predictions Exercise

The Challenge: Can we transform business thinking by changing the verb from “automate” to “predict?” Instead of focusing on automating what we already know, in its place let’s try focusing on “predicting” what is likely to happen and “prescribing” what actions we should take.

“Automate” assumes that the current process is the best process, when in fact; there may be opportunities to leverage new sources of data and new data science techniques to change, re-engineer or even delete the process. Can we drive a more innovative approach by instead of focusing on “automation,” we focus on what predictions (in support of key business decisions) we are trying to make and prescribing what actions we should take?

Let’s demonstrate the process using the Chipotle key business initiative of “Increase Same Store Sales.” (Note: this decision modeling exercise expands upon Step 8 in the “Thinking Like A Data Scientist” methodology).

  • First, list the use cases. In Table 1, we will start with just one use case: “Increase Store Traffic Via Local Events Marketing.”
  • Second, list the decisions that one would to address to support the use case. For example, we would need to make a decision about “Which local events to support and with how much funding?”
  • Next, for each decision, brainstorm the predictions that one would need to make to enable the decision. It’s useful to start the predictions statement with the word “Predict.” For example, in support of the “Which local events to support” decision, we would need to “Predict attendance at the local events”.
  • Then, list the potential analytic scores that could be used to support the predictions that we are trying to make. The potential scores were identified in Step 7 in the “Thinking Like A Data Scientist” methodology, but this decision modeling exercise gives us a chance to validate and expand upon those potential analytic scores.
  • Finally, brainstorm the potential variables and metrics that might be better predictors of performance. Step 6 in the “Thinking Like A Data Scientist” methodology identified many of those variables and metrics, but again this decision modeling exercise gives us a chance to validate and expand the potential variables and metrics.

Table 1 shows the results of this process for one use case (Increase Store Traffic Via Local Events Marketing) that supports the “Increase Same Store Sales” business initiative.

Chipotle Business Initiative: Increase Same Store Sales
Use Cases Decisions -> Predictions Scores/Metrics
Increase Store Traffic Via Local Events Marketing Which local events to support and with how much funding?

  • Predict attendance at local events (sporting events, concerts)
  • Predict composition of attendance at local event (parents, kids, teenagers)

How much staff do we need to support the local events?

  • Predict how many workers are required by hour to staff the store
  • Predict what special skills are needed by hour to staff the store
  • Predict how much overtime might be required

How much additional inventory do we need?

  • Predict how much additional food inventory is required to support the local event
  • Predict how much many additional utensils and bowls inventory required to support local events
  • Predict store waste/shrinkage
  • Predict when we need to replenish store inventory and with what

From what suppliers do we source additional food inventory?

  • Predict suppliers excess capacity by food item
  • Predict time-to-delivery for food inventory replenishment
  • Predict (prioritize) what suppliers to engage for additional food procurement
  • Predict quality scores of the new suppliers
Economic Potential Score

  • Local demographics
  • Increase in home values
  • Local economic indicators
  • Local unemployment rate
  • Change in city budget
  • Average income levels
  • Average education levels
  • Number of local IPO’s

 

 

Local Vitality Score

  • Miles from high school
  • Miles from mall
  • Average mall attendance
  • Miles from business park
  • Number of college students
  • Number of local sporting events
  • Number of local entertainment events

Local Sourcing Potential

  • Number of local suppliers
  • Miles from stores
  • Supplier production capacity
  • Supplier quality
  • Supplier reliability
  • Delivery feasibility
Table 1: Predictions Exercise Worksheet

In the workshop or classroom, we would repeat this process for each use case (e.g., improve promotional effectiveness, improve market basket revenues). This analytics-driven approach can bring more innovative and out-of-the box thinking to the organization.

Summary: The GE Story

A recent article titled “You Can’t Outsource Digital Transformation” discusses what GE is doing to prepare for–if not lead–digital business transformation disruption. To quote the article:

“It’s the threat of a digital competitor who skates past all the traditional barriers to entry: the largest taxi service in the world that owns no cars; or a lodging service without any real estate; or a razor blade purveyor without any manufacturing.”

The author, Aaron Darcy, describes what GE is doing to “think differently” – that is to unlearn and relearn – regarding digital business model disruption. This includes:

  • Transforming their operating model with the creation of GE Digital to help lead their digital business transformation.
  • Creating a partner open software ecosystem that enables collaboration with partners and third-party developers to deliver business and financial value for all participants (Customer, Partner and GE).
  • Transforming (un-education and re-education) management leadership with lean startup principles that emphasizes iterative innovation, space to experiment, and a fail-fast mentality.
  • Exploring new or alternative business models by focusing on delivering outcomes and creating sustainable business value with big data capabilities.

Nothing threatens the existence of your business like the Peter Principle. An organization’s unwillingness to “un-education / re-education” will ultimately be the undoing of the organization. Because as IDC believes “By 2018, 33% of all industry leaders will be disrupted by digitally enabled competitors.” Ouch.

IoT is Essential

[1] https://en.wikipedia.org/wiki/Peter_principle

The post Peter Principle: The Destroyer of Great Ideas…and Companies appeared first on InFocus Blog | Dell EMC Services.


Update your feed preferences


   

   


   


   

submit to reddit
   

Related:

Decisions Exercise: Identifying Where and How To Start the Big Data Journey

EMC logo


The recent deluge of rains in Northern California have flooded streets, brought down trees and plugged storm sewers.  As I was trying to make my way around the neighborhood, I thought of a classroom exercise to help my MBA students to identify the use cases upon which they could focus data and analytics.  In this exercise, I’m going to ask my students to pretend that they have been hired by the city to “Optimize Street Maintenance” after these rainstorms.  In particular, the students need to address the following questions:

  • Where and how do you start to address this initiative?
  • What data might you need to support this initiative?

These are classic questions that I hear all the time when I meet with clients about their big data journeys.  Let’s walk through how I’ll teach my students to address this challenge.

Step 1:  Identify and Brainstorm the Decisions

“Where and how to start?” is such an open ended question.  How does one even begin to think about that question?  We recommend that organizations start by identifying the decisions that need to be made to support the targeted business initiative, which is “Optimize Street Maintenance” in this exercise.

I will break up the students into small groups (3 to 5 students) and ask them to brainstorm the decisions that need to be made with respect to the “Optimize Street Maintenance” initiative.  Those decisions could include:

  • What streets and intersections need maintenance?
  • What storm sewers are blocked?
  • What is blocking those storm sewers?
  • What sort of maintenance is needed?
  • What is the impact of street cleaning and debris removal on flooding?
  • What streets and intersections should we fix first?
  • How busy are the streets and intersections?
  • What worker skills are needed to fix the street?
  • What equipment and materials are needed to fix the street?
  • What time of the day / day of the week is ideal for doing that maintenance work?
  • How many workers are available?
  • Do I have access to temporary workers?
  • How much overtime can I afford?
  • How do I warn residents that a road is flooded?
  • What options do I give residents when the major arteries are flooded?

This brainstorming is much more effective when you have brought together the different business stakeholders who either impact or are impacted by the “Accelerate Street Maintenance” initiative (see Figure 1).

Brainstorming Decisions

Figure 1: Brainstorm Decisions Across Different Stakeholders

Some key process points about Step 1:

  • Allow individuals to brainstorm on their own at first. When it is entirely a group exercise, some folks go quiet and we potentially lose some good ideas.
  • Be sure to capture each decision on a separate Post-It note for later usage.
  • Place the decisions/Post-it Notes on a flip chart (or two).
  • You don’t need to group decisions by business function. I just did it here to demonstrate the process.

Finally, “all ideas are worthy of consideration.”  This is the key to any brainstorming session; to create an environment where everyone feels comfortable to contribute without someone passing judgment about his or her thoughts or ideas.

Step 2:  Group Decisions Into Use Cases

Next, we want to group the decisions into common subject areas or use cases (which is much easier to do if each decision is captured on a separate Post-It note).  I will bring all the students together around the decisions on Post-it Notes, and have them look for logical groupings.

Looking over the decisions captured above, we can start to see some natural “Accelerate Street Maintenance” use cases emerging, such as:

Prioritize Streets and Intersections

  • What streets and intersections should we fix first?
  • What streets and intersections are busiest at what times of the day?
  • What are the alternative route options during maintenance?
  • What are the alternative transportation options during maintenance?
  • What business parks or malls will be disrupted by the maintenance work?
  • Which streets and intersections raise safety concerns for bikers and pedestrians?

Estimate Maintenance Effort

  • What streets and intersections need maintenance?
  • What storm sewers need maintenance?
  • How much maintenance is needed?
  • What type of maintenance is needed?
  • What worker maintenance skills are needed?
  • What types of equipment and materials are needed?

Optimize Maintenance Effort

  • What worker skills are needed to fix the street?
  • How many workers with those skills are available?
  • What equipment is available to fix the street?
  • What tools are needed to fix the street?
  • What materials (concrete, asphalt) are needed to fix the street?
  • How effective is street cleaning and debris removal in preventing flooding?

Minimize Traffic Disruptions

  • Which streets are bottlenecks for schools and at what times of the day?
  • Which streets are bottlenecks for shopping malls and at what times of the day?
  • Which streets are bottlenecks for business parks and at what times of the day?
  • What are the alternative route options?
  • What are the public transportation options?

Minimize Maintenance Costs

  • How many workers are available?
  • To what temporary workers do we have access?
  • How much overtime can I afford?
  • How much maintenance budget is available?

Improve Resident Communications

  • What streets need maintenance?
  • What streets and intersections are likely to need maintenance?
  • What are alternative travel routes?
  • What are alternative transportation options?

Increase Resident Satisfaction

  • How many residents did the flooding impact?
  • How long were those residents impacted?
  • What comments or feedback are most important and/or relevant?
  • What phone calls are most important and/or relevant?
  • What social media postings are important and/or relevant?

See Figure 2 for an example of how the end point of Step 2 might look.Group Decisions

A key process point about Step 2:

  • Ideally you will end up with 7 to 12 use cases. If you have fewer than 7, then look for ways to break up some of the groupings.  If you have more than 12, then look for ways to aggregate similar use cases.  Not sure why, but 7 to 12 use cases always seems to work out to the right level of granularity in the use cases.

Step 3:  Prioritize Use Cases

Not all use cases are equal, and some use cases are dependent upon other use cases.  The prioritization matrix takes the different business stakeholders through a facilitated process to prioritize each use case vis-à-vis its business value and implementation feasibility (see Figure 3).

Figure 3: Prioritization Matrix

Figure 3: Prioritization Matrix

For more details on the prioritization process, check out these blogs:

Summary

The news really surprised no one:  “MD Anderson Benches IBM Watson In Setback For Artificial Intelligence In Medicine.”  From the press release:

“The partnership between IBM and one of the world’s top cancer research institutions is falling apart. The project is on hold, MD Anderson confirms, and has been since late last year. MD Anderson is actively requesting bids from other contractors who might replace IBM in future efforts.  And a scathing report from auditors at the University of Texas says the project cost MD Anderson more than $62 million and yet did not meet its goals.”

If big data were only about buying and installing technology, then it would be easy.  Unfortunately, companies are learning the hard way that the “big bang” approach for implementing big data is fraught with misguided expectations and outright failures.

Organizations are so eager to realize the business benefits of big data, that they don’t take the time to do the little things first, like identifying and prioritizing those use cases that offer the optimal mix of business value and implementation feasibility. While I applaud all efforts to cure cancer (my mom died from cancer, so I have a vested interest like so many others), sometimes “curing cancer” might not be the best place to start.  Identifying and prioritizing those use cases that move the organization towards that “cure cancer” aspiration is the best way to achieve that goal.

 

 

 

 

 

 

 

 

 

The post Decisions Exercise: Identifying Where and How To Start the Big Data Journey appeared first on InFocus Blog | Dell EMC Services.


Update your feed preferences


   

   


   


   

submit to reddit
   

Related: