Article

How to leverage advanced analytics in the healthcare domain

Learn how Teradata Vantage's advanced analytics capabilities can analyze and predict useful diagnoses and insights in biomedicine and healthcare.

Bilal Khaliq

23 juin 2020 3 min de lecture

Imagine wanting to analyze the notes that a doctor’s typed up in a patient’s electronic health records (EHR) who has tested positive for COVID-19, including descriptions of symptoms and complications in great detail. These notes contain nuances that could be vital to understanding the Similarly, there is exorbitant amounts of healthcare and bio-medical care data for various diseases made available through physician notes, insurance claim, EHR, medical journals, news feeds, social media, etc. All of this data lacks utility, unless mined and brought into shape. The emerging technologies in text processing techniques and resources give way to an ocean of opportunities for providing useful insight, analysis and deduction which mimic the behavior of experts associated with healthcare and its related domain.

This post exemplifies use of some of latest technologies and resources to mine concepts from the bio-medical domain, and applying Teradata Vantage’s advanced analytics capabilities to analyze and predict useful diagnosis and development of the disease, the manner of transmission and the most effective treatments with the least side effects. Doing this in a timely and efficient manner can be critically important to address issues for better prevention, preparedness and even a cure.

prescription.

Electronic Health Records (EHR) are the digital patient information records that are inputted by a physician/clinician after each visit/examination. These recorded entries are manual, free-form text inputs containing a variety of medical information including patient demographics, diseases, anatomy, medication, treatments, dosages, etc. - all which lack structure. These records are often grammatically incorrect, have misspelt names and acronyms which are difficult to disambiguate from different contexts of usage.

In order to process such complex and irregular domain-specific text, we need at our disposal some powerful tools which are able to disambiguate, mine and structure the text which can, in turn, provide ground for further advanced analytics:

One powerful instrument for cleaning and shaping text is Regex. Using Vantage’s Regex functions, text is transformed by removing non-ascii and other mark-up tags, performing sentence segmentation and other text normalization tasks.
Next, we use an important entity recognition tool, MetaMap, which is used to map biomedical text to Unified Medical Language System (UMLS) concepts. It uses a knowledge intensive approach coupled with natural language processing and computational linguistics to categorize concepts and acronyms into 137 possible types and groups of categories. This is a key resource to understanding medical information which is made freely available to promote and improve healthcare services. Through API calls, we were able to transform our dataset into a rich corpus tagged with medical entities and their inter-relations. An example output of an entity tagged sentence is shown in Figure 1.

Figure 1: Bio-Medical Entity Recognition

Syntactics dependency parser gives grammatical structuring to a sentence which in turn helps to pin-point deeper analysis about expressed opinion. Concept negations, conjunctions and adjectival terms help to extract aspect information and opinionated terms from the sentence. This helps to identify at a finer level the sentiment associated with a specific terms or concept rather than jumbled sentiment at the coarse sentence level. To build dependency parse, advanced NLP libraries from python are a good choice, whereas for sentiment analysis, in-built models and trainers are available within Vantage.

Picture1-(1).png

Figure 2: Dependency Parse of Opinionated Sentence

Figure 3: Disorder type mention in each visitor report along with sentiment

For each inspection report, using the features for various categories such as medication, diseases, body parts, etc., along with possible associated sentiment of each aspect, we are able to build advanced models for the medical condition of patients. By using native Vantage capabilities, example analytics are built to obtain useful insight and deductions:

Using the features for disorders and anatomy, we build a classifier to predict possible diagnosis for a patient. Such analytics can assist in a physician’ decision-making in prescribing medication and treatments, taking into account the patient’s past and present conditions along with historical treatment record.
Clustering of physician reports based of various types of features, particularly disorder and anatomy, can reveal related examinations and patients with related symptoms and diseases. This is particularly useful when profiling patients based on their illness patterns.
Using N-Path, it’s possible to obtain a trace of prescribed medication and visualize how physicians have treated cases belonging to the particular medical condition of patients.

Picture1-(3).png

Figure 4: Clustering Visualized using PCA and TSNE graphs

Figure 5: NPath tracing medication prescription

Given that healthcare, pharmaceutical and cosmetic companies are looking towards AI-enabled technologies to help provide useful insight into medical diagnosis, the approach presented here showcases Teradata’s ability to combine Vantage’s advanced analytics offering -- seamlessly integrated with open-source tools and techniques in text processing -- to decipher complex healthcare-related issues pertinent to industry requirements.

Restez au courant

Abonnez-vous au blog de Teradata pour recevoir des informations hebdomadaires

Adresse e-mail professionnelle*

Pays*

Oui

Non

J'accepte que Teradata Corporation, hébergeur de ce site, m'envoie occasionnellement des communications marketing Teradata par e-mail sur lesquelles figurent des informations relatives à ses produits, des analyses de données et des invitations à des événements et webinaires. J'ai pris connaissance du fait que je peux me désabonner à tout moment en suivant le lien de désabonnement présent au bas des e-mails que je reçois.

address1

Votre confidentialité est importante. Vos informations personnelles seront collectées, stockées et traitées conformément à la politique de confidentialité globale de Teradata.

How to leverage advanced analytics in the healthcare domain

À propos de Bilal Khaliq