Elizabete Kalnozola Manager • about 3 years ago
Johnson & Johnson
Prize - Internship for Fresh Graduate, Working Adult & Undergraduate in Singapore
1. Predicting Length of Stay in ICU using MIMIC-III Dataset
Problem Description: Intensive Care Units (ICUs) provide critical care to patients with life-threatening illnesses and injuries. Accurately predicting the length of stay (LOS) of patients in ICU can help healthcare providers allocate resources efficiently and improve patient outcomes. In this challenge, participants will use the publicly available MIMIC-III dataset to develop models that predict the LOS of ICU patients.
Dataset: MIMIC-III (Medical Information Mart for Intensive Care III) is a large, freely-available database comprising deidentified health-related data associated with over forty thousand patients who stayed in critical care units (https://mimic.mit.edu/docs/iii/)
Here are some of recommended steps for approaching the problem:
Getting access to the dataset
Understand the dataset including the tables, meaning of the attributes, etc.
Conduct EDA (explanatory data analysis) to visualized the key indicators impacting ICU length of stay
Build a predictive model to predict the length of stay in ICU
Design methodology to verify your prediction and present the result.
Hint: Azure Machine Learning
2. Mall Analytics for Commercial Success
Problem Description: It is crucial to understand visitor traffic, customer demographic attributes and market trends to increase commercial success. Please develop a model to identify popular malls that are likely receive high foot traffic from the public. Deductions about the age group and customer profile (such as affluence) who may visit these malls are highly valuable as well.
Here are some attributes you may consider, although other methods demonstrating innovation & creativity are welcome as well:
Distance to Malls, MRTs, Residential areas
Footfall traffic at nearby MRTs & bus stations
Type of amenities in surrounding area, and distance to these amenities
Number of parking lots at mall
Google API data showing “popular timings” which has more visitors on an average day (e.g. https://github.com/m-wrzr/populartimes)
Other possible data sources you may use (but not limited to) includes:
Data published by LTA on Public transport: https://datamall.lta.gov.sg/content/datamall/en/dynamic-data.html
Data published by Gov on Carparks: https://data.gov.sg/dataset/carpark-availability
Hint: Azure Machine Learning
Comments are closed.

1 comment
Elizabete Kalnozola Manager • about 3 years ago
Problem Statement Session Recording - https://itraingroup-my.sharepoint.com/:v:/g/personal/syarifah_itraingroup_onmicrosoft_com/EZ7KL9d-NLNJub3vKzR19OgBYqfDm1OE1tCtVvqkuxT4Vw?e=WdKAbz