

Mohit Kukreja
DATA SCIENTIST | DATA ENGINEER ANALYST


SKILLS

PROJECTS
House price prediction using Geospatial Data
University of Florida, 2023
House Price Prediction using Satellite Imagery and Local Demographics:
​
Developed a house price prediction model for real estate pricing by integrating intrinsic features, high-resolution satellite imagery, and Points of Interest (POI) data
-
Utilized the King County Housing Dataset containing 21,613 property details prepared by the Center for Spatial Data Science at the University of Chicago.
-
Integrated intrinsic features with Mapbox satellite imagery (at different zoom levels: 16, 18, and 19) and used pre-trained Inception V3 and VGG16 for transfer learning to forecast house prices using the XGBoost regressor model.
-
Incorporated POI data from USGS and utilized geopandas for geospatial analysis and geodesic distance calculations.
-
Achieved an R2 score of 0.846 on this prepared training data and optimized model with diverse feature sets.
Diagnosis and Disease Identification
University of Florida, 2023
Developed machine learning models using the Synthea COVID-19 dataset to predict patients’ mortality rates, demonstrating strong analytical and data manipulation skills in preprocessing large healthcare datasets
-
Developed machine learning models using the Synthea COVID-19 dataset to predict patients’ mortality rates, demonstrating strong analytical and data manipulation skills in preprocessing large healthcare datasets.
-
Utilized and performed a comparative analysis of a variety of machine learning algorithms including Random Forest, Logistic Regression, Gradient Boosting, and Support Vector Classifier.
-
Achieved an accuracy of 98.2% and an F1 score of 0.97 with the Random Forest Classifier, outperforming other models
Early Detection of Glaucoma
University of Florida, 2023
Early Detection of Glaucoma from Fundus Images using Deep Learning​:
​
Developed an automated deep learning system for early glaucoma detection by accurately segmenting the optic disc and
cup in retinal fundus images
-
Developed an automated deep learning system for early glaucoma detection by accurately segmenting the optic disc and
cup in retinal fundus images.
-
Integrated image classifier with the segmentation model for automated glaucoma classification based on optic cup-to-disc ratio (CDR), achieving significant progress with a mean dice score of 0.869.
Library Management System
University of Florida, 2023
Library Management System using RedBlackTrees, and MinHeaps:
​
Offers an efficient method to handle patron and book management in a library environment. Reservation management and operations are made as efficient as possible by using a Red-Black Tree and MinHeap.
-
For the most optimal book management, this system uses a Red-Black Tree data structure, and for processing book reservations, it uses a Binary Min-Heap method.
-
This project includes all the necessary functions, including the ability to insert, borrow, return, and delete books.
-
RedBlackTree class provides a self-balancing binary search tree with added features. Color flipping optimizations during insertion and deletion, counting color flips, and in-order traversal are some of the key features.
Comparative Analysis of GAN and VAE
University of Florida, 2023
Comparative Analysis of GAN and VAE for Image Generation:
​
Developed and compared GAN (DCGAN) and VAE (CVAE) deep learning models on MNIST dataset, optimizing image generation using Convolutional NNs, Adam optimization
-
Developed and compared GAN (DCGAN) and VAE (CVAE) deep learning models on MNIST dataset, optimizing image generation using Convolutional NNs, Adam optimization, with a decay rate of 0.00007 and learning rate of 0.003.
-
Demonstrated the effectiveness of convolutional generative models and the Adam optimization method in generating superior quality images, with DCGAN outperforming other models with 96.5% accuracy despite a longer training time.
Visual Summaries of Analyses
University of Florida, 2022
Explored the use of impression logs and event logs of millions of users present in the Microsoft News Recommendation dataset (MIND dataset) with 16 different news categories and performed data engineering using the pandas library and D3.js
-
Explored the use of impression logs and event logs of millions of users present in MIND dataset.
-
Performed 3-phased exploratory analysis on the normalized data to understand users’ behavior toward the news articles they come across in their day-to-day lives to uncover the hidden patterns.
-
Generated 5 interconnected Tableau dashboards containing visualization summaries and interactive reports, embedding them in a webpage and later using these insights to aid journalists and news officials in designing better news for the users
QUICK ID
Phone
+1 (352) 709-9105
Website
Address
Gainesville, Florida, United States
