top of page
Business Consultation

Mohit Kukreja

DATA SCIENTIST | DATA ENGINEER ANALYST

Business meeting

ABOUT ME

  • LinkedIn
  • Instagram
  • Snapchat

Follow me on social netwroks

Hello, I'm Mohit Kukreja, a graduate student pursuing Computer and Information Science at the University of Florida. I am passionate about data science and data visualization, and I thrive on deriving insights and hidden patterns from large datasets.

 

With over 3 years of work experience as a Data Engineer Analyst, I have honed my skills in SQL and data analysis. Apart from my expertise in SQL, I have developed a strong set of technical skills in data analysis, data science, and machine learning. I have worked on end-to-end data science projects that involved collecting data from third-party sources and presenting insights and conclusions in a clear and concise manner. I have experience working with programming languages such as Python and R, as well as data analysis tools such as Pandas, NumPy, and Matplotlib. I am also experienced in working with machine learning libraries such as Scikit-learn and TensorFlow.

In addition to my technical skills, I have excellent communication and collaboration skills that allow me to work effectively with colleagues and stakeholders. I am always looking for new challenges and opportunities to improve my skills and stay up-to-date with the latest trends and technologies in the field of data science.

 

Thank you for taking the time to learn a little about me. Please feel free to explore my portfolio and reach out if you have any questions or if you would like to collaborate on a data science project.

Mohit Kukreja
Business Consultation

SKILLS

Bar Chart

PROJECTS

House price prediction using Geospatial Data
University of Florida, 2023

House Price Prediction using Satellite Imagery and Local Demographics:

​

Developed a house price prediction model for real estate pricing by integrating intrinsic features, high-resolution satellite imagery, and Points of Interest (POI) data

  • Utilized the King County Housing Dataset containing 21,613 property details prepared by the Center for Spatial Data Science at the University of Chicago.
     

  • Integrated intrinsic features with Mapbox satellite imagery (at different zoom levels: 16, 18, and 19) and used pre-trained Inception V3 and VGG16 for transfer learning to forecast house prices using the XGBoost regressor model.
     

  • Incorporated POI data from USGS and utilized geopandas for geospatial analysis and geodesic distance calculations.
     

  • Achieved an R2 score of 0.846 on this prepared training data and optimized model with diverse feature sets.

Diagnosis and Disease Identification
University of Florida, 2023

Developed machine learning models using the Synthea COVID-19 dataset to predict patients’ mortality rates, demonstrating strong analytical and data manipulation skills in preprocessing large healthcare datasets

  • Developed machine learning models using the Synthea COVID-19 dataset to predict patients’ mortality rates, demonstrating strong analytical and data manipulation skills in preprocessing large healthcare datasets.
     

  • Utilized and performed a comparative analysis of a variety of machine learning algorithms including Random Forest, Logistic Regression, Gradient Boosting, and Support Vector Classifier.
     

  • Achieved an accuracy of 98.2% and an F1 score of 0.97 with the Random Forest Classifier, outperforming other models

Early Detection of Glaucoma
University of Florida, 2023

Early Detection of Glaucoma from Fundus Images using Deep Learning​:

​

Developed an automated deep learning system for early glaucoma detection by accurately segmenting the optic disc and

cup in retinal fundus images

  • Developed an automated deep learning system for early glaucoma detection by accurately segmenting the optic disc and
    cup in retinal fundus images.
     

  • Integrated image classifier with the segmentation model for automated glaucoma classification based on optic cup-to-disc ratio (CDR), achieving significant progress with a mean dice score of 0.869.

Library Management System
University of Florida, 2023

Library Management System using RedBlackTrees, and MinHeaps:

​

Offers an efficient method to handle patron and book management in a library environment.  Reservation management and operations are made as efficient as possible by using a Red-Black Tree and MinHeap.

  • For the most optimal book management, this system uses a Red-Black Tree data structure, and for processing book reservations, it uses a Binary Min-Heap method.
     

  • This project includes all the necessary functions, including the ability to insert, borrow, return, and delete books.
     

  • RedBlackTree class provides a self-balancing binary search tree with added features. Color flipping optimizations during insertion and deletion, counting color flips, and in-order traversal are some of the key features.

Comparative Analysis of GAN and VAE
University of Florida, 2023

Comparative Analysis of GAN and VAE for Image Generation:

​

Developed and compared GAN (DCGAN) and VAE (CVAE) deep learning models on MNIST dataset, optimizing image generation using Convolutional NNs, Adam optimization

  • Developed and compared GAN (DCGAN) and VAE (CVAE) deep learning models on MNIST dataset, optimizing image generation using Convolutional NNs, Adam optimization, with a decay rate of 0.00007 and learning rate of 0.003.
     

  • Demonstrated the effectiveness of convolutional generative models and the Adam optimization method in generating superior quality images, with DCGAN outperforming other models with 96.5% accuracy despite a longer training time.

Visual Summaries of Analyses
University of Florida, 2022

Explored the use of impression logs and event logs of millions of users present in the Microsoft News Recommendation dataset (MIND dataset) with 16 different news categories and performed data engineering using the pandas library and D3.js

  • Explored the use of impression logs and event logs of millions of users present in MIND dataset.
     

  • Performed 3-phased exploratory analysis on the normalized data to understand users’ behavior toward the news articles they come across in their day-to-day lives to uncover the hidden patterns.
     

  • Generated 5 interconnected Tableau dashboards containing visualization summaries and interactive reports, embedding them in a webpage and later using these insights to aid journalists and news officials in designing better news for the users

QUICK ID

Phone

+1 (352) 709-9105

Email

Website

Address

Gainesville, Florida, United States

CONTACT ME

Thanks for submitting!

bottom of page