Name: Harshi Gupta

Passionate about: Data Science

Address: Delhi, India

Skills

Languages
Python, MySQL, C++
ML & Data Science
Scikit-learn, XGBoost, LightGBM, CatBoost, TabNet, Feature Engineering, ML Pipelines, Model Evaluation
Data Skills
EDA, Data Cleaning, Data Visualization, Statistical Analysis
Tools
Git, GitHub, Jupyter Notebooks, VS Code, Kaggle

About

About Me

I'm Harshi Gupta, a second-year B.Tech student in AI & Data Science at VIPS-TC, Delhi (CGPA: 9.16). I build applied ML systems: tabular data, predictive modeling, and end-to-end pipelines from training to deployed API. My work includes a mental health prediction system using TabNet (91.67% accuracy, 0.986 AUC), accepted at ICDAM 2026 (Springer LNNS, Scopus Indexed), and a book chapter on algorithmic fairness in AI. Competition highlights: Top 10 at IIT Roorkee's E-Summit, Round 2 at EY Techathon 6.0 (1.85L+ registrations) and YUVAi 2026 (2,400+ global teams). Secretary of CODEX. Open source contributor. Currently seeking ML research internships.

  • Domain: Data Science, Machine Learning
  • Education: Bachelor of Engineering
  • Language: English, Hindi
  • Other Skills: Git, GitHub, VS Code, Jupyter Notebooks& Kaggle
  • Interest: Traveling, Travel Photography, Writing

0 +   Projects completed

LinkedIn

Resume

Resume

AI & Data Science undergraduate (CGPA: 9.16) specializing in tabular ML, predictive modeling, and production deployment. Research accepted at ICDAM 2026 (Springer LNNS, Scopus Indexed): 91.67% accuracy and 0.986 AUC via TabNet. Shortlisted at EY Techathon 6.0 (1.85L+ registrations) and YUVAi 2026 (2,400+ global teams). Seeking ML research internships.



Education


2024-2028

Bachelor of Engineering

Vivekananda Institute Of professional Studies- Technical Campus

CGPA:9.16

2010-2024

Higher Secondary School

Richmondd Global School

Grade(Class XII): 81.2%

Grade(Class X): 89.7%

Projects

Projects

Below are the sample Data science projects on Pandas Numpy Matplotlib, Seaborn & Scikit-Learn .

Multi-Model Mental Health Prediction System

Built an ML Model to classify mental health risk in students using academic and demographic features: CGPA, depression indicators, treatment history, and year of study. Benchmarked XGBoost, LightGBM, CatBoost, and TabNet; TabNet achieved best performance with 91.67% accuracy and 0.986 AUC. Research under review at ICDAM 2026, Springer LNNS (Scopus-indexed).

AutoWorthAI

AutoWorth AI is an end-to-end machine learning application that predicts used car prices from 426,000+ real-world Craigslist listings. Built a full scikit-learn preprocessing pipeline, trained Random Forest and Linear Regression models achieving R² of 0.87 and MAE of $2.7K, and deployed a production FastAPI backend with a live public REST API - accessible via Streamlit frontend on the web.

Netflix Content Ecosystem Analysis

Processed and explored a Netflix dataset containing 8,800+ movies and TV shows across 12+ attributes, reducing missing values by 20 to 25% through data cleaning, type conversion, and feature engineering using Pandas.Derived content insights using Matplotlib and Seaborn, showing that 70%+ of titles are Movies, the US contributes 30% of total content, and post-2015 releases account for over 55% of the catalog.

Titanic Survival Prediction

Processed the Titanic dataset (891 passengers) using Pandas, handling missing values in Age and Embarked columns and preparing categorical features, improving overall data usability by 15%. Visualized survival patterns with Matplotlib/Seaborn, showing 74% survival for females vs 19% for males and higher survival rates for 1st-class passengers compared to lower classes

E-Commerce Sales Analysis

Analyzed 200+ order records across 50+ customers and 30+ products, performing data cleaning, feature engineering, and revenue analysis using Python (Pandas, Matplotlib) to compute Customer Lifetime Value (CLV), top-10 customers, and monthly sales trends.Identified the highest revenue-generating product category and analyzed repeat vs one-time customer behavior, supported by category-wise, time-based, and customer-distribution visualizations.

More projects on Github

I love to solve uncover hidden data stories


GitHub

Contact

Contact Me

Below are the details to reach out to me!

Address

Delhi, India

Contact Number

+91 8929604705

Email Address

gharshi089@gmail.com

Download Resume

resumelink