- Andrew Carnegie

"Delightful mixture of Logic and Creativity" - This is what people closest to me would describe me as. I hold a Master’s in Information Systems from Carnegie Mellon University and my research interests span Statistics, Machine Learning, and Operations Research. Here is some more information about me:
[1] M. Sawai, M. Jih-Vieira, S. Chen, and S. Zhu, “Evaluating grid de-energization
through spatio-temporal wildfire ignition prediction,” Manuscript in progress; planned submission
to Applied Energy.[2] S. Chen, M. Jih-Vieira, M. Guo, M. Sawai, and S. Zhu, “Grid-bench: A multimodal benchmark
dataset for modeling power grid resilience under climate and infrastructure stressors,” Manuscript
in progress; planned submission to KDD.[3] M. Sawai, S. More, S. Pandhare, P. Nagardhane, and M. Ranjanikar, A review of intelligent
computing systems for diagnosing plant diseases, https://ssrn.com/abstract=3879604, SSRN
working paper, Jul. 2021. doi: 10.2139/ssrn.3879604.[4] M. Sawai, S. More, P. Nagardhane, S. Pandhare, and M. Ranjanikar, “Intelligent computing
systems for diagnosing plant diseases,” in Computational Intelligence in Data Mining, J. Nayak,
H. Behera, B. Naik, S. Vimal, and D. Pelusi, Eds., Singapore: Springer Nature Singapore, 2022,
pp. 75–87, isbn: 978-981-16-9447-9.[5] M. Sawai, D. Sawai, and N. Sharma, “6-tier design framework for smart solution as a service
(ssaas) using big data,” in Data Management, Analytics and Innovation, V. E. Balas, N. Sharma,
and A. Chakrabarti, Eds., Singapore: Springer Singapore, 2019, pp. 89–98, isbn: 978-981-13-1274-8.
Role : Machine Learning Engineer | FinSat
March 2025 - Present• As FinSat’s inaugural ML engineer, architected and built the end-to-end cloud ML infrastructure and developed from scratch a modular pipeline using Grounding DINO and SAM for mask generation—boosting rooftop segmentation mean IoU from 0.85 to 0.92 and eliminating all manual annotation for fully automated geospatial analysis
• Engineered and optimized a YOLOv8m-based obstruction detection module—integrating pre-trained weights, spatial filtering, and geometric merging—to isolate rooftop obstructions, raising mean IoU from 0.75 to 0.87 and reducing false positives by 40%
• As the first engineer to migrate Finsat’s end-to-end ML pipeline to Azure (Tesla T4 GPU), containerized and orchestrated services with Gunicorn and Nginx, implemented monitoring, and optimized inference to reduce latency from 8 s to under 2 s
Role : Software Engineering Intern (AI / ML) | Tata Consultancy Services and Carnegie Mellon University
August 2024 - Present
(Capstone Project)• Developed a real-time pothole detection and depth estimation system using YOLOv8 and Structure from Motion, achieving 92% detection accuracy and a mean depth error of 4.3 cm
• Built an LSTM-based trajectory prediction model integrated with RAFT optical flow, reducing pedestrian tracking error by 38% and enabling real-time inference at 30 FPS on edge devices
Role : Software Engineer and Machine Learning Engineer - Intern | Hyphenova Network
May 2024 - Aug 2024• Engineered and fine-tuned state-of-the-art transformer models (BERT, RoBERTa) for text classification, improving hate speech detection accuracy by 30% and reducing false positives by 15%, driving robust content moderation for large-scale data systems
• Implemented large language models (LLMs) for multi-class political discourse classification, elevating model accuracy by 25% and achieving a 95% precision rate across 10,000+ text samples
• Transformed an existing no-code platform into a high-performance, scalable native application using React.js (frontend) and FastAPI (backend), resulting in a 45% increase in application speed
• Developed and integrated a dynamic AuthContext, enabling a 40% faster authentication process while maintaining secure and consistent session management across the application
• Engineered a robust JWT-based session management system, achieving a 30% reduction in session expiry incidents and enhancing user experience with automatic token refresh across app restarts.
Role : Software Engineer | Accenture
Aug 2021 - June 2023• Spearheaded the development and testing of Barclays’ flagship application: Barclays Mobile Banking, which boasts over 10 million downloads; integrated new features (employing Java as the primary language) into existing builds for an elevated user experience and enhanced functionality.
• Exhibited advanced problem-solving acumen by conceiving, executing, and overseeing automated regression testing for emerging feature builds; conducted rigorous code reviews for resilient Java and Spring Boot implementations deployed on Kubernetes; utilized Jenkins to streamline testing processes, leading to an impressive 50% reduction in testing time
• Performed automated API testing with Karate framework, Postman, and Katalon Studio; improved unit, integration, and end-to-end (E2E) test cases, elevating production code coverage to a substantial 90%
• Recognized for outstanding performance in Accenture's Tech Expressway program, a rigorous six-month training initiative, demonstrating expertise in Java, C++, and Python programming languages
Role : Teaching Intern | ATSS College of Business Studies and Computer Applications
May 2020 - May 2021I worked at the ATSS College of Business Studies and Computer Applications located in Pune, India as a part time intern.• Conducted comprehensive lectures for the Bachelor of Science (Computer Science) program, covering topics such as Data Structures and Algorithms, Python, Relational Database Management, Operating Systems, Computer Networks, Software Development Life Cycle (SDLC), and Scrum
• Leadership: Mentored undergraduates in software development, focusing on SDLC and Agile methodologies
Role : Graduate Machine Learning Researcher | Carnegie Mellon University
December 2024 - Present• Doing Research on reinforcement learning for sequential decision-making, optimizing policy learning with uncertainty estimation for Disaster Management
Role : Instructional Support Staff | Carnegie Mellon University
August 2024 - Present• Course Staff for 95702: Distributed Systems with Profs. Michael McCarthy and Martin Barrett at Carnegie Mellon University
• Mentoring 90+ Graduate students to implement Distributed Systems, APIs, software architecture patterns, networking protocols, message queueing mechanisms, android applications, microservices, Blockchain, REST programming and cryptography
• Responsible for preparing and teaching lab sessions, holding office hours, grading projects, and assisting instructors
Role : Peer Tutor for Machine Learning Course (10-601) | Carnegie Mellon University
August 2024 - Present• Selected as a tutor based on exceptional academic performance in the course.
• Conducting review sessions for a group of four students, focusing on conceptual understanding, programming techniques, and the mathematical and computational foundations of Machine Learning.

Carnegie Mellon University
Master of Information Systems Management (August 2023 – December 2024)Coursework: Machine Learning, Mathematical and Computational Foundations of Machine Learning, Distributed Systems, Java for Application Programmers, Data Focused Python, Data Structures for Application Programmers, Cyber Security in AI and ML, Database Management Systems, Intermediate Statistics, Agile Methods, Financial Analytics, Digital Transformation, Managing Disruptive Technologies
Dean's List : Spring Semester 2024
Dean's List : Fall Semester 2024

Bachelor of Engineering in Computer Science (July 2017 – July 2021)Coursework: Object-Oriented Programming (C++ and Java), Machine Learning, Cloud Computing, Design and Analysis of Algorithms, Operating Systems, Theory of Computation, Software Engineering and Project Management, Web Technology, Software Testing, Computer Networks, Digital Electronics and Logic Design, Discrete Mathematics
Dean's List at Carnegie Mellon university (Spring 2024 and Fall 2024)Honored with the prestigious Dean's List Award for exceptional academic performance, achieving a GPA of 3.85+ for consecutive semesters
GATE Exam 2022 (National entrance Test for Post Graduation in India)All India Rank: 5900 in Computer Science ExamAll India Rank: 1128 in Mathematics Exam
Graduate Record Examination (GRE) 2022Quantitative Reasoning (Math): 170/170
Verbal Reasoning: 153/170Total Score: 323/340
Unified Cyber Olympiad 2009All India rank: 5 (2009)Maharashtra State Government Scholarship Examination 2011State Rank: 16
Percentile: 99.997
Getting Felicitated by the hands of then Deputy Chief Minister, Mr. Ajit Pawar:

FootballCaptained Computer Science Department Football Team (undergrad) to various Inter-collegiate Tournament Championships.


Cochlea, Pune, IndiaVolunteer Experience: Cochlea – Empowering Hearing-Impaired Children (2023)
In 2023, I had the privilege of volunteering with Cochlea, a renowned non-profit organization dedicated to supporting and empowering hearing-impaired children. Through this experience, I contributed to initiatives aimed at enhancing accessibility, fostering inclusivity, and creating opportunities for these children to thrive in their personal and educational journeys.

I architected a social networking application for Carnegie Mellon University that validates CMU email IDs via an external 3rd-party API, ensuring only authorized users can register. This project involved creating an Android application using Java, incorporating multi-threading for a responsive GUI. The app made HTTP requests following RESTful principles and handled JSON/XML parsing, communicating with a web service deployed on Codespaces Cloud as the backend.The project also included integrating a desktop application analysis dashboard using JSP, HTML, and CSS to display analytics, such as the most popular course at CMU. User information was securely stored in a MongoDB database hosted on Atlas Cloud, with the server program handling all CRUD operations.Here is a glimpse of the application:


Project Overview:
Built a Recurrent Neural Network Language Model (RNN-LM) with self-attention to generate and score text, trained on the TinyStories dataset containing 2 million GPT-generated short stories with simplified vocabularies.Data Preprocessing:
Implemented subword tokenization to break text into sequences of tokens, assigning numerical values to substrings such as words, characters, punctuation, and whitespace. This approach aligns with state-of-the-art techniques used in models like GPT-4 and LLaMA.Model Architecture:
Embedding Layer: Converts token indices into dense vector representations.
RNN Layer: Acts as the backbone for sequence modeling, capturing temporal dependencies in the data.
Self-Attention: Enhances the model's ability to focus on relevant context across sequences.
Language Model Head (LM Head): Maps the processed features to output token probabilities.Training Methodology:
Used PyTorch's Module-based Auto Differentiation for efficient implementation and backpropagation. Optimized the model using a cross-entropy loss function, ensuring alignment between predicted and true next-token distributions.Text Generation Techniques:
Demonstrated advanced text generation methods, including:
Greedy Sampling: Selecting the token with the highest probability at each step.
Temperature Sampling: Adjusting randomness to control creativity in generated text.
Key Insights and Results:
The project highlighted the effectiveness of combining RNNs with self-attention mechanisms for language modeling tasks, emphasizing scalability and adaptability to diverse datasets.

Overview
This project explores Q-learning with linear function approximation to solve the Mountain Car environment, where an underpowered car must leverage momentum to reach the goal. The ε-greedy policy was implemented to balance exploration and exploitation, ensuring optimal decision-making.State Representation & Function Approximation
To handle continuous state spaces, the tile coding method was used to discretize position and velocity, allowing efficient feature engineering. A linear function approximator with stochastic gradient updates was employed to estimate Q-values, optimizing the policy over time.Experience Replay & Temporal Difference Learning
An experience replay buffer improved sample efficiency by breaking correlation between consecutive updates. The Q-learning update rule incorporated temporal difference learning with a discount factor (γ) to optimize long-term rewards.Results & Impact
The implementation successfully trained an RL agent to solve the Mountain Car game with stable convergence, demonstrating the effectiveness of Q-learning, function approximation, and policy optimization techniques.
Blockchain Java Documentation and ImplementationI designed and implemented comprehensive Java documentation for Blockchain technology, covering essential methods such as proof of work, chain repair, transaction addition, and Blockchain validation. Key components include:Blockchain and Block Classes: Developed classes with all necessary methods for executing Blockchain-based operations.
Deployment: Utilized a client-server architecture with TCP socket programming.
Client: Provided a menu-driven interface for user interactions.
Server: Handled backend Blockchain operations and authenticated clients using RSA digital signatures.
This implementation ensures secure and efficient Blockchain operations, showcasing my expertise in both Blockchain technology and Java programming.
• Pioneered the "Know Your State" web-based distributed system, utilizing Java Servlets, JSP, and MVC architecture on the TomEE server to enable interactive education about U.S. states, demonstrating proficiency in web application development and distributed computing• Executed advanced web scraping using Jsoup and dynamically processed APIs, fetching and parsing JSON records with GSON to deliver real-time, detailed information and images about the state
• Developed a game in Java, implementing multithreading for optimal performance; utilized Java Swing for GUI application; and used Java AWT for efficient event handling• Implemented a dedicated thread for each mole, enabling real-time score tracking and dynamic mole behavior
I Developed a Student Management Software Application in Java demonstrating important OOP concepts like classes, abstraction, inheritance, polymorphism, interfaces, static nested classes, non-static nested classes, anonymous classes, and the importance of making defensive copies for data security when dealing with mutable data; developed this application to demonstrate the effectiveness and ease of OOP.
• AI/ML: Led a team with an objective to detect whether a plant is diseased with the help of Computer Vision and Machine Learning (Neural Networks); published research paper in Springer proceedings, 2022• Full Stack Development: Orchestrated the design, implementation, and deployment of the project through a robust application coded in Python (backend), integrating Flask for RESTful API development; utilized modern frontend frameworks such as React for dynamic user interfaces; and employed asynchronous programming with Celery to optimize background tasks, ensuring a responsive and efficient user experience
• Pioneered a new algorithm called ‘Modified Data Encryption Standard’ for encryption of digital data• Successfully reduced the key generation time for this algorithm as compared to the ‘Simplified Data Encryption Standard’ algorithm; decreased the possibility of a brute force attack
• Deployed a system to predict the house price using associated datasets by implementing different classification algorithms like Linear Regression, Random Forest, and Decision Tree• Compared these classification algorithms on various parameters and conducted exploratory data analysis with the help of Seaborn, Pandas, and Matplotlib using Python