Curriculum Vitae
💼 Honours and Awards
🎓 Education
M.S. - Data Analytics Engineering
Northeastern University, Boston, MA (Jan 2021 - May 2023)
- GPA: 3.9/4.0
- Coursework: MLOps, Machine Learning, Deep Learning, Algorithms, Natural Language Processing, Probability and Statistics, Data Mining, Data Management for Analytics
B.Tech - Computer Science Engineering
Indian Institute of Information Technology (IIITS), India (Sept 2015 - May 2019)
- Coursework: Data Structures, Algorithms, Database Management, Linear Algebra, Engineering Probability, Statistics for Data Science.
💼 Experience
ML & MLOps Engineer
Tausight, Boston, MA (July 2023 - Present)
- Engineered an NLP model for detecting Personal Identifiable Information (PII) in electronic health records. Achieved a state-ofthe-art efficacy of 94% F1-score by fine-tuning an LLM model using TensorFlow and TFX in an agile workflow.
- Orchestrated MLOps best practices with Airflow, ELK, DVC, MLflow, etc. for experiment tracking, model management, and data versioning, ensuring a streamlined workflow.
- Established and managed robust MLOps pipelines, integrating CI/CD practices, version control, automated testing, ensuring seamless deployment of models into production environments while reducing deployment time by 25%.
Machine Learning Co-op
Tausight, Boston, MA (Jan 2022 - Aug 2022)
- Conducted research and developed an unsupervised anomaly-detection model to identify malicious applications in healthcare systems, resulting in a reduction of false positives by 15%. Deployed it on customer end points through GCP and Airflow.
- Built end-to-end data pipelines using Python and SQ to process multi-tera byte customer data utilizing BigQuery.
- Employed a wide range of advanced statistical tests, including chi-squared, z-test, t-test and ANOVA which validated the model assumptions, fine-tuned alert thresholds, and reinforced the statistical significance of anomaly alerts.
- Generated comprehensive analytic reports on unprotected health information (PHI) on customer endpoints using a diverse array of data analysis tools while ensuring compliance with data protection regulations.
Machine Learning Engineer
Youngsoft Inc., India (Jan 2020 - Dec 2020)
- Engineered NLP-powered AI chatbot assistants for healthcare products, reducing customer support costs by 30% and enhancing customer engagement by 46%.
- Developed them in Python with Rasa (an open-source ML framework), REST APIs, and deployed on servers through SQL, GitHub/git, Docker, and AWS EC2 instances while integrating with social media (WhatsApp, Facebook, Telegram, etc.)
- Collaborated with cross-functional teams, including data analysts, and stakeholders, to define project goals, requirements, and deliverables, ensuring alignment with business objectives.
Machine Learning - Teaching Assistant
Northeastern University, Boston, MA (Sept 2021 - May 2023)
- Tutoring students in Python, NumPy, Pandas, and SciPy to program Machine Learning algorithms from scratch.
- Mentoring students on Machine Learning concepts with one-on-one tutoring and regular out-of-class assistance, conducting code reviews, and evaluating assignments.
📑 Certifications
- Certified TensorFlow Developer
- DeepLearning.ai specialization by Prof. Andrew Ng (Coursera)
- Data Science Boot Camp (Udemy)
🔧 Skills
- Programming: Python, R, SQL, C++
- ML Frameworks: Scikit-learn, TensorFlow, Keras, HuggingFace, PyTorch
- Data Science Libraries: Pandas, NumPy, Spacy, NLTK, Gensim, OpenCV, SciPy, Dask, Matplotlib, Seaborn, Plotly
- ML Techniques: Hypothesis Testing, A/B Testing, Regression, Classification, Clustering, Decision Trees, Dimensionality Reduction, Neural Networks, CNN, RNN, LSTM, tf-idf, word2vec, Embeddings, Transformer, BERT, GPT, NER
- MLOps: git/GitHub, Docker, MLflow, Airflow, Data Version Control (DVC), TFX, Kubernetes, ELK, Linux, GCP, AWS
📝 Projects
- Please check my Portfolio for more details.