Available for opportunities

Satyam
Raj Purohit

AI/ML Software Engineer with 1.5+ years building production-grade ML systems, NLP pipelines, and data engineering solutions. Passionate about LLMs, RAG architectures, and turning research into real-world impact.

80% Processing Time Saved
90% Categorization Precision↑
10+ Awards & Recognitions
8 Certifications
S
Satyam Raj Purohit
Software Engineer @ Accenture · IIT Jammu Alumni
NLP LLMs RAG PyTorch Snowflake Azure FastAPI Databricks
1.5+ yrs
Industry Exp.
IIT Jammu
B.Tech CSE
Top 5
of 80+ teams
8.15 CGPA
Academic Score
Employee of the Month
🏆 ACE Award Winner

Building AI that
actually ships.

I'm an AI/ML Software Engineer at Accenture, Pune, where I design and deploy production-grade machine learning systems — from NLP pipelines and LLM-powered contract analysis to time-series forecasting and multi-agent AI systems.

I graduated with a B.Tech in Computer Science from IIT Jammu (CGPA: 8.15), where I also worked as an AI Research Assistant developing novel evaluation metrics for first-order logic translations by large language models.

Outside the day job, I enjoy building projects from scratch — whether it's an audio language converter, a VSCode extension, or implementing ML papers by hand to deepen my understanding.

🏢
Accenture, Pune Software Engineer — AI/ML & Data Engineering · Aug 2024 – Present
🎓
IIT Jammu B.Tech Computer Science & Engineering · CGPA 8.15 / 10
📍
Pune, Maharashtra Originally from Sirohi, Rajasthan

Find me online

02 / Technical Skills

The tech I work with

A snapshot of languages, frameworks, and platforms I use to build and ship AI/ML systems.

💻
Languages
Python SQL JavaScript C/C++ TypeScript
🧠
AI / ML
NLP LLMs RAG Fine-tuning Transformers Deep Learning Vector Embeddings Multi-agent Systems Time Series Forecasting Anomaly Detection Semantic Search TF-IDF
⚙️
Frameworks & Libraries
PyTorch Scikit-learn Hugging Face NLTK Sentence Transformers FastAPI Flask Prophet ARIMA XGBoost Pandas NumPy
☁️
Cloud & DevOps
Azure GCP AWS Docker GitHub Actions GitHub Copilot Git Agile / Scrum
🗄️
Data Platforms
Snowflake Databricks MySQL H2O.ai
🛠️
Tools
VS Code Postman iTrack Jupyter MuleSoft
03 / Experience

Where I've worked

From production ML pipelines to cutting-edge NLP research.

Software Engineer — AI/ML & Data Engineering
Accenture Private Limited · Pune, India
Aug 2024 – Present
  • Collaborated with cross-functional teams across a 12-week Agile project to deliver a supply chain monitoring platform on Databricks and Snowflake, serving 184 Sourcing Managers.
  • Engineered 3 alert modules (Supplier Health, Price Variance) and 2 forecasting pipelines (Prophet, ARIMA) with log-transform preprocessing, enabling early procurement risk detection.
  • Reduced contract clause verification time by ~80% (80–90 hrs/week → 10–20 hrs/week) using Sentence Transformers and vector embeddings to automate matching across 1,200+ PDF contracts.
  • Improved goods categorization precision by ~90%, narrowing a 20M USD → 2M USD predicted-vs-actual gap using TF-IDF + Balanced Random Forest + rule-based overrides.
  • Shipped 10+ RESTful API endpoints via FastAPI on Databricks (Azure App Services) and 15+ Snowflake stored procedures; resolved a production-critical pipeline stalled 2+ weeks.
  • Cut anomaly investigation time by ~9 hrs/week by prototyping a multi-agent detection system on Azure AI that auto-generates root-cause alerts.
  • Advanced to finals of company-wide Agentic AI hackathon (top 5 of 80+ teams) building a supply chain risk alert system using real-time market news and AI agents.
🏆 Employee of the Month (Oct 2025) ⭐ ACE Award 🎖️ 3 Client Certificates 📣 5 Manager Awards
Artificial Intelligence Research Assistant
Indian Institute of Technology, Jammu · Jammu, India
Mar 2024 – May 2024
  • Designed a syntactic AST-match metric inspired by CodeBLEU to evaluate first-order logic (FOL) translations by LLMs on structural correctness — achieving 15% more discriminative scoring than BLEU on the FOLIO dataset.
  • Built custom AST trees from FOL prefix-notation expressions to capture logical relations and predicates; excluded leaf nodes to isolate structural accuracy.
  • Assessed GPT-3 and BERT reasoning on FOLIO, identified structural failure patterns, and delivered 3 targeted improvement proposals to the supervising NLP professor.
04 / Projects

Things I've built

From research prototypes to production systems and open-source experiments.

🔍
Bias Detection in LLMs
Detected racial, gender, and religious bias in large language models. Fine-tuned DistilBERT and BERT for 3-category bias classification achieving 87% F1; raised it to 88% with a 2-stage summarization + classification pipeline. Benchmarked GPT-2, BERT, RoBERTa, and DistilBERT on the ToxicBias dataset with CONAN augmentation.
Python NLP DistilBERT BERT HuggingFace PyTorch
🔗
Supply Chain Risk Alert System
Advanced to the finals of a company-wide Agentic AI hackathon (top 5 of 80+ teams). Built a real-time supply chain risk alert system that ingests external market news and deploys AI agents to surface actionable procurement risk signals for enterprise teams.
Agentic AI Azure AI Python FastAPI Snowflake
🎙️
Audio Language Converter
An audio-to-audio language converter pipeline — takes speech in one language and outputs speech in a target language. Combines speech recognition, neural machine translation, and text-to-speech synthesis end-to-end.
Python Speech NLP TTS ASR
🔌
DeepSeek VSCode Extension
A VSCode extension integrating the DeepSeek LLM for in-editor AI assistance — code completions, explanations, and chat, built natively with the VSCode Extension API in TypeScript.
TypeScript DeepSeek LLM VSCode API
🏗️
AI From Scratch
Ground-up implementations of core AI/ML algorithms and architectures — neural networks, backpropagation, attention mechanisms, and more — built purely in Python to develop deep intuition beyond library abstractions.
Python Deep Learning NumPy Transformers
🔐
Cryptography Systems
Implementations of classical and modern cryptographic systems in C++ — covering symmetric ciphers, asymmetric cryptography, and cryptanalysis techniques from Caesar cipher to RSA.
C++ Cryptography Systems
💬
ChatBox — TCP Networking
A TCP client-server chat implementation in C using BSD sockets — demonstrates low-level network programming, concurrent connection handling, and real-time bidirectional communication.
C TCP Sockets Systems
📄
Structure Representation for Text
Re-implementation of the AAAI 2018 paper on text classification via structured representation learning using reinforcement learning — policy gradient methods for discovering optimal document structure representations.
Python RL NLP Policy Gradient
05 / Certifications

Credentials & Certs

Industry-recognized certifications across cloud, AI, and developer tooling.

Associate Data Practitioner
Google Cloud Certified
✓ Issued May 2025 · Expires May 2028
Generative AI Leader
Google Cloud Certified
✓ Issued Dec 2025 · Expires Dec 2028
Azure AI Engineer Associate (AI-102)
Microsoft Certified
✓ Issued Aug 2025 · Expires Aug 2026
Azure AI Fundamentals (AI-900)
Microsoft Certified
✓ Issued June 2025
SnowPro Core
Snowflake Certified
✓ Certified
GitHub Actions
GitHub Certified
✓ Certified
GitHub Copilot
GitHub Certified
✓ Certified
Reinvention with Agentic AI
Accenture · Credly
✓ Certified
06 / Education

Academic background

Strong CS fundamentals from one of India's premier technical institutions.

Bachelor of Technology — Computer Science & Engineering
Indian Institute of Technology (IIT), Jammu
Jammu, India · Dec 2020 – May 2024 · CGPA: 8.15 / 10
B.Tech CSE CGPA 8.15 / 10 AI/ML Research NLP Thesis IIT Alumni
07 / Contact

Let's work together

Open to new opportunities

Whether you're looking for an AI/ML engineer to build production-grade systems, have a research collaboration in mind, or just want to connect — I'd love to hear from you.