Aquib Farhaan Hussain

Data Scientist

Experienced Data Science professional with 6+ years of expertise in developing AI products, solution-driven architectures, and automated pipelines. Skilled in Predictive Modeling, Data Analytics, Statistical Analysis, and delivering scalable, impact-driven solutions.

Machine Learning
Deep Learning
NLP
GenAI
Statistics
Time Series

Work Experience

Data Scientist
Acceledge | Pune, Maharashtra
Apr 2023 - Present

Voice-Bot Platform

  • Developed a scalable voicebot integrated with a dialer API to manage outbound call campaigns, tracking 10,000+ queued, active, and completed calls daily via MongoDB and temporary caching.
  • Designed and implemented an end to end service for real-time audio interactions, integrating speech-to-text transcription, multilingual support, and dynamic conversational behavior using LLM-driven logic and intent classification with JSON-based dialogue mapping.
  • Upgraded the system by incorporating GenAI (OpenAI + LangChain) to enable scalable and dynamic response generation, enhancing performance and expanding conversational capabilities.
  • Integrated a robust audio response generation pipeline with session-aware memory management to simulate natural, voice-based conversations in diverse languages and user personas.

Web-Bot Platform

  • Engineered a scalable Chabot platform for Web-based and WhatsApp based application product configured with a JSON-based architecture, setup for bot flows and deployment on Linux based server.
  • Implemented database management with MongoDB for session management, RAG and fine tuning leveraging with LLM and hugging face model to generate language modeling.
  • Integrated advanced functionalities into the chatbot platform, including API integration, sentiment analysis, translation mapping, intent classification, multilingual support, and GenAI-driven capabilities.

Audio Processing and NLP Automation

  • Built FastAPI microservices for multilingual audio pipelines, incorporating language identification (SpeechBrain), transcription (Google STT, SpeechRecognition), neural translation, and domain-specific NER models, processing over 5,000+ audio files/month
  • Architected APIs with flexible input support (file/base64), token-based security, modular utility functions, and resilient fault handling to ensure scalable deployments and seamless cross-language NLP tasks.

Data Scraping and Automation

  • Developed a Selenium-based web scraping tool to systematically navigate websites and extract updated data files on a daily basis. Automated the workflow using a scheduler to enhance operational efficiency.
  • Designed post-processing pipelines for extracted data and established a daily logging system, seamlessly integrating with an SQL database for real-time mapping and visualization on the client dashboard.
Data Scientist
Networth Corp | Bangalore, Karnataka
Feb 2021 - Apr 2023
Python Developer
Tech Mahindra | Noida
Feb 2019 - Dec 2020

Skills & Expertise

Technical Skills
Programming Skills

Loading chart data...

Projects

Voice-Bot Platform
Product Project

Developed a scalable voicebot integrated with a dialer API to manage outbound call campaigns, tracking 10,000+ queued, active, and completed calls daily via MongoDB and temporary caching. Designed and implemented an end to end service for real-time audio interactions, integrating speech-to-text transcription, multilingual support, and dynamic conversational behavior using LLM-driven logic and intent classification with JSON-based dialogue mapping.

GenAI
LLM
MongoDB
Speech-to-Text
NLP
Sentiment Analysis and Model Training
Research Project

Analyzed a Twitter dataset of 1.6M records to discern the sentiment (positive or negative) of tweets. Trained various models, including GloVe+Stacked Bi-LSTM, ANN, and Logistic Regression, to assess their performance in sentiment analysis on the data.

NLP
Deep Learning
GloVe
LSTM
ANN
Logistic Regression
Electricity Forecast in Breweries
Industry Project

Finding patterns of electricity consumption on the MAZ region Brewery dataset, where the records of kilowatts consumed are in every 15 minutes. Demonstrated proficiency in time series forecasting techniques, achieving an RMSPE of 37% for ARIMA models and 10% for LSTM models on the validation set.

Time Series
ARIMA
LSTM
Forecasting
Oil Well Shutdown Process Automation
Freelance Project

Investigated the behavior of neighboring wells in the vicinity of a center well shutdown within a specified radius. Managed data from 60 wells, handling a substantial dataset of approximately 13 GB. Conducted experiments in predictive analysis and visualization techniques, automating the process to enhance data representation.

Python
Data Analysis
Predictive Modeling
Visualization

Get In Touch