Ganap Tewary

| Advancing State-of-the-Art Vector Search with Advanced Computing Architecture

Graduate Researcher specializing in Hierarchical Navigable Small World graphs, Production-Scale RAG Systems, and FPGA and Parallel Computing

πŸŽ“ M.S. Computer Engineering @ Arizona State University

View Research
RAG System Architecture with HNSW User Query "Find similar docs" Query Processing β€’ Tokenization β€’ Normalization Embedding Model BERT/Sentence-T5 768D β†’ Vectors HNSW Vector Index Entry Point Layer 2 Layer 1 Layer 0 98%+ Recall Document Store 10M+ Documents Vector + Metadata PostgreSQL/Pinecone Retrieved Context Top-K Documents Re-ranked by Score With Metadata Large Language Model GPT-4/Claude/Llama Context Window: 128K Prompt Engineering Generated Response Context-aware answer

Academic Journey

From honors undergraduate to doctoral researcher, pioneering efficient ML systems

Incoming Ph.D. in Computer Science
Arizona State University

Starting August 2026 | Advisor: Dr. Jeff Zhang

Research Focus: Optimizing search and recommendation pipelines for speed, personalization, and sustainability across large-scale deployments and resource-constrained devices.
M.S. Computer Engineering (Thesis Track)
Arizona State University

Graduating May 2026

Master's Thesis: FPGA-Accelerated HNSW: Hardware Implementation for Ultra-Low Latency Vector Search
Key Achievement: Targeting sub-microsecond query latency through custom datapaths
Advisor: Dr. Jeff Zhang
Key Courses: Deep Learning , Hardware Systems for ML, Advanced Algorithms, Computer Architecture
B.S.E. Computer Systems Engineering
Arizona State University

Graduated May 2025 |
Summa Cum Laude, GPA: 3.86

Honors Thesis: Machine Learning Model for Financial Data Prediction
Barrett, The Honors College - Degree with Honors
Advisor: Dr. Jodi Menees
Key Courses: Machine Learning, Data Structures, Digital Hardware Design, FPGA Programming

Research Areas

Advancing for efficient machine learning systems

πŸ“Š Efficient Recommendation Systems

Why: Modern platforms serve billions of users requiring real-time personalization at unprecedented scale with minimal latency

Developing next-generation recommendation systems through multiple cutting-edge approaches, achieving industry-leading performance metrics with novel algorithmic innovations.

πŸ” HNSW Algorithm Optimization

Pioneering advancements in Hierarchical Navigable Small World graphs through density-aware quantization and multi-stage re-ranking.

Achievements:

  • 2.5Γ— throughput improvement
  • 75% memory reduction
  • 98%+ recall at billion-scale

Heterogenous Hardware-Accelerated

Building production-scale Retrieval-Augmented Generation systems with combined computational power of the CPU - FPGA - GAUDI.

Performance:

  • Sub-100ms P95 latency
  • 100K QPS throughput
  • 10M+ embeddings

πŸ•ΈοΈ Graph Neural Networks

Leveraging GNN architectures to capture complex user-item interactions and social network effects for enhanced personalization.

Innovations:

  • Multi-hop reasoning
  • Temporal dynamics modeling
  • Cross-domain transfer

Industry Adoption:

Netflix YouTube Amazon Spotify TikTok Meta Google Microsoft OpenAI Pinterest LinkedIn Uber

🧬 Neural Architecture Search

Why: Manual network design is inefficient and often suboptimal for specific hardware

Automating discovery of optimal neural architectures for resource-constrained environments. My NEXUS-NAS framework uses evolutionary algorithms and reinforcement learning for Pareto-optimal architectures.

Google AutoML Microsoft Facebook Amazon

πŸ” Quantum Circuits and Algorithms

Why: Classical computing faces fundamental limits in solving certain optimization problems

Exploring quantum algorithms for similarity search and optimization in high-dimensional spaces. Investigating QAOA for large-scale recommendation systems and quantum-enhanced HNSW.

IBM Quantum Google Microsoft Amazon

⚑ FPGA/Hardware Acceleration

Why: Software-only solutions cannot meet ultra-low latency requirements for real-time AI

Developing FPGA implementations of HNSW and NAS algorithms. Targeting sub-microsecond query latency through custom datapaths and memory hierarchies for edge deployment.

NVIDIA Intel Xilinx Qualcomm

Publications & Research

Contributing to the advancement of efficient ML systems

AQR-HNSW: Accelerating Approximate Nearest Neighbor Search via Density-aware Quantization and Multi-stage Re-ranking

Design and Automation Conference (DAC) 2026

Novel density-aware adaptive quantization achieving 4Γ— compression while preserving distance relationships. Demonstrates 1.8-2.5Γ— higher QPS than state-of-the-art HNSW implementations.

HNSW Vector Quantization SIMD C++
ONGOING WORK

FPGA-Accelerated HNSW: Hardware Implementation for Ultra-Low Latency Vector Search

Master's Thesis | Target: IEEE FCCM Reconfigurable Computing Challenge (RCC 2026)

Developing FPGA-based acceleration for HNSW algorithm targeting sub-microsecond query latency. Custom datapath optimizations for billion-scale deployments.

FPGA Verilog HLS Xilinx
ONGOING WORK

NEXUS-NAS: Multi-Fidelity Bayesian Optimization for Hardware-Aware Neural Architecture Search

Target: NeurIPS 2026

Automated neural architecture discovery with hardware constraints. Leveraging GNNs for architecture encoding and Bayesian optimization.

AutoML Bayesian Opt GNN PyTorch
ONGOING WORK

METIS-Graph: Adaptive Multi-Source RAG with Graph-Aware Autoscaling

Target: International Conference on Supercomputing (ICS) 2026

A system to enable adaptive Graph RAG through graph-aware query profiling that prunes 98% of configuration space, dynamic hybrid retrieval mode selection, and multi-source autoscaling across CPU and GPU resources.

RAG Knowledge Graph Autoscaling LLMs

Teaching & Professional Experience

Mentoring the next generation while advancing industry applications

ML/NLP Intern

MyEdMaster Inc.

Aug 2024 – May 2025

Capstone Project

Fine-tuned transformer models using spaCy and PyTorch on scholastic corpora, improving classification accuracy by 20% and achieving F1-score of 0.91. Deployed ML models to production serving 10,000+ active students.

Undergraduate Teaching Assistant

School of Computing & AI, ASU

Jan 2025 – May 2025

Supported 80+ undergraduate students in Embedded C and hardware systems projects, achieving 25% improvement in project outcomes. Delivered focused lectures on IoT systems and Edge AI integration.

Instructional Aide / Grader

School of Mathematical & Statistical Sciences, ASU

Aug 2023 – July 2025

Guided students through Calculus and advanced Statistics coursework. Assessed assignments using structured rubrics, providing constructive feedback to foster academic growth.

College Activities & Certificates

Google Developers Club
IEEE Student Club Member
Barrett Honors College
Dean's List (Multiple Semesters)

Key Projects

From academic research to production deployments

Production-Scale RAG System with HNSW

Master's Research | 2025 - Present

Leading development of production RAG system integrating LLMs with HNSW-optimized vector search. Achieved 30% lower latency and 40% memory savings across 10M+ embeddings.

PyTorch FAISS LangChain pgvector

Machine Learning Model for Financial Data Prediction

Undergraduate Honors Thesis | Barrett, The Honors College | 2024-2025

Developed advanced ML models for financial time series prediction, exploring LSTM networks, ensemble methods, and feature engineering techniques for market analysis.

View Full Thesis
Python TensorFlow Time Series LSTM

Transformer Optimization for EdTech

MyEdMaster Inc. | Aug 2024 - May 2025

Fine-tuned transformer models on scholastic corpora, improving classification accuracy by 20% with F1-score of 0.91. Deployed to production serving 10,000+ students.

spaCy PyTorch FastAPI AWS

Technical Arsenal

Comprehensive skillset spanning ML, systems, and hardware

πŸ€– Machine Learning & AI

HNSW Algorithms Vector Databases Neural Architecture Search Deep Learning Recommendation Systems NLP/Transformers Computer Vision RAG Systems

πŸ’» Programming and Tools

Python C/C++ PyTorch TensorFlow Java SQL R MATLAB AWS SageMaker Docker ElasticSearch Pinecone Weaviate Git & Github

βš™οΈ Hardware & Systems

FPGA/Verilog SIMD (AVX/NEON) ML Accelerators Embedded Systems Computer Architecture IoT Systems

Let's Connect

Open to collaborations and opportunities in ML systems research

Get in Touch

Email

ganapat0706@gmail.com

Location

Tempe, Arizona, USA

Seeking Opportunities

Actively seeking a Summer 2026 ML Engineering Internship focusing on:

  • HNSW and vector database optimizations
  • Large-scale recommendation systems
  • LLMs and RAG systems
  • Model architecture optimization
  • Hardware-accelerated ML

Open to research collaborations in approximate nearest neighbor search and efficient ML systems.

Resume Preview