2024 – Multimodal Robotic World Model In progress
Project developing a unified sensor and teleoperation AI model designed to accelerate robotic perception and and enable safer, more intuitive human-robot collaboration. This project aims to create robots that can learn complex tasks more efficiently by understanding and interacting with their environment through diverse sensory inputs and human guidance.
2024 – WallE: VR Based Robot Teleoperation Interface
Link: Demo
Engineered an intuitive VR teleoperation system that enables precise, real-time remote robot control by translating head movements into robot actions and providing immersive 3D visual feedback for enhanced depth perception. Key achievements include:
Real-time Head-Driven Control: Captured 3-DOF head-pose at 60 Hz from an Oculus headset via ROS, translating movements into joint commands for a multi-DOF robotic neck. This facilitates smooth, natural robot motion through imitation learning.
Immersive 3D Visual Feedback: Integrated a stereo RGBD camera, streaming dual-view video into the Occulus, completeing a closed loop where headset user can look at Robot's envirnoment and reptely perform actions, getting immediate feeback. This provides operators with crucial depth perception for performing fine manipulation tasks.
Modular & Extensible Architecture: Designed a Unity-ROS-Dynamixel workflow with configurable DOF settings, ensuring adaptability to various robot platforms and paving the way for future full-body telemanipulation capabilities. Implemented ROS dampening filters and H.264/H.265 video compression, achieving 60 Hz command updates. This resulted in a low round-trip delay of 45–60 ms and sub-degree motion accuracy, critical for effective teleoperation. Successfully executed neck movements and remote folding tasks, establishing a robust foundation for scalable imitation-learning data collection and complex remote operations.
Designed and implemented a 16-bit CPU based on a reduced instruction set, x86-inspired ISA in SystemVerilog, showcasing end-to-end expertise from ISA specification through to FPGA verification. Key features:
ISA Implementation: Developed a 16-bit processor featuring a 16-bit Program Counter (PC), Instruction Register (IR), general-purpose registers, ALU, and a 3-stage fetch–decode–execute pipeline. Supports core instructions including ADD, AND, NOT, BR, JMP, JSR, LDR, STR, and PAUSE. Constructed a state-machine to precisely sequence memory access, ALU operations, and I/O interactions via a custom `cpu_to_io` bridge, interfacing with on-board switches and hexadecimal displays.
Memory-Mapped I/O & BRAM Integration: Mapped UART, switches, and seven-segment displays into the CPU's address space utilizing on-chip Spartan-7 Block RAM (BRAM), efficiently handling read/write timing without requiring an external “ready” signal.
Graphics Controller: Developed an IP-core-based HDMI graphics controller for 80×30 character text rendering over AXI4 on Vivado IP Integrator. Implemented monochrome and color text output using VRAM and font ROM, supporting inverse text and palette-based coloring.
FPGA Verification: Successfully synthesized and achieved timing closure for the design in Vivado, deploying and verifying stable operation at 50 MHz on a Xilinx Spartan-7 FPGA board.
Developed a UNIX-style operating system kernel and a robust journaling filesystem from scratch for the RISC-V architecture. This project demonstrates helped build my skill in low-level systems programming and OS design. Highlights include:
UNIX-style Kernel in C: Implemented core OS functionalities including a bootloader, trap handling, Sv39 virtual memory with demand paging, and process abstraction. Supports essential user-mode syscalls (open, close, read, write, ioctl, exec, fork, wait, usleep, fscreate, fsdelete). Created cooperative and preemptive threading models using condition variables and timer (mtime) interrupts.
Custom Filesystem + Efficient Block Cache Layer: Engineered a block-based filesystem with a write-ahead journal for metadata consistency and robust crash recovery. Features support for create, read, write, delete, flush operations, and multi-level indirection. Mountable via VirtIO block device or an in-memory “memio” for rapid testing. Implemented a write-back cache with configurable associativity, significantly reducing I/O latency for both filesystem and block operations while ensuring data consistency through the journaling mechanism.
Device Drivers & MMIO: Developed drivers for UART (polling & interrupt-driven), Real-Time Clock (RTC), Platform-Level Interrupt Controller (PLIC), VirtIO block & RNG devices (with custom ISR integration), and GPIO, SPI interfaces for embedded peripherals.
Games! : Built a unified I/O interface to load and execute ELF binaries (e.g., Star Trek game, Doom, Rogue, Zork) on QEMU RISC-V, validated by automated tests for correct loading, execution, and system-call handling.
Physics-Based Dynamic Scene Reconstruction Submitting to AAAI 2025
Using unsupervised, physics-informed deep learning frameworks to model and compensate for atmospheric turbulence and fluid dynamics in visual data. This research integrates convolutional encoders, optical flow estimation, and advanced digital signal processing (Fourier and wavelet domain filters) with physical optics simulations and 3D geometric scene reconstruction. The goal is to accurately predict and correct refractive and flow-induced distortions in real-world imagery, with model training conducted on parallel supercomputing clusters. This work has the potential to revolutionize imaging in challenging environmental conditions.
2024 – AI-Accelerated Hardware Design & Benchmarking Submitted to NeurIPS 2025
Leveraging Large Language Models to automate the generation and verification of SystemVerilog modules for complex hardware design tasks. Developed a comprehensive benchmarking pipeline to evaluate LLM performance in hardware design, featuring automated graders that test LLM-generated modules against testbenches using open-source simulation software. This suite includes a diverse array of tasks, varying in complexity and domain-specific requirements, to thoroughly assess and advance LLM capabilities in the hardware engineering lifecycle.
Developed an innovative autonomous security system to modernize access control, addressing the inefficiencies of traditional key/card-based systems.
Problem Addressed: Traditional campus and home access methods relying on physical cards or keys often lead to delays, additional costs, and accessibility challenges.
Innovative Solution: DoorGuardian employs an ultrasonic sensor (HC-SR04) to detect approaching individuals, triggering an Arduino UNO and OV7670 camera. The captured image is streamed via UART to a PC and then to a Telegram bot. Property owners can remotely grant access by sending “door open” or “door close” commands via Telegram, which an ESP32 then uses to actuate SG90 servo motors, locking or unlocking the door.
Technology Stack:
Sensing & Power Management: HC-SR04 ultrasonic sensor (50 cm trigger range) for motion detection and power-saving standby mode.
Imaging: OV7670 camera module interfaced over UART.
Core Control: Arduino UNO for system coordination, ESP32 for Wi-Fi connectivity and Telegram API interaction.
Actuation: SG90 servo motors for the physical locking mechanism.
Workflow Automation: Telegram Bot integrated via Make.com for seamless remote control.
Key Features & Impact: Provides a live video feed for verification, automated entry/exit logs for security monitoring, an easily updatable face database, and firmware that can be upgraded for future enhancements. This system offers significant real-world benefits, such as secure keyless access for Airbnb guests, ID-free entry for campus buildings, and improved accessibility for users with disabilities.
2018 – PotterMost Platform
Website: https://bit.ly/pottermost
This was my first project! Successfully cultivated and scaled a vibrant Harry Potter fan community to over 12,500 registered users and 4,000+ social media followers. Led and coordinated a team of 30 volunteers to develop engaging content, including quizzes and discussion forums, fostering a highly active and interactive online platform. This demonstrates strong leadership, community management, and content strategy skills.