Founder & Inventor
Inventor of core patent-pending technology for a next-generation speaker company.
Lab Manager & Creative Technologist
Lab Manager & Creative Technologist under Academy Award Winner, Jack Cashin
Computer Vision Lead
Lead computer vision engineer for international public event, turning a large format projector into an explorable world
Projection Mapping & Audio Systems
Custom screen, audio system, projection mapping for a reverse-projection showing of Fantastic Mr. Fox in a theatrical 7.1 audio channel up-mix
Animatronics & LLM Systems
A Star Wars-inspired animatronic with face tracking eyes and LLM-enabled communication
Real-Time Writing Systems
A real time system-level editing assistant that automatically updates your writing to the desired tone as you go
Physics-Informed Neural Networks
A PINN-based deep learning formulation of the Gerchberg-Saxton algorithm for volumetric phase retrieval in immersive displays
Volume Rendering & Acoustics
OpenGL Volume Rendering Engine for designing visual displays and acoustic fields in immersive environments
Investigating high amplitude ultrasound’s effect on Varroa Destructor pests
Captured Performer Motion for Projected Digital Character Animation
R&D in sound-based levitation of objects in immersive displays
Entirely computer vision based control of computer, including cursor, keyboard, and scrolling
An Ubuntu Server Configuration for Remote Deep Learning Tasks
While living in India, managed a team of five PHD-Students in developing and shipping a mobile app
An oxidation-driven expansion rate in a layered titanium prosthesis for children with knee conditions
Researcher with Center for Social Action in Bangalore
An exploration of a selection of my photos chosen to tell stories.
Some of my favorite origami pieces: modern takes on an ancient art form with intricate design, a single piece of paper, and no cutting or tearing.
A collection of my abstract, impressionistic, and geometric ink drawings.
A collection of my digital art.
Mixed Media Limited Palette Figurative Impressions.
A rigorous work in modern metaphysical philosophy.
Ice picks, mules, and diseased glacial water.
A glimpse into my life in India at a local university, as a researcher, and mobile application team manager.
Scuba Diving, Drone Photography, Big Mountain Skiing, Climbing Mt. Fuji, etc.
Compassionate ∩ Curious ∩ Persistent ∩ Innovative ∩ Collaborative
‘Deep Gerchberg-Saxton’: A Physics Informed Neural Network Architecture
A novel deep learning approach to Volumetric Phase Retrieval problems; applied to immersive displays
Summary
Less-Technical Overview: The aim of this project is to control special devices called phased arrays that can produce tactile sensations in mid air, simulating touching an object when there is, in reality, no object there. Methods at this have been created, but have been limited by accuracy in producing the desired sensations. There are also applications of the techniques described here to radio, satellites, visual displays, and sound.
Abstract: ‘Deep Gerchberg-Saxton; is a physics informed neural network (PINN) architecture that addresses an incomplete area in applied physics with huge structural implications to engineering: phase retrieval problems. Phase retrieval problems are concerned with finding the correct phase of an oscillation so as to produce a desired state after a certain distance or time of propagation. For any non-trivial phase retrieval problem, given a desired state there is no guarantee of an exact solution to produce it, nor is there a procedural method of getting the closest possible answer. Gerchberg-Saxton is the among the best techniques for approximating a phase retrieval solution, using a combination of Fourier and Inverse Fourier Transforms and propagating through the Angular Spectrum Method. However, it is not very accurate and there is no training, so inference is quite slow- around 200 iterations of Fourier Layers and gradient descent per planar slice per instant in time.
‘Deep Gerchberg-Saxton’ is a learned replacement for iterative numerical solvers that, reduces the inference complexity to around 3 Fourier layers and improves phase retrieval fidelity. The applications are diverse and the upshot is devices like phased arrays having continuous volumetric amplitude field production with low computational power.
Due to IP constraints, some of the implementation and details are redacted.
Key Contribution: Designed physics informed neural network (PINN) architecture with wave propagation operators embedded directly in the network that reduced inference from ~200 iterative Fourier propagation steps to ~3 learned Fourier layers while improving phase reconstruction fidelity, enabling improved phased array control in deployed devices.
Skills
ML Production Pipeline ▪︎ Physics Informed Neural Networks (PINNs) ▪︎ Custom Neural Network Architecture Design ▪︎ Loss Function Design for Physical Constraints ▪︎ Inference-Time Complexity Optimization ▪︎ Algorithmic Benchmarking vs Classical Methods ▪︎ ML Inference Optimization for Embedded / Low-Compute Systems ▪︎ Phase Retrieval Algorithms ▪︎ Volumetric Field Synthesis ▪︎ Wave Propagation & Angular Spectrum Method ▪︎ Computational Physics ▪︎ Fourier and Inverse Fourier Transforms ▪︎ Python ▪︎ NumPy ▪︎ Matplotlib ▪︎ Tensorflow ▪︎ PyTorch
Highlighted Role
Jupyter Notebook
Details:
Physics Background: Superposition to Angular Spectrum Method
Superposition is the property of waves, under the linear wave equation assumption, such that the amplitude at a given point is the sum of the amplitude produced by each distinct source wave at that point. You may think of how two speakers, each playing music are generally louder than a single speaker on its own. In essence, the amplitude (a linear version of volume or dB) from each speaker is added together.
The Angular Spectrum Method (ASM) becomes useful when there are many source signals- say over 150- and we are interested in the amplitude over an area rather than at a single point. We use ASM because of its reduced computational complexity: O(N*M) -> O(Mlog(M)), for N sources and M evaluation points.
(Linear Wave Equation Assumption)
Here we see the graphs of two sine waves with the same amplitude and different frequencies plotted on a 2D graph. On the bottom we see a simple sum of the two functions.
Fourier Transforms
Fourier transforms, in simple terms, convert a spatial or amplitude representation of a function into one of frequency. The graph here is of the frequency components of the superimposed function in the above graph. On the x-axis we see the frequency and on the right the relative magnitude. Notice that the prevalent frequencies are not simply the component frequencies but also the sum and difference frequencies.
Angular Spectrum Method
The Angular Spectrum Method follows these four steps for propagating a sampled planar acoustic field (in this application the phased array plane) to a desired plane (in this case the user’s hand):
Physics Background: Gerchberg-Saxton Algorithm for Phase Retreival
The Gerchberg-Saxton Algorithm for phase retrieval is among the most ubiquitous in its use case. However, it is inefficient and lossy, leading to many hybrid and specialized techniques based on application. However, the basic building blocks- using ASM and making updates the the amplitude field on each forward and backward propagation lend itself extremely well to a deep learning application- in essence, the deep learning component that I propose answers the question of what updates should be made at each iteration of ASM to best converge to the optimal solution.
Source
Gerchberg-Saxton (GS)
Gerchberg-Saxton (GS) first takes an initial guess at phase in the emission plane (the phased array plane) and the known amplitude. Then, it uses ASM for forward propagation to the target plane (a user’s hand) and constrains the amplitude to the target image (the desired acoustic field to produce tactile sensations), keeping the phased components. Then, ASM for backward propagation brings the field back to the emission plane. The amplitude is constrained here to the known amplitude.
This process is repeated until the error criterion is met.
The limitations of Gerchberg-Saxton have been widely discussed. The most important is that there is no guarantee of convergence, often getting stuck in local minima. There are also the aforementioned inference complexity issues.
Limitations of Gerchberg-Saxton
Deployed Solution
Implementation details have been redacted due to IP constraints.
The core of the prosed system is using the intuition of the iterative improvements typical of GS, while using deep learning to make those updates, and bring about convergence to an error criterion more often and with less computational complexity during inference than traditional GS.
Problem Formalization
Input: Discrete representation of target amplitude field, ŷ, at image plane.
Output: Discrete representation of amplitude restricted phase distribution, y, at emission plane.
Objective: Minimize reconstruction error between target, y, and actual, ŷ, after propagation of planar phase distribution and amplitude from emission plane to target plane, i.e. Minimize(Loss( ASM(y), ŷ )).
Data Synthesis
These are simple examples of the data synthesis technique used. Unlike some machine learning paradigms, PINNs often use synthetic data generated according to known physcial laws or desired physical states.
Methodology
I employed best practices in machine learning to prevent overfitting, data leakage, and accurately estimate performance with the correct metrics.
This includes a 70/15/15 train/test/validate split followed by partitioned normalization, a custom MSE based loss function, and hyperparameter tuning constrained to training folds.
You may also be interested in:
Fluid Simulation OpenGL Engine
OpenGL volume rendering engine for investigating acoustic fields in density dependent immersive displays.
boX: Custom Deep Learning Server
An Ubuntu Server Configuration for Remote Deep Learning Tasks
Acoustophoresis: Acoustic Levitation in Volumetric Displays
R&D in sound-driven object levitation for immersive spaces
Founder
Inventor of Core Patent-Pending Technology of a Speaker Manufacturing and IP Company