Projects
A non-exhaustive list of some projects I’ve worked on!
We explore methods for compressing and accelerating Stable Diffusion, resulting in a final compressed model with 80% memory size reduction and a generation speed that is ∼4x faster, while maintaining text-to-image quality.
We build and train custom predictive models to understand the trends of key economic indicators in Kenya. Our findings are presented in an interactive web visualization, build with RShiny, CSS and JS to drive actionable insights.
I propose and implement a novel method for training functionally diverse deep neural networks ensembles by using a repulsive loss. I show that my proposed repulsive loss function results in higher entropy, better calibration and uncertainty quantification compared to standard deep ensembles
I implement a new deep neural network architecture for sequence modelling (S4) as a replacement for the transformer in offline reinforcement learning (RL). We show that S4’s ability to handle longer context improves offline RL performance in standard benchmarks by ~10%.
Designed a custom Generative Adversarial Network (GAN) trained from scratch to closely mimic specific digital art-styles. The network is trained to learn off small datasets (~2000 images) without typical GAN overfitting issues and is trainable in <20 hours on consumer-grade hardware. The generation of artwork is guided by image or text-based inputs powered by CLIP.
AI generated summaries often include hallucinations and do not factually summarize a source text. We propose a novel training method for training seq2seq transformers to correct entity and predicate discrepancies between the source and summary text.