
Academic Project
Specifically looked into and improved hybrid-reasoning models using LoRA and novel hybrid-training technique. Regular supervised fine-tuning on reasoning models causes them to forget their reasoning capabilities. This is solved using my novel hybrid training technique.