Maruf Bepary

Projects
Experience
Education
Certificates
More

© 2023-2025 Maruf Bepary

Alignment in LLMs

Project Image

Academic Project

Description

Specifically looked into and improved hybrid-reasoning models using LoRA and novel hybrid-training technique. Regular supervised fine-tuning on reasoning models causes them to forget their reasoning capabilities. This is solved using my novel hybrid training technique.

Language

Python

Links

Repository

Report