Maruf Bepary - Projects: Alignment in LLMs

Academic Project

Description

Specifically looked into and improved hybrid-reasoning models using LoRA and novel hybrid-training technique. Regular supervised fine-tuning on reasoning models causes them to forget their reasoning capabilities. This is solved using my novel hybrid training technique.

Language

Python

Links

Repository

Report

Alignment in LLMs

Description

Language

Links

Features

Related Material

Alignment in LLMs

Description

Language

Links

Features

Related Material