Resources
Discord Research PaperInformation
This AI was created using the paper "Attention is all you need". Other transformer models like GPT 4o were also trained loosely resembling this paper.
I used a 60 gigabyte text corpus for training data, and a 20 gigabyte text corpus for testing data.
This model does not use any encoder layers (because I'm still learning the math and how it works), but utilizes 4 decoder layers as well.
The AI model is then put inside of a tkinter UI with a robot being the LLM.
Michael's Description
Honestly I have no idea what to say about this one. It's a very small LLM (a SLM?) It took a VERY long time to train (I have 1 GPU) and was quite proud of this model
The character is Mirage from ULTRAKILL, probably one of the best games out there
Started this in sophmore year after watching ALL of Andrej Karpathy's transformer videos (i love his videos; i watched them all not just those)
Visuals