felix089 a day ago

How did you structure the dataset for FT? Reminds me of: https://rosslazer.com/posts/fine-tuning/

  • jonpizza 17 hours ago

    I chunked my conversations by day so that each conversation in the dataset would be about the same topic throughout without random switching, which isn't perfect, ideally I would let an LLM chunk the conversations into logical start/stop points, but I didn't want to spend all that money on tokens. I also got rid of any conversations with images and group chat conversations to simplify.

ungreased0675 a day ago

Have you tried talking to yourself with this? Were there any unexpected insights?

  • jonpizza 17 hours ago

    I asked "myself" what my greatest fear was and it actually gave me an accurate answer. Then I asked it again and it said "Clowns". I don't think it was particuarly insightful, but it was slightly eeire. The tone and style are kinda spot on tbh, though the content is generally incorrect.

bn-l a day ago

I tried it out. Were many of the texts about hooking up and “cuties”?

  • jonpizza a day ago

    Not really. There are some, but I do think generally my conversations kinda biased the model over towards adjacent topics. I also included something in the system message along those lines ("Hi, I have a cute girl I want you to meet") for a sample response to a basic input like "Hi" just to make the conversation more interesting.