thoughts on language, markets, numeric computations, among other things.
Comparing single-shot LLM paper selection against multi-run consensus voting.
When evaluating LLMs, the order you present data matters more than you’d think. I found that models consistently favor the second option when comparing summaries—unless you make them think first.
Long documents don’t fit in context windows. Here’s a practical chunking approach that iteratively distills text into concise notes, preserving what matters.
Quantization saves memory, but does it hurt quality on smaller 7B models? I ran the same summarization task across Q3, Q4, Q5 and Q8 variants. The differences were smaller than expected.
There’s drama on HuggingFace about data contamination in top-ranked models. I’ve been using una-cybertron-7B-v2 for summarization—time to take a closer look at what’s actually going on.
Q4_K_M, Q5_K, Q8_0—what do these cryptic codes mean? A quick reference for picking the right quantization method when running LLMs locally.
Does GPT-4 justify its cost for domain-specific tasks? I test both GPT-3.5 and GPT-4 on financial news sentiment classification to find out.
We talk about addiction in terms of substances, but the real dependencies run deeper. Some thoughts on the attachments we rarely examine.
A distillation of evidence-based longevity practices: exercise, sleep, diet, supplements. No magic bullets—just what the research actually supports.
Cholesky decomposition breaks a matrix into triangular parts. Useful for solving linear systems, simulating correlated variables, and speeding up optimization.
Word embeddings turn text into vectors that encode meaning. I walk through building one from scratch using TensorFlow and the Gutenberg Encyclopedia.
Neural networks are powerful but expensive. For simpler image tasks, PCA can get you surprisingly far. Here I classify landscapes vs impressionist paintings with just eigenvectors.
FML posts are depressing; JustMadeMyDay posts are uplifting. I build a Naive Bayes classifier to tell them apart—and see which words carry the most weight.
I had 1,363 Bukowski poems lying around from a neural network project. Might as well run sentiment analysis and see what themes emerge from all that beautiful misery.