Large Language Model Testing Tool
The definitive proof AI can't reason. A browser-based tool for testing and evaluating large language models in real-time.
Jul 11, 2025The quick brown fox panagram lookup test. Create a language that maps real words to words within the target word list. Then ask the LLM to figure out the sentence and then fill in the last word. Make sure to pick a unique seed number to ensure that the engine has not already read the solved puzzle.
Generate the output and pass on to a large language model, and then check the result.
While the input and seeding is deterministic, be careful as the word list may change if you reload the page and the server refreshes the list.
Over time, LLMs learn to answer correctly with common inputs. However, they struggle to produce the desired output with less common seeds. Utilizing Deep Research, which incorporates search and tool capabilities, can help the LLM find the answer. For instance, searches based on a phrase like "The quick brown fox..." will directly yield the sentence, indicating the LLM can retrieve a correct result without requiring significant intelligence. Conversely, prompts types of 'think longer,' where the LLM is run repeatedly, demand genuine thinking or complex reasoning and therefore often fail.
Test Config
Expected answer
Check The Answer
The test will appear here...
The code can be found at the repository: github.com/snowdon-dev/node-llm-test