Update LLMEval example by timsneath · Pull Request #452 · ml-explore/mlx-swift-examples

timsneath · 2025-12-10T23:45:40Z

Proposed changes

This PR modernizes the LLMEval example application by refactoring it from a monolithic ContentView into a clean MVVM architecture with several new features.

New Features

Improved metrics panel: Better visual hierarchy for performance statistics
Preset prompts: Curated prompt library with support for tools and thinking mode, including long-form prompts
Enhanced loading UX: Visual progress indicators for model downloads with file counts
Collapsible prompt area: Expandable text input for longer prompts

Architecture Refactor

Extracted business logic into LLMEvaluator view model with improved state management
Split UI into focused, reusable components: HeaderView, OutputView, PromptInputView, MetricsView,
LoadingOverlayView, PresetPromptsSheet
Created dedicated service layer with ToolExecutor for function calling
Organized models into separate files (PresetPrompts, ToolDefinitions)

Configuration Updates

Changed default model from Phi-2 to Qwen3-8B-4bit to demonstrate improved prompt loading on M5 hardware
Updated README with clearer instructions for switching models
Added an app icon
Cleaned up entitlements

Checklist

Put an x in the boxes that apply.

I have read the CONTRIBUTING document
I have run pre-commit run --all-files to format my code / installed pre-commit prior to committing changes
I have added tests that prove my fix is effective or that my feature works
I have updated the necessary documentation (if needed)

davidkoski · 2025-12-11T00:54:22Z

The changes look good, but some things to think about:

I think we want a very simple example that people could start with -- LLMEval is probably that app
- that isn't to say that it shouldn't be improved or cleaned up, but feature-wise I think it should stay pretty bare
- and ideally with few lines of code
- but it hasn't been touched much for a long time so it can probably be modernized a bit!
the MLXChatExample app, on the other hand, is a little more feature rich and shows what you might build
- I wonder if these changes might be more appropriate there?

OR

do we need a new application that better showcases features like feeding it the .md files as part of the prompt?
- then we could actually simplify the LLMEval app to contain just the barest features

We should probably provide more guidance documentation-wise as to what to expect from each.

What do you think?

mlx-swift-examples.xcodeproj/project.pbxproj

davidkoski · 2025-12-11T16:29:10Z

After offline discussion the new plan is:

take these improvements
I will make a new minimal "hello world" LLM application as a minimal example

mlx-swift-examples.xcodeproj/project.pbxproj

davidkoski

Thank you for this improvement!

Update LLMEval example

c18241e

davidkoski reviewed Dec 11, 2025

View reviewed changes

mlx-swift-examples.xcodeproj/project.pbxproj Outdated Show resolved Hide resolved

Run swift-format and fix pbxproj

e6c41ee

davidkoski reviewed Dec 11, 2025

View reviewed changes

mlx-swift-examples.xcodeproj/project.pbxproj Outdated Show resolved Hide resolved

Tweak pbxproj for distribution

302dbd0

davidkoski approved these changes Dec 11, 2025

View reviewed changes

davidkoski merged commit fc3afc7 into ml-explore:main Dec 11, 2025
2 checks passed

timsneath deleted the llm-eval-new branch December 11, 2025 18:36

timsneath restored the llm-eval-new branch December 11, 2025 18:40

timsneath deleted the llm-eval-new branch December 11, 2025 18:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update LLMEval example#452

Update LLMEval example#452
davidkoski merged 3 commits intoml-explore:mainfrom
timsneath:llm-eval-new

timsneath commented Dec 10, 2025 •

edited

Loading

Uh oh!

davidkoski commented Dec 11, 2025

Uh oh!

Uh oh!

davidkoski commented Dec 11, 2025

Uh oh!

Uh oh!

davidkoski left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

timsneath commented Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposed changes

New Features

Architecture Refactor

Configuration Updates

Checklist

Uh oh!

davidkoski commented Dec 11, 2025

Uh oh!

Uh oh!

davidkoski commented Dec 11, 2025

Uh oh!

Uh oh!

davidkoski left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

timsneath commented Dec 10, 2025 •

edited

Loading