LLMs on Apple Neural Engine (ANE)

Question

Following the discussion on running Language Models (LLMs) on Apple's Neural Engine (ANE), users are exploring the performance advantages claimed by Apple, which include 'up to 10 times faster and 14 times lower peak memory consumption' compared to standard implementations. The debate touches upon the limitations of existing frameworks, such as MLX and llama.cpp, which have not fully integrated ANE support due to API restrictions. Users express skepticism about ANE utilization, suggesting that more resources could be directed towards GPU capabilities instead. Additionally, there's curiosity about the potential for neural cores in Apple Silicon to be used in training, as current evidence points to their effectiveness mainly in inference. This indicates an ongoing interest in maximizing the hardware potential of Apple devices.

LLMs on Apple Neural Engine (ANE)

0 Answers