October 17, 2025
By Anastasiia D.
Edge Computing,
Edge AI
The need to fit machine learning (ML) solutions into battery-powered devices has created a new kind of engineering standoff. ML models are hungry, and their computational needs grow exponentially. At the same time, battery improvements crawl forward in a slow, linear line. As a result, you can’t just throw in a bigger battery.
Today, power management is not a background task. It’s a core challenge for engineering leaders building the next generation of IoT edge devices, especially in healthcare edge computing, where reliability and runtime are critical.
Simply keeping these devices "always on" – listening, sensing, and analyzing – pushes power limits too far. The consequence is shorter battery life and unwanted heat buildup. Meeting this challenge requires an integrated approach where hardware and software evolve together. Success depends on a strategy that stretches from silicon to software to the battery itself.
Below are four key strategies to help engineering teams design energy-efficient, high-performance ML devices that last longer and run faster.
The foundational layer of any effective power-management strategy for edge computing devices lies in hardware-software co-design. Instead of treating them as separate layers, this approach fosters a dynamic partnership in which software actively leverages the hardware’s power-saving capabilities.
Modern System-on-Chips (SoCs) come equipped with a rich toolkit of power-saving features. These are the physical levers that software can pull to balance performance and consumption in edge AI devices:
Hardware alone can’t optimize itself. Software must act as the orchestrator, managing when and how to use those hardware levers.
Machine learning at the edge doesn’t run in neat, predictable cycles. Instead, it comes in bursts — short, intense computations followed by long idle stretches. Static power optimization can’t handle that volatility. The system must ramp up quickly, perform the inference, then slip back to near-zero power. This is where software expertise becomes decisive: OS-level programming, driver tuning, and real-time analytics determine whether an edge device drains its battery or conserves it.
The biggest power draw in any ML-enabled edge device isn’t the screen or sensors; it’s the model itself. That makes model optimization not a luxury, but a necessity. The challenge is to take a model trained in the comfort of a cloud data center and reshape it to thrive within the tight energy and memory limits of a battery-powered edge computing device.
The goal is simple — reduce size and complexity while preserving accuracy. Achieving it requires three complementary techniques.
Optimization starts with the right foundation. Trying to retrofit a data center-scale model for the edge is like forcing a supercomputer into a smartwatch. Instead, begin with architectures designed specifically for efficiency — MobileNets, EfficientNets, and the broader TinyML family of algorithms. These are purpose-built for edge AI devices, capable of running complex inferences on microcontrollers that sip power in milliwatts.
To bring these models to life, developers rely on edge-optimized frameworks like TensorFlow Lite and PyTorch Mobile, which leverage hardware acceleration while managing tight compute budgets. The result: real-time intelligence without real-time drain.
The final step is translating the trained model into a binary that fits the embedded target. High-level development frameworks like Python are invaluable for design and training, but must eventually yield to the constraints of C++ or similar low-level environments.
A case in point: the OtoNexus Novoscope, a medical edge device designed for near-real-time diagnostic analysis. Our team developed a custom script that automatically generated optimized C++ class structures from Python-based models. This translation preserved ML accuracy while meeting strict performance and power targets — a prime example of how deep optimization turns theoretical ML into practical, deployable models.
Beyond optimizing the primary workload, significant power savings can be achieved by managing the device's runtime behavior, particularly during periods of inactivity. The core principle here is to maximize the time spent in low-power sleep states.
A well-optimized edge device should sleep as much as possible. That means designing an event-driven, sleep-centric architecture, where activity is the exception, not the rule.
A well-defined state machine is the backbone of a reliable and power-efficient embedded system. It governs transitions between power states, ensuring the system always uses just enough energy. Typical states include:
Power management isn’t just about saving energy — it’s about understanding when to spend it wisely. If a device wakes up too frequently for very short tasks, the cumulative energy cost of these transitions can outweigh the energy saved by sleeping.
This leads to a non-obvious optimization challenge: it can sometimes be more energy-efficient to batch several small tasks together and stay awake for a slightly longer, single period than to perform many rapid sleep/wake cycles. Engineering teams must profile not only the power consumption of each state but also the energy cost of the transitions between them.
This approach made all the difference when our engineers worked on the Novoscope project, helping our client turn PoC into a production-ready medical edge device for clinical diagnostics. Its finely tuned state machine enabled seamless transitions between charging, standby, and low-battery states – preserving readiness and extending runtime.
If your organization explores optimizing ML on edge devices for performance and battery life, let’s connect. Our engineers specialize in making the most out of power management on edge devices.
The hardware and software platform you choose shapes not only how efficiently the system runs, but also how your teams build, test, and evolve it over time.
The embedded device market has largely consolidated around ARM-based processors due to their strong focus on performance-per-watt. Technologies like ARM big.LITTLE, which combines high-performance "big" cores with high-efficiency "LITTLE" cores, evolved in response to the need for power-optimized workload management.
A core can sprint when it must (running ML on edge devices) and rest when it can, optimizing both speed and battery life. Many teams now invest in porting software to ARM64, ensuring applications and frameworks are tuned to fully leverage these efficiency advantages across IoT edge devices and enterprise edge device deployment scale systems.
True optimization goes deeper than recompiling code for a new architecture. Real gains come from rethinking how software interacts with hardware. Porting involves tackling compatibility issues, restructuring build pipelines, and tuning every layer of the stack for efficiency.
At Janea Systems, we specialize in this kind of deep engineering — helping organizations bring complex frameworks to new, more efficient architecture. Our experience in porting software to ARM64 has enabled enterprises to extend the life and performance of their products.
A standout example is porting PyTorch to Windows — a project that brought one of the world’s leading AI tools into the realm of edge AI devices. Our engineers had to:
This type of foundational engineering enables an entire ecosystem of developers to build and deploy ML models on edge devices. However, architectural choices like this are strategic commitments. Moving to ARM64 or another specialized platform requires parallel investments in compilers, debuggers, testing environments, and talent.
If your organization is exploring a transition to ARM64 or optimizing existing frameworks for edge computing devices, our engineering team can help. Contact us to learn more.
True efficiency in edge AI devices isn’t achieved through a single optimization or a last-minute fix. It’s the product of a multi-layered engineering approach, one that touches every phase of design and development, including:
Applying these strategies ensures edge devices remain reliable, responsive, and available. Our experience proves these strategies work.
We partnered with OtoNexus to engineer the Novoscope, a handheld medical edge device. By fine-tuning its power management and optimizing embedded data processing, we helped transform an early prototype into a production-ready clinical solution.
Our team led the ARM64 porting of PyTorch on Windows. This project extended the framework’s reach across healthcare, geospatial, and fintech applications, empowering developers to build and deploy ML models on edge devices with greater flexibility and energy efficiency.
If your organization is working on edge AI devices, our engineering team has expertise to share. Reach out to us to learn how we can help.
Ready to discuss your software engineering needs with our team of experts?