Device Management
Software

Observability for Mobile Devices

Businesses and organizations that develop mobile applications for critical end-users – such as healthcare or security personnel – are wary of the current volatile mobile device ecosystem that reigns over the industry. Even the most secure and reliable mobile apps are vulnerable due to the fragmentation of the devices market. Provisioning vetted devices still come with substantial compromises to customizations and security. 

A level up, Mobile Device Management (MDM) platforms are a more suitable option for building, scaling, and managing hardware devices, from start to finish; device manufacturing to end-users. Given those flexible alternatives, the next logical step is to exert modern Software Development Lifecycle practices such as DevOps, security engineering, CI/CD, and shift-left testing.

A highly important aspect when adopting MDM solutions is how to do effective observability and telemetry. Let’s take a good look at the best observability use cases for mobile devices in all stages of their enrollment from the initial provision, everyday usage, and decommissioning.

What is observability?

For the uninitiated, the term observability relates to the methodology of inferring the inner state of systems by inspecting their external outputs. You can say it’s the art of debugging without breakpoints, in a sense that you don’t literally place breakpoints that stop during the execution of the live program. Instead, you infer by looking at the external observability metrics that something is amiss. For example, those random timeout requests happen because the request-host is down or the CPU utilization is high because somewhere in the ORM layer the deserialization process is unoptimized. In each case, you have traces and metrics that pinpoint the culprit function calls.

In general terms, you want to use observability to record and satisfy specific non-functional requirements like performance, reliability, or recoverability.

Mobile Device Management Observability

When you apply observability controls in a mobile device, you are interested in both understanding the internal states of the distributed mobile apps that run on the device, as well as the overall device health. This process should fall under the policies of each individual organization with respect to local law and privacy concerns. We can identify the following use cases.

Enrollment

When organizations choose to establish dedicated mobile devices for its end users, they have multiple options and ways to start as explained in this article. After enrolling a device to an MDM, it’s important to establish a baseline of observability metrics and traces so that they can be attached to each user profile once they deliver it. This way any spikes in device behavior or performance can be picked up and resolved easier by MDM operators.

This is critically important when you plan for updates over a period of time, and enables operators to resolve the following questions:

  • How is the current performance?
  • Is an issue affecting all users or just the ones with a particular model?
  • What logs are we capturing right now, and why?

Image updates

Once the devices are distributed to end-users, organizations should establish and observe usage metrics for rolling updates or OS upgrades that may affect the overall device behavior. An effective way to automate this process is by integrating this customization in the CI/CD pipeline so that each build is reproducible, immutable, and traceable. 

Assuming rolling updates and devices that are in use for a period of time, operators should be able to answer the following questions:

  • How has the performance been affected during the past three months?
  • What has performance changed since last month, or since the last image deployment?
  • How has latency been changed?
  • Have SLO or SLA been affected?

Debugging and troubleshooting apps

Even if you deploy approved applications in devices, there would be cases when the applications misbehave or have defects – because no software is bug-free. In order for developers to effectively debug and resolve those issues, they will need to have access to the application logs and debug info. 

Traditionally, for Android devices, this process is performed using the Android Debug Bridge or ADB. However debugging with ADB is really tricky to perform remotely, without physical access to the device and without customized solutions. Therefore, organizations should establish a feasible solution to perform partial or full remote debugging operations on demand; or, as a last resort, rely on remote control application traces or related instrumentation. 

After the application defects are fixed and redeployed to devices, operators should be able to answer the following questions:

  • Did the bug introduce any other defects?
  • Has the performance been affected?
  • How can this be avoided in the future? Is there a preventive action to introduce?

Wear and tear

Once devices are distributed to end-users, they are susceptible to all sorts of wear and tear events. Devices may fall from a height, or into water; there may be a battery defect, or the fingerprint sensor might malfunction. And the list goes on. 

During that time, operators should be able to monitor overall device capabilities and perform preventive actions such as shutting down the device, stopping unused services to preserve battery life, etc.

This lower level of control that operates at the OS directly, is a powerful feature. It can be augmented with specific observability controls that monitor those specific aspects of the deployed devices. 

When implemented that way, operators should be able to answer the following questions:

  • What is the overall device health score?
  • What hardware features are used, or not used at all?
  • How can we prevent device failures and data loss?
  • How can we maximize the usage life?

Decommissioning

Eventually, mobile devices will have to be decommissioned. This could be unexpected due to irreparable physical damage, or due to the end of life support. The time between enrollment and decommissioning can vary from device to device and from vendor to vendor. When decommissioning, it is of particular importance to properly backup valuable data, keep user preferences so that they migrate properly to new devices, and to perform smoke tests on them.

The main objective is to provide sufficient and comprehensive insights on this transition phase so that there is no loss of continuity. For example, a smart dashboard will give the operator a thorough overview of current or past devices that each user used; and depending on the business requirements, a way to securely verify if any settings or applications are installed in the device. For example you may have to check if the right keystore system with the right cryptographic keys is in place when migrating from device to device.

Once more, a robust MDM platform will have security controls in place to ensure proper decommission steps are applied, and a physical removal of the device can be logged into the platform as part of this process.

Next Steps

MDM solutions should establish firm, end-to-end observability controls as a matter of good business practices. An effective observability strategy can reduce risk, prevent loss of personal information, detect broken applications and avoid negative publicity.

Building trust with device management, however, goes both ways. To begin with, you have to inform the users of the safe and proper usage of provisioned devices so that they know the procedures and valid usage policies. Then on the other end, you train the Device Management Administrators (IT Support, Devs or Admins) on the correct ways to set up, interpret and evaluate observability events, in case something goes awry. With this accomplished, only then can you hope to have a higher level of operational maturity and uptime performance in your organization.

Theo Despoudis is a Senior Software Engineer, a consultant and an experienced mentor. He has a keen interest in Open Source Architectures, Cloud Computing, best practices and functional programming. He occasionally blogs on several publishing platforms and enjoys creating projects from inspiration. Follow him on Twitter @nerdokto. He can be contacted via http://www.techway.io/.