Remote Radio Unit Network Application from http://www.heliostelecom.com

Big Data and Pre-emptive Services in Wireless: Improving Performance at the Edge

To build on previous posts discussing what the wireless network of the very near future will deliver and enable, one area requiring exploration is artificial intelligence and how that can be used to improve network performance, particular at the edge.  With current Long Term Evolution (LTE) fourth generation (4G) wireless networks, remote radio units (RRUs) are mounted at the top of the tower, and communications between them and the Digital Unit LTE (aka DUL, essentially a computer) is provided via a Common Public Radio Interface (CPRI) fiber optic link.  Placing electronics at the top of the tower reduces signal loss between ground-based radios and antennae, the configuration used in second generation (2G) and some third generation (3G) networks, but it also makes troubleshooting and repairs much more costly.  Tower climbing is labor intensive, obviously brings safety and risk into play, and, with a site having chronic issues, can be repetitive.

Vodafone and O2 4G Remote Radio Unit, Peter Clarke, “Telecommunication Infrastructure,” Webpage http://www.pedroc.co.uk

In my previous position supporting a large U.S. Tier 1 carrier, we started looking at troubleshooting and repair challenges in the late 2011 timeframe, the early days of LTE, because we were receiving units returned for repair that were analyzed as, “No Fault Found,” or NFF.  As any returned unit would have been retrieved and replaced via at least one tower climb evolution, cost and network availability impact are significant when one realizes that Tier 1 networks can contain over one hundred thousand (100,000) cell sites.  We realized that we needed to improve remote site diagnostics to enable some level of pin-pointing of possible faults.

When an outage of a site (or larger) occurs within a network, we try to retrieve and store detailed logs from Operations Support Systems (OSS) as well as site diagnostics if available in order to perform a Root Cause Analysis (RCA) of the event.  We have detailed records from customer trouble tickets which define problems, remedies, workarounds, code fixes.  We have outage data from emergency handling systems that document event times, symptoms, conditions, actions taken, and subscriber impact.  Thus, with a large installed base containing such operational performance data from customers, we had sufficient volume to create a “Big Data” platform from which to build better analytics and hopefully improve diagnostics.  We also had access to some internal tools to assist, notably a laptop-based Baseband Unit (BBU) and a test platform to verify RRU function prior to starting repair work.  We set a challenge for the analytics team to see if we could use these tools and the operational data to enable us to predict a pending failure of a single LTE site based on degradation in operational parameters.

In addition to crunching operational data, the team spent several months working with design to understand failure modes within the chain of components involved with making an LTE network operate.  This includes components such as Small Form-factor Pluggable (SFP) optical transceivers, fiber optic cabling, power supplies, electrical connectors, remote electronic tilt (RET) motors, antennae, etc.  Many of these are commodities and can be sourced through many suppliers, and, while they may meet a design specification, can have very different performance profiles in the field.  The ability to identify issues at that level is critical, because if there is a defective device problem from one manufacturer, it may be necessary to remove and replace affected devices across an entire network.  If one doesn’t know where those products are installed, that results in a one hundred percent (100%) open-and-inspect operation, something that is extremely expensive and time consuming (six months, typically).

Through this work, we built the analytics capability to identify component-related degradation that can lead to a site-level failure if not corrected, and can generate and send alarms to a network management system platform to enable an operator to dispatch technician(s) to site to address the problem before an outage occurs.  We can also present site-level data using digital mapping to operations personnel via a smartphone or tablet to enable them to take action regardless of their location at the time the condition is noted.  While not perfect, the system is constantly being improved through both volume and granularity.  We have demonstrated the capability to a couple of key customers who are very excited about the potential impact to network availability and will hopefully have an initial deployment very soon.

References

1. “RRU Application,” Diagram of Remote Radio Unit Application from http://www.heliostelecom.com

2. “Vodafone and O2 4G RRU,” Photograph of Vodafone and O2 4G Remote Radio Unit configuration from Peter Clarke via http://www.pedroc.co.uk

0