Artificial Intelligence: Simply Complex

I believe that, in science, holding a simplistic view regarding a problem or a technology is risky and potentially misleading to others. In the same way, overcomplicating the challenges behind a given problem or fantasizing about the capabilities of the corresponding solution is, typically, a sign of incomprehension and lack of expertise. Interestingly enough, artificial intelligence (AI) and machine learning (ML) have become exceptionally prone to either judgement in the last decade; ranging from the awe of the end users with regards to the potential of emerging applications, to sharp statements like “It is all just about counting fast” by Craig Martell, Head of Science and Engineering at LinkedIn. None is wrong, none is right; and still, in this confusing duality, AI has raised as one of the essential and most lucrative pillars in the IT space because: who could resist developing – and commercializing – an “enhanced” digital future while engineers keep doing the same (simple) statistical reasoning of the last 50 years? Rebranding & repurposing 101.

My opinion, based on moderate experience, is that the processes and algorithms underlying AI/ML are truly straightforward to understand and develop – inc. neural nets, but the output logic or approximated functions may be outstandingly complex depending on the (i) quality plus abundance of data, and (ii) the size of the correlator – i.e. the processing engine. This is exactly the boring yet powerful beauty of AI/ML that Craig Martell talked about during the lecture: one only needs worry about owning data and flops to create extraordinary services, arbitrary models, high-dimensional fitting functions… because, ultimately, one will simply “selects the algorithm that works”. It is no surprise, given this convenient and naïve premise, that AI/ML keeps penetrating so wildly in our daily lives; perfectly synchronized with the rise of cloud computing and its services – e.g. SaaS, the digitization of everything and everyone – everything is a sensor, everyone feature vector, and the drastic drop of the cost of data thanks, in part, to the progress in compute technology, electronics, and electro-optics – e.g. Quantum computing, and the public release of rich databases through governmental institutions – see e.g. LoadDB, open industrial projects – e.g. Open Compute Project, Telecom Infra Project  and others – e.g. Kaggle. Concerning specific applications, end-users’ represent just the tip of the iceberg, the most intuitive and marketable use or a very flexible tool; however, the power of data-driven AI/ML algorithms span way beyond recommender systems, anti-spam filters, or Siri…and that is not so well known.

One of the industries that leverages ML more intensively is Telecom, a highly data-rich sector whose KPI for success is, precisely, remaining totally transparent to the end-user. No news, good news. Mainly fueled by the development of next-gen 5G technologies, and the decentralization of the compute power and data – i.e. the Edge Cloud vision and IoT, the ICT segment has experience thrilling evolution; now reaching a point in which the diversity of the transported flows and network infrastructures is so high that the implementation of operation automation, traffic prediction, and adaptive transceivers is strictly mandatory for ensuring proper operation. Within the plethora of opportunities for ML in this outlook, I would like to bring the spotlight onto two particularly trending use cases: “cognitive” networking, and universal transceivers.

Cognitive/Flexible/Elastic networking

Data communication systems, and concretely medium-/long-haul optical networks are very expensive and sophisticated infrastructures moving information around the planet at tens of Tbps. Such is so that, traditionally, equipment vendors and systems integrators have spent millions of euros in protection mechanism – inc. black fibers, guaranteeing operation margins until EoL– hence wasting a lot of resources, and making worst-case-ready equipment designs based on too small and uncertain data sets. That’s not the future of anything, and clearly that is well reflected in the latest research and development work which, among many other things, consider a centralized software-defined networking approach. In this approach, the network is full sensing ports (digital and analog; optical and electrical) that collect and forward status data to a centralized application that orchestrates the network like a single and dynamic entity. This enables making optimal and automatic resource-management decisions, operation margin compression, signal integrity prediction, pro-active fiber break detection, on-demand resource reservation, network slicing for flow protection, and a plethora of other intelligent/cognitive ML-based services. The result? The Zettabyte Era is no longer utopia.

Universal transceivers

If neural nets are universal function approximators, could we replace the entire digital processing stack in charge of equalizing and decoding the information signal by one sophisticated net? Is that computationally tractable? How accurate would it be compared to the individually optimum algorithms? How much power would it consume? How fast could it adapt to fast variations in the communication channel? These are some of the questions being currently attacked by the telecom community responsible for the design of high-bitrate communication transmitters and receivers because, in some transmission scenarios, the end-to-end channel is so complex to parametrize, and the uncertainties are so high and diverse, that it is impractical to develop an analytic expression. And the interest goes beyond technical aspects, since a single chip design could provide the just-right service to all communication channels, thus leveraging economies of scale, strongly facilitating multi-vendor compatibility, and moving great part of the development to the agile software space. Therefore, researchers are modeling the entire communication chain as an autoencoding neural net problem: Data -> BLACK BOX -> Channel -> XOB KCALB -> Data, now the objective is to learn the BLACK BOX that makes this system exist with a probability of bit error of 1·10-15. A very complicated training process that benefits extraordinarily from the injection of “expert knowledge” across the process – e.g. in the neural net structure, input featuring, initialization, or learning regularization; a whole field of research by itself.

Bottom line: artificial intelligence, and machine learning algorithms are easy to grasp, as much as statistical math can be; but their power currently escapes to our full understanding, just because we are literally unable to digest ourselves the amount and assortment of data that this technology manipulates and correlates. In this context, the end user will soon – next 1-3 years – enjoy few AI teasers like domotics, car automation, and assisted/enhanced reality – who does not like a magic trick, right? but, in my opinion, those cases are an unfair reflection of the actual power of a well-trained data center; and hence, its implications.

References

Cisco, “The Zettabyte Era: Trends and Analysis, ” white paper [Online]. Available: https://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/vni-hyperconnectivity-wp.html

LoadDB website [Online]. Available: http://www.loadb.org/

Open Compute Project website  [Online]. Available: https://www.opencompute.org/

Telecom Infra Project website [Online]. Available: https://telecominfraproject.com/

Nokia Internal on “dynamic optical networks” and “software defined networking”

A. S. Thyagaturu et al., “Software Defined Optical Networks (SDONs): A Comprehensive Survey,” in IEEE Communications Surveys & Tutorials, vol. 18, no. 4, pp. 2738-2786, 2016.

T. J. O’Shea and J. Hoydis, “An Introduction to Deep Learning for the Physical Layer,”  arXiv: 1702.00832v2 [cs.IT] 2017

0