The network is the bedrock of nearly every significant innovation that we see around us. For example, the high-speed 4G network is making self-driving cars a reality, one of the biggest innovations of this era. A Deloitte report estimates that self-driving cars, enabled by wireless connectivity, will reduce emissions by 40% to 90%, travel times by nearly 40%, and delays by 20%. A 2013 study estimated that if 90% of vehicles were autonomous, it would save 21,700 lives and $447 billion per year.
Energy companies are also leveraging next-generation networks to build energy efficient smart-grids saving the average consumer hundreds of dollars per year. Electric Power Research Institute (EPRI) estimates that an efficient and reliable energy grid enabled by a smart grid could create $1.8 trillion in additive revenue for the U.S. economy from 2013-2020.
To achieve these revolutionary outcomes, there are network technologies that are evolving at breakneck speeds. It took just eight years to go from 3G to 4G. 5G is expected to replace 4G in about half that time. Gartner predicts 20.8 billion devices will be in use by 2020 on these high-speed networks.
To cater to the enormous volume of network traffic, the network must be virtualized and software-defined. For example, AT&T has committed to virtualize 75% of its network by 2020. Deutsche Telekom and Vodafone are deploying multi-access edge compute infrastructure closer to the point of consumption to target the rapid growth in latency-sensitive applications. The next generation of network technology is right around the corner and will demand even more changes to micro-datacenters that run hyper-converged infrastructure.
Disrupt or be disrupted
• Proprietary hardware and fat-tree network architectures are competing against software-defined and “white-box” network equipment. The pace of change is accelerating as equipment buyers face cost pressures to innovate faster, measured in days not years. For example, Nutanix is credited with coining the term hyper-converged infrastructure (HCI) with the 2011 release of its appliance. Today, Nutanix is the largest HCI vendor in terms of customer deployments (3,100) and revenue (approximately a $460 million annual run rate). Software-defined storage (SDS) is gaining traction worldwide, and hence both traditional storage companies like Dell-EMC, HP, IBM and NetApp, as well as new vendors like RedHat, Caringo and FalconStor have all started launching SDS offerings.
• Datacenter workloads are moving away from traditional monolithic services to light-weight microservices. The changes in the datacenter and the rise of public and private cloud services have triggered WAN to shift from expensive and complex MPLS connectivity to simplified, cost-effective virtual-overlays using SD-WAN to connect branch offices. The Deutsche Telekom (DT) Silicon Valley Innovation Center is one leading service provider that has deployed VeloCloud’s SD-WAN solution for secure VPN connections between Europe and North America for introducing new services dynamically such as industrial 3D printing and robot automation.
• New services are running on the edge of the network to enable ultra-low latency applications like vehicle-to-vehicle (V2V) communications, multi-player AR/VR gaming, immersive video experience in entertainment, remote surgery, robotics, and other applications that require near real-time mobile communications.
From simplicity comes complexity
The new network is getting simpler by design but managing this new network is getting progressively more complex. According to Pyramid Research, mobile operators are spending three times more on operational expenditure (Opex) than on capital expenditure (Capex), or a total of $400 billion to $500 billion annually on Opex. This suggests there is significant added complexity in the caring and feeding of the new network. Some reasons for this include:
• Interdependence. For example, a communication service provider (CSP) must typically tackle congestion, failures, call drops, security attacks, power optimization and other issues. However, new hardware, new services, new architectures, new devices mean they must also commit to more demanding service-level agreements (SLAs) to ensure an uncompromising, high-quality customer experience.
• Millisecond response time. Delivering the speeds required for ultra-low latency applications demands an altogether new networking approach to launch, manage and maintain these critical services.
• N-architectural layers. Traditional mobile computing is a two-tier architecture consisting of the device (application) and the (public) cloud (application server). In edge computing, there is a three-tier architecture: the device (application), the edge (the low-latency sensitive component of the application server) and the cloud (the rest of the application server which is not latency sensitive). Additional edge components like Docker networking adds more complexity.
• Velocity of feature introductions. The pace of feature introduction is fast and getting faster. Releases today go live in days, not weeks, and certainly not months to get ahead and stay ahead of the completion.
The common theme for all these issues is the triumvirate of productivity, automation and speed.
The time has arrived for the new network to think for itself, as traditional operations are not equipped for the velocity of changes and the inherent complexity of the architecture.
Industry leaders are already getting their acts together to make the network more intelligent. Juniper has embarked on a journey of a production-ready, economically feasible Self-Driving Network, an autonomous network that is predictive and adapts to its environment. Cisco has ushered in a new era of networking with Cisco DNA (Digital Network Architecture) that constantly learn, adapt, and protect. It’s a network designed to be intuitive, that is simplifying management to turn hours of work into seconds, automating processes to lower costs and using analytics to improve performance.
Network standardization bodies are also gearing up to explore this new area and come up with standard architectures. For example, ETSI has created a new Industry Specification Group, called Experiential Network Intelligence (ISG ENI), to define a cognitive network management architecture using AI and context-aware policies to adjust offered services based on changes in user needs, environmental conditions and business goals. Along similar lines, 5GPP has established an AI project named CogNet for building an intelligent system of insights and actions for 5G network management.
Architecting a brain for networks
When traffic spikes occur on today’s networks, it’s difficult to distinguish a DDoS attack from the widespread downloading of a popular new music album. Using machine learning (ML) algorithms that interpret vast amounts of traffic behavior data, the network will predict performance issues before users are affected. In this example, connections with algorithms that scrape Twitter feeds will confirm the hypothesis: Have hacking groups been threatening action against a particular enterprise, or have fans been clamoring for the music album in the weeks leading up to the spike? The thinking network will analyze and adapt accordingly, either by shutting down ports to isolate the DDoS attack or adding bandwidth to accommodate the surge in album downloads.
With the advances in technologies like machine learning (ML), artificial intelligence (AI) and intent-based networking (IBN) it is now possible to build a brain for the network so it can think and take care of itself.
Google and Amazon are using predictive algorithms to create a brain in a datacenter. Google is using DeepMind's AI-based predictive algorithms-Google acquired the British AI company for over $600 million in 2014-to slash the enormous electricity bill of its datacenters. Amazon AWS applies ML-based predictive models to one of the toughest puzzles in datacenter management: capacity planning. AWS uses ML to forecast cloud datacenter capacity demand and to figure out where on the planet to store additional datacenter components so that it can expand capacity quickly where and when they are needed.
Autonomous network architecture (ANA), is a paradigm that would practically eliminate operational complexity regardless of the type and volume of network traffic. With ANA, the network will get a brain to self-configure, monitor, manage, correct, defend and analyze with zero human intervention. It will be predictive and adapt to its environment and will optimize and personalize experiences for the user and their specific situations, such as at home, at work or the gym.
An autonomous network is the end state of a progressive journey that begins with automation and programmability and builds through the integration and advancement of four technology areas: programmability, semantic telemetry, machine decision-making and intent-based networking.
• Programmability - Automation and programmability are the foundations of autonomous networking. Thanks to agility and flexibility introduced by software-defined networks (SDNs), network administrators are now able to programmatically initialize, control, change and manage network behavior dynamically using open interfaces and abstraction of lower-level functionalities. With SDN topology discovery, path computation, path installation, bandwidth management and other developments, the network can be fully automated. As a next step, the automation must become more intelligent. For example, bandwidth reservation is already responsive to traffic changes, but can we make it smarter? Can we automate service placement and motion?
• Semantic Telemetry - The telemetry interface enables the collection of data from remote points to support monitoring, analysis and visualization. But SNMP (Simple Network Management Protocol), pull-based telemetry, and naïve deep-packet inspection are starting to show limitations. For the success of the autonomous network, we need telemetry that is based on push semantics and ML-based anomaly detection.
• Machine Decision Making - Today’s rules-based systems involve simple programming (if X happens, then do Y). These rules are rigid and programmed. ML and AI techniques will move decision-making from static programming to algorithms. AI algorithms recognize patterns in data, make predictions, and take appropriate actions without having to be programmed. The more data that is fed into training algorithms, the smarter the networks become. ANA has several advanced AI/ML models built into it for finding quick and foolproof solutions for the real-world challenges faced by the network
• Intent-Based Networking - Intent-based systems operate in a manner where the administrators tell the network what it wants done, but the how is determined by the network and the specific tasks are automated to make this happen. Intent-based networking allows IT to move from tedious traditional processes to automating intent, making it possible to manage millions of devices in minutes. This is a crucial development to help organizations navigate today's ever-expanding technology landscape. For example, if a business wants to secure all traffic from accounting, that command is issued, and the systems would take care of all the technical details. Network changes are automated and continuous, so if a worker moves, all the policies and network settings follow him or her. Gartner predicts that by 2020, 10% of enterprises will use intent-driven network design and operation tools (an increase from 0% today), which will reduce network outages by 65%.
These four key technology areas, combined with local and global awareness, will help to build the right knowledge for the network. Local awareness comes through continuous monitoring of the underlying network. While local awareness will remain essential, increased global awareness will usher in the ANA, featuring root-cause analysis using supervised learning; time-based trending to establish and adapt baselines; correlation of information across geographies, layers and peers; and optimal local decisions based on a global state.
The Aricent Autonomous Network architecture is described in Figure 1 below. The architecture enables the network to self-learn through continuous feedback using closed-loop automation. It uses a multi-level intent orchestrator to apply the identified intents back to the network in a simplistic way.
Figure 1: The Aricent Autonomous Network Architecture
The Aricent Autonomous Network brings immense value to our customers in five important ways:
• Enhanced end-user experience and reduced customer churn through proactive detection of network issues and by meeting dynamic customer needs.
• Higher operational efficiency by simplifying network operations and facilitating a consistent, error-free network.
• Guaranteed SLAs, to avoid the financial impact of SLA penalties and improve brand reputation.
• Improved employee productivity by freeing up expert resources from day-to-day tasks like debugging and root-cause-analysis.
• Reliance on trusted and proven telecom expertise, to minimize the risk of inadequate solution capabilities and to gain a competitive advantage.
We take a staged approach for achieving a fully autonomous-network, as shown in Figure 2 below.
Figure 2: Aricent Staged Approach for the Autonomous Network
The journey toward the autonomous network will be challenging, but the rewards will unfold in real-time. The most important decision is to take the first step.
In my next post, I will talk about the key use cases and the related AI/ ML algorithms, tools and datasets that are required to create the relevant prediction models.
1. Wireless Connectivity Fuels Industry Growth and Innovation in Energy, Health, Public Safety, and Transportation https://www.ctia.org/docs/default-source/default-document-library/deloitte_20170119.pdf
2. Gartner Newsroom http://www.gartner.com/newsroom/id/3598917
3. AT&T's Network of the Future http://about.att.com/innovation/sdn
4. 12 most powerful hyper-converged-infrastructure vendors https://www.networkworld.com/article/3112622/hardware/12-most-powerful-hyperconverged-infrastructure-vendors.html
5. Velocloud Case Study - SD-WAN for industrial 3D printing and robot automation http://www.velocloud.com/customers/case-study-deutsche-telekom
6. Pyramid Research, Inc report: Demystifying Opex & Capex Budgets - Feedback from Operator Network Managers to their offering http://www.businesswire.com/news/home/20070305005538/en/Research-Markets-Mobile-Operators-Spending-US400-500bn-Annually
7. Juniper The Self-Driving Network™ Restoring Economic Sustainability to Your Infrastructure https://www.juniper.net/us/en/dm/the-self-driving-network/
8. Cisco - The Network Intuitive https://www.cisco.com/c/en_in/solutions/enterprise-networks/index.html
9. New ETSI group on improving operator experience using Artificial Intelligence http://www.etsi.org/news-events/news/1171-2017-02-new-etsi-group-on-improving-operator-experience-using-artificial-intelligence
10. Building an intelligent system of insight and action for 5G network management http://www.cognet.5g-ppp.eu/
11. DeepMind AI Reduces Google Data Centre Cooling Bill by 40% https://deepmind.com/blog/deepmind-ai-reduces-google-data-centre-cooling-bill-40/
12. AI Tells AWS How Many Servers to Buy and When http://www.datacenterknowledge.com/archives/2017/05/19/report-ai-tells-aws-many-servers-buy
13. Gartner 2017 Strategic Roadmap for Networking https://www.gartner.com/doc/reprints?id=1-3YQCDRO&ct=170425&st=sb&oid=anren000267&elqTrackId=40b379c3fd274b81854fa6116b9b9abf&elqaid=5173&elqat=2