Skip to main content

Business Challenge

Guavus is a leading provider of big data analytics solutions for the Communications Service Provider (CSP) industry. The company counts 4 of the top 5 mobile network operators, 3 of the top 5 Internet backbone providers, as well as 80% of cable MSOs in North America as customers. The Guavus Reflex platform provides operational intelligence to these service providers. Reflex currently analyzes more than 50% of all US mobile data traffic and processes more than 2.5 petabytes of data per day.

Yet that data grows at an exponential rate. Ever increasing data volume and velocity makes it harder to generate timely insights. For instance, one operational issue can quickly cascade into multiple issues down-stream in the network, which makes it critical to go from insight to action in a very short time frame.

Many service providers have deployed data lakes to explore more of their information. These hold data across all time periods and are well suited to analytic jobs that take several minutes to complete at best. Yet a much more urgent and important challenge is to enable decision making on incoming streams of data in real time. Therefore, Guavus needed an ability to produce meaningful, real-time insights on just the last seconds of data, correlating data sources across network equipment, end customer devices, subscriber and billing systems. A particular challenge for the first generation of the Reflex Platform was to filter specific events from large amounts of incoming data, which is necessary in order to achieve the goal of generating real-time operational intelligence.


Since the company’s launch in February 2006, Guavus has been developing big data analytics solutions based on Hadoop and Map Reduce technologies. As customers’ requirements have become more time sensitive and the technologies have matured, Guavus looked to evolve its solutions from batch-processing (

Guavus’ second product generation Reflex 2.0™ was unveiled at Spark Summit and is built on Apache Spark and Hadoop YARN. Thanks to the capabilities of Apache Spark, the latest version of Reflex expands beyond batch processing to enable truly real-time continuous analysis at scale from multiple sources including data from billing systems, CRM, OSS, networks, applications, devices and clouds. Reflex 2.0 continuously correlates, fuses and analyzes these data streams with data at rest to provide communications service providers with a 360-degree view of what’s going on in their network.

Spark’s capabilities for iterative processing, filtering and enrichment of event stream data made it ideal for use with existing event filtering algorithms. Developers and end users embraced the new platform because of its ease of use and high-level abstractions. These included Spark machine learning libraries but also a distributed SQL query engine for Hadoop data.

As part of the company’s commitment to the open source community and strong belief in Apache Spark’s capabilities for streaming analytics, the Guauvs Reflex 2.0 platform is now also a Certified Spark Distribution. Certification is significant, as it will allow Guavus to innovate even faster and enhance the Reflex platform with greater flexibility, while ensuring compatibility with the latest standards and support for the growing ecosystem around Spark.

Moreover, Guavus contributes back to the open source community in the area of real-time event processing. When conducting operational analyses on network data, it is important to be able to distinguish between the actual time the event happened, and the time stamp when the event was received by the system. Guavus’ team has evolved the D-streams (discretized streams) feature to a concept it calls bin-streams, which better addresses this common service provider use case. This represents Guavus’ first contribution back into the Spark community.

Value Realized

Spark enables an “analyze first” approach that allows service providers to reason on the data as it arrives, versus the traditional “store first ask questions later” approach of Hadoop. This real-time analytic capability is disproportionately valuable, since it allows the service provider to take action while the shopper is in the store, while the customer is on the line with the call center agent, or while fraudulent transactions are in progress.

The metrics Guavus shared at Spark Summit are certainly impressive. Reflex 2.0 processes over 2.5 petabytes of data per day, which equals 250 billion records per day, and 2.5 million transactions per second. Guavus has observed a 3 to 5x performance improvement for its Reflex 2.0 product versus its 1.x product generation, while drastically reducing the hardware footprint required.

By analyzing data as it arrives within milliseconds of when it hits the network, Guavus customers can trigger immediate actions and improve decision-making.

Guavus has built a data layer that sits on top of the Reflex platform that transforms, aggregates and correlates data from multiple sources, including streaming and stored data, and applies machine learning algorithms to then feed the data into an analytic application. The data layer works in conjunction with the application to deliver rapid time to value, in some cases reducing development time by as much as 12 months. The data layer can also feed into third party applications and into data lakes for maximum extensibility. And by creating an abstraction layer that manages the data complexity and optimizes the processing for Extract Transform Load (ETL) and Enterprise Data Warehouse (EDW) at the edge vs. in a central repository, Guavus fundamentally changes the economics of analyzing the data.

For example, a Tier 1 US Multiple System Operator (MSO) leverages the Guavus Reflex platform to correlate call center events and network streaming events to detect anomalies and identify the root cause for timely resolution. Based on these insights, the MSO was able to make adjustments in the moment to improve the customer experience. The CareReflex application allowed the MSO to discriminate between device and network related issues using one-click root-cause analysis. From there, the Interactive Voice Response (IVR) could be deflected and the customer service agent script modified accordingly. In addition, field operations were dispatched to repair network faulty equipment vs. customer premise equipment saving the MSO millions in unnecessary truck rolls and improving mean time to resolution (MTTR) for customer call agents. The MSO estimates that this single application will result in $50 million in savings annually.

Other examples of how CSPs can use Guavus Reflex 2.0 operational intelligence platform include:

  • Develop solutions that can be embedded into workflows and business processes to improve CAPEX/OPEX efficiencies
  • Identify and prevent fraudulent activity in the network as it happens
  • Create highly targeted, personalized marketing mobile ad campaigns based on subscriber activities

To Learn More:

Try Databricks for free

Related posts

See all Partners posts