BT is one
of the largest telecommunications providers in the world, with £24 billion (US$33.5
billion) in revenue in 2017.
The ability to broaden and deepen customer
relationships is key to achieving sustainable, profitable growth in today’s
competitive landscape. In order to meet customer expectations, it is essential
for the company to know who their customers are, what services they are using
and how those services are performing. Maintaining the quality and integrity of
those data assets is a challenge.
Several years ago, Phillip Radley, chief data architect at
BT , was having a discussion
with colleagues about the next iteration of a critical extract, transform, and
load (ETL) “pipeline.”
In the legacy environment, business client
records were spread across multiple databases. They needed to be reconciled and
updated daily with Dun & Bradstreet data
in order to provide business units with the most relevant and up-to-date
Nearly one billion records were being
compared and reconciled daily, and BT’s legacy ETL platform, built on a
traditional relational database, couldn’t keep up with the pace. It was taking
more than 24 hours to process 24 hours’ worth of data. Consequently, BT’s
business units were working with day-old data at any given point in time.
the big data challenges
The team initially had a proposal to re-platform
the system to a new relational database.
“But as we sat down, our discussion turned
to [Apache] Hadoop. We realized we basically had a data velocity problem. We
had to process the data faster and increase the volume that we could
ingest—both of which Hadoop excels at,” Mr Radley said.
BT engaged Cloudera to install a
production-ready Hadoop cluster that replaced the batch ETL application with MapReduce
routines, and went from PowerPoint to production in nine months.
The company wanted its Linux administrators
to manage the data platform instead of hiring new talent. Cloudera provided the
required training saving the company time and money.
“The Cloudera University training course
was not only high quality, but also the trainers were able to understand what
we were trying to accomplish and helped ramp up the team quickly. The same
people who run our 30,000 Linux servers also now run Hadoop, and they can do
that on top of their other responsibilities,” said Mr Radley.
The new enterprise data hub (EDH) approach could
not only solve BT’s immediate ETL problem, but it also helped tackle a host of
big data challenges to help BT fast-track the delivery of new offerings.
BT has 1,900 operational systems and
several of the world’s largest data warehouses. The EDH runs below the
operational systems and the systems extend their data into the EDH. The data
can then be shared and exposed as required. This unified, cost-effective
infrastructure enables BT to gain unified views of its data across its multiple
The platform also provides the ability to combine
batch, streaming, and interactive analytics and allows business intelligence
(BI) teams to perform SQL queries on the data.
Additionally, the environment enables the
company to extend data retention from one year to more than 10 years when
needed and implement innovative knowledge management use cases.
Security and stability were vital to the
platform’s success. Security had to be as good as business-as-usual security.
Manager rolling upgrades allowed BT to keep the platform on the latest
release to get quick access to new features without service interruptions,
while the data governance solution, Cloudera
Navigator saved time auditing the platform and tracking data lineage.
Accelerated data velocity
The move to the new platform enabled BT to increase
data velocity by a factor of 15, processing five times the data in a third of
the time. Businesses were now working with today’s data instead of yesterday’s.
The move also delivered substantial cost savings for BT.
Mr Radley said, “Putting the data on Hadoop
was much cheaper than putting it on a standalone system.”
One-year return on investment (ROI) from
the deployment was in the range of 200 to 250 percent range. Moreover, BT could
now undertake new projects quickly and at a much lower incremental cost.
Better Broadband Service for customers and cost
savings for BT
Following the success of its ETL
initiative, BT started utilising the EDH to help deliver improved broadband
services. BT could use all the raw data, processing it faster and at a much
lower cost. The resulting improvement in network analytics helped BT understand
how to deliver better network performance, which is beneficial for customers.
The speed of an individual line is
dominated by its length (the distance from network equipment to a customer’s
premises), but many other factors can have a significant impact on customer
BT’s copper network has been in existence
for around 50 years. It predates the Internet and broadband services and has
significant legacy test infrastructure, that did not always provide reliable
indication of the Internet performance.
The EDH was used to combine network
topology (Geographic Information System or GIS) data with terabytes of DSL (direct
subscriber line) performance (time series) and electrical line test data to
grade the quality of every line in the network. This helped indicate if slow
speed was a network issue or a customer issue. Using this network analysis, the
probability of a successful outcome of an engineer dispatch could be predicted,
reducing wasted in-person engineer visits.
Supporting Urban Planning with IoT Data
Cloudera also helped BT to take advantage
of the Internet of Things (IoT) with its fleet management services to utility
companies. Having the ability to instrument those vehicles and collect data
from them to enable predictive analytics around vehicle faults and failure
provides BT with a competitive edge.
Perhaps the best example on how BT taps on
the massive potential of IoT is its work with Milton Keynes (MK), a
fast-growing town in Buckinghamshire, England.
BT was part of the MK:Smart initiative which concluded last
year. It was a large collaborative initiative, partly funded by HEFCE (the
Higher Education Funding Council for England) and led by The Open University.
It had the aim of supporting sustainable growth without exceeding the capacity of
the infrastructure, and whilst meeting key carbon reduction targets.
As part of MK:SMART, sensors
were installed in car parking spaces that broadcast if the spots are vacant
or occupied. Citizens and visitors can then use a smartphone app that guides
them to the on nearest free parking space based on the sensor data. This data
was analysed in the central MK Data Hub.
The data can ultimately
be used to take evidence-based decisions for multi-million pound
infrastructure, such as the location and size of future car parks.
Mr Radley said that the data from things,
such as car parking spaces, recycling bins, and street lights, can provide
valuable insights and needs to be captured, analysed, and made available.
When this is scaled up for a large town or
city, the volumes of data can become large and meaningful, providing
significant insights and value to the community and business.
 Hadoop MapReduce is a software framework for easily writing
applications which process vast amounts of data (multi-terabyte data-sets)
in-parallel on large clusters (thousands of nodes) of commodity hardware in a
reliable, fault-tolerant manner.