I was fascinated last week while reading an article on Bloomberg, which showed how Big Data has essentially transformed Guizhou, one of the poorest provinces in China into the third-fastest growing province in the country now boasting a rapid development pace of 10.5 percent. Historically, China has been the economic powerhouse of manufacturing and labor. But the next economic wave is all dependent on China’s ability to capitalize Big Data and advanced analytics. In fact, just last month, Gartner released the top 10 technology trends driving the Chinese economy, with Big Data listed as the crucial backbone for 6 of the 10 trends. With these technological investments, business intelligence shows promise, not only as a way to improve operations, but also as a means to derive value out of the rapidly growing amount of consumer data.
BAT – Romance of the Three Kingdoms
“Romance of the Three Kingdoms”, a classical novel of Chinese literature, describes the wars that take place among three powerful kingdoms fighting to replace the dwindling Han Dynasty around 1800 years ago. Similarly, with more than 650 million tech-savvy users, today China’s internet world is dominated by three kingdoms – Baidu, Alibaba and Tencent (collectively referred to as BAT). Baidu, sometimes called the Google of China, holds a commanding 71 percent of market share in search. Alibaba holds a similar powerful market share in e-commerce with a record $14.3 billion last year in total sales volume during single’s day. Tencent is the dominant player in social media with over 600M active users. In fact, just in June this year, Tencent zoomed past Alibaba to become China’s most valuable tech company.
These three kingdoms have made huge progress towards becoming technological powerhouses in industries such as online services, smartphone technology and telecommunication. Their widely successful internet services give them a treasure trove of data to analyze and plenty of customers to experiment on. In early 2015, Baidu made a huge impact in Artificial Intelligence by announcing the Minwa Supercomputer, powered by Mellanox network and NVidia GPUs. Alibaba, on its end, held the record of fastest Daytona GraySort and MinuteSort with 15.9TB/min and 7.7TB receptivity and Indy GraySort and MinuteSort with 18.2TB/min and 11TB respectively. Yesterday, Tencent along with Mellanox and IBM, announced that it has been named the 2016 winner of Sort Benchmark’s annual global computing competition. Tencent broke records in the GraySort and MinuteSort categories, improving last year’s Alibaba overall results by up to five times and achieving more than one Terabyte/second of sort performance. In addition, the results improved by up to 33 times per node.
Terasorting on Tencent Cloud’s OpenPower-based Cluster
Each year, leading global companies and academic institutions participate in the Sort contest to evaluate the capability of their software and hardware system architectures, as well as their research results. The TeraSort benchmark (www.sortbenchmark.org) is touted as the gold standard of sort benchmarks. TencentCloud Intelligent Distributed Computing Platform participated in two of the four competition categories – GraySort and MinuteSort for both Daytona (general purpose sort) and Indy (special purpose sort).
Using 512 OpenPower-based servers, with NVMe-based storage and Mellanox ConnectX®-4 100Gbps Ethernet adapters, TencentCloud spent less than 99 seconds to finish sorting a massive 100 terabytes of data, and used 85 percent less servers than the 3,377 servers used by last year’s winner. To achieve this, Tencent developed their own sort application and tuned it for specifically for the benchmark. Managing the combination of sort, NVMe storage and high-performance CPU, pushes the analytics boundary and hence latency and bandwidth of the network which plays a crucial part in achieving maximum performance. With advanced hardware-based stateless offloads and flow steering engine, Mellanox’s ConnectX-4 adapter reduces the CPU overhead in packet processing and provides the lowest latency and highest bandwidth.
“Mellanox ConnectX-4 100GbE NIC optimizations include enabling Large Send Offload (LSO), Large Receive Offload (LRO), and 64KB socket buffers to leverage LSO and LRO, using large packets (MTU 9000), and managing interrupt NUMA affinity. When the shuffle stage is run in isolation, per-node sustained throughput is close to 10GB/s.”
The 32 rack cluster is equipped with 16 servers that are interconnected with Mellanox Spectrum-based 32-port 100GbE SN2700 (32 leaf switches and 16 spine switches) and Mellanox’s 100GbE LinkX optical cables. Sixteen ports in a leaf switch are connected to the servers in the rack and the other 16-ports are connected to each of the spine switches. The SN2700 switch provides the highest performance fabric solution in a 1U form facto, delivering non-blocking throughput for big data workloads, with predictable low-latency and Zero Packet Loss. Due to bursty network traffic in big data workloads, non-blocking switches play a crucial role in delivering predictable real-time analytics. In addition, Mellanox DAC and optical cables offer reliable connection with the highest quality, featuring error rates of up to 100 times lower than the industry standard. To learn more check out: Tencent whitepaper.
The key takeaway with this accomplishment is that the 10GbE-based network can no longer sustain the demand for real-time and advanced analytics as the industry is rapidly migrating to faster CPUs with faster flash-based SSDs and NVMe storage. While fewer enterprise customers will jump to 40GbE network, many will migrate to a more efficient and cost-effective 25/50/100GbE network. In fact, moving to 25GbE today makes a perfect sense, allowing businesses to future-proof their data center fabrics. On the other hand, hyperscale companies such as Baidu, Alibaba and Tencent who are the vanguards of technological innovations, will drive the demand for 100GbE based network as a way to solve their challenging analytics problem.