When you think back on the famous cities you have visited, like Paris or New York, what usually comes to mind are the buildings, the parks, the boulevards, and maybe the museums. We tend to ignore the part of city that thrives under the pavement--the tunnels, pipes, and wiring--even though this infrastructure plays a vital role in keeping the city alive and functioning.
Powerful things sometimes lie below the surface, easily escaping our attention. One often unnoticed and frequently under-appreciated component within Teradata Vantage is the BYNET.
What is the BYNET?
The BYNET is the system interconnect that allows the various components of the Vantage database to communicate. This is important because Vantage is composed of many self-contained virtual processors with no inherent connection among them. Just like underground cables, messages travel along the BYNET between the parsing engine modules, which do all logon and query pre-processing activities, and the AMPs. AMPs are the virtual processes that run in parallel and do the database work involved in query executions. Typically, the data from a relational table is spread across all AMPs in the configuration, with each AMP owning a subset of a table's rows.
One of the things that impresses me the most about the BYNET is its role in returning very large sorted answer sets in an effortless manner.
When I first started at Teradata, the YNet was the interconnect that glued all the physical components within the database together. The YNet was architected to offer a wide range of efficiencies, all of which have been inherited by today's BYNET, Teradata's second-generation interconnect.
Although primarily a message delivery mechanism, the BYNET, is so much more.
Three Key Capabilities of the BYNET that Are Critical to Overall Performance of Vantage
Here are three key capabilities of the BYNET that I consider critical to the overall performance delivered by Vantage today.
1. Multi-AMP Query Coordination
If a group of us are going to meet for lunch, we need to coordinate our plans to arrive at the same time and agree on how we are going to split up the check. The BYNET imposes similar coordination among AMPs that are working in parallel on the same query. To do this, the BYNET performs an on-the-spot association of just the AMPs working on the same piece of work, making sure that all the involved AMPs have a special communication channel set up just for them. This is just like co-workers sharing a Slack channel or Microsoft Teams for direct and easy communication.
The BYNET oversees query step completion and error handling, so that if one AMP fails during the execution of its share of query work, all other AMPs on the same channel will be notified immediately. This avoids the potential of one AMP hitting a problem, yet the other AMPs continue their part of the work and wasting platform resources. To do this coordination, the BYNET uses lightweight communications and signaling behind the scenes, rather than sending more complex messages. Tying the lightweight communications and signaling to our lunch example, we may send a text message instead of making a phone call to adjust for last minute changes to our reservation time or location.
2. Final Answer Set Ordering
One of the things that impresses me the most about the BYNET is its role in returning very large sorted answer sets in an effortless manner. In most database systems, sorting a large final answer set is costly because it often involves several sub-sorts and data merges. This can be I/O-intensive and time-consuming and usually involves writing and reading intermediate data sets. Think about what it would take to reorder all of the books in your local public library so that instead of being grouped by the Dewey Decimal System they were all ordered alphabetically by title. You'd have to lift a lot of books of the shelves and probably reorder a subset at a time in temporary locations. After you would have to lift all of the books back to their new shelf location.
The BYNET knows about the parallelism of the AMPs and recognizes that each AMP has built up a small sorted answer set in a buffer for its portion of the data at the end of a query. The BYNET simply reads the data from all AMPs simultaneously while maintaining the specified sorted order. Like sucking on a straw that is forked into multiple milkshakes, the BYNET pulls data off the AMPs. The answer set emerges in sorted order and is returned to the client without ever having to land anywhere for one big sort/merge operation (See Figure 2). This is an elegant and efficient compilation of the final answer set across parallel units which bypasses I/O-intensive routines and speeds up query completion.