You get “The call”, a service is down, you know the drill: “is it a networking problem or am I being blamed due to some server/storage failure? And the race begins, you know the network is probably fine but till you prove it, you will be the one responsible for your company losing money, losing customers and damaging its reputation.
“Time To Innocence” is the key. How fast can you prove it is not the network’s fault?
Let’s look at the tools you’ve got to solve your problem:
- SNMP MIBs
- Performance Monitoring Tools gathering the above data
So now you have to search for the piece of data which will solve the mystery and usually you’ll go: “TX counters – RX counters do not equal zero, yes, we found a packet loss!!!”
Great progress, now you’re just few hours (!) away from identifying the issue – you know you’ve dropped packets, now you can start investigating.
You know the packet drop came from a specific port on a specific switch. But why? The switch won’t tell you why… it just drops the packet.
So now you want to reproduce the issue, for that you need it to happen again in order to debug it yourself, or have the vendor debug it remotely or on site. The problem is that this packet drop doesn’t happen when you want it to happen, it happens when it happens… and you’re stuck.
Mellanox Introduces “What Just Happened, ” Advanced Streaming Telemetry Technology
2019: Mellanox Introduces “What Just Happened, ” our new Advanced Streaming Telemetry Technology. WJH tells you why the packet was dropped, when, in which protocol and more.
Let’s rethink this statement: “Switches don’t tell us why, they just drop the packet”.
Does it have to be this way? Why wouldn’t switches tell you why a packet was dropped and save all the hassle? When switches drop packets, they do so for a reason. It’s not a bug, but rather what they are designed to do. Here are a few examples of scenarios in which a switch is asked to drop packets:
- ACL action: “Drop the packets sent from a specific IP”
- “Drop the packet if TTL=0” (e.g. the packet traveled too long in the network hence there’s probably a loop
- “Drop the packet if SMAC=DMAC” (in other words, you sent the packet to yourself)
So… if the switch knows why it dropped the packet, why doesn’t it report that information?
Well, now it does.
Mellanox switches introduce “What Just Happened!”
Mellanox “What Just Happened” tells you why the packet was dropped, when it happened, who sent the packet, to whom, in which protocol, VLAN and more. WJH can even tell you if the issue was related to the network or rather to server or the storage. It provides recommended actions to ease troubleshooting and it is available on 3 different network operating systems.
Your time is precious. Don’t waste it, talk to me: https://www.mellanox.com/products/what-just-happened/lets-talk/
Mellanox “What Just Happened”, available TODAY, taking network telemetry to the next level.