Playing with QoS and jumbo frames on the Nexus 5K

So you've introduced a Nexus stack in your datacenter instead of using those 3750x switches deployed because want DC grade switches running in your DC.

One thing you may come across is particular tuning for different types of traffic passing through your new NX switches. Whether it be iSCSI, NFS, FCoE, or just normal web traffic with little payload. With different traffic, you may think about classifying this traffic with QoS and putting policies in place. Lets start simple by classifying your iSCSI traffic for jumbo frames. The first thing we need to do is to match the traffic with an ACL:

ip access-list jumbo-frames
  10 permit ip any
  20 permit ip any
  30 permit ip any
ipv6 access-list jumbo-frames6
  10 permit ipv6 2001:db8:10::0/64 any

Now we can start by creating a class map for the QoS type of traffic using the above referenced ACL:

class-map type qos match-any jumbo-support
  match access-group name jumbo-frames
class-map type qos match-any jumbo-support6
  match access-group name jumbo-frames6

Now we have the class, where we can point out the networks we specified for jumbo frames. Now we can create a policy and mark the traffic accordingly. To keep it simple for this example, we can set a QoS group of 2, which is an internal label local to the Nexus. You can learn more about them in this article by Cisco. The next steps are to use these QoS markings to queue the traffic and set bandwidth, etc. First we will start by creating our first policy mapping, to do so:

policy-map type qos jumbo-support-qos-policy
  class jumbo-support
    set qos-group 2
  class jumbo-support6
    set qos-group 2

Now the traffic is officially marked. But if we were to apply it at this point, it still isn't doing anything. We configured the classes and applied them to the traffic. But we have no network or queuing policies to actually act on those class markings. So lets start off by configuring the queueing policies to allow at least 50% bandwidth for the jumbo frame traffic. By default, the default class is set to 100%, so the following policy will limit that also to 50:

class-map type queuing jumbo-support
  match qos-group 2
policy-map type queuing jumbo-support-queuing-policy
  class type queuing jumbo-support
    bandwidth percent 50
  class type queuing class-default
    bandwidth percent 50

And finally, to configure the network policies to actually allow for the jumbo frames, using a policy to set the MTU to a colossal size to accomodate:

class-map type network-qos jumbo-support
  match qos-group 2
policy-map type network-qos jumbo-support-network-qos-policy
  class type network-qos jumbo-support
    mtu 9216
  class type network-qos class-default

So now the class and policy mappings are complete. We are ready to apply them to your new nexus switches. To do this, we need to override the default policies. To override, you need to specify your three policies (qos, queueing, and network-qos) as the preferred system QoS policies. With the above policy examples, this looks like:

system qos
  service-policy type qos input jumbo-support-qos-policy
  service-policy type network-qos jumbo-support-network-qos-policy
  service-policy type queuing input jumbo-support-queuing-policy
  service-policy type queuing output jumbo-support-queuing-policy

So there you have it. Now your jumbo frames as specified in our ACL for IPv4 and IPv6 traffic, we now have set the proper markings to classify traffic to be used with our larger MTU size, with queuing and all. To verify that it is working, you can use the 'show queuing interface' command, like so:

prod-# show queuing int e 1/25
Ethernet1/25 queuing information:
  TX Queuing
    qos-group  sched-type  oper-bandwidth
        0       WRR             50
        2       WRR             50

  RX Queuing
    qos-group 0
    q-size: 100160, HW MTU: 1500 (1500 configured)
    drop-type: drop, xon: 0, xoff: 0
        Pkts received over the port             : 76808
        Ucast pkts sent to the cross-bar        : 67186
        Mcast pkts sent to the cross-bar        : 9622
        Ucast pkts received from the cross-bar  : 104692
        Pkts sent to the port                   : 3995088
        Pkts discarded on ingress               : 0
        Per-priority-pause status               : Rx (Inactive), Tx (Inactive)

    qos-group 2
    q-size: 100160, HW MTU: 9216 (9216 configured)
    drop-type: drop, xon: 0, xoff: 0
        Pkts received over the port             : 4342
        Ucast pkts sent to the cross-bar        : 4342
        Mcast pkts sent to the cross-bar        : 0
        Ucast pkts received from the cross-bar  : 6337
        Pkts sent to the port                   : 8375
        Pkts discarded on ingress               : 0
        Per-priority-pause status               : Rx (Inactive), Tx (Inactive)

You want to make sure you don't have any discarded packets or other signs of issues, which could mean many things such as improper configuration of the classes, policies or issues on the ports connecting to your ESXi server for example. Like mentioned above, this is a basic set of policies to get the point across and could be expanded much more. You don't have to match based on an access list, you could also match based on protocol or other traits, leading to some more advanced markings based on that.

Let me know if you have any questions or improvements to this basic tutorial.


IOS Tip: Write custom messages to Syslog

You often find yourself sifting through the logs when an issue arises on any given network device. You find useful information such as flapping indications, failed SLA ops, and several other things. These are all event-driven log messages that are written when a certain event occurs on the platform (software or hardware). It may be useful to write your own messages to the local syslog instance for various reasons:

  • To keep track of timestamps for various maintenance, research events. Useful for reporting incidents (ITIL) and also to aid in troubleshooting
  • Notes for next on-shift engineer. Things like including a CM ticket # before a change, for the next person to reference
  • Etc..

Well, this is a very simple thing to do, which involves entering the TCL shell, binding to syslog, writing your message, and then closing the syslog connection. This is done like so:


NY1-LIC-rtr01(tcl)#set log [open "syslog: " w+]

NY1-LIC-rtr01(tcl)#puts $log "MAINTENANCE BEGIN - Change: INF-6783"

NY1-LIC-rtr01(tcl)#close $log


And from there, if any issues arise during your MAINTENANCE BEGIN/END tags, you can clearly see the errors and correlate with timestamps, etc. Also, by leaving notes such as the ticket #, you can reference much more information such as what changes were made and any issues that had came up during the maintenance.

To view history of maintenance events for further troubleshooting:

NY1-LIC-rtr01#show log | i MAINT
*Apr 15 03:34:21: MAINTENANCE BEGIN - Change: INF-6711
*Apr 15 03:59:07: MAINTENANCE END - Change: INF-6711 - Successful
*Apr 22 06:34:44: MAINTENANCE BEGIN - Change: INF-6751
*Apr 22 07:25:58: MAINTENANCE END - Change: INF-6751 - Rollback
*Apr 29 01:02:22: MAINTENANCE BEGIN - Change: INF-6783
*Apr 29 01:49:18: MAINTENANCE END - Change: INF-6783 - Successful

There are other situations in which writing custom messages would be useful, be creative! I hope this is informative to anyone who hasn't done anything like this before.