# Arduino Code – Performance

I’m currently making a self balancing robot, powered by an Arduino Nano, and decided to use NEMA 17 stepper motors to power it.

The DRV8825 motor drivers are great, but cannot be driven using a PWM signal. Consequently the Arduino code must send a ‘step’ signal to the correct pin many times per second – Easily in the thousands if you require a decent amount of speed, especially if you’re micro-stepping.

Performance is critical…

To make the stepper rotate one full turn requires it to step 200 times. In my case I’m using a micro-step level of ‘8’, meaning I actually have to send 1,600 ‘steps’ (2 * 800) signals to the DRV8825 input pin. So to get a speed of 2 revolutions per second that’s actually 3,200 steps per second! Getting this sort of speed (combined with code to calculate motor acceleration, the angle of the robot, etc) is no simple feat!

To achieve the best code performance you need to really analyse how long each loop cycle takes, trimming down unnecessary calculations as far as possible. You may or may not know that different types of calculation take different amounts of time. For example, multiplying two `int`s is faster than multiplying two `double`s, and any sort of ‘divide’ operations is significantly slower again – See below!

There are various tricks you can use to keep performance high:

• Use the simplest possible numeric types.
• Apply fixed point math.
• Pre-calculate as many values as possible.
• Consider that a ‘divide’ operation can be achieved by multiplying by a value’s reciprocal. (E.g. x / y is the same as x * (1 / y), and (1 / y) might be pre-calculated.)
• Perform some operations (such as reading gyro values) only periodically. (Made simpler by using my Timed Events code!)
• User faster pin I/O code.

There’s an extremely handy table showing how long various Arduino operations take. It’s definitely worth keeping at hand! I’ll include a copy here for reference:

```Speed test
----------
F_CPU = 16000000 Hz
1/F_CPU = 0.0625 us
The next tests are runtime compensated for overhead
Interrupts are still enabled, because millis() is used for timing
nop                       : 0.063 us
avr gcc I/O               : 0.125 us
Arduino digitalWrite      : 5.092 us
pinMode                   : 4.217 us
multiply byte             : 0.632 us
divide byte               : 5.412 us
multiply integer          : 1.387 us
divide integer            : 14.277 us
multiply long             : 6.100 us
divide long               : 38.687 us
multiply float            : 7.110 us
divide float              : 79.962 us
itoa()                    : 13.397 us
ltoa()                    : 126.487 us
dtostrf()                 : 78.962 us
random()                  : 51.512 us
y |= (1<<x)               : 0.569 us
bitSet()                  : 0.569 us
Notice how much longer a ‘divide’ operation takes! Nearly 12x slower for a `float`!