A NeoPixel Driver using AVR Hardware
12th August 2025
This project describes a driver for NeoPixel (WS2812) LED displays based on an AVR processor, using the SPI peripheral in conjunction with a Timer/Counter and the Configurable Custom Logic (CCL):
NeoPixel Driver using the hardware peripherals in an AVR128DA28.
It's capable of driving anything from a single NeoPixel LED up to a strip of several hundred NeoPixels. I give details for running it on an AVR128DA28, but the same principles could be used with almost any recent AVR processor.
Introduction
WS2812 displays, nicknamed NeoPixels by Adafruit [1], are a popular chainable type of RGB LED display now available in a wide variety of formats. They use a non-standard protocol consisting of a serial stream of pulses, with the width of each pulse determining whether it is a '0' or a '1'. However the pulses are very short: a zero is defined as having a width of 350ns, which is just 8.4 cycles on a 24 MHz CPU. Most NeoPixel libraries therefore use hand-crafted assembler routines tailored to each processor for the low-level pulse generation.
Using peripherals
One hardware-based approach to writing a NeoPixel driver is made possible by the programmable input/output block (PIO) peripherals in the Raspberry Pi RP2040 and RP2350 processors [2]. This prompted me to wonder whether a NeoPixel driver could be implemented in a similar way on an AVR processor by using the SPI peripheral in conjunction with a hardware timer and the Configurable Custom Logic (CCL), resulting in this design.
Approach
This AVR NeoPixel Driver uses the SPI peripheral to generate the serial stream of bits, a Timer/Counter to generate waveforms with the appropriate pulse widths for the '0' and '1' bits, and the Configurable Custom Logic (CCL) to combine these signals into a single stream of pulses encoding the colours.
This approach has several advantages:
- It's less processor intensive, because each byte of data is serialised and output by the SPI peripheral independent of the processor.
- It's less tricky to program, because the timing is determined by a hardware timer, so it doesn't depend on counting instruction cycles, and taking account of the effect of branches and loops.
- It's independent of the processor clock rate, so one routine could be designed to work at different clock rates.
This application would probably work using most members of the AVR DA, DB, and DD families; I chose to use an AVR128DA28 which is available in a 28-pin DIP package, ideal for experimenting with on a prototyping board.
The approach is described in greater detail below:
SPI peripheral
The colour for each NeoPixel display is specified by a serial stream of 24 bits:
This is repeated for each of the NeoPixel displays in the chain, so to light up a chain of 20 NeoPixels you transmit a stream of 60 bytes. So the first task of a NeoPixel driver is to serialise the colour data bytes.
The AVR processors offer three alternative peripherals that convert bytes into a serial stream of bits: the USARTs, TWI/I2C peripherals, and SPI peripherals. I chose the SPI peripheral because it seemed easiest to adapt it for this application. The SPI peripheral can simultaneously transmit data to and receive data from another device using two serial lines, MOSI and MISO, and a clock, SCK.
The SPI peripheral can either be configured in Host mode (previously known as Master mode), in which case the SPI module generates the SCK signal to control the exchange, or Client mode (previously known as Slave mode), in which case it is clocked by the SCK signal from the Host.
I used the SPI in Client mode because this allows you to control the data rate based on the signal at the SCK input. In Client mode the data is shifted out on the MISO pin, and we can ignore the MOSI pin because we're not receiving data. The other important signal is SS which is normally high, and taking it low initiates the transfer of data. All we need to do then is write the next byte of data into the SPI DATA register before each set of eight bytes has been shifted out. The SPI peripheral provides a flag that makes it easy to do this with an interrupt service routine.
The AVR128DA28 includes two SPI peripherals, but I'm only using one in this application.
Timer/Counter Type A
In the NeoPixel data stream a '0' bit is represented by a pulse of 350ns followed by a gap of 800ns, and a '1' bit is represented by a pulse of twice that length followed by a gap of 600ns [3]:
All the timings have a tolerance of ±150 ns, and the end of a stream of data is marked by a low interval of at least 50µs.
The AVR processors provide a choice of three types of Timer/Counters that are ideal for generating waveforms and pulses of different widths. I chose the 16-bit Timer/Counter Type A, and there's one available in the AVR128DA28.
Using this Timer/Counter's Single-Slope PWM Generation mode allows you to generate two waveform outputs WO0 and WO1 with the same frequency, but with different duty cycles. I chose a frequency of 24MHz/30 or 800kHz, giving a period of 1250ns. For the two waveforms I chose periods of 333ns for WO0 and 708ns for WO1, well within the specification.
Configurable Custom Logic (CCL)
The idea now is to select a cycle of waveform WO0 if the data in the MISO serial output is a zero, and WO1 if it's a one. We could do this with external logic gates, but fortunately the AVR processors include a Configurable Custom Logic (CCL) module that provides logic gates that you can configure with software. These are called Look-Up Tables (LUTs).
The 28-pin AVR DA devices provide four look-up tables, LUT0 to LUT3. Each look-up table has three inputs and one output, and you can construct more complex circuits by combining these. The following table shows the pins corresponding to each of the four look-up tables:
IN0 | IN1 | IN2 | OUT | |
LUT0 | PA0 (22) | PA1 (23) | PA2 (24) | PA3 (25) |
LUT1 | PC0 (2) | PC1 (3) | PC2 (4) | PC3 (5) |
LUT2 | PD0 (6) | PD1 (7) | PD2 (8) | PD3 (9) |
LUT3 | PF0 (16) | PF1 (17) |
We only need one look-up-table for this application, and I used LUT1:
Inputs to LUT1 for the NeoPixel Driver.
The look-up tables are programmed by defining a truth table, showing the output required for each of the eight combinations of the inputs.
The truth table will connect WO0 to the output when MISO is zero, and WO1 to the output when MISO is one:
MISO | WO1 | WO0 | OUT |
0 | 0 | 0 | 0 |
0 | 0 | 1 | 1 |
0 | 1 | 0 | 0 |
0 | 1 | 1 | 1 |
1 | 0 | 0 | 0 |
1 | 0 | 1 | 0 |
1 | 1 | 0 | 1 |
1 | 1 | 1 | 1 |
The resulting output waveform will be in the required NeoPixel format, giving a short pulse when the data in MISO is a zero, and a longer pulse when the data in MISO is a one:
For a simpler example of using the SPI peripheral in conjunction with the CCL to implement a serial protocol see the Manchester Encoder example, page 17, in Getting Started with CCL, TB3218.
Timings
To refresh a strip of NeoPixels each bit takes 1250ns, making 30µs per NeoPixel, or about 1ms for a strip of 30 LEDs. To animate NeoPixels without flicker we need to refresh them about 50 times a second, or at least once every 20ms. So with a software NeoPixel driver the longest strip of NeoPixels you could animate is 600 LEDs, as this would leave no time for calculating the next pattern. This hardware NeoPixel Driver uses only about 10% of the processor time to update the strip, leaving 90% to calculate the next pattern.
The circuit
Here's the circuit of the NeoPixel Driver, using a similar layout to the circuit on the prototyping board:
Circuit of the NeoPixel Driver using the hardware peripherals in an AVR128DA28.
I used an AVR128DA28 in an SPDIP package [4]. The important pins are:
- PA0 and PA1 are the waveform outputs WO0 and WO1 from Timer/Counter TCA0.
- PA5, PA6, and PA7 are the MISO output, SCK input, and SS input in SPI0.
- PC0, PC1, and PC2 are the three inputs IN0, IN1, and IN2 in the logic gate LUT1.
- PC3 is the output OUT from the logic gate LUT1.
- PD4 is used as an enable output, EN, to control the SS input.
I have included an LED and current-limiting resistor on PF0; this is used for feedback about the NeoPixel driver.
Note that if you're using a 28-pin DB or DD family device pin 6 becomes VDDIO2, for the Multi-Voltage I/O, rather than PD0, and needs to be connected to VDD.
For the photograph at the beginning of this article I used the NeoPixel Driver to drive an Adafruit NeoPixel Stick containing eight 5050 RGB LEDs [5].
The program
Here's a description of the NeoPixel Driver program.
The NeoPixel data buffer
The NeoPixel data is read from a buffer which is defined using a union so it can be read either as sets of the three colours, or as a stream of individual bytes:
const int NumPixels = 20; union { uint8_t col[NumPixels][3]; uint8_t out[NumPixels*3]; } Buffer; volatile uint8_t BufPtr; enum colour { GRN, RED, BLU };
Configuring the Timer/Counter Type A
First we configure Timer/Counter Type A to generate the two waveforms on WO0 and WO1 (PA0 and PA1):
PORTMUX.TCAROUTEA = PORTMUX_TCA0_PORTA_gc; // Clear routing TCA0.SINGLE.CTRLD = 0; // Normal mode PORTA.DIRSET = PIN0_bm | PIN1_bm; // WO0 and WO1 outputs TCA0.SINGLE.CTRLA = TCA_SINGLE_CLKSEL_DIV1_gc; // Clock divided by 1 TCA0.SINGLE.CTRLB = TCA_SINGLE_WGMODE_SINGLESLOPE_gc // Single-slope PWM ... | TCA_SINGLE_CMP0EN_bm | TCA_SINGLE_CMP1EN_bm; // waveform on WO0 and WO1 TCA0.SINGLE.PER = 30-1; // Period is 1250ns TCA0.SINGLE.CMP0 = 8-1; // WO0 (PA0) is 333.33ns TCA0.SINGLE.CMP1 = 17-1; // WO1 (PA1) is 708.33ns
The first two statements are needed to reset the configuration performed by DxCore.
The last three statements calculate the timings based on a 24MHz clock rate. They could be made independent of the clock rate by deriving them from the clock rate constant, F_CPU.
Configuring the SPI peripheral
Next we configure the SPI peripheral in Client mode:
PORTA.DIRSET = PIN5_bm; // MISO output SPI0.CTRLB = SPI_BUFEN_bm | SPI_BUFWR_bm // Buffer mode ... | SPI_MODE_0_gc; // transfer mode 0 SPI0.CTRLA = SPI_ENABLE_bm; // Enable in client mode
We select Buffer Mode, which triple-buffers the data: after writing it to the DATA register it is buffered in a Transmit Buffer before it is written to the Shift Register. The DREIF (Data Register Empty) flag goes high when the Transmit Buffer is empty, indicating that we can write another byte to the DATA register, and the TXCIF (Transfer Complete) flag goes high when the Shift Register is empty, indicating that the whole transfer has finished. We will use an interrupt-service routine to respond to these flags.
Defining the look-up table
We use the look-up table LUT1 to choose between WO0 or WO1 depending on the state of MISO. First we specify that LUT1 should take its inputs from the pins PC0, PC1, and PC2:
CCL.LUT1CTRLB = CCL_INSEL0_IN0_gc | CCL_INSEL1_IN1_gc; CCL.LUT1CTRLC = CCL_INSEL2_IN2_gc;
We then need to define a truth table, specifying what state the output should be for the eight possible states of these inputs. The truth table is specified by an 8-bit number; each bit specifies the output for one set of states of the inputs. So, for example, bit 6 in the truth table specifies the state of the output for IN2 = 1, IN1 = 1, and IN0 = 0. Referring back to the truth table given earlier you'll see that the correct 8-bit number is as follows:
CCL.TRUTH1 = 0b11001010;
Finally we enable the LUT, with an output on the physical pin PC3. Note that there's no need to also define the pin as an output; this happens automatically:
CCL.LUT1CTRLA = CCL_OUTEN_bm | CCL_ENABLE_bm; // Enable, output on PC3
Finally we enable the whole CCL:
CCL.CTRLA = CCL_ENABLE_bm; // Enable CCL last
Enabling a LUT must be performed after configuring its control registers, because the LUT control registers are protected from being configured while a LUT is enabled; if you forget to do this, nothing will appear to work!
Configuring pins
Finally we use PD4 as an enable pin to control the SPI SS pin. We make this an output, and set it high initially to disable the SPI output:
PORTD.OUTSET = PIN4_bm; PORTD.DIRSET = PIN4_bm; // PD4 EN output high
Interrupt service routine
Here's the Interrupt Service Routine for the SPI interrupt generated by the DREIF and TXCIF flags:
ISR(SPI0_INT_vect) { if (BufPtr < NumPixels*3) { // More data to write? SPI0.DATA = Buffer.out[BufPtr++]; // Output byte, clears flag } else if (SPI0.INTFLAGS & SPI_TXCIF_bm) { // Shift register empty? PORTD.OUTSET = PIN4_bm; // Take EN high to stop SPI TCA0.SINGLE.CMP0 = 0; // WO0 constant low signal TCA0.SINGLE.CMP1 = 0; // WO1 constant low signal SPI0.INTFLAGS = SPI_TXCIF_bm; // Clear flag SPI0.INTCTRL = 0; // Disable interrupts } else { SPI0.INTFLAGS = SPI_DREIF_bm; // Clear flag SPI0.INTCTRL = SPI0.INTCTRL &~SPI_DREIE_bm; // Disable DREIF interrupt } }
There are three possible actions:
- If there are more bytes in Buffer[] to be transferred we write the next byte to the SPI DATA register, which automatically clears the DREIF flag.
- If the interrupt was caused by the TXCIF flag it indicates that the transfer has finished. In this case we take EN high to make SS high to halt the SPI peripheral and zero the TCA0 compare registers to generate a constant low signal on both of the WO0 and WO1 outputs. We also write to the TXCIF flag to clear it, and disable interrupts.
- Otherwise we disable the DREIF interrupt.
The interrupt service routine takes about 1µs to execute; I measured this by setting an I/O pin on entry, clearing it on exit, and viewing the duration of the pulse on an oscilloscope. Transmitting a byte takes 8 x 1250ns or 10µs, so transmitting the NeoPixel data stream uses about 10% of the processor time.
Starting a transfer
The routine StartTransfer() is called to start transferring the contents of the Buffer[] array to the NeoPixel strip:
void StartTransfer () { BufPtr = 0; SPI0.DATA = Buffer.out[BufPtr++]; // Initial data SPI0.DATA = Buffer.out[BufPtr++]; // Initial data SPI0.INTCTRL = SPI_DREIE_bm | SPI_TXCIE_bm; // Enable interrupts TCA0.SINGLE.CTRLA &= ~TCA_SINGLE_ENABLE_bm; // Stop Timer/Counter TCA0.SINGLE.CMP0 = 8-1; // WO0 (PA0) is 333.33ns TCA0.SINGLE.CMP1 = 17-1; // WO1 (PA1) is 708.33ns TCA0.SINGLE.CNT = 0; // Clear Timer/Counter PORTD.OUTCLR = PIN4_bm; // EN low to start output TCA0.SINGLE.CTRLA |= TCA_SINGLE_ENABLE_bm; // Start Timer/Counter }
Because the SPI is triple-buffered we need to write the first two bytes to the DATA register. We then enable the DREIE and TXCIE interrupts, set the correct compare register values for the WO0 and WO1 pulse widths, and clear the TCA0 Timer/Counter. Finally we take EN low to make SS low to start the SPI shift register, and start the Timer/Counter.
Subsequent bytes of data are copied to the SPI DATA register by the SPI interrupt service routine
NeoPixel Driver demo
The following demo Waves() shows how to use the NeoPixel Driver. It gives a slowly changing display that cycles through all the colours:
int fix (int y) { y = y % 768; if (y >= 256) y = 511 - y; if (y < 0) y = 0; return y; } void Waves () { unsigned long t = millis(); StartTransfer(); // Runs under interrupt for (int p=0; p<NumPixels; p++) { // Update LEDs int c = t/10 + (768 * p)/20; while (BufPtr <= p*3); // Wait until output Buffer.col[p][GRN] = fix(c); Buffer.col[p][RED] = fix(c+256); Buffer.col[p][BLU] = fix(c+512); } if (millis() - t >= 20) PORTD.OUTSET = PIN7_bm; // Error light else while (millis() - t < 20); // Wait until 20ms tick }
After starting the transfer by calling StartTransfer() it calculates the colour values for the next update of each LED. The statement:
while (BufPtr <= p*3);
waits to ensure that each element of Buffer[] is not updated until the previous value has been output by the interrupt-service routine. This avoids the need for two buffers, which would require more RAM, and this approach is only marginally less efficient.
The following statements light the error light if updating the LED buffer has taken more than the available 20ms (see Timings above), and then waits until the next 20ms tick:
if (millis() - t >= 20) PORTD.OUTSET = PIN7_bm; // Error light else while (millis() - t < 20); // Wait until 20ms
The Error light means you may start to get flicker from your animation, in which case you can try and optimise your update routine to avoid this, such as by taking calculations outside the main loop, or using integer rather than floating-point arithmetic.
To run the demo call it from loop():
void loop() { Waves(); }
The colour changes will be smoothest with a 5V supply, but it will also work with 3.3V.
Uploading the program
First install Spence Konde's DxCore from GitHub: see DxCore - Installation. Note that this program gives a compiler error with DxCore version 1.5.11 due to a conflict with some compatibility extensions, so I recommend using 1.5.10 or earlier.
To program the processor the recommended option is to use a 5V or 3.3V USB to Serial board, such as the SparkFun FTDI Basic board [6], or a USB to Serial cable [7], connected with a Schottky diode as follows. You can substitute a 4.7kΩ resistor for the Schottky diode:
- Choose the AVR DA-series (no bootloader) option under the DxCore heading on the Board menu.
- Check that the subsequent options are set as follows (ignore any other options):
Board: "AVR DA-series (no bootloader)"
Chip: "AVR128DA28"
Clock Speed: "24 MHz internal"
You can leave the other options at their defaults, the first option on each submenu.
- Set Programmer to the first of the "SerialUPDI - 230400 baud" options.
- Select the USB port corresponding to the USB to Serial board in the Port menu.
- Choose Burn Bootloader from the Tools menu to set the fuses as appropriate.
- Then choose Upload on the Sketch menu to upload the program to the AVR128DA28.
Resources
Here's the NeoPixel Driver program with the demo: NeoPixel Driver Program.
Further suggestions
We could go one stage further, and use the AVR Events System to eliminate most of the interconnections between the pins on the above circuit without affecting its operation; I'll describe how to do this in a subsequent article.
- ^ NeoPixel Überguide on Adafruit.
- ^ See WS2812 LEDs in the RP2040 Datasheet, page 348.
- ^ WS2812 datasheet on Adafruit.
- ^ AVR128DA28-I/SP on Mouser.co.uk.
- ^ NeoPixel Stick - 8 x 5050 RGB LED on Adafruit.
- ^ SparkFun FTDI Basic Breakout - 5V on Sparkfun.
- ^ FTDI Serial TTL-232 USB Cable on Adafruit.
blog comments powered by Disqus