Skip to content

SPI methods are inconsistent with Arduino AVR SPI #2677

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Makuna opened this issue Nov 9, 2016 · 19 comments
Closed

SPI methods are inconsistent with Arduino AVR SPI #2677

Makuna opened this issue Nov 9, 2016 · 19 comments

Comments

@Makuna
Copy link
Collaborator

Makuna commented Nov 9, 2016

Basic Infos

APIs across platforms should try to be consistent so that library writers don't have to pepper their code with ifdefs and custom calls for standard uses.
The Esp8266 SPI has method names for functions that are inconsistent with AVR SPI but are functionally the same.

on AVR
void transfer(void *buf, size_t count)
is the same as on Esp8266
void writeBytes(uint8_t * data, uint32_t size);

@mkeyno
Copy link

mkeyno commented Nov 12, 2016

sorry to be picky but what's big problem with that ? writeBytes much faster than normal transfer function

#2624

@Makuna
Copy link
Collaborator Author

Makuna commented Nov 12, 2016

The standard Arduino SPI library does not have a writeBytes, it does have a transfer that takes a byte pointer and count of bytes. The Esp8266 SPI library does not expose this transfer and further adds its own writeBytes. This is inconsistent and makes writing Arduino code across platforms harder.

I am not saying there is a need to remove writeBytes; but at least expose the standard of a transfer that takes a buffer and count of bytes.

@selste
Copy link

selste commented May 1, 2017

According to the writeBytes method comment the data has to be aligned to 32 bits - so it differs significantly from the transfer method that ships with the Arduino.
Would be great if the implementations would provide the same functionality.

@PaulStoffregen
Copy link

PaulStoffregen commented Jul 12, 2018

My improved performance Ethernet library doesn't compile on ESP8266, only because SPI.transfer(data, size) is missing.

Here's a quick implementation I added in SPI.h for testing. Maybe something like this could be merged, so Arduino sketches and libraries using this well established Arduino API can compile and run on ESP8266?

void transfer(uint8_t * data, uint32_t size) {
    uint8_t buf[40];
    while (size > sizeof(buf)) {
        memcpy(buf, data, sizeof(buf));
        transferBytes(buf, data, sizeof(buf));
        data += sizeof(buf);
        size -= sizeof(buf);
    }
    if (size) {
        memcpy(buf, data, size);
        transferBytes(buf, data, size);
    }
}

@d-a-v
Copy link
Collaborator

d-a-v commented Jul 12, 2018

The above comments are right, writeBytes() does the job without copy, using 32 bits registers, but require 32bits aligned buffers.
We can arrange something for the non aligned bytes and use this function.
Could SPI.transfer() take a const uint8_t* ?

off-topic:
There is a work in progress to use the ethernet-only part of the W5100 or an enc28j60 (ethernet only) to use with lwIP that we luckily have on the esp. That would allow ethernet to integrate better with the network part of this core like for #4914.
Any pointer to an improved SPI driver for the W5100 would be welcome.

@PaulStoffregen
Copy link

Could SPI.transfer() take a const uint8_t* ?

No, not if you want to be compatible with Arduino programs. According to Arduino's documentation "In case of buffer transfers the received data is stored in the buffer in-place (the old data is replaced with the data received)."

https://www.arduino.cc/en/Reference/SPITransfer

The reason to implement SPI.transfer(buffer, size) is for compatibility with Arduino programs.

@d-a-v
Copy link
Collaborator

d-a-v commented Jul 12, 2018

Yes I was just realizing that. The esp8266/arduino devs did things right.
We can probably find something that does not involve memcpy().

@d-a-v
Copy link
Collaborator

d-a-v commented Jul 12, 2018

@PaulStoffregen
Could you by chance test this arduino compatible spi transfer function
It uses the 32bits-aligned and 32bits registers optimized esp's transferBytes().
#4925

@d-a-v d-a-v self-assigned this Jul 12, 2018
@d-a-v d-a-v added component: libraries waiting for feedback Waiting on additional info. If it's not received, the issue may be closed. labels Jul 13, 2018
@PaulStoffregen
Copy link

PaulStoffregen commented Jul 13, 2018

Tried it just now. Seems to work fine, at least testing with my Ethernet lib.

However, the performance is not as good as the memcpy-based code. But they're similar. With the memcpy version, I get 609.6 kbytes/sec overall speed fetching a web page from a fast Linux server on the same LAN, using a W5500 (Adafruit's Featherwing Ethernet). With this optimized version I get 582.9 kbytes/sec. I ran each a few times. The speeds are very consistent.

FWIW, ESP32's speed is 964.0 kbytes/sec on this test, using this in ESP32's SPI.h:

  void transfer(uint8_t * data, uint32_t size) { transferBytes(data, data, size); }

Unfortunately this doesn't work on ESP8266. Seems using the same pointer for read and write isn't allowed?

@PaulStoffregen
Copy link

Here's the exact code I tested in SPI.cpp.

#if 1
void SPIClass::transfer(void *buf, uint16_t count) {
    uint8_t *cbuf = reinterpret_cast<uint8_t*>(buf);
    for (; ((long)cbuf) & 3; cbuf++, count--)
        *cbuf = transfer(*cbuf);
    uint16_t count4 = count & ~3;
    transferBytes(cbuf, cbuf, count4);
    cbuf += count4;
    count -= count4;
    for (; count; cbuf++, count--)
        *cbuf = transfer(*cbuf);
}
#else
void SPIClass::transfer(void * ptr, uint16_t size) {
        uint8_t *data = (uint8_t *)ptr;
        uint8_t buf[40];
        while (size > sizeof(buf)) {
                memcpy(buf, data, sizeof(buf));
                transferBytes(buf, data, sizeof(buf));
                data += sizeof(buf);
                size -= sizeof(buf);
        }
        if (size) {
                memcpy(buf, data, size);
                transferBytes(buf, data, size);
        }
}
#endif

@d-a-v
Copy link
Collaborator

d-a-v commented Jul 13, 2018

esp8266's transferBytes() requires 32bits aligned buffers, that may be the reason why it does not work when you use it directly. The new transfer() addresses unaligned buffers and uses transferBytes() with the same buffer for in and out.
tranferBytes() needs further work. It is in fact optimized only for writing, not for reading, but that still doesn't explain why your memcpy version works faster.

I'll work on this as soon as I can. Thanks for your feedback.

@d-a-v d-a-v removed the waiting for feedback Waiting on additional info. If it's not received, the issue may be closed. label Jul 13, 2018
@PaulStoffregen
Copy link

PaulStoffregen commented Jul 13, 2018

Maybe my buffers are always aligned to 32 bit boundaries? Some are small buffers on the stack. Others are pointers to buffers from the user's sketch code, but in this test I'm only using a couple of those. Maybe they are aligned only by luck? (or because the first local variable in a function gets aligned?)

Anyway, the performance is similar. Probably not worth a lot of work for such a small difference. Maybe best to use the code which for-sure handles unaligned data?

@devyte
Copy link
Collaborator

devyte commented Jul 13, 2018

Aren't stack variables always 32bit-aligned? => buf[40] would always have an aligned base.

@Makuna
Copy link
Collaborator Author

Makuna commented Jul 16, 2018

The issue will be in class/struct members. You will need to test something like this...

struct Example {
    uint8_t variable1;
    uint8_t variable2[12]; // may not be aligned to 32 bits
}

@PaulStoffregen
Copy link

I tried placing my data at an offset within the buffer. Both seem to work fine. Does transferBytes() allow pointers to be unaligned from 32 bit word boundaries?

@PaulStoffregen
Copy link

The optimized version of #4925 is good enough. I'd recommend merging it. More optimization can always be done later. As things are now, libraries written for Arduino (like my recent work to speed up W5X00 Ethernet) are incompatible with ESP8266, only because this well documented Arduino API is missing.

@PaulStoffregen
Copy link

I released Arduino Ethernet 2.0.0 today.
https://github.com/arduino-libraries/Ethernet/releases/tag/2.0.0

It requires this issue fixed to compile for ESP8266. Users can of course still use the old versions, or Adafruit's Ethernet2 or others... without the improvement performance and new features.

@devyte
Copy link
Collaborator

devyte commented Jul 27, 2018

#4925 is merged.

@rsaxvc
Copy link
Contributor

rsaxvc commented Dec 29, 2019

I think if we were careful, we could implement SPI::writebytes() to support unaligned pointers for just a few cycles more per call. Would this be worth it? By sending a few single bytes, we could get the read-pointer aligned on 32-bits, then chuck the (hopefully) large amount of remaining data through the fast 32-bit function. Basically like this:

void SPIClass::writeBytes(const uint8_t * data, uint32_t size) {
    if( size>=1 &&( (uintptr_t)data & 1) ) {
        write(*data++);size--;
    }
    if( size>=2 &&( (uintptr_t)data & 2) ) {
        write(*data++);size--;
        write(*data++);size--;
    }
    while(size) {
        if(size > 64) {
            writeBytes_(data, 64);
            size -= 64;
            data += 64;
        } else {
            writeBytes_(data, size);
            size = 0;
        }
    }
}

If we wanted to get faster, we could remove the FIFO synchronization from the end of writeBytes_() and just check it once when returning from writeBytes().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants