Article Image

Exploring Coding on Embedded Devices

Recently, I've been digging deep into developing cross-embedded-platform software on the Particle/Arduino/Unix in my library for, with ubsub-iot.

This is my first in-depth embedded application, so along the way, I ran into some learning examples I thought I'd share.

Cross-Platform Arduino/Particle

Luckily for me, there were only minor differences between the various platforms that needed to be worked around, and all were managed by just isolation and leveraging preprocessor macro's.

  1. UDP Socket creation / data communication
  2. Logging

For the first, I simply worked around it like so (eg.):

static uint64_t getTime() {
  return (uint64_t)(millis() / 1000);
  return std::time(NULL);

The second issue, I implemented two strategies: On unix I used stdout/std::cout, and on arduino/particle, I added opt-in log-to-serial.

It's important to do opt-in because the consumer of my library might choose to use Serial for something other than logging. Also, c-strings tend to consume a lot of extra memory, and when you're trying to fit it into a small space, I didn't want to be responsible for consuming memory that may be important to the user.

Memory Alignment

Memory alignment issues were the first big thing that I ran into where it "worked on my computer", but constantly hard-faulted on anything else.

I can't speak to every platform, but I was testing the code on my Particle Photon, and it would randomly hard-fault. After adding a ton of logging, I discovered it would have issues mostly around operations involving uint64_t. After a bit more digging, and a small test program later, I discovered that you can't move a uint64_t (or other 64-bit types for that matter) to non-processor-aligned memory.

Eg, this works fine: c char buf[128]; *(uint64_t*)(buf+4) = (uint64_t)0xAA

But this would cause a hard fault: c char buf[128]; *(uint64_t*)(buf+1) = (uint64_t)0xAA;

I wrote the following to work around this problem (probably overkill, but it was fun):

#include <cstring>

#ifndef binio_h
#define binio_h

Particle/arduino can't use pointer-arithmatic to copy memory of certain types
(namely 64 bit ones) into or out of non-aligned memory.  This helpers
were provided as a method to work around that issue using memcpy() if necessary

// Enable byte-by-byte read/write, even on platforms where there are faster alternatives
// #define UBSUB_MANUAL_RW true

template <typename T> static inline T read_le(const uint8_t* at) {
    T ret = 0;
    memcpy(&ret, at, sizeof(T));
    return ret;
    return *(T*)at;

template <typename T> static inline void write_le(uint8_t* to, const T& val) {
  memcpy(to, &val, sizeof(T));
  *(T*)to = val;


Update: I later discovered that on other platforms, such as esp8266's, that it will even happen for uint16t or uint32t. To be safe, it's probably smart to just use memcpy everywhere.


Another problem that caused a hard-fault once in a while was trying to use various formatting utilities (sprintf and Serial.printf) with %d representing a 64 bit value (have you sensed a theme yet?).

My solution here was to quickly/naively implement a hex-string for these values

NOTE: This function uses static memory, meaning that you can only use it once per consumption (or copy it into its own memory):

static char hextable[16] = {'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F'};
// Nieve helper that stores in local memory. Don't use more than once per log msg (or copy out)
template <typename T> static const char* tohexstr(T val) {
static char buf[32];
int i=0;
for (; i<(int)sizeof(T)*2; ++i) {
  buf[sizeof(T)*2-i-1] = hextable[val & 0xF];
  val >>= 4;
buf[i] = '\0';
return buf;

Buffer management

The last thing I want to talk about is buffer management. One of the core assumptions of this application is that it will run in a single-threaded context, so I create a lot of shared-memory space. At this point I just defined a fairly low MTU (max transmission unit) to keep the memory suitably low, but one of the next things I would like to do is to go over it and optimize its memory usage, as more can be done.

Remember, a lot of the devices that I'm working on may only have 8 KB of memory, so I need to make sure that it can run in a very small space.

Blog Logo

Christopher LaPointe



Chris LaPointe

Another site of Code

Back to Overview