Who Is This For?
This guide is written for engineers preparing for embedded software engineer, firmware engineer, embedded systems engineer, or embedded Linux engineer roles. Whether you are a fresh graduate targeting your first embedded role or a mid-level engineer switching from application software into firmware, this roadmap will give you a structured path.
Embedded software interviews are notoriously difficult because they test a wide surface area: low-level C, hardware knowledge, operating system internals, real-time constraints, debugging skills, and sometimes even DSA. The good news: the topic list is finite and well-defined.
How Embedded Software Interviews Are Structured
Most companies follow a 4–6 round process:
- Phone/Video Screen — 30–45 min. Resume review + 1–2 basic C/embedded questions.
- Technical Phone Screen — 60 min. Live coding in C (bitwise ops, pointers, linked lists) or MCQ on embedded concepts.
- Take-Home Assignment — Some companies give a small firmware project (e.g., implement a ring buffer, write a mock UART driver, simulate an RTOS scheduler).
- Onsite / Virtual Onsite (3–5 rounds):
- Deep C/C++ coding round
- Embedded systems concepts (protocols, memory, ISR)
- RTOS / OS internals
- System design (firmware architecture)
- Behavioural / leadership
Phase 1: C Programming & Core Language Concepts
C is the lingua franca of embedded systems. Every embedded interview will test your C knowledge — not just syntax, but the why behind language features. Here are the most frequently tested topics:
volatile keyword
The volatile keyword tells the compiler never to cache the value of a variable and always read it directly from memory. Critical for:
- Hardware registers (memory-mapped I/O)
- Variables shared between an ISR and main code
- Variables modified by DMA
volatile uint32_t *GPIOA_IDR = (uint32_t *)0x40020010; // Without volatile, the compiler may optimise away repeated reads uint32_t pin_state = *GPIOA_IDR;
static keyword — three distinct uses
- Static local variable: persists across function calls (stored in .data/.bss, not stack)
- Static function: limits scope to the translation unit — your go-to for encapsulation in C
- Static global: internal linkage — prevents name collisions across .c files
Pointers, arrays, and function pointers
Expect questions on pointer arithmetic, pointer-to-pointer, const correctness, and function pointer syntax — the last one is especially common in callback-based embedded APIs and ISR registration.
// Function pointer for ISR callback registration
typedef void (*isr_handler_t)(void);
isr_handler_t handlers[16];
void register_irq(uint8_t irq_num, isr_handler_t handler) {
handlers[irq_num] = handler;
}
// Const correctness — know the difference:
const uint8_t *p; // pointer to const data
uint8_t * const p; // const pointer to mutable data
const uint8_t * const p; // const pointer to const dataEndianness
Know the difference between big-endian and little-endian, how to detect it at runtime using a union trick, and how to byte-swap values (used constantly in network/protocol code).
// Detect endianness at runtime
bool is_little_endian(void) {
uint16_t val = 0x0001;
return *(uint8_t *)&val == 0x01;
}
// Byte swap a 32-bit value
uint32_t bswap32(uint32_t x) {
return ((x >> 24) & 0xFF) |
((x >> 8) & 0xFF00) |
((x << 8) & 0xFF0000) |
((x << 24) & 0xFF000000);
}Bitwise operations — the most common embedded interview topic
You will almost certainly be asked to implement bit manipulation operations from scratch. Master these patterns:
#define BIT(n) (1U << (n))
#define SET_BIT(reg, n) ((reg) |= BIT(n))
#define CLEAR_BIT(reg, n) ((reg) &= ~BIT(n))
#define TOGGLE_BIT(reg, n) ((reg) ^= BIT(n))
#define READ_BIT(reg, n) (((reg) >> (n)) & 1U)
// Is x a power of 2?
bool is_power_of_two(uint32_t x) { return x && !(x & (x - 1)); }
// Count set bits (popcount)
uint32_t count_bits(uint32_t x) {
uint32_t count = 0;
while (x) { count += x & 1; x >>= 1; }
return count;
}Memory layout: .text, .data, .bss, stack, heap
Know exactly where each variable type lives in memory. This is asked directly and indirectly in MCU startup sequence questions.
- .text — compiled code (ROM/Flash)
- .data — initialised global/static variables (copied from Flash to RAM at startup)
- .bss — uninitialised global/static variables (zeroed at startup)
- stack — local variables, function call frames (grows downward on ARM)
- heap — dynamic allocation via malloc/free
Phase 2: Bus & Communication Protocols
Hardware protocol questions are almost guaranteed in any embedded interview. Focus on the most common ones:
UART (Universal Asynchronous Receiver-Transmitter)
- Asynchronous — no shared clock. Start/stop bits frame each byte.
- Baud rate must match on both ends (common: 9600, 115200)
- Parity bits for basic error detection
- Full-duplex over two wires (TX, RX)
SPI (Serial Peripheral Interface)
- Synchronous, full-duplex. Four wires: SCLK, MOSI, MISO, CS̄
- Four modes based on CPOL (clock polarity) and CPHA (clock phase)
- Master drives clock and chip-select. Multi-slave via multiple CS lines.
- Typically faster than I2C — used for displays, flash, ADCs
I2C (Inter-Integrated Circuit)
- Two-wire, half-duplex (SDA + SCL). Open-drain lines with pull-up resistors.
- 7-bit (or 10-bit) device addressing — up to 127 devices on one bus
- ACK/NACK after each byte. Clock stretching for slow slaves.
- Prone to bus hangs — know how to recover (9-clock-cycle reset)
CAN bus
- Differential signaling (CANH/CANL) — robust to noise, used in automotive
- Multi-master. Arbitration via message ID (lower ID = higher priority)
- Built-in error detection: CRC, bit stuffing, ACK field
Phase 3: RTOS & FreeRTOS Internals
RTOS questions are the most differentiating factor in embedded interviews. A candidate who understands scheduler internals, priority inversion, and ISR integration with RTOS primitives will stand out significantly.
Scheduling algorithms
- Preemptive priority-based scheduling — highest-priority ready task always runs (FreeRTOS default)
- Round-robin — equal-priority tasks time-slice via tick interrupt
- EDF (Earliest Deadline First) — used in hard real-time systems
- Cooperative scheduling — tasks voluntarily yield; no preemption
Priority inversion & priority inheritance
Priority inversion is one of the most famous bugs in embedded systems (it caused the Mars Pathfinder reset). It occurs when a high-priority task is blocked waiting for a resource held by a low-priority task, which itself is preempted by a medium-priority task.
Semaphore vs Mutex vs Binary Semaphore
- Mutex — ownership-based. Only the task that took it can release it. Supports priority inheritance.
- Binary semaphore — signalling between tasks (or ISR→task). No ownership. No priority inheritance.
- Counting semaphore — tracks availability of N identical resources.
ISR & RTOS integration (critical concept)
You cannot call blocking RTOS APIs from an ISR. FreeRTOS provides FromISR variants for this reason:
// In ISR — use FromISR variant and check if context switch needed
void EXTI0_IRQHandler(void) {
BaseType_t xHigherPriorityTaskWoken = pdFALSE;
xSemaphoreGiveFromISR(xSemaphore, &xHigherPriorityTaskWoken);
portYIELD_FROM_ISR(xHigherPriorityTaskWoken);
}Stack overflow detection in FreeRTOS
- Enable
configCHECK_FOR_STACK_OVERFLOW(mode 1 or 2) - Use
uxTaskGetStackHighWaterMark()to measure worst-case stack usage - Mode 2 fills the stack with a known pattern (0xA5) and checks it on context switch
Phase 4: Bare-Metal Firmware
Bare-metal programming — running firmware directly on hardware with no OS — is the heart of many embedded roles, especially at automotive, medical device, and defence companies.
MCU startup sequence
Know every step from reset to main():
- Reset exception fires → PC loaded from reset vector (offset 0x04 in vector table)
- Stack pointer initialized from address 0x00 in vector table
- Startup code copies
.datafrom Flash to RAM - Startup code zeroes
.bss SystemInit()called (clock config, PLL setup)- C++ static constructors called (if applicable)
main()called
Linker scripts
Interviewers at companies with bare-metal stacks (NXP, ST, Silicon Labs) will ask you to explain linker scripts. Key concepts:
MEMORYblock — defines Flash and RAM regions with origins and lengthsSECTIONSblock — maps input sections to memory regionsKEEP()— prevents linker from garbage-collecting ISR vector table entries- Exported symbols (e.g.,
_etext,_sdata) used by startup code
DMA (Direct Memory Access)
- DMA allows peripherals to transfer data to/from memory without CPU involvement
- Critical for: high-speed UART/SPI, ADC streaming, audio codecs
- Circular mode — DMA wraps around the buffer (ring buffer from hardware)
- Cache coherency issue: On Cortex-M7 (with D-cache), must invalidate/clean cache before/after DMA transfer
Watchdog timers
- IWDG (Independent Watchdog) — clocked by internal LSI, cannot be stopped once started. Resets MCU on timeout.
- WWDG (Window Watchdog) — must be kicked within a time window (not too early, not too late)
- HardFault/MemManage handlers — know how to decode the fault status registers (SCB→CFSR) to find the faulting address
Phase 5: Embedded Linux & Kernel Drivers
For roles at companies building Linux-based embedded products (Raspberry Pi, automotive infotainment, industrial PLCs, networking hardware), expect deep Linux internals questions.
Kernel vs userspace
- System calls are the only bridge (write, read, ioctl, mmap)
- Kernel runs in privileged mode with direct hardware access
- Userspace drivers (UIO, VFIO) are growing in popularity
Linux device driver model
- Character devices — byte stream interface (most common for custom peripherals)
- Platform drivers — match against device tree compatible strings
- Device tree — hardware description passed to kernel at boot (ARM boards)
probe()/remove()lifecycle
Memory allocation in kernel
kmalloc(size, GFP_KERNEL)— physically contiguous, size-limited (~4MB)vmalloc(size)— virtually contiguous, larger allocations, slower TLB performanceGFP_ATOMIC— use in ISR context (cannot sleep)
Debugging tools
- GDB + OpenOCD / JLink — on-chip debugging via JTAG/SWD
- ftrace — kernel function tracer, latency analysis
- perf — performance profiling
- Logic analyser + oscilloscope — hardware signal debugging (UART, SPI, I2C decoding)
Phase 6: DSA for Embedded Coders
Not all embedded interviews include LeetCode-style rounds, but the trend is increasing — especially at FAANG adjacent companies (Apple, Amazon, Google) hiring embedded engineers. Even traditional embedded companies test linked lists, strings, and memory APIs.
Most frequently tested embedded DSA topics
- Linked lists — reverse, detect cycle (Floyd's), merge sorted lists, delete duplicates
- Ring buffer / circular buffer — implement from scratch. Used everywhere in embedded (UART RX buffer, audio codec, DMA double-buffering)
- Strings — implement strncpy, memcpy, safe atoi/itoa, reverse string in-place
- Stacks/Queues — array-based, used for state machines and command queues
- Binary trees / BST — DFS traversals (in/pre/post order), BFS level-order
- Sorting — know insertion sort (O(n²) but cache-friendly for small N on MCUs)
Implement a ring buffer (classic embedded interview question)
typedef struct {
uint8_t buf[256];
uint8_t head;
uint8_t tail;
uint8_t count;
} ring_buf_t;
void rb_push(ring_buf_t *rb, uint8_t byte) {
if (rb->count == 256) return; // full
rb->buf[rb->head] = byte;
rb->head = (rb->head + 1) & 0xFF; // power-of-2 wrap
rb->count++;
}
bool rb_pop(ring_buf_t *rb, uint8_t *out) {
if (rb->count == 0) return false;
*out = rb->buf[rb->tail];
rb->tail = (rb->tail + 1) & 0xFF;
rb->count--;
return true;
}How to Analyse a Coding Question in an Embedded Interview
The right framework for approaching an embedded coding question is different from a standard LeetCode problem. Here is a proven 5-step method:
1. Clarify constraints & environment
Ask immediately: Is this bare-metal or RTOS? What MCU/architecture? Is malloc available? What are the real-time requirements? Interrupt-driven or polling? These answers completely change the implementation.
2. Identify the data path
Trace where data comes from (sensor, peripheral, external host) and where it goes (memory, actuator, network). Draw a block diagram if possible. This shows the interviewer you think in systems, not just code.
3. Identify concurrency / interrupt safety requirements
Will this code run in ISR context? Will it be accessed from multiple tasks? Flag volatile usage, critical section requirements, FromISR API variants. Embedded interviewers heavily reward this thinking.
4. Write the API first, then the implementation
Declare your function signatures and structs before writing the body. This demonstrates design thinking and gives the interviewer a chance to redirect before you write 50 lines in the wrong direction.
5. Walk through edge cases specific to embedded
Buffer overflow (check bounds). Null pointer dereference. Integer overflow in timer math. Missed wakeup in semaphore patterns. Unaligned memory access on ARM. These are the edge cases that separate embedded specialists from generic software engineers.
System Design for Embedded Roles
Senior embedded roles (L4+ at Apple, Staff at Qualcomm, Principal at NXP) will include a firmware architecture or embedded system design round. Common prompts:
- "Design an OTA (Over-The-Air) firmware update system for an IoT device"
- "Design the firmware architecture for a motor controller"
- "How would you implement a bootloader for a safety-critical system?"
- "Design a multi-sensor data acquisition system with DMA and double-buffering"
Framework for embedded system design
- Requirements: Real-time constraints? Safety level (IEC 61508, ISO 26262)? Power budget? Flash/RAM budget?
- Hardware interface: What peripherals, buses, sensors?
- Software architecture: Bare-metal vs RTOS? Event-driven vs polling? State machine structure?
- Data flow: Where does data enter, transform, and exit?
- Error handling & fault tolerance: Watchdog strategy, error codes, safe states
- Testability: Hardware-in-loop (HIL), unit tests with mocks for HAL
8-Week Embedded Interview Study Plan
Weeks 1–2
C Mastery
- volatile, static, const
- Bitwise ops (50+ problems)
- Pointers & memory layout
- Endianness & byte manipulation
Weeks 3–4
Hardware & Protocols
- UART, SPI, I2C internals
- CAN bus basics
- Interrupts & ISR design
- PWM, ADC, timers
Weeks 5–6
RTOS & Bare-metal
- FreeRTOS tasks, queues, semaphores
- Priority inversion & inheritance
- MCU startup sequence
- Linker scripts, DMA, watchdog
Week 7
Linux & System Design
- Character drivers & device tree
- kmalloc/vmalloc
- OTA update design
- Motor/sensor system design
Week 8
Mock Interviews & Review
- 2x mock interviews
- Ring buffer + linked list impl
- Revisit weakest topics
- Behavioural stories (STAR)
Top Companies & What They Typically Ask
| Company | Focus Areas | DSA Round? |
|---|---|---|
| Apple | C++, real-time audio, power management, boot sequence | Yes (LeetCode medium) |
| Tesla | CAN, AUTOSAR, safety-critical firmware, bare-metal C | Sometimes |
| Qualcomm | ARM DSP, RTOS, modem firmware, Linux kernel | Yes |
| NXP / STM | MCU peripherals, linker scripts, HAL drivers, RTOS | Rarely |
| Texas Instruments | DSP/MCU firmware, CAN, motor control, bare-metal | Rarely |
| Amazon (Alexa/IoT) | FreeRTOS, OTA, MQTT, power optimization | Yes (LeetCode) |
| Arm Ltd. | Architecture internals, TrustZone, compiler behavior | Yes |
Ready to Start Practising?
EmbeddedPrep has 100+ hands-on problems covering every topic in this guide — bitwise ops, RTOS design, embedded C patterns, system design, and more.