You write a struct to represent a database entity. Maybe 10 fields, maybe 20. What could possibly go wrong?
Nothing, according to your tests. But somewhere in production, your heap is 30% larger than it should be, your Garbage Collector is working overtime, and your L1 cache is not used properly. The reason? Invisible padding bytes silently inflating every instance of your struct.
This is the story of struct field alignment: a memory optimization that costs nothing to implement but can significantly improve performance.
Why Alignment Exists
Modern CPUs don’t load memory one byte at a time. They operate on aligned words, typically 8 bytes on 64-bit systems. When a data type’s memory address is not divisible by its alignment requirement, one of two things happens:
- Performance penalty: The CPU performs multiple memory accesses to load the value
- Hardware fault: On some architectures (historically common, now rare), unaligned access causes a crash
To prevent this, the Go compiler automatically inserts padding bytes between struct fields. The padding ensures each field starts at a memory address that satisfies its alignment requirement.
The Go Spec on Alignment
According to the Go Language Specification:
For a variable
xof any type:unsafe.Alignof(x)is at least 1. For a variablexof struct type:unsafe.Alignof(x)is the largest of all the valuesunsafe.Alignof(x.f)for each fieldfofx, but at least 1.
This means a struct’s alignment is determined by its largest-aligned field. A struct containing an int64 must be 8-byte aligned, even if its first field is a bool.
Demonstrating the Silent Bloat
Consider this innocent-looking struct with fields ordered “logically” (booleans together, then data):
1type BadStruct struct {
2 IsActive bool // 1 byte + 7 padding
3 ID uint64 // 8 bytes
4 IsVerified bool // 1 byte + 7 padding
5 Name string // 16 bytes (ptr + len)
6 IsAdmin bool // 1 byte + 7 padding
7 Score float64 // 8 bytes
8 // ... more fields
9}Each 1-byte bool followed by an 8-byte type wastes 7 bytes of padding.
Let’s measure real structs with all common Go types:
1// BadStruct: Fields interleaved to maximize padding
2type BadStruct struct {
3 IsActive bool // 1 byte + 7 padding
4 ID uint64 // 8 bytes
5 IsVerified bool // 1 byte + 7 padding
6 Name string // 16 bytes
7 IsAdmin bool // 1 byte + 7 padding
8 Score float64 // 8 bytes
9 IsPremium bool // 1 byte + 3 padding
10 ParentID uint32 // 4 bytes
11 TinyVal int8 // 1 byte + 1 padding
12 SmallVal int16 // 2 bytes + 4 padding
13 Email string // 16 bytes
14 IsDeleted bool // 1 byte + 7 padding
15 Count int64 // 8 bytes
16 IsArchived bool // 1 byte + 3 padding
17 Rating float32 // 4 bytes
18 Status int8 // 1 byte + 7 padding
19 Tags []string // 24 bytes
20 Enabled bool // 1 byte + 7 padding
21 Metadata map[string]string // 8 bytes
22 Ready bool // 1 byte + 7 padding
23 CreatedAt int64 // 8 bytes
24 Done bool // 1 byte + 7 padding
25 UpdatedAt int64 // 8 bytes
26 Callback func() // 8 bytes
27 Flag bool // 1 byte + 7 padding
28 Description string // 16 bytes
29}Measuring with unsafe.Sizeof:
1fmt.Println(unsafe.Sizeof(BadStruct{})) // Output: 224 bytes224 bytes. Let’s see how much is wasted.
The Alignment Rules
Go’s alignment requirements are architecture-dependent but follow predictable rules on 64-bit systems:
| Type | Size | Alignment | Notes |
|---|---|---|---|
bool |
1 | 1 | |
int8, uint8, byte |
1 | 1 | |
int16, uint16 |
2 | 2 | |
int32, uint32, float32 |
4 | 4 | |
int64, uint64, float64 |
8 | 8 | |
int, uint, uintptr |
8 | 8 | On 64-bit systems |
string |
16 | 8 | Header: {ptr, len} |
slice ([]T) |
24 | 8 | Header: {ptr, len, cap} |
map |
8 | 8 | Pointer to hmap |
func |
8 | 8 | Pointer |
interface{} |
16 | 8 | {type, data} |
pointer (*T) |
8 | 8 | |
chan |
8 | 8 | Pointer to hchan |
string is 16 bytes regardless of its content length.
The Optimization Rule
Order fields from largest alignment requirement to smallest:
- 24-byte types: Slices (
[]T) - 16-byte types: Strings, interfaces
- 8-byte types:
int64,uint64,float64, pointers, maps, funcs, chans - 4-byte types:
int32,uint32,float32 - 2-byte types:
int16,uint16 - 1-byte types:
int8,uint8,bool,byte
Applying this to our struct:
// GoodStruct: Fields ordered by alignment (largest to smallest)
type GoodStruct struct {
// 24-byte: Slices
Tags []string // 24 bytes
// 16-byte: Strings
Name string // 16 bytes
Email string // 16 bytes
Description string // 16 bytes
// 8-byte: int64, uint64, float64, pointers
ID uint64 // 8 bytes
Count int64 // 8 bytes
Score float64 // 8 bytes
CreatedAt int64 // 8 bytes
UpdatedAt int64 // 8 bytes
Metadata map[string]string // 8 bytes (pointer)
Callback func() // 8 bytes (pointer)
// 4-byte: int32, uint32, float32
ParentID uint32 // 4 bytes
Rating float32 // 4 bytes
// 2-byte: int16, uint16
SmallVal int16 // 2 bytes
// 1-byte: int8, bool (packed together)
TinyVal int8 // 1 byte
Status int8 // 1 byte
IsActive bool // 1 byte
IsVerified bool // 1 byte
IsAdmin bool // 1 byte
IsPremium bool // 1 byte
IsDeleted bool // 1 byte
IsArchived bool // 1 byte
Enabled bool // 1 byte
Ready bool // 1 byte
Done bool // 1 byte
Flag bool // 1 byte + 2 padding (struct alignment)
}
1fmt.Println(unsafe.Sizeof(GoodStruct{})) // Output: 152 bytes152 bytes. We saved 72 bytes (32% reduction) with zero code changes, just by reordering fields.
Measuring the Impact with Benchmarks
Here are real benchmarks run on an Apple M2 (darwin/arm64):
Struct Sizes
1=== TestEntitySizes ===
2BadStruct size: 224 bytes
3GoodStruct size: 152 bytes
4Memory saved: 72 bytes (32.1% reduction)Allocation Benchmarks
1goos: darwin
2goarch: arm64
3cpu: Apple M2
4
5BenchmarkBadStruct_Alloc-8 17997783 61.11 ns/op 224 B/op 1 allocs/op
6BenchmarkGoodStruct_Alloc-8 33894393 31.85 ns/op 160 B/op 1 allocs/op
7
8BenchmarkBadStruct_Slice1k-8 95834 12318 ns/op 229377 B/op 1 allocs/op
9BenchmarkGoodStruct_Slice1k-8 136102 8421 ns/op 155649 B/op 1 allocs/op
10
11BenchmarkBadStruct_Slice10k-8 16683 71874 ns/op 2244613 B/op 1 allocs/op
12BenchmarkGoodStruct_Slice10k-8 20770 60512 ns/op 1523715 B/op 1 allocs/opAnalysis
| Benchmark | Bad | Good | Improvement |
|---|---|---|---|
| Single Alloc | 61.1 ns, 224 B | 31.9 ns, 160 B | 48% faster |
| 1k Slice | 12.3 µs, 224 KB | 8.4 µs, 152 KB | 32% faster, 32% less memory |
| 10k Slice | 71.9 µs, 2.19 MB | 60.5 µs, 1.49 MB | 16% faster, 32% less memory |
Run the Benchmarks Yourself
Download struct_alignment_benchmark_test.go$ go mod init alignment_test $ go test -v -run TestEntitySizes $ go test -bench=. -benchmem
How Padding Works Behind the Scenes
Let’s trace through the field offsets to understand exactly where padding is inserted:
1BadStruct Field Offsets:| Field | Offset | Size | Explanation |
|---|---|---|---|
| IsActive | 0 | 1 | Starts at 0 |
| (padding) | 1 | 7 | ID needs 8-byte alignment |
| ID | 8 | 8 | Starts at 8 (divisible by 8) |
| IsVerified | 16 | 1 | Starts at 16 |
| (padding) | 17 | 7 | Name needs 8-byte alignment |
| Name | 24 | 16 | Starts at 24 (divisible by 8) |
| IsAdmin | 40 | 1 | |
| (padding) | 41 | 7 | Score needs 8-byte alignment |
| Score | 48 | 8 | |
| IsPremium | 56 | 1 | |
| (padding) | 57 | 3 | ParentID needs 4-byte alignment |
| ParentID | 60 | 4 | 60 is divisible by 4 |
| TinyVal | 64 | 1 | |
| (padding) | 65 | 1 | SmallVal needs 2-byte alignment |
| SmallVal | 66 | 2 | 66 is divisible by 2 |
| (padding) | 68 | 4 | Email needs 8-byte alignment |
| 72 | 16 | 72 is divisible by 8 | |
| … |
You can inspect these offsets programmatically:
1var s BadStruct
2fmt.Printf("IsActive offset: %d\n", unsafe.Offsetof(s.IsActive)) // 0
3fmt.Printf("ID offset: %d\n", unsafe.Offsetof(s.ID)) // 8
4fmt.Printf("IsVerified offset: %d\n", unsafe.Offsetof(s.IsVerified)) // 16
5fmt.Printf("Name offset: %d\n", unsafe.Offsetof(s.Name)) // 24Why This Matters Beyond Memory
1. Reduced GC Pressure
Smaller structs mean:
- Smaller heap -> Less memory to scan during garbage collection
- Better cache locality -> GC mark phase runs faster
- Lower allocation churn -> GC triggers less frequently
If runtime.scanobject appears in your CPU profiles, you have GC pressure. Shrinking struct sizes directly reduces this cost.
2. CPU Cache Efficiency
The L1 data cache line is typically 64 bytes. Compact structs fit more instances per cache line:
164-byte cache line capacity:
2- BadStruct (224 bytes): 0.28 structs per line
3- GoodStruct (152 bytes): 0.42 structs per lineWhen iterating over a slice of structs, compact layouts mean fewer cache misses and better hardware prefetching.
3. Network/Disk Efficiency
If you use binary serialization (protobuf, msgpack, gob), struct layout can affect:
- Network payload sizes
- Memory-mapped file efficiency
- Serialization/deserialization speed
Detecting Alignment Issues
Tool: fieldalignment
The Go team provides an official analyzer:
$ go install golang.org/x/tools/go/analysis/passes/fieldalignment/cmd/fieldalignment@latest $ fieldalignment ./... $ fieldalignment -fix ./...
Example output:
1entity/user.go:15:6: struct of size 224 could be 152 (order fields by alignment)
2entity/post.go:42:6: struct of size 120 could be 104 (order fields by alignment)Manual Inspection
Use unsafe.Sizeof and unsafe.Offsetof to audit critical structs:
1import "unsafe"
2
3func auditStruct() {
4 var e Entity
5 fmt.Printf("Total size: %d bytes\n", unsafe.Sizeof(e))
6 fmt.Printf("Field1 offset: %d\n", unsafe.Offsetof(e.Field1))
7 fmt.Printf("Field2 offset: %d\n", unsafe.Offsetof(e.Field2))
8 // Check for gaps between (offset + size) and next offset
9}Pitfalls and Caveats
1. Auto-fix Can Break Binary Compatibility
fieldalignment -fix tool reorders fields. If your code depends on specific field ordering for binary serialization, memory-mapped files, or CGO interop, this will break things silently.
Always review changes before applying them. For CGO structs, field order must often match C struct definitions exactly.
2. JSON/YAML Unmarshalling Is Unaffected
The common fear that reordering fields breaks JSON parsing is unfounded. The encoding/json package uses reflection and struct tags to map JSON keys to fields:
1type User struct {
2 Age int `json:"age"` // Field order in memory
3 Name string `json:"name"` // ≠ key order in JSON
4}
5
6// JSON: {"name":"Vikash","age":40}
7// Works identically regardless of struct field order3. Don’t Micro-optimize Everything
Alignment optimization matters when:
- The struct is instantiated thousands or millions of times
- The struct appears in hot paths (request handlers, tight loops)
- You’re seeing GC pressure (
runtime.scanobject> 5% CPU) - Memory is a constraint (embedded systems, large caches)
For structs used sparingly, readability may be more valuable than a few bytes.
4. Struct Tail Padding
Even optimally ordered structs may have tail padding. The struct’s total size must be a multiple of its alignment:
1type Example struct {
2 A int64 // 8 bytes
3 B bool // 1 byte + 7 padding = 16 total
4}This ensures arrays of structs maintain proper alignment for each element.
Quick Reference Checklist
When designing or refactoring structs:
- Order by alignment: 24-byte -> 16-byte -> 8-byte -> 4-byte -> 2-byte -> 1-byte
- Pack bools together: Place all
boolfields at the end - Verify with
unsafe.Sizeof: Confirm your optimizations work - Run
fieldalignment: Catch what you miss, but review before applying - Profile first: Only optimize structs that matter
1// Quick verification
2import "unsafe"
3
4func checkSize[T any]() {
5 var x T
6 fmt.Printf("Type: %T, Size: %d bytes, Align: %d\n",
7 x, unsafe.Sizeof(x), unsafe.Alignof(x))
8}Summary
| Concept | Key Point |
|---|---|
| Problem | CPU alignment requirements force compilers to insert padding |
| Symptom | Structs consume more memory than the sum of their fields |
| Solution | Order fields from largest to smallest alignment requirement |
| Benefit | 20-40% typical memory reduction, better cache usage, lower GC pressure |
| Caveat | Don’t auto-fix binary-serialized or CGO structs without proper review and testing |
Field alignment is the kind of optimization that costs nothing to implement correctly. Once you internalize the ordering rules, writing compact structs becomes second nature. Your heap will be smaller, your GC will run faster, and your CPU cache will thank you.
Further Reading
- Go Language Specification: Size and Alignment - Go spec
- fieldalignment analyzer source - Understand how the tool works
- A Guide to the Go Garbage Collector - Why heap size matters
- unsafe package documentation -
Sizeof,Offsetof,Alignof