NEON Vector Math Library

Overview

The Sandpiper SDK includes a comprehensive NEON-optimized vector and matrix math library designed for 3D graphics applications. The library provides C-style functions that automatically utilize ARM NEON SIMD instructions when compiled with -mfpu=neon -mfloat-abi=hard flags, offering significant performance improvements over scalar operations.

The library is designed to be portable - it will use optimized NEON code on ARMv7 targets with NEON support, and fall back to standard C implementations on other platforms.

Header File: vec.h

Data Types

vec3_t - 3D Vector

typedef struct {
    float x, y, z;
    float _pad;  // Padding for 16-byte alignment
} vec3_t;

vec4_t - 4D Vector

typedef struct {
    float x, y, z, w;
} vec4_t;

mat3_t - 3x3 Matrix

typedef struct {
    float m[9];  // Column-major: m[col*3 + row]
} mat3_t;

mat4_t - 4x4 Matrix

typedef struct {
    float m[16];  // Column-major: m[col*4 + row]
} mat4_t;

quat_t - Quaternion

typedef struct {
    float x, y, z, w;
} quat_t;

Constants

#define NEON_PI          3.14159265358979323846f
#define NEON_TWO_PI      6.28318530717958647692f
#define NEON_HALF_PI     1.57079632679489661923f
#define NEON_DEG_TO_RAD  0.01745329251994329577f
#define NEON_RAD_TO_DEG  57.2957795130823208768f
#define NEON_EPSILON     1e-6f

Vector3 Functions

Creation

vec3_t vec3_create(float x, float y, float z);
vec3_t vec3_zero(void);
vec3_t vec3_one(void);
vec3_t vec3_up(void);      // (0, 1, 0)
vec3_t vec3_down(void);    // (0, -1, 0)
vec3_t vec3_left(void);    // (-1, 0, 0)
vec3_t vec3_right(void);   // (1, 0, 0)
vec3_t vec3_forward(void); // (0, 0, -1)
vec3_t vec3_back(void);    // (0, 0, 1)

Basic Operations

vec3_t vec3_add(vec3_t a, vec3_t b);
vec3_t vec3_sub(vec3_t a, vec3_t b);
vec3_t vec3_mul(vec3_t a, vec3_t b);  // Component-wise
vec3_t vec3_div(vec3_t a, vec3_t b);  // Component-wise
vec3_t vec3_scale(vec3_t v, float s);
vec3_t vec3_negate(vec3_t v);

Vector Math

float vec3_dot(vec3_t a, vec3_t b);
vec3_t vec3_cross(vec3_t a, vec3_t b);
float vec3_length(vec3_t v);
float vec3_length_squared(vec3_t v);
vec3_t vec3_normalize(vec3_t v);
float vec3_distance(vec3_t a, vec3_t b);
float vec3_distance_squared(vec3_t a, vec3_t b);

Interpolation & Utilities

vec3_t vec3_lerp(vec3_t a, vec3_t b, float t);
vec3_t vec3_min(vec3_t a, vec3_t b);
vec3_t vec3_max(vec3_t a, vec3_t b);
vec3_t vec3_clamp(vec3_t v, vec3_t min_val, vec3_t max_val);
vec3_t vec3_reflect(vec3_t incident, vec3_t normal);
vec3_t vec3_refract(vec3_t incident, vec3_t normal, float eta);

Matrix Transformations

vec3_t vec3_transform_mat3(vec3_t v, mat3_t m);
vec3_t vec3_transform_mat4(vec3_t v, mat4_t m);       // As point (w=1)
vec3_t vec3_transform_mat4_dir(vec3_t v, mat4_t m);  // As direction (w=0)

Vector4 Functions

Creation

vec4_t vec4_create(float x, float y, float z, float w);
vec4_t vec4_from_vec3(vec3_t v, float w);
vec4_t vec4_zero(void);
vec4_t vec4_one(void);

Basic Operations

vec4_t vec4_add(vec4_t a, vec4_t b);
vec4_t vec4_sub(vec4_t a, vec4_t b);
vec4_t vec4_mul(vec4_t a, vec4_t b);
vec4_t vec4_div(vec4_t a, vec4_t b);
vec4_t vec4_scale(vec4_t v, float s);
vec4_t vec4_negate(vec4_t v);

Vector Math

float vec4_dot(vec4_t a, vec4_t b);
float vec4_length(vec4_t v);
float vec4_length_squared(vec4_t v);
vec4_t vec4_normalize(vec4_t v);

Interpolation & Conversion

vec4_t vec4_lerp(vec4_t a, vec4_t b, float t);
vec4_t vec4_min(vec4_t a, vec4_t b);
vec4_t vec4_max(vec4_t a, vec4_t b);
vec4_t vec4_transform_mat4(vec4_t v, mat4_t m);
vec3_t vec4_to_vec3(vec4_t v);
vec3_t vec4_to_vec3_perspective(vec4_t v);  // Divides by w

Matrix 3x3 Functions

Creation

mat3_t mat3_identity(void);
mat3_t mat3_zero(void);
mat3_t mat3_from_rows(vec3_t r0, vec3_t r1, vec3_t r2);
mat3_t mat3_from_cols(vec3_t c0, vec3_t c1, vec3_t c2);
mat3_t mat3_from_mat4(mat4_t m);  // Extract upper-left 3x3

Basic Operations

mat3_t mat3_add(mat3_t a, mat3_t b);
mat3_t mat3_sub(mat3_t a, mat3_t b);
mat3_t mat3_scale(mat3_t m, float s);
mat3_t mat3_mul(mat3_t a, mat3_t b);
mat3_t mat3_transpose(mat3_t m);
float mat3_determinant(mat3_t m);
mat3_t mat3_inverse(mat3_t m);

Rotation Matrices

mat3_t mat3_rotation_x(float radians);
mat3_t mat3_rotation_y(float radians);
mat3_t mat3_rotation_z(float radians);
mat3_t mat3_rotation_axis(vec3_t axis, float radians);

2D Transformations

mat3_t mat3_translation_2d(float tx, float ty);
mat3_t mat3_scaling_2d(float sx, float sy);
mat3_t mat3_rotation_2d(float radians);

Normal Matrix

mat3_t mat3_normal_matrix(mat4_t model_matrix);  // For transforming normals

Matrix 4x4 Functions

Creation

mat4_t mat4_identity(void);
mat4_t mat4_zero(void);
mat4_t mat4_from_rows(vec4_t r0, vec4_t r1, vec4_t r2, vec4_t r3);
mat4_t mat4_from_cols(vec4_t c0, vec4_t c1, vec4_t c2, vec4_t c3);

Basic Operations

mat4_t mat4_add(mat4_t a, mat4_t b);
mat4_t mat4_sub(mat4_t a, mat4_t b);
mat4_t mat4_scale_scalar(mat4_t m, float s);
mat4_t mat4_mul(mat4_t a, mat4_t b);
mat4_t mat4_transpose(mat4_t m);
float mat4_determinant(mat4_t m);
mat4_t mat4_inverse(mat4_t m);

Transformation Matrices

mat4_t mat4_translation(float tx, float ty, float tz);
mat4_t mat4_translation_vec3(vec3_t t);
mat4_t mat4_scaling(float sx, float sy, float sz);
mat4_t mat4_scaling_uniform(float s);
mat4_t mat4_scaling_vec3(vec3_t s);
mat4_t mat4_rotation_x(float radians);
mat4_t mat4_rotation_y(float radians);
mat4_t mat4_rotation_z(float radians);
mat4_t mat4_rotation_axis(vec3_t axis, float radians);
mat4_t mat4_rotation_euler(float pitch, float yaw, float roll);

Compose/Decompose

mat4_t mat4_compose(vec3_t translation, quat_t rotation, vec3_t scale);
void mat4_decompose(mat4_t m, vec3_t* translation, quat_t* rotation, vec3_t* scale);
vec3_t mat4_get_translation(mat4_t m);
vec3_t mat4_get_scale(mat4_t m);

Camera & Projection Functions

View Matrices

mat4_t mat4_look_at(vec3_t eye, vec3_t target, vec3_t up);
mat4_t mat4_look_at_lh(vec3_t eye, vec3_t target, vec3_t up);  // Left-handed
mat4_t mat4_look_at_rh(vec3_t eye, vec3_t target, vec3_t up);  // Right-handed

Example:

vec3_t eye = vec3_create(0.0f, 10.0f, 20.0f);
vec3_t target = vec3_zero();
vec3_t up = vec3_up();
mat4_t view = mat4_look_at(eye, target, up);

Projection Matrices

mat4_t mat4_perspective(float fov_radians, float aspect, 
                        float near_plane, float far_plane);
mat4_t mat4_perspective_fov(float fov_radians, float width, float height,
                            float near_plane, float far_plane);
mat4_t mat4_ortho(float left, float right, float bottom, float top,
                  float near_plane, float far_plane);
mat4_t mat4_ortho_2d(float left, float right, float bottom, float top);

Example:

float fov = 45.0f * NEON_DEG_TO_RAD;
float aspect = 640.0f / 480.0f;
mat4_t proj = mat4_perspective(fov, aspect, 0.1f, 100.0f);

Quaternion Functions

Creation

quat_t quat_identity(void);  // (0, 0, 0, 1)
quat_t quat_create(float x, float y, float z, float w);
quat_t quat_from_axis_angle(vec3_t axis, float radians);
quat_t quat_from_euler(float pitch, float yaw, float roll);
quat_t quat_from_mat3(mat3_t m);
quat_t quat_from_mat4(mat4_t m);

Basic Operations

quat_t quat_add(quat_t a, quat_t b);
quat_t quat_sub(quat_t a, quat_t b);
quat_t quat_mul(quat_t a, quat_t b);  // Hamilton product
quat_t quat_scale(quat_t q, float s);
quat_t quat_conjugate(quat_t q);
quat_t quat_inverse(quat_t q);
float quat_length(quat_t q);
quat_t quat_normalize(quat_t q);
float quat_dot(quat_t a, quat_t b);

Interpolation

quat_t quat_lerp(quat_t a, quat_t b, float t);   // Linear interpolation
quat_t quat_slerp(quat_t a, quat_t b, float t);  // Spherical interpolation

Note: Use slerp for smooth rotation animation.

Conversion

mat3_t quat_to_mat3(quat_t q);
mat4_t quat_to_mat4(quat_t q);
void quat_to_axis_angle(quat_t q, vec3_t* axis, float* angle);
vec3_t quat_to_euler(quat_t q);

Rotation

vec3_t quat_rotate_vec3(quat_t q, vec3_t v);
quat_t quat_look_rotation(vec3_t forward, vec3_t up);
float quat_angle(quat_t a, quat_t b);

Utility Functions

Angle Conversions

float deg_to_rad(float degrees);
float rad_to_deg(float radians);

Common Math

float clampf(float value, float min_val, float max_val);
float lerpf(float a, float b, float t);
float smoothstep(float edge0, float edge1, float x);
float inversesqrt(float x);

Fast Approximations

float fast_sin(float x);
float fast_cos(float x);
float fast_tan(float x);
float fast_inversesqrt(float x);

Note: Fast functions sacrifice some accuracy for performance.

Complete Example

3D Transformation Pipeline

#include "vec.h"
#include "vpu.h"
#include "platform.h"

void render_scene(struct SPPlatform* platform) {
    // Camera setup
    vec3_t eye = vec3_create(0.0f, 5.0f, 10.0f);
    vec3_t target = vec3_zero();
    vec3_t up = vec3_up();
    mat4_t view = mat4_look_at(eye, target, up);
    
    // Projection setup
    float fov = 45.0f * NEON_DEG_TO_RAD;
    float aspect = 640.0f / 480.0f;
    mat4_t proj = mat4_perspective(fov, aspect, 0.1f, 100.0f);
    
    // Object transformation
    vec3_t position = vec3_create(2.0f, 0.0f, 0.0f);
    vec3_t scale = vec3_one();
    
    // Rotation using quaternion
    vec3_t axis = vec3_up();
    float angle = 30.0f * NEON_DEG_TO_RAD;
    quat_t rotation = quat_from_axis_angle(axis, angle);
    
    // Compose model matrix
    mat4_t model = mat4_compose(position, rotation, scale);
    
    // Create MVP matrix
    mat4_t mv = mat4_mul(view, model);
    mat4_t mvp = mat4_mul(proj, mv);
    
    // Transform a vertex
    vec3_t vertex = vec3_create(1.0f, 0.0f, 0.0f);
    vec3_t transformed = vec3_transform_mat4(vertex, mvp);
    
    // Transform a normal (use normal matrix)
    mat3_t normal_mat = mat3_normal_matrix(model);
    vec3_t normal = vec3_up();
    vec3_t transformed_normal = vec3_normalize(
        vec3_transform_mat3(normal, normal_mat)
    );
}

Quaternion Animation

#include "vec.h"

void update_rotation(float t) {
    // Start and end orientations
    quat_t start = quat_identity();
    vec3_t axis = vec3_up();
    quat_t end = quat_from_axis_angle(axis, NEON_PI);
    
    // Smooth interpolation
    quat_t current = quat_slerp(start, end, t);
    
    // Apply to object
    vec3_t point = vec3_forward();
    vec3_t rotated = quat_rotate_vec3(current, point);
}

Vector Math Example

#include "vec.h"

void calculate_lighting(vec3_t light_pos, vec3_t surface_pos, 
                        vec3_t surface_normal) {
    // Light direction
    vec3_t light_dir = vec3_normalize(
        vec3_sub(light_pos, surface_pos)
    );
    
    // Diffuse lighting
    float diffuse = vec3_dot(surface_normal, light_dir);
    diffuse = clampf(diffuse, 0.0f, 1.0f);
    
    // Specular reflection
    vec3_t reflected = vec3_reflect(
        vec3_negate(light_dir), 
        surface_normal
    );
    
    // Distance attenuation
    float dist = vec3_distance(light_pos, surface_pos);
    float attenuation = 1.0f / (1.0f + 0.1f * dist + 0.01f * dist * dist);
    
    float final_intensity = diffuse * attenuation;
}

Compilation Notes

Enabling NEON Optimization

To enable NEON SIMD optimizations, compile with:

arm-none-linux-gnueabihf-gcc -mfpu=neon -mfloat-abi=hard -O3 your_code.c

Portability

The library automatically detects NEON support at compile time. If NEON is not available, it falls back to standard C implementations, ensuring your code works across different platforms.

Performance Tips

Batch operations: Process multiple vectors/matrices in loops for better SIMD utilization
Alignment: Data is automatically aligned for optimal NEON performance
Fast math: Use fast approximation functions when high precision isn't critical
Minimize conversions: Keep data in NEON-friendly formats (vec3_t, vec4_t) throughout pipelines

Matrix Convention

Column-Major Order

All matrices use column-major ordering, compatible with OpenGL, Vulkan, and Metal:

mat4_t m;
// m.m[0..3]   = column 0
// m.m[4..7]   = column 1
// m.m[8..11]  = column 2
// m.m[12..15] = column 3

// Element at column c, row r:
float element = m.m[c * 4 + r];

Coordinate System

The library uses a right-handed coordinate system by default:

+X = right
+Y = up
+Z = towards camera (away from screen)

Overview

Data Types

vec3_t - 3D Vector

vec4_t - 4D Vector

mat3_t - 3x3 Matrix

mat4_t - 4x4 Matrix

quat_t - Quaternion

Constants

Vector3 Functions

Creation

Basic Operations

Vector Math

Interpolation & Utilities

Matrix Transformations

Vector4 Functions

Creation

Basic Operations

Vector Math

Interpolation & Conversion

Matrix 3x3 Functions

Creation

Basic Operations

Rotation Matrices

2D Transformations

Normal Matrix

Matrix 4x4 Functions

Creation

Basic Operations

Transformation Matrices

Compose/Decompose

Camera & Projection Functions

View Matrices

Projection Matrices

Quaternion Functions

Creation

Basic Operations

Interpolation

Conversion

Rotation

Utility Functions

Angle Conversions

Common Math

Fast Approximations

Complete Example

3D Transformation Pipeline

Quaternion Animation

Vector Math Example

Compilation Notes

Enabling NEON Optimization

Portability

Performance Tips

Matrix Convention

Column-Major Order

Coordinate System

Related Documentation