Unambiguous types

Most of these mappings are obvious, but there are some nuances and gotchas with Rust FFI (Foreign Function Interface).

This document defines clear, one-to-one mappings between primitive types in C, Rust (and possible other languages in the future). Its purpose is to eliminate ambiguity in type widths, signedness, and binary representation across platforms and languages.

For Git, the only header required to use these unambiguous types in C is git-compat-util.h.

Boolean types

C Type	Rust Type
bool¹	bool

C Type

Rust Type

bool¹

bool

Integer types

In C, <stdint.h> (or an equivalent) must be included.

C Type	Rust Type
uint8_t	u8
uint16_t	u16
uint32_t	u32
uint64_t	u64
int8_t	i8
int16_t	i16
int32_t	i32
int64_t	i64

C Type

Rust Type

uint8_t

uint16_t

u16

uint32_t

u32

uint64_t

u64

int8_t

int16_t

i16

int32_t

i32

int64_t

i64

Floating-point types

Rust requires IEEE-754 semantics. In C, that is typically true, but not guaranteed by the standard.

C Type	Rust Type
float²	f32
double²	f64

C Type

Rust Type

float²

f32

double²

f64

Size types

These types represent pointer-sized integers and are typically defined in <stddef.h> or an equivalent header.

Size types should be used any time pointer arithmetic is performed e.g. indexing an array, describing the number of elements in memory, etc…

C Type	Rust Type
size_t³	usize
ptrdiff_t³	isize

C Type

Rust Type

size_t³

usize

ptrdiff_t³

isize

Character types

This is where C and Rust don’t have a clean one-to-one mapping.

A C char and a Rust u8 share the same bit width, so any C struct containing a char will have the same size as the corresponding Rust struct using u8. In that sense, such structs are safe to pass over the FFI boundary, because their fields will be laid out identically. However, beyond bit width, C char has additional semantics and platform-dependent behavior that can cause problems, as discussed below.

The C language leaves the signedness of char implementation defined. Because our developer build enables -Wsign-compare, comparison of a value of char type with either signed or unsigned integers may trigger warnings from the compiler.

Note: Rust’s char type is an unsigned 32-bit integer that is used to describe Unicode code points.

Notes

¹ This is only true if stdbool.h (or equivalent) is used.
² C does not enforce IEEE-754 compatibility, but Rust expects it. If the platform/arch for C does not follow IEEE-754 then this equivalence does not hold. Also, it’s assumed that float is 32 bits and double is 64, but there may be a strange platform/arch where even this isn’t true.
³ C also defines uintptr_t, ssize_t and intptr_t, but these types are discouraged for FFI purposes. For functions like read() and write() ssize_t should be cast to a different, and unambiguous, type before being passed over the FFI boundary.

Problems with std::ffi::c_* types in Rust

TL;DR: In practice, Rust’s c_* types aren’t guaranteed to match C types for all possible C compilers, platforms, or architectures, because Rust only ensures correctness of C types on officially supported targets. These definitions have changed over time to match more targets which means that the c_* definitions will differ based on which Rust version Git chooses to use.

Current list of safe, Rust side, FFI types in Git:

c_void
CStr
CString

Even then, they should be used sparingly, and only where the semantics match exactly.

The std::os::raw::c_* directly inherits the problems of core::ffi, which changes over time and seems to make a best guess at the correct definition for a given platform/target. This probably isn’t a problem for all other platforms that Rust supports currently, but can anyone say that Rust got it right for all C compilers of all platforms/targets?

To give an example: c_long is defined in ^[1] ^[2]

Rust version 1.63.0

mod c_long_definition {
    cfg_if! {
        if #[cfg(all(target_pointer_width = "64", not(windows)))] {
            pub type c_long = i64;
            pub type NonZero_c_long = crate::num::NonZeroI64;
            pub type c_ulong = u64;
            pub type NonZero_c_ulong = crate::num::NonZeroU64;
        } else {
            // The minimal size of `long` in the C standard is 32 bits
            pub type c_long = i32;
            pub type NonZero_c_long = crate::num::NonZeroI32;
            pub type c_ulong = u32;
            pub type NonZero_c_ulong = crate::num::NonZeroU32;
        }
    }
}

Rust version 1.89.0

mod c_long_definition {
    crate::cfg_select! {
        any(
            all(target_pointer_width = "64", not(windows)),
            // wasm32 Linux ABI uses 64-bit long
            all(target_arch = "wasm32", target_os = "linux")
        ) => {
            pub(super) type c_long = i64;
            pub(super) type c_ulong = u64;
        }
        _ => {
            // The minimal size of `long` in the C standard is 32 bits
            pub(super) type c_long = i32;
            pub(super) type c_ulong = u32;
        }
    }
}

Even for the cases where C types are correctly mapped to Rust types via std::ffi::c_* there are still problems. Let’s take c_char for example. On some platforms it’s u8 on others it’s i8.

Subtraction underflow in debug mode

The following code will panic in debug on platforms that define c_char as u8, but won’t if it’s an i8.

let mut x: std::ffi::c_char = 0;
x -= 1;

Inconsistent shift behavior

x will be 0xC0 for platforms that use i8, but will be 0x40 where it’s u8.

let mut x: std::ffi::c_char = 0x80;
x >>= 1;

Equality fails to compile on some platforms

The following will not compile on platforms that define c_char as i8, but will if it’s u8. You can cast x e.g. assert_eq!(x as u8, b'a');, but then you get a warning on platforms that use u8 and a clean compilation where i8 is used.

let mut x: std::ffi::c_char = 0x61;
assert_eq!(x, b'a');

Enum types

Rust enum types should not be used as FFI types. Rust enum types are more like C union types than C enum’s. For something like:

#[repr(C, u8)]
enum Fruit {
    Apple,
    Banana,
    Cherry,
}

It’s easy enough to make sure the Rust enum matches what C would expect, but a more complex type like.

enum HashResult {
    SHA1([u8; 20]),
    SHA256([u8; 32]),
}

The Rust compiler has to add a discriminant to the enum to distinguish between the variants. The width, location, and values for that discriminant is up to the Rust compiler and is not ABI stable.

1. c_long in 1.63.0

2. c_long in 1.89.0