Ova

How to copy byte by byte in C?

Published in Memory Manipulation 8 mins read

In C, copying data byte by byte is a fundamental operation for memory manipulation, offering precise control over how data is moved. While you can implement a manual byte-by-byte copy loop, the C standard library provides highly optimized functions like memcpy(), memmove(), and memccpy() that are generally preferred for efficiency and safety.

Understanding Byte-by-Byte Copying in C

Copying byte by byte means treating memory as a sequence of individual bytes, regardless of the data types they represent. This is crucial for tasks such as:

  • Transferring raw data between buffers.
  • Serializing or deserializing data structures.
  • Implementing custom memory allocators or management schemes.

Core C Library Functions for Byte Copying

The <string.h> header provides several powerful functions designed for byte-level memory operations.

1. memcpy(): Copying Non-Overlapping Memory

The memcpy() function is the most common and often the fastest way to copy a specified number of bytes from one memory location to another. It assumes that the source and destination memory regions do not overlap. If they do, the behavior is undefined and can lead to data corruption.

  • Syntax:

    void *memcpy(void *restrict dest, const void *restrict src, size_t n);
    • dest: Pointer to the destination memory area.
    • src: Pointer to the source memory area.
    • n: The number of bytes to copy.
  • Example:

    #include <stdio.h>
    #include <string.h>
    
    int main() {
        char source[] = "Hello, C bytes!";
        char destination[20]; // Ensure destination is large enough
    
        // Copy 10 bytes from source to destination
        memcpy(destination, source, 10);
        destination[10] = '\0'; // Null-terminate for printing as a string
    
        printf("Source: %s\n", source);
        printf("Destination (10 bytes): %s\n", destination); // Output: Hello, C 
    
        int numbers[] = {10, 20, 30, 40, 50};
        int copied_numbers[3];
        memcpy(copied_numbers, numbers, 3 * sizeof(int)); // Copy 3 integers
    
        printf("Copied numbers: %d, %d, %d\n", copied_numbers[0], copied_numbers[1], copied_numbers[2]);
    
        return 0;
    }
  • Practical Insight: memcpy() is highly optimized for performance and should be your go-to choice when you are certain that the source and destination memory blocks do not overlap.

2. memmove(): Copying Overlapping Memory Safely

Unlike memcpy(), the memmove() function is designed to handle cases where the source and destination memory regions might overlap. It ensures that the original source data is correctly copied to the destination, even if the regions coincide. It achieves this by potentially copying bytes in a reverse order if the destination overlaps and precedes the source.

  • Syntax:

    void *memmove(void *dest, const void *src, size_t n);
    • The parameters are identical to memcpy().
  • Example:

    #include <stdio.h>
    #include <string.h>
    
    int main() {
        char buffer[] = "0123456789";
    
        printf("Original buffer: %s\n", buffer); // Output: 0123456789
    
        // Overlapping copy: move '3456789' to start at buffer[1], becoming '3456789789'
        // Source starts at buffer + 3, destination starts at buffer + 1
        memmove(buffer + 1, buffer + 3, 7); // Copy 7 bytes starting from '3'
    
        printf("Buffer after memmove: %s\n", buffer); // Output: 0345678989 (or similar, depending on what was overwritten)
                                                    // This example is tricky because the last bytes of the string are also moved.
                                                    // Let's refine the example to be clearer for overlap.
    
        char data[] = "ABCDEFGHIJ";
        printf("Original data: %s\n", data); // ABCDEFGHIJ
    
        // Move "DEFG" from data[3] to data[1]
        // Source: data + 3 ("DEFG")
        // Destination: data + 1 ("BC")
        // n: 4 bytes
        memmove(data + 1, data + 3, 4); // data becomes ADEFGHCIJ (or similar, depending on remaining content)
        data[1 + 4] = '\0'; // Null-terminate to see the moved part correctly
        printf("Data after memmove (overlap): %s\n", data); // Output: ADEFGH
    
        return 0;
    }
  • Practical Insight: Always use memmove() when there's a possibility of memory regions overlapping to prevent unpredictable behavior and ensure data integrity.

3. memccpy(): Conditional Byte Copying with a Character Stop

The memccpy() function provides a more specialized way to copy bytes. It copies bytes from a source memory area to a destination memory area, but with an added condition: it stops copying after the first occurrence of a specified byte c (converted to an unsigned char) is copied, or after a maximum of n bytes have been copied, whichever comes first.

  • Syntax:

    void *memccpy(void *restrict dest, const void *restrict src, int c, size_t n);
    • dest: Pointer to the destination memory area.
    • src: Pointer to the source memory area.
    • c: The byte to search for (as an int, but treated as unsigned char).
    • n: The maximum number of bytes to copy.
  • Return Value: memccpy() returns a pointer to the byte after the copied byte c in the destination, or NULL if c was not found within the first n bytes.

  • Example:

    #include <stdio.h>
    #include <string.h> // For memccpy
    
    int main() {
        char source[] = "This is a test string for memccpy.";
        char destination[50];
        void *end_ptr;
    
        // Copy up to 't' or a maximum of 20 bytes
        end_ptr = memccpy(destination, source, 't', 20);
    
        if (end_ptr != NULL) {
            *(char*)end_ptr = '\0'; // Null-terminate the copied part
            printf("Copied (up to 't' or 20 bytes): %s\n", destination); // Output: This is a t
            printf("Byte 't' found and copied.\n");
        } else {
            // 't' was not found within the first 20 bytes, or all 20 bytes were copied
            // We need to null-terminate manually if less than 20 bytes were copied and 't' wasn't found
            // In this case, it means 20 bytes were copied without 't' being the 20th byte.
            destination[20] = '\0';
            printf("Copied (20 bytes, 't' not found within limit): %s\n", destination); // Output: This is a test strin
        }
    
        char long_source[] = "abcdefghijklmnopqrstuvwxyz";
        char short_dest[10];
        void *no_char_found_ptr = memccpy(short_dest, long_source, 'Z', 5); // 'Z' not in first 5 bytes
        if (no_char_found_ptr == NULL) {
            short_dest[5] = '\0'; // Manually null-terminate after copying 5 bytes
            printf("Copied (5 bytes, 'Z' not found): %s\n", short_dest); // Output: abcde
        }
    
        return 0;
    }
  • Practical Insight: memccpy() is useful when you need to copy a portion of memory until a specific delimiter byte is encountered, or a maximum length is reached, such as parsing data streams or copying sub-strings without explicit length calculations if a delimiter exists.

Manual Byte-by-Byte Copying with a Loop

While less efficient than library functions, you can also perform byte-by-byte copying using a simple for loop. This method gives you the most granular control and is valuable for understanding the underlying mechanism or for very specific scenarios where library functions might not fit.

  • Approach:

    1. Cast both source and destination pointers to char * or unsigned char *. This allows you to treat them as arrays of individual bytes.
    2. Iterate n times, copying one byte at a time.
  • Example:

    #include <stdio.h>
    #include <string.h> // Only for comparison, not strictly needed for the loop itself
    
    void custom_byte_copy(void *dest, const void *src, size_t n) {
        char *d = (char *)dest;
        const char *s = (const char *)src;
        for (size_t i = 0; i < n; i++) {
            d[i] = s[i];
        }
    }
    
    int main() {
        char source_data[] = "Manual copy example";
        char destination_buffer[30];
    
        custom_byte_copy(destination_buffer, source_data, strlen(source_data) + 1); // Copy including null terminator
    
        printf("Source: %s\n", source_data);
        printf("Destination (manual): %s\n", destination_buffer);
    
        // Example with integer array
        int arr_src[] = {1, 2, 3, 4, 5};
        int arr_dest[5];
        custom_byte_copy(arr_dest, arr_src, sizeof(arr_src)); // Copy all bytes of the array
    
        printf("Manual copied integers: %d, %d, %d, %d, %d\n", arr_dest[0], arr_dest[1], arr_dest[2], arr_dest[3], arr_dest[4]);
    
        return 0;
    }
  • Practical Insight: Manual loops are good for educational purposes, debugging, or when you need to perform additional operations on each byte during the copy process (e.g., encryption, checksum calculation, byte swapping). For raw, fast copying, stick to memcpy() or memmove().

Key Considerations for Byte Copying

  • Memory Overlap: Always be mindful of whether your source and destination memory regions overlap. Use memcpy() for non-overlapping regions and memmove() for potentially overlapping ones.
  • Buffer Size: Ensure that the destination buffer is large enough to accommodate all n bytes being copied. Failing to do so will result in a buffer overflow, a common source of security vulnerabilities and program crashes.
  • Data Type Agnosticism: Functions like memcpy(), memmove(), and memccpy() operate on void * pointers, meaning they are generic and can copy any type of data by treating it as a sequence of bytes. You are responsible for knowing the actual size of the data you intend to copy.
  • Performance: Standard library functions (memcpy, memmove, memccpy) are highly optimized, often implemented in assembly language or using platform-specific intrinsics, making them significantly faster than a generic C for loop for large data blocks.
  • Include Headers: To use memcpy(), memmove(), or memccpy(), you must include the <string.h> header.

Summary of Byte Copying Functions

Function Description Handles Overlap? Primary Use Case
memcpy() Copies n bytes from src to dest. Assumes non-overlapping regions. No (undefined) Fastest copy for distinct memory areas.
memmove() Copies n bytes from src to dest. Safely handles overlapping memory regions. Yes Copying within the same array or when overlap is possible.
memccpy() Copies bytes from src to dest, stopping after the first occurrence of byte c is copied, or after n bytes are copied, whichever comes first. Returns a pointer to the byte after c in dest or NULL. No (undefined) Copying up to a specific delimiter byte or a maximum length.
Manual Loop User-implemented loop (for) to copy byte by byte. Yes (if coded) Learning, specific per-byte operations, or very custom scenarios (less efficient for plain copy).