Convert raw vectors to R vectors
Usage
raw_to_uint8(x)
raw_to_uint16(x)
raw_to_uint32(x)
raw_to_int8(x)
raw_to_int16(x)
raw_to_int32(x)
raw_to_int64(x)
raw_to_float(x)
raw_to_string(x)
Details
For numeric conversions, the function names are straightforward.
For example,
raw_to_uintN
converts raw vectors to unsigned integers, and
raw_to_intN
converts raw vectors to signed integers. The number
'N'
stands for the number of bits used to store the integer.
For example raw_to_uint8
uses 8 bits (1 byte) to store an integer,
hence the value range is 0-255
.
The input data length must be multiple of the element size represented by
the underlying data. For example uint16
integer uses 16 bites, and
one raw number uses 8 bits, hence two raw vectors can form one unsigned
integer-16. That is, raw_to_uint16
requires the length of input
to be multiple of two. An easy calculation is: the length of x
times
8, must be divided by 'N'
(see last paragraph for definition).
The returned data uses the closest available R native data type that can
fully represent the data. For example, R does not have single float
type, hence raw_to_float
returns double
type, which can
represent all possible values in float
. For raw_to_uint32
,
the potential value range is 0 - (2^32-1)
. This exceeds the limit of
R integer type (-2^31) - (2^31-1)
. Therefore, the returned values
will be real (double float) data type.
There is no native data type that can store integer-64 data in R, package
bit64
provides integer64
type, which will be used by
raw_to_int64
. Currently there is no solution to convert raw to
unsigned integer-64 type.
raw_to_string
converts raw to character string. This function respects
null
character, hence is slightly different than the native
rawToChar
, which translates raw byte-by-byte. If each
raw byte represents a valid character, then the above two functions returns
the same result. However, when the characters represented by raw bytes are
invalid, raw_to_string
will stop parsing and returns only the valid
characters, while rawToChar
will still try to parse, and
most likely to result in errors.
Please see Examples for comparisons.
Examples
# 0x00, 0x7f, 0x80, 0xFF
x <- as.raw(c(0, 127, 128, 255))
raw_to_uint8(x)
#> [1] 0 127 128 255
# The first bit becomes the integer sign
# 128 -> -128, 255 -> -1
raw_to_int8(x)
#> [1] 0 127 -128 -1
## Comments based on little endian system
# 0x7f00 (32512), 0xFF80 (65408 unsigned, or -128 signed)
raw_to_uint16(x)
#> [1] 32512 65408
raw_to_int16(x)
#> [1] 32512 -128
# 0xFF807F00 (4286611200 unsigned, -8356096 signed)
raw_to_uint32(x)
#> [1] 4286611200
raw_to_int32(x)
#> [1] -8356096
# ---------------------------- String ---------------------------
# ASCII case: all valid
x <- charToRaw("This is an ASCII string")
raw_to_string(x)
#> [1] "This is an ASCII string"
rawToChar(x)
#> [1] "This is an ASCII string"
x <- c(charToRaw("This is the end."),
as.raw(0),
charToRaw("*** is invalid"))
# rawToChar will raise error
raw_to_string(x)
#> [1] "This is the end."
# ---------------------------- Integer64 ------------------------
# Runs on little endian system
x <- as.raw(c(0x80, 0x00, 0x7f, 0x80, 0xFF, 0x50, 0x7f, 0x00))
# Calculate bitstring, which concaternates the followings
# 10000000 (0x80), 00000000 (0x00), 01111111 (0x7f), 10000000 (0x80),
# 11111111 (0xFF), 01010000 (0x50), 01111111 (0x7f), 00000000 (0x00)
if(.Platform$endian == "little") {
bitstring <- paste0(
"00000000011111110101000011111111",
"10000000011111110000000010000000"
)
} else {
bitstring <- paste0(
"00000001000000001111111000000001",
"11111111000010101111111000000000"
)
}
# This is expected value
bit64::as.integer64(structure(
bitstring,
class = "bitstring"
))
#> integer64
#> [1] 35836380344942720
# This is actual value
raw_to_int64(x)
#> integer64
#> [1] 35836380344942720