Deconstructing+arrays

//by Jon Ripley, August 2006//

This article assumes that you have read the manual section [|'Array storage in memory']. This article expands on the documentation and shows how to use this information in a real program.

Understanding the layout and structure of arrays in memory has practical applications when dealing with arrays from assembly language and when writing code that manipulates arrays. This information can also be used to implement methods to deal with arrays in ways not natively supported by BBC BASIC.

Reading the array pointer
The most important part of an array is the pointer to the array data block and reading this is the first step in deconstructing an array. But we also need an array to deal with and one such example is created below.

code format="bb4w" DIM array(1, 2, 3, 4) parr% = ^array:REM Pointer to array code

Here we dimension an example array **array** and set **parr%** to the pointer to this array.

Basic array information
code format="bb4w" PRINT "Pointer to parameter block address: &";~parr% PRINT "Address of parameter block: &";~!parr% PRINT "Number of dimensions: ";?!parr% PRINT "Address of data area: &";!parr%+1+?!parr%*4 code

Here we display basic information about the array. The pointer to the parameter block could be expressed as **^array**, the parameter block address could be expressed as **!^array** and the number of dimensions could be expressed as **?!^array**.


 * Note:** The following code could be rewritten with **^array** replacing each instance of **parr%** but this would limit us to only examining an array called **array**. By storing the pointer to the array in **parr%** we can use this code to examine any type of array.

Reading the size of each dimension
To find the size of each dimension in the array and calculate the total number of elements in the array use the following code:

code format="bb4w" tele% = 1:REM Total number of elements FOR i% = 0 TO ?!parr% - 1 nele% = !(!parr%+1+4*i%) PRINT "Size of dimension #";i%": ";nele%-1" + 1" tele% *= nele% NEXT i%     PRINT "Total number of elements: ";tele% code

Here we iterate through the dimensions of the array and read the number of elements in each dimension (**nele%**) and display the result. The running total number of elements in the array is calculated and stored in **tele%**, this value is used to calculate the size of the array data block.

Determining the array type
Arrays can contain different types of variable and each variable type occupies a different number of bytes in memory. Knowing the type of an array is important when manipulating the data in the array or the array itself. The array variable name is precedes the array pointer in memory. This allows us to find the type of variable stored in the array. The variable type is stored three bytes before the array pointer.

We can use this information to find the type of variable stored in the array:

code format="bb4w" PRINT "Array type: "; CASE parr%?-3 OF       WHEN ASC"%": PRINT "Integer": esize%=4 WHEN ASC"#": PRINT "Double": esize%=8 WHEN ASC"$": PRINT "String": esize%=6 WHEN ASC"&": PRINT "Byte": esize%=1 OTHERWISE: PRINT "Single": esize%=5 ENDCASE PRINT "Size of each element: ";esize%" bytes" code

Here **esize%** is set to the size in bytes of the variable type stored in the array. One variable type, the 40-bit real, does not have a type suffix so it is possible to assume that any variable name that does not end in **%**, **#**, **$** or **&** is a 40-bit real. When BBC BASIC is in //float 64// mode all new real variables have the **#** suffix silently added to the end of the name. This is essential for BBC BASIC to tell the difference between the two types of real variable.


 * Note:** We are ignoring structures and arrays here as these variable types cannot be contained within an array. Structure variable names have a '**{**' suffix and array variable names have a '**(**' suffix.

Calculating the total size of the array
The total size of the array data area equals the total number of elements (**tele%**) multiplied by the size of each element (**esize%**). The total size of the array block equals the size of the data area plus one byte for the number of dimensions and four extra bytes for the description of each dimension.

We can now use all the information collected so far to determine the total size of the array in memory:

code format="bb4w" PRINT "Total data area size: ";tele%*esize%" bytes" PRINT "Total array size: ";1+4*?!parr%+tele%*esize%" bytes" code

How to verify the array pointer
When writing code that manipulates arrays by reference it is a good idea to check that the array pointer **parr%** is actually a pointer to an array. Not doing so will lead to undefined behaviour and may crash //BBC BASIC// if your code is called with an invalid array pointer.

To check if a pointer points to an array use the following code:

code format="bb4w" IF NOT (parr%?-1 = 0 AND parr%?-2 = 40) THEN REM parr% does not point to an array ELSE REM parr% points to an array ENDIF code

Here we check for a variable descriptor that matches an array. If the array pointer **parr%** does not point to a valid array block the routine should not attempt to access or manipulate data in the invalid array.