ARRAYS AND STRINGS




ARRAYS



In principle arrays in C are similar to those found in other languages.


As we shall shortly see arrays are defined slightly differently and there are

many subtle differences due the close link between array and pointers.


We will look more closely at the link between pointer and arrays.


SINGLE AND MULTI-DIMENSIONAL ARRAYS



Setting the value of an array element is as easy as accessing the element and

performing an assignment. For instance,


[] =


for instance,


MY_STRING[0] = 'C'; /* SET THE FIRST ELEMENT OF MY_FIRST TO BE THE LETTER C

*/


or, for two dimensional arrays

[][] = ;



Let us first look at how we define arrays in C:
 


INT LISTOFNUMBERS[50];



BEWARE:


In C Array subscripts start at 0 and end one less than the array size. For

example, in the above case valid subscripts range from 0 to 49.

This is a BIG difference between C and other languages and does require a bit

of practice to get in the right frame of mind.

Elements can be accessed in the following ways:-


THIRDNUMBER=LISTOFNUMBERS[2];
LISTOFNUMBERS[5]=100;
Multi-dimensional arrays can be defined as follows:
INT TABLEOFNUMBERS[50][50];
for two dimensions.
For further dimensions simply add more [ ]:
INT BIGD[50][50][40][30]......[50];
Elements can be accessed in the following ways:
ANUMBER=TABLEOFNUMBERS[2][3];
TABLEOFNUMBERS[25][16]=100;



C STRINGS



This lesson will discuss C-style strings, which you may have already seen in

the array . In fact, C-style strings are really arrays of chars with a little

bit of special sauce to indicate where the string ends.

This tutorial will cover some of the tools available for working with

strings--things like copying them, concatenating them, and getting their

length.

WHAT IS A STRING?


Note that along with C-style strings, which are arrays, there are also string

literals, such as "this". In reality, both of these string types are merely

just collections of characters sitting next to each other in memory.



The only difference is that you cannot modify string literals, whereas you can

modify arrays. Functions that take a C-style string will be just as happy to

accept string literals unless they modify the string (in which case your

program will crash).



Some things that might look like strings are not strings; in particular, a

character enclosed in single quotes, like this, 'a', is not a string.



It's a single character, which can be assigned to a specific location in a

string, but which cannot be treated as a string.


(Remember how arrays act like pointers when passed into functions? Characters

don't, so if you pass a single character into a function, it won't work; the

function is expecting a char*, not a char.)




To recap: strings are arrays of chars. String literals are words surrounded by

double quotation marks.


"THIS IS A STATIC STRING"


Remember that special sauce mentioned above? Well, it turns out that C-style

strings are always terminated with a null character, literally a '\0'

character (with the value of 0), so to declare a string of 49 letters, you

need to account for it by adding an extra character, so you would want to say:

CHAR STRING[50];



This would declare a string with a length of 50 characters. Do not forget that

arrays begin at zero, not 1 for the index number.



In addition, we've accounted for the extra with a null character, literally a

'\0' character.



It's important to remember that there will be an extra character on the end on

a string, just like there is always a period at the end of a sentence.



Since this string terminator is unprintable, it is not counted as a letter,

but it still takes up a space.



Technically, in a fifty char array you could only hold 49 letters and one null

character at the end to terminate the string.



Note that something like


CHAR *MY_STRING;


can also be used as a string. If you have read the tutorial on pointers, you

can do something such as:


ARRY = MALLOC( SIZEOF(*ARRY) * 256 );


which allows you to access arry just as if it were an array. To free the

memory you allocated, just use free:

For example:


FREE ( ARRY );


USING STRINGS


Strings are useful for holding all types of long input. If you want the user

to input his or her name, you must use a string.


Using scanf() to input a string works, but it will terminate the string after

it reads the first space, and moreover, because scanf doesn't know how big the

array is, it can lead to "buffer overflows" when the user inputs a string that

is longer than the size of the string (which acts as an input "buffer").


There are several approaches to handling this problem, but probably the

simplest and safest is to use the fgets function, which is declared in

stdio.h.



The prototype for the fgets function is:


CHAR *FGETS (CHAR *STR, INT SIZE, FILE* FILE);




There are a few new things here. First of all, let's clear up the questions

about that funky FILE* pointer.


The reason this exists is because fgets is supposed to be able to read from

any file on disk, not just from the user's keyboard (or other "standard input"

device).


For the time being, whenever we call fgets, we'll just pass in a variable

called stdin, defined in stdio.h, which refers to "standard input".


This effectively tells the program to read from the keyboard. The other two

arguments to fgets, str and size, are simply the place to store the data read

from the input and the size of the char*, str.


Finally, fgets returns str whenever it successfully read from the input.


When fgets actually reads input from the user, it will read up to size - 1

characters and then place the null terminator after the last character it

read.


fgets will read input until it either has no more room to store the data or

until the user hits enter.



Notice that fgets may fill up the entire space allocated for str, but it will

never return a non-null terminated string to you.



Let's look at an example of using fgets, and then we'll talk about some

pitfalls to watch out for.


For a example:


#INCLUDE

INT MAIN()
{
/* A NICE LONG STRING */
CHAR STRING[256];

PRINTF( "PLEASE ENTER A LONG STRING: " );

/* NOTICE STDIN BEING PASSED IN */
FGETS ( STRING, 256, STDIN );

PRINTF( "YOU ENTERED A VERY LONG STRING, %S", STRING );

GETCHAR();
}





Remember that you are actually passing the address of the array when you pass

string because arrays do not require an address operator (&) to be used to

pass their addresses, so the values in the array string are modified.


The one thing to watch out for when using fgets is that it will include the

newline character ('\n') when it reads input unless there isn't room in the

string to store it. This means that you may need to manually remove the input.


One way to do this would be to search the string for a newline and then

replace it with the null terminator.


What would this look like? See if you can figure out a way to do it before

looking below:


CHAR INPUT[256];
INT I;

FGETS( INPUT, 256, STDIN );

FOR ( I = 0; I < 256; I++ ) { IF ( INPUT[I] == '\N' ) { INPUT[I] = '\0'; BREAK; } }







Here, we just loop through the input until we come to a newline, and when we

do, we replace it with the null terminator. Notice that if the input is less

than 256 characters long, the user must have hit enter, which would have

included the newline character in the string!


MANIPULATING C STRINGS USING STRING.H


string.h is a header file that contains many functions for manipulating

strings. One of these is the string comparison function.


INT STRCMP ( CONST CHAR *S1, CONST CHAR *S2 );


strcmp will accept two strings. It will return an integer. This integer will

either be:


Negative if s1 is less than s2.


Zero if s1 and s2 are equal.


Positive if s1 is greater than s2.


Strcmp performs a case sensitive comparison; if the strings are the same

except for a difference in cAse, then they're countered as being different.

Strcmp also passes the address of the character array to the function to allow

it to be accessed.


CHAR *STRCAT ( CHAR *DEST, CONST CHAR *SRC );


strcat is short for "string concatenate"; concatenate is a fancy word that

means to add to the end, or append. It adds the second string to the first

string. It returns a pointer to the concatenated string. Beware this function;

it assumes that dest is large enough to hold the entire contents of src as

well as its own contents.


CHAR *STRCPY ( CHAR *DEST, CONST CHAR *SRC );


strcpy is short for string copy, which means it copies the entire contents of

src into dest. The contents of dest after strcpy will be exactly the same as

src such that strcmp ( dest, src ) will return 0.


SIZE_T STRLEN ( CONST CHAR *S );


strlen will return the length of a string, minus the terminating character

('\0'). The size_t is nothing to worry about. Just treat it as an integer that

cannot be negative, which is what it actually is. (The type size_t is just a

way to indicate that the value is intended for use as a size of something.)




No comments:

Post a Comment