CHAPTER 11
|
Table of Contents
Learning C++:
An Index of Entry Points
2. The A reference document on the basic elements of C++.
3. The Patterns
|
A. The '>>' Operator Whenever we write
the bits being read from the input stream are interpreted in terms of
the type of the variable on the right hand side. In all cases, the default
behavior is to ignore leading blanks (ASCII character code 32) and then,
when the first non-blank character appears on the stream, read data into
the variable until the next blank is found. For example, if the variable
is of type 'int' the system will skip all blanks and attempt to translate
the codes that follow - up to the next blank - as digits in the range 0 -
9 (with the first character being an optional '+' or '-'). If a non-digit
character appears - a letter or, for example, a comma, the input fails
because it cannot translate this code into an integer value. When this
happens, the cin stream has a status of 'fail'. (We will see the
implications of this in a bit.) Similar rules apply for doubles. However, with variables of type char,
only one character is read, and it is much harder to have an input
failure. Note that in both cases all leading white space (blanks) is again
skipped. The same process occurs with strings, but here one has to be more
careful. Since the system only stops removing characters from the stream
when it encounters a blank, it is possible to overflow the memory set
aside for the string. If this happens, the system will blithely continue
reading characters into memory meant for other purposes until a blank is
encountered. The consequences of this could be disastrous. One nice
feature: when a trailing blank is encountered, a NULL character is
automatically inserted into the string so the string manipulation
operations will work as long as too many characters are not entered. B. The 'get' Member Function
can be used to read in one character at a time. Since this function
does not skip leading white space, it can be used to read in a blank. (One
cannot do this with 'cin' because it always skips leading blank.) If
blanks have significance as input, this could be very useful. Earlier, we saw how functions can be overloaded - to take on more than
one purpose - by using different types and/or number of parameters. The
'get' function has been overloaded a number of times. When called with,
first, a string parameter and, second, an integer parameter, it can be
used to read characters until it encounters a newline character ('\n') or
until it has read "integer -1 characters. Thus, if we write:
the system will read in up to four characters (stopping sooner if it
encounters a newline character) AND add a NULL character. The programmer
must make sure that 'string' is large enough to hold four characters and a
NULL character but this is much safer than assuming that the user will not
enter too many characters. This form of 'get' actually has a third parameter. We don't need to use
it because it has a default value of '\n'. Thus, when a call to 'get' does
not provide a third parameter, the function performs as described - using
the default value to indicate the character to look for to stop reading in
input. However, if, for example, one wishes a program to read in possibly
multiple lines of characters until 80 characters or a period (.) is
entered, one could write:
In this case if one entered the following (on three separate lines):
the output would be on three lines - because of the newline characters
stored in the string! Having learned about function
overloading and parameter
default values, we can imagine how the function 'get' is declared in
the file "iostream.h". There are some obscure points about what are called
'far' pointers and 'unsigned characters' in the actual declarations that
we will skip, but if you examine this file you will see three function
declarations for 'get' that are relevant to us: The second and third versions of this function, for reasons that are
unimportant here, return the modified stream. Notice that the address of
the stream is returned (that is what the '&' means after the return
type) so that a copy of the stream is not made. The second version also
needs to return a character via its sole parameter thus it has the '&'
after the parameter. We have not discussed this before but technically one
need not put a variable name for each parameter in the declarations. All
that is required is the parameter type. However, it seems safer and
simpler to understand if we continue to provide parameter names in our
declarations. We have already discussed the third version. It returns a string, thus
the first parameter is a 'char*'. We do not need the '&' symbol
because the '*' means the address is being passed - which is what the
'&' also indicates. The second parameter is the maximum length of the
string, while the third parameter is the delimiter, the character
at which input will stop if encountered before the maximum number of
characters have been read. Note that this third parameter has a default
type. In other words, as we already know, if the third parameter is
dropped in a call to 'get', the delimiter will automatically be the
newline (\n) symbol. C. The 'getline' Member Function
If called as :
this function will read in up to 80 characters or until the newline
character is encountered. There is no version of this function declared
with simply a 'char' as the first parameter. Therefore, you cannot read
data into a character type variable using 'getline' – as you can with the
first two versions of 'get'. This makes sense because, as its name
implies, 'getline' is meant to read in a line at a time. You can, however,
define what a 'line' means to you by choosing to call this function with a
third parameter, thus overriding the value of '\n' as the delimiter. The main difference between 'getline' and 'get' is that 'getline' does
remove the delimiter from the stream. As we saw in chapter 10, Section 8, when using the
'get' function, it is often necessary to also use the 'ignore' member
function to remove the newline character before proceeding with the next
'get'. If this is not done, the next 'get' will also encounter the newline
and will stop immediately. The 'getline' function does all this for us.
With keyboard input, where typing the 'Enter' key is crucial to getting
information processed, the difference between 'get' and 'getline' is not
too important. One can either use 'get' and 'ignore' together or use
'getline' by itself. However, in processing files, the difference can be
more significant. D. The Use of the setw Manipulator The manipulator 'setw' (short for set width) can be used to indicate
the maximum number of characters to be read in with the '>>'
operator. In other words, it is a way to use '>>' to read in strings
and guarantee that the string array does not overflow. It does not,
however, change the fact that '>>' stops when it encounters a blank.
Thus you cannot use it to read strings that contain blanks. Consider the following code:
Now consider the same code with two lines added:
Note that 'setw' only affects the very next '>>' operation. Thus,
it needs to be repeated each time you want to restrict the number of
characters read. Also, to use it in a program, one must include the header
file 'iomanip.h'. E. Handling Invalid Input
When, for example, the system attempts to read a non-numeric character
into an integer variable, an error occurs that sets the stream status to
'fail'. Once this happens, any further request to read is ignored.
Therefore, if a user types, for example, a 'Q' when requested for a number
in the above code, the status flag is set to 'fail', the input is ignored,
and the system returns to the top of the loop. Since 'x' does not yet
equal 0, the loop again attempts to execute the line
However, the instruction is ignored because the status flag has been
set to 'fail'. This means that the value of 'x' has not changed and the
loop repeats again. In fact, 'x' will never change because the status flag
will always be 'fail' and the system will ignore the read instruction. We
have an infinite loop! To fix this we need to check for the 'fail' error status, clear the
status flag if necessary, get rid of the illegal character and then
continue. Below is a code fragment that demonstrates. It uses a 'while'
loop that simply keeps asking for numbers until the user enters a 0. Note
that the 'while' instruction itself is not part of the error handling. It
is include because such error handling is needed when an input fails
inside a loop.
checks the error status of 'cin'. If an error has occurred, the
code:
If an error has not occurred, this code simply announces that the read
was a success. In a real application, the line would be replaced by whatever was to be done with valid input. In both
cases (valid or invalid data), another character is read and the system
starts again. Here you see another pattern (Pattern G1, "Clean Handling of Invalid
Data") that you might want to consider adding to your code, whenever
you are working with input that could be invalid. By the way, you may have noticed that one of the so-called error states
was 'EOF' or "end of file". Of, course, this is not really an error state.
It represents the state of a stream when it has encountered the end of
file character code. This code is the equivalent to striking "Ctrl-D" on
your keyboard (hitting the CTRL key and the 'D' key simultaneously).
Therefore, although it may seen strange, the 'cin' stream could be in the
'eof' state, if the user typed Ctrl-D and all input prior to this had been
processed.
That is all we will say here about input operators,
functions, and manipulators. There are more ways to control input and more
ways to accomplish the same things discussed here. For example, one can
also use the member function 'width' to set the width of a string to be
input as in:
Topics Covered in the "Essentials of C++" |