| |
Main Menu | Next Section |
Basic Input Using Instances of the Class istream
A. The '>>' Operator
As you know, 'cin' is an instance of the class 'istream'. We could
declare our own additional instances of this class, but there rarely is a need
to. This section, then, will first review the functionality we already know, related to input with 'cin', and then explore some
additional elements.
Whenever we write
the bits being read from the input stream are interpreted in terms
of the type of the variable on the right hand side. In all cases,
the default behavior is to ignore leading blanks (ASCII character
code 32) and then, when the first non-blank character appears
on the stream, read data into the variable until the next blank
is found. For example, if the variable is of type 'int' the system
will skip all blanks and attempt to translate the codes that follow
- up to the next blank - as digits in the range 0 - 9 (with the
first character being an optional '+' or '-'). If a non-digit
character appears - a letter or, for example, a comma, the input
fails because it cannot translate this code into an integer value. When this happens,
the cin stream has a status of 'fail'. (We will see the implications
of this in a bit.)
Similar rules apply for doubles. However, with variables of type
char, only one character is read, and it is much harder to have
an input failure. Note that in both cases all leading white space
(blanks) is again skipped. The same process occurs with strings,
but here one has to be more careful. Since the system only stops
removing characters from the stream when it encounters a blank,
it is possible to overflow the memory set aside for the string.
If this happens, the system will blithely continue reading characters
into memory meant for other purposes until a blank is encountered.
The consequences of this could be disastrous. One nice feature:
when a trailing blank is encountered, a NULL character is automatically
inserted into the string so the string manipulation operations
will work as long as too many characters are not entered.
B. The 'get' Member Function
Since there is this danger of overflow and because it is not possible
using the ">>" operator to read in a string with
embedded blanks (the input stops at the first blank after the
first non-blank character), it is often better to use the member
functions 'get' or 'getline' when reading strings. The function
'get' actually has a number of different forms. The simplest form:
can be used to read in one character at a time. Since this function
does not skip leading white space, it can be used to read in a
blank. (One cannot do this with 'cin' because it always skips
leading blank.) If blanks have significance as input, this could
be very useful.
Earlier, we saw how functions can be overloaded - to take on more
than one purpose - by using different types and/or number of parameters.
The 'get' function has been overloaded a number of times. When
called with, first, a string parameter and, second, an integer
parameter, it can be used to read characters until it encounters
a newline character ('\n') or until it has read "integer
-1 characters. Thus, if we write:
the system will read in up to four characters (stopping sooner
if it encounters a newline character) AND add a NULL character.
The programmer must make sure that 'string' is large enough to
hold four characters and a NULL character but this is much safer
than assuming that the user will not enter too many characters.
This form of 'get' actually has a third parameter. We don't need
to use it because it has a default value of '\n'. Thus, when a
call to 'get' does not provide a third parameter, the function
performs as described - using the default value to indicate the
character to look for to stop reading in input. However, if, for
example, one wishes a program to read in possibly multiple lines
of characters until 80 characters or a period (.) is entered,
one could write:
In this case if one entered the following (on three separate lines):
the output would be on three lines - because of the newline characters
stored in the string!
Having learned about function overloading and parameter default
values, we can imagine how the function 'get' is declared in
the file "iostream.h". There are some obscure points
about what are called 'far' pointers and 'unsigned characters'
in the actual declarations that we will skip, but if you examine
this file you will see three function declarations for 'get' that
are relevant to us:
The second and third versions of this function, for reasons that
are unimportant here, return the modified stream. Notice that
the address of the stream is returned (that is what the '&'
means after the return type) so that a copy of the stream is not
made. The second version also needs to return a character via
its sole parameter thus it has the '&' after the parameter.
We have not discussed this before but technically one need not
put a variable name for each parameter in the declarations. All
that is required is the parameter type. However, it seems safer
and simpler to understand if we continue to provide parameter
names in our declarations.
We have already discussed the third version. It returns a string, thus the first parameter is
a 'char*'. We do not need the '&' symbol because the '*' means
the address is being passed - which is what the '&' also indicates.
The second parameter is the maximum length of the string, while
the third parameter is the delimiter, the character at
which input will stop if encountered before the maximum number
of characters have been read. Note that this third parameter
has a default type. In other words, as we already know, if the
third parameter is dropped in a call to 'get', the delimiter will
automatically be the newline (\n) symbol.
C. The 'getline' Member Function
Similar to 'get' is the 'getline' member function. The single
declaration of importance to us is:
If called as :
this function will read in up to 80 characters or until the newline character is encountered. There is no version of this function declared with simply a 'char' as the first parameter. Therefore, you cannot read data into a character type variable using 'getline' – as you can with the first two versions of 'get'. This makes sense because, as its name implies, 'getline' is meant to read in a line at a time. You can, however, define what a 'line' means to you by choosing to call this function with a third parameter, thus overriding the value of '\n' as the delimiter.
The main difference between 'getline' and 'get' is that 'getline' does
remove the delimiter from the stream. As we saw in chapter 10, Section 8,
when using the 'get' function, it is often necessary to also use
the 'ignore' member function to remove the newline character before
proceeding with the next 'get'. If this is not done, the next
'get' will also encounter the newline and will stop immediately.
The 'getline' function does all this for us. With keyboard input,
where typing the 'Enter' key is crucial to getting information
processed, the difference between 'get' and 'getline' is not too
important. One can either use 'get' and 'ignore' together or use 'getline' by itself. However, in processing files, the difference can be more significant.
D. The Use of the setw Manipulator
There are a number of function-like operators,called manipulators,
that can be used to change the way the input and output operators
('>>' and '<<') work. We have already seen one of
these - 'endl'. There is a second one we will explore briefly
here in conjunction with input.
The manipulator 'setw' (short for set width) can be used to indicate
the maximum number of characters to be read in with the '>>'
operator. In other words, it is a way to use '>>' to read
in strings and guarantee that the string array does not overflow.
It does not, however, change the fact that '>>' stops when
it encounters a blank. Thus you cannot use it to read strings
that contain blanks.
Consider the following code:
Now consider the same code with two lines added:
Note that 'setw' only affects the very next '>>' operation.
Thus, it needs to be repeated each time you want to restrict the
number of characters read. Also, to use it in a program, one must
include the header file 'iomanip.h'.
E. Handling Invalid Input
By now it is likely that you have managed more than once to get
yourself into an infinite loop by entering some invalid data.
For example, if you have a loop that expects the user to continuously
enter integers until some sentinel value is entered, but the user
types a non-numeric character, the program will go into an infinite
loop. Consider the following very simplistic code:
int x;
cout << "Enter a number\n";
cin >> x;
while (x != 0)
{ cout << " A number was read\n";
cout << "Enter a number\n";
cin >> x;
}
This code will cause such an infinite loop if the user types in
non-numeric characters. (The symbols '+' and '-' are legal as the first character.)
The reason has nothing to do with the quality of the code and
every thing to do with the way C++ represents and handles streams.
As noted earlier, associated with every stream variable such as
'cin' and 'cout', is a data structure representing the status of
the input or output operations on the stream. Possible status
values include:
When, for example, the system attempts to read a non-numeric character
into an integer variable, an error occurs that sets the stream
status to 'fail'. Once this happens, any further request to read
is ignored. Therefore, if a user types, for example, a 'Q' when
requested for a number in the above code, the status flag is set
to 'fail', the input is ignored, and the system returns to the
top of the loop. Since 'x' does not yet equal 0, the loop again
attempts to execute the line
However, the instruction is ignored because the status flag has
been set to 'fail'. This means that the value of 'x' has not changed
and the loop repeats again. In fact, 'x' will never change because
the status flag will always be 'fail' and the system will ignore
the read instruction. We have an infinite loop!
To fix this we need to check for the 'fail' error status, clear the status flag if necessary, get rid of the illegal character and then continue. Below is a code fragment that demonstrates. It uses a 'while' loop that simply keeps asking for numbers until the user enters a 0. Note that the 'while' instruction itself is not part of the error handling. It is include because such error handling is needed when an input fails inside a loop.
int x;
char ch1;
cout << "Enter a number\n";
cin >> x;
while (x != 0)
{ if (cin.fail())
{ cin.clear();
cin >> ch1;
cout << "Invalid value entered\n";
}
else
{ cout << " A number was read\n";
}
cout << "Enter a number\n";
cin >> x;
}
}
The function call
checks the error status of 'cin'. If an error has occurred, the
code:
If an error has not occurred, this code simply announces that the read was a success. In a real application, the line
would be replaced by whatever was to be done with valid input.
In both cases (valid or invalid data), another character is read and the system starts
again. Here you see another pattern (Pattern G1, "Clean Handling of Invalid Data") that you might want to consider
adding to your code, whenever you are working with input that could
be invalid.
By the way, you may have noticed that one of the so-called error
states was 'EOF' or "end of file". Of, course, this
is not really an error state. It represents the state of a stream
when it has encountered the end of file character code. This code
is the equivalent to striking "Ctrl-D" on your keyboard
(hitting the CTRL key and the 'D' key simultaneously). Therefore,
although it may seen strange, the 'cin' stream could be in the
'eof' state, if the user typed Ctrl-D and all input prior to this
had been processed.
That is all we will say here about input operators, functions,
and manipulators. There are more ways to control input and more
ways to accomplish the same things discussed here. For example,
one can also use the member function 'width' to set the width
of a string to be input as in:
Topics Covered in the "Essentials of C++"
| |
Main Menu | Next Section |